litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	firewire: fw-ohci: plug dma memory leak in AR handler	Jarod Wilson	2008-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's an ugly little memory leak in firewire-ohci's ar_context_tasklet(), where we're not freeing up some of the memory we use for each ar_buffer, due to a moving pointer. The problem has been there for a while, but didn't get noticed until after converting the AR routines over to use coherent DMA and I started running into I/O stall- outs with the following message output repeatedly to the console: PCI-DMA: Out of IOMMU space for 53248 bytes at device 0000:04:09.0 Plugging this leak is definitely necessary, but unfortunately, isn't the entire answer to my problem, it only increases the amount of I/O that I can do before hitting the problem. Still working on tracking down the root cause.. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fix panic in handle_at_packet	Stefan Richter	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a use-after-free bug in the handling of split transactions. The AT DMA handler of the request was occasionally executed after the AR DMA handler of the response. The AT DMA handler then accessed an already freed packet. Reported by Johannes Berg. http://bugzilla.kernel.org/show_bug.cgi?id=9617 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-ohci: shut up false compiler warning on PPC32	Stefan Richter	2008-03-13
\| \| \| \| \| \| \|	Shut up "may be used uninitialised in this function" warnings due to PPC32's implementation of dma_alloc_coherent(). Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: use dma_alloc_coherent for ar_buffer	Jarod Wilson	2008-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we do nothing to guarantee we have a consistent DMA buffer for asynchronous receive packets. Rather than doing several sync's following a dma_map_single() to get consistent buffers, just switch to using dma_alloc_coherent(). Resolves constant buffer failures on my own x86_64 laptop w/4GB of RAM and likely to fix a number of other failures witnessed on x86_64 systems with 4GB of RAM or more. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: fix for SYM13FW500 bridge (Datafab disk)	Stefan Richter	2008-03-13
\| \| \| \| \| \| \| \| \|	Fix I/O errors due to SYM13FW500's inability to handle larger request sizes. Reported by Piergiorgio Sartor <piergiorgio.sartor@nexgo.de> in https://bugzilla.redhat.com/show_bug.cgi?id=436879 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: update Kconfig help text	Stefan Richter	2008-03-13
\| \| \| \| \| \| \|	Remove some less necessary information, point out that video1394 and dv1394 should be blacklisted along with ohci1394. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: warn on fatal condition in topology code	Stefan Richter	2008-03-13
\| \| \| \| \| \|	If this ever happens to anybody, we want to have it in his log. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: set single-phase retry_limit	Jarod Wilson	2008-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Per the SBP-2 specification, all SBP-2 target devices must have a BUSY_TIMEOUT register. Per the 1394-1995 specification, the retry_limt portion of the register should be set to 0x0 initially, and set on the target by a logged in initiator (i.e., a Linux host w/firewire controller(s)). Well, as it turns out, lots of devices these days have actually moved on to starting to implement SBP-3 compliance, which says that retry_limit should default to 0xf instead (yes, SBP-3 stomps directly on 1394-1995, oops). Prior to this change, the firewire driver stack didn't touch retry_limit, and any SBP-3 compliant device worked fine, while SBP-2 compliant ones were unable to retransmit when the host returned an ack_busy_X, which resulted in stalled out I/O, eventually causing the SCSI layer to give up and offline the device. The simple fix is for us to set retry_limit to 0xf in the register for all devices (which actually matches what the old ieee1394 stack did). Prior to this change, a hard disk behind an SBP-2 Prolific PL-3507 bridge chip would routinely encounter buffer I/O errors and wind up offlined by the SCSI layer. With this change, I've encountered zero I/O failures moving tens of GB of data around. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: Apple UniNorth 1st generation support	Stefan Richter	2008-03-13
\| \| \| \| \| \| \|	Mostly copied from ohci1394.c. Necessary for some older Macs, e.g. PowerBook G3 Pismo and early PowerBook G4 Titanium. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: PPC PMac platform code	Stefan Richter	2008-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Copied from ohci1394.c. This code is necessary to prevent machine check exceptions when reloading or resuming the driver. Tested on a 1st generation PowerBook G4 Titanium, which also needs the pci_probe() hunk. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> I was able to reproduce the system exception on resume with a 3rd-gen Titanium PowerBook G4 667, and this patch does let the system resume successfully now. Not quite clear if there was possibly an updated version coming using pci_enable_device() instead of the pair of pmac_call_feature() calls, but either way, this is a definite must-have, at least for older ppc macs -- my Aluminum PowerBook G4/1.67 suspends and resumes without this patch just fine. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	firewire: endianess annotations	Stefan Richter	2008-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kills warnings from 'make C=1 CHECKFLAGS="-D__CHECK_ENDIAN__" modules': drivers/firewire/fw-transaction.c:771:10: warning: incorrect type in assignment (different base types) drivers/firewire/fw-transaction.c:771:10: expected unsigned int [unsigned] [usertype] <noident> drivers/firewire/fw-transaction.c:771:10: got restricted unsigned int [usertype] <noident> drivers/firewire/fw-transaction.h:93:10: warning: incorrect type in assignment (different base types) drivers/firewire/fw-transaction.h:93:10: expected unsigned int [unsigned] [usertype] <noident> drivers/firewire/fw-transaction.h:93:10: got restricted unsigned int [usertype] <noident> drivers/firewire/fw-ohci.c:1490:8: warning: restricted degrades to integer drivers/firewire/fw-ohci.c:1490:35: warning: restricted degrades to integer drivers/firewire/fw-ohci.c:1516:5: warning: cast to restricted type Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: endianess fix	Stefan Richter	2008-03-13
\| \| \| \| \| \| \| \|	The generation of incoming requests was filled in in wrong byte order on machines with big endian CPU. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fix crash in automatic module unloading	Stefan Richter	2008-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to result in crashes like this: BUG: unable to handle kernel paging request at ffffffff8807b455 IP: [<ffffffff8807b455>] PGD 203067 PUD 207063 PMD 7c170067 PTE 0 Oops: 0010 [1] PREEMPT SMP CPU 0 Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t] Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3 RIP: 0010:[<ffffffff8807b455>] [<ffffffff8807b455>] RSP: 0018:ffff81007dcdde88 EFLAGS: 00010246 RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13 RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388 RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388 R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040) Stack: ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80 ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000 Call Trace: [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3 [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3 [<ffffffff8023e813>] ? kthread+0x47/0x74 [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8020c008>] ? child_rip+0xa/0x12 [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171 [<ffffffff8023e7cc>] ? kthread+0x0/0x74 [<ffffffff8020bffe>] ? child_rip+0x0/0x12 Code: Bad RIP value. RIP [<ffffffff8807b455>] RSP <ffff81007dcdde88> CR2: ffffffff8807b455 ---[ end trace c7366c6657fe5bed ]--- Note that this crash happened _after_ firewire-core was unloaded. The shared workqueue tried to run firewire-core's device initialization jobs or similar jobs. The fix makes sure that firewire-ohci and hence firewire-core is not unloaded before all device shutdown jobs have been completed. This is determined by the count of device initializations minus device releases. Also skip useless retries in the node initialization job if the node is to be shut down. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: potentially invalid pointers used in fw_card_bm_work	Stefan Richter	2008-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	The bus management workqueue job was in danger to dereference NULL pointers. Also, after having temporarily lifted card->lock, a few node pointers and a device pointer may have become invalid. Add NULL pointer checks and get the necessary references. Also, move card->local_node out of fw_card_bm_work's sight during shutdown of the card. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device	Stefan Richter	2008-03-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device" had the unintended effect that firewire-sbp2 could not be unloaded anymore until all SBP-2 devices were unplugged. We now fix the NULL pointer bug by reacquiring a reference to the sdev instead of holding a reference to the sdev (and to the module) all the time. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Tested-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fix NULL pointer deref. and resource leak	Stefan Richter	2008-02-21
\| \| \| \| \| \| \| \| \| \| \|	By supplying ioctl()s in the wrong order, a userspace client was able to trigger NULL pointer dereferences. Furthermore, by calling ioctl_create_iso_context more than once, new contexts could be created without ever freeing the previously created contexts. Thanks to Anders Blomdell for the report. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device	Stefan Richter	2008-02-19
\| \| \| \| \| \| \|	Fix a kernel bug when unplugging an SBP-2 device after having its scsi_device already removed via the "delete" sysfs attribute. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: fix NULL pointer deref. in slave_alloc	Stefan Richter	2008-02-19
\| \| \| \| \| \| \|	Fix a kernel bug when running rescan-scsi-bus while a FireWire disk is connected: http://bugzilla.kernel.org/show_bug.cgi?id=10008 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: (try to) avoid I/O errors during reconnect	Stefan Richter	2008-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While fw-sbp2 takes the necessary time to reconnect to a logical unit after bus reset, the SCSI core keeps sending new commands. They are all immediately completed with host busy status, and application clients or filesystems will break quickly. The SCSI device might even be taken offline: http://bugzilla.kernel.org/show_bug.cgi?id=9734 The only remedy seems to be to block the SCSI device until reconnect. Alas the SCSI core has no useful API to block only one logical unit i.e. the scsi_device, therefore we block the entire Scsi_Host. This currently corresponds to an SBP-2 target. In case of targets with multiple logical units, we need to satisfy the dependencies between logical units by carefully tracking the blocking state of the target and its units. We block all logical units of a target as soon as one of them needs to be blocked, and keep them blocked until all of them are ready to be unblocked. Furthermore, as the history of the old sbp2 driver has shown, the scsi_block_requests() API is a minefield with high potential of deadlocks. We therefore take extra measures to keep logical units unblocked during __scsi_add_device() and during shutdown. This avoids I/O errors during reconnect in many but alas not in all cases. There may still be errors after a re-login had to be performed. Also, some bridges have been seen to cease fetching management ORBs if I/O went on up until a bus reset. In these cases, all management ORBs time out after mgt_orb_timeout. The old sbp2 driver is less vulnerable or maybe not vulnerable to this, for as yet unknown reasons. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: enforce a retry of __scsi_add_device if bus generation ↵	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	changed fw-sbp2 is unable to reconnect while performing __scsi_add_device because there is only a single workqueue thread context available for both at the moment. This should be fixed eventually. An actual failure of __scsi_add_device is easy to handle, but an incomplete execution of __scsi_add_device with an sdev returned would remain undetected and leave the SBP-2 target unusable. Therefore we use a workaround: If there was a bus reset during __scsi_add_device (i.e. during the SCSI probe), we remove the new sdev immediately, log out, and attempt login and SCSI probe again. Tested-by: Jarod Wilson <jwilson@redhat.com> (earlier version) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: sort includes	Stefan Richter	2008-02-16
\| \| \| \|	Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: logout and login after failed reconnect	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If fw-sbp2 was too late with requesting the reconnect, the target would reject this. In this case, log out before attempting the reconnect. Else several firmwares will deny the re-login because they somehow didn't invalidate the old login. Also, don't retry reconnects in this situation. The retries won't succeed either. These changes improve chances for successful re-login and shorten the period during which the logical unit is inaccessible. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: don't add scsi_device twice	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a reconnect failed but re-login succeeded, __scsi_add_device was called again. In those cases, __scsi_add_device succeeded and returned the pointer to the existing scsi_device. fw-sbp2 then continued orderly, except that it missed to call sbp2_cancel_orbs. SCSI core would call fw-sbp2's eh_abort_handler eventually if there had been an outstanding command. This patch avoids the needless lookups and temporary allocations in SCSI core and I/O stall and timeout until eh_abort_handler hits. Also, __scsi_add_device tolerating calls for devices which already exist is undocumented behavior on which we shouldn't rely. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: log bus_id at management request failures	Stefan Richter	2008-02-16
\| \| \| \| \| \| \|	for easier readable logs if more than one SBP-2 device is present. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: wait for completion of fetch agent reset	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \|	Like the old sbp2 driver, wait for the write transaction to the AGENT_RESET to complete before proceeding (after login, after reconnect, or in SCSI error handling). There is one occasion where AGENT_RESET is written to from atomic context when getting DEAD status for a command ORB. There we still continue without waiting for the transaction to complete because this is more difficult to fix... Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: add INQUIRY delay workaround	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several different SBP-2 bridges accept a login early while the IDE device is still powering up. They are therefore unable to respond to SCSI INQUIRY immediately, and the SCSI core has to retry the INQUIRY. One of these retries is typically successful, and all is well. But in case of Momobay FX-3A, the INQUIRY retries tend to fail entirely. This can usually be avoided by waiting a little while after login before letting the SCSI core send the INQUIRY. The old sbp2 driver handles this more gracefully for as yet unknown reasons (perhaps because it waits for fetch agent resets to complete, unlike fw-sbp2 which quickly proceeds after requesting the agent reset). Therefore the workaround is not as much necessary for sbp2. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: log GUID of new devices	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \|	This should help to interpret user reports. E.g. one can look up the vendor OUI (first three bytes of the GUID) and thus tell what is what. Also simplifies the math in the GUID sysfs attribute. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: don't retry login or reconnect after unplug	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a device is being unplugged while fw-sbp2 had a login or reconnect on schedule, it would take about half a minute to shut the fw_unit down: Jan 27 18:34:54 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 18:34:59 stein firewire_sbp2: sbp2_scsi_abort Jan 27 18:34:59 stein scsi 25:0:0:0: Device offlined - not ready after error recovery Jan 27 18:35:01 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:06 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:12 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:17 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:22 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:27 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: orb reply timed out, rcode=0x11 Jan 27 18:35:32 stein firewire_sbp2: failed to login to fw2.0 LUN 0000 Jan 27 18:35:32 stein firewire_sbp2: released fw2.0 After this patch, typically only a few seconds spent in __scsi_add_device remain: Jan 27 19:05:50 stein firewire_sbp2: logged in to fw2.0 LUN 0000 (0 retries) <unplug> Jan 27 19:05:56 stein firewire_sbp2: sbp2_scsi_abort Jan 27 19:05:56 stein scsi 33:0:0:0: Device offlined - not ready after error recovery Jan 27 19:05:56 stein firewire_sbp2: released fw2.0 The benefit of this is less noise in the syslog. It furthermore avoids a few wasted CPU cycles and needlessly prolonged lifetime of a few driver objects. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fix "kobject_add failed for fw* with -EEXIST"	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race between shutdown and creation of devices: fw-core may attempt to add a device with the same name of an already existing device. http://bugzilla.kernel.org/show_bug.cgi?id=9828 Impact of the bug: Happens rarely (when shutdown of a device coincides with creation of another), forces the user to unplug and replug the new device to get it working. The fix is obvious: Free the minor number after instead of before device_unregister(). This requires to take an additional reference of the fw_device as long as the IDR tree points to it. And while we are at it, we fix an additional race condition: fw_device_op_open() took its reference of the fw_device a little bit too late, hence was in danger to access an already invalid fw_device. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: fix logout before login retry	Stefan Richter	2008-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a "can't recognize device" kind of bug. If the SCSI INQUIRY failed and hence __scsi_add_device failed due to a bus reset, we tried a logout and then waited for the already scheduled login work to happen. So far so good, but the generation used for the logout was outdated, hence the logout never reached the target. The target might therefore deny the subsequent relogin attempt, which would also leave the target inaccessible. Therefore fetch a fresh device->generation for the logout. Use memory barriers to prevent our plan being foiled by compiler or hardware optimizations. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: unsigned int vs. unsigned	Stefan Richter	2008-02-16
\| \| \| \| \| \| \|	Standardize on "unsigned int" style. Sort some struct members thematically. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: Use sbp2 device-provided mgt orb timeout for logins	Jarod Wilson	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \|	To be more compliant with section 7.4.8 of the SBP-2 specification, use the mgt_ORB_timeout specified in the SBP-2 device's config rom for login ORB attempts (though with some sanity checks). A happy side-effect is that certain device and controller combinations that sometimes take more than 20 seconds to get synced up (like my laptop with just about any SBP-2 device) now function more reliably. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (silenced sparse)
*	firewire: fw-sbp2: increase login orb reply timeout, fix "failed to login"	Jarod Wilson	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \|	Increase (and rename) the login orb reply timeout value to 20s to match that of the old firewire stack. 2s simply didn't give many devices enough time to spin up and reply. Fixes inability to recognize some devices. Failure mode was "orb reply timed out"/"failed to login". Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (style, comments, changelog)
*	firewire: replace subtraction with bitwise and	Jarod Wilson	2008-01-30
\| \| \| \| \| \| \| \| \|	Replace an unnecessary subtraction with a bitwise AND when determining the value of ext_tcode in fw_fill_transaction() to save a cpu cycle or two in a somewhat critical path. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-core: react on bus resets while the config ROM is being fetched	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	read_rom() obtained a fresh new fw_device.generation for each read transaction. Hence it was able to continue reading in the middle of the ROM even if a bus reset happened. However the device may have modified the ROM during the reset. We would end up with a corrupt fetched ROM image then. Although all of this is quite unlikely, it is not impossible. Therefore we now restart reading the ROM if the bus generation changed. Note, the memory barrier in read_rom() is still necessary according to tests by Jarod Wilson, despite of the ->generation access being moved up in the call chain. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> This is essentially what I've been beating on locally, and I've yet to hit another config rom read failure with it. Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: enforce access order between generation and node ID, fix "giving ↵	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	up on config rom" fw_device.node_id and fw_device.generation are accessed without mutexes. We have to ensure that all readers will get to see node_id updates before generation updates. Fixes an inability to recognize devices after "giving up on config rom", https://bugzilla.redhat.com/show_bug.cgi?id=429950 Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Reviewed by Nick Piggin <nickpiggin@yahoo.com.au>. Verified to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Kristian Høgsberg <krh@redhat.com>
*	firewire: fw-cdev: use device generation, not card generation	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to use the fw_device.generation here, not the fw_card.generation, because the generation must never be newer than the node ID when we emit a transaction. This cannot be guaranteed with fw_card.generation. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Verified in concert with subsequent memory barriers patch to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: use device generation, not card generation	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was a small window where a login or reconnect job could use an already updated card generation with an outdated node ID. We have to use the fw_device.generation here, not the fw_card.generation, because the generation must never be newer than the node ID when we emit a transaction. This cannot be guaranteed with fw_card.generation. Furthermore, the target's and initiator's node IDs can be obtained from fw_device and fw_card. Dereferencing their underlying topology objects is not necessary. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Verified in concert with subsequent memory barriers patch to fix 'giving up on config rom' issues on multiple system and drive combinations that were previously affected. Signed-off-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: try to increase reconnect_hold (speed up reconnection)	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ask the target to grant 4 seconds instead of the standard and minimum of 1 second window after bus reset for reconnection. This accelerates reconnection if there are more than one targets on the bus: If a login and inquiry to one target blocks the fw-sbp2 workqueue for more than 1s after bus reset, we now still can reconnect to the other target. Before that, fw-sbp2's reconnect attempts would be rejected with "error status: 0:9" (function rejected), and fw-sbp2 would finally re-login. All those futile reconnect attemps cost extra time until the target which needs re-login is ready for I/O again. The reconnect timeout field in the login ORB doesn't have to be honored by the target though. I found that we could get up to - allegedly 32768s from an old OXFW911 firmware - 256s from LSI bridges - 4s from OXUF922 and OXFW912 bridges, - 2s from TI bridges, - only the standard 1s from Initio and Prolific bridges and from Apple OpenFirmware in target mode. We just try to get 4 seconds which already covers the case of a few HDDs on the same bus quite nicely. A minor drawback occurs in the following (rare and impractical) border case: - two initiators are there, initiator 1 holds an exclusive login to a target, - initiator 1 goes off the bus, - target refuses login attempts from initiator 2 until reconnect_hold seconds after bus reset. An alternative approach to the issue at hand would be to parallelize fw-sbp2's reconnect and login work. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-sbp2: skip unnecessary logout	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \|	Don't attempt to send a logout ORB if the target was already unplugged or had its link switched off. If two targets are attached, this enhances the chance to quickly reconnect to the remaining target when one target is plugged out. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Jarod Wilson <jwilson@redhat.com>
*	firewire: fw-ohci: Dynamically allocate buffers for DMA descriptors	David Moore	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the fw-ohci driver used fixed-length buffers for storing descriptors for isochronous receive DMA programs. If an application (such as libdc1394) generated a DMA program that was too large, fw-ohci would reach the limit of its fixed-sized buffer and return an error to userspace. This patch replaces the fixed-length ring-buffer with a linked-list of page-sized buffers. Additional buffers can be dynamically allocated and appended to the list when necessary. For a particular context, buffers are kept around after use and reused as necessary, so there is no allocation taking place after the DMA program is generated for the first time. In addition, the buffers it uses are coherent for DMA so there is no syncing required before and after writes. This syncing wasn't properly done in the previous version of the code. - This is the fourth version of my patch that replaces a fixed-length buffer for DMA descriptors with a dynamically allocated linked-list of buffers. As we discovered with the last attempt, new context programs are sometimes queued from interrupt context, making it unacceptable to call tasklet_disable() from context_get_descriptors(). This version of the patch uses ohci->lock for all locking needs instead of tasklet_disable/enable. There is a new requirement that context_get_descriptors() be called while holding ohci->lock. It was already held for the AT context, so adding the requirement for the iso context did not seem particularly onerous. In addition, this has the side benefit of allowing iso queue to be safely called from concurrent user-space threads, which previously was not safe. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Jarod Wilson <jwilson@redhat.com> - Fixes the following issues: - Isochronous reception stopped prematurely if an application used a larger buffer. (Reproduced with coriander.) - Isochronous reception stopped after one or a few frames on VT630x in OHCI 1.0 mode. (Fixes reception in coriander, but dvgrab still doesn't work with these chips.) Patch update: struct member alignment, whitespace nits Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: CycleTooLong interrupt management	Stefan Richter	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The firewire-ohci driver so far lacked the ability to resume cycle master duty after that condition happened, as added to ohci1394 in Linux 2.6.18 by commit 57fdb58fa5a140bdd52cf4c4ffc30df73676f0a5. This ports this patch to fw-ohci. The "cycle too long" condition has been seen in practice - with IIDC cameras if a mode with packets too large for a speed is chosen, - sporadically when capturing DV on a VIA VT6306 card with ohci1394/ ieee1394/ raw1394/ dvgrab 2. https://bugzilla.redhat.com/show_bug.cgi?id=415841#c7 (This does not fix Fedora bug 415841.) Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: Fix extraction of source node id	Rabin Vincent	2008-01-30
\| \| \| \| \| \| \|	Fix extraction of the source node id from the packet header. Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: Bug fixes for packet-per-buffer support	David Moore	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch corrects a number of bugs in the current OHCI 1.0 packet-per-buffer support: 1. Correctly deal with payloads that cross a page boundary. The previous version would not split the descriptor at such a boundary, potentially corrupting unrelated memory. 2. Allow user-space to specify multiple packets per struct fw_cdev_iso_packet in the same way that dual-buffer allows. This is signaled by header_length being a multiple of header_size. This multiple determines the number of packets. The payload size allocated per packet is determined by dividing the total payload size by the number of packets. 3. Make sync support work properly for packet-per-buffer. I have tested this patch with libdc1394 by forcing my OHCI 1.1 controller to use the packet-per-buffer support instead of dual-buffer. I would greatly appreciate testing by those who have a DV devices and other types of iso streamers to make sure I didn't cause any regressions. Stefan, with this patch, I'm hoping that libdc1394 will work with all your OHCI 1.0 controllers now. The one bit of future work that remains for packet-per-buffer support is the automatic compaction of short payloads that I discussed with Kristian. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-ohci: Fix for dualbuffer three-or-more buffers	David Moore	2008-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the problem where different OHCI 1.1 controllers behave differently when a received iso packet straddles three or more buffers when using the dual-buffer receive mode. Two changes are made in order to handle this situation: 1. The packet sync DMA descriptor is given a non-zero header length and non-zero payload length. This is because zero-payload descriptors are not discussed in the OHCI 1.1 specs and their behavior is thus undefined. Instead we use a header size just large enough for a single header and a payload length of 4 bytes for this first descriptor. 2. As we process received packets in the context's tasklet, read the packet length out of the headers. Keep track of the running total of the packet length as "excess_bytes", so we can ignore any descriptors where no packet starts or ends. These descriptors may not have had their first_res_count or second_res_count fields updated by the controller so we cannot rely on those values. The main drawback of this patch is that the excess_bytes value might get "out of sync" with the packet descriptors if something strange happens to the DMA program. I'm not if such a thing could ever happen, but I appreciate any suggestions in making it more robust. Also, the packet-per-buffer support may need a similar fix to deal with issue 1, but I haven't done any work on that yet. Stefan, I'm hoping that with this patch, all your OHCI 1.1 controllers will work properly with an unmodified version of libdc1394. Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: remove unused misleading macro	Stefan Richter	2008-01-30
\| \| \| \| \| \| \|	SBP2_MAX_SECTORS is nowhere used in fw-sbp2. It merely got copied over from sbp2 where it played a role in the past. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: prepare for s/g chaining	Stefan Richter	2008-01-30
\| \| \| \|	Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	firewire: fw-sbp2: refactor workq and kref handling	Stefan Richter	2008-01-30
\| \| \| \| \| \|	This somewhat reduces the size of firewire-sbp2.ko. Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
*	[SCSI] relax scsi dma alignment	James Bottomley	2008-01-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch relaxes the default SCSI DMA alignment from 512 bytes to 4 bytes. I remember from previous discussions that usb and firewire have sector size alignment requirements, so I upped their alignments in the respective slave allocs. The reason for doing this is so that we don't get such a huge amount of copy overhead in bio_copy_user() for udev. (basically all inquiries it issues can now be directly mapped). Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	firewire: OHCI 1.0 Isochronous Receive support	Jarod Wilson	2007-12-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Third rendition of FireWire OHCI 1.0 Isochronous Receive support, using a zer-copy method similar to OHCI 1.1 which puts the IR data payload directly into the userspace buffer. The zero-copy implementation eliminates the video artifacts, audio popping, and buffer underrun problems seen with version 1 of this patch, as well as fixing a regression in OHCI 1.1 support introduced by version 2 of this patch. Successfully tested in OHCI 1.1 mode on the following chipsets: - NEC uPD72847 (rev 01), OHCI 1.1 (PCI) - Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe) - Ti TSB41AB2 (rev 01), OHCI 1.1 (PCI on SB Audigy) - Apple UniNorth 2 (rev 81), OHCI 1.1 (PowerBook G4 onboard) Successfully tested in OHCI 1.0 mode on the following chipsets: - Agere FW323 (rev 06), OHCI 1.0 (Mac Mini onboard) - Agere FW323 (rev 06), OHCI 1.0 (PCI) - Via VT6306 (rev 46), OHCI 1.0 (PCI) - NEC OrangeLink (rev 01), OHCI 1.0 (PCI) - NEC uPD72847 (rev 01), OHCI 1.1 (PCI) - Ti XIO2200(A) (rev 01), OHCI 1.1 (PCIe) The bulk of testing was done in an x86_64 system, but was also successfully sanity-tested on other systems, including a PPC(32) PowerBook G4 and an i686 EPIA M10k. Crude benchmarking (watching top during capture) puts the cpu utilization during capture on the EPIA's 1GHz Via C3 processor around 13%, which is down from 30% with the v1 code. Some implementation details: To maintain the same userspace API as dual-buffer mode, we set up two descriptors for every incoming packet. The first is an INPUT_MORE descriptor, pointing to a buffer large enough to hold just the packet's iso headers, immediately followed by an INPUT_LAST descriptor, pointing to a chunk of the userspace buffer big enough for the packet's data payload. With this setup, each incoming packet fills in these two descriptors in a manner that very closely emulates dual-buffer receive, to the point where the bulk of the handle_ir_* code is now identical between the two (and probably primed for some restructuring to share code between them). The only caveat I have at the moment is that neither of my OHCI 1.0 Via VT6307-based FireWire controllers work particularly well with this code for reasons I have yet to figure out. Signed-off-by: Jarod Wilson <jwilson@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>