litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	[GFS2] Shrink gfs2_inode memory by half	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Remove max_atomic_write tunable	Steven Whitehouse	2007-02-05
\| \| \| \| \| \|	This removes an unused sysfs tunable parameter. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Clean up/speed up readdir	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Add writepages for "data=writeback" mounts	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It occurred to me that although a gfs2 specific writepages for ordered writes and journaled data would be tricky, by hooking writepages only for "data=writeback" mounts we could take advantage of not needing buffer heads (we don't use them on the read side, nor have we for some time) and create much larger I/Os for the block layer. Using blktrace both before and after, its possible to see that for large I/Os, most of the requests generated through writepages are now 1024 sectors after this patch is applied as opposed to 8 sectors before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix master recovery	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If master recovery happens on an rsb in one recovery sequence, then that sequence is aborted before lock recovery happens, then in the next sequence, we rely on the previous master recovery (which may now be invalid due to another node ignoring a lookup result) and go on do to the lock recovery where we get stuck due to an invalid master value. recovery cycle begins: master of rsb X has left nodes A and B send node C an rcom lookup for X to find the new master C gets lookup from B first, sets B as new master, and sends reply back to B C gets lookup from A next, and sends reply back to A saying B is master A gets lookup reply from C and sets B as the new master in the rsb recovery cycle on A, B and C is aborted to start a new recovery B gets lookup reply from C and ignores it since there's a new recovery recovery cycle begins: some other node has joined B doesn't think it's the master of X so it doesn't rebuild it in the directory C looks up the master of X, no one is master, so it becomes new master B looks up the master of X, finds it's C A believes that B is the master of X, so it sends its lock to B B sends an error back to A A resends this repeats forever, the incorrect master value on A is never corrected The fix is to do master recovery on an rsb that still has the NEW_MASTER flag set from an earlier recovery sequence, and therefore didn't complete lock recovery. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix user unlocking	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a user process exits, we clear all the locks it holds. There is a problem, though, with locks that the process had begun unlocking before it exited. We couldn't find the lkb's that were in the process of being unlocked remotely, to flag that they are DEAD. To solve this, we move lkb's being unlocked onto a new list in the per-process structure that tracks what locks the process is holding. We can then go through this list to flag the necessary lkb's when clearing locks for a process when it exits. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] Use workqueues for dlm lowcomms	Patrick Caulfield	2007-02-05
\| \| \| \| \| \| \| \| \|	This patch converts the DLM TCP lowcomms to use workqueues rather than using its own daemon functions. Simultaneously removing a lot of code and making it more scalable on multi-processor machines. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] make gfs2_change_nlink_i() static	Adrian Bunk	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] gfs2 knows of directories which it chooses not to display	Robert Peterson	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is for Red Hat bugzilla bug bz #222302: Moving a virtual IP from node to node between two NFS-over-GFS2 servers was causing one of the GFS2 servers to become confused and reference a deleted inode. The problem was due to vfs dentries that did not reference the gfs2_dops and therefore didn't call the gfs2 revalidate code to revalidate a dentry after a directory had been deleted & recreated. This patch is a crosswrite from a RHEL4 bug found in GFS1 as bz #190756 and it is against the latest -nmw git tree. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] expose dlm_config_info fields in configfs	David Teigland	2007-02-05
\| \| \| \| \| \| \| \|	Make the dlm_config_info values readable and writeable via configfs entries. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] add config entry to enable log_debug	David Teigland	2007-02-05
\| \| \| \| \| \| \| \|	Add a new dlm_config_info field to enable log_debug output and change log_debug() to use it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] rename dlm_config_info fields	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \|	Add a "ci_" prefix to the fields in the dlm_config_info struct so that we can use macros to add configfs functions to access them (in a later patch). No functional changes in this patch, just naming changes. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] change some log_error to log_debug	David Teigland	2007-02-05
\| \| \| \| \| \| \| \|	Some common, non-error messages should use log_debug instead of log_error so they can be turned off. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Fix gfs2_rename deadlock	S. Wendy Cheng	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \|	Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] BZ 217008 fsfuzzer fix.	Russell Cattelan	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the quilt header comments to match the code changes. Change gfs2_lookup_simple to return an error in the case of a NULL inode. The callers of gfs2_lookup_simple do not check for NULL in the no entry case and such would end up dereferencing a NULL ptr. This fixes: http://projects.info-pull.com/mokb/MOKB-15-11-2006.html Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Fix ordering of page disposal vs. glock_dq	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \|	In case of unlinked files with dirty pages GFS2 wasn't clearing the pages in quite the right order. This patch clears the pages earlier (before the qlock_dq) to avoid the situation that the release of the glock results in attempting to write back data that has already been deallocated. This fixes Red Hat bugzilla: #220117 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] Fix spin lock already unlocked bug	Patrick Caulfield	2007-02-05
\| \| \| \| \| \| \| \| \| \| \|	I just noticed this message when testing some other changes I'd made to lowcomms (to use workqueues) but the problem seems to be in the current git trees too. I'm amazed no-one has seen it. BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] Fix schedule() calls	Patrick Caulfield	2007-02-05
\| \| \| \| \| \| \| \| \|	I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton. These four should really be calls to schedule() or the dlm can busy-wait. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Fix change nlink deadlock	S. Wendy Cheng	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] Fail over to readpage for stuffed files	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>
*	[GFS2] Fix DIO deadlock	Steven Whitehouse	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>
*	[DLM] fs/dlm/lowcomms-tcp.c: remove 2 functions	Adrian Bunk	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \|	Remove the following unused functions: - lowcomms_send_message() - lowcomms_max_buffer_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix lost flags in stub replies	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \|	When the dlm fakes an unlock/cancel reply from a failed node using a stub message struct, it wasn't setting the flags in the stub message. So, in the process of receiving the fake message the lkb flags would be updated and cleared from the zero flags in the message. The problem observed in tests was the loss of the USER flag which caused the dlm to think a user lock was a kernel lock and subsequently fail an assertion checking the validity of the ast/callback field. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix receive_request() lvb copying	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \|	LVB's are not sent as part of new requests, but the code receiving the request was copying data into the lvb anyway. The space in the message where it mistakenly thought the lvb lived actually contained the resource name, so it wound up incorrectly copying this name data into the lvb. Fix is to just create the lvb, not copy junk into it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix send_args() lvb copying	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \|	The send_args() function is used to copy parameters into a message for a number different message types. Only some of those types are set up beforehand (in create_message) to include space for sending lvb data. send_args was wrongly copying the lvb for all message types as long as the lock had an lvb. This means that the lvb data was being written past the end of the message into unknown space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] add version check	David Teigland	2007-02-05
\| \| \| \| \| \| \| \|	Check if we receive a message from another lockspace member running a version of the dlm with an incompatible inter-node message protocol. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix old rcom messages	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[DLM] fix resend rcom lock	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's a chance the new master of resource hasn't learned it's the new master before another node sends it a lock during recovery. The node sending the lock needs to resend if this happens. - A sends a master lookup for resource R to C - B sends a master lookup for resource R to C - C receives A's lookup, assigns A to be master of R and sends a reply back to A - C receives B's lookup and sends a reply back to B saying that A is the master - B receives lookup reply from C and sends its lock for R to A - A receives lock from B, doesn't think it's the master of R and sends an error back to B - A receives lookup reply from C and becomes master of R - B gets error back from A and resends its lock back to A (this resending is what this patch does) - A receives lock from B, it now sees it's the master of R and takes the lock Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	[GFS2] don't try to lockfs after shutdown	David Teigland	2007-02-05
\| \| \| \| \| \| \| \| \| \| \| \|	If an fs has already been shut down, a lockfs callback should do nothing. An fs that's been shut down can't acquire locks or do anything with respect to the cluster. Also, remove FIXME comment in withdraw function. The missing bits of the withdraw procedure are now all done by user space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	Linux 2.6.20v2.6.20	Linus Torvalds	2007-02-04
\|
*	[PATCH] EFI x86: pass firmware call parameters on the stack	Frédéric Riss	2007-02-04
\| \| \| \| \| \| \| \| \| \|	When calling into the EFI firmware, the parameters need to be passed on the stack. The recent change to use -mregparm=3 breaks x86 EFI support. This patch is needed to allow the new Intel-based Macs to suspend to ram (efi.get_time is called during the suspend phase). Signed-off-by: Frederic Riss <frederic.riss@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] fix rtl8150	Al Viro	2007-02-03
\| \| \| \| \| \| \|	That code doesn't do what its author apparently thought it would do... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6	Linus Torvalds	2007-02-03
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect [SCSI] qla4xxx: bug fixes [SCSI] Fix scsi_add_device() for async scanning
\| *	[SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash	Nagendra Singh Tomar	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sd_probe() calls class_device_add() even before initializing the sdkp->device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a crash as the show function for allow_restart (i.e sd_show_allow_restart) returns the attribute value by reading the sdkp->device->allow_restart variable. As the sdkp->device is not initialized before calling the user mode hotplug helper, this results in a crash. The patch below solves it by calling class_device_add() only after the necessary fields in the scsi_disk structure are initialized properly. Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
\| *	[SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file ↵	Kai Makisara	2007-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	number to be incorrect On Wed, 24 Jan 2007, Andrew Morton wrote: > On Mon, 22 Jan 2007 13:07:20 -0800 > bugme-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=7864 > > > > Summary: A MTIOCTOP/MTWEOF within the early warning will cause > > the file number to be incorrect > > Kernel Version: 2.6.19.2 > > Status: NEW > > Severity: low > > Owner: io_scsi@kernel-bugs.osdl.org > > Submitter: ce_reisinger@yahoo.com > > > > > > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached > > early warning. > > Now perform a: > > struct mtget before, after; > > ioctl(fd, MTIOCGET, &before); > > struct mtop mtop = { MTWEOF, 1 }; > > ioctl(fd, MTIOCTOP, &mtop); > > ioctl(fd, MTIOCGET, &after); > > > > Check the value of mt_fileno in the before and after structures. Notice the > > after is 2 greater then the before. > > > > The problem appears to be in the block of code starting at line 2817 in st.c. > > This block is entered because the drive did return a CHECK CONDITION with NO > > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But > > it has already been increased by the number of filemarks requested by the > > MTIOCTOP. I believe that the residue count in the sense data should be > > subtracted from fileno, not a increment as is done. > > > > Thanks. Could you please send us a tested patch to fix these things, as > per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ? > The analysis is basically correct and explains the bug. According to the SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case writing filemark(s) succeeds. If it fails (partly or completely) the sense code is VOLUME OVERFLOW. The patch below is tested to fix the case when one filemark is successfully written after the EOM early warning. It should also fix the case at real EOM but this has not been tested. Carl, thanks for reporting the bug and providing the analysis for the fix. Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
\| *	[SCSI] qla4xxx: bug fixes	David C Somayajulu	2007-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The included patch fixes the following issues: 1. qla3xxx/qla4xxx co-existence issue which can result in a lockup when qla3xxx driver is unloaded, or when ifdown; ifup is performed on one of the interfaces correponding to qla3xxx. This is because qla4xxx HBA supports one ethernet and iscsi interfaces per port. Both iscsi and ethernet interfaces share the same state machine. The problem has to do with synchronizing access to the state machine in the event of a reset 2. mutex_lock() is sometimes not followed by mutex_unlock() prior to invoking a msleep() in qla4xxx_mailbox_command() Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
\| *	[SCSI] Fix scsi_add_device() for async scanning	Matthew Wilcox	2007-01-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I had thought that all drivers which didn't call scsi_scan_host() called scsi_scan_target(). Some, such as sbp2, mptsas and libata-scsi, call scsi_add_device() or __scsi_add_device(). We just need to wait for the currently executing async scans to complete first. This is the same code that's in scsi_scan_target(), except that we have to return an error instead of void when we're declining to scan at all. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* \|	[PATCH] x86-64: define dma noncoherent API functions	Jeff Garzik	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	x86-64 is missing these: Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] Altix: more ACPI PRT support	John Keller	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SN Altix platform does not conform to the IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and return. Due to an oversight, this code was not added previously when similar code was added to acpi_register_gsi(). http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2 Signed-off-by: John Keller <jpk@sgi.com> Acked-by: Len Brown <lenb@kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] revert blockdev direct io back to 2.6.19 version	Andrew Morton	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] alpha: fix epoll syscall enumerations	Mike Frysinger	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We went and named them __NR_sys_foo instead of __NR_foo. It may be too late to change this, but we can at least add the proper names now. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] net/smc911x: match up spin lock/unlock	Peter Korsgaard	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	smc911x_phy_configure's error handling unconditionally unlocks the spinlock even if it wasn't locked. Patch fixes it. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Cc: Jeff Garzik <jeff@garzik.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] kexec: Avoid migration of already disabled irqs (ia64)	Magnus Damm	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this by skipping migration of already disabled irqs. This is most likely a problem on other ia64 platforms as well, but I've only been able to reproduce it on one machine so far. The full story is that handle_bad_irq() gets invoked before starting the new kernel without this patch. This seems to happen when fixup_irqs() calls generic_handle_irq() on already migrated (and disabled) irqs. So by avoiding migration of disabled irqs we stay away of handle_bad_irq(). The code has been tested on three different ia64 machines, all with good results. It is possible to trigger the same bug by offlining a processor using echo 0 > /sys/devices/system/cpu/cpuX/online. More detailed information is available in the following mail thread: http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774 Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Zou, Nanhai <nanhai.zou@intel.com> Acked-by: Jay Lan <jlan@sgi.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2	Ken Chen	2007-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[NETFILTER]: nf_conntrack_h323: fix compile error with CONFIG_IPV6=m, ↵	Adrian Bunk	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CONFIG_NF_CONNTRACK_H323=y Fix this by letting NF_CONNTRACK_H323 depend on (IPV6 \|\| IPV6=n). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=n	Patrick McHardy	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event': net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark' make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'upstream-linus' of ↵	Linus Torvalds	2007-02-02
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: Initialize nbytes for internal sg commands libata: Fix ata_busy_wait() kernel docs pata_via: Correct missing comments pata_atiixp: propogate cable detection hack from drivers/ide to the new driver ahci/pata_jmicron: fix JMicron quirk
\| * \|	libata: Initialize nbytes for internal sg commands	Brian King	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some LLDDs, like ipr, use nbytes and pad_len to determine the total data transfer length of a command. Make sure nbytes gets initialized for internally generated commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	libata: Fix ata_busy_wait() kernel docs	Alan	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	> Looks like you should use ata_busy_wait() here, rather than reproducing > the same code again. It waits in 10uS chunks while 1uS chunks were used in the workaround. Could indeed do that once I know the fix is right. While I'm at it the ata_busy_wait kerneldoc is borked so here's a fix Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	pata_via: Correct missing comments	Alan	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 8237S was added to the chipsets but not to the comments. Fix this Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>