aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2008-05-08
|\ | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc: Fix SA_ONSTACK signal handling.
| * sparc: Fix SA_ONSTACK signal handling.David S. Miller2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to be more liberal about the alignment of the buffer given to us by sigaltstack(). The user should not need to be mindful of all of the alignment constraints we have for the stack frame. This mirrors how we handle this situation in clone() as well. Also, we align the stack even in non-SA_ONSTACK cases so that signals due to bad stack alignment can be delivered properly. This makes such errors easier to debug and recover from. Finally, add the sanity check x86 has to make sure we won't overflow the signal stack. This fixes glibc testcases nptl/tst-cancel20.c and nptl/tst-cancelx20.c Signed-off-by: David S. Miller <davem@davemloft.net>
* | Revert "PCI: remove default PCI expansion ROM memory allocation"Linus Torvalds2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 9f8daccaa05c14e5643bdd4faf5aed9cc8e6f11e, which was reported to break X startup (xf86-video-ati-6.8.0). See http://bugs.freedesktop.org/show_bug.cgi?id=15523 for details. Reported-by: Laurence Withers <l@lwithers.me.uk> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Jan Beulich <jbeulich@novell.com> Cc: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2008-05-08
|\ \ | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-fixes: sched: fix weight calculations semaphore: fix
| * | sched: fix weight calculationsMike Galbraith2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion between virtual and real time is as follows: dvt = rw/w * dt <=> dt = w/rw * dvt Since we want the fair sleeper granularity to be in real time, we actually need to do: dvt = - rw/w * l This bug could be related to the regression reported by Yanmin Zhang: | Comparing with kernel 2.6.25, sysbench+mysql(oltp, readonly) has lots | of regressions with 2.6.26-rc1: | | 1) 8-core stoakley: 28%; | 2) 16-core tigerton: 20%; | 3) Itanium Montvale: 50%. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | semaphore: fixIngo Molnar2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yanmin Zhang reported: | Comparing with kernel 2.6.25, AIM7 (use tmpfs) has more th | regression under 2.6.26-rc1 on my 8-core stoakley, 16-core tigerton, | and Itanium Montecito. Bisect located the patch below: | | 64ac24e738823161693bf791f87adc802cf529ff is first bad commit | commit 64ac24e738823161693bf791f87adc802cf529ff | Author: Matthew Wilcox <matthew@wil.cx> | Date: Fri Mar 7 21:55:58 2008 -0500 | | Generic semaphore implementation | | After I manually reverted the patch against 2.6.26-rc1 while fixing | lots of conflicts/errors, aim7 regression became less than 2%. i reproduced the AIM7 workload and can confirm Yanmin's findings that -.26-rc1 regresses over .25 - by over 67% here. Looking at the workload i found and fixed what i believe to be the real bug causing the AIM7 regression: it was inefficient wakeup / scheduling / locking behavior of the new generic semaphore code, causing suboptimal performance. The problem comes from the following code. The new semaphore code does this on down(): spin_lock_irqsave(&sem->lock, flags); if (likely(sem->count > 0)) sem->count--; else __down(sem); spin_unlock_irqrestore(&sem->lock, flags); and this on up(): spin_lock_irqsave(&sem->lock, flags); if (likely(list_empty(&sem->wait_list))) sem->count++; else __up(sem); spin_unlock_irqrestore(&sem->lock, flags); where __up() does: list_del(&waiter->list); waiter->up = 1; wake_up_process(waiter->task); and where __down() does this in essence: list_add_tail(&waiter.list, &sem->wait_list); waiter.task = task; waiter.up = 0; for (;;) { [...] spin_unlock_irq(&sem->lock); timeout = schedule_timeout(timeout); spin_lock_irq(&sem->lock); if (waiter.up) return 0; } the fastpath looks good and obvious, but note the following property of the contended path: if there's a task on the ->wait_list, the up() of the current owner will "pass over" ownership to that waiting task, in a wake-one manner, via the waiter->up flag and by removing the waiter from the wait list. That is all and fine in principle, but as implemented in kernel/semaphore.c it also creates a nasty, hidden source of contention! The contention comes from the following property of the new semaphore code: the new owner owns the semaphore exclusively, even if it is not running yet. So if the old owner, even if just a few instructions later, does a down() [lock_kernel()] again, it will be blocked and will have to wait on the new owner to eventually be scheduled (possibly on another CPU)! Or if another task gets to lock_kernel() sooner than the "new owner" scheduled, it will be blocked unnecessarily and for a very long time when there are 2000 tasks running. I.e. the implementation of the new semaphores code does wake-one and lock ownership in a very restrictive way - it does not allow opportunistic re-locking of the lock at all and keeps the scheduler from picking task order intelligently. This kind of scheduling, with 2000 AIM7 processes running, creates awful cross-scheduling between those 2000 tasks, causes reduced parallelism, a throttled runqueue length and a lot of idle time. With increasing number of CPUs it causes an exponentially worse behavior in AIM7, as the chance for a newly woken new-owner task to actually run anytime soon is less and less likely. Note that it takes just a tiny bit of contention for the 'new-semaphore catastrophy' to happen: the wakeup latencies get added to whatever small contention there is, and quickly snowball out of control! I believe Yanmin's findings and numbers support this analysis too. The best fix for this problem is to use the same scheduling logic that the kernel/mutex.c code uses: keep the wake-one behavior (that is OK and wanted because we do not want to over-schedule), but also allow opportunistic locking of the lock even if a wakee is already "in flight". The patch below implements this new logic. With this patch applied the AIM7 regression is largely fixed on my quad testbox: # v2.6.25 vanilla: .................. Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 56096.4 91 207.5 789.7 0.4675 2000 55894.4 94 208.2 792.7 0.4658 # v2.6.26-rc1-166-gc0a1811 vanilla: ................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 33230.6 83 350.3 784.5 0.2769 2000 31778.1 86 366.3 783.6 0.2648 # v2.6.26-rc1-166-gc0a1811 + semaphore-speedup: ............................................... Tasks Jobs/Min JTI Real CPU Jobs/sec/task 2000 55707.1 92 209.0 795.6 0.4642 2000 55704.4 96 209.0 796.0 0.4642 i.e. a 67% speedup. We are now back to within 1% of the v2.6.25 performance levels and have zero idle time during the test, as expected. Btw., interactivity also improved dramatically with the fix - for example console-switching became almost instantaneous during this workload (which after all is running 2000 tasks at once!), without the patch it was stuck for a minute at times. There's another nice side-effect of this speedup patch, the new generic semaphore code got even smaller: text data bss dec hex filename 1241 0 0 1241 4d9 semaphore.o.before 1207 0 0 1207 4b7 semaphore.o.after (because the waiter.up complication got removed.) Longer-term we should look into using the mutex code for the generic semaphore code as well - but i's not easy due to legacies and it's outside of the scope of v2.6.26 and outside the scope of this patch as well. Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2008-05-08
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: [ALSA] soc at91 minor bug fixes [ALSA] soc - at91-pcm - Fix line wrapping pcspkr: fix dependancies
| * | | [ALSA] soc at91 minor bug fixesPatrik Sevallius2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found these two bugs while browsing through the code. The first one is a cut-n-paste bug, instead of disabling the clock when request_irq() fails, it enabled it once more. The second one fixes a debug printout, AT91_SSC_IER is write only, AT91_SSC_IMR is readable (the printed string actually says imr). Frank Mandarino was busy so he asked me to send these to this list. /Patrik Signed-off-by: Patrik Sevallius <patrik.sevallius@enea.com> Acked-by: Frank Mandarino <fmandarino@endrelia.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | | [ALSA] soc - at91-pcm - Fix line wrappingMark Brown2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's more checkpatch stuff to fix in the driver, this just fixes the minimum required for the following patch to be clean. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | | pcspkr: fix dependanciesStas Sergeev2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix pcspkr dependancies: make the pcspkr platform drivers to depend on a platform device, and not the other way around. Signed-off-by: Stas Sergeev <stsp@aknet.ru> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Dmitry Torokhov <dtor@mail.ru> CC: Vojtech Pavlik <vojtech@suse.cz> CC: Michael Opdenacker <michael-lists@free-electrons.com> [fixed for 2.6.26-rc1 by tiwai] Signed-off-by: Takashi Iwai <tiwai@suse.de>
* | | | Remove duplicated include in net/sunrpc/svc.cHuang Weiyi2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | <linux/sched.h> we included twice. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | fs/proc/task_mmu.c: remove duplicated include filesHuang Weiyi2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed duplicated include files <linux/ptrace.h> and <linux/seq_file.h> in fs/proc/task_mmu.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Fix drivers/media build for modular buildsIngo Molnar2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix allmodconfig build bug introduced in latest -git by commit 7c91f0624a9 ("V4L/DVB(7767): Move tuners to common/tuners"): LD kernel/built-in.o LD drivers/built-in.o ld: drivers/media/built-in.o: No such file: No such file or directory which happens if all media drivers are modular: http://redhat.com/~mingo/misc/config-Wed_Apr_30_09_24_48_CEST_2008.bad In that case there's no obj-y rule connecting all the built-in.o files and the link tree breaks. The fix is to add a guaranteed obj-y rule for the core vmlinux to build. (which results in an empty object file if all media drivers are modular) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2008-05-08
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Wait for async events to finish before destroying QP IB/ipath: Fix SDMA error recovery in absence of link status change IB/ipath: Need to always request and handle PIO avail interrupts IB/ipath: Fix count of packets received by kernel IB/ipath: Return the correct opcode for RDMA WRITE with immediate IB/ipath: Fix bug that can leave sends disabled after freeze recovery IB/ipath: Only increment SSN if WQE is put on send queue IB/ipath: Only warn about prototype chip during init RDMA/cxgb3: Fix severe limit on userspace memory registration size RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks
| * | | | IB/ehca: Wait for async events to finish before destroying QPStefan Roscher2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is necessary because, in a multicore environment, a race between uverbs async handler and destroy QP could occur. Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Fix SDMA error recovery in absence of link status changeJohn Gregor2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | What's fixed: in ipath_cancel_sends() We need to unconditionally set ABORTING. So, swap the tests so the set_bit() isn't shadowed by the &&. If we've disarmed the piobufs, then we need to unconditionally set DISARMED. So, move it out from the overly protective if at the bottom. in sdma_abort_task() Abort_task was written knowing that the SDMA engine would always be reset (and restarted) on error. A recent change broke that fundamental assumption by taking the restart portion and making it conditional on a link status change. But, SDMA can go boom without a link status change in some conditions. Signed-off-by: John Gregor <john.gregor@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Need to always request and handle PIO avail interruptsDave Olson2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we always use PIO for vl15 on 7220, we could get stuck forever if we happened to run out of PIO buffers from the verbs code, because the setup code wouldn't run; the interrupt was also ignored if SDMA was supported. We also have to reduce the pio update threshold if we have fewer kernel buffers than the existing threshold. Clean up the initialization a bit to get ordering safer and more sensible, and use the existing ipath_chg_kernavail call to do init, rather than doing it separately. Drop unnecessary clearing of pio buffer on pio parity error. Drop incorrect updating of pioavailshadow when exitting freeze mode (software state may not match chip state if buffer has been allocated and not yet written). If we couldn't get a kernel buffer for a while, make sure we are in sync with hardware, mainly to handle the exitting freeze case. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Fix count of packets received by kernelMichael Albaugh2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loop in ipath_kreceive() that processes packets increments the loop-index 'i' once too often, because the exit condition does not depend on it, and is checked after the increment. By adding a check for !last to the iterator in the for loop, we correct that in a way that is not so likely to be re-broken by changes in the loop body. Signed-off-by: Michael Albaugh <micheal.albaugh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Return the correct opcode for RDMA WRITE with immediateRalph Campbell2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in the RC responder which generates a completion entry with the wrong opcode when an RDMA WRITE with immediate is received. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Fix bug that can leave sends disabled after freeze recoveryDave Olson2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The semantics of cancel_sends changed, but the code using it was missed. Don't leave sends and pioavail updates disabled, and add a comment as to why the force update is needed. Signed-off-by: Dave Olson <dave.olson@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Only increment SSN if WQE is put on send queueRalph Campbell2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a send work request has immediate errors and is not put on the send queue, we shouldn't update any of the QP state. The increment of the SSN wasn't obeying this. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | IB/ipath: Only warn about prototype chip during initMichael Albaugh2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We warn about prototype chips, but the function that checks for support is also called as a result of a get_portinfo request, which can clutter the logs. Restrict warning to only appear during initialization. Signed-off-by: Michael Albaugh <michael.albaugh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | RDMA/cxgb3: Fix severe limit on userspace memory registration sizeRoland Dreier2008-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, iw_cxgb3 is severely limited on the amount of userspace memory that can be registered in in a single memory region, which causes big problems for applications that expect to be able to register 100s of MB. The problem is that the driver uses a single kmalloc()ed buffer to hold the physical buffer list (PBL) for the entire memory region during registration, which means that 8 bytes of contiguous memory are required for each page of memory being registered. For example, a 64 MB registration will require 128 KB of contiguous memory with 4 KB pages, and it unlikely that such an allocation will succeed on a busy system. This is purely a driver problem: the temporary page list buffer is not needed by the hardware, so we can fix this by writing the PBL to the hardware in page-sized chunks rather than all at once. We do this by splitting the memory registration operation up into several steps: - Allocate PBL space in adapter memory for the full registration - Copy PBL to adapter memory in chunks - Allocate STag and enable memory region This also allows several other cleanups to the __cxio_tpt_op() interface and related parts of the driver. This change leaves the reregister memory region and memory window operations broken, but they already didn't work due to other longstanding bugs, so fixing them will be left to a later patch. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | | | RDMA/cxgb3: Don't add PBL memory to gen_pool in chunksRoland Dreier2008-05-06
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current iw_cxgb3 code adds PBL memory to the driver's gen_pool in 2 MB chunks. This limits the largest single allocation that can be done to the same size, which means that with 4 KB pages, each of which takes 8 bytes of PBL memory, the largest memory region that can be allocated is 1 GB (256K PBL entries * 4 KB/entry). Remove this limit by adding all the PBL memory in a single gen_pool chunk, if possible. Add code that falls back to smaller chunks if gen_pool_add() fails, which can happen if there is not sufficient contiguous lowmem for the internal gen_pool bitmap. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | | | MN10300: Make cpu_relax() invoke barrier()David Howells2008-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Make cpu_relax() invoke barrier() to be the same as other arches. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2008-05-08
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "relay: fix splice problem" docbook: fix bio missing parameter block: use unitialized_var() in bio_alloc_bioset() block: avoid duplicate calls to get_part() in disk stat code cfq-iosched: make io priorities inherit CPU scheduling class as well as nice block: optimize generic_unplug_device() block: get rid of likely/unlikely predictions in merge logic vfs: splice remove_suid() cleanup cfq-iosched: fix RCU race in the cfq io_context destructor handling block: adjust tagging function queue bit locking block: sysfs store function needs to grab queue_lock and use queue_flag_*()
| * | | | Revert "relay: fix splice problem"Jens Axboe2008-05-08
| | | | | | | | | | | | | | | | | | | | This reverts commit c3270e577c18b3d0e984c3371493205a4807db9d.
| * | | | docbook: fix bio missing parameterRandy Dunlap2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix fs/bio.c kernel-doc parameter warning: Warning(linux-2.6.25-git14//fs/bio.c:972): No description found for parameter 'reading' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: use unitialized_var() in bio_alloc_bioset()Jens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Better than setting idx to some random value and it silences the same bogus gcc warning. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: avoid duplicate calls to get_part() in disk stat codeJens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_part() is fairly expensive, as it O(N) loops over partitions to find the right one. In lots of normal IO paths we end up looking up the partition twice, to make matters even worse. Change the stat add code to accept a passed in partition instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | cfq-iosched: make io priorities inherit CPU scheduling class as well as niceJens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently set all processes to the best-effort scheduling class, regardless of what CPU scheduling class they belong to. Improve that so that we correctly track idle and rt scheduling classes as well. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: optimize generic_unplug_device()Jens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original patch from Mikulas Patocka <mpatocka@redhat.com> Mike Anderson was doing an OLTP benchmark on a computer with 48 physical disks mapped to one logical device via device mapper. He found that there was a slowdown on request_queue->lock in function generic_unplug_device. The slowdown is caused by the fact that when some code calls unplug on the device mapper, device mapper calls unplug on all physical disks. These unplug calls take the lock, find that the queue is already unplugged, release the lock and exit. With the below patch, performance of the benchmark was increased by 18% (the whole OLTP application, not just block layer microbenchmarks). So I'm submitting this patch for upstream. I think the patch is correct, because when more threads call simultaneously plug and unplug, it is unspecified, if the queue is or isn't plugged (so the patch can't make this worse). And the caller that plugged the queue should unplug it anyway. (if it doesn't, there's 3ms timeout). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: get rid of likely/unlikely predictions in merge logicJens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They tend to depend a lot on the workload, so not a clear-cut likely or unlikely fit. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | vfs: splice remove_suid() cleanupMiklos Szeredi2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | generic_file_splice_write() duplicates remove_suid() just because it doesn't hold i_mutex. But it grabs i_mutex inside splice_from_pipe() anyway, so this is rather pointless. Move locking to generic_file_splice_write() and call remove_suid() and __splice_from_pipe() instead. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | cfq-iosched: fix RCU race in the cfq io_context destructor handlingJens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | put_io_context() drops the RCU read lock before calling into cfq_dtor(), however we need to hold off freeing there before grabbing and dereferencing the first object on the list. So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(), and optimize cfq_free_io_context() to use a new variant for call_for_each_cic() that assumes the RCU read lock is already held. Hit in the wild by Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: adjust tagging function queue bit lockingJens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For most initialization purposes, calling blk_queue_init_tags() without the queue lock held is OK. Only if called for resizing an existing map must the lock be held. Ditto for tag cleanup, the maps are reference counted. So switch the general queue flag setting to the unlocked variant, but retain the locked variant for resizing. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: sysfs store function needs to grab queue_lock and use queue_flag_*()Jens Axboe2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Concurrency isn't a big deal here since we have requests in flight at this point, but do the locked variant to set a better example. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | Merge branch 'for_linus' of ↵Linus Torvalds2008-05-08
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: Fix memory corruption when fs mounted with noadinicb option udf: Make udf exportable udf: fs/udf/partition.c:udf_get_pblock() mustn't be inline
| * | | | | udf: Fix memory corruption when fs mounted with noadinicb optionJan Kara2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When UDF filesystem is mounted with noadinicb mount option, it happens that we extend an empty directory with a block. A code in udf_add_entry() didn't count with this possibility and used uninitialized data leading to memory and filesystem corruption. Add a check whether file already has some extents before operating on them. Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | udf: Make udf exportableRasmus Rohde2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Rasmus Rohde <rohde@duff.dk> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | | | udf: fs/udf/partition.c:udf_get_pblock() mustn't be inlineAdrian Bunk2008-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following build error with UML and gcc 4.3: <-- snip --> ... CC fs/udf/partition.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c: In function ‘udf_get_pblock_virt15’: /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c:32: sorry, unimplemented: inlining failed in call to ‘udf_get_pblock’: function body not available /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/udf/partition.c:102: sorry, unimplemented: called from here make[3]: *** [fs/udf/partition.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | | | Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6Linus Torvalds2008-05-08
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] guest page hinting light [S390] tty3270: fix put_char fail/success conversion. [S390] compat ptrace cleanup [S390] s390mach compile warning [S390] cio: Fix parsing mechanism for blacklisted devices. [S390] cio: Remove cio_msg kernel parameter. [S390] s390-kvm: leave sie context on work. Removes preemption requirement [S390] s390: Optimize user and work TIF check
| * | | | | | [S390] guest page hinting lightMartin Schwidefsky2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the existing arch_alloc_page/arch_free_page callbacks to do the guest page state transitions between stable and unused. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] tty3270: fix put_char fail/success conversion.Heiko Carstens2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wrong function got coverted ;) CC drivers/s390/char/tty3270.o drivers/s390/char/tty3270.c:1747: warning: initialization from incompatible pointer type Acked-by: Alan Cox <alan@redhat.com> Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] compat ptrace cleanupRoland McGrath2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This removes redundant arch code for generic ptrace requests already handled by ptrace_request and compat_ptrace_request. It simplifies things to just have the standard entry points, and use the generic compat_sys_ptrace. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] s390mach compile warningMartin Schwidefsky2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following compile warning: drivers/s390/s390mach.c: In function 's390_collect_crw_info': drivers/s390/s390mach.c:77: warning: ignoring return value of 'down_interruptibl Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] cio: Fix parsing mechanism for blacklisted devices.Michael Ernst2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New format cssid.ssid.devno is now parsed correctly. Signed-off-by: Michael Ernst <mernst@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] cio: Remove cio_msg kernel parameter.Michael Ernst2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only sporadically used CIO_DEBUG messages are replaced by ordinary CIO_MSG_EVENT messages. The CIO_MSG_EVENT messages debug levels are consolidated. Signed-off-by: Michael Ernst <mernst@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] s390-kvm: leave sie context on work. Removes preemption requirementChristian Borntraeger2008-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Martin Schwidefsky <schwidefsky@de.ibm.com> This patch fixes a bug with cpu bound guest on kvm-s390. Sometimes it was impossible to deliver a signal to a spinning guest. We used preemption as a circumvention. The preemption notifiers called vcpu_load, which checked for pending signals and triggered a host intercept. But even with preemption, a sigkill was not delivered immediately. This patch changes the low level host interrupt handler to check for the SIE instruction, if TIF_WORK is set. In that case we change the instruction pointer of the return PSW to rerun the vcpu_run loop. The kvm code sees an intercept reason 0 if that happens. This patch adds accounting for these types of intercept as well. The advantages: - works with and without preemption - signals are delivered immediately - much better host latencies without preemption Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | | [S390] s390: Optimize user and work TIF checkMartin Schwidefsky2008-05-07
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On return from syscall or interrupt, we have to check if we return to userspace (likely) and if there is work todo (less likely) to decide if we handle the work. We can optimize this check: we first check for the less likely work case and then check for userspace. This patch is also a preparation for an additional patch, that fixes a bug in KVM dealing with cpu bound guests. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>