aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* [PATCH] add asm-generic/mman.hMichael S. Tsirkin2006-02-15
| | | | | | | | | | | | | | Make new MADV_REMOVE, MADV_DONTFORK, MADV_DOFORK consistent across all arches. The idea is to make it possible to use them portably even before distros include them in libc headers. Move common flags to asm-generic/mman.h Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] cpuset: oops in exit on null cpuset fixPaul Jackson2006-02-15
| | | | | | | | | | | | | | | Fix a latent bug in cpuset_exit() handling. If a task tried to allocate memory after calling cpuset_exit(), it oops'd in cpuset_update_task_memory_state() on a NULL cpuset pointer. So set the exiting tasks cpuset to the root cpuset instead of to NULL. A distro kernel hit this with an added kernel package that had just such a hook (allocating memory) in the exit code path. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] ide: touch softlockup detector during pioAndrew Morton2006-02-15
| | | | | | | | | | We're getting some softlockup false positives during heavy PIO operations. So poke the lockup detector. Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] kbuild: revert "fix make -jN with multiple targets with O=..."Benjamin LaHaise2006-02-15
| | | | | | | | | | | | | Commit 296e0855b0f9a4ec9be17106ac541745a55b2ce1: "kbuild: fix make -jN with multiple targets with O=..." causes a ~95% increase in build time for the kernel. Before: 4m21s after: 8m1.403s. Can we revert this until another approach is found? Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] neofb: avoid resetting display config on unblank (v2)Christian Trefzer2006-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were two mistakes in the register-read-on-(un)blank approach. - First, without proper register (un)locking the value read back will always be zero, and this is what I missed entirely until just now. Due to this, the logic could not be verified at all and I tried some bogus checks which are completely stupid. - Second, the LCD status bit will always be set to zero when the backlight has been turned off. Reading the value back during unblank will disable the LCD unconditionally, regardless of the state it is supposed to be in, since we set it to zero beforehand. So this is what we do now: - create a new variable in struct neofb_par, and use that to determine whether to read back registers (initialized to true) - before actually blanking the screen, read back the register to sense any possible change made through Fn key combo - use proper neoUnlock() / neoLock() to actually read something - every call to neofb_blank() determines if we read back next time: blanking disables readback, unblanking (FB_BLANK_UNBLANK) enables it This should give us a nice and clean state machine. Has been thoroughly tested on a Dell Latitude CPiA / NM220 Chip docked to a C/Dock2 with attached CRT in all possible combinations of LCD/CRT on/off. I changed the config via Fn key, let the console blank, unblanked by keypress - works flawlessly. Signed-off-by: Christian Trefzer <ctrefzer@gmx.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix zap_thread's ptrace related problemsOleg Nesterov2006-02-15
| | | | | | | | | | | | | | | 1. The tracee can go from ptrace_stop() to do_signal_stop() after __ptrace_unlink(p). 2. It is unsafe to __ptrace_unlink(p) while p->parent may wait for tasklist_lock in ptrace_detach(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Hellwig <hch@lst.de> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix kill_proc_info() vs fork() theoretical raceOleg Nesterov2006-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copy_process: attach_pid(p, PIDTYPE_PID, p->pid); attach_pid(p, PIDTYPE_TGID, p->tgid); What if kill_proc_info(p->pid) happens in between? copy_process() holds current->sighand.siglock, so we are safe in CLONE_THREAD case, because current->sighand == p->sighand. Otherwise, p->sighand is unlocked, the new process is already visible to the find_task_by_pid(), but have a copy of parent's 'struct pid' in ->pids[PIDTYPE_TGID]. This means that __group_complete_signal() may hang while doing do ... while (next_thread() != p) We can solve this problem if we reverse these 2 attach_pid()s: attach_pid() does wmb() group_send_sig_info() calls spin_lock(), which provides a read barrier. // Yes ? I don't think we can hit this race in practice, but still. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix kill_proc_info() vs CLONE_THREAD raceOleg Nesterov2006-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | There is a window after copy_process() unlocks ->sighand.siglock and before it adds the new thread to the thread list. In that window __group_complete_signal(SIGKILL) will not see the new thread yet, so this thread will start running while the whole thread group was supposed to exit. I beleive we have another good reason to place attach_pid(PID/TGID) under ->sighand.siglock. We can do the same for release_task()->__unhash_process() de_thread()->switch_exec_pids() After that we don't need tasklist_lock to iterate over the thread list, and we can simplify things, see for example do_sigaction() or sys_times(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2006-02-15
|\
| * [BRIDGE]: Fix deadlock in br_stp_disable_bridgeAdrian Drzewiecki2006-02-15
| | | | | | | | | | | | | | | | | | | | Looks like somebody forgot to use the _bh spin_lock variant. We ran into a deadlock where br->hello_timer expired while br_stp_disable_br() walked br->port_list. Signed-off-by: Adrian Drzewiecki <z@drze.net> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [NETFILTER]: Fix xfrm lookup after SNATPatrick McHardy2006-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To find out if a packet needs to be handled by IPsec after SNAT, packets are currently rerouted in POST_ROUTING and a new xfrm lookup is done. This breaks SNAT of non-unicast packets to non-local addresses because the packet is routed as incoming packet and no neighbour entry is bound to the dst_entry. In general, it seems to be a bad idea to replace the dst_entry after the packet was already sent to the output routine because its state might not match what's expected. This patch changes the xfrm lookup in POST_ROUTING to re-use the original dst_entry without routing the packet again. This means no policy routing can be used for transport mode transforms (which keep the original route) when packets are SNATed to match the policy, but it looks like the best we can do for now. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [PATCH] CIFS: fix cifs_user_read oops when null SMB response on ↵Steve French2006-02-14
|/ | | | | | | | | | | forcedirectio mount This patch fixes an oops reported by Adrian Bunk in cifs_user_read when a null read response is returned on a forcedirectio mount. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] neofb: avoid resetting display config on unblankChristian Trefzer2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix issues with the NeoMagic framebuffer driver. It nicely complements my previous fix already in linus' tree. The only thing missing now is that the external CRT will not be activated at neofb init when external-only is selected, either by register read or module/kernel parameter. Testing was done on a Dell Latitude CPi-A/NM2200 chip. Previous behaviour: - before booting linux, set the preferred display config X via FN+F8 - boot linux, neofb stores the register values in a private variable - change the display config to Y via keystroke - leave the machine in peace until display is blanked - touching any key will result in display config X being restored - booting up, the BIOS will acknowledge config Y, though... Current behaviour: At the time of unblanking, config Y is honoured because we now read back register contents instead of just overwriting them with outdated values. Signed-off by: Christian Trefzer <ctrefzer@gmx.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86: gitignore some autogenerated files for i386Thomas Meyer2006-02-14
| | | | | | | | | Add some more gitignore files for i386 architecture. This files are created during the build process of a i386 kernel. Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] x86: document sysenter pathAlbert D. Cahalan2006-02-14
| | | | | | | | | This path isn't obvious. It looks as if the kernel will be taking three args from the user stack, but it only takes one from there. Signed-off-by: Albert Cahalan <acahalan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] FRV: Use virtual interrupt disablementDavid Howells2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the FRV arch use virtual interrupt disablement because accesses to the processor status register (PSR) are relatively slow and because we will soon have the need to deal with multiple interrupt controls at the same time (separate h/w and inter-core interrupts). The way this is done is to dedicate one of the four integer condition code registers (ICC2) to maintaining a virtual interrupt disablement state whilst inside the kernel. This uses the ICC2.Z flag (Zero) to indicate whether the interrupts are virtually disabled and the ICC2.C flag (Carry) to indicate whether the interrupts are physically disabled. ICC2.Z is set to indicate interrupts are virtually disabled. ICC2.C is set to indicate interrupts are physically enabled. Under normal running conditions Z==0 and C==1. Disabling interrupts with local_irq_disable() doesn't then actually physically disable interrupts - it merely sets ICC2.Z to 1. Should an interrupt then happen, the exception prologue will note ICC2.Z is set and branch out of line using one instruction (an unlikely BEQ). Here it will physically disable interrupts and clear ICC2.C. When it comes time to enable interrupts (local_irq_enable()), this simply clears the ICC2.Z flag and invokes a trap #2 if both Z and C flags are clear (the HI integer condition). This can be done with the TIHI conditional trap instruction. The trap then physically reenables interrupts and sets ICC2.C again. Upon returning the interrupt will be taken as interrupts will then be enabled. Note that whilst processing the trap, the whole exceptions system is disabled, and so an interrupt can't happen till it returns. If no pending interrupt had happened, ICC2.C would still be set, the HI condition would not be fulfilled, and no trap will happen. Saving interrupts (local_irq_save) is simply a matter of pulling the ICC2.Z flag out of the CCR register, shifting it down and masking it off. This gives a result of 0 if interrupts were enabled and 1 if they weren't. Restoring interrupts (local_irq_restore) is then a matter of taking the saved value mentioned previously and XOR'ing it against 1. If it was one, the result will be zero, and if it was zero the result will be non-zero. This result is then used to affect the ICC2.Z flag directly (it is a condition code flag after all). An XOR instruction does not affect the Carry flag, and so that bit of state is unchanged. The two flags can then be sampled to see if they're both zero using the trap (TIHI) as for the unconditional reenablement (local_irq_enable). This patch also: (1) Modifies the debugging stub (break.S) to handle single-stepping crossing into the trap #2 handler and into virtually disabled interrupts. (2) Removes superseded fixup pointers from the second instructions in the trap tables (there's no a separate fixup table for this). (3) Declares the trap #3 vector for use in .org directives in the trap table. (4) Moves irq_enter() and irq_exit() in do_IRQ() to avoid problems with virtual interrupt handling, and removes the duplicate code that has now been folded into irq_exit() (softirq and preemption handling). (5) Tells the compiler in the arch Makefile that ICC2 is now reserved. (6) Documents the in-kernel ABI, including the virtual interrupts. (7) Renames the old irq management functions to different names. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] FRV: Miscellaneous fixesDavid Howells2006-02-14
| | | | | | | | | | | | | | | | | | Make various alterations and fixes to the FRV arch: (1) Resyncs the FRV system call collection with the i386 arch. (2) Discards __iounmap() as it's not used. (3) Fixes the use of the SWAP/SWAPI instruction to get the arguments the right way around in atomic.h, and also to get the asm constraints correct. (4) Moves copy_to/from_user_page() to asm/cacheflush.h to be consistent with other archs. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] hrtimer: round up relative start time on low-res archesIngo Molnar2006-02-14
| | | | | | | | | | | | | CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that they simply return xtime in do_gettimeoffset(). In this corner-case we want to round up by resolution when starting a relative timer, to avoid short timeouts. This will go away with the GTOD framework. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] s390: fix __delay implementationHeiko Carstens2006-02-14
| | | | | | | | | Fix __delay implementation. Called with an argument "1" or "0" it would loop nearly forever (since (1/2)-1 = 0xffffffff). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix a typo in the CPU_H8300H dependenciesAdrian Bunk2006-02-14
| | | | | | | | | Jean-Luc Leger <reiga@dspnet.fr.eu.org> found this obvious typo. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] sched: revert "filter affine wakeups"Chen, Kenneth W2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert commit d7102e95b7b9c00277562c29aad421d2d521c5f6: [PATCH] sched: filter affine wakeups Apparently caused more than 10% performance regression for aim7 benchmark. The setup in use is 16-cpu HP rx8620, 64Gb of memory and 12 MSA1000s with 144 disks. Each disk is 72Gb with a single ext3 filesystem (courtesy of HP, who supplied benchmark results). The problem is, for aim7, the wake-up pattern is random, but it still needs load balancing action in the wake-up path to achieve best performance. With the above commit, lack of load balancing hurts that workload. However, for workloads like database transaction processing, the requirement is exactly opposite. In the wake up path, best performance is achieved with absolutely zero load balancing. We simply wake up the process on the CPU that it was previously run. Worst performance is obtained when we do load balancing at wake up. There isn't an easy way to auto detect the workload characteristics. Ingo's earlier patch that detects idle CPU and decide whether to load balance or not doesn't perform with aim7 either since all CPUs are busy (it causes even bigger perf. regression). Revert commit d7102e95b7b9c00277562c29aad421d2d521c5f6, which causes more than 10% performance regression with aim7. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] madvise MADV_DONTFORK/MADV_DOFORKMichael S. Tsirkin2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | Currently, copy-on-write may change the physical address of a page even if the user requested that the page is pinned in memory (either by mlock or by get_user_pages). This happens if the process forks meanwhile, and the parent writes to that page. As a result, the page is orphaned: in case of get_user_pages, the application will never see any data hardware DMA's into this page after the COW. In case of mlock'd memory, the parent is not getting the realtime/security benefits of mlock. In particular, this affects the Infiniband modules which do DMA from and into user pages all the time. This patch adds madvise options to control whether memory range is inherited across fork. Useful e.g. for when hardware is doing DMA from/into these pages. Could also be useful to an application wanting to speed up its forks by cutting large areas out of consideration. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] kprobes: Update Documentation/kprobes.txtJim Keniston2006-02-14
| | | | | | | | | | Update Documentation/kprobes.txt to reflect Kprobes enhancements and other recent developments. Acked-by: Ananth Mavinakayanahalli <mananth@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Fix NULL pointer dereference in isdn_tty_at_coutKarsten Keil2006-02-14
| | | | | | | | | | | The changes in the tty related code introduced wrong parenthesis in a if condition in the isdn_tty_at_cout function. This caused access to index -1 in the dev->drv[] array. This patch change it back to the correct condition from the previous versions. Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fix x86 topology export in sysfs for subarchitecturesJames Bottomley2006-02-14
| | | | | | | | | | The correct way to export hyperthreading based functions is to predicate them on CONFIG_X86_HT. Without this, the topology exporting patch breaks the build on all non-PC x86 subarchitectures. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] NLM: Fix the NLM_GRANTED callback checksTrond Myklebust2006-02-14
| | | | | | | | | | | | | | | | | | | | If 2 threads attached to the same process are blocking on different locks on different files (maybe even on different servers) but have the same lock arguments (i.e. same offset+length - actually quite common, since most processes try to lock the entire file) then the first GRANTED call that wakes one up will also wake the other. Currently when the NLM_GRANTED callback comes in, lockd walks the list of blocked locks in search of a match to the lock that the NLM server has granted. Although it checks the lock pid, start and end, it fails to check the filehandle and the server address. By checking the filehandle and server IP address, we ensure that this only happens if the locks truly are referencing the same file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] jbd: revert checkpoint list changesMark Fasheh2006-02-14
| | | | | | | | | | | | | | | | | | | | | This patch reverts commit f93ea411b73594f7d144855fd34278bcf34a9afc: [PATCH] jbd: split checkpoint lists This broke journal_flush() for OCFS2, which is its method of being sure that metadata is sent to disk for another node. And two related commits 8d3c7fce2d20ecc3264c8d8c91ae3beacdeaed1b and 43c3e6f5abdf6acac9b90c86bf03f995bf7d3d92 with the subjects: [PATCH] jbd: log_do_checkpoint fix [PATCH] jbd: remove_transaction fix These seem to be incremental bugfixes on the original patch and as such are no longer needed. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] HPET: handle multiple ACPI EXTENDED_IRQ resourcesBjorn Helgaas2006-02-14
| | | | | | | | | | | | | | | | | | | | | | When the _CRS for a single HPET contains multiple EXTENDED_IRQ resources, we overwrote hdp->hd_nirqs every time we found one. So the driver worked when all the IRQs were described in a single EXTENDED_IRQ resource, but failed when multiple resources were used. (Strictly speaking, I think the latter is actually more correct, but both styles have been used.) Someday we should remove all the ACPI stuff from hpet.c and use PNP driver registration instead. But currently PNP_MAX_IRQ is 2, and HPETs often have more IRQs. Hint, hint, Adam :-) Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Bob Picco <robert.picco@hp.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Adam Belay <ambx1@neo.rr.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] tty reference count fixPaul Fulghum2006-02-14
| | | | | | | | | | | | | | | | | | | | | | Fix hole where tty structure can be released when reference count is non zero. Existing code can sleep without tty_sem protection between deciding to release the tty structure (setting local variables tty_closing and otty_closing) and setting TTY_CLOSING to prevent further opens. An open can occur during this interval causing release_dev() to free the tty structure while it is still referenced. This should fix bugzilla.kernel.org [Bug 6041] New: Unable to handle kernel paging request In Bug 6041, tty_open() oopes on accessing the tty structure it has successfully claimed. Bug was on SMP machine with the same tty being opened and closed by multiple processes, and DEBUG_PAGEALLOC enabled. Signed-off-by: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] compound page: no access_process_vm checkHugh Dickins2006-02-14
| | | | | | | | | | | The PageCompound check before access_process_vm's set_page_dirty_lock is no longer necessary, so remove it. But leave the PageCompound checks in bio_set_pages_dirty, dio_bio_complete and nfs_free_user_pages: at least some of those were introduced as a little optimization on hugetlb pages. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] compound page: default destructorHugh Dickins2006-02-14
| | | | | | | | | | | | | | | | | | | | Somehow I imagined that calling a NULL destructor would free a compound page rather than oopsing. No, we must supply a default destructor, __free_pages_ok using the order noted by prep_compound_page. hugetlb can still replace this as before with its own free_huge_page pointer. The case that needs this is not common: rarely does put_compound_page's put_page_testzero bring the count down to 0. But if get_user_pages is applied to some part of a compound page, without immediate release (e.g. AIO or Infiniband), then it's possible for its put_page to come after the containing vma has been unmapped and the driver done its free_pages. That's just the kind of case compound pages are supposed to be guarding against (but Nick points out, nor did PageReserved handle this right). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] compound page: use page[1].lruHugh Dickins2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a compound page has its own put_page_testzero destructor (the only current example is free_huge_page), that is noted in page[1].mapping of the compound page. But that's rather a poor place to keep it: functions which call set_page_dirty_lock after get_user_pages (e.g. Infiniband's __ib_umem_release) ought to be checking first, otherwise set_page_dirty is liable to crash on what's not the address of a struct address_space. And now I'm about to make that worse: it turns out that every compound page needs a destructor, so we can no longer rely on hugetlb pages going their own special way, to avoid further problems of page->mapping reuse. For example, not many people know that: on 50% of i386 -Os builds, the first tail page of a compound page purports to be PageAnon (when its destructor has an odd address), which surprises page_add_file_rmap. Keep the compound page destructor in page[1].lru.next instead. And to free up the common pairing of mapping and index, also move compound page order from index to lru.prev. Slab reuses page->lru too: but if we ever need slab to use compound pages, it can easily stack its use above this. (akpm: decoded version of the above: the tail pages of a compound page now have ->mapping==NULL, so there's no need for the set_page_dirty[_lock]() caller to check that they're not compund pages before doing the dirty). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pktcdvd: Reduce stack usagePeter Osterlund2006-02-14
| | | | | | | | | | Reduce stack usage in the pkt_start_write() function. Even though it's not currently a real problem, the pages and offsets arrays can be eliminated, which saves approximately 1000 bytes of stack space. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pktcdvd: Don't unlock the door if the disc is in usePeter Osterlund2006-02-14
| | | | | | | | | | Unlocking the door when the disc is in use is obviously not good, because then it's possible to eject the disc at the wrong time and cause severe disc data corruption. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pktcdvd: Allow non-writable media to be mountedPeter Osterlund2006-02-14
| | | | | | | | | If opening for write fails, the open method should return -EROFS. This makes "mount" try again with a read-only mount, instead of just giving up. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] pktcdvd: Don't spam the kernel log when nothing is wrongPeter Osterlund2006-02-14
| | | | | | | | | Change some messages that don't indicate an error so that they are only printed when debugging is enabled. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2006-02-14
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
| * IB/mthca: bump driver version and release dateRoland Dreier2006-02-13
| | | | | | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IPoIB: Yet another fix for send-only joinsRoland Dreier2006-02-11
| | | | | | | | | | | | | | | | | | | | | | Even after the last fix, it's still possible for a send-only join to start before the join for the broadcast group has finished. This could cause us to create a multicast group using attributes from the broadcast group that haven't been initialized yet, so we would use garbage for the Q_Key, etc. Fix this by waiting until the broadcast group's attached flag is set before starting send-only joins. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Don't print debugging info until we have all valuesRoland Dreier2006-02-10
| | | | | | | | | | | | | | | | | | | | | | When debugging is enabled, the mthca_QUERY_DEV_LIM() firmware command function prints out some of the device limits that it queries. However the debugging prints happen before all of the fields are extracted from the firmware response, so some of the values that get printed are uninitialized junk. Move the prints to the end of the function to fix this. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IPoIB: Fix another send-only join raceMichael S. Tsirkin2006-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | Further, there's an additional issue that I saw in testing: ipoib_mcast_send may get called when priv->broadcast is NULL (e.g. if the device was downed and then upped internally because of a port event). If this happends and the send-only join request gets completed before priv->broadcast is set, we get an oops. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IPoIB: Don't start send-only joins while multicast thread is stoppedMichael S. Tsirkin2006-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following race scenario: - Device is up. - Port event or set mcast list triggers ipoib_mcast_stop_thread, this cancels the query and waits on mcast "done" completion. - Completion is called and "done" is set. - Meanwhile, ipoib_mcast_send arrives and starts a new query, re-initializing "done". Fix this by adding a "multicast started" bit and checking it before starting a send-only join. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mad: Handle DR SMPs with a LID routed partRalph Campbell2006-02-03
| | | | | | | | | | | | | | | | | | Fix handling of directed route SMPs with a beginning or ending LID routed part. Signed-off-by: Ralph Campbell <ralphc@pathscale.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | [MIPS] Fix typo in _sys32_rt_sigreturn and _sysn32_rt_sigreturn.Atsushi Nemoto2006-02-14
| | | | | | | | | | Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] Fix CPU type bitmasks for MIPS III, IV and V.Maciej W. Rozycki2006-02-14
| | | | | | | | | | Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] RM9000: Fix buggy I-cache workaround.Thomas Koeller2006-02-14
| | | | | | | | | | Signed-off-by: Thomas Koeller <thomas.koeller@baslerweb.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] MT: Propagate config7 into VPE.Ralf Baechle2006-02-14
| | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] MT: Fix c0 back-to-back hazard.Ralf Baechle2006-02-14
| | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] Get rid of kludgery needed to keep stdargs of old compilers working.Ralf Baechle2006-02-14
| | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | [MIPS] More uaccess.h fixes with gcc >= 4.0.1.Ralf Baechle2006-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From Richard Sandiford <richard@codesourcery.com>: This patch caused a miscompilation of the restore_gp_regs() block in restore_sigcontext(). This was in a 32-bit kernel compiled with GCC CVS head. restore_gp_regs() copies 64-bit user fields into 32-bit variables, and in this combination, the new __get_user_asm_ll32() clobbers too many registers. It says: /* * Get a long long 64 using 32 bit registers. */ { \ __asm__ __volatile__( \ "1: lw %1, (%3) \n" \ "2: lw %D1, 4(%3) \n" \ " move %0, $0 \n" \ "3: .section .fixup,\"ax\" \n" \ "4: li %0, %4 \n" \ " move %1, $0 \n" \ " move %D1, $0 \n" \ " j 3b \n" \ " .previous \n" \ " .section __ex_table,\"a\" \n" \ " " __UA_ADDR " 1b, 4b \n" \ " " __UA_ADDR " 2b, 4b \n" \ " .previous \n" \ : "=r" (__gu_err), "=&r" (val) \ : "0" (0), "r" (addr), "i" (-EFAULT)); \ } and this requires val (%1) to be a 64-bit value. In the case I saw, gcc was using $3 for the 32-bit val, and wasn't expecting $4 to be clobbered. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>