aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAge
* [PATCH] Add pselect/ppoll system call implementationDavid Woodhouse2006-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following implementation of ppoll() and pselect() system calls depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the thread_info. These system calls have to change the signal mask during their operation, and signal handlers must be invoked using the new, temporary signal mask. The old signal mask must be restored either upon successful exit from the system call, or upon returning from the invoked signal handler if the system call is interrupted. We can't simply restore the original signal mask and return to userspace, since the restored signal mask may actually block the signal which interrupted the system call. The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit path to trap into do_signal() just as TIF_SIGPENDING does, and by causing do_signal() to use the saved signal mask instead of the current signal mask when setting up the stack frame for the signal handler -- or by causing do_signal() to simply restore the saved signal mask in the case where there is no handler to be invoked. The first patch implements the sys_pselect() and sys_ppoll() system calls, which are present only if TIF_RESTORE_SIGMASK is defined. That #ifdef should go away in time when all architectures have implemented it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC kernel (in the -mm tree), and the third patch then removes the arch-specific implementations of sys_rt_sigsuspend() and replaces them with generic versions using the same trick. The fourth and fifth patches, provided by David Howells, implement TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch adds the syscalls to the i386 syscall table. This patch: Add the pselect() and ppoll() system calls, providing core routines usable by the original select() and poll() system calls and also the new calls (with their semantics w.r.t timeouts). Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Generic sys_rt_sigsuspend()David Woodhouse2006-01-18
| | | | | | | | | | | | The TIF_RESTORE_SIGMASK flag allows us to have a generic implementation of sys_rt_sigsuspend() instead of duplicating it for each architecture. This provides such an implementation and makes arch/powerpc use it. It also tidies up the ppc32 sys_sigsuspend() to use TIF_RESTORE_SIGMASK. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] vfs: *at functions: coreUlrich Drepper2006-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is a series of patches which introduce in total 13 new system calls which take a file descriptor/filename pair instead of a single file name. These functions, openat etc, have been discussed on numerous occasions. They are needed to implement race-free filesystem traversal, they are necessary to implement a virtual per-thread current working directory (think multi-threaded backup software), etc. We have in glibc today implementations of the interfaces which use the /proc/self/fd magic. But this code is rather expensive. Here are some results (similar to what Jim Meyering posted before). The test creates a deep directory hierarchy on a tmpfs filesystem. Then rm -fr is used to remove all directories. Without syscall support I get this: real 0m31.921s user 0m0.688s sys 0m31.234s With syscall support the results are much better: real 0m20.699s user 0m0.536s sys 0m20.149s The interfaces are for obvious reasons currently not much used. But they'll be used. coreutils (and Jeff's posixutils) are already using them. Furthermore, code like ftw/fts in libc (maybe even glob) will also start using them. I expect a patch to make follow soon. Every program which is walking the filesystem tree will benefit. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] nfsd4: rename lk_stateownerJ. Bruce Fields2006-01-18
| | | | | | | | | | | | | One of the things that's confusing about nfsd4_lock is that the lk_stateowner field could be set to either of two different lockowners: the open owner or the lock owner. Rename to lk_replay_owner and add a comment to make it clear that it's used for whichever stateowner has its sequence id bumped for replay detection. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] svcrpc: save and restore the daddr field when request deferredJ. Bruce Fields2006-01-18
| | | | | | | | | | | | The server code currently keeps track of the destination address on every request so that it can reply using the same address. However we forget to do that in the case of a deferred request. Remedy this oversight. >From folks at PolyServe. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] nfsd: check error status from nfsd_sync_dirYAMAMOTO Takashi2006-01-18
| | | | | | | | | | | | | | | | | Change nfsd_sync_dir to return an error if ->sync fails, and pass that error up through the stack. This involves a number of rearrangements of error paths, and care to distinguish between Linux -errno numbers and NFSERR numbers. In the 'create' routines, we continue with the 'setattr' even if a previous sync_dir failed. This patch is quite different from Takashi's in a few ways, but there is still a strong lineage. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] add missing syscall declarationsArnd Bergmann2006-01-18
| | | | | | | | | | All standard system calls should be declared in include/linux/syscalls.h. Add some of the new additions that were previously missed. Signed-off-by: Arnd Bergmann <arndb@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] NUMA policies in the slab allocator V2Christoph Lameter2006-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a regression in 2.6.14 against 2.6.13 that causes an imbalance in memory allocation during bootup. The slab allocator in 2.6.13 is not numa aware and simply calls alloc_pages(). This means that memory policies may control the behavior of alloc_pages(). During bootup the memory policy is set to MPOL_INTERLEAVE resulting in the spreading out of allocations during bootup over all available nodes. The slab allocator in 2.6.13 has only a single list of slab pages. As a result the per cpu slab cache and the spinlock controlled page lists may contain slab entries from off node memory. The slab allocator in 2.6.13 makes no effort to discern the locality of an entry on its lists. The NUMA aware slab allocator in 2.6.14 controls locality of the slab pages explicitly by calling alloc_pages_node(). The NUMA slab allocator manages slab entries by having lists of available slab pages for each node. The per cpu slab cache can only contain slab entries associated with the node local to the processor. This guarantees that the default allocation mode of the slab allocator always assigns local memory if available. Setting MPOL_INTERLEAVE as a default policy during bootup has no effect anymore. In 2.6.14 all node unspecific slab allocations are performed on the boot processor. This means that most of key data structures are allocated on one node. Most processors will have to refer to these structures making the boot node a potential bottleneck. This may reduce performance and cause unnecessary memory pressure on the boot node. This patch implements NUMA policies in the slab layer. There is the need of explicit application of NUMA memory policies by the slab allcator itself since the NUMA slab allocator does no longer let the page_allocator control locality. The check for policies is made directly at the beginning of __cache_alloc using current->mempolicy. The memory policy is already frequently checked by the page allocator (alloc_page_vma() and alloc_page_current()). So it is highly likely that the cacheline is present. For MPOL_INTERLEAVE kmalloc() will spread out each request to one node after another so that an equal distribution of allocations can be obtained during bootup. It is not possible to push the policy check to lower layers of the NUMA slab allocator since the per cpu caches are now only containing slab entries from the current node. If the policy says that the local node is not to be preferred or forbidden then there is no point in checking the slab cache or local list of slab pages. The allocation better be directed immediately to the lists containing slab entries for the allowed set of nodes. This way of applying policy also fixes another strange behavior in 2.6.13. alloc_pages() is controlled by the memory allocation policy of the current process. It could therefore be that one process is running with MPOL_INTERLEAVE and would f.e. obtain a new page following that policy since no slab entries are in the lists anymore. A page can typically be used for multiple slab entries but lets say that the current process is only using one. The other entries are then added to the slab lists. These are now non local entries in the slab lists despite of the possible availability of local pages that would provide faster access and increase the performance of the application. Another process without MPOL_INTERLEAVE may now run and expect a local slab entry from kmalloc(). However, there are still these free slab entries from the off node page obtained from the other process via MPOL_INTERLEAVE in the cache. The process will then get an off node slab entry although other slab entries may be available that are local to that process. This means that the policy if one process may contaminate the locality of the slab caches for other processes. This patch in effect insures that a per process policy is followed for the allocation of slab entries and that there cannot be a memory policy influence from one process to another. A process with default policy will always get a local slab entry if one is available. And the process using memory policies will get its memory arranged as requested. Off-node slab allocation will require the use of spinlocks and will make the use of per cpu caches not possible. A process using memory policies to redirect allocations offnode will have to cope with additional lock overhead in addition to the latency added by the need to access a remote slab entry. Changes V1->V2 - Remove #ifdef CONFIG_NUMA by moving forward declaration into prior #ifdef CONFIG_NUMA section. - Give the function determining the node number to use a saner name. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Zone reclaim: proc overrideChristoph Lameter2006-01-18
| | | | | | | | | | | | proc support for zone reclaim This patch creates a proc entry /proc/sys/vm/zone_reclaim_mode that may be used to override the automatic determination of the zone reclaim made on bootup. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Zone reclaim: Reclaim logicChristoph Lameter2006-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some bits for zone reclaim exists in 2.6.15 but they are not usable. This patch fixes them up, removes unused code and makes zone reclaim usable. Zone reclaim allows the reclaiming of pages from a zone if the number of free pages falls below the watermarks even if other zones still have enough pages available. Zone reclaim is of particular importance for NUMA machines. It can be more beneficial to reclaim a page than taking the performance penalties that come with allocating a page on a remote zone. Zone reclaim is enabled if the maximum distance to another node is higher than RECLAIM_DISTANCE, which may be defined by an arch. By default RECLAIM_DISTANCE is 20. 20 is the distance to another node in the same component (enclosure or motherboard) on IA64. The meaning of the NUMA distance information seems to vary by arch. If zone reclaim is not successful then no further reclaim attempts will occur for a certain time period (ZONE_RECLAIM_INTERVAL). This patch was discussed before. See http://marc.theaimsgroup.com/?l=linux-kernel&m=113519961504207&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113408418232531&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113389027420032&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113380938612205&w=2 Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] mm: migration page refcounting fixNick Piggin2006-01-18
| | | | | | | | | | | | | | | | | Migration code currently does not take a reference to target page properly, so between unlocking the pte and trying to take a new reference to the page with isolate_lru_page, anything could happen to it. Fix this by holding the pte lock until we get a chance to elevate the refcount. Other small cleanups while we're here. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge master.kernel.org:/home/rmk/linux-2.6-serialLinus Torvalds2006-01-18
|\
| * [SERIAL] Add 8250 support for Decision Computer International Co. PCCOM2Alon Bar-Lev2006-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a new device which is look like: Serial controller: Decision Computer International Co. PCCOM2 (rev 02) (prog-if 02 [16550]) 0700: 6666:0004 (rev 02) (prog-if 02) Flags: medium devsel, IRQ 177 Memory at fe000000 (32-bit, non-prefetchable) [size=128] I/O ports at e880 [size=128] I/O ports at e400 [size=256] It has two 16550A, and is not listed in kernel, although the manufacturer clams that it is supported... I've created the following patch, it only add the new PCI id and the card to the repository, it seems to work. Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* | Merge git://tipc.cslab.ericsson.net/pub/git/tipcDavid S. Miller2006-01-18
|\ \ | |/ |/|
| * [TIPC] Move ethernet protocol id to linux/if_ether.hPer Liden2006-01-17
| | | | | | | | Signed-off-by: Per Liden <per.liden@ericsson.com>
| * [TIPC] Updated link priority macrosPer Liden2006-01-17
| | | | | | | | | | | | | | | | Added macros for min/default/max link priority in tipc_config.h. Also renamed TIPC_NUM_LINK_PRI to TIPC_MEDIA_LINK_PRI since that is a more accurate description of what it is used for. Signed-off-by: Per Liden <per.liden@ericsson.com>
* | Merge branch 'upstream-linus' of ↵Linus Torvalds2006-01-17
|\ \ | |/ |/| | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
| * [PATCH] libata: Fix heuristic typos add LBA48PIO flag and support code, add ↵Alan Cox2006-01-17
| | | | | | | | | | | | | | IRQ flag for next diff Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
| * [PATCH] libata: add a function to decide if we need iordyAlan Cox2006-01-17
| | | | | | | | | | | | | | This ought to be simple but for PIO2 we have to poke around the drive data to get it 100% correct. Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
* | [NET]: Make second arg to skb_reserved() signed.David S. Miller2006-01-17
| | | | | | | | | | | | | | | | | | | | | | Some subsystems, such as PPP, can send negative values here. It just happened to work correctly on 32-bit with an unsigned value, but on 64-bit this explodes. Figured out by Paul Mackerras based upon several PPP crash reports. Signed-off-by: David S. Miller <davem@davemloft.net>
* | [NETFILTER] ip6tables: remove unused definitionsYasuyuki Kozakai2006-01-17
| | | | | | | | | | | | | | | | | | These definitions ware used for only internal use in kernel <= 2.6.13, which had not introduced the unified parser of IPv6 extension header yet. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [IPV6]: Preserve procfs IPV6 address output formatYOSHIFUJI Hideaki2006-01-17
|/ | | | | | | | Procfs always output IPV6 addresses without the colon characters, and we cannot change that. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds2006-01-17
|\
| * Merge branch 'master' of ↵Mauro Carvalho Chehab2006-01-16
| |\ | | | | | | | | | ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
| | * Merge branch 'work'Mauro Carvalho Chehab2006-01-15
| | |\
| | | * V4L/DVB (3365): i2c ids for upd64031a saa717x upd64083 wm8739Tyler Trafford2006-01-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add i2c ids for drivers: upd64031a saa717x upd64083 wm8739 Signed-off-by: Tyler Trafford <tatrafford@comcast.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
* | | | [PATCH] x86_64: add __meminit for memory hotplugMatt Tolentino2006-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add __meminit to the __init lineup to ensure functions default to __init when memory hotplug is not enabled. Replace __devinit with __meminit on functions that were changed when the memory hotplug code was introduced. Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] add /sys/fsMiklos Szeredi2006-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an empty /sys/fs, which filesystems can use. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] sh: kexec() supportkogiidena2006-01-17
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds kexec() support for SH. Signed-off-by: kogiidena <kogiidena@eggplant.ddo.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Cc: <fastboot@lists.osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds2006-01-15
|\ \ \ | |/ / |/| |
| * | Fix "stuct", "strut", "struc" typosAlexey Dobriyan2006-01-14
| | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* | | Merge master.kernel.org:/home/rmk/linux-2.6-serialLinus Torvalds2006-01-14
|\ \ \
| * | | [SERIAL] convert uart_state.sem to uart_state.mutexIngo Molnar2006-01-13
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | semaphore to mutex conversion. the conversion was generated via scripts, and the result was validated automatically via a script as well. build and boot tested. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* | | [PATCH] When CONFIG_CC_OPTIMIZE_FOR_SIZE, allow gcc4 to control inliningIngo Molnar2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If optimizing for size (CONFIG_CC_OPTIMIZE_FOR_SIZE), allow gcc4 compilers to decide what to inline and what not - instead of the kernel forcing gcc to inline all the time. This requires several places that require to be inlined to be marked as such, previous patches in this series do that. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] mark several functions __always_inlineIngo Molnar2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arjan van de Ven <arjan@infradead.org> Mark a number of functions as 'must inline'. The functions affected by this patch need to be inlined because they use knowledge that their arguments are constant so that most of the function optimizes away. At this point this patch does not change behavior, it's for documentation only (and for future patches in the inline series) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] Make __always_inline actually force always inliningIngo Molnar2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is the first in a series that tries to optimize the kernel in terms of size (and thus cache behavior, both cpu and pagecache). This first patch changes __always_inline to be a forced inline instead of the "regular" inline it was on everything except alpha. This forced inline matches the intention of the define better as a matter of documentation. There is no change in behavior by this patch, since "inline" currently is mapped to a forced inline anyway. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] fbdev: Sanitize ->fb_mmap prototypeChristoph Hellwig2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need for a file argument. If we'd really need it it's in vma->vm_file already. gbefb and sgivwfb used to set vma->vm_file to the file argument, but the kernel alrady did that. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] fbdev: Sanitize ->fb_ioctl prototypeChristoph Hellwig2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ioctl and file arguments to ->fb_mmap are totally unused and there's not reason a driver should need them. Also update the ->fb_compat_ioctl prototype to be the same as ->fb_mmap. Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] smbfs: remove kmalloc wrapperPekka Enberg2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | Remove the remaining kmalloc() wrapper bits from fs/smbfs/. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] ncpfs: remove kmalloc wrapperPekka Enberg2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | Remove remaining kmalloc wrapper bits from fs/ncpfs/. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] cpuset oom lock fixPaul Jackson2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem, reported in: http://bugzilla.kernel.org/show_bug.cgi?id=5859 and by various other email messages and lkml posts is that the cpuset hook in the oom (out of memory) code can try to take a cpuset semaphore while holding the tasklist_lock (a spinlock). One must not sleep while holding a spinlock. The fix seems easy enough - move the cpuset semaphore region outside the tasklist_lock region. This required a few lines of mechanism to implement. The oom code where the locking needs to be changed does not have access to the cpuset locks, which are internal to kernel/cpuset.c only. So I provided a couple more cpuset interface routines, available to the rest of the kernel, which simple take and drop the lock needed here (cpusets callback_sem). Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] s390: cputime misaccountingMartin Schwidefsky2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | finish_arch_switch needs to update the user cpu time as well, not just the system cpu time. Otherwise the partial user cpu time of a process that is stored in the lowcore will be (mis-)accounted to the next process. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] Add tmpfs options for memory placement policiesRobin Holt2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Anything that writes into a tmpfs filesystem is liable to disproportionately decrease the available memory on a particular node. Since there's no telling what sort of application (e.g. dd/cp/cat) might be dropping large files there, this lets the admin choose the appropriate default behavior for their site's situation. Introduce a tmpfs mount option which allows specifying a memory policy and a second option to specify the nodelist for that policy. With the default policy, tmpfs will behave as it does today. This patch adds support for preferred, bind, and interleave policies. The default policy will cause pages to be added to tmpfs files on the node which is doing the writing. Some jobs expect a single process to create and manage the tmpfs files. This results in a node which has a significantly reduced number of free pages. With this patch, the administrator can specify the policy and nodes for that policy where they would prefer allocations. This patch was originally written by Brent Casavant and Hugh Dickins. I added support for the bind and preferred policies and the mpol_nodelist mount option. Signed-off-by: Brent Casavant <bcasavan@sgi.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] Fix for CONFIG_NUMA without CONFIG_SWAPChristoph Lameter2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some people apparently run CONFIG_NUMA without CONFIG_SWAP. The migration code currently depends on swap. This patch provides a set of inline fallback functions so that the kernel properly compiles. However, calls to migration functions will fail. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] sched: add new SCHED_BATCH policyIngo Molnar2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new SCHED_BATCH (3) scheduling policy: such tasks are presumed CPU-intensive, and will acquire a constant +5 priority level penalty. Such policy is nice for workloads that are non-interactive, but which do not want to give up their nice levels. The policy is also useful for workloads that want a deterministic scheduling policy without interactivity causing extra preemptions (between that workload's tasks). Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] Altix: ioc3 serial supportPatrick Gefre2006-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add driver support for a 2 port PCI IOC3-based serial card on Altix boxes: This is a re-submission. On the original submission I was asked to organize the code so that the MIPS ioc3 ethernet and serial parts could be used with this driver. Stanislaw Skowronek was kind enough to provide the shim layer for this - thanks Stanislaw. This patch includes the shim layer and the Altix PCI ioc3 serial driver. The MIPS merged ioc3 ethernet and serial support is forthcoming. Signed-off-by: Patrick Gefre <pfg@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | [PATCH] convert /proc/devices to use seq_file interfaceNeil Horman2006-01-14
| |/ |/| | | | | | | | | | | | | | | | | | | | | A Christoph suggested that the /proc/devices file be converted to use the seq_file interface. This patch does that. I've obxerved one or two installation that had sufficiently large sans that they overran the 4k limit on /proc/devices. Signed-off-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds2006-01-14
|\ \
| * | [SCSI] fusion - adding support for FC949ESMoore, Eric2006-01-14
| | | | | | | | | | | | | | | | | | | | | Add software recognition for the new LSI Logic Fibre Channel controller. Signed-off-by: Eric Moore <Eric.Moore@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
| * | [SCSI] raid_class.c - adding RAID10 and RAID10 definesMoore, Eric2006-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adding defines for RAID10 and RAID50 levels, in preparation of adding RAID Transport support in the mpt fusion drivers. (BTW: IME is RAID10, and IM is RAID1). Signed-off-by: Eric Moore <Eric.Moore@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>