aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/sysctl.c
Commit message (Collapse)AuthorAge
* Merge branch 'for-linus' of ↵Linus Torvalds2010-10-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) split invalidate_inodes() fs: skip I_FREEING inodes in writeback_sb_inodes fs: fold invalidate_list into invalidate_inodes fs: do not drop inode_lock in dispose_list fs: inode split IO and LRU lists fs: switch bdev inode bdi's correctly fs: fix buffer invalidation in invalidate_list fsnotify: use dget_parent smbfs: use dget_parent exportfs: use dget_parent fs: use RCU read side protection in d_validate fs: clean up dentry lru modification fs: split __shrink_dcache_sb fs: improve DCACHE_REFERENCED usage fs: use percpu counter for nr_dentry and nr_dentry_unused fs: simplify __d_free fs: take dcache_lock inside __d_path fs: do not assign default i_ino in new_inode fs: introduce a per-cpu last_ino allocator new helper: ihold() ...
| * fs: use percpu counter for nr_dentry and nr_dentry_unusedChristoph Hellwig2010-10-25
| | | | | | | | | | | | | | | | | | | | | | | | The nr_dentry stat is a globally touched cacheline and atomic operation twice over the lifetime of a dentry. It is used for the benfit of userspace only. Turn it into a per-cpu counter and always decrement it in d_free instead of doing various batching operations to reduce lock hold times in the callers. Based on an earlier patch from Nick Piggin <npiggin@suse.de>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * fs: Convert nr_inodes and nr_unused to per-cpu countersDave Chinner2010-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of inodes allocated does not need to be tied to the addition or removal of an inode to/from a list. If we are not tied to a list lock, we could update the counters when inodes are initialised or destroyed, but to do that we need to convert the counters to be per-cpu (i.e. independent of a lock). This means that we have the freedom to change the list/locking implementation without needing to care about the counters. Based on a patch originally from Eric Dumazet. [AV: cleaned up a bit, fixed build breakage on weird configs Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * fs: allow for more than 2^31 filesEric Dumazet2010-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew, Could you please review this patch, you probably are the right guy to take it, because it crosses fs and net trees. Note : /proc/sys/fs/file-nr is a read-only file, so this patch doesnt depend on previous patch (sysctl: fix min/max handling in __do_proc_doulongvec_minmax()) Thanks ! [PATCH V4] fs: allow for more than 2^31 files Robin Holt tried to boot a 16TB system and found af_unix was overflowing a 32bit value : <quote> We were seeing a failure which prevented boot. The kernel was incapable of creating either a named pipe or unix domain socket. This comes down to a common kernel function called unix_create1() which does: atomic_inc(&unix_nr_socks); if (atomic_read(&unix_nr_socks) > 2 * get_max_files()) goto out; The function get_max_files() is a simple return of files_stat.max_files. files_stat.max_files is a signed integer and is computed in fs/file_table.c's files_init(). n = (mempages * (PAGE_SIZE / 1024)) / 10; files_stat.max_files = n; In our case, mempages (total_ram_pages) is approx 3,758,096,384 (0xe0000000). That leaves max_files at approximately 1,503,238,553. This causes 2 * get_max_files() to integer overflow. </quote> Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long integers, and change af_unix to use an atomic_long_t instead of atomic_t. get_max_files() is changed to return an unsigned long. get_nr_files() is changed to return a long. unix_nr_socks is changed from atomic_t to atomic_long_t, while not strictly needed to address Robin problem. Before patch (on a 64bit kernel) : # echo 2147483648 >/proc/sys/fs/file-max # cat /proc/sys/fs/file-max -18446744071562067968 After patch: # echo 2147483648 >/proc/sys/fs/file-max # cat /proc/sys/fs/file-max 2147483648 # cat /proc/sys/fs/file-nr 704 0 2147483648 Reported-by: Robin Holt <holt@sgi.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David Miller <davem@davemloft.net> Reviewed-by: Robin Holt <holt@sgi.com> Tested-by: Robin Holt <holt@sgi.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | printk: declare printk_ratelimit_state in ratelimit.hNamhyung Kim2010-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | Adding declaration of printk_ratelimit_state in ratelimit.h removes potential build breakage and following sparse warning: kernel/printk.c:1426:1: warning: symbol 'printk_ratelimit_state' was not declared. Should it be static? [akpm@linux-foundation.org: remove unneeded ifdef] Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | fs: allow for more than 2^31 filesEric Dumazet2010-10-26
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Robin Holt tried to boot a 16TB system and found af_unix was overflowing a 32bit value : <quote> We were seeing a failure which prevented boot. The kernel was incapable of creating either a named pipe or unix domain socket. This comes down to a common kernel function called unix_create1() which does: atomic_inc(&unix_nr_socks); if (atomic_read(&unix_nr_socks) > 2 * get_max_files()) goto out; The function get_max_files() is a simple return of files_stat.max_files. files_stat.max_files is a signed integer and is computed in fs/file_table.c's files_init(). n = (mempages * (PAGE_SIZE / 1024)) / 10; files_stat.max_files = n; In our case, mempages (total_ram_pages) is approx 3,758,096,384 (0xe0000000). That leaves max_files at approximately 1,503,238,553. This causes 2 * get_max_files() to integer overflow. </quote> Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long integers, and change af_unix to use an atomic_long_t instead of atomic_t. get_max_files() is changed to return an unsigned long. get_nr_files() is changed to return a long. unix_nr_socks is changed from atomic_t to atomic_long_t, while not strictly needed to address Robin problem. Before patch (on a 64bit kernel) : # echo 2147483648 >/proc/sys/fs/file-max # cat /proc/sys/fs/file-max -18446744071562067968 After patch: # echo 2147483648 >/proc/sys/fs/file-max # cat /proc/sys/fs/file-max 2147483648 # cat /proc/sys/fs/file-nr 704 0 2147483648 Reported-by: Robin Holt <holt@sgi.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David Miller <davem@davemloft.net> Reviewed-by: Robin Holt <holt@sgi.com> Tested-by: Robin Holt <holt@sgi.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sysctl: fix min/max handling in __do_proc_doulongvec_minmax()Eric Dumazet2010-10-07
| | | | | | | | | | | | | | | | | | | When proc_doulongvec_minmax() is used with an array of longs, and no min/max check requested (.extra1 or .extra2 being NULL), we dereference a NULL pointer for the second element of the array. Noticed while doing some changes in network stack for the "16TB problem" Fix is to not change min & max pointers in __do_proc_doulongvec_minmax(), so that all elements of the vector share an unique min/max limit, like proc_dointvec_minmax(). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* gcc-4.6: kernel/*: Fix unused but set warningsAndi Kleen2010-09-05
| | | | | | | | | | No real bugs I believe, just some dead code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: andi@firstfloor.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds2010-08-10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.
| * sysctl extern cleanup: inotifyDave Young2010-07-28
| | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be move to their own head file, and then include them in relavant .c files. Move inotify_table extern declaration to linux/inotify.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Paris <eparis@redhat.com>
| * dnotify: move dir_notify_enable declarationAlexey Dobriyan2010-07-28
| | | | | | | | | | | | | | Move dir_notify_enable declaration to where it belongs -- dnotify.h . Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | oom: move sysctl declarations to oom.hDavid Rientjes2010-08-09
| | | | | | | | | | | | | | | | | | | | | | | | The three oom killer sysctl variables (sysctl_oom_dump_tasks, sysctl_oom_kill_allocating_task, and sysctl_panic_on_oom) are better declared in include/linux/oom.h rather than kernel/sysctl.c. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2010-08-07
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits) workqueue: mark init_workqueues() as early_initcall() workqueue: explain for_each_*cwq_cpu() iterators fscache: fix build on !CONFIG_SYSCTL slow-work: kill it gfs2: use workqueue instead of slow-work drm: use workqueue instead of slow-work cifs: use workqueue instead of slow-work fscache: drop references to slow-work fscache: convert operation to use workqueue instead of slow-work fscache: convert object to use workqueue instead of slow-work workqueue: fix how cpu number is stored in work->data workqueue: fix mayday_mask handling on UP workqueue: fix build problem on !CONFIG_SMP workqueue: fix locking in retry path of maybe_create_worker() async: use workqueue for worker pool workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead workqueue: implement unbound workqueue workqueue: prepare for WQ_UNBOUND implementation libata: take advantage of cmwq and remove concurrency limitations workqueue: fix worker management invocation without pending works ... Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c
| * | slow-work: kill itTejun Heo2010-07-23
| |/ | | | | | | | | | | | | slow-work doesn't have any user left. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds2010-08-06
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (28 commits) driver core: device_rename's new_name can be const sysfs: Remove owner field from sysfs struct attribute powerpc/pci: Remove owner field from attribute initialization in PCI bridge init regulator: Remove owner field from attribute initialization in regulator core driver leds: Remove owner field from attribute initialization in bd2802 driver scsi: Remove owner field from attribute initialization in ARCMSR driver scsi: Remove owner field from attribute initialization in LPFC driver cgroupfs: create /sys/fs/cgroup to mount cgroupfs on Driver core: Add BUS_NOTIFY_BIND_DRIVER driver core: fix memory leak on one error path in bus_register() debugfs: no longer needs to depend on SYSFS sysfs: Fix one more signature discrepancy between sysfs implementation and docs. sysfs: fix discrepancies between implementation and documentation dcdbas: remove a redundant smi_data_buf_free in dcdbas_exit dmi-id: fix a memory leak in dmi_id_init error path sysfs: sysfs_chmod_file's attr can be const firmware: Update hotplug script Driver core: move platform device creation helpers to .init.text (if MODULE=n) Driver core: reduce duplicated code for platform_device creation Driver core: use kmemdup in platform_device_add_resources ...
| * | hotplug: Support kernel/hotplug sysctl variable when !CONFIG_NETIan Abbott2010-08-05
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel/hotplug sysctl variable (/proc/sys/kernel/hotplug file) was made conditional on CONFIG_NET by commit f743ca5e10f4145e0b3e6d11b9b46171e16af7ce (applied in 2.6.18) to fix problems with undefined references in 2.6.16 when CONFIG_HOTPLUG=y && !CONFIG_NET, but this restriction is no longer needed. This patch makes the kernel/hotplug sysctl variable depend only on CONFIG_HOTPLUG. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Acked-by: Randy Dunlap <randy.dunlap@oracle.COM> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | Merge branch 'perf/nmi' into perf/coreIngo Molnar2010-08-05
|\ \ | |/ |/| | | | | | | | | | | | | Conflicts: kernel/Makefile Merge reason: Add the now complete topic, fix the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * lockup_detector: Remove old softlockup codeDon Zickus2010-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that is no longer compiled or used, just remove it. Also move some of the code wrapped with DETECT_SOFTLOCKUP to the LOCKUP_DETECTOR wrappers because that is the code that uses it now. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <1273266711-18706-4-git-send-email-dzickus@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * lockup_detector: Touch_softlockup cleanups and softlockup_tick removalDon Zickus2010-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just some code cleanup to make touch_softlockup clearer and remove the softlockup_tick function as it is no longer needed. Also remove the /proc softlockup_thres call as it has been changed to watchdog_thres. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <1273266711-18706-3-git-send-email-dzickus@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * lockup_detector: Combine nmi_watchdog and softlockup detectorDon Zickus2010-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new nmi_watchdog (which uses the perf event subsystem) is very similar in structure to the softlockup detector. Using Ingo's suggestion, I combined the two functionalities into one file: kernel/watchdog.c. Now both the nmi_watchdog (or hardlockup detector) and softlockup detector sit on top of the perf event subsystem, which is run every 60 seconds or so to see if there are any lockups. To detect hardlockups, cpus not responding to interrupts, I implemented an hrtimer that runs 5 times for every perf event overflow event. If that stops counting on a cpu, then the cpu is most likely in trouble. To detect softlockups, tasks not yielding to the scheduler, I used the previous kthread idea that now gets kicked every time the hrtimer fires. If the kthread isn't being scheduled neither is anyone else and the warning is printed to the console. I tested this on x86_64 and both the softlockup and hardlockup paths work. V2: - cleaned up the Kconfig and softlockup combination - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI - seperated out the softlockup case from perf event subsystem - re-arranged the enabling/disabling nmi watchdog from proc space - added cpumasks for hardlockup failure cases - removed fallback to soft events if no PMU exists for hard events V3: - comment cleanups - drop support for older softlockup code - per_cpu cleanups - completely remove software clock base hardlockup detector - use per_cpu masking on hard/soft lockup detection - #ifdef cleanups - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR - documentation additions V4: - documentation fixes - convert per_cpu to __get_cpu_var - powerpc compile fixes V5: - split apart warn flags for hard and soft lockups TODO: - figure out how to make an arch-agnostic clock2cycles call (if possible) to feed into perf events as a sample period [fweisbec: merged conflict patch] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
| * Merge commit 'v2.6.34-rc7' into perf/nmiFrederic Weisbecker2010-05-12
| |\ | | | | | | | | | Merge reason: catch up with latest softlockup detector changes.
| * | nmi_watchdog: Compile and portability fixesDon Zickus2010-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original patch was x86_64 centric. Changed the code to make it less so. ested by building and running on a powerpc. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: gorcunov@gmail.com Cc: aris@redhat.com LKML-Reference: <1266013161-31197-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | pipe: change /proc/sys/fs/pipe-max-pages to byte sized interfaceJens Axboe2010-06-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes the interface to be based on bytes instead. The API matches that of F_SETPIPE_SZ in that it rounds up the passed in size so that the resulting page array is a power-of-2 in size. The proc file is renamed to /proc/sys/fs/pipe-max-size to reflect this change. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2010-05-25
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (63 commits) drivers/net/usb/asix.c: Fix pointer cast. be2net: Bug fix to avoid disabling bottom half during firmware upgrade. proc_dointvec: write a single value hso: add support for new products Phonet: fix potential use-after-free in pep_sock_close() ath9k: remove VEOL support for ad-hoc ath9k: change beacon allocation to prefer the first beacon slot sock.h: fix kernel-doc warning cls_cgroup: Fix build error when built-in macvlan: do proper cleanup in macvlan_common_newlink() V2 be2net: Bug fix in init code in probe net/dccp: expansion of error code size ath9k: Fix rx of mcast/bcast frames in PS mode with auto sleep wireless: fix sta_info.h kernel-doc warnings wireless: fix mac80211.h kernel-doc warnings iwlwifi: testing the wrong variable in iwl_add_bssid_station() ath9k_htc: rare leak in ath9k_hif_usb_alloc_tx_urbs() ath9k_htc: dereferencing before check in hif_usb_tx_cb() rt2x00: Fix rt2800usb TX descriptor writing. rt2x00: Fix failed SLEEP->AWAKE and AWAKE->SLEEP transitions. ...
| * | | proc_dointvec: write a single valueJ. R. Okajima2010-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 00b7c3395aec3df43de5bd02a3c5a099ca51169f "sysctl: refactor integer handling proc code" modified the behaviour of writing to /proc. Before the commit, write("1\n") to /proc/sys/kernel/printk succeeded. But now it returns EINVAL. This commit supports writing a single value to a multi-valued entry. Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp> Reviewed-and-tested-by: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | mm: compaction: add a tunable that decides when memory should be compacted ↵Mel Gorman2010-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and when it should be reclaimed The kernel applies some heuristics when deciding if memory should be compacted or reclaimed to satisfy a high-order allocation. One of these is based on the fragmentation. If the index is below 500, memory will not be compacted. This choice is arbitrary and not based on data. To help optimise the system and set a sensible default for this value, this patch adds a sysctl extfrag_threshold. The kernel will only compact memory if the fragmentation index is above the extfrag_threshold. [randy.dunlap@oracle.com: Fix build errors when proc fs is not configured] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: compaction: add /proc trigger for memory compactionMel Gorman2010-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a proc file /proc/sys/vm/compact_memory. When an arbitrary value is written to the file, all zones are compacted. The expected user of such a trigger is a job scheduler that prepares the system before the target application runs. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'for-2.6.35' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-05-21
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.35' of git://git.kernel.dk/linux-2.6-block: (86 commits) pipe: set lower and upper limit on max pages in the pipe page array pipe: add support for shrinking and growing pipes drbd: This is now equivalent to drbd release 8.3.8rc1 drbd: Do not free p_uuid early, this is done in the exit code of the receiver drbd: Null pointer deref fix to the large "multi bio rewrite" drbd: Fix: Do not detach, if a bio with a barrier fails drbd: Ensure to not trigger late-new-UUID creation multiple times drbd: Do not Oops when C_STANDALONE when uuid gets generated writeback: fix mixed up arguments to bdi_start_writeback() writeback: fix problem with !CONFIG_BLOCK compilation block: improve automatic native capacity unlocking block: use struct parsed_partitions *state universally in partition check code block,ide: simplify bdops->set_capacity() to ->unlock_native_capacity() block: restart partition scan after resizing a device buffer: make invalidate_bdev() drain all percpu LRU add caches block: remove all rcu head initializations writeback: fixups for !dirty_writeback_centisecs writeback: bdi_writeback_task() must set task state before calling schedule() writeback: ensure that WB_SYNC_NONE writeback with sb pinned is sync drivers/block/drbd: Use kzalloc ...
| * | | | Merge branch 'master' into for-2.6.35Jens Axboe2010-05-21
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: fs/ext3/fsync.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | pipe: set lower and upper limit on max pages in the pipe page arrayJens Axboe2010-05-21
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need at least two to guarantee proper POSIX behaviour, so never allow a smaller limit than that. Also expose a /proc/sys/fs/pipe-max-pages sysctl file that allows root to define a sane upper limit. Make it default to 16 times the default size, which is 16 pages. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | sysctl: fix kernel-doc notation and typosRandy Dunlap2010-05-21
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | Fix kernel-doc warnings, kernel-doc special characters, and typos in recent kernel/sysctl.c additions. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Amerigo Wang <amwang@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds2010-05-21
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits) qlcnic: adding co maintainer ixgbe: add support for active DA cables ixgbe: dcb, do not tag tc_prio_control frames ixgbe: fix ixgbe_tx_is_paused logic ixgbe: always enable vlan strip/insert when DCB is enabled ixgbe: remove some redundant code in setting FCoE FIP filter ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp ixgbe: fix header len when unsplit packet overflows to data buffer ipv6: Never schedule DAD timer on dead address ipv6: Use POSTDAD state ipv6: Use state_lock to protect ifa state ipv6: Replace inet6_ifaddr->dead with state cxgb4: notify upper drivers if the device is already up when they load cxgb4: keep interrupts available when the ports are brought down cxgb4: fix initial addition of MAC address cnic: Return SPQ credit to bnx2x after ring setup and shutdown. cnic: Convert cnic_local_flags to atomic ops. can: Fix SJA1000 command register writes on SMP systems bridge: fix build for CONFIG_SYSFS disabled ARCNET: Limit com20020 PCI ID matches for SOHARD cards ... Fix up various conflicts with pcmcia tree drivers/net/ {pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and wireless/orinoco/spectrum_cs.c} and feature removal (Documentation/feature-removal-schedule.txt). Also fix a non-content conflict due to pm_qos_requirement getting renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
| * | | sysctl: add proc_do_large_bitmapOctavian Purdila2010-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new function can be used to read/write large bitmaps via /proc. A comma separated range format is used for compact output and input (e.g. 1,3-4,10-10). Writing into the file will first reset the bitmap then update it based on the given input. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sysctl: refactor integer handling proc codeAmerigo Wang2010-05-16
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Based on Octavian's work, and I modified a lot.) As we are about to add another integer handling proc function a little bit of cleanup is in order: add a few helper functions to improve code readability and decrease code duplication. In the process a bug is also fixed: if the user specifies a number with more then 20 digits it will be interpreted as two integers (e.g. 10000...13 will be interpreted as 100.... and 13). Behavior for EFAULT handling was changed as well. Previous to this patch, when an EFAULT error occurred in the middle of a write operation, although some of the elements were set, that was not acknowledged to the user (by shorting the write and returning the number of bytes accepted). EFAULT is now treated just like any other errors by acknowledging the amount of bytes accepted. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2010-05-20
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (40 commits) Input: psmouse - small formatting changes to better follow coding style Input: synaptics - set dimensions as reported by firmware Input: elantech - relax signature checks Input: elantech - enforce common prefix on messages Input: wistron_btns - switch to using kmemdup() Input: usbtouchscreen - switch to using kmemdup() Input: do not force selecting i8042 on Moorestown Input: Documentation/sysrq.txt - update KEY_SYSRQ info Input: 88pm860x_onkey - remove invalid irq number assignment Input: i8042 - add a PNP entry to the aux device list Input: i8042 - add some extra PNP keyboard types Input: wm9712 - fix wm97xx_set_gpio() logic Input: add keypad driver for keys interfaced to TCA6416 Input: remove obsolete {corgi,spitz,tosa}kbd.c Input: kbtab - do not advertise unsupported events Input: kbtab - simplify kbtab_disconnect() Input: kbtab - fix incorrect size parameter in usb_buffer_free Input: acecad - don't advertise mouse events Input: acecad - fix some formatting issues Input: acecad - simplify usb_acecad_disconnect() ... Trivial conflict in Documentation/feature-removal-schedule.txt
| * | | Input: implement SysRq as a separate input handlerDmitry Torokhov2010-04-14
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of keeping SysRq support inside of legacy keyboard driver split it out into a separate input handler (filter). This stops most SysRq input events from leaking into evdev clients (some events, such as first SysRq scancode - not keycode - event, are still leaked into both legacy keyboard and evdev). [martinez.javier@gmail.com: fix compile error when CONFIG_MAGIC_SYSRQ is not defined] Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* / / [S390] debug: enable exception-trace debug facilityHeiko Carstens2010-05-17
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The exception-trace facility on x86 and other architectures prints traces to dmesg whenever a user space application crashes. s390 has such a feature since ages however it is called userprocess_debug and is enabled differently. This patch makes sure that whenever one of the two procfs files /proc/sys/kernel/userprocess_debug /proc/sys/debug/exception-trace is modified the contents of the second one changes as well. That way we keep backwards compatibilty but also support the same interface like other architectures do. Besides that the output of the traces is improved since it will now also contain the corresponding filename of the vma (when available) where the process caused a fault or trap. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | sysctl extern cleanup: lockdepDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move lockdep extern declarations to linux/lockdep.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: rtmutexDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move max_lock_depth extern declaration to linux/rtmutex.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: acctDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move acct_parm extern declaration to linux/acct.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: sgDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move sg_big_buff extern declaration to scsi/sg.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Doug Gilbert <dgilbert@interlog.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: moduleDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move modprobe_path extern declaration to linux/kmod.h Move modules_disabled extern declaration to linux/module.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: rcuDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move rcutorture_runnable extern declaration to linux/rcupdate.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: signalDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move print_fatal_signals extern declaration to linux/signal.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | sysctl extern cleanup: C_A_DDave Young2010-03-12
| | | | | | | | | | | | | | | | | | | | | | Extern declarations in sysctl.c should be moved to their own header file, and then include them in relavant .c files. Move C_A_D extern variable declaration to linux/reboot.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'perf-probes-for-linus-2' of ↵Linus Torvalds2010-03-05
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-probes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Issue at least one memory barrier in stop_machine_text_poke() perf probe: Correct probe syntax on command line help perf probe: Add lazy line matching support perf probe: Show more lines after last line perf probe: Check function address range strictly in line finder perf probe: Use libdw callback routines perf probe: Use elfutils-libdw for analyzing debuginfo perf probe: Rename probe finder functions perf probe: Fix bugs in line range finder perf probe: Update perf probe document perf probe: Do not show --line option without dwarf support kprobes: Add documents of jump optimization kprobes/x86: Support kprobes jump optimization on x86 x86: Add text_poke_smp for SMP cross modifying code kprobes/x86: Cleanup save/restore registers kprobes/x86: Boost probes when reentering kprobes: Jump optimization sysctl interface kprobes: Introduce kprobes jump optimization kprobes: Introduce generic insn_slot framework kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE
| * | kprobes: Jump optimization sysctl interfaceMasami Hiramatsu2010-02-25
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add /proc/sys/debug/kprobes-optimization sysctl which enables and disables kprobes jump optimization on the fly for debugging. Changes in v7: - Remove ctl_name = CTL_UNNUMBERED for upstream compatibility. Changes in v6: - Update comments and coding style. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Anders Kaseorg <andersk@ksplice.com> Cc: Tim Abbott <tabbott@ksplice.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jason Baron <jbaron@redhat.com> Cc: Mathieu Desnoyers <compudj@krystal.dyndns.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> LKML-Reference: <20100225133415.6725.8274.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* / sparc: Support show_unhandled_signals.David S. Miller2010-03-01
|/ | | | | | Just faults right now, will add other traps later. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-17
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: Keys: KEYCTL_SESSION_TO_PARENT needs TIF_NOTIFY_RESUME architecture support NOMMU: Optimise away the {dac_,}mmap_min_addr tests security/min_addr.c: make init_mmap_min_addr() static keys: PTR_ERR return of wrong pointer in keyctl_get_security()
| * NOMMU: Optimise away the {dac_,}mmap_min_addr testsDavid Howells2009-12-16
| | | | | | | | | | | | | | | | | | | | | | | | In NOMMU mode clamp dac_mmap_min_addr to zero to cause the tests on it to be skipped by the compiler. We do this as the minimum mmap address doesn't make any sense in NOMMU mode. mmap_min_addr and round_hint_to_min() can be discarded entirely in NOMMU mode. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>