diff options
author | Glauber Costa <glommer@openvz.org> | 2013-08-27 20:17:53 -0400 |
---|---|---|
committer | Al Viro <viro@zeniv.linux.org.uk> | 2013-09-10 18:56:29 -0400 |
commit | 3942c07ccf98e66b8893f396dca98f5b076f905f (patch) | |
tree | 063ec7aa542d9fa812482c02e2436205fe6a9e8e /fs/inode.c | |
parent | da5338c7498556b760871661ffecb053cc6f708f (diff) |
fs: bump inode and dentry counters to long
This series reworks our current object cache shrinking infrastructure in
two main ways:
* Noticing that a lot of users copy and paste their own version of LRU
lists for objects, we put some effort in providing a generic version.
It is modeled after the filesystem users: dentries, inodes, and xfs
(for various tasks), but we expect that other users could benefit in
the near future with little or no modification. Let us know if you
have any issues.
* The underlying list_lru being proposed automatically and
transparently keeps the elements in per-node lists, and is able to
manipulate the node lists individually. Given this infrastructure, we
are able to modify the up-to-now hammer called shrink_slab to proceed
with node-reclaim instead of always searching memory from all over like
it has been doing.
Per-node lru lists are also expected to lead to less contention in the lru
locks on multi-node scans, since we are now no longer fighting for a
global lock. The locks usually disappear from the profilers with this
change.
Although we have no official benchmarks for this version - be our guest to
independently evaluate this - earlier versions of this series were
performance tested (details at
http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
visible performance regressions while yielding a better qualitative
behavior in NUMA machines.
With this infrastructure in place, we can use the list_lru entry point to
provide memcg isolation and per-memcg targeted reclaim. Historically,
those two pieces of work have been posted together. This version presents
only the infrastructure work, deferring the memcg work for a later time,
so we can focus on getting this part tested. You can see more about the
history of such work at http://lwn.net/Articles/552769/
Dave Chinner (18):
dcache: convert dentry_stat.nr_unused to per-cpu counters
dentry: move to per-sb LRU locks
dcache: remove dentries from LRU before putting on dispose list
mm: new shrinker API
shrinker: convert superblock shrinkers to new API
list: add a new LRU list type
inode: convert inode lru list to generic lru list code.
dcache: convert to use new lru list infrastructure
list_lru: per-node list infrastructure
shrinker: add node awareness
fs: convert inode and dentry shrinking to be node aware
xfs: convert buftarg LRU to generic code
xfs: rework buffer dispose list tracking
xfs: convert dquot cache lru to list_lru
fs: convert fs shrinkers to new scan/count API
drivers: convert shrinkers to new count/scan API
shrinker: convert remaining shrinkers to count/scan API
shrinker: Kill old ->shrink API.
Glauber Costa (7):
fs: bump inode and dentry counters to long
super: fix calculation of shrinkable objects for small numbers
list_lru: per-node API
vmscan: per-node deferred work
i915: bail out earlier when shrinker cannot acquire mutex
hugepage: convert huge zero page shrinker to new shrinker API
list_lru: dynamically adjust node arrays
This patch:
There are situations in very large machines in which we can have a large
quantity of dirty inodes, unused dentries, etc. This is particularly true
when umounting a filesystem, where eventually since every live object will
eventually be discarded.
Dave Chinner reported a problem with this while experimenting with the
shrinker revamp patchset. So we believe it is time for a change. This
patch just moves int to longs. Machines where it matters should have a
big long anyway.
Signed-off-by: Glauber Costa <glommer@openvz.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Diffstat (limited to 'fs/inode.c')
-rw-r--r-- | fs/inode.c | 18 |
1 files changed, 9 insertions, 9 deletions
diff --git a/fs/inode.c b/fs/inode.c index 93a0625b46e4..2a3c37ea823d 100644 --- a/fs/inode.c +++ b/fs/inode.c | |||
@@ -70,33 +70,33 @@ EXPORT_SYMBOL(empty_aops); | |||
70 | */ | 70 | */ |
71 | struct inodes_stat_t inodes_stat; | 71 | struct inodes_stat_t inodes_stat; |
72 | 72 | ||
73 | static DEFINE_PER_CPU(unsigned int, nr_inodes); | 73 | static DEFINE_PER_CPU(unsigned long, nr_inodes); |
74 | static DEFINE_PER_CPU(unsigned int, nr_unused); | 74 | static DEFINE_PER_CPU(unsigned long, nr_unused); |
75 | 75 | ||
76 | static struct kmem_cache *inode_cachep __read_mostly; | 76 | static struct kmem_cache *inode_cachep __read_mostly; |
77 | 77 | ||
78 | static int get_nr_inodes(void) | 78 | static long get_nr_inodes(void) |
79 | { | 79 | { |
80 | int i; | 80 | int i; |
81 | int sum = 0; | 81 | long sum = 0; |
82 | for_each_possible_cpu(i) | 82 | for_each_possible_cpu(i) |
83 | sum += per_cpu(nr_inodes, i); | 83 | sum += per_cpu(nr_inodes, i); |
84 | return sum < 0 ? 0 : sum; | 84 | return sum < 0 ? 0 : sum; |
85 | } | 85 | } |
86 | 86 | ||
87 | static inline int get_nr_inodes_unused(void) | 87 | static inline long get_nr_inodes_unused(void) |
88 | { | 88 | { |
89 | int i; | 89 | int i; |
90 | int sum = 0; | 90 | long sum = 0; |
91 | for_each_possible_cpu(i) | 91 | for_each_possible_cpu(i) |
92 | sum += per_cpu(nr_unused, i); | 92 | sum += per_cpu(nr_unused, i); |
93 | return sum < 0 ? 0 : sum; | 93 | return sum < 0 ? 0 : sum; |
94 | } | 94 | } |
95 | 95 | ||
96 | int get_nr_dirty_inodes(void) | 96 | long get_nr_dirty_inodes(void) |
97 | { | 97 | { |
98 | /* not actually dirty inodes, but a wild approximation */ | 98 | /* not actually dirty inodes, but a wild approximation */ |
99 | int nr_dirty = get_nr_inodes() - get_nr_inodes_unused(); | 99 | long nr_dirty = get_nr_inodes() - get_nr_inodes_unused(); |
100 | return nr_dirty > 0 ? nr_dirty : 0; | 100 | return nr_dirty > 0 ? nr_dirty : 0; |
101 | } | 101 | } |
102 | 102 | ||
@@ -109,7 +109,7 @@ int proc_nr_inodes(ctl_table *table, int write, | |||
109 | { | 109 | { |
110 | inodes_stat.nr_inodes = get_nr_inodes(); | 110 | inodes_stat.nr_inodes = get_nr_inodes(); |
111 | inodes_stat.nr_unused = get_nr_inodes_unused(); | 111 | inodes_stat.nr_unused = get_nr_inodes_unused(); |
112 | return proc_dointvec(table, write, buffer, lenp, ppos); | 112 | return proc_doulongvec_minmax(table, write, buffer, lenp, ppos); |
113 | } | 113 | } |
114 | #endif | 114 | #endif |
115 | 115 | ||