aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/vm/cleancache.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/vm/cleancache.txt')
-rw-r--r--Documentation/vm/cleancache.txt43
1 files changed, 21 insertions, 22 deletions
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt
index 142fbb0f325..36c367c7308 100644
--- a/Documentation/vm/cleancache.txt
+++ b/Documentation/vm/cleancache.txt
@@ -46,11 +46,10 @@ a negative return value indicates failure. A "put_page" will copy a
46the pool id, a file key, and a page index into the file. (The combination 46the pool id, a file key, and a page index into the file. (The combination
47of a pool id, a file key, and an index is sometimes called a "handle".) 47of a pool id, a file key, and an index is sometimes called a "handle".)
48A "get_page" will copy the page, if found, from cleancache into kernel memory. 48A "get_page" will copy the page, if found, from cleancache into kernel memory.
49An "invalidate_page" will ensure the page no longer is present in cleancache; 49A "flush_page" will ensure the page no longer is present in cleancache;
50an "invalidate_inode" will invalidate all pages associated with the specified 50a "flush_inode" will flush all pages associated with the specified file;
51file; and, when a filesystem is unmounted, an "invalidate_fs" will invalidate 51and, when a filesystem is unmounted, a "flush_fs" will flush all pages in
52all pages in all files specified by the given pool id and also surrender 52all files specified by the given pool id and also surrender the pool id.
53the pool id.
54 53
55An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache 54An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache
56to treat the pool as shared using a 128-bit UUID as a key. On systems 55to treat the pool as shared using a 128-bit UUID as a key. On systems
@@ -63,12 +62,12 @@ of the kernel (e.g. by "tools" that control cleancache). Or a
63cleancache implementation can simply disable shared_init by always 62cleancache implementation can simply disable shared_init by always
64returning a negative value. 63returning a negative value.
65 64
66If a get_page is successful on a non-shared pool, the page is invalidated 65If a get_page is successful on a non-shared pool, the page is flushed (thus
67(thus making cleancache an "exclusive" cache). On a shared pool, the page 66making cleancache an "exclusive" cache). On a shared pool, the page
68is NOT invalidated on a successful get_page so that it remains accessible to 67is NOT flushed on a successful get_page so that it remains accessible to
69other sharers. The kernel is responsible for ensuring coherency between 68other sharers. The kernel is responsible for ensuring coherency between
70cleancache (shared or not), the page cache, and the filesystem, using 69cleancache (shared or not), the page cache, and the filesystem, using
71cleancache invalidate operations as required. 70cleancache flush operations as required.
72 71
73Note that cleancache must enforce put-put-get coherency and get-get 72Note that cleancache must enforce put-put-get coherency and get-get
74coherency. For the former, if two puts are made to the same handle but 73coherency. For the former, if two puts are made to the same handle but
@@ -78,22 +77,22 @@ if a get for a given handle fails, subsequent gets for that handle will
78never succeed unless preceded by a successful put with that handle. 77never succeed unless preceded by a successful put with that handle.
79 78
80Last, cleancache provides no SMP serialization guarantees; if two 79Last, cleancache provides no SMP serialization guarantees; if two
81different Linux threads are simultaneously putting and invalidating a page 80different Linux threads are simultaneously putting and flushing a page
82with the same handle, the results are indeterminate. Callers must 81with the same handle, the results are indeterminate. Callers must
83lock the page to ensure serial behavior. 82lock the page to ensure serial behavior.
84 83
85CLEANCACHE PERFORMANCE METRICS 84CLEANCACHE PERFORMANCE METRICS
86 85
87If properly configured, monitoring of cleancache is done via debugfs in 86Cleancache monitoring is done by sysfs files in the
88the /sys/kernel/debug/mm/cleancache directory. The effectiveness of cleancache 87/sys/kernel/mm/cleancache directory. The effectiveness of cleancache
89can be measured (across all filesystems) with: 88can be measured (across all filesystems) with:
90 89
91succ_gets - number of gets that were successful 90succ_gets - number of gets that were successful
92failed_gets - number of gets that failed 91failed_gets - number of gets that failed
93puts - number of puts attempted (all "succeed") 92puts - number of puts attempted (all "succeed")
94invalidates - number of invalidates attempted 93flushes - number of flushes attempted
95 94
96A backend implementation may provide additional metrics. 95A backend implementatation may provide additional metrics.
97 96
98FAQ 97FAQ
99 98
@@ -144,7 +143,7 @@ systems.
144 143
145The core hooks for cleancache in VFS are in most cases a single line 144The core hooks for cleancache in VFS are in most cases a single line
146and the minimum set are placed precisely where needed to maintain 145and the minimum set are placed precisely where needed to maintain
147coherency (via cleancache_invalidate operations) between cleancache, 146coherency (via cleancache_flush operations) between cleancache,
148the page cache, and disk. All hooks compile into nothingness if 147the page cache, and disk. All hooks compile into nothingness if
149cleancache is config'ed off and turn into a function-pointer- 148cleancache is config'ed off and turn into a function-pointer-
150compare-to-NULL if config'ed on but no backend claims the ops 149compare-to-NULL if config'ed on but no backend claims the ops
@@ -185,15 +184,15 @@ or for real kernel-addressable RAM, it makes perfect sense for
185transcendent memory. 184transcendent memory.
186 185
1874) Why is non-shared cleancache "exclusive"? And where is the 1864) Why is non-shared cleancache "exclusive"? And where is the
188 page "invalidated" after a "get"? (Minchan Kim) 187 page "flushed" after a "get"? (Minchan Kim)
189 188
190The main reason is to free up space in transcendent memory and 189The main reason is to free up space in transcendent memory and
191to avoid unnecessary cleancache_invalidate calls. If you want inclusive, 190to avoid unnecessary cleancache_flush calls. If you want inclusive,
192the page can be "put" immediately following the "get". If 191the page can be "put" immediately following the "get". If
193put-after-get for inclusive becomes common, the interface could 192put-after-get for inclusive becomes common, the interface could
194be easily extended to add a "get_no_invalidate" call. 193be easily extended to add a "get_no_flush" call.
195 194
196The invalidate is done by the cleancache backend implementation. 195The flush is done by the cleancache backend implementation.
197 196
1985) What's the performance impact? 1975) What's the performance impact?
199 198
@@ -223,7 +222,7 @@ Some points for a filesystem to consider:
223 as tmpfs should not enable cleancache) 222 as tmpfs should not enable cleancache)
224- To ensure coherency/correctness, the FS must ensure that all 223- To ensure coherency/correctness, the FS must ensure that all
225 file removal or truncation operations either go through VFS or 224 file removal or truncation operations either go through VFS or
226 add hooks to do the equivalent cleancache "invalidate" operations 225 add hooks to do the equivalent cleancache "flush" operations
227- To ensure coherency/correctness, either inode numbers must 226- To ensure coherency/correctness, either inode numbers must
228 be unique across the lifetime of the on-disk file OR the 227 be unique across the lifetime of the on-disk file OR the
229 FS must provide an "encode_fh" function. 228 FS must provide an "encode_fh" function.
@@ -244,11 +243,11 @@ If cleancache would use the inode virtual address instead of
244inode/filehandle, the pool id could be eliminated. But, this 243inode/filehandle, the pool id could be eliminated. But, this
245won't work because cleancache retains pagecache data pages 244won't work because cleancache retains pagecache data pages
246persistently even when the inode has been pruned from the 245persistently even when the inode has been pruned from the
247inode unused list, and only invalidates the data page if the file 246inode unused list, and only flushes the data page if the file
248gets removed/truncated. So if cleancache used the inode kva, 247gets removed/truncated. So if cleancache used the inode kva,
249there would be potential coherency issues if/when the inode 248there would be potential coherency issues if/when the inode
250kva is reused for a different file. Alternately, if cleancache 249kva is reused for a different file. Alternately, if cleancache
251invalidated the pages when the inode kva was freed, much of the value 250flushed the pages when the inode kva was freed, much of the value
252of cleancache would be lost because the cache of pages in cleanache 251of cleancache would be lost because the cache of pages in cleanache
253is potentially much larger than the kernel pagecache and is most 252is potentially much larger than the kernel pagecache and is most
254useful if the pages survive inode cache removal. 253useful if the pages survive inode cache removal.