diff options
Diffstat (limited to 'Documentation/vm/cleancache.txt')
-rw-r--r-- | Documentation/vm/cleancache.txt | 41 |
1 files changed, 21 insertions, 20 deletions
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt index d5c615af10ba..142fbb0f325a 100644 --- a/Documentation/vm/cleancache.txt +++ b/Documentation/vm/cleancache.txt | |||
@@ -46,10 +46,11 @@ a negative return value indicates failure. A "put_page" will copy a | |||
46 | the pool id, a file key, and a page index into the file. (The combination | 46 | the pool id, a file key, and a page index into the file. (The combination |
47 | of a pool id, a file key, and an index is sometimes called a "handle".) | 47 | of a pool id, a file key, and an index is sometimes called a "handle".) |
48 | A "get_page" will copy the page, if found, from cleancache into kernel memory. | 48 | A "get_page" will copy the page, if found, from cleancache into kernel memory. |
49 | A "flush_page" will ensure the page no longer is present in cleancache; | 49 | An "invalidate_page" will ensure the page no longer is present in cleancache; |
50 | a "flush_inode" will flush all pages associated with the specified file; | 50 | an "invalidate_inode" will invalidate all pages associated with the specified |
51 | and, when a filesystem is unmounted, a "flush_fs" will flush all pages in | 51 | file; and, when a filesystem is unmounted, an "invalidate_fs" will invalidate |
52 | all files specified by the given pool id and also surrender the pool id. | 52 | all pages in all files specified by the given pool id and also surrender |
53 | the pool id. | ||
53 | 54 | ||
54 | An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache | 55 | An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache |
55 | to treat the pool as shared using a 128-bit UUID as a key. On systems | 56 | to treat the pool as shared using a 128-bit UUID as a key. On systems |
@@ -62,12 +63,12 @@ of the kernel (e.g. by "tools" that control cleancache). Or a | |||
62 | cleancache implementation can simply disable shared_init by always | 63 | cleancache implementation can simply disable shared_init by always |
63 | returning a negative value. | 64 | returning a negative value. |
64 | 65 | ||
65 | If a get_page is successful on a non-shared pool, the page is flushed (thus | 66 | If a get_page is successful on a non-shared pool, the page is invalidated |
66 | making cleancache an "exclusive" cache). On a shared pool, the page | 67 | (thus making cleancache an "exclusive" cache). On a shared pool, the page |
67 | is NOT flushed on a successful get_page so that it remains accessible to | 68 | is NOT invalidated on a successful get_page so that it remains accessible to |
68 | other sharers. The kernel is responsible for ensuring coherency between | 69 | other sharers. The kernel is responsible for ensuring coherency between |
69 | cleancache (shared or not), the page cache, and the filesystem, using | 70 | cleancache (shared or not), the page cache, and the filesystem, using |
70 | cleancache flush operations as required. | 71 | cleancache invalidate operations as required. |
71 | 72 | ||
72 | Note that cleancache must enforce put-put-get coherency and get-get | 73 | Note that cleancache must enforce put-put-get coherency and get-get |
73 | coherency. For the former, if two puts are made to the same handle but | 74 | coherency. For the former, if two puts are made to the same handle but |
@@ -77,20 +78,20 @@ if a get for a given handle fails, subsequent gets for that handle will | |||
77 | never succeed unless preceded by a successful put with that handle. | 78 | never succeed unless preceded by a successful put with that handle. |
78 | 79 | ||
79 | Last, cleancache provides no SMP serialization guarantees; if two | 80 | Last, cleancache provides no SMP serialization guarantees; if two |
80 | different Linux threads are simultaneously putting and flushing a page | 81 | different Linux threads are simultaneously putting and invalidating a page |
81 | with the same handle, the results are indeterminate. Callers must | 82 | with the same handle, the results are indeterminate. Callers must |
82 | lock the page to ensure serial behavior. | 83 | lock the page to ensure serial behavior. |
83 | 84 | ||
84 | CLEANCACHE PERFORMANCE METRICS | 85 | CLEANCACHE PERFORMANCE METRICS |
85 | 86 | ||
86 | Cleancache monitoring is done by sysfs files in the | 87 | If properly configured, monitoring of cleancache is done via debugfs in |
87 | /sys/kernel/mm/cleancache directory. The effectiveness of cleancache | 88 | the /sys/kernel/debug/mm/cleancache directory. The effectiveness of cleancache |
88 | can be measured (across all filesystems) with: | 89 | can be measured (across all filesystems) with: |
89 | 90 | ||
90 | succ_gets - number of gets that were successful | 91 | succ_gets - number of gets that were successful |
91 | failed_gets - number of gets that failed | 92 | failed_gets - number of gets that failed |
92 | puts - number of puts attempted (all "succeed") | 93 | puts - number of puts attempted (all "succeed") |
93 | flushes - number of flushes attempted | 94 | invalidates - number of invalidates attempted |
94 | 95 | ||
95 | A backend implementation may provide additional metrics. | 96 | A backend implementation may provide additional metrics. |
96 | 97 | ||
@@ -143,7 +144,7 @@ systems. | |||
143 | 144 | ||
144 | The core hooks for cleancache in VFS are in most cases a single line | 145 | The core hooks for cleancache in VFS are in most cases a single line |
145 | and the minimum set are placed precisely where needed to maintain | 146 | and the minimum set are placed precisely where needed to maintain |
146 | coherency (via cleancache_flush operations) between cleancache, | 147 | coherency (via cleancache_invalidate operations) between cleancache, |
147 | the page cache, and disk. All hooks compile into nothingness if | 148 | the page cache, and disk. All hooks compile into nothingness if |
148 | cleancache is config'ed off and turn into a function-pointer- | 149 | cleancache is config'ed off and turn into a function-pointer- |
149 | compare-to-NULL if config'ed on but no backend claims the ops | 150 | compare-to-NULL if config'ed on but no backend claims the ops |
@@ -184,15 +185,15 @@ or for real kernel-addressable RAM, it makes perfect sense for | |||
184 | transcendent memory. | 185 | transcendent memory. |
185 | 186 | ||
186 | 4) Why is non-shared cleancache "exclusive"? And where is the | 187 | 4) Why is non-shared cleancache "exclusive"? And where is the |
187 | page "flushed" after a "get"? (Minchan Kim) | 188 | page "invalidated" after a "get"? (Minchan Kim) |
188 | 189 | ||
189 | The main reason is to free up space in transcendent memory and | 190 | The main reason is to free up space in transcendent memory and |
190 | to avoid unnecessary cleancache_flush calls. If you want inclusive, | 191 | to avoid unnecessary cleancache_invalidate calls. If you want inclusive, |
191 | the page can be "put" immediately following the "get". If | 192 | the page can be "put" immediately following the "get". If |
192 | put-after-get for inclusive becomes common, the interface could | 193 | put-after-get for inclusive becomes common, the interface could |
193 | be easily extended to add a "get_no_flush" call. | 194 | be easily extended to add a "get_no_invalidate" call. |
194 | 195 | ||
195 | The flush is done by the cleancache backend implementation. | 196 | The invalidate is done by the cleancache backend implementation. |
196 | 197 | ||
197 | 5) What's the performance impact? | 198 | 5) What's the performance impact? |
198 | 199 | ||
@@ -222,7 +223,7 @@ Some points for a filesystem to consider: | |||
222 | as tmpfs should not enable cleancache) | 223 | as tmpfs should not enable cleancache) |
223 | - To ensure coherency/correctness, the FS must ensure that all | 224 | - To ensure coherency/correctness, the FS must ensure that all |
224 | file removal or truncation operations either go through VFS or | 225 | file removal or truncation operations either go through VFS or |
225 | add hooks to do the equivalent cleancache "flush" operations | 226 | add hooks to do the equivalent cleancache "invalidate" operations |
226 | - To ensure coherency/correctness, either inode numbers must | 227 | - To ensure coherency/correctness, either inode numbers must |
227 | be unique across the lifetime of the on-disk file OR the | 228 | be unique across the lifetime of the on-disk file OR the |
228 | FS must provide an "encode_fh" function. | 229 | FS must provide an "encode_fh" function. |
@@ -243,11 +244,11 @@ If cleancache would use the inode virtual address instead of | |||
243 | inode/filehandle, the pool id could be eliminated. But, this | 244 | inode/filehandle, the pool id could be eliminated. But, this |
244 | won't work because cleancache retains pagecache data pages | 245 | won't work because cleancache retains pagecache data pages |
245 | persistently even when the inode has been pruned from the | 246 | persistently even when the inode has been pruned from the |
246 | inode unused list, and only flushes the data page if the file | 247 | inode unused list, and only invalidates the data page if the file |
247 | gets removed/truncated. So if cleancache used the inode kva, | 248 | gets removed/truncated. So if cleancache used the inode kva, |
248 | there would be potential coherency issues if/when the inode | 249 | there would be potential coherency issues if/when the inode |
249 | kva is reused for a different file. Alternately, if cleancache | 250 | kva is reused for a different file. Alternately, if cleancache |
250 | flushed the pages when the inode kva was freed, much of the value | 251 | invalidated the pages when the inode kva was freed, much of the value |
251 | of cleancache would be lost because the cache of pages in cleanache | 252 | of cleancache would be lost because the cache of pages in cleanache |
252 | is potentially much larger than the kernel pagecache and is most | 253 | is potentially much larger than the kernel pagecache and is most |
253 | useful if the pages survive inode cache removal. | 254 | useful if the pages survive inode cache removal. |