diff options
Diffstat (limited to 'Documentation/vm/cleancache.txt')
-rw-r--r-- | Documentation/vm/cleancache.txt | 43 |
1 files changed, 21 insertions, 22 deletions
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt index 142fbb0f325..36c367c7308 100644 --- a/Documentation/vm/cleancache.txt +++ b/Documentation/vm/cleancache.txt | |||
@@ -46,11 +46,10 @@ a negative return value indicates failure. A "put_page" will copy a | |||
46 | the pool id, a file key, and a page index into the file. (The combination | 46 | the pool id, a file key, and a page index into the file. (The combination |
47 | of a pool id, a file key, and an index is sometimes called a "handle".) | 47 | of a pool id, a file key, and an index is sometimes called a "handle".) |
48 | A "get_page" will copy the page, if found, from cleancache into kernel memory. | 48 | A "get_page" will copy the page, if found, from cleancache into kernel memory. |
49 | An "invalidate_page" will ensure the page no longer is present in cleancache; | 49 | A "flush_page" will ensure the page no longer is present in cleancache; |
50 | an "invalidate_inode" will invalidate all pages associated with the specified | 50 | a "flush_inode" will flush all pages associated with the specified file; |
51 | file; and, when a filesystem is unmounted, an "invalidate_fs" will invalidate | 51 | and, when a filesystem is unmounted, a "flush_fs" will flush all pages in |
52 | all pages in all files specified by the given pool id and also surrender | 52 | all files specified by the given pool id and also surrender the pool id. |
53 | the pool id. | ||
54 | 53 | ||
55 | An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache | 54 | An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache |
56 | to treat the pool as shared using a 128-bit UUID as a key. On systems | 55 | to treat the pool as shared using a 128-bit UUID as a key. On systems |
@@ -63,12 +62,12 @@ of the kernel (e.g. by "tools" that control cleancache). Or a | |||
63 | cleancache implementation can simply disable shared_init by always | 62 | cleancache implementation can simply disable shared_init by always |
64 | returning a negative value. | 63 | returning a negative value. |
65 | 64 | ||
66 | If a get_page is successful on a non-shared pool, the page is invalidated | 65 | If a get_page is successful on a non-shared pool, the page is flushed (thus |
67 | (thus making cleancache an "exclusive" cache). On a shared pool, the page | 66 | making cleancache an "exclusive" cache). On a shared pool, the page |
68 | is NOT invalidated on a successful get_page so that it remains accessible to | 67 | is NOT flushed on a successful get_page so that it remains accessible to |
69 | other sharers. The kernel is responsible for ensuring coherency between | 68 | other sharers. The kernel is responsible for ensuring coherency between |
70 | cleancache (shared or not), the page cache, and the filesystem, using | 69 | cleancache (shared or not), the page cache, and the filesystem, using |
71 | cleancache invalidate operations as required. | 70 | cleancache flush operations as required. |
72 | 71 | ||
73 | Note that cleancache must enforce put-put-get coherency and get-get | 72 | Note that cleancache must enforce put-put-get coherency and get-get |
74 | coherency. For the former, if two puts are made to the same handle but | 73 | coherency. For the former, if two puts are made to the same handle but |
@@ -78,22 +77,22 @@ if a get for a given handle fails, subsequent gets for that handle will | |||
78 | never succeed unless preceded by a successful put with that handle. | 77 | never succeed unless preceded by a successful put with that handle. |
79 | 78 | ||
80 | Last, cleancache provides no SMP serialization guarantees; if two | 79 | Last, cleancache provides no SMP serialization guarantees; if two |
81 | different Linux threads are simultaneously putting and invalidating a page | 80 | different Linux threads are simultaneously putting and flushing a page |
82 | with the same handle, the results are indeterminate. Callers must | 81 | with the same handle, the results are indeterminate. Callers must |
83 | lock the page to ensure serial behavior. | 82 | lock the page to ensure serial behavior. |
84 | 83 | ||
85 | CLEANCACHE PERFORMANCE METRICS | 84 | CLEANCACHE PERFORMANCE METRICS |
86 | 85 | ||
87 | If properly configured, monitoring of cleancache is done via debugfs in | 86 | Cleancache monitoring is done by sysfs files in the |
88 | the /sys/kernel/debug/mm/cleancache directory. The effectiveness of cleancache | 87 | /sys/kernel/mm/cleancache directory. The effectiveness of cleancache |
89 | can be measured (across all filesystems) with: | 88 | can be measured (across all filesystems) with: |
90 | 89 | ||
91 | succ_gets - number of gets that were successful | 90 | succ_gets - number of gets that were successful |
92 | failed_gets - number of gets that failed | 91 | failed_gets - number of gets that failed |
93 | puts - number of puts attempted (all "succeed") | 92 | puts - number of puts attempted (all "succeed") |
94 | invalidates - number of invalidates attempted | 93 | flushes - number of flushes attempted |
95 | 94 | ||
96 | A backend implementation may provide additional metrics. | 95 | A backend implementatation may provide additional metrics. |
97 | 96 | ||
98 | FAQ | 97 | FAQ |
99 | 98 | ||
@@ -144,7 +143,7 @@ systems. | |||
144 | 143 | ||
145 | The core hooks for cleancache in VFS are in most cases a single line | 144 | The core hooks for cleancache in VFS are in most cases a single line |
146 | and the minimum set are placed precisely where needed to maintain | 145 | and the minimum set are placed precisely where needed to maintain |
147 | coherency (via cleancache_invalidate operations) between cleancache, | 146 | coherency (via cleancache_flush operations) between cleancache, |
148 | the page cache, and disk. All hooks compile into nothingness if | 147 | the page cache, and disk. All hooks compile into nothingness if |
149 | cleancache is config'ed off and turn into a function-pointer- | 148 | cleancache is config'ed off and turn into a function-pointer- |
150 | compare-to-NULL if config'ed on but no backend claims the ops | 149 | compare-to-NULL if config'ed on but no backend claims the ops |
@@ -185,15 +184,15 @@ or for real kernel-addressable RAM, it makes perfect sense for | |||
185 | transcendent memory. | 184 | transcendent memory. |
186 | 185 | ||
187 | 4) Why is non-shared cleancache "exclusive"? And where is the | 186 | 4) Why is non-shared cleancache "exclusive"? And where is the |
188 | page "invalidated" after a "get"? (Minchan Kim) | 187 | page "flushed" after a "get"? (Minchan Kim) |
189 | 188 | ||
190 | The main reason is to free up space in transcendent memory and | 189 | The main reason is to free up space in transcendent memory and |
191 | to avoid unnecessary cleancache_invalidate calls. If you want inclusive, | 190 | to avoid unnecessary cleancache_flush calls. If you want inclusive, |
192 | the page can be "put" immediately following the "get". If | 191 | the page can be "put" immediately following the "get". If |
193 | put-after-get for inclusive becomes common, the interface could | 192 | put-after-get for inclusive becomes common, the interface could |
194 | be easily extended to add a "get_no_invalidate" call. | 193 | be easily extended to add a "get_no_flush" call. |
195 | 194 | ||
196 | The invalidate is done by the cleancache backend implementation. | 195 | The flush is done by the cleancache backend implementation. |
197 | 196 | ||
198 | 5) What's the performance impact? | 197 | 5) What's the performance impact? |
199 | 198 | ||
@@ -223,7 +222,7 @@ Some points for a filesystem to consider: | |||
223 | as tmpfs should not enable cleancache) | 222 | as tmpfs should not enable cleancache) |
224 | - To ensure coherency/correctness, the FS must ensure that all | 223 | - To ensure coherency/correctness, the FS must ensure that all |
225 | file removal or truncation operations either go through VFS or | 224 | file removal or truncation operations either go through VFS or |
226 | add hooks to do the equivalent cleancache "invalidate" operations | 225 | add hooks to do the equivalent cleancache "flush" operations |
227 | - To ensure coherency/correctness, either inode numbers must | 226 | - To ensure coherency/correctness, either inode numbers must |
228 | be unique across the lifetime of the on-disk file OR the | 227 | be unique across the lifetime of the on-disk file OR the |
229 | FS must provide an "encode_fh" function. | 228 | FS must provide an "encode_fh" function. |
@@ -244,11 +243,11 @@ If cleancache would use the inode virtual address instead of | |||
244 | inode/filehandle, the pool id could be eliminated. But, this | 243 | inode/filehandle, the pool id could be eliminated. But, this |
245 | won't work because cleancache retains pagecache data pages | 244 | won't work because cleancache retains pagecache data pages |
246 | persistently even when the inode has been pruned from the | 245 | persistently even when the inode has been pruned from the |
247 | inode unused list, and only invalidates the data page if the file | 246 | inode unused list, and only flushes the data page if the file |
248 | gets removed/truncated. So if cleancache used the inode kva, | 247 | gets removed/truncated. So if cleancache used the inode kva, |
249 | there would be potential coherency issues if/when the inode | 248 | there would be potential coherency issues if/when the inode |
250 | kva is reused for a different file. Alternately, if cleancache | 249 | kva is reused for a different file. Alternately, if cleancache |
251 | invalidated the pages when the inode kva was freed, much of the value | 250 | flushed the pages when the inode kva was freed, much of the value |
252 | of cleancache would be lost because the cache of pages in cleanache | 251 | of cleancache would be lost because the cache of pages in cleanache |
253 | is potentially much larger than the kernel pagecache and is most | 252 | is potentially much larger than the kernel pagecache and is most |
254 | useful if the pages survive inode cache removal. | 253 | useful if the pages survive inode cache removal. |