diff options
author | Mike Rapoport <rppt@linux.vnet.ibm.com> | 2018-03-21 15:22:19 -0400 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2018-04-16 16:18:11 -0400 |
commit | 5ef829e056c82579329ccec67a6f5fda2f724dc7 (patch) | |
tree | 0343ecbda537eeed614fd77fb0075f0a84af203e /Documentation/vm | |
parent | d04f9f5a78b836cc51f8000e2049f2709c0b61f6 (diff) |
docs/vm: cleancache.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/vm')
-rw-r--r-- | Documentation/vm/cleancache.txt | 105 |
1 files changed, 62 insertions, 43 deletions
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt index e4b49df7a048..68cba9131c31 100644 --- a/Documentation/vm/cleancache.txt +++ b/Documentation/vm/cleancache.txt | |||
@@ -1,4 +1,11 @@ | |||
1 | MOTIVATION | 1 | .. _cleancache: |
2 | |||
3 | ========== | ||
4 | Cleancache | ||
5 | ========== | ||
6 | |||
7 | Motivation | ||
8 | ========== | ||
2 | 9 | ||
3 | Cleancache is a new optional feature provided by the VFS layer that | 10 | Cleancache is a new optional feature provided by the VFS layer that |
4 | potentially dramatically increases page cache effectiveness for | 11 | potentially dramatically increases page cache effectiveness for |
@@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented | |||
21 | in Xen (using hypervisor memory) and zcache (using in-kernel compressed | 28 | in Xen (using hypervisor memory) and zcache (using in-kernel compressed |
22 | memory) and other implementations are in development. | 29 | memory) and other implementations are in development. |
23 | 30 | ||
24 | FAQs are included below. | 31 | :ref:`FAQs <faq>` are included below. |
25 | 32 | ||
26 | IMPLEMENTATION OVERVIEW | 33 | Implementation Overview |
34 | ======================= | ||
27 | 35 | ||
28 | A cleancache "backend" that provides transcendent memory registers itself | 36 | A cleancache "backend" that provides transcendent memory registers itself |
29 | to the kernel's cleancache "frontend" by calling cleancache_register_ops, | 37 | to the kernel's cleancache "frontend" by calling cleancache_register_ops, |
@@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page | |||
80 | with the same handle, the results are indeterminate. Callers must | 88 | with the same handle, the results are indeterminate. Callers must |
81 | lock the page to ensure serial behavior. | 89 | lock the page to ensure serial behavior. |
82 | 90 | ||
83 | CLEANCACHE PERFORMANCE METRICS | 91 | Cleancache Performance Metrics |
92 | ============================== | ||
84 | 93 | ||
85 | If properly configured, monitoring of cleancache is done via debugfs in | 94 | If properly configured, monitoring of cleancache is done via debugfs in |
86 | the /sys/kernel/debug/cleancache directory. The effectiveness of cleancache | 95 | the `/sys/kernel/debug/cleancache` directory. The effectiveness of cleancache |
87 | can be measured (across all filesystems) with: | 96 | can be measured (across all filesystems) with: |
88 | 97 | ||
89 | succ_gets - number of gets that were successful | 98 | ``succ_gets`` |
90 | failed_gets - number of gets that failed | 99 | number of gets that were successful |
91 | puts - number of puts attempted (all "succeed") | 100 | |
92 | invalidates - number of invalidates attempted | 101 | ``failed_gets`` |
102 | number of gets that failed | ||
103 | |||
104 | ``puts`` | ||
105 | number of puts attempted (all "succeed") | ||
106 | |||
107 | ``invalidates`` | ||
108 | number of invalidates attempted | ||
93 | 109 | ||
94 | A backend implementation may provide additional metrics. | 110 | A backend implementation may provide additional metrics. |
95 | 111 | ||
112 | .. _faq: | ||
113 | |||
96 | FAQ | 114 | FAQ |
115 | === | ||
97 | 116 | ||
98 | 1) Where's the value? (Andrew Morton) | 117 | * Where's the value? (Andrew Morton) |
99 | 118 | ||
100 | Cleancache provides a significant performance benefit to many workloads | 119 | Cleancache provides a significant performance benefit to many workloads |
101 | in many environments with negligible overhead by improving the | 120 | in many environments with negligible overhead by improving the |
@@ -137,8 +156,8 @@ device that stores pages of data in a compressed state. And | |||
137 | the proposed "RAMster" driver shares RAM across multiple physical | 156 | the proposed "RAMster" driver shares RAM across multiple physical |
138 | systems. | 157 | systems. |
139 | 158 | ||
140 | 2) Why does cleancache have its sticky fingers so deep inside the | 159 | * Why does cleancache have its sticky fingers so deep inside the |
141 | filesystems and VFS? (Andrew Morton and Christoph Hellwig) | 160 | filesystems and VFS? (Andrew Morton and Christoph Hellwig) |
142 | 161 | ||
143 | The core hooks for cleancache in VFS are in most cases a single line | 162 | The core hooks for cleancache in VFS are in most cases a single line |
144 | and the minimum set are placed precisely where needed to maintain | 163 | and the minimum set are placed precisely where needed to maintain |
@@ -168,9 +187,9 @@ filesystems in the future. | |||
168 | The total impact of the hooks to existing fs and mm files is only | 187 | The total impact of the hooks to existing fs and mm files is only |
169 | about 40 lines added (not counting comments and blank lines). | 188 | about 40 lines added (not counting comments and blank lines). |
170 | 189 | ||
171 | 3) Why not make cleancache asynchronous and batched so it can | 190 | * Why not make cleancache asynchronous and batched so it can more |
172 | more easily interface with real devices with DMA instead | 191 | easily interface with real devices with DMA instead of copying each |
173 | of copying each individual page? (Minchan Kim) | 192 | individual page? (Minchan Kim) |
174 | 193 | ||
175 | The one-page-at-a-time copy semantics simplifies the implementation | 194 | The one-page-at-a-time copy semantics simplifies the implementation |
176 | on both the frontend and backend and also allows the backend to | 195 | on both the frontend and backend and also allows the backend to |
@@ -182,8 +201,8 @@ are avoided. While the interface seems odd for a "real device" | |||
182 | or for real kernel-addressable RAM, it makes perfect sense for | 201 | or for real kernel-addressable RAM, it makes perfect sense for |
183 | transcendent memory. | 202 | transcendent memory. |
184 | 203 | ||
185 | 4) Why is non-shared cleancache "exclusive"? And where is the | 204 | * Why is non-shared cleancache "exclusive"? And where is the |
186 | page "invalidated" after a "get"? (Minchan Kim) | 205 | page "invalidated" after a "get"? (Minchan Kim) |
187 | 206 | ||
188 | The main reason is to free up space in transcendent memory and | 207 | The main reason is to free up space in transcendent memory and |
189 | to avoid unnecessary cleancache_invalidate calls. If you want inclusive, | 208 | to avoid unnecessary cleancache_invalidate calls. If you want inclusive, |
@@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call. | |||
193 | 212 | ||
194 | The invalidate is done by the cleancache backend implementation. | 213 | The invalidate is done by the cleancache backend implementation. |
195 | 214 | ||
196 | 5) What's the performance impact? | 215 | * What's the performance impact? |
197 | 216 | ||
198 | Performance analysis has been presented at OLS'09 and LCA'10. | 217 | Performance analysis has been presented at OLS'09 and LCA'10. |
199 | Briefly, performance gains can be significant on most workloads, | 218 | Briefly, performance gains can be significant on most workloads, |
@@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache | |||
206 | has little value, but in newer multicore machines, especially | 225 | has little value, but in newer multicore machines, especially |
207 | consolidated/virtualized machines, it has great value. | 226 | consolidated/virtualized machines, it has great value. |
208 | 227 | ||
209 | 6) How do I add cleancache support for filesystem X? (Boaz Harrash) | 228 | * How do I add cleancache support for filesystem X? (Boaz Harrash) |
210 | 229 | ||
211 | Filesystems that are well-behaved and conform to certain | 230 | Filesystems that are well-behaved and conform to certain |
212 | restrictions can utilize cleancache simply by making a call to | 231 | restrictions can utilize cleancache simply by making a call to |
@@ -217,26 +236,26 @@ not enable the optional cleancache. | |||
217 | 236 | ||
218 | Some points for a filesystem to consider: | 237 | Some points for a filesystem to consider: |
219 | 238 | ||
220 | - The FS should be block-device-based (e.g. a ram-based FS such | 239 | - The FS should be block-device-based (e.g. a ram-based FS such |
221 | as tmpfs should not enable cleancache) | 240 | as tmpfs should not enable cleancache) |
222 | - To ensure coherency/correctness, the FS must ensure that all | 241 | - To ensure coherency/correctness, the FS must ensure that all |
223 | file removal or truncation operations either go through VFS or | 242 | file removal or truncation operations either go through VFS or |
224 | add hooks to do the equivalent cleancache "invalidate" operations | 243 | add hooks to do the equivalent cleancache "invalidate" operations |
225 | - To ensure coherency/correctness, either inode numbers must | 244 | - To ensure coherency/correctness, either inode numbers must |
226 | be unique across the lifetime of the on-disk file OR the | 245 | be unique across the lifetime of the on-disk file OR the |
227 | FS must provide an "encode_fh" function. | 246 | FS must provide an "encode_fh" function. |
228 | - The FS must call the VFS superblock alloc and deactivate routines | 247 | - The FS must call the VFS superblock alloc and deactivate routines |
229 | or add hooks to do the equivalent cleancache calls done there. | 248 | or add hooks to do the equivalent cleancache calls done there. |
230 | - To maximize performance, all pages fetched from the FS should | 249 | - To maximize performance, all pages fetched from the FS should |
231 | go through the do_mpag_readpage routine or the FS should add | 250 | go through the do_mpag_readpage routine or the FS should add |
232 | hooks to do the equivalent (cf. btrfs) | 251 | hooks to do the equivalent (cf. btrfs) |
233 | - Currently, the FS blocksize must be the same as PAGESIZE. This | 252 | - Currently, the FS blocksize must be the same as PAGESIZE. This |
234 | is not an architectural restriction, but no backends currently | 253 | is not an architectural restriction, but no backends currently |
235 | support anything different. | 254 | support anything different. |
236 | - A clustered FS should invoke the "shared_init_fs" cleancache | 255 | - A clustered FS should invoke the "shared_init_fs" cleancache |
237 | hook to get best performance for some backends. | 256 | hook to get best performance for some backends. |
238 | 257 | ||
239 | 7) Why not use the KVA of the inode as the key? (Christoph Hellwig) | 258 | * Why not use the KVA of the inode as the key? (Christoph Hellwig) |
240 | 259 | ||
241 | If cleancache would use the inode virtual address instead of | 260 | If cleancache would use the inode virtual address instead of |
242 | inode/filehandle, the pool id could be eliminated. But, this | 261 | inode/filehandle, the pool id could be eliminated. But, this |
@@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache | |||
251 | is potentially much larger than the kernel pagecache and is most | 270 | is potentially much larger than the kernel pagecache and is most |
252 | useful if the pages survive inode cache removal. | 271 | useful if the pages survive inode cache removal. |
253 | 272 | ||
254 | 8) Why is a global variable required? | 273 | * Why is a global variable required? |
255 | 274 | ||
256 | The cleancache_enabled flag is checked in all of the frequently-used | 275 | The cleancache_enabled flag is checked in all of the frequently-used |
257 | cleancache hooks. The alternative is a function call to check a static | 276 | cleancache hooks. The alternative is a function call to check a static |
@@ -262,14 +281,14 @@ global variable allows cleancache to be enabled by default at compile | |||
262 | time, but have insignificant performance impact when cleancache remains | 281 | time, but have insignificant performance impact when cleancache remains |
263 | disabled at runtime. | 282 | disabled at runtime. |
264 | 283 | ||
265 | 9) Does cleanache work with KVM? | 284 | * Does cleanache work with KVM? |
266 | 285 | ||
267 | The memory model of KVM is sufficiently different that a cleancache | 286 | The memory model of KVM is sufficiently different that a cleancache |
268 | backend may have less value for KVM. This remains to be tested, | 287 | backend may have less value for KVM. This remains to be tested, |
269 | especially in an overcommitted system. | 288 | especially in an overcommitted system. |
270 | 289 | ||
271 | 10) Does cleancache work in userspace? It sounds useful for | 290 | * Does cleancache work in userspace? It sounds useful for |
272 | memory hungry caches like web browsers. (Jamie Lokier) | 291 | memory hungry caches like web browsers. (Jamie Lokier) |
273 | 292 | ||
274 | No plans yet, though we agree it sounds useful, at least for | 293 | No plans yet, though we agree it sounds useful, at least for |
275 | apps that bypass the page cache (e.g. O_DIRECT). | 294 | apps that bypass the page cache (e.g. O_DIRECT). |