aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2009-11-30 16:33:48 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2009-11-30 16:33:48 -0500
commit6e80133f7f247f313da1638af4ce30f2bac303cc (patch)
tree318afcc1c1c434135849cef50e3d89be505ad011 /Documentation
parente3a41d7b99e7f97d9a50bec2a8f4eb237ce1d504 (diff)
parent4fa9f4ede88b4e2ff135b6e5717499d734508c62 (diff)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (31 commits) FS-Cache: Provide nop fscache_stat_d() if CONFIG_FSCACHE_STATS=n SLOW_WORK: Fix GFS2 to #include <linux/module.h> before using THIS_MODULE SLOW_WORK: Fix CIFS to pass THIS_MODULE to slow_work_register_user() CacheFiles: Don't log lookup/create failing with ENOBUFS CacheFiles: Catch an overly long wait for an old active object CacheFiles: Better showing of debugging information in active object problems CacheFiles: Mark parent directory locks as I_MUTEX_PARENT to keep lockdep happy CacheFiles: Handle truncate unlocking the page we're reading CacheFiles: Don't write a full page if there's only a partial page to cache FS-Cache: Actually requeue an object when requested FS-Cache: Start processing an object's operations on that object's death FS-Cache: Make sure FSCACHE_COOKIE_LOOKING_UP cleared on lookup failure FS-Cache: Add a retirement stat counter FS-Cache: Handle pages pending storage that get evicted under OOM conditions FS-Cache: Handle read request vs lookup, creation or other cache failure FS-Cache: Don't delete pending pages from the page-store tracking tree FS-Cache: Fix lock misorder in fscache_write_op() FS-Cache: The object-available state can't rely on the cookie to be available FS-Cache: Permit cache retrieval ops to be interrupted in the initial wait phase FS-Cache: Use radix tree preload correctly in tracking of pages to be stored ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/filesystems/caching/fscache.txt110
-rw-r--r--Documentation/filesystems/caching/netfs-api.txt21
-rw-r--r--Documentation/slow-work.txt160
3 files changed, 284 insertions, 7 deletions
diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 9e94b9491d89..a91e2e2095b0 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -235,6 +235,7 @@ proc files.
235 neg=N Number of negative lookups made 235 neg=N Number of negative lookups made
236 pos=N Number of positive lookups made 236 pos=N Number of positive lookups made
237 crt=N Number of objects created by lookup 237 crt=N Number of objects created by lookup
238 tmo=N Number of lookups timed out and requeued
238 Updates n=N Number of update cookie requests seen 239 Updates n=N Number of update cookie requests seen
239 nul=N Number of upd reqs given a NULL parent 240 nul=N Number of upd reqs given a NULL parent
240 run=N Number of upd reqs granted CPU time 241 run=N Number of upd reqs granted CPU time
@@ -250,8 +251,10 @@ proc files.
250 ok=N Number of successful alloc reqs 251 ok=N Number of successful alloc reqs
251 wt=N Number of alloc reqs that waited on lookup completion 252 wt=N Number of alloc reqs that waited on lookup completion
252 nbf=N Number of alloc reqs rejected -ENOBUFS 253 nbf=N Number of alloc reqs rejected -ENOBUFS
254 int=N Number of alloc reqs aborted -ERESTARTSYS
253 ops=N Number of alloc reqs submitted 255 ops=N Number of alloc reqs submitted
254 owt=N Number of alloc reqs waited for CPU time 256 owt=N Number of alloc reqs waited for CPU time
257 abt=N Number of alloc reqs aborted due to object death
255 Retrvls n=N Number of retrieval (read) requests seen 258 Retrvls n=N Number of retrieval (read) requests seen
256 ok=N Number of successful retr reqs 259 ok=N Number of successful retr reqs
257 wt=N Number of retr reqs that waited on lookup completion 260 wt=N Number of retr reqs that waited on lookup completion
@@ -261,6 +264,7 @@ proc files.
261 oom=N Number of retr reqs failed -ENOMEM 264 oom=N Number of retr reqs failed -ENOMEM
262 ops=N Number of retr reqs submitted 265 ops=N Number of retr reqs submitted
263 owt=N Number of retr reqs waited for CPU time 266 owt=N Number of retr reqs waited for CPU time
267 abt=N Number of retr reqs aborted due to object death
264 Stores n=N Number of storage (write) requests seen 268 Stores n=N Number of storage (write) requests seen
265 ok=N Number of successful store reqs 269 ok=N Number of successful store reqs
266 agn=N Number of store reqs on a page already pending storage 270 agn=N Number of store reqs on a page already pending storage
@@ -268,12 +272,37 @@ proc files.
268 oom=N Number of store reqs failed -ENOMEM 272 oom=N Number of store reqs failed -ENOMEM
269 ops=N Number of store reqs submitted 273 ops=N Number of store reqs submitted
270 run=N Number of store reqs granted CPU time 274 run=N Number of store reqs granted CPU time
275 pgs=N Number of pages given store req processing time
276 rxd=N Number of store reqs deleted from tracking tree
277 olm=N Number of store reqs over store limit
278 VmScan nos=N Number of release reqs against pages with no pending store
279 gon=N Number of release reqs against pages stored by time lock granted
280 bsy=N Number of release reqs ignored due to in-progress store
281 can=N Number of page stores cancelled due to release req
271 Ops pend=N Number of times async ops added to pending queues 282 Ops pend=N Number of times async ops added to pending queues
272 run=N Number of times async ops given CPU time 283 run=N Number of times async ops given CPU time
273 enq=N Number of times async ops queued for processing 284 enq=N Number of times async ops queued for processing
285 can=N Number of async ops cancelled
286 rej=N Number of async ops rejected due to object lookup/create failure
274 dfr=N Number of async ops queued for deferred release 287 dfr=N Number of async ops queued for deferred release
275 rel=N Number of async ops released 288 rel=N Number of async ops released
276 gc=N Number of deferred-release async ops garbage collected 289 gc=N Number of deferred-release async ops garbage collected
290 CacheOp alo=N Number of in-progress alloc_object() cache ops
291 luo=N Number of in-progress lookup_object() cache ops
292 luc=N Number of in-progress lookup_complete() cache ops
293 gro=N Number of in-progress grab_object() cache ops
294 upo=N Number of in-progress update_object() cache ops
295 dro=N Number of in-progress drop_object() cache ops
296 pto=N Number of in-progress put_object() cache ops
297 syn=N Number of in-progress sync_cache() cache ops
298 atc=N Number of in-progress attr_changed() cache ops
299 rap=N Number of in-progress read_or_alloc_page() cache ops
300 ras=N Number of in-progress read_or_alloc_pages() cache ops
301 alp=N Number of in-progress allocate_page() cache ops
302 als=N Number of in-progress allocate_pages() cache ops
303 wrp=N Number of in-progress write_page() cache ops
304 ucp=N Number of in-progress uncache_page() cache ops
305 dsp=N Number of in-progress dissociate_pages() cache ops
277 306
278 307
279 (*) /proc/fs/fscache/histogram 308 (*) /proc/fs/fscache/histogram
@@ -299,6 +328,87 @@ proc files.
299 jiffy range covered, and the SECS field the equivalent number of seconds. 328 jiffy range covered, and the SECS field the equivalent number of seconds.
300 329
301 330
331===========
332OBJECT LIST
333===========
334
335If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
336list of all the objects currently allocated and allow them to be viewed
337through:
338
339 /proc/fs/fscache/objects
340
341This will look something like:
342
343 [root@andromeda ~]# head /proc/fs/fscache/objects
344 OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
345 ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
346 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
347 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
348
349where the first set of columns before the '|' describe the object:
350
351 COLUMN DESCRIPTION
352 ======= ===============================================================
353 OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
354 PARENT Debugging ID of parent object
355 STAT Object state
356 CHLDN Number of child objects of this object
357 OPS Number of outstanding operations on this object
358 OOP Number of outstanding child object management operations
359 IPR
360 EX Number of outstanding exclusive operations
361 READS Number of outstanding read operations
362 EM Object's event mask
363 EV Events raised on this object
364 F Object flags
365 S Object slow-work work item flags
366
367and the second set of columns describe the object's cookie, if present:
368
369 COLUMN DESCRIPTION
370 =============== =======================================================
371 NETFS_COOKIE_DEF Name of netfs cookie definition
372 TY Cookie type (IX - index, DT - data, hex - special)
373 FL Cookie flags
374 NETFS_DATA Netfs private data stored in the cookie
375 OBJECT_KEY Object key } 1 column, with separating comma
376 AUX_DATA Object aux data } presence may be configured
377
378The data shown may be filtered by attaching the a key to an appropriate keyring
379before viewing the file. Something like:
380
381 keyctl add user fscache:objlist <restrictions> @s
382
383where <restrictions> are a selection of the following letters:
384
385 K Show hexdump of object key (don't show if not given)
386 A Show hexdump of object aux data (don't show if not given)
387
388and the following paired letters:
389
390 C Show objects that have a cookie
391 c Show objects that don't have a cookie
392 B Show objects that are busy
393 b Show objects that aren't busy
394 W Show objects that have pending writes
395 w Show objects that don't have pending writes
396 R Show objects that have outstanding reads
397 r Show objects that don't have outstanding reads
398 S Show objects that have slow work queued
399 s Show objects that don't have slow work queued
400
401If neither side of a letter pair is given, then both are implied. For example:
402
403 keyctl add user fscache:objlist KB @s
404
405shows objects that are busy, and lists their object keys, but does not dump
406their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
407not implied.
408
409By default all objects and all fields will be shown.
410
411
302========= 412=========
303DEBUGGING 413DEBUGGING
304========= 414=========
diff --git a/Documentation/filesystems/caching/netfs-api.txt b/Documentation/filesystems/caching/netfs-api.txt
index 2666b1ed5e9e..1902c57b72ef 100644
--- a/Documentation/filesystems/caching/netfs-api.txt
+++ b/Documentation/filesystems/caching/netfs-api.txt
@@ -641,7 +641,7 @@ data file must be retired (see the relinquish cookie function below).
641 641
642Furthermore, note that this does not cancel the asynchronous read or write 642Furthermore, note that this does not cancel the asynchronous read or write
643operation started by the read/alloc and write functions, so the page 643operation started by the read/alloc and write functions, so the page
644invalidation and release functions must use: 644invalidation functions must use:
645 645
646 bool fscache_check_page_write(struct fscache_cookie *cookie, 646 bool fscache_check_page_write(struct fscache_cookie *cookie,
647 struct page *page); 647 struct page *page);
@@ -654,6 +654,25 @@ to see if a page is being written to the cache, and:
654to wait for it to finish if it is. 654to wait for it to finish if it is.
655 655
656 656
657When releasepage() is being implemented, a special FS-Cache function exists to
658manage the heuristics of coping with vmscan trying to eject pages, which may
659conflict with the cache trying to write pages to the cache (which may itself
660need to allocate memory):
661
662 bool fscache_maybe_release_page(struct fscache_cookie *cookie,
663 struct page *page,
664 gfp_t gfp);
665
666This takes the netfs cookie, and the page and gfp arguments as supplied to
667releasepage(). It will return false if the page cannot be released yet for
668some reason and if it returns true, the page has been uncached and can now be
669released.
670
671To make a page available for release, this function may wait for an outstanding
672storage request to complete, or it may attempt to cancel the storage request -
673in which case the page will not be stored in the cache this time.
674
675
657========================== 676==========================
658INDEX AND DATA FILE UPDATE 677INDEX AND DATA FILE UPDATE
659========================== 678==========================
diff --git a/Documentation/slow-work.txt b/Documentation/slow-work.txt
index ebc50f808ea4..52bc31433723 100644
--- a/Documentation/slow-work.txt
+++ b/Documentation/slow-work.txt
@@ -41,6 +41,13 @@ expand files, provided the time taken to do so isn't too long.
41Operations of both types may sleep during execution, thus tying up the thread 41Operations of both types may sleep during execution, thus tying up the thread
42loaned to it. 42loaned to it.
43 43
44A further class of work item is available, based on the slow work item class:
45
46 (*) Delayed slow work items.
47
48These are slow work items that have a timer to defer queueing of the item for
49a while.
50
44 51
45THREAD-TO-CLASS ALLOCATION 52THREAD-TO-CLASS ALLOCATION
46-------------------------- 53--------------------------
@@ -64,9 +71,11 @@ USING SLOW WORK ITEMS
64Firstly, a module or subsystem wanting to make use of slow work items must 71Firstly, a module or subsystem wanting to make use of slow work items must
65register its interest: 72register its interest:
66 73
67 int ret = slow_work_register_user(); 74 int ret = slow_work_register_user(struct module *module);
68 75
69This will return 0 if successful, or a -ve error upon failure. 76This will return 0 if successful, or a -ve error upon failure. The module
77pointer should be the module interested in using this facility (almost
78certainly THIS_MODULE).
70 79
71 80
72Slow work items may then be set up by: 81Slow work items may then be set up by:
@@ -93,6 +102,10 @@ Slow work items may then be set up by:
93 102
94 or: 103 or:
95 104
105 delayed_slow_work_init(&myitem, &myitem_ops);
106
107 or:
108
96 vslow_work_init(&myitem, &myitem_ops); 109 vslow_work_init(&myitem, &myitem_ops);
97 110
98 depending on its class. 111 depending on its class.
@@ -102,15 +115,92 @@ A suitably set up work item can then be enqueued for processing:
102 int ret = slow_work_enqueue(&myitem); 115 int ret = slow_work_enqueue(&myitem);
103 116
104This will return a -ve error if the thread pool is unable to gain a reference 117This will return a -ve error if the thread pool is unable to gain a reference
105on the item, 0 otherwise. 118on the item, 0 otherwise, or (for delayed work):
119
120 int ret = delayed_slow_work_enqueue(&myitem, my_jiffy_delay);
106 121
107 122
108The items are reference counted, so there ought to be no need for a flush 123The items are reference counted, so there ought to be no need for a flush
109operation. When all a module's slow work items have been processed, and the 124operation. But as the reference counting is optional, means to cancel
125existing work items are also included:
126
127 cancel_slow_work(&myitem);
128 cancel_delayed_slow_work(&myitem);
129
130can be used to cancel pending work. The above cancel function waits for
131existing work to have been executed (or prevent execution of them, depending
132on timing).
133
134
135When all a module's slow work items have been processed, and the
110module has no further interest in the facility, it should unregister its 136module has no further interest in the facility, it should unregister its
111interest: 137interest:
112 138
113 slow_work_unregister_user(); 139 slow_work_unregister_user(struct module *module);
140
141The module pointer is used to wait for all outstanding work items for that
142module before completing the unregistration. This prevents the put_ref() code
143from being taken away before it completes. module should almost certainly be
144THIS_MODULE.
145
146
147================
148HELPER FUNCTIONS
149================
150
151The slow-work facility provides a function by which it can be determined
152whether or not an item is queued for later execution:
153
154 bool queued = slow_work_is_queued(struct slow_work *work);
155
156If it returns false, then the item is not on the queue (it may be executing
157with a requeue pending). This can be used to work out whether an item on which
158another depends is on the queue, thus allowing a dependent item to be queued
159after it.
160
161If the above shows an item on which another depends not to be queued, then the
162owner of the dependent item might need to wait. However, to avoid locking up
163the threads unnecessarily be sleeping in them, it can make sense under some
164circumstances to return the work item to the queue, thus deferring it until
165some other items have had a chance to make use of the yielded thread.
166
167To yield a thread and defer an item, the work function should simply enqueue
168the work item again and return. However, this doesn't work if there's nothing
169actually on the queue, as the thread just vacated will jump straight back into
170the item's work function, thus busy waiting on a CPU.
171
172Instead, the item should use the thread to wait for the dependency to go away,
173but rather than using schedule() or schedule_timeout() to sleep, it should use
174the following function:
175
176 bool requeue = slow_work_sleep_till_thread_needed(
177 struct slow_work *work,
178 signed long *_timeout);
179
180This will add a second wait and then sleep, such that it will be woken up if
181either something appears on the queue that could usefully make use of the
182thread - and behind which this item can be queued, or if the event the caller
183set up to wait for happens. True will be returned if something else appeared
184on the queue and this work function should perhaps return, of false if
185something else woke it up. The timeout is as for schedule_timeout().
186
187For example:
188
189 wq = bit_waitqueue(&my_flags, MY_BIT);
190 init_wait(&wait);
191 requeue = false;
192 do {
193 prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
194 if (!test_bit(MY_BIT, &my_flags))
195 break;
196 requeue = slow_work_sleep_till_thread_needed(&my_work,
197 &timeout);
198 } while (timeout > 0 && !requeue);
199 finish_wait(wq, &wait);
200 if (!test_bit(MY_BIT, &my_flags)
201 goto do_my_thing;
202 if (requeue)
203 return; // to slow_work
114 204
115 205
116=============== 206===============
@@ -118,7 +208,8 @@ ITEM OPERATIONS
118=============== 208===============
119 209
120Each work item requires a table of operations of type struct slow_work_ops. 210Each work item requires a table of operations of type struct slow_work_ops.
121All members are required: 211Only ->execute() is required; the getting and putting of a reference and the
212describing of an item are all optional.
122 213
123 (*) Get a reference on an item: 214 (*) Get a reference on an item:
124 215
@@ -148,6 +239,16 @@ All members are required:
148 This should perform the work required of the item. It may sleep, it may 239 This should perform the work required of the item. It may sleep, it may
149 perform disk I/O and it may wait for locks. 240 perform disk I/O and it may wait for locks.
150 241
242 (*) View an item through /proc:
243
244 void (*desc)(struct slow_work *work, struct seq_file *m);
245
246 If supplied, this should print to 'm' a small string describing the work
247 the item is to do. This should be no more than about 40 characters, and
248 shouldn't include a newline character.
249
250 See the 'Viewing executing and queued items' section below.
251
151 252
152================== 253==================
153POOL CONFIGURATION 254POOL CONFIGURATION
@@ -172,3 +273,50 @@ The slow-work thread pool has a number of configurables:
172 is bounded to between 1 and one fewer than the number of active threads. 273 is bounded to between 1 and one fewer than the number of active threads.
173 This ensures there is always at least one thread that can process very 274 This ensures there is always at least one thread that can process very
174 slow work items, and always at least one thread that won't. 275 slow work items, and always at least one thread that won't.
276
277
278==================================
279VIEWING EXECUTING AND QUEUED ITEMS
280==================================
281
282If CONFIG_SLOW_WORK_PROC is enabled, a proc file is made available:
283
284 /proc/slow_work_rq
285
286through which the list of work items being executed and the queues of items to
287be executed may be viewed. The owner of a work item is given the chance to
288add some information of its own.
289
290The contents look something like the following:
291
292 THR PID ITEM ADDR FL MARK DESC
293 === ===== ================ == ===== ==========
294 0 3005 ffff880023f52348 a 952ms FSC: OBJ17d3: LOOK
295 1 3006 ffff880024e33668 2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2
296 2 3165 ffff8800296dd180 a 424ms FSC: OBJ17e4: LOOK
297 3 4089 ffff8800262c8d78 a 212ms FSC: OBJ17ea: CRTN
298 4 4090 ffff88002792bed8 2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2
299 5 4092 ffff88002a0ef308 2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2
300 6 4094 ffff88002abaf4b8 2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2
301 7 4095 ffff88002bb188e0 a 388ms FSC: OBJ17e9: CRTN
302 vsq - ffff880023d99668 1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2
303 vsq - ffff8800295d1740 1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2
304 vsq - ffff880025ba3308 1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2
305 vsq - ffff880024ec83e0 1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2
306 vsq - ffff880026618e00 1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2
307 vsq - ffff880025a2a4b8 1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2
308 vsq - ffff880023cbe6d8 9 212ms FSC: OBJ17eb: LOOK
309 vsq - ffff880024d37590 9 212ms FSC: OBJ17ec: LOOK
310 vsq - ffff880027746cb0 9 212ms FSC: OBJ17ed: LOOK
311 vsq - ffff880024d37ae8 9 212ms FSC: OBJ17ee: LOOK
312 vsq - ffff880024d37cb0 9 212ms FSC: OBJ17ef: LOOK
313 vsq - ffff880025036550 9 212ms FSC: OBJ17f0: LOOK
314 vsq - ffff8800250368e0 9 212ms FSC: OBJ17f1: LOOK
315 vsq - ffff880025036aa8 9 212ms FSC: OBJ17f2: LOOK
316
317In the 'THR' column, executing items show the thread they're occupying and
318queued threads indicate which queue they're on. 'PID' shows the process ID of
319a slow-work thread that's executing something. 'FL' shows the work item flags.
320'MARK' indicates how long since an item was queued or began executing. Lastly,
321the 'DESC' column permits the owner of an item to give some information.
322