diff options
author | Glenn Elliott <gelliott@cs.unc.edu> | 2012-03-04 19:47:13 -0500 |
---|---|---|
committer | Glenn Elliott <gelliott@cs.unc.edu> | 2012-03-04 19:47:13 -0500 |
commit | c71c03bda1e86c9d5198c5d83f712e695c4f2a1e (patch) | |
tree | ecb166cb3e2b7e2adb3b5e292245fefd23381ac8 /Documentation/filesystems/porting | |
parent | ea53c912f8a86a8567697115b6a0d8152beee5c8 (diff) | |
parent | 6a00f206debf8a5c8899055726ad127dbeeed098 (diff) |
Merge branch 'mpi-master' into wip-k-fmlpwip-k-fmlp
Conflicts:
litmus/sched_cedf.c
Diffstat (limited to 'Documentation/filesystems/porting')
-rw-r--r-- | Documentation/filesystems/porting | 101 |
1 files changed, 95 insertions, 6 deletions
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting index b12c89538680..6e29954851a2 100644 --- a/Documentation/filesystems/porting +++ b/Documentation/filesystems/porting | |||
@@ -216,7 +216,6 @@ had ->revalidate()) add calls in ->follow_link()/->readlink(). | |||
216 | ->d_parent changes are not protected by BKL anymore. Read access is safe | 216 | ->d_parent changes are not protected by BKL anymore. Read access is safe |
217 | if at least one of the following is true: | 217 | if at least one of the following is true: |
218 | * filesystem has no cross-directory rename() | 218 | * filesystem has no cross-directory rename() |
219 | * dcache_lock is held | ||
220 | * we know that parent had been locked (e.g. we are looking at | 219 | * we know that parent had been locked (e.g. we are looking at |
221 | ->d_parent of ->lookup() argument). | 220 | ->d_parent of ->lookup() argument). |
222 | * we are called from ->rename(). | 221 | * we are called from ->rename(). |
@@ -299,11 +298,14 @@ be used instead. It gets called whenever the inode is evicted, whether it has | |||
299 | remaining links or not. Caller does *not* evict the pagecache or inode-associated | 298 | remaining links or not. Caller does *not* evict the pagecache or inode-associated |
300 | metadata buffers; getting rid of those is responsibility of method, as it had | 299 | metadata buffers; getting rid of those is responsibility of method, as it had |
301 | been for ->delete_inode(). | 300 | been for ->delete_inode(). |
302 | ->drop_inode() returns int now; it's called on final iput() with inode_lock | 301 | |
303 | held and it returns true if filesystems wants the inode to be dropped. As before, | 302 | ->drop_inode() returns int now; it's called on final iput() with |
304 | generic_drop_inode() is still the default and it's been updated appropriately. | 303 | inode->i_lock held and it returns true if filesystems wants the inode to be |
305 | generic_delete_inode() is also alive and it consists simply of return 1. Note that | 304 | dropped. As before, generic_drop_inode() is still the default and it's been |
306 | all actual eviction work is done by caller after ->drop_inode() returns. | 305 | updated appropriately. generic_delete_inode() is also alive and it consists |
306 | simply of return 1. Note that all actual eviction work is done by caller after | ||
307 | ->drop_inode() returns. | ||
308 | |||
307 | clear_inode() is gone; use end_writeback() instead. As before, it must | 309 | clear_inode() is gone; use end_writeback() instead. As before, it must |
308 | be called exactly once on each call of ->evict_inode() (as it used to be for | 310 | be called exactly once on each call of ->evict_inode() (as it used to be for |
309 | each call of ->delete_inode()). Unlike before, if you are using inode-associated | 311 | each call of ->delete_inode()). Unlike before, if you are using inode-associated |
@@ -318,3 +320,90 @@ if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput( | |||
318 | may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly | 320 | may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly |
319 | free the on-disk inode, you may end up doing that while ->write_inode() is writing | 321 | free the on-disk inode, you may end up doing that while ->write_inode() is writing |
320 | to it. | 322 | to it. |
323 | |||
324 | --- | ||
325 | [mandatory] | ||
326 | |||
327 | .d_delete() now only advises the dcache as to whether or not to cache | ||
328 | unreferenced dentries, and is now only called when the dentry refcount goes to | ||
329 | 0. Even on 0 refcount transition, it must be able to tolerate being called 0, | ||
330 | 1, or more times (eg. constant, idempotent). | ||
331 | |||
332 | --- | ||
333 | [mandatory] | ||
334 | |||
335 | .d_compare() calling convention and locking rules are significantly | ||
336 | changed. Read updated documentation in Documentation/filesystems/vfs.txt (and | ||
337 | look at examples of other filesystems) for guidance. | ||
338 | |||
339 | --- | ||
340 | [mandatory] | ||
341 | |||
342 | .d_hash() calling convention and locking rules are significantly | ||
343 | changed. Read updated documentation in Documentation/filesystems/vfs.txt (and | ||
344 | look at examples of other filesystems) for guidance. | ||
345 | |||
346 | --- | ||
347 | [mandatory] | ||
348 | dcache_lock is gone, replaced by fine grained locks. See fs/dcache.c | ||
349 | for details of what locks to replace dcache_lock with in order to protect | ||
350 | particular things. Most of the time, a filesystem only needs ->d_lock, which | ||
351 | protects *all* the dcache state of a given dentry. | ||
352 | |||
353 | -- | ||
354 | [mandatory] | ||
355 | |||
356 | Filesystems must RCU-free their inodes, if they can have been accessed | ||
357 | via rcu-walk path walk (basically, if the file can have had a path name in the | ||
358 | vfs namespace). | ||
359 | |||
360 | i_dentry and i_rcu share storage in a union, and the vfs expects | ||
361 | i_dentry to be reinitialized before it is freed, so an: | ||
362 | |||
363 | INIT_LIST_HEAD(&inode->i_dentry); | ||
364 | |||
365 | must be done in the RCU callback. | ||
366 | |||
367 | -- | ||
368 | [recommended] | ||
369 | vfs now tries to do path walking in "rcu-walk mode", which avoids | ||
370 | atomic operations and scalability hazards on dentries and inodes (see | ||
371 | Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes | ||
372 | (above) are examples of the changes required to support this. For more complex | ||
373 | filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so | ||
374 | no changes are required to the filesystem. However, this is costly and loses | ||
375 | the benefits of rcu-walk mode. We will begin to add filesystem callbacks that | ||
376 | are rcu-walk aware, shown below. Filesystems should take advantage of this | ||
377 | where possible. | ||
378 | |||
379 | -- | ||
380 | [mandatory] | ||
381 | d_revalidate is a callback that is made on every path element (if | ||
382 | the filesystem provides it), which requires dropping out of rcu-walk mode. This | ||
383 | may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be | ||
384 | returned if the filesystem cannot handle rcu-walk. See | ||
385 | Documentation/filesystems/vfs.txt for more details. | ||
386 | |||
387 | permission and check_acl are inode permission checks that are called | ||
388 | on many or all directory inodes on the way down a path walk (to check for | ||
389 | exec permission). These must now be rcu-walk aware (flags & IPERM_FLAG_RCU). | ||
390 | See Documentation/filesystems/vfs.txt for more details. | ||
391 | |||
392 | -- | ||
393 | [mandatory] | ||
394 | In ->fallocate() you must check the mode option passed in. If your | ||
395 | filesystem does not support hole punching (deallocating space in the middle of a | ||
396 | file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode. | ||
397 | Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set, | ||
398 | so the i_size should not change when hole punching, even when puching the end of | ||
399 | a file off. | ||
400 | |||
401 | -- | ||
402 | [mandatory] | ||
403 | |||
404 | -- | ||
405 | [mandatory] | ||
406 | ->get_sb() is gone. Switch to use of ->mount(). Typically it's just | ||
407 | a matter of switching from calling get_sb_... to mount_... and changing the | ||
408 | function type. If you were doing it manually, just switch from setting ->mnt_root | ||
409 | to some pointer to returning that pointer. On errors return ERR_PTR(...). | ||