aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems/porting
diff options
context:
space:
mode:
authorGlenn Elliott <gelliott@cs.unc.edu>2012-03-04 19:47:13 -0500
committerGlenn Elliott <gelliott@cs.unc.edu>2012-03-04 19:47:13 -0500
commitc71c03bda1e86c9d5198c5d83f712e695c4f2a1e (patch)
treeecb166cb3e2b7e2adb3b5e292245fefd23381ac8 /Documentation/filesystems/porting
parentea53c912f8a86a8567697115b6a0d8152beee5c8 (diff)
parent6a00f206debf8a5c8899055726ad127dbeeed098 (diff)
Merge branch 'mpi-master' into wip-k-fmlpwip-k-fmlp
Conflicts: litmus/sched_cedf.c
Diffstat (limited to 'Documentation/filesystems/porting')
-rw-r--r--Documentation/filesystems/porting101
1 files changed, 95 insertions, 6 deletions
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index b12c89538680..6e29954851a2 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -216,7 +216,6 @@ had ->revalidate()) add calls in ->follow_link()/->readlink().
216->d_parent changes are not protected by BKL anymore. Read access is safe 216->d_parent changes are not protected by BKL anymore. Read access is safe
217if at least one of the following is true: 217if at least one of the following is true:
218 * filesystem has no cross-directory rename() 218 * filesystem has no cross-directory rename()
219 * dcache_lock is held
220 * we know that parent had been locked (e.g. we are looking at 219 * we know that parent had been locked (e.g. we are looking at
221->d_parent of ->lookup() argument). 220->d_parent of ->lookup() argument).
222 * we are called from ->rename(). 221 * we are called from ->rename().
@@ -299,11 +298,14 @@ be used instead. It gets called whenever the inode is evicted, whether it has
299remaining links or not. Caller does *not* evict the pagecache or inode-associated 298remaining links or not. Caller does *not* evict the pagecache or inode-associated
300metadata buffers; getting rid of those is responsibility of method, as it had 299metadata buffers; getting rid of those is responsibility of method, as it had
301been for ->delete_inode(). 300been for ->delete_inode().
302 ->drop_inode() returns int now; it's called on final iput() with inode_lock 301
303held and it returns true if filesystems wants the inode to be dropped. As before, 302 ->drop_inode() returns int now; it's called on final iput() with
304generic_drop_inode() is still the default and it's been updated appropriately. 303inode->i_lock held and it returns true if filesystems wants the inode to be
305generic_delete_inode() is also alive and it consists simply of return 1. Note that 304dropped. As before, generic_drop_inode() is still the default and it's been
306all actual eviction work is done by caller after ->drop_inode() returns. 305updated appropriately. generic_delete_inode() is also alive and it consists
306simply of return 1. Note that all actual eviction work is done by caller after
307->drop_inode() returns.
308
307 clear_inode() is gone; use end_writeback() instead. As before, it must 309 clear_inode() is gone; use end_writeback() instead. As before, it must
308be called exactly once on each call of ->evict_inode() (as it used to be for 310be called exactly once on each call of ->evict_inode() (as it used to be for
309each call of ->delete_inode()). Unlike before, if you are using inode-associated 311each call of ->delete_inode()). Unlike before, if you are using inode-associated
@@ -318,3 +320,90 @@ if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput(
318may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly 320may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly
319free the on-disk inode, you may end up doing that while ->write_inode() is writing 321free the on-disk inode, you may end up doing that while ->write_inode() is writing
320to it. 322to it.
323
324---
325[mandatory]
326
327 .d_delete() now only advises the dcache as to whether or not to cache
328unreferenced dentries, and is now only called when the dentry refcount goes to
3290. Even on 0 refcount transition, it must be able to tolerate being called 0,
3301, or more times (eg. constant, idempotent).
331
332---
333[mandatory]
334
335 .d_compare() calling convention and locking rules are significantly
336changed. Read updated documentation in Documentation/filesystems/vfs.txt (and
337look at examples of other filesystems) for guidance.
338
339---
340[mandatory]
341
342 .d_hash() calling convention and locking rules are significantly
343changed. Read updated documentation in Documentation/filesystems/vfs.txt (and
344look at examples of other filesystems) for guidance.
345
346---
347[mandatory]
348 dcache_lock is gone, replaced by fine grained locks. See fs/dcache.c
349for details of what locks to replace dcache_lock with in order to protect
350particular things. Most of the time, a filesystem only needs ->d_lock, which
351protects *all* the dcache state of a given dentry.
352
353--
354[mandatory]
355
356 Filesystems must RCU-free their inodes, if they can have been accessed
357via rcu-walk path walk (basically, if the file can have had a path name in the
358vfs namespace).
359
360 i_dentry and i_rcu share storage in a union, and the vfs expects
361i_dentry to be reinitialized before it is freed, so an:
362
363 INIT_LIST_HEAD(&inode->i_dentry);
364
365must be done in the RCU callback.
366
367--
368[recommended]
369 vfs now tries to do path walking in "rcu-walk mode", which avoids
370atomic operations and scalability hazards on dentries and inodes (see
371Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes
372(above) are examples of the changes required to support this. For more complex
373filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so
374no changes are required to the filesystem. However, this is costly and loses
375the benefits of rcu-walk mode. We will begin to add filesystem callbacks that
376are rcu-walk aware, shown below. Filesystems should take advantage of this
377where possible.
378
379--
380[mandatory]
381 d_revalidate is a callback that is made on every path element (if
382the filesystem provides it), which requires dropping out of rcu-walk mode. This
383may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be
384returned if the filesystem cannot handle rcu-walk. See
385Documentation/filesystems/vfs.txt for more details.
386
387 permission and check_acl are inode permission checks that are called
388on many or all directory inodes on the way down a path walk (to check for
389exec permission). These must now be rcu-walk aware (flags & IPERM_FLAG_RCU).
390See Documentation/filesystems/vfs.txt for more details.
391
392--
393[mandatory]
394 In ->fallocate() you must check the mode option passed in. If your
395filesystem does not support hole punching (deallocating space in the middle of a
396file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode.
397Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set,
398so the i_size should not change when hole punching, even when puching the end of
399a file off.
400
401--
402[mandatory]
403
404--
405[mandatory]
406 ->get_sb() is gone. Switch to use of ->mount(). Typically it's just
407a matter of switching from calling get_sb_... to mount_... and changing the
408function type. If you were doing it manually, just switch from setting ->mnt_root
409to some pointer to returning that pointer. On errors return ERR_PTR(...).