diff options
| -rw-r--r-- | Documentation/filesystems/Exporting | 115 | ||||
| -rw-r--r-- | fs/exportfs/expfs.c | 41 | ||||
| -rw-r--r-- | include/linux/exportfs.h | 26 |
3 files changed, 70 insertions, 112 deletions
diff --git a/Documentation/filesystems/Exporting b/Documentation/filesystems/Exporting index 31047e0fe14b..87019d2b5981 100644 --- a/Documentation/filesystems/Exporting +++ b/Documentation/filesystems/Exporting | |||
| @@ -2,9 +2,12 @@ | |||
| 2 | Making Filesystems Exportable | 2 | Making Filesystems Exportable |
| 3 | ============================= | 3 | ============================= |
| 4 | 4 | ||
| 5 | Most filesystem operations require a dentry (or two) as a starting | 5 | Overview |
| 6 | -------- | ||
| 7 | |||
| 8 | All filesystem operations require a dentry (or two) as a starting | ||
| 6 | point. Local applications have a reference-counted hold on suitable | 9 | point. Local applications have a reference-counted hold on suitable |
| 7 | dentrys via open file descriptors or cwd/root. However remote | 10 | dentries via open file descriptors or cwd/root. However remote |
| 8 | applications that access a filesystem via a remote filesystem protocol | 11 | applications that access a filesystem via a remote filesystem protocol |
| 9 | such as NFS may not be able to hold such a reference, and so need a | 12 | such as NFS may not be able to hold such a reference, and so need a |
| 10 | different way to refer to a particular dentry. As the alternative | 13 | different way to refer to a particular dentry. As the alternative |
| @@ -13,14 +16,14 @@ server-reboot (among other things, though these tend to be the most | |||
| 13 | problematic), there is no simple answer like 'filename'. | 16 | problematic), there is no simple answer like 'filename'. |
| 14 | 17 | ||
| 15 | The mechanism discussed here allows each filesystem implementation to | 18 | The mechanism discussed here allows each filesystem implementation to |
| 16 | specify how to generate an opaque (out side of the filesystem) byte | 19 | specify how to generate an opaque (outside of the filesystem) byte |
| 17 | string for any dentry, and how to find an appropriate dentry for any | 20 | string for any dentry, and how to find an appropriate dentry for any |
| 18 | given opaque byte string. | 21 | given opaque byte string. |
| 19 | This byte string will be called a "filehandle fragment" as it | 22 | This byte string will be called a "filehandle fragment" as it |
| 20 | corresponds to part of an NFS filehandle. | 23 | corresponds to part of an NFS filehandle. |
| 21 | 24 | ||
| 22 | A filesystem which supports the mapping between filehandle fragments | 25 | A filesystem which supports the mapping between filehandle fragments |
| 23 | and dentrys will be termed "exportable". | 26 | and dentries will be termed "exportable". |
| 24 | 27 | ||
| 25 | 28 | ||
| 26 | 29 | ||
| @@ -89,11 +92,9 @@ For a filesystem to be exportable it must: | |||
| 89 | 1/ provide the filehandle fragment routines described below. | 92 | 1/ provide the filehandle fragment routines described below. |
| 90 | 2/ make sure that d_splice_alias is used rather than d_add | 93 | 2/ make sure that d_splice_alias is used rather than d_add |
| 91 | when ->lookup finds an inode for a given parent and name. | 94 | when ->lookup finds an inode for a given parent and name. |
| 92 | Typically the ->lookup routine will end: | 95 | Typically the ->lookup routine will end with a: |
| 93 | if (inode) | 96 | |
| 94 | return d_splice(inode, dentry); | 97 | return d_splice_alias(inode, dentry); |
| 95 | d_add(dentry, inode); | ||
| 96 | return NULL; | ||
| 97 | } | 98 | } |
| 98 | 99 | ||
| 99 | 100 | ||
| @@ -101,67 +102,39 @@ For a filesystem to be exportable it must: | |||
| 101 | A file system implementation declares that instances of the filesystem | 102 | A file system implementation declares that instances of the filesystem |
| 102 | are exportable by setting the s_export_op field in the struct | 103 | are exportable by setting the s_export_op field in the struct |
| 103 | super_block. This field must point to a "struct export_operations" | 104 | super_block. This field must point to a "struct export_operations" |
| 104 | struct which could potentially be full of NULLs, though normally at | 105 | struct which has the following members: |
| 105 | least get_parent will be set. | 106 | |
| 106 | 107 | encode_fh (optional) | |
| 107 | The primary operations are decode_fh and encode_fh. | 108 | Takes a dentry and creates a filehandle fragment which can later be used |
| 108 | decode_fh takes a filehandle fragment and tries to find or create a | 109 | to find or create a dentry for the same object. The default |
| 109 | dentry for the object referred to by the filehandle. | 110 | implementation creates a filehandle fragment that encodes a 32bit inode |
| 110 | encode_fh takes a dentry and creates a filehandle fragment which can | 111 | and generation number for the inode encoded, and if necessary the |
| 111 | later be used to find/create a dentry for the same object. | 112 | same information for the parent. |
| 112 | 113 | ||
| 113 | decode_fh will probably make use of "find_exported_dentry". | 114 | fh_to_dentry (mandatory) |
| 114 | This function lives in the "exportfs" module which a filesystem does | 115 | Given a filehandle fragment, this should find the implied object and |
| 115 | not need unless it is being exported. So rather that calling | 116 | create a dentry for it (possibly with d_alloc_anon). |
| 116 | find_exported_dentry directly, each filesystem should call it through | 117 | |
| 117 | the find_exported_dentry pointer in it's export_operations table. | 118 | fh_to_parent (optional but strongly recommended) |
| 118 | This field is set correctly by the exporting agent (e.g. nfsd) when a | 119 | Given a filehandle fragment, this should find the parent of the |
| 119 | filesystem is exported, and before any export operations are called. | 120 | implied object and create a dentry for it (possibly with d_alloc_anon). |
| 120 | 121 | May fail if the filehandle fragment is too small. | |
| 121 | find_exported_dentry needs three support functions from the | 122 | |
| 122 | filesystem: | 123 | get_parent (optional but strongly recommended) |
| 123 | get_name. When given a parent dentry and a child dentry, this | 124 | When given a dentry for a directory, this should return a dentry for |
| 124 | should find a name in the directory identified by the parent | 125 | the parent. Quite possibly the parent dentry will have been allocated |
| 125 | dentry, which leads to the object identified by the child dentry. | 126 | by d_alloc_anon. The default get_parent function just returns an error |
| 126 | If no get_name function is supplied, a default implementation is | 127 | so any filehandle lookup that requires finding a parent will fail. |
| 127 | provided which uses vfs_readdir to find potential names, and | 128 | ->lookup("..") is *not* used as a default as it can leave ".." entries |
| 128 | matches inode numbers to find the correct match. | 129 | in the dcache which are too messy to work with. |
| 129 | 130 | ||
| 130 | get_parent. When given a dentry for a directory, this should return | 131 | get_name (optional) |
| 131 | a dentry for the parent. Quite possibly the parent dentry will | 132 | When given a parent dentry and a child dentry, this should find a name |
| 132 | have been allocated by d_alloc_anon. | 133 | in the directory identified by the parent dentry, which leads to the |
| 133 | The default get_parent function just returns an error so any | 134 | object identified by the child dentry. If no get_name function is |
| 134 | filehandle lookup that requires finding a parent will fail. | 135 | supplied, a default implementation is provided which uses vfs_readdir |
| 135 | ->lookup("..") is *not* used as a default as it can leave ".." | 136 | to find potential names, and matches inode numbers to find the correct |
| 136 | entries in the dcache which are too messy to work with. | 137 | match. |
| 137 | |||
| 138 | get_dentry. When given an opaque datum, this should find the | ||
| 139 | implied object and create a dentry for it (possibly with | ||
| 140 | d_alloc_anon). | ||
| 141 | The opaque datum is whatever is passed down by the decode_fh | ||
| 142 | function, and is often simply a fragment of the filehandle | ||
| 143 | fragment. | ||
| 144 | decode_fh passes two datums through find_exported_dentry. One that | ||
| 145 | should be used to identify the target object, and one that can be | ||
| 146 | used to identify the object's parent, should that be necessary. | ||
| 147 | The default get_dentry function assumes that the datum contains an | ||
| 148 | inode number and a generation number, and it attempts to get the | ||
| 149 | inode using "iget" and check it's validity by matching the | ||
| 150 | generation number. A filesystem should only depend on the default | ||
| 151 | if iget can safely be used this way. | ||
| 152 | |||
| 153 | If decode_fh and/or encode_fh are left as NULL, then default | ||
| 154 | implementations are used. These defaults are suitable for ext2 and | ||
| 155 | extremely similar filesystems (like ext3). | ||
| 156 | |||
| 157 | The default encode_fh creates a filehandle fragment from the inode | ||
| 158 | number and generation number of the target together with the inode | ||
| 159 | number and generation number of the parent (if the parent is | ||
| 160 | required). | ||
| 161 | |||
| 162 | The default decode_fh extract the target and parent datums from the | ||
| 163 | filehandle assuming the format used by the default encode_fh and | ||
| 164 | passed them to find_exported_dentry. | ||
| 165 | 138 | ||
| 166 | 139 | ||
| 167 | A filehandle fragment consists of an array of 1 or more 4byte words, | 140 | A filehandle fragment consists of an array of 1 or more 4byte words, |
| @@ -172,5 +145,3 @@ generated by encode_fh, in which case it will have been padded with | |||
| 172 | nuls. Rather, the encode_fh routine should choose a "type" which | 145 | nuls. Rather, the encode_fh routine should choose a "type" which |
| 173 | indicates the decode_fh how much of the filehandle is valid, and how | 146 | indicates the decode_fh how much of the filehandle is valid, and how |
| 174 | it should be interpreted. | 147 | it should be interpreted. |
| 175 | |||
| 176 | |||
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c index 352465312398..109ab5e44eca 100644 --- a/fs/exportfs/expfs.c +++ b/fs/exportfs/expfs.c | |||
| @@ -1,4 +1,13 @@ | |||
| 1 | 1 | /* | |
| 2 | * Copyright (C) Neil Brown 2002 | ||
| 3 | * Copyright (C) Christoph Hellwig 2007 | ||
| 4 | * | ||
| 5 | * This file contains the code mapping from inodes to NFS file handles, | ||
| 6 | * and for mapping back from file handles to dentries. | ||
| 7 | * | ||
| 8 | * For details on why we do all the strange and hairy things in here | ||
| 9 | * take a look at Documentation/filesystems/Exporting. | ||
| 10 | */ | ||
| 2 | #include <linux/exportfs.h> | 11 | #include <linux/exportfs.h> |
| 3 | #include <linux/fs.h> | 12 | #include <linux/fs.h> |
| 4 | #include <linux/file.h> | 13 | #include <linux/file.h> |
| @@ -9,19 +18,19 @@ | |||
| 9 | #define dprintk(fmt, args...) do{}while(0) | 18 | #define dprintk(fmt, args...) do{}while(0) |
| 10 | 19 | ||
| 11 | 20 | ||
| 12 | static int get_name(struct dentry *dentry, char *name, | 21 | static int get_name(struct vfsmount *mnt, struct dentry *dentry, char *name, |
| 13 | struct dentry *child); | 22 | struct dentry *child); |
| 14 | 23 | ||
| 15 | 24 | ||
| 16 | static int exportfs_get_name(struct dentry *dir, char *name, | 25 | static int exportfs_get_name(struct vfsmount *mnt, struct dentry *dir, |
| 17 | struct dentry *child) | 26 | char *name, struct dentry *child) |
| 18 | { | 27 | { |
| 19 | const struct export_operations *nop = dir->d_sb->s_export_op; | 28 | const struct export_operations *nop = dir->d_sb->s_export_op; |
| 20 | 29 | ||
| 21 | if (nop->get_name) | 30 | if (nop->get_name) |
| 22 | return nop->get_name(dir, name, child); | 31 | return nop->get_name(dir, name, child); |
| 23 | else | 32 | else |
| 24 | return get_name(dir, name, child); | 33 | return get_name(mnt, dir, name, child); |
| 25 | } | 34 | } |
| 26 | 35 | ||
| 27 | /* | 36 | /* |
| @@ -85,7 +94,7 @@ find_disconnected_root(struct dentry *dentry) | |||
| 85 | * It may already be, as the flag isn't always updated when connection happens. | 94 | * It may already be, as the flag isn't always updated when connection happens. |
| 86 | */ | 95 | */ |
| 87 | static int | 96 | static int |
| 88 | reconnect_path(struct super_block *sb, struct dentry *target_dir) | 97 | reconnect_path(struct vfsmount *mnt, struct dentry *target_dir) |
| 89 | { | 98 | { |
| 90 | char nbuf[NAME_MAX+1]; | 99 | char nbuf[NAME_MAX+1]; |
| 91 | int noprogress = 0; | 100 | int noprogress = 0; |
| @@ -108,7 +117,7 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
| 108 | pd->d_flags &= ~DCACHE_DISCONNECTED; | 117 | pd->d_flags &= ~DCACHE_DISCONNECTED; |
| 109 | spin_unlock(&pd->d_lock); | 118 | spin_unlock(&pd->d_lock); |
| 110 | noprogress = 0; | 119 | noprogress = 0; |
| 111 | } else if (pd == sb->s_root) { | 120 | } else if (pd == mnt->mnt_sb->s_root) { |
| 112 | printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n"); | 121 | printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n"); |
| 113 | spin_lock(&pd->d_lock); | 122 | spin_lock(&pd->d_lock); |
| 114 | pd->d_flags &= ~DCACHE_DISCONNECTED; | 123 | pd->d_flags &= ~DCACHE_DISCONNECTED; |
| @@ -134,8 +143,8 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
| 134 | struct dentry *npd; | 143 | struct dentry *npd; |
| 135 | 144 | ||
| 136 | mutex_lock(&pd->d_inode->i_mutex); | 145 | mutex_lock(&pd->d_inode->i_mutex); |
| 137 | if (sb->s_export_op->get_parent) | 146 | if (mnt->mnt_sb->s_export_op->get_parent) |
| 138 | ppd = sb->s_export_op->get_parent(pd); | 147 | ppd = mnt->mnt_sb->s_export_op->get_parent(pd); |
| 139 | mutex_unlock(&pd->d_inode->i_mutex); | 148 | mutex_unlock(&pd->d_inode->i_mutex); |
| 140 | 149 | ||
| 141 | if (IS_ERR(ppd)) { | 150 | if (IS_ERR(ppd)) { |
| @@ -148,7 +157,7 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
| 148 | 157 | ||
| 149 | dprintk("%s: find name of %lu in %lu\n", __FUNCTION__, | 158 | dprintk("%s: find name of %lu in %lu\n", __FUNCTION__, |
| 150 | pd->d_inode->i_ino, ppd->d_inode->i_ino); | 159 | pd->d_inode->i_ino, ppd->d_inode->i_ino); |
| 151 | err = exportfs_get_name(ppd, nbuf, pd); | 160 | err = exportfs_get_name(mnt, ppd, nbuf, pd); |
| 152 | if (err) { | 161 | if (err) { |
| 153 | dput(ppd); | 162 | dput(ppd); |
| 154 | dput(pd); | 163 | dput(pd); |
| @@ -238,8 +247,8 @@ static int filldir_one(void * __buf, const char * name, int len, | |||
| 238 | * calls readdir on the parent until it finds an entry with | 247 | * calls readdir on the parent until it finds an entry with |
| 239 | * the same inode number as the child, and returns that. | 248 | * the same inode number as the child, and returns that. |
| 240 | */ | 249 | */ |
| 241 | static int get_name(struct dentry *dentry, char *name, | 250 | static int get_name(struct vfsmount *mnt, struct dentry *dentry, |
| 242 | struct dentry *child) | 251 | char *name, struct dentry *child) |
| 243 | { | 252 | { |
| 244 | struct inode *dir = dentry->d_inode; | 253 | struct inode *dir = dentry->d_inode; |
| 245 | int error; | 254 | int error; |
| @@ -255,7 +264,7 @@ static int get_name(struct dentry *dentry, char *name, | |||
| 255 | /* | 264 | /* |
| 256 | * Open the directory ... | 265 | * Open the directory ... |
| 257 | */ | 266 | */ |
| 258 | file = dentry_open(dget(dentry), NULL, O_RDONLY); | 267 | file = dentry_open(dget(dentry), mntget(mnt), O_RDONLY); |
| 259 | error = PTR_ERR(file); | 268 | error = PTR_ERR(file); |
| 260 | if (IS_ERR(file)) | 269 | if (IS_ERR(file)) |
| 261 | goto out; | 270 | goto out; |
| @@ -372,7 +381,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
| 372 | * filesystem root. | 381 | * filesystem root. |
| 373 | */ | 382 | */ |
| 374 | if (result->d_flags & DCACHE_DISCONNECTED) { | 383 | if (result->d_flags & DCACHE_DISCONNECTED) { |
| 375 | err = reconnect_path(mnt->mnt_sb, result); | 384 | err = reconnect_path(mnt, result); |
| 376 | if (err) | 385 | if (err) |
| 377 | goto err_result; | 386 | goto err_result; |
| 378 | } | 387 | } |
| @@ -424,7 +433,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
| 424 | * connected to the filesystem root. The VFS really doesn't | 433 | * connected to the filesystem root. The VFS really doesn't |
| 425 | * like disconnected directories.. | 434 | * like disconnected directories.. |
| 426 | */ | 435 | */ |
| 427 | err = reconnect_path(mnt->mnt_sb, target_dir); | 436 | err = reconnect_path(mnt, target_dir); |
| 428 | if (err) { | 437 | if (err) { |
| 429 | dput(target_dir); | 438 | dput(target_dir); |
| 430 | goto err_result; | 439 | goto err_result; |
| @@ -435,7 +444,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
| 435 | * dentry for the inode we're after, make sure that our | 444 | * dentry for the inode we're after, make sure that our |
| 436 | * inode is actually connected to the parent. | 445 | * inode is actually connected to the parent. |
| 437 | */ | 446 | */ |
| 438 | err = exportfs_get_name(target_dir, nbuf, result); | 447 | err = exportfs_get_name(mnt, target_dir, nbuf, result); |
| 439 | if (!err) { | 448 | if (!err) { |
| 440 | mutex_lock(&target_dir->d_inode->i_mutex); | 449 | mutex_lock(&target_dir->d_inode->i_mutex); |
| 441 | nresult = lookup_one_len(nbuf, target_dir, | 450 | nresult = lookup_one_len(nbuf, target_dir, |
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index 0b4a771b4903..51d214138814 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h | |||
| @@ -55,30 +55,8 @@ struct fid { | |||
| 55 | * @get_parent: find the parent of a given directory | 55 | * @get_parent: find the parent of a given directory |
| 56 | * @get_dentry: find a dentry for the inode given a file handle sub-fragment | 56 | * @get_dentry: find a dentry for the inode given a file handle sub-fragment |
| 57 | * | 57 | * |
| 58 | * Description: | 58 | * See Documentation/filesystems/Exporting for details on how to use |
| 59 | * The export_operations structure provides a means for nfsd to communicate | 59 | * this interface correctly. |
| 60 | * with a particular exported file system - particularly enabling nfsd and | ||
| 61 | * the filesystem to co-operate when dealing with file handles. | ||
| 62 | * | ||
| 63 | * export_operations contains two basic operation for dealing with file | ||
| 64 | * handles, decode_fh() and encode_fh(), and allows for some other | ||
| 65 | * operations to be defined which standard helper routines use to get | ||
| 66 | * specific information from the filesystem. | ||
| 67 | * | ||
| 68 | * nfsd encodes information use to determine which filesystem a filehandle | ||
| 69 | * applies to in the initial part of the file handle. The remainder, termed | ||
| 70 | * a file handle fragment, is controlled completely by the filesystem. The | ||
| 71 | * standard helper routines assume that this fragment will contain one or | ||
| 72 | * two sub-fragments, one which identifies the file, and one which may be | ||
| 73 | * used to identify the (a) directory containing the file. | ||
| 74 | * | ||
| 75 | * In some situations, nfsd needs to get a dentry which is connected into a | ||
| 76 | * specific part of the file tree. To allow for this, it passes the | ||
| 77 | * function acceptable() together with a @context which can be used to see | ||
| 78 | * if the dentry is acceptable. As there can be multiple dentrys for a | ||
| 79 | * given file, the filesystem should check each one for acceptability before | ||
| 80 | * looking for the next. As soon as an acceptable one is found, it should | ||
| 81 | * be returned. | ||
| 82 | * | 60 | * |
| 83 | * encode_fh: | 61 | * encode_fh: |
| 84 | * @encode_fh should store in the file handle fragment @fh (using at most | 62 | * @encode_fh should store in the file handle fragment @fh (using at most |
