diff options
-rw-r--r-- | Documentation/filesystems/Exporting | 115 | ||||
-rw-r--r-- | fs/exportfs/expfs.c | 41 | ||||
-rw-r--r-- | include/linux/exportfs.h | 26 |
3 files changed, 70 insertions, 112 deletions
diff --git a/Documentation/filesystems/Exporting b/Documentation/filesystems/Exporting index 31047e0fe14b..87019d2b5981 100644 --- a/Documentation/filesystems/Exporting +++ b/Documentation/filesystems/Exporting | |||
@@ -2,9 +2,12 @@ | |||
2 | Making Filesystems Exportable | 2 | Making Filesystems Exportable |
3 | ============================= | 3 | ============================= |
4 | 4 | ||
5 | Most filesystem operations require a dentry (or two) as a starting | 5 | Overview |
6 | -------- | ||
7 | |||
8 | All filesystem operations require a dentry (or two) as a starting | ||
6 | point. Local applications have a reference-counted hold on suitable | 9 | point. Local applications have a reference-counted hold on suitable |
7 | dentrys via open file descriptors or cwd/root. However remote | 10 | dentries via open file descriptors or cwd/root. However remote |
8 | applications that access a filesystem via a remote filesystem protocol | 11 | applications that access a filesystem via a remote filesystem protocol |
9 | such as NFS may not be able to hold such a reference, and so need a | 12 | such as NFS may not be able to hold such a reference, and so need a |
10 | different way to refer to a particular dentry. As the alternative | 13 | different way to refer to a particular dentry. As the alternative |
@@ -13,14 +16,14 @@ server-reboot (among other things, though these tend to be the most | |||
13 | problematic), there is no simple answer like 'filename'. | 16 | problematic), there is no simple answer like 'filename'. |
14 | 17 | ||
15 | The mechanism discussed here allows each filesystem implementation to | 18 | The mechanism discussed here allows each filesystem implementation to |
16 | specify how to generate an opaque (out side of the filesystem) byte | 19 | specify how to generate an opaque (outside of the filesystem) byte |
17 | string for any dentry, and how to find an appropriate dentry for any | 20 | string for any dentry, and how to find an appropriate dentry for any |
18 | given opaque byte string. | 21 | given opaque byte string. |
19 | This byte string will be called a "filehandle fragment" as it | 22 | This byte string will be called a "filehandle fragment" as it |
20 | corresponds to part of an NFS filehandle. | 23 | corresponds to part of an NFS filehandle. |
21 | 24 | ||
22 | A filesystem which supports the mapping between filehandle fragments | 25 | A filesystem which supports the mapping between filehandle fragments |
23 | and dentrys will be termed "exportable". | 26 | and dentries will be termed "exportable". |
24 | 27 | ||
25 | 28 | ||
26 | 29 | ||
@@ -89,11 +92,9 @@ For a filesystem to be exportable it must: | |||
89 | 1/ provide the filehandle fragment routines described below. | 92 | 1/ provide the filehandle fragment routines described below. |
90 | 2/ make sure that d_splice_alias is used rather than d_add | 93 | 2/ make sure that d_splice_alias is used rather than d_add |
91 | when ->lookup finds an inode for a given parent and name. | 94 | when ->lookup finds an inode for a given parent and name. |
92 | Typically the ->lookup routine will end: | 95 | Typically the ->lookup routine will end with a: |
93 | if (inode) | 96 | |
94 | return d_splice(inode, dentry); | 97 | return d_splice_alias(inode, dentry); |
95 | d_add(dentry, inode); | ||
96 | return NULL; | ||
97 | } | 98 | } |
98 | 99 | ||
99 | 100 | ||
@@ -101,67 +102,39 @@ For a filesystem to be exportable it must: | |||
101 | A file system implementation declares that instances of the filesystem | 102 | A file system implementation declares that instances of the filesystem |
102 | are exportable by setting the s_export_op field in the struct | 103 | are exportable by setting the s_export_op field in the struct |
103 | super_block. This field must point to a "struct export_operations" | 104 | super_block. This field must point to a "struct export_operations" |
104 | struct which could potentially be full of NULLs, though normally at | 105 | struct which has the following members: |
105 | least get_parent will be set. | 106 | |
106 | 107 | encode_fh (optional) | |
107 | The primary operations are decode_fh and encode_fh. | 108 | Takes a dentry and creates a filehandle fragment which can later be used |
108 | decode_fh takes a filehandle fragment and tries to find or create a | 109 | to find or create a dentry for the same object. The default |
109 | dentry for the object referred to by the filehandle. | 110 | implementation creates a filehandle fragment that encodes a 32bit inode |
110 | encode_fh takes a dentry and creates a filehandle fragment which can | 111 | and generation number for the inode encoded, and if necessary the |
111 | later be used to find/create a dentry for the same object. | 112 | same information for the parent. |
112 | 113 | ||
113 | decode_fh will probably make use of "find_exported_dentry". | 114 | fh_to_dentry (mandatory) |
114 | This function lives in the "exportfs" module which a filesystem does | 115 | Given a filehandle fragment, this should find the implied object and |
115 | not need unless it is being exported. So rather that calling | 116 | create a dentry for it (possibly with d_alloc_anon). |
116 | find_exported_dentry directly, each filesystem should call it through | 117 | |
117 | the find_exported_dentry pointer in it's export_operations table. | 118 | fh_to_parent (optional but strongly recommended) |
118 | This field is set correctly by the exporting agent (e.g. nfsd) when a | 119 | Given a filehandle fragment, this should find the parent of the |
119 | filesystem is exported, and before any export operations are called. | 120 | implied object and create a dentry for it (possibly with d_alloc_anon). |
120 | 121 | May fail if the filehandle fragment is too small. | |
121 | find_exported_dentry needs three support functions from the | 122 | |
122 | filesystem: | 123 | get_parent (optional but strongly recommended) |
123 | get_name. When given a parent dentry and a child dentry, this | 124 | When given a dentry for a directory, this should return a dentry for |
124 | should find a name in the directory identified by the parent | 125 | the parent. Quite possibly the parent dentry will have been allocated |
125 | dentry, which leads to the object identified by the child dentry. | 126 | by d_alloc_anon. The default get_parent function just returns an error |
126 | If no get_name function is supplied, a default implementation is | 127 | so any filehandle lookup that requires finding a parent will fail. |
127 | provided which uses vfs_readdir to find potential names, and | 128 | ->lookup("..") is *not* used as a default as it can leave ".." entries |
128 | matches inode numbers to find the correct match. | 129 | in the dcache which are too messy to work with. |
129 | 130 | ||
130 | get_parent. When given a dentry for a directory, this should return | 131 | get_name (optional) |
131 | a dentry for the parent. Quite possibly the parent dentry will | 132 | When given a parent dentry and a child dentry, this should find a name |
132 | have been allocated by d_alloc_anon. | 133 | in the directory identified by the parent dentry, which leads to the |
133 | The default get_parent function just returns an error so any | 134 | object identified by the child dentry. If no get_name function is |
134 | filehandle lookup that requires finding a parent will fail. | 135 | supplied, a default implementation is provided which uses vfs_readdir |
135 | ->lookup("..") is *not* used as a default as it can leave ".." | 136 | to find potential names, and matches inode numbers to find the correct |
136 | entries in the dcache which are too messy to work with. | 137 | match. |
137 | |||
138 | get_dentry. When given an opaque datum, this should find the | ||
139 | implied object and create a dentry for it (possibly with | ||
140 | d_alloc_anon). | ||
141 | The opaque datum is whatever is passed down by the decode_fh | ||
142 | function, and is often simply a fragment of the filehandle | ||
143 | fragment. | ||
144 | decode_fh passes two datums through find_exported_dentry. One that | ||
145 | should be used to identify the target object, and one that can be | ||
146 | used to identify the object's parent, should that be necessary. | ||
147 | The default get_dentry function assumes that the datum contains an | ||
148 | inode number and a generation number, and it attempts to get the | ||
149 | inode using "iget" and check it's validity by matching the | ||
150 | generation number. A filesystem should only depend on the default | ||
151 | if iget can safely be used this way. | ||
152 | |||
153 | If decode_fh and/or encode_fh are left as NULL, then default | ||
154 | implementations are used. These defaults are suitable for ext2 and | ||
155 | extremely similar filesystems (like ext3). | ||
156 | |||
157 | The default encode_fh creates a filehandle fragment from the inode | ||
158 | number and generation number of the target together with the inode | ||
159 | number and generation number of the parent (if the parent is | ||
160 | required). | ||
161 | |||
162 | The default decode_fh extract the target and parent datums from the | ||
163 | filehandle assuming the format used by the default encode_fh and | ||
164 | passed them to find_exported_dentry. | ||
165 | 138 | ||
166 | 139 | ||
167 | A filehandle fragment consists of an array of 1 or more 4byte words, | 140 | A filehandle fragment consists of an array of 1 or more 4byte words, |
@@ -172,5 +145,3 @@ generated by encode_fh, in which case it will have been padded with | |||
172 | nuls. Rather, the encode_fh routine should choose a "type" which | 145 | nuls. Rather, the encode_fh routine should choose a "type" which |
173 | indicates the decode_fh how much of the filehandle is valid, and how | 146 | indicates the decode_fh how much of the filehandle is valid, and how |
174 | it should be interpreted. | 147 | it should be interpreted. |
175 | |||
176 | |||
diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c index 352465312398..109ab5e44eca 100644 --- a/fs/exportfs/expfs.c +++ b/fs/exportfs/expfs.c | |||
@@ -1,4 +1,13 @@ | |||
1 | 1 | /* | |
2 | * Copyright (C) Neil Brown 2002 | ||
3 | * Copyright (C) Christoph Hellwig 2007 | ||
4 | * | ||
5 | * This file contains the code mapping from inodes to NFS file handles, | ||
6 | * and for mapping back from file handles to dentries. | ||
7 | * | ||
8 | * For details on why we do all the strange and hairy things in here | ||
9 | * take a look at Documentation/filesystems/Exporting. | ||
10 | */ | ||
2 | #include <linux/exportfs.h> | 11 | #include <linux/exportfs.h> |
3 | #include <linux/fs.h> | 12 | #include <linux/fs.h> |
4 | #include <linux/file.h> | 13 | #include <linux/file.h> |
@@ -9,19 +18,19 @@ | |||
9 | #define dprintk(fmt, args...) do{}while(0) | 18 | #define dprintk(fmt, args...) do{}while(0) |
10 | 19 | ||
11 | 20 | ||
12 | static int get_name(struct dentry *dentry, char *name, | 21 | static int get_name(struct vfsmount *mnt, struct dentry *dentry, char *name, |
13 | struct dentry *child); | 22 | struct dentry *child); |
14 | 23 | ||
15 | 24 | ||
16 | static int exportfs_get_name(struct dentry *dir, char *name, | 25 | static int exportfs_get_name(struct vfsmount *mnt, struct dentry *dir, |
17 | struct dentry *child) | 26 | char *name, struct dentry *child) |
18 | { | 27 | { |
19 | const struct export_operations *nop = dir->d_sb->s_export_op; | 28 | const struct export_operations *nop = dir->d_sb->s_export_op; |
20 | 29 | ||
21 | if (nop->get_name) | 30 | if (nop->get_name) |
22 | return nop->get_name(dir, name, child); | 31 | return nop->get_name(dir, name, child); |
23 | else | 32 | else |
24 | return get_name(dir, name, child); | 33 | return get_name(mnt, dir, name, child); |
25 | } | 34 | } |
26 | 35 | ||
27 | /* | 36 | /* |
@@ -85,7 +94,7 @@ find_disconnected_root(struct dentry *dentry) | |||
85 | * It may already be, as the flag isn't always updated when connection happens. | 94 | * It may already be, as the flag isn't always updated when connection happens. |
86 | */ | 95 | */ |
87 | static int | 96 | static int |
88 | reconnect_path(struct super_block *sb, struct dentry *target_dir) | 97 | reconnect_path(struct vfsmount *mnt, struct dentry *target_dir) |
89 | { | 98 | { |
90 | char nbuf[NAME_MAX+1]; | 99 | char nbuf[NAME_MAX+1]; |
91 | int noprogress = 0; | 100 | int noprogress = 0; |
@@ -108,7 +117,7 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
108 | pd->d_flags &= ~DCACHE_DISCONNECTED; | 117 | pd->d_flags &= ~DCACHE_DISCONNECTED; |
109 | spin_unlock(&pd->d_lock); | 118 | spin_unlock(&pd->d_lock); |
110 | noprogress = 0; | 119 | noprogress = 0; |
111 | } else if (pd == sb->s_root) { | 120 | } else if (pd == mnt->mnt_sb->s_root) { |
112 | printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n"); | 121 | printk(KERN_ERR "export: Eeek filesystem root is not connected, impossible\n"); |
113 | spin_lock(&pd->d_lock); | 122 | spin_lock(&pd->d_lock); |
114 | pd->d_flags &= ~DCACHE_DISCONNECTED; | 123 | pd->d_flags &= ~DCACHE_DISCONNECTED; |
@@ -134,8 +143,8 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
134 | struct dentry *npd; | 143 | struct dentry *npd; |
135 | 144 | ||
136 | mutex_lock(&pd->d_inode->i_mutex); | 145 | mutex_lock(&pd->d_inode->i_mutex); |
137 | if (sb->s_export_op->get_parent) | 146 | if (mnt->mnt_sb->s_export_op->get_parent) |
138 | ppd = sb->s_export_op->get_parent(pd); | 147 | ppd = mnt->mnt_sb->s_export_op->get_parent(pd); |
139 | mutex_unlock(&pd->d_inode->i_mutex); | 148 | mutex_unlock(&pd->d_inode->i_mutex); |
140 | 149 | ||
141 | if (IS_ERR(ppd)) { | 150 | if (IS_ERR(ppd)) { |
@@ -148,7 +157,7 @@ reconnect_path(struct super_block *sb, struct dentry *target_dir) | |||
148 | 157 | ||
149 | dprintk("%s: find name of %lu in %lu\n", __FUNCTION__, | 158 | dprintk("%s: find name of %lu in %lu\n", __FUNCTION__, |
150 | pd->d_inode->i_ino, ppd->d_inode->i_ino); | 159 | pd->d_inode->i_ino, ppd->d_inode->i_ino); |
151 | err = exportfs_get_name(ppd, nbuf, pd); | 160 | err = exportfs_get_name(mnt, ppd, nbuf, pd); |
152 | if (err) { | 161 | if (err) { |
153 | dput(ppd); | 162 | dput(ppd); |
154 | dput(pd); | 163 | dput(pd); |
@@ -238,8 +247,8 @@ static int filldir_one(void * __buf, const char * name, int len, | |||
238 | * calls readdir on the parent until it finds an entry with | 247 | * calls readdir on the parent until it finds an entry with |
239 | * the same inode number as the child, and returns that. | 248 | * the same inode number as the child, and returns that. |
240 | */ | 249 | */ |
241 | static int get_name(struct dentry *dentry, char *name, | 250 | static int get_name(struct vfsmount *mnt, struct dentry *dentry, |
242 | struct dentry *child) | 251 | char *name, struct dentry *child) |
243 | { | 252 | { |
244 | struct inode *dir = dentry->d_inode; | 253 | struct inode *dir = dentry->d_inode; |
245 | int error; | 254 | int error; |
@@ -255,7 +264,7 @@ static int get_name(struct dentry *dentry, char *name, | |||
255 | /* | 264 | /* |
256 | * Open the directory ... | 265 | * Open the directory ... |
257 | */ | 266 | */ |
258 | file = dentry_open(dget(dentry), NULL, O_RDONLY); | 267 | file = dentry_open(dget(dentry), mntget(mnt), O_RDONLY); |
259 | error = PTR_ERR(file); | 268 | error = PTR_ERR(file); |
260 | if (IS_ERR(file)) | 269 | if (IS_ERR(file)) |
261 | goto out; | 270 | goto out; |
@@ -372,7 +381,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
372 | * filesystem root. | 381 | * filesystem root. |
373 | */ | 382 | */ |
374 | if (result->d_flags & DCACHE_DISCONNECTED) { | 383 | if (result->d_flags & DCACHE_DISCONNECTED) { |
375 | err = reconnect_path(mnt->mnt_sb, result); | 384 | err = reconnect_path(mnt, result); |
376 | if (err) | 385 | if (err) |
377 | goto err_result; | 386 | goto err_result; |
378 | } | 387 | } |
@@ -424,7 +433,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
424 | * connected to the filesystem root. The VFS really doesn't | 433 | * connected to the filesystem root. The VFS really doesn't |
425 | * like disconnected directories.. | 434 | * like disconnected directories.. |
426 | */ | 435 | */ |
427 | err = reconnect_path(mnt->mnt_sb, target_dir); | 436 | err = reconnect_path(mnt, target_dir); |
428 | if (err) { | 437 | if (err) { |
429 | dput(target_dir); | 438 | dput(target_dir); |
430 | goto err_result; | 439 | goto err_result; |
@@ -435,7 +444,7 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid, | |||
435 | * dentry for the inode we're after, make sure that our | 444 | * dentry for the inode we're after, make sure that our |
436 | * inode is actually connected to the parent. | 445 | * inode is actually connected to the parent. |
437 | */ | 446 | */ |
438 | err = exportfs_get_name(target_dir, nbuf, result); | 447 | err = exportfs_get_name(mnt, target_dir, nbuf, result); |
439 | if (!err) { | 448 | if (!err) { |
440 | mutex_lock(&target_dir->d_inode->i_mutex); | 449 | mutex_lock(&target_dir->d_inode->i_mutex); |
441 | nresult = lookup_one_len(nbuf, target_dir, | 450 | nresult = lookup_one_len(nbuf, target_dir, |
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index 0b4a771b4903..51d214138814 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h | |||
@@ -55,30 +55,8 @@ struct fid { | |||
55 | * @get_parent: find the parent of a given directory | 55 | * @get_parent: find the parent of a given directory |
56 | * @get_dentry: find a dentry for the inode given a file handle sub-fragment | 56 | * @get_dentry: find a dentry for the inode given a file handle sub-fragment |
57 | * | 57 | * |
58 | * Description: | 58 | * See Documentation/filesystems/Exporting for details on how to use |
59 | * The export_operations structure provides a means for nfsd to communicate | 59 | * this interface correctly. |
60 | * with a particular exported file system - particularly enabling nfsd and | ||
61 | * the filesystem to co-operate when dealing with file handles. | ||
62 | * | ||
63 | * export_operations contains two basic operation for dealing with file | ||
64 | * handles, decode_fh() and encode_fh(), and allows for some other | ||
65 | * operations to be defined which standard helper routines use to get | ||
66 | * specific information from the filesystem. | ||
67 | * | ||
68 | * nfsd encodes information use to determine which filesystem a filehandle | ||
69 | * applies to in the initial part of the file handle. The remainder, termed | ||
70 | * a file handle fragment, is controlled completely by the filesystem. The | ||
71 | * standard helper routines assume that this fragment will contain one or | ||
72 | * two sub-fragments, one which identifies the file, and one which may be | ||
73 | * used to identify the (a) directory containing the file. | ||
74 | * | ||
75 | * In some situations, nfsd needs to get a dentry which is connected into a | ||
76 | * specific part of the file tree. To allow for this, it passes the | ||
77 | * function acceptable() together with a @context which can be used to see | ||
78 | * if the dentry is acceptable. As there can be multiple dentrys for a | ||
79 | * given file, the filesystem should check each one for acceptability before | ||
80 | * looking for the next. As soon as an acceptable one is found, it should | ||
81 | * be returned. | ||
82 | * | 60 | * |
83 | * encode_fh: | 61 | * encode_fh: |
84 | * @encode_fh should store in the file handle fragment @fh (using at most | 62 | * @encode_fh should store in the file handle fragment @fh (using at most |