diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/autofs4-mount-control.txt | 393 | ||||
-rw-r--r-- | Documentation/filesystems/ext3.txt | 8 | ||||
-rw-r--r-- | Documentation/filesystems/ext4.txt | 32 | ||||
-rw-r--r-- | Documentation/filesystems/nfsroot.txt | 2 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 40 | ||||
-rw-r--r-- | Documentation/filesystems/ramfs-rootfs-initramfs.txt | 2 | ||||
-rw-r--r-- | Documentation/filesystems/ubifs.txt | 9 |
7 files changed, 455 insertions, 31 deletions
diff --git a/Documentation/filesystems/autofs4-mount-control.txt b/Documentation/filesystems/autofs4-mount-control.txt new file mode 100644 index 000000000000..c6341745df37 --- /dev/null +++ b/Documentation/filesystems/autofs4-mount-control.txt | |||
@@ -0,0 +1,393 @@ | |||
1 | |||
2 | Miscellaneous Device control operations for the autofs4 kernel module | ||
3 | ==================================================================== | ||
4 | |||
5 | The problem | ||
6 | =========== | ||
7 | |||
8 | There is a problem with active restarts in autofs (that is to say | ||
9 | restarting autofs when there are busy mounts). | ||
10 | |||
11 | During normal operation autofs uses a file descriptor opened on the | ||
12 | directory that is being managed in order to be able to issue control | ||
13 | operations. Using a file descriptor gives ioctl operations access to | ||
14 | autofs specific information stored in the super block. The operations | ||
15 | are things such as setting an autofs mount catatonic, setting the | ||
16 | expire timeout and requesting expire checks. As is explained below, | ||
17 | certain types of autofs triggered mounts can end up covering an autofs | ||
18 | mount itself which prevents us being able to use open(2) to obtain a | ||
19 | file descriptor for these operations if we don't already have one open. | ||
20 | |||
21 | Currently autofs uses "umount -l" (lazy umount) to clear active mounts | ||
22 | at restart. While using lazy umount works for most cases, anything that | ||
23 | needs to walk back up the mount tree to construct a path, such as | ||
24 | getcwd(2) and the proc file system /proc/<pid>/cwd, no longer works | ||
25 | because the point from which the path is constructed has been detached | ||
26 | from the mount tree. | ||
27 | |||
28 | The actual problem with autofs is that it can't reconnect to existing | ||
29 | mounts. Immediately one thinks of just adding the ability to remount | ||
30 | autofs file systems would solve it, but alas, that can't work. This is | ||
31 | because autofs direct mounts and the implementation of "on demand mount | ||
32 | and expire" of nested mount trees have the file system mounted directly | ||
33 | on top of the mount trigger directory dentry. | ||
34 | |||
35 | For example, there are two types of automount maps, direct (in the kernel | ||
36 | module source you will see a third type called an offset, which is just | ||
37 | a direct mount in disguise) and indirect. | ||
38 | |||
39 | Here is a master map with direct and indirect map entries: | ||
40 | |||
41 | /- /etc/auto.direct | ||
42 | /test /etc/auto.indirect | ||
43 | |||
44 | and the corresponding map files: | ||
45 | |||
46 | /etc/auto.direct: | ||
47 | |||
48 | /automount/dparse/g6 budgie:/autofs/export1 | ||
49 | /automount/dparse/g1 shark:/autofs/export1 | ||
50 | and so on. | ||
51 | |||
52 | /etc/auto.indirect: | ||
53 | |||
54 | g1 shark:/autofs/export1 | ||
55 | g6 budgie:/autofs/export1 | ||
56 | and so on. | ||
57 | |||
58 | For the above indirect map an autofs file system is mounted on /test and | ||
59 | mounts are triggered for each sub-directory key by the inode lookup | ||
60 | operation. So we see a mount of shark:/autofs/export1 on /test/g1, for | ||
61 | example. | ||
62 | |||
63 | The way that direct mounts are handled is by making an autofs mount on | ||
64 | each full path, such as /automount/dparse/g1, and using it as a mount | ||
65 | trigger. So when we walk on the path we mount shark:/autofs/export1 "on | ||
66 | top of this mount point". Since these are always directories we can | ||
67 | use the follow_link inode operation to trigger the mount. | ||
68 | |||
69 | But, each entry in direct and indirect maps can have offsets (making | ||
70 | them multi-mount map entries). | ||
71 | |||
72 | For example, an indirect mount map entry could also be: | ||
73 | |||
74 | g1 \ | ||
75 | / shark:/autofs/export5/testing/test \ | ||
76 | /s1 shark:/autofs/export/testing/test/s1 \ | ||
77 | /s2 shark:/autofs/export5/testing/test/s2 \ | ||
78 | /s1/ss1 shark:/autofs/export1 \ | ||
79 | /s2/ss2 shark:/autofs/export2 | ||
80 | |||
81 | and a similarly a direct mount map entry could also be: | ||
82 | |||
83 | /automount/dparse/g1 \ | ||
84 | / shark:/autofs/export5/testing/test \ | ||
85 | /s1 shark:/autofs/export/testing/test/s1 \ | ||
86 | /s2 shark:/autofs/export5/testing/test/s2 \ | ||
87 | /s1/ss1 shark:/autofs/export2 \ | ||
88 | /s2/ss2 shark:/autofs/export2 | ||
89 | |||
90 | One of the issues with version 4 of autofs was that, when mounting an | ||
91 | entry with a large number of offsets, possibly with nesting, we needed | ||
92 | to mount and umount all of the offsets as a single unit. Not really a | ||
93 | problem, except for people with a large number of offsets in map entries. | ||
94 | This mechanism is used for the well known "hosts" map and we have seen | ||
95 | cases (in 2.4) where the available number of mounts are exhausted or | ||
96 | where the number of privileged ports available is exhausted. | ||
97 | |||
98 | In version 5 we mount only as we go down the tree of offsets and | ||
99 | similarly for expiring them which resolves the above problem. There is | ||
100 | somewhat more detail to the implementation but it isn't needed for the | ||
101 | sake of the problem explanation. The one important detail is that these | ||
102 | offsets are implemented using the same mechanism as the direct mounts | ||
103 | above and so the mount points can be covered by a mount. | ||
104 | |||
105 | The current autofs implementation uses an ioctl file descriptor opened | ||
106 | on the mount point for control operations. The references held by the | ||
107 | descriptor are accounted for in checks made to determine if a mount is | ||
108 | in use and is also used to access autofs file system information held | ||
109 | in the mount super block. So the use of a file handle needs to be | ||
110 | retained. | ||
111 | |||
112 | |||
113 | The Solution | ||
114 | ============ | ||
115 | |||
116 | To be able to restart autofs leaving existing direct, indirect and | ||
117 | offset mounts in place we need to be able to obtain a file handle | ||
118 | for these potentially covered autofs mount points. Rather than just | ||
119 | implement an isolated operation it was decided to re-implement the | ||
120 | existing ioctl interface and add new operations to provide this | ||
121 | functionality. | ||
122 | |||
123 | In addition, to be able to reconstruct a mount tree that has busy mounts, | ||
124 | the uid and gid of the last user that triggered the mount needs to be | ||
125 | available because these can be used as macro substitution variables in | ||
126 | autofs maps. They are recorded at mount request time and an operation | ||
127 | has been added to retrieve them. | ||
128 | |||
129 | Since we're re-implementing the control interface, a couple of other | ||
130 | problems with the existing interface have been addressed. First, when | ||
131 | a mount or expire operation completes a status is returned to the | ||
132 | kernel by either a "send ready" or a "send fail" operation. The | ||
133 | "send fail" operation of the ioctl interface could only ever send | ||
134 | ENOENT so the re-implementation allows user space to send an actual | ||
135 | status. Another expensive operation in user space, for those using | ||
136 | very large maps, is discovering if a mount is present. Usually this | ||
137 | involves scanning /proc/mounts and since it needs to be done quite | ||
138 | often it can introduce significant overhead when there are many entries | ||
139 | in the mount table. An operation to lookup the mount status of a mount | ||
140 | point dentry (covered or not) has also been added. | ||
141 | |||
142 | Current kernel development policy recommends avoiding the use of the | ||
143 | ioctl mechanism in favor of systems such as Netlink. An implementation | ||
144 | using this system was attempted to evaluate its suitability and it was | ||
145 | found to be inadequate, in this case. The Generic Netlink system was | ||
146 | used for this as raw Netlink would lead to a significant increase in | ||
147 | complexity. There's no question that the Generic Netlink system is an | ||
148 | elegant solution for common case ioctl functions but it's not a complete | ||
149 | replacement probably because it's primary purpose in life is to be a | ||
150 | message bus implementation rather than specifically an ioctl replacement. | ||
151 | While it would be possible to work around this there is one concern | ||
152 | that lead to the decision to not use it. This is that the autofs | ||
153 | expire in the daemon has become far to complex because umount | ||
154 | candidates are enumerated, almost for no other reason than to "count" | ||
155 | the number of times to call the expire ioctl. This involves scanning | ||
156 | the mount table which has proved to be a big overhead for users with | ||
157 | large maps. The best way to improve this is try and get back to the | ||
158 | way the expire was done long ago. That is, when an expire request is | ||
159 | issued for a mount (file handle) we should continually call back to | ||
160 | the daemon until we can't umount any more mounts, then return the | ||
161 | appropriate status to the daemon. At the moment we just expire one | ||
162 | mount at a time. A Generic Netlink implementation would exclude this | ||
163 | possibility for future development due to the requirements of the | ||
164 | message bus architecture. | ||
165 | |||
166 | |||
167 | autofs4 Miscellaneous Device mount control interface | ||
168 | ==================================================== | ||
169 | |||
170 | The control interface is opening a device node, typically /dev/autofs. | ||
171 | |||
172 | All the ioctls use a common structure to pass the needed parameter | ||
173 | information and return operation results: | ||
174 | |||
175 | struct autofs_dev_ioctl { | ||
176 | __u32 ver_major; | ||
177 | __u32 ver_minor; | ||
178 | __u32 size; /* total size of data passed in | ||
179 | * including this struct */ | ||
180 | __s32 ioctlfd; /* automount command fd */ | ||
181 | |||
182 | __u32 arg1; /* Command parameters */ | ||
183 | __u32 arg2; | ||
184 | |||
185 | char path[0]; | ||
186 | }; | ||
187 | |||
188 | The ioctlfd field is a mount point file descriptor of an autofs mount | ||
189 | point. It is returned by the open call and is used by all calls except | ||
190 | the check for whether a given path is a mount point, where it may | ||
191 | optionally be used to check a specific mount corresponding to a given | ||
192 | mount point file descriptor, and when requesting the uid and gid of the | ||
193 | last successful mount on a directory within the autofs file system. | ||
194 | |||
195 | The fields arg1 and arg2 are used to communicate parameters and results of | ||
196 | calls made as described below. | ||
197 | |||
198 | The path field is used to pass a path where it is needed and the size field | ||
199 | is used account for the increased structure length when translating the | ||
200 | structure sent from user space. | ||
201 | |||
202 | This structure can be initialized before setting specific fields by using | ||
203 | the void function call init_autofs_dev_ioctl(struct autofs_dev_ioctl *). | ||
204 | |||
205 | All of the ioctls perform a copy of this structure from user space to | ||
206 | kernel space and return -EINVAL if the size parameter is smaller than | ||
207 | the structure size itself, -ENOMEM if the kernel memory allocation fails | ||
208 | or -EFAULT if the copy itself fails. Other checks include a version check | ||
209 | of the compiled in user space version against the module version and a | ||
210 | mismatch results in a -EINVAL return. If the size field is greater than | ||
211 | the structure size then a path is assumed to be present and is checked to | ||
212 | ensure it begins with a "/" and is NULL terminated, otherwise -EINVAL is | ||
213 | returned. Following these checks, for all ioctl commands except | ||
214 | AUTOFS_DEV_IOCTL_VERSION_CMD, AUTOFS_DEV_IOCTL_OPENMOUNT_CMD and | ||
215 | AUTOFS_DEV_IOCTL_CLOSEMOUNT_CMD the ioctlfd is validated and if it is | ||
216 | not a valid descriptor or doesn't correspond to an autofs mount point | ||
217 | an error of -EBADF, -ENOTTY or -EINVAL (not an autofs descriptor) is | ||
218 | returned. | ||
219 | |||
220 | |||
221 | The ioctls | ||
222 | ========== | ||
223 | |||
224 | An example of an implementation which uses this interface can be seen | ||
225 | in autofs version 5.0.4 and later in file lib/dev-ioctl-lib.c of the | ||
226 | distribution tar available for download from kernel.org in directory | ||
227 | /pub/linux/daemons/autofs/v5. | ||
228 | |||
229 | The device node ioctl operations implemented by this interface are: | ||
230 | |||
231 | |||
232 | AUTOFS_DEV_IOCTL_VERSION | ||
233 | ------------------------ | ||
234 | |||
235 | Get the major and minor version of the autofs4 device ioctl kernel module | ||
236 | implementation. It requires an initialized struct autofs_dev_ioctl as an | ||
237 | input parameter and sets the version information in the passed in structure. | ||
238 | It returns 0 on success or the error -EINVAL if a version mismatch is | ||
239 | detected. | ||
240 | |||
241 | |||
242 | AUTOFS_DEV_IOCTL_PROTOVER_CMD and AUTOFS_DEV_IOCTL_PROTOSUBVER_CMD | ||
243 | ------------------------------------------------------------------ | ||
244 | |||
245 | Get the major and minor version of the autofs4 protocol version understood | ||
246 | by loaded module. This call requires an initialized struct autofs_dev_ioctl | ||
247 | with the ioctlfd field set to a valid autofs mount point descriptor | ||
248 | and sets the requested version number in structure field arg1. These | ||
249 | commands return 0 on success or one of the negative error codes if | ||
250 | validation fails. | ||
251 | |||
252 | |||
253 | AUTOFS_DEV_IOCTL_OPENMOUNT and AUTOFS_DEV_IOCTL_CLOSEMOUNT | ||
254 | ---------------------------------------------------------- | ||
255 | |||
256 | Obtain and release a file descriptor for an autofs managed mount point | ||
257 | path. The open call requires an initialized struct autofs_dev_ioctl with | ||
258 | the the path field set and the size field adjusted appropriately as well | ||
259 | as the arg1 field set to the device number of the autofs mount. The | ||
260 | device number can be obtained from the mount options shown in | ||
261 | /proc/mounts. The close call requires an initialized struct | ||
262 | autofs_dev_ioct with the ioctlfd field set to the descriptor obtained | ||
263 | from the open call. The release of the file descriptor can also be done | ||
264 | with close(2) so any open descriptors will also be closed at process exit. | ||
265 | The close call is included in the implemented operations largely for | ||
266 | completeness and to provide for a consistent user space implementation. | ||
267 | |||
268 | |||
269 | AUTOFS_DEV_IOCTL_READY_CMD and AUTOFS_DEV_IOCTL_FAIL_CMD | ||
270 | -------------------------------------------------------- | ||
271 | |||
272 | Return mount and expire result status from user space to the kernel. | ||
273 | Both of these calls require an initialized struct autofs_dev_ioctl | ||
274 | with the ioctlfd field set to the descriptor obtained from the open | ||
275 | call and the arg1 field set to the wait queue token number, received | ||
276 | by user space in the foregoing mount or expire request. The arg2 field | ||
277 | is set to the status to be returned. For the ready call this is always | ||
278 | 0 and for the fail call it is set to the errno of the operation. | ||
279 | |||
280 | |||
281 | AUTOFS_DEV_IOCTL_SETPIPEFD_CMD | ||
282 | ------------------------------ | ||
283 | |||
284 | Set the pipe file descriptor used for kernel communication to the daemon. | ||
285 | Normally this is set at mount time using an option but when reconnecting | ||
286 | to a existing mount we need to use this to tell the autofs mount about | ||
287 | the new kernel pipe descriptor. In order to protect mounts against | ||
288 | incorrectly setting the pipe descriptor we also require that the autofs | ||
289 | mount be catatonic (see next call). | ||
290 | |||
291 | The call requires an initialized struct autofs_dev_ioctl with the | ||
292 | ioctlfd field set to the descriptor obtained from the open call and | ||
293 | the arg1 field set to descriptor of the pipe. On success the call | ||
294 | also sets the process group id used to identify the controlling process | ||
295 | (eg. the owning automount(8) daemon) to the process group of the caller. | ||
296 | |||
297 | |||
298 | AUTOFS_DEV_IOCTL_CATATONIC_CMD | ||
299 | ------------------------------ | ||
300 | |||
301 | Make the autofs mount point catatonic. The autofs mount will no longer | ||
302 | issue mount requests, the kernel communication pipe descriptor is released | ||
303 | and any remaining waits in the queue released. | ||
304 | |||
305 | The call requires an initialized struct autofs_dev_ioctl with the | ||
306 | ioctlfd field set to the descriptor obtained from the open call. | ||
307 | |||
308 | |||
309 | AUTOFS_DEV_IOCTL_TIMEOUT_CMD | ||
310 | ---------------------------- | ||
311 | |||
312 | Set the expire timeout for mounts withing an autofs mount point. | ||
313 | |||
314 | The call requires an initialized struct autofs_dev_ioctl with the | ||
315 | ioctlfd field set to the descriptor obtained from the open call. | ||
316 | |||
317 | |||
318 | AUTOFS_DEV_IOCTL_REQUESTER_CMD | ||
319 | ------------------------------ | ||
320 | |||
321 | Return the uid and gid of the last process to successfully trigger a the | ||
322 | mount on the given path dentry. | ||
323 | |||
324 | The call requires an initialized struct autofs_dev_ioctl with the path | ||
325 | field set to the mount point in question and the size field adjusted | ||
326 | appropriately as well as the arg1 field set to the device number of the | ||
327 | containing autofs mount. Upon return the struct field arg1 contains the | ||
328 | uid and arg2 the gid. | ||
329 | |||
330 | When reconstructing an autofs mount tree with active mounts we need to | ||
331 | re-connect to mounts that may have used the original process uid and | ||
332 | gid (or string variations of them) for mount lookups within the map entry. | ||
333 | This call provides the ability to obtain this uid and gid so they may be | ||
334 | used by user space for the mount map lookups. | ||
335 | |||
336 | |||
337 | AUTOFS_DEV_IOCTL_EXPIRE_CMD | ||
338 | --------------------------- | ||
339 | |||
340 | Issue an expire request to the kernel for an autofs mount. Typically | ||
341 | this ioctl is called until no further expire candidates are found. | ||
342 | |||
343 | The call requires an initialized struct autofs_dev_ioctl with the | ||
344 | ioctlfd field set to the descriptor obtained from the open call. In | ||
345 | addition an immediate expire, independent of the mount timeout, can be | ||
346 | requested by setting the arg1 field to 1. If no expire candidates can | ||
347 | be found the ioctl returns -1 with errno set to EAGAIN. | ||
348 | |||
349 | This call causes the kernel module to check the mount corresponding | ||
350 | to the given ioctlfd for mounts that can be expired, issues an expire | ||
351 | request back to the daemon and waits for completion. | ||
352 | |||
353 | AUTOFS_DEV_IOCTL_ASKUMOUNT_CMD | ||
354 | ------------------------------ | ||
355 | |||
356 | Checks if an autofs mount point is in use. | ||
357 | |||
358 | The call requires an initialized struct autofs_dev_ioctl with the | ||
359 | ioctlfd field set to the descriptor obtained from the open call and | ||
360 | it returns the result in the arg1 field, 1 for busy and 0 otherwise. | ||
361 | |||
362 | |||
363 | AUTOFS_DEV_IOCTL_ISMOUNTPOINT_CMD | ||
364 | --------------------------------- | ||
365 | |||
366 | Check if the given path is a mountpoint. | ||
367 | |||
368 | The call requires an initialized struct autofs_dev_ioctl. There are two | ||
369 | possible variations. Both use the path field set to the path of the mount | ||
370 | point to check and the size field adjusted appropriately. One uses the | ||
371 | ioctlfd field to identify a specific mount point to check while the other | ||
372 | variation uses the path and optionaly arg1 set to an autofs mount type. | ||
373 | The call returns 1 if this is a mount point and sets arg1 to the device | ||
374 | number of the mount and field arg2 to the relevant super block magic | ||
375 | number (described below) or 0 if it isn't a mountpoint. In both cases | ||
376 | the the device number (as returned by new_encode_dev()) is returned | ||
377 | in field arg1. | ||
378 | |||
379 | If supplied with a file descriptor we're looking for a specific mount, | ||
380 | not necessarily at the top of the mounted stack. In this case the path | ||
381 | the descriptor corresponds to is considered a mountpoint if it is itself | ||
382 | a mountpoint or contains a mount, such as a multi-mount without a root | ||
383 | mount. In this case we return 1 if the descriptor corresponds to a mount | ||
384 | point and and also returns the super magic of the covering mount if there | ||
385 | is one or 0 if it isn't a mountpoint. | ||
386 | |||
387 | If a path is supplied (and the ioctlfd field is set to -1) then the path | ||
388 | is looked up and is checked to see if it is the root of a mount. If a | ||
389 | type is also given we are looking for a particular autofs mount and if | ||
390 | a match isn't found a fail is returned. If the the located path is the | ||
391 | root of a mount 1 is returned along with the super magic of the mount | ||
392 | or 0 otherwise. | ||
393 | |||
diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt index b45f3c1b8b43..9dd2a3bb2acc 100644 --- a/Documentation/filesystems/ext3.txt +++ b/Documentation/filesystems/ext3.txt | |||
@@ -96,6 +96,11 @@ errors=remount-ro(*) Remount the filesystem read-only on an error. | |||
96 | errors=continue Keep going on a filesystem error. | 96 | errors=continue Keep going on a filesystem error. |
97 | errors=panic Panic and halt the machine if an error occurs. | 97 | errors=panic Panic and halt the machine if an error occurs. |
98 | 98 | ||
99 | data_err=ignore(*) Just print an error message if an error occurs | ||
100 | in a file data buffer in ordered mode. | ||
101 | data_err=abort Abort the journal if an error occurs in a file | ||
102 | data buffer in ordered mode. | ||
103 | |||
99 | grpid Give objects the same group ID as their creator. | 104 | grpid Give objects the same group ID as their creator. |
100 | bsdgroups | 105 | bsdgroups |
101 | 106 | ||
@@ -193,6 +198,5 @@ kernel source: <file:fs/ext3/> | |||
193 | programs: http://e2fsprogs.sourceforge.net/ | 198 | programs: http://e2fsprogs.sourceforge.net/ |
194 | http://ext2resize.sourceforge.net | 199 | http://ext2resize.sourceforge.net |
195 | 200 | ||
196 | useful links: http://www.zip.com.au/~akpm/linux/ext3/ext3-usage.html | 201 | useful links: http://www-106.ibm.com/developerworks/linux/library/l-fs7/ |
197 | http://www-106.ibm.com/developerworks/linux/library/l-fs7/ | ||
198 | http://www-106.ibm.com/developerworks/linux/library/l-fs8/ | 202 | http://www-106.ibm.com/developerworks/linux/library/l-fs8/ |
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index eb154ef36c2a..174eaff7ded9 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
@@ -2,19 +2,24 @@ | |||
2 | Ext4 Filesystem | 2 | Ext4 Filesystem |
3 | =============== | 3 | =============== |
4 | 4 | ||
5 | This is a development version of the ext4 filesystem, an advanced level | 5 | Ext4 is an an advanced level of the ext3 filesystem which incorporates |
6 | of the ext3 filesystem which incorporates scalability and reliability | 6 | scalability and reliability enhancements for supporting large filesystems |
7 | enhancements for supporting large filesystems (64 bit) in keeping with | 7 | (64 bit) in keeping with increasing disk capacities and state-of-the-art |
8 | increasing disk capacities and state-of-the-art feature requirements. | 8 | feature requirements. |
9 | 9 | ||
10 | Mailing list: linux-ext4@vger.kernel.org | 10 | Mailing list: linux-ext4@vger.kernel.org |
11 | Web site: http://ext4.wiki.kernel.org | ||
11 | 12 | ||
12 | 13 | ||
13 | 1. Quick usage instructions: | 14 | 1. Quick usage instructions: |
14 | =========================== | 15 | =========================== |
15 | 16 | ||
17 | Note: More extensive information for getting started with ext4 can be | ||
18 | found at the ext4 wiki site at the URL: | ||
19 | http://ext4.wiki.kernel.org/index.php/Ext4_Howto | ||
20 | |||
16 | - Compile and install the latest version of e2fsprogs (as of this | 21 | - Compile and install the latest version of e2fsprogs (as of this |
17 | writing version 1.41) from: | 22 | writing version 1.41.3) from: |
18 | 23 | ||
19 | http://sourceforge.net/project/showfiles.php?group_id=2406 | 24 | http://sourceforge.net/project/showfiles.php?group_id=2406 |
20 | 25 | ||
@@ -36,11 +41,9 @@ Mailing list: linux-ext4@vger.kernel.org | |||
36 | 41 | ||
37 | # mke2fs -t ext4 /dev/hda1 | 42 | # mke2fs -t ext4 /dev/hda1 |
38 | 43 | ||
39 | Or configure an existing ext3 filesystem to support extents and set | 44 | Or to configure an existing ext3 filesystem to support extents: |
40 | the test_fs flag to indicate that it's ok for an in-development | ||
41 | filesystem to touch this filesystem: | ||
42 | 45 | ||
43 | # tune2fs -O extents -E test_fs /dev/hda1 | 46 | # tune2fs -O extents /dev/hda1 |
44 | 47 | ||
45 | If the filesystem was created with 128 byte inodes, it can be | 48 | If the filesystem was created with 128 byte inodes, it can be |
46 | converted to use 256 byte for greater efficiency via: | 49 | converted to use 256 byte for greater efficiency via: |
@@ -104,8 +107,8 @@ exist yet so I'm not sure they're in the near-term roadmap. | |||
104 | The big performance win will come with mballoc, delalloc and flex_bg | 107 | The big performance win will come with mballoc, delalloc and flex_bg |
105 | grouping of bitmaps and inode tables. Some test results available here: | 108 | grouping of bitmaps and inode tables. Some test results available here: |
106 | 109 | ||
107 | - http://www.bullopensource.org/ext4/20080530/ffsb-write-2.6.26-rc2.html | 110 | - http://www.bullopensource.org/ext4/20080818-ffsb/ffsb-write-2.6.27-rc1.html |
108 | - http://www.bullopensource.org/ext4/20080530/ffsb-readwrite-2.6.26-rc2.html | 111 | - http://www.bullopensource.org/ext4/20080818-ffsb/ffsb-readwrite-2.6.27-rc1.html |
109 | 112 | ||
110 | 3. Options | 113 | 3. Options |
111 | ========== | 114 | ========== |
@@ -214,9 +217,6 @@ noreservation | |||
214 | bsddf (*) Make 'df' act like BSD. | 217 | bsddf (*) Make 'df' act like BSD. |
215 | minixdf Make 'df' act like Minix. | 218 | minixdf Make 'df' act like Minix. |
216 | 219 | ||
217 | check=none Don't do extra checking of bitmaps on mount. | ||
218 | nocheck | ||
219 | |||
220 | debug Extra debugging information is sent to syslog. | 220 | debug Extra debugging information is sent to syslog. |
221 | 221 | ||
222 | errors=remount-ro(*) Remount the filesystem read-only on an error. | 222 | errors=remount-ro(*) Remount the filesystem read-only on an error. |
@@ -253,8 +253,6 @@ nobh (a) cache disk block mapping information | |||
253 | "nobh" option tries to avoid associating buffer | 253 | "nobh" option tries to avoid associating buffer |
254 | heads (supported only for "writeback" mode). | 254 | heads (supported only for "writeback" mode). |
255 | 255 | ||
256 | mballoc (*) Use the multiple block allocator for block allocation | ||
257 | nomballoc disabled multiple block allocator for block allocation. | ||
258 | stripe=n Number of filesystem blocks that mballoc will try | 256 | stripe=n Number of filesystem blocks that mballoc will try |
259 | to use for allocation size and alignment. For RAID5/6 | 257 | to use for allocation size and alignment. For RAID5/6 |
260 | systems this should be the number of data | 258 | systems this should be the number of data |
diff --git a/Documentation/filesystems/nfsroot.txt b/Documentation/filesystems/nfsroot.txt index 31b329172343..68baddf3c3e0 100644 --- a/Documentation/filesystems/nfsroot.txt +++ b/Documentation/filesystems/nfsroot.txt | |||
@@ -169,7 +169,7 @@ They depend on various facilities being available: | |||
169 | 3.1) Booting from a floppy using syslinux | 169 | 3.1) Booting from a floppy using syslinux |
170 | 170 | ||
171 | When building kernels, an easy way to create a boot floppy that uses | 171 | When building kernels, an easy way to create a boot floppy that uses |
172 | syslinux is to use the zdisk or bzdisk make targets which use | 172 | syslinux is to use the zdisk or bzdisk make targets which use zimage |
173 | and bzimage images respectively. Both targets accept the | 173 | and bzimage images respectively. Both targets accept the |
174 | FDARGS parameter which can be used to set the kernel command line. | 174 | FDARGS parameter which can be used to set the kernel command line. |
175 | 175 | ||
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index b488edad743c..bcceb99b81dd 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -1321,6 +1321,18 @@ debugging information is displayed on console. | |||
1321 | NMI switch that most IA32 servers have fires unknown NMI up, for example. | 1321 | NMI switch that most IA32 servers have fires unknown NMI up, for example. |
1322 | If a system hangs up, try pressing the NMI switch. | 1322 | If a system hangs up, try pressing the NMI switch. |
1323 | 1323 | ||
1324 | panic_on_unrecovered_nmi | ||
1325 | ------------------------ | ||
1326 | |||
1327 | The default Linux behaviour on an NMI of either memory or unknown is to continue | ||
1328 | operation. For many environments such as scientific computing it is preferable | ||
1329 | that the box is taken out and the error dealt with than an uncorrected | ||
1330 | parity/ECC error get propogated. | ||
1331 | |||
1332 | A small number of systems do generate NMI's for bizarre random reasons such as | ||
1333 | power management so the default is off. That sysctl works like the existing | ||
1334 | panic controls already in that directory. | ||
1335 | |||
1324 | nmi_watchdog | 1336 | nmi_watchdog |
1325 | ------------ | 1337 | ------------ |
1326 | 1338 | ||
@@ -1372,15 +1384,18 @@ causes the kernel to prefer to reclaim dentries and inodes. | |||
1372 | dirty_background_ratio | 1384 | dirty_background_ratio |
1373 | ---------------------- | 1385 | ---------------------- |
1374 | 1386 | ||
1375 | Contains, as a percentage of total system memory, the number of pages at which | 1387 | Contains, as a percentage of the dirtyable system memory (free pages + mapped |
1376 | the pdflush background writeback daemon will start writing out dirty data. | 1388 | pages + file cache, not including locked pages and HugePages), the number of |
1389 | pages at which the pdflush background writeback daemon will start writing out | ||
1390 | dirty data. | ||
1377 | 1391 | ||
1378 | dirty_ratio | 1392 | dirty_ratio |
1379 | ----------------- | 1393 | ----------------- |
1380 | 1394 | ||
1381 | Contains, as a percentage of total system memory, the number of pages at which | 1395 | Contains, as a percentage of the dirtyable system memory (free pages + mapped |
1382 | a process which is generating disk writes will itself start writing out dirty | 1396 | pages + file cache, not including locked pages and HugePages), the number of |
1383 | data. | 1397 | pages at which a process which is generating disk writes will itself start |
1398 | writing out dirty data. | ||
1384 | 1399 | ||
1385 | dirty_writeback_centisecs | 1400 | dirty_writeback_centisecs |
1386 | ------------------------- | 1401 | ------------------------- |
@@ -2400,24 +2415,29 @@ will be dumped when the <pid> process is dumped. coredump_filter is a bitmask | |||
2400 | of memory types. If a bit of the bitmask is set, memory segments of the | 2415 | of memory types. If a bit of the bitmask is set, memory segments of the |
2401 | corresponding memory type are dumped, otherwise they are not dumped. | 2416 | corresponding memory type are dumped, otherwise they are not dumped. |
2402 | 2417 | ||
2403 | The following 4 memory types are supported: | 2418 | The following 7 memory types are supported: |
2404 | - (bit 0) anonymous private memory | 2419 | - (bit 0) anonymous private memory |
2405 | - (bit 1) anonymous shared memory | 2420 | - (bit 1) anonymous shared memory |
2406 | - (bit 2) file-backed private memory | 2421 | - (bit 2) file-backed private memory |
2407 | - (bit 3) file-backed shared memory | 2422 | - (bit 3) file-backed shared memory |
2408 | - (bit 4) ELF header pages in file-backed private memory areas (it is | 2423 | - (bit 4) ELF header pages in file-backed private memory areas (it is |
2409 | effective only if the bit 2 is cleared) | 2424 | effective only if the bit 2 is cleared) |
2425 | - (bit 5) hugetlb private memory | ||
2426 | - (bit 6) hugetlb shared memory | ||
2410 | 2427 | ||
2411 | Note that MMIO pages such as frame buffer are never dumped and vDSO pages | 2428 | Note that MMIO pages such as frame buffer are never dumped and vDSO pages |
2412 | are always dumped regardless of the bitmask status. | 2429 | are always dumped regardless of the bitmask status. |
2413 | 2430 | ||
2414 | Default value of coredump_filter is 0x3; this means all anonymous memory | 2431 | Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only |
2415 | segments are dumped. | 2432 | effected by bit 5-6. |
2433 | |||
2434 | Default value of coredump_filter is 0x23; this means all anonymous memory | ||
2435 | segments and hugetlb private memory are dumped. | ||
2416 | 2436 | ||
2417 | If you don't want to dump all shared memory segments attached to pid 1234, | 2437 | If you don't want to dump all shared memory segments attached to pid 1234, |
2418 | write 1 to the process's proc file. | 2438 | write 0x21 to the process's proc file. |
2419 | 2439 | ||
2420 | $ echo 0x1 > /proc/1234/coredump_filter | 2440 | $ echo 0x21 > /proc/1234/coredump_filter |
2421 | 2441 | ||
2422 | When a new process is created, the process inherits the bitmask status from its | 2442 | When a new process is created, the process inherits the bitmask status from its |
2423 | parent. It is useful to set up coredump_filter before the program runs. | 2443 | parent. It is useful to set up coredump_filter before the program runs. |
diff --git a/Documentation/filesystems/ramfs-rootfs-initramfs.txt b/Documentation/filesystems/ramfs-rootfs-initramfs.txt index 7be232b44ee4..62fe9b1e0890 100644 --- a/Documentation/filesystems/ramfs-rootfs-initramfs.txt +++ b/Documentation/filesystems/ramfs-rootfs-initramfs.txt | |||
@@ -263,7 +263,7 @@ User Mode Linux, like so: | |||
263 | sleep(999999999); | 263 | sleep(999999999); |
264 | } | 264 | } |
265 | EOF | 265 | EOF |
266 | gcc -static hello2.c -o init | 266 | gcc -static hello.c -o init |
267 | echo init | cpio -o -H newc | gzip > test.cpio.gz | 267 | echo init | cpio -o -H newc | gzip > test.cpio.gz |
268 | # Testing external initramfs using the initrd loading mechanism. | 268 | # Testing external initramfs using the initrd loading mechanism. |
269 | qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero | 269 | qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero |
diff --git a/Documentation/filesystems/ubifs.txt b/Documentation/filesystems/ubifs.txt index 6a0d70a22f05..dd84ea3c10da 100644 --- a/Documentation/filesystems/ubifs.txt +++ b/Documentation/filesystems/ubifs.txt | |||
@@ -86,6 +86,15 @@ norm_unmount (*) commit on unmount; the journal is committed | |||
86 | fast_unmount do not commit on unmount; this option makes | 86 | fast_unmount do not commit on unmount; this option makes |
87 | unmount faster, but the next mount slower | 87 | unmount faster, but the next mount slower |
88 | because of the need to replay the journal. | 88 | because of the need to replay the journal. |
89 | bulk_read read more in one go to take advantage of flash | ||
90 | media that read faster sequentially | ||
91 | no_bulk_read (*) do not bulk-read | ||
92 | no_chk_data_crc skip checking of CRCs on data nodes in order to | ||
93 | improve read performance. Use this option only | ||
94 | if the flash media is highly reliable. The effect | ||
95 | of this option is that corruption of the contents | ||
96 | of a file can go unnoticed. | ||
97 | chk_data_crc (*) do not skip checking CRCs on data nodes | ||
89 | 98 | ||
90 | 99 | ||
91 | Quick usage instructions | 100 | Quick usage instructions |