diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/9p.txt | 40 | ||||
-rw-r--r-- | Documentation/filesystems/ext4.txt | 24 | ||||
-rw-r--r-- | Documentation/filesystems/ncpfs.txt | 6 | ||||
-rw-r--r-- | Documentation/filesystems/nfs41-server.txt | 54 | ||||
-rw-r--r-- | Documentation/filesystems/nfsroot.txt | 2 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 27 | ||||
-rw-r--r-- | Documentation/filesystems/sharedsubtree.txt | 220 |
7 files changed, 161 insertions, 212 deletions
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt index 6208f55c44c3..57e0b80a5274 100644 --- a/Documentation/filesystems/9p.txt +++ b/Documentation/filesystems/9p.txt | |||
@@ -18,11 +18,11 @@ the 9p client is available in the form of a USENIX paper: | |||
18 | 18 | ||
19 | Other applications are described in the following papers: | 19 | Other applications are described in the following papers: |
20 | * XCPU & Clustering | 20 | * XCPU & Clustering |
21 | http://www.xcpu.org/xcpu-talk.pdf | 21 | http://xcpu.org/papers/xcpu-talk.pdf |
22 | * KVMFS: control file system for KVM | 22 | * KVMFS: control file system for KVM |
23 | http://www.xcpu.org/kvmfs.pdf | 23 | http://xcpu.org/papers/kvmfs.pdf |
24 | * CellFS: A New ProgrammingModel for the Cell BE | 24 | * CellFS: A New Programming Model for the Cell BE |
25 | http://www.xcpu.org/cellfs-talk.pdf | 25 | http://xcpu.org/papers/cellfs-talk.pdf |
26 | * PROSE I/O: Using 9p to enable Application Partitions | 26 | * PROSE I/O: Using 9p to enable Application Partitions |
27 | http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf | 27 | http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf |
28 | 28 | ||
@@ -48,6 +48,7 @@ OPTIONS | |||
48 | (see rfdno and wfdno) | 48 | (see rfdno and wfdno) |
49 | virtio - connect to the next virtio channel available | 49 | virtio - connect to the next virtio channel available |
50 | (from lguest or KVM with trans_virtio module) | 50 | (from lguest or KVM with trans_virtio module) |
51 | rdma - connect to a specified RDMA channel | ||
51 | 52 | ||
52 | uname=name user name to attempt mount as on the remote server. The | 53 | uname=name user name to attempt mount as on the remote server. The |
53 | server may override or ignore this value. Certain user | 54 | server may override or ignore this value. Certain user |
@@ -59,16 +60,22 @@ OPTIONS | |||
59 | cache=mode specifies a caching policy. By default, no caches are used. | 60 | cache=mode specifies a caching policy. By default, no caches are used. |
60 | loose = no attempts are made at consistency, | 61 | loose = no attempts are made at consistency, |
61 | intended for exclusive, read-only mounts | 62 | intended for exclusive, read-only mounts |
63 | fscache = use FS-Cache for a persistent, read-only | ||
64 | cache backend. | ||
62 | 65 | ||
63 | debug=n specifies debug level. The debug level is a bitmask. | 66 | debug=n specifies debug level. The debug level is a bitmask. |
64 | 0x01 = display verbose error messages | 67 | 0x01 = display verbose error messages |
65 | 0x02 = developer debug (DEBUG_CURRENT) | 68 | 0x02 = developer debug (DEBUG_CURRENT) |
66 | 0x04 = display 9p trace | 69 | 0x04 = display 9p trace |
67 | 0x08 = display VFS trace | 70 | 0x08 = display VFS trace |
68 | 0x10 = display Marshalling debug | 71 | 0x10 = display Marshalling debug |
69 | 0x20 = display RPC debug | 72 | 0x20 = display RPC debug |
70 | 0x40 = display transport debug | 73 | 0x40 = display transport debug |
71 | 0x80 = display allocation debug | 74 | 0x80 = display allocation debug |
75 | 0x100 = display protocol message debug | ||
76 | 0x200 = display Fid debug | ||
77 | 0x400 = display packet debug | ||
78 | 0x800 = display fscache tracing debug | ||
72 | 79 | ||
73 | rfdno=n the file descriptor for reading with trans=fd | 80 | rfdno=n the file descriptor for reading with trans=fd |
74 | 81 | ||
@@ -100,6 +107,10 @@ OPTIONS | |||
100 | any = v9fs does single attach and performs all | 107 | any = v9fs does single attach and performs all |
101 | operations as one user | 108 | operations as one user |
102 | 109 | ||
110 | cachetag cache tag to use the specified persistent cache. | ||
111 | cache tags for existing cache sessions can be listed at | ||
112 | /sys/fs/9p/caches. (applies only to cache=fscache) | ||
113 | |||
103 | RESOURCES | 114 | RESOURCES |
104 | ========= | 115 | ========= |
105 | 116 | ||
@@ -118,7 +129,7 @@ and export. | |||
118 | A Linux version of the 9p server is now maintained under the npfs project | 129 | A Linux version of the 9p server is now maintained under the npfs project |
119 | on sourceforge (http://sourceforge.net/projects/npfs). The currently | 130 | on sourceforge (http://sourceforge.net/projects/npfs). The currently |
120 | maintained version is the single-threaded version of the server (named spfs) | 131 | maintained version is the single-threaded version of the server (named spfs) |
121 | available from the same CVS repository. | 132 | available from the same SVN repository. |
122 | 133 | ||
123 | There are user and developer mailing lists available through the v9fs project | 134 | There are user and developer mailing lists available through the v9fs project |
124 | on sourceforge (http://sourceforge.net/projects/v9fs). | 135 | on sourceforge (http://sourceforge.net/projects/v9fs). |
@@ -126,7 +137,8 @@ on sourceforge (http://sourceforge.net/projects/v9fs). | |||
126 | A stand-alone version of the module (which should build for any 2.6 kernel) | 137 | A stand-alone version of the module (which should build for any 2.6 kernel) |
127 | is available via (http://github.com/ericvh/9p-sac/tree/master) | 138 | is available via (http://github.com/ericvh/9p-sac/tree/master) |
128 | 139 | ||
129 | News and other information is maintained on SWiK (http://swik.net/v9fs). | 140 | News and other information is maintained on SWiK (http://swik.net/v9fs) |
141 | and the Wiki (http://sf.net/apps/mediawiki/v9fs/index.php). | ||
130 | 142 | ||
131 | Bug reports may be issued through the kernel.org bugzilla | 143 | Bug reports may be issued through the kernel.org bugzilla |
132 | (http://bugzilla.kernel.org) | 144 | (http://bugzilla.kernel.org) |
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 7be02ac5fa36..18b5ec8cea45 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
@@ -134,15 +134,9 @@ ro Mount filesystem read only. Note that ext4 will | |||
134 | mount options "ro,noload" can be used to prevent | 134 | mount options "ro,noload" can be used to prevent |
135 | writes to the filesystem. | 135 | writes to the filesystem. |
136 | 136 | ||
137 | journal_checksum Enable checksumming of the journal transactions. | ||
138 | This will allow the recovery code in e2fsck and the | ||
139 | kernel to detect corruption in the kernel. It is a | ||
140 | compatible change and will be ignored by older kernels. | ||
141 | |||
142 | journal_async_commit Commit block can be written to disk without waiting | 137 | journal_async_commit Commit block can be written to disk without waiting |
143 | for descriptor blocks. If enabled older kernels cannot | 138 | for descriptor blocks. If enabled older kernels cannot |
144 | mount the device. This will enable 'journal_checksum' | 139 | mount the device. |
145 | internally. | ||
146 | 140 | ||
147 | journal=update Update the ext4 file system's journal to the current | 141 | journal=update Update the ext4 file system's journal to the current |
148 | format. | 142 | format. |
@@ -263,10 +257,18 @@ resuid=n The user ID which may use the reserved blocks. | |||
263 | 257 | ||
264 | sb=n Use alternate superblock at this location. | 258 | sb=n Use alternate superblock at this location. |
265 | 259 | ||
266 | quota | 260 | quota These options are ignored by the filesystem. They |
267 | noquota | 261 | noquota are used only by quota tools to recognize volumes |
268 | grpquota | 262 | grpquota where quota should be turned on. See documentation |
269 | usrquota | 263 | usrquota in the quota-tools package for more details |
264 | (http://sourceforge.net/projects/linuxquota). | ||
265 | |||
266 | jqfmt=<quota type> These options tell filesystem details about quota | ||
267 | usrjquota=<file> so that quota information can be properly updated | ||
268 | grpjquota=<file> during journal replay. They replace the above | ||
269 | quota options. See documentation in the quota-tools | ||
270 | package for more details | ||
271 | (http://sourceforge.net/projects/linuxquota). | ||
270 | 272 | ||
271 | bh (*) ext4 associates buffer heads to data pages to | 273 | bh (*) ext4 associates buffer heads to data pages to |
272 | nobh (a) cache disk block mapping information | 274 | nobh (a) cache disk block mapping information |
diff --git a/Documentation/filesystems/ncpfs.txt b/Documentation/filesystems/ncpfs.txt index f12c30c93f2f..5af164f4b37b 100644 --- a/Documentation/filesystems/ncpfs.txt +++ b/Documentation/filesystems/ncpfs.txt | |||
@@ -7,6 +7,6 @@ ftp.gwdg.de/pub/linux/misc/ncpfs, but sunsite and its many mirrors | |||
7 | will have it as well. | 7 | will have it as well. |
8 | 8 | ||
9 | Related products are linware and mars_nwe, which will give Linux partial | 9 | Related products are linware and mars_nwe, which will give Linux partial |
10 | NetWare server functionality. Linware's home site is | 10 | NetWare server functionality. |
11 | klokan.sh.cvut.cz/pub/linux/linware; mars_nwe can be found on | 11 | |
12 | ftp.gwdg.de/pub/linux/misc/ncpfs. | 12 | mars_nwe can be found on ftp.gwdg.de/pub/linux/misc/ncpfs. |
diff --git a/Documentation/filesystems/nfs41-server.txt b/Documentation/filesystems/nfs41-server.txt index 05d81cbcb2e1..5920fe26e6ff 100644 --- a/Documentation/filesystems/nfs41-server.txt +++ b/Documentation/filesystems/nfs41-server.txt | |||
@@ -11,6 +11,11 @@ the /proc/fs/nfsd/versions control file. Note that to write this | |||
11 | control file, the nfsd service must be taken down. Use your user-mode | 11 | control file, the nfsd service must be taken down. Use your user-mode |
12 | nfs-utils to set this up; see rpc.nfsd(8) | 12 | nfs-utils to set this up; see rpc.nfsd(8) |
13 | 13 | ||
14 | (Warning: older servers will interpret "+4.1" and "-4.1" as "+4" and | ||
15 | "-4", respectively. Therefore, code meant to work on both new and old | ||
16 | kernels must turn 4.1 on or off *before* turning support for version 4 | ||
17 | on or off; rpc.nfsd does this correctly.) | ||
18 | |||
14 | The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based | 19 | The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based |
15 | on the latest NFSv4.1 Internet Draft: | 20 | on the latest NFSv4.1 Internet Draft: |
16 | http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29 | 21 | http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29 |
@@ -25,6 +30,49 @@ are still under development out of tree. | |||
25 | See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design | 30 | See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design |
26 | for more information. | 31 | for more information. |
27 | 32 | ||
33 | The current implementation is intended for developers only: while it | ||
34 | does support ordinary file operations on clients we have tested against | ||
35 | (including the linux client), it is incomplete in ways which may limit | ||
36 | features unexpectedly, cause known bugs in rare cases, or cause | ||
37 | interoperability problems with future clients. Known issues: | ||
38 | |||
39 | - gss support is questionable: currently mounts with kerberos | ||
40 | from a linux client are possible, but we aren't really | ||
41 | conformant with the spec (for example, we don't use kerberos | ||
42 | on the backchannel correctly). | ||
43 | - no trunking support: no clients currently take advantage of | ||
44 | trunking, but this is a mandatory failure, and its use is | ||
45 | recommended to clients in a number of places. (E.g. to ensure | ||
46 | timely renewal in case an existing connection's retry timeouts | ||
47 | have gotten too long; see section 8.3 of the draft.) | ||
48 | Therefore, lack of this feature may cause future clients to | ||
49 | fail. | ||
50 | - Incomplete backchannel support: incomplete backchannel gss | ||
51 | support and no support for BACKCHANNEL_CTL mean that | ||
52 | callbacks (hence delegations and layouts) may not be | ||
53 | available and clients confused by the incomplete | ||
54 | implementation may fail. | ||
55 | - Server reboot recovery is unsupported; if the server reboots, | ||
56 | clients may fail. | ||
57 | - We do not support SSV, which provides security for shared | ||
58 | client-server state (thus preventing unauthorized tampering | ||
59 | with locks and opens, for example). It is mandatory for | ||
60 | servers to support this, though no clients use it yet. | ||
61 | - Mandatory operations which we do not support, such as | ||
62 | DESTROY_CLIENTID, FREE_STATEID, SECINFO_NO_NAME, and | ||
63 | TEST_STATEID, are not currently used by clients, but will be | ||
64 | (and the spec recommends their uses in common cases), and | ||
65 | clients should not be expected to know how to recover from the | ||
66 | case where they are not supported. This will eventually cause | ||
67 | interoperability failures. | ||
68 | |||
69 | In addition, some limitations are inherited from the current NFSv4 | ||
70 | implementation: | ||
71 | |||
72 | - Incomplete delegation enforcement: if a file is renamed or | ||
73 | unlinked, a client holding a delegation may continue to | ||
74 | indefinitely allow opens of the file under the old name. | ||
75 | |||
28 | The table below, taken from the NFSv4.1 document, lists | 76 | The table below, taken from the NFSv4.1 document, lists |
29 | the operations that are mandatory to implement (REQ), optional | 77 | the operations that are mandatory to implement (REQ), optional |
30 | (OPT), and NFSv4.0 operations that are required not to implement (MNI) | 78 | (OPT), and NFSv4.0 operations that are required not to implement (MNI) |
@@ -142,6 +190,12 @@ NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 | | |||
142 | 190 | ||
143 | Implementation notes: | 191 | Implementation notes: |
144 | 192 | ||
193 | DELEGPURGE: | ||
194 | * mandatory only for servers that support CLAIM_DELEGATE_PREV and/or | ||
195 | CLAIM_DELEG_PREV_FH (which allows clients to keep delegations that | ||
196 | persist across client reboots). Thus we need not implement this for | ||
197 | now. | ||
198 | |||
145 | EXCHANGE_ID: | 199 | EXCHANGE_ID: |
146 | * only SP4_NONE state protection supported | 200 | * only SP4_NONE state protection supported |
147 | * implementation ids are ignored | 201 | * implementation ids are ignored |
diff --git a/Documentation/filesystems/nfsroot.txt b/Documentation/filesystems/nfsroot.txt index 68baddf3c3e0..3ba0b945aaf8 100644 --- a/Documentation/filesystems/nfsroot.txt +++ b/Documentation/filesystems/nfsroot.txt | |||
@@ -105,7 +105,7 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> | |||
105 | the client address and this parameter is NOT empty only | 105 | the client address and this parameter is NOT empty only |
106 | replies from the specified server are accepted. | 106 | replies from the specified server are accepted. |
107 | 107 | ||
108 | Only required for for NFS root. That is autoconfiguration | 108 | Only required for NFS root. That is autoconfiguration |
109 | will not be triggered if it is missing and NFS root is not | 109 | will not be triggered if it is missing and NFS root is not |
110 | in operation. | 110 | in operation. |
111 | 111 | ||
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index ffead13f9443..b5aee7838a00 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -176,6 +176,7 @@ read the file /proc/PID/status: | |||
176 | CapBnd: ffffffffffffffff | 176 | CapBnd: ffffffffffffffff |
177 | voluntary_ctxt_switches: 0 | 177 | voluntary_ctxt_switches: 0 |
178 | nonvoluntary_ctxt_switches: 1 | 178 | nonvoluntary_ctxt_switches: 1 |
179 | Stack usage: 12 kB | ||
179 | 180 | ||
180 | This shows you nearly the same information you would get if you viewed it with | 181 | This shows you nearly the same information you would get if you viewed it with |
181 | the ps command. In fact, ps uses the proc file system to obtain its | 182 | the ps command. In fact, ps uses the proc file system to obtain its |
@@ -229,6 +230,7 @@ Table 1-2: Contents of the statm files (as of 2.6.30-rc7) | |||
229 | Mems_allowed_list Same as previous, but in "list format" | 230 | Mems_allowed_list Same as previous, but in "list format" |
230 | voluntary_ctxt_switches number of voluntary context switches | 231 | voluntary_ctxt_switches number of voluntary context switches |
231 | nonvoluntary_ctxt_switches number of non voluntary context switches | 232 | nonvoluntary_ctxt_switches number of non voluntary context switches |
233 | Stack usage: stack usage high water mark (round up to page size) | ||
232 | .............................................................................. | 234 | .............................................................................. |
233 | 235 | ||
234 | Table 1-3: Contents of the statm files (as of 2.6.8-rc3) | 236 | Table 1-3: Contents of the statm files (as of 2.6.8-rc3) |
@@ -307,7 +309,7 @@ address perms offset dev inode pathname | |||
307 | 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test | 309 | 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test |
308 | 0804a000-0806b000 rw-p 00000000 00:00 0 [heap] | 310 | 0804a000-0806b000 rw-p 00000000 00:00 0 [heap] |
309 | a7cb1000-a7cb2000 ---p 00000000 00:00 0 | 311 | a7cb1000-a7cb2000 ---p 00000000 00:00 0 |
310 | a7cb2000-a7eb2000 rw-p 00000000 00:00 0 | 312 | a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [threadstack:001ff4b4] |
311 | a7eb2000-a7eb3000 ---p 00000000 00:00 0 | 313 | a7eb2000-a7eb3000 ---p 00000000 00:00 0 |
312 | a7eb3000-a7ed5000 rw-p 00000000 00:00 0 | 314 | a7eb3000-a7ed5000 rw-p 00000000 00:00 0 |
313 | a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6 | 315 | a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6 |
@@ -343,6 +345,7 @@ is not associated with a file: | |||
343 | [stack] = the stack of the main process | 345 | [stack] = the stack of the main process |
344 | [vdso] = the "virtual dynamic shared object", | 346 | [vdso] = the "virtual dynamic shared object", |
345 | the kernel system call handler | 347 | the kernel system call handler |
348 | [threadstack:xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size | ||
346 | 349 | ||
347 | or if empty, the mapping is anonymous. | 350 | or if empty, the mapping is anonymous. |
348 | 351 | ||
@@ -375,6 +378,19 @@ of memory currently marked as referenced or accessed. | |||
375 | This file is only present if the CONFIG_MMU kernel configuration option is | 378 | This file is only present if the CONFIG_MMU kernel configuration option is |
376 | enabled. | 379 | enabled. |
377 | 380 | ||
381 | The /proc/PID/clear_refs is used to reset the PG_Referenced and ACCESSED/YOUNG | ||
382 | bits on both physical and virtual pages associated with a process. | ||
383 | To clear the bits for all the pages associated with the process | ||
384 | > echo 1 > /proc/PID/clear_refs | ||
385 | |||
386 | To clear the bits for the anonymous pages associated with the process | ||
387 | > echo 2 > /proc/PID/clear_refs | ||
388 | |||
389 | To clear the bits for the file mapped pages associated with the process | ||
390 | > echo 3 > /proc/PID/clear_refs | ||
391 | Any other value written to /proc/PID/clear_refs will have no effect. | ||
392 | |||
393 | |||
378 | 1.2 Kernel data | 394 | 1.2 Kernel data |
379 | --------------- | 395 | --------------- |
380 | 396 | ||
@@ -1032,9 +1048,9 @@ Various pieces of information about kernel activity are available in the | |||
1032 | since the system first booted. For a quick look, simply cat the file: | 1048 | since the system first booted. For a quick look, simply cat the file: |
1033 | 1049 | ||
1034 | > cat /proc/stat | 1050 | > cat /proc/stat |
1035 | cpu 2255 34 2290 22625563 6290 127 456 0 | 1051 | cpu 2255 34 2290 22625563 6290 127 456 0 0 |
1036 | cpu0 1132 34 1441 11311718 3675 127 438 0 | 1052 | cpu0 1132 34 1441 11311718 3675 127 438 0 0 |
1037 | cpu1 1123 0 849 11313845 2614 0 18 0 | 1053 | cpu1 1123 0 849 11313845 2614 0 18 0 0 |
1038 | intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] | 1054 | intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] |
1039 | ctxt 1990473 | 1055 | ctxt 1990473 |
1040 | btime 1062191376 | 1056 | btime 1062191376 |
@@ -1056,6 +1072,7 @@ second). The meanings of the columns are as follows, from left to right: | |||
1056 | - irq: servicing interrupts | 1072 | - irq: servicing interrupts |
1057 | - softirq: servicing softirqs | 1073 | - softirq: servicing softirqs |
1058 | - steal: involuntary wait | 1074 | - steal: involuntary wait |
1075 | - guest: running a guest | ||
1059 | 1076 | ||
1060 | The "intr" line gives counts of interrupts serviced since boot time, for each | 1077 | The "intr" line gives counts of interrupts serviced since boot time, for each |
1061 | of the possible system interrupts. The first column is the total of all | 1078 | of the possible system interrupts. The first column is the total of all |
@@ -1191,7 +1208,7 @@ The following heuristics are then applied: | |||
1191 | * if the task was reniced, its score doubles | 1208 | * if the task was reniced, its score doubles |
1192 | * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE | 1209 | * superuser or direct hardware access tasks (CAP_SYS_ADMIN, CAP_SYS_RESOURCE |
1193 | or CAP_SYS_RAWIO) have their score divided by 4 | 1210 | or CAP_SYS_RAWIO) have their score divided by 4 |
1194 | * if oom condition happened in one cpuset and checked task does not belong | 1211 | * if oom condition happened in one cpuset and checked process does not belong |
1195 | to it, its score is divided by 8 | 1212 | to it, its score is divided by 8 |
1196 | * the resulting score is multiplied by two to the power of oom_adj, i.e. | 1213 | * the resulting score is multiplied by two to the power of oom_adj, i.e. |
1197 | points <<= oom_adj when it is positive and | 1214 | points <<= oom_adj when it is positive and |
diff --git a/Documentation/filesystems/sharedsubtree.txt b/Documentation/filesystems/sharedsubtree.txt index 736540045dc7..23a181074f94 100644 --- a/Documentation/filesystems/sharedsubtree.txt +++ b/Documentation/filesystems/sharedsubtree.txt | |||
@@ -4,7 +4,7 @@ Shared Subtrees | |||
4 | Contents: | 4 | Contents: |
5 | 1) Overview | 5 | 1) Overview |
6 | 2) Features | 6 | 2) Features |
7 | 3) smount command | 7 | 3) Setting mount states |
8 | 4) Use-case | 8 | 4) Use-case |
9 | 5) Detailed semantics | 9 | 5) Detailed semantics |
10 | 6) Quiz | 10 | 6) Quiz |
@@ -41,14 +41,14 @@ replicas continue to be exactly same. | |||
41 | 41 | ||
42 | Here is an example: | 42 | Here is an example: |
43 | 43 | ||
44 | Lets say /mnt has a mount that is shared. | 44 | Let's say /mnt has a mount that is shared. |
45 | mount --make-shared /mnt | 45 | mount --make-shared /mnt |
46 | 46 | ||
47 | note: mount command does not yet support the --make-shared flag. | 47 | Note: mount(8) command now supports the --make-shared flag, |
48 | I have included a small C program which does the same by executing | 48 | so the sample 'smount' program is no longer needed and has been |
49 | 'smount /mnt shared' | 49 | removed. |
50 | 50 | ||
51 | #mount --bind /mnt /tmp | 51 | # mount --bind /mnt /tmp |
52 | The above command replicates the mount at /mnt to the mountpoint /tmp | 52 | The above command replicates the mount at /mnt to the mountpoint /tmp |
53 | and the contents of both the mounts remain identical. | 53 | and the contents of both the mounts remain identical. |
54 | 54 | ||
@@ -58,8 +58,8 @@ replicas continue to be exactly same. | |||
58 | #ls /tmp | 58 | #ls /tmp |
59 | a b c | 59 | a b c |
60 | 60 | ||
61 | Now lets say we mount a device at /tmp/a | 61 | Now let's say we mount a device at /tmp/a |
62 | #mount /dev/sd0 /tmp/a | 62 | # mount /dev/sd0 /tmp/a |
63 | 63 | ||
64 | #ls /tmp/a | 64 | #ls /tmp/a |
65 | t1 t2 t2 | 65 | t1 t2 t2 |
@@ -80,21 +80,20 @@ replicas continue to be exactly same. | |||
80 | 80 | ||
81 | Here is an example: | 81 | Here is an example: |
82 | 82 | ||
83 | Lets say /mnt has a mount which is shared. | 83 | Let's say /mnt has a mount which is shared. |
84 | #mount --make-shared /mnt | 84 | # mount --make-shared /mnt |
85 | 85 | ||
86 | Lets bind mount /mnt to /tmp | 86 | Let's bind mount /mnt to /tmp |
87 | #mount --bind /mnt /tmp | 87 | # mount --bind /mnt /tmp |
88 | 88 | ||
89 | the new mount at /tmp becomes a shared mount and it is a replica of | 89 | the new mount at /tmp becomes a shared mount and it is a replica of |
90 | the mount at /mnt. | 90 | the mount at /mnt. |
91 | 91 | ||
92 | Now lets make the mount at /tmp; a slave of /mnt | 92 | Now let's make the mount at /tmp; a slave of /mnt |
93 | #mount --make-slave /tmp | 93 | # mount --make-slave /tmp |
94 | [or smount /tmp slave] | ||
95 | 94 | ||
96 | lets mount /dev/sd0 on /mnt/a | 95 | let's mount /dev/sd0 on /mnt/a |
97 | #mount /dev/sd0 /mnt/a | 96 | # mount /dev/sd0 /mnt/a |
98 | 97 | ||
99 | #ls /mnt/a | 98 | #ls /mnt/a |
100 | t1 t2 t3 | 99 | t1 t2 t3 |
@@ -104,9 +103,9 @@ replicas continue to be exactly same. | |||
104 | 103 | ||
105 | Note the mount event has propagated to the mount at /tmp | 104 | Note the mount event has propagated to the mount at /tmp |
106 | 105 | ||
107 | However lets see what happens if we mount something on the mount at /tmp | 106 | However let's see what happens if we mount something on the mount at /tmp |
108 | 107 | ||
109 | #mount /dev/sd1 /tmp/b | 108 | # mount /dev/sd1 /tmp/b |
110 | 109 | ||
111 | #ls /tmp/b | 110 | #ls /tmp/b |
112 | s1 s2 s3 | 111 | s1 s2 s3 |
@@ -124,12 +123,11 @@ replicas continue to be exactly same. | |||
124 | 123 | ||
125 | 2d) A unbindable mount is a unbindable private mount | 124 | 2d) A unbindable mount is a unbindable private mount |
126 | 125 | ||
127 | lets say we have a mount at /mnt and we make is unbindable | 126 | let's say we have a mount at /mnt and we make is unbindable |
128 | 127 | ||
129 | #mount --make-unbindable /mnt | 128 | # mount --make-unbindable /mnt |
130 | [ smount /mnt unbindable ] | ||
131 | 129 | ||
132 | Lets try to bind mount this mount somewhere else. | 130 | Let's try to bind mount this mount somewhere else. |
133 | # mount --bind /mnt /tmp | 131 | # mount --bind /mnt /tmp |
134 | mount: wrong fs type, bad option, bad superblock on /mnt, | 132 | mount: wrong fs type, bad option, bad superblock on /mnt, |
135 | or too many mounted file systems | 133 | or too many mounted file systems |
@@ -137,149 +135,15 @@ replicas continue to be exactly same. | |||
137 | Binding a unbindable mount is a invalid operation. | 135 | Binding a unbindable mount is a invalid operation. |
138 | 136 | ||
139 | 137 | ||
140 | 3) smount command | 138 | 3) Setting mount states |
141 | 139 | ||
142 | Currently the mount command is not aware of shared subtree features. | 140 | The mount command (util-linux package) can be used to set mount |
143 | Work is in progress to add the support in mount ( util-linux package ). | 141 | states: |
144 | Till then use the following program. | ||
145 | 142 | ||
146 | ------------------------------------------------------------------------ | 143 | mount --make-shared mountpoint |
147 | // | 144 | mount --make-slave mountpoint |
148 | //this code was developed my Miklos Szeredi <miklos@szeredi.hu> | 145 | mount --make-private mountpoint |
149 | //and modified by Ram Pai <linuxram@us.ibm.com> | 146 | mount --make-unbindable mountpoint |
150 | // sample usage: | ||
151 | // smount /tmp shared | ||
152 | // | ||
153 | #include <stdio.h> | ||
154 | #include <stdlib.h> | ||
155 | #include <unistd.h> | ||
156 | #include <string.h> | ||
157 | #include <sys/mount.h> | ||
158 | #include <sys/fsuid.h> | ||
159 | |||
160 | #ifndef MS_REC | ||
161 | #define MS_REC 0x4000 /* 16384: Recursive loopback */ | ||
162 | #endif | ||
163 | |||
164 | #ifndef MS_SHARED | ||
165 | #define MS_SHARED 1<<20 /* Shared */ | ||
166 | #endif | ||
167 | |||
168 | #ifndef MS_PRIVATE | ||
169 | #define MS_PRIVATE 1<<18 /* Private */ | ||
170 | #endif | ||
171 | |||
172 | #ifndef MS_SLAVE | ||
173 | #define MS_SLAVE 1<<19 /* Slave */ | ||
174 | #endif | ||
175 | |||
176 | #ifndef MS_UNBINDABLE | ||
177 | #define MS_UNBINDABLE 1<<17 /* Unbindable */ | ||
178 | #endif | ||
179 | |||
180 | int main(int argc, char *argv[]) | ||
181 | { | ||
182 | int type; | ||
183 | if(argc != 3) { | ||
184 | fprintf(stderr, "usage: %s dir " | ||
185 | "<rshared|rslave|rprivate|runbindable|shared|slave" | ||
186 | "|private|unbindable>\n" , argv[0]); | ||
187 | return 1; | ||
188 | } | ||
189 | |||
190 | fprintf(stdout, "%s %s %s\n", argv[0], argv[1], argv[2]); | ||
191 | |||
192 | if (strcmp(argv[2],"rshared")==0) | ||
193 | type=(MS_SHARED|MS_REC); | ||
194 | else if (strcmp(argv[2],"rslave")==0) | ||
195 | type=(MS_SLAVE|MS_REC); | ||
196 | else if (strcmp(argv[2],"rprivate")==0) | ||
197 | type=(MS_PRIVATE|MS_REC); | ||
198 | else if (strcmp(argv[2],"runbindable")==0) | ||
199 | type=(MS_UNBINDABLE|MS_REC); | ||
200 | else if (strcmp(argv[2],"shared")==0) | ||
201 | type=MS_SHARED; | ||
202 | else if (strcmp(argv[2],"slave")==0) | ||
203 | type=MS_SLAVE; | ||
204 | else if (strcmp(argv[2],"private")==0) | ||
205 | type=MS_PRIVATE; | ||
206 | else if (strcmp(argv[2],"unbindable")==0) | ||
207 | type=MS_UNBINDABLE; | ||
208 | else { | ||
209 | fprintf(stderr, "invalid operation: %s\n", argv[2]); | ||
210 | return 1; | ||
211 | } | ||
212 | setfsuid(getuid()); | ||
213 | |||
214 | if(mount("", argv[1], "dontcare", type, "") == -1) { | ||
215 | perror("mount"); | ||
216 | return 1; | ||
217 | } | ||
218 | return 0; | ||
219 | } | ||
220 | ----------------------------------------------------------------------- | ||
221 | |||
222 | Copy the above code snippet into smount.c | ||
223 | gcc -o smount smount.c | ||
224 | |||
225 | |||
226 | (i) To mark all the mounts under /mnt as shared execute the following | ||
227 | command: | ||
228 | |||
229 | smount /mnt rshared | ||
230 | the corresponding syntax planned for mount command is | ||
231 | mount --make-rshared /mnt | ||
232 | |||
233 | just to mark a mount /mnt as shared, execute the following | ||
234 | command: | ||
235 | smount /mnt shared | ||
236 | the corresponding syntax planned for mount command is | ||
237 | mount --make-shared /mnt | ||
238 | |||
239 | (ii) To mark all the shared mounts under /mnt as slave execute the | ||
240 | following | ||
241 | |||
242 | command: | ||
243 | smount /mnt rslave | ||
244 | the corresponding syntax planned for mount command is | ||
245 | mount --make-rslave /mnt | ||
246 | |||
247 | just to mark a mount /mnt as slave, execute the following | ||
248 | command: | ||
249 | smount /mnt slave | ||
250 | the corresponding syntax planned for mount command is | ||
251 | mount --make-slave /mnt | ||
252 | |||
253 | (iii) To mark all the mounts under /mnt as private execute the | ||
254 | following command: | ||
255 | |||
256 | smount /mnt rprivate | ||
257 | the corresponding syntax planned for mount command is | ||
258 | mount --make-rprivate /mnt | ||
259 | |||
260 | just to mark a mount /mnt as private, execute the following | ||
261 | command: | ||
262 | smount /mnt private | ||
263 | the corresponding syntax planned for mount command is | ||
264 | mount --make-private /mnt | ||
265 | |||
266 | NOTE: by default all the mounts are created as private. But if | ||
267 | you want to change some shared/slave/unbindable mount as | ||
268 | private at a later point in time, this command can help. | ||
269 | |||
270 | (iv) To mark all the mounts under /mnt as unbindable execute the | ||
271 | following | ||
272 | |||
273 | command: | ||
274 | smount /mnt runbindable | ||
275 | the corresponding syntax planned for mount command is | ||
276 | mount --make-runbindable /mnt | ||
277 | |||
278 | just to mark a mount /mnt as unbindable, execute the following | ||
279 | command: | ||
280 | smount /mnt unbindable | ||
281 | the corresponding syntax planned for mount command is | ||
282 | mount --make-unbindable /mnt | ||
283 | 147 | ||
284 | 148 | ||
285 | 4) Use cases | 149 | 4) Use cases |
@@ -350,7 +214,7 @@ replicas continue to be exactly same. | |||
350 | mount --rbind / /view/v3 | 214 | mount --rbind / /view/v3 |
351 | mount --rbind / /view/v4 | 215 | mount --rbind / /view/v4 |
352 | 216 | ||
353 | and if /usr has a versioning filesystem mounted, than that | 217 | and if /usr has a versioning filesystem mounted, then that |
354 | mount appears at /view/v1/usr, /view/v2/usr, /view/v3/usr and | 218 | mount appears at /view/v1/usr, /view/v2/usr, /view/v3/usr and |
355 | /view/v4/usr too | 219 | /view/v4/usr too |
356 | 220 | ||
@@ -390,7 +254,7 @@ replicas continue to be exactly same. | |||
390 | 254 | ||
391 | For example: | 255 | For example: |
392 | mount --make-shared /mnt | 256 | mount --make-shared /mnt |
393 | mount --bin /mnt /tmp | 257 | mount --bind /mnt /tmp |
394 | 258 | ||
395 | The mount at /mnt and that at /tmp are both shared and belong | 259 | The mount at /mnt and that at /tmp are both shared and belong |
396 | to the same peer group. Anything mounted or unmounted under | 260 | to the same peer group. Anything mounted or unmounted under |
@@ -558,7 +422,7 @@ replicas continue to be exactly same. | |||
558 | then the subtree under the unbindable mount is pruned in the new | 422 | then the subtree under the unbindable mount is pruned in the new |
559 | location. | 423 | location. |
560 | 424 | ||
561 | eg: lets say we have the following mount tree. | 425 | eg: let's say we have the following mount tree. |
562 | 426 | ||
563 | A | 427 | A |
564 | / \ | 428 | / \ |
@@ -566,7 +430,7 @@ replicas continue to be exactly same. | |||
566 | / \ / \ | 430 | / \ / \ |
567 | D E F G | 431 | D E F G |
568 | 432 | ||
569 | Lets say all the mount except the mount C in the tree are | 433 | Let's say all the mount except the mount C in the tree are |
570 | of a type other than unbindable. | 434 | of a type other than unbindable. |
571 | 435 | ||
572 | If this tree is rbound to say Z | 436 | If this tree is rbound to say Z |
@@ -683,13 +547,13 @@ replicas continue to be exactly same. | |||
683 | 'b' on mounts that receive propagation from mount 'B' and does not have | 547 | 'b' on mounts that receive propagation from mount 'B' and does not have |
684 | sub-mounts within them are unmounted. | 548 | sub-mounts within them are unmounted. |
685 | 549 | ||
686 | Example: Lets say 'B1', 'B2', 'B3' are shared mounts that propagate to | 550 | Example: Let's say 'B1', 'B2', 'B3' are shared mounts that propagate to |
687 | each other. | 551 | each other. |
688 | 552 | ||
689 | lets say 'A1', 'A2', 'A3' are first mounted at dentry 'b' on mount | 553 | let's say 'A1', 'A2', 'A3' are first mounted at dentry 'b' on mount |
690 | 'B1', 'B2' and 'B3' respectively. | 554 | 'B1', 'B2' and 'B3' respectively. |
691 | 555 | ||
692 | lets say 'C1', 'C2', 'C3' are next mounted at the same dentry 'b' on | 556 | let's say 'C1', 'C2', 'C3' are next mounted at the same dentry 'b' on |
693 | mount 'B1', 'B2' and 'B3' respectively. | 557 | mount 'B1', 'B2' and 'B3' respectively. |
694 | 558 | ||
695 | if 'C1' is unmounted, all the mounts that are most-recently-mounted on | 559 | if 'C1' is unmounted, all the mounts that are most-recently-mounted on |
@@ -710,7 +574,7 @@ replicas continue to be exactly same. | |||
710 | A cloned namespace contains all the mounts as that of the parent | 574 | A cloned namespace contains all the mounts as that of the parent |
711 | namespace. | 575 | namespace. |
712 | 576 | ||
713 | Lets say 'A' and 'B' are the corresponding mounts in the parent and the | 577 | Let's say 'A' and 'B' are the corresponding mounts in the parent and the |
714 | child namespace. | 578 | child namespace. |
715 | 579 | ||
716 | If 'A' is shared, then 'B' is also shared and 'A' and 'B' propagate to | 580 | If 'A' is shared, then 'B' is also shared and 'A' and 'B' propagate to |
@@ -759,11 +623,11 @@ replicas continue to be exactly same. | |||
759 | mount --make-slave /mnt | 623 | mount --make-slave /mnt |
760 | 624 | ||
761 | At this point we have the first mount at /tmp and | 625 | At this point we have the first mount at /tmp and |
762 | its root dentry is 1. Lets call this mount 'A' | 626 | its root dentry is 1. Let's call this mount 'A' |
763 | And then we have a second mount at /tmp1 with root | 627 | And then we have a second mount at /tmp1 with root |
764 | dentry 2. Lets call this mount 'B' | 628 | dentry 2. Let's call this mount 'B' |
765 | Next we have a third mount at /mnt with root dentry | 629 | Next we have a third mount at /mnt with root dentry |
766 | mnt. Lets call this mount 'C' | 630 | mnt. Let's call this mount 'C' |
767 | 631 | ||
768 | 'B' is the slave of 'A' and 'C' is a slave of 'B' | 632 | 'B' is the slave of 'A' and 'C' is a slave of 'B' |
769 | A -> B -> C | 633 | A -> B -> C |
@@ -794,7 +658,7 @@ replicas continue to be exactly same. | |||
794 | 658 | ||
795 | Q3 Why is unbindable mount needed? | 659 | Q3 Why is unbindable mount needed? |
796 | 660 | ||
797 | Lets say we want to replicate the mount tree at multiple | 661 | Let's say we want to replicate the mount tree at multiple |
798 | locations within the same subtree. | 662 | locations within the same subtree. |
799 | 663 | ||
800 | if one rbind mounts a tree within the same subtree 'n' times | 664 | if one rbind mounts a tree within the same subtree 'n' times |
@@ -803,7 +667,7 @@ replicas continue to be exactly same. | |||
803 | mounts. Here is a example. | 667 | mounts. Here is a example. |
804 | 668 | ||
805 | step 1: | 669 | step 1: |
806 | lets say the root tree has just two directories with | 670 | let's say the root tree has just two directories with |
807 | one vfsmount. | 671 | one vfsmount. |
808 | root | 672 | root |
809 | / \ | 673 | / \ |
@@ -875,7 +739,7 @@ replicas continue to be exactly same. | |||
875 | Unclonable mounts come in handy here. | 739 | Unclonable mounts come in handy here. |
876 | 740 | ||
877 | step 1: | 741 | step 1: |
878 | lets say the root tree has just two directories with | 742 | let's say the root tree has just two directories with |
879 | one vfsmount. | 743 | one vfsmount. |
880 | root | 744 | root |
881 | / \ | 745 | / \ |