diff options
author | J. Bruce Fields <bfields@citi.umich.edu> | 2009-10-27 14:41:35 -0400 |
---|---|---|
committer | J. Bruce Fields <bfields@citi.umich.edu> | 2009-10-27 19:34:04 -0400 |
commit | dc7a08166f3a5f23e79e839a8a88849bd3397c32 (patch) | |
tree | 2feb8aed7b6142467e6b8833fbfd9838bda69c39 /Documentation/filesystems/nfs | |
parent | e343eb0d60f74547e0aeb5bd151105c2e6cfe588 (diff) |
nfs: new subdir Documentation/filesystems/nfs
We're adding enough nfs documentation that it may as well have its own
subdirectory.
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Diffstat (limited to 'Documentation/filesystems/nfs')
-rw-r--r-- | Documentation/filesystems/nfs/00-INDEX | 12 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/Exporting | 147 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/nfs-rdma.txt | 271 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/nfs.txt | 98 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/nfs41-server.txt | 222 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/nfsroot.txt | 270 |
6 files changed, 1020 insertions, 0 deletions
diff --git a/Documentation/filesystems/nfs/00-INDEX b/Documentation/filesystems/nfs/00-INDEX new file mode 100644 index 000000000000..6ff3d212027b --- /dev/null +++ b/Documentation/filesystems/nfs/00-INDEX | |||
@@ -0,0 +1,12 @@ | |||
1 | 00-INDEX | ||
2 | - this file (nfs-related documentation). | ||
3 | Exporting | ||
4 | - explanation of how to make filesystems exportable. | ||
5 | nfs.txt | ||
6 | - nfs client, and DNS resolution for fs_locations. | ||
7 | nfs41-server.txt | ||
8 | - info on the Linux server implementation of NFSv4 minor version 1. | ||
9 | nfs-rdma.txt | ||
10 | - how to install and setup the Linux NFS/RDMA client and server software | ||
11 | nfsroot.txt | ||
12 | - short guide on setting up a diskless box with NFS root filesystem. | ||
diff --git a/Documentation/filesystems/nfs/Exporting b/Documentation/filesystems/nfs/Exporting new file mode 100644 index 000000000000..87019d2b5981 --- /dev/null +++ b/Documentation/filesystems/nfs/Exporting | |||
@@ -0,0 +1,147 @@ | |||
1 | |||
2 | Making Filesystems Exportable | ||
3 | ============================= | ||
4 | |||
5 | Overview | ||
6 | -------- | ||
7 | |||
8 | All filesystem operations require a dentry (or two) as a starting | ||
9 | point. Local applications have a reference-counted hold on suitable | ||
10 | dentries via open file descriptors or cwd/root. However remote | ||
11 | applications that access a filesystem via a remote filesystem protocol | ||
12 | such as NFS may not be able to hold such a reference, and so need a | ||
13 | different way to refer to a particular dentry. As the alternative | ||
14 | form of reference needs to be stable across renames, truncates, and | ||
15 | server-reboot (among other things, though these tend to be the most | ||
16 | problematic), there is no simple answer like 'filename'. | ||
17 | |||
18 | The mechanism discussed here allows each filesystem implementation to | ||
19 | specify how to generate an opaque (outside of the filesystem) byte | ||
20 | string for any dentry, and how to find an appropriate dentry for any | ||
21 | given opaque byte string. | ||
22 | This byte string will be called a "filehandle fragment" as it | ||
23 | corresponds to part of an NFS filehandle. | ||
24 | |||
25 | A filesystem which supports the mapping between filehandle fragments | ||
26 | and dentries will be termed "exportable". | ||
27 | |||
28 | |||
29 | |||
30 | Dcache Issues | ||
31 | ------------- | ||
32 | |||
33 | The dcache normally contains a proper prefix of any given filesystem | ||
34 | tree. This means that if any filesystem object is in the dcache, then | ||
35 | all of the ancestors of that filesystem object are also in the dcache. | ||
36 | As normal access is by filename this prefix is created naturally and | ||
37 | maintained easily (by each object maintaining a reference count on | ||
38 | its parent). | ||
39 | |||
40 | However when objects are included into the dcache by interpreting a | ||
41 | filehandle fragment, there is no automatic creation of a path prefix | ||
42 | for the object. This leads to two related but distinct features of | ||
43 | the dcache that are not needed for normal filesystem access. | ||
44 | |||
45 | 1/ The dcache must sometimes contain objects that are not part of the | ||
46 | proper prefix. i.e that are not connected to the root. | ||
47 | 2/ The dcache must be prepared for a newly found (via ->lookup) directory | ||
48 | to already have a (non-connected) dentry, and must be able to move | ||
49 | that dentry into place (based on the parent and name in the | ||
50 | ->lookup). This is particularly needed for directories as | ||
51 | it is a dcache invariant that directories only have one dentry. | ||
52 | |||
53 | To implement these features, the dcache has: | ||
54 | |||
55 | a/ A dentry flag DCACHE_DISCONNECTED which is set on | ||
56 | any dentry that might not be part of the proper prefix. | ||
57 | This is set when anonymous dentries are created, and cleared when a | ||
58 | dentry is noticed to be a child of a dentry which is in the proper | ||
59 | prefix. | ||
60 | |||
61 | b/ A per-superblock list "s_anon" of dentries which are the roots of | ||
62 | subtrees that are not in the proper prefix. These dentries, as | ||
63 | well as the proper prefix, need to be released at unmount time. As | ||
64 | these dentries will not be hashed, they are linked together on the | ||
65 | d_hash list_head. | ||
66 | |||
67 | c/ Helper routines to allocate anonymous dentries, and to help attach | ||
68 | loose directory dentries at lookup time. They are: | ||
69 | d_alloc_anon(inode) will return a dentry for the given inode. | ||
70 | If the inode already has a dentry, one of those is returned. | ||
71 | If it doesn't, a new anonymous (IS_ROOT and | ||
72 | DCACHE_DISCONNECTED) dentry is allocated and attached. | ||
73 | In the case of a directory, care is taken that only one dentry | ||
74 | can ever be attached. | ||
75 | d_splice_alias(inode, dentry) will make sure that there is a | ||
76 | dentry with the same name and parent as the given dentry, and | ||
77 | which refers to the given inode. | ||
78 | If the inode is a directory and already has a dentry, then that | ||
79 | dentry is d_moved over the given dentry. | ||
80 | If the passed dentry gets attached, care is taken that this is | ||
81 | mutually exclusive to a d_alloc_anon operation. | ||
82 | If the passed dentry is used, NULL is returned, else the used | ||
83 | dentry is returned. This corresponds to the calling pattern of | ||
84 | ->lookup. | ||
85 | |||
86 | |||
87 | Filesystem Issues | ||
88 | ----------------- | ||
89 | |||
90 | For a filesystem to be exportable it must: | ||
91 | |||
92 | 1/ provide the filehandle fragment routines described below. | ||
93 | 2/ make sure that d_splice_alias is used rather than d_add | ||
94 | when ->lookup finds an inode for a given parent and name. | ||
95 | Typically the ->lookup routine will end with a: | ||
96 | |||
97 | return d_splice_alias(inode, dentry); | ||
98 | } | ||
99 | |||
100 | |||
101 | |||
102 | A file system implementation declares that instances of the filesystem | ||
103 | are exportable by setting the s_export_op field in the struct | ||
104 | super_block. This field must point to a "struct export_operations" | ||
105 | struct which has the following members: | ||
106 | |||
107 | encode_fh (optional) | ||
108 | Takes a dentry and creates a filehandle fragment which can later be used | ||
109 | to find or create a dentry for the same object. The default | ||
110 | implementation creates a filehandle fragment that encodes a 32bit inode | ||
111 | and generation number for the inode encoded, and if necessary the | ||
112 | same information for the parent. | ||
113 | |||
114 | fh_to_dentry (mandatory) | ||
115 | Given a filehandle fragment, this should find the implied object and | ||
116 | create a dentry for it (possibly with d_alloc_anon). | ||
117 | |||
118 | fh_to_parent (optional but strongly recommended) | ||
119 | Given a filehandle fragment, this should find the parent of the | ||
120 | implied object and create a dentry for it (possibly with d_alloc_anon). | ||
121 | May fail if the filehandle fragment is too small. | ||
122 | |||
123 | get_parent (optional but strongly recommended) | ||
124 | When given a dentry for a directory, this should return a dentry for | ||
125 | the parent. Quite possibly the parent dentry will have been allocated | ||
126 | by d_alloc_anon. The default get_parent function just returns an error | ||
127 | so any filehandle lookup that requires finding a parent will fail. | ||
128 | ->lookup("..") is *not* used as a default as it can leave ".." entries | ||
129 | in the dcache which are too messy to work with. | ||
130 | |||
131 | get_name (optional) | ||
132 | When given a parent dentry and a child dentry, this should find a name | ||
133 | in the directory identified by the parent dentry, which leads to the | ||
134 | object identified by the child dentry. If no get_name function is | ||
135 | supplied, a default implementation is provided which uses vfs_readdir | ||
136 | to find potential names, and matches inode numbers to find the correct | ||
137 | match. | ||
138 | |||
139 | |||
140 | A filehandle fragment consists of an array of 1 or more 4byte words, | ||
141 | together with a one byte "type". | ||
142 | The decode_fh routine should not depend on the stated size that is | ||
143 | passed to it. This size may be larger than the original filehandle | ||
144 | generated by encode_fh, in which case it will have been padded with | ||
145 | nuls. Rather, the encode_fh routine should choose a "type" which | ||
146 | indicates the decode_fh how much of the filehandle is valid, and how | ||
147 | it should be interpreted. | ||
diff --git a/Documentation/filesystems/nfs/nfs-rdma.txt b/Documentation/filesystems/nfs/nfs-rdma.txt new file mode 100644 index 000000000000..e386f7e4bcee --- /dev/null +++ b/Documentation/filesystems/nfs/nfs-rdma.txt | |||
@@ -0,0 +1,271 @@ | |||
1 | ################################################################################ | ||
2 | # # | ||
3 | # NFS/RDMA README # | ||
4 | # # | ||
5 | ################################################################################ | ||
6 | |||
7 | Author: NetApp and Open Grid Computing | ||
8 | Date: May 29, 2008 | ||
9 | |||
10 | Table of Contents | ||
11 | ~~~~~~~~~~~~~~~~~ | ||
12 | - Overview | ||
13 | - Getting Help | ||
14 | - Installation | ||
15 | - Check RDMA and NFS Setup | ||
16 | - NFS/RDMA Setup | ||
17 | |||
18 | Overview | ||
19 | ~~~~~~~~ | ||
20 | |||
21 | This document describes how to install and setup the Linux NFS/RDMA client | ||
22 | and server software. | ||
23 | |||
24 | The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server | ||
25 | was first included in the following release, Linux 2.6.25. | ||
26 | |||
27 | In our testing, we have obtained excellent performance results (full 10Gbit | ||
28 | wire bandwidth at minimal client CPU) under many workloads. The code passes | ||
29 | the full Connectathon test suite and operates over both Infiniband and iWARP | ||
30 | RDMA adapters. | ||
31 | |||
32 | Getting Help | ||
33 | ~~~~~~~~~~~~ | ||
34 | |||
35 | If you get stuck, you can ask questions on the | ||
36 | |||
37 | nfs-rdma-devel@lists.sourceforge.net | ||
38 | |||
39 | mailing list. | ||
40 | |||
41 | Installation | ||
42 | ~~~~~~~~~~~~ | ||
43 | |||
44 | These instructions are a step by step guide to building a machine for | ||
45 | use with NFS/RDMA. | ||
46 | |||
47 | - Install an RDMA device | ||
48 | |||
49 | Any device supported by the drivers in drivers/infiniband/hw is acceptable. | ||
50 | |||
51 | Testing has been performed using several Mellanox-based IB cards, the | ||
52 | Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter. | ||
53 | |||
54 | - Install a Linux distribution and tools | ||
55 | |||
56 | The first kernel release to contain both the NFS/RDMA client and server was | ||
57 | Linux 2.6.25 Therefore, a distribution compatible with this and subsequent | ||
58 | Linux kernel release should be installed. | ||
59 | |||
60 | The procedures described in this document have been tested with | ||
61 | distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). | ||
62 | |||
63 | - Install nfs-utils-1.1.2 or greater on the client | ||
64 | |||
65 | An NFS/RDMA mount point can be obtained by using the mount.nfs command in | ||
66 | nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils | ||
67 | version with support for NFS/RDMA mounts, but for various reasons we | ||
68 | recommend using nfs-utils-1.1.2 or greater). To see which version of | ||
69 | mount.nfs you are using, type: | ||
70 | |||
71 | $ /sbin/mount.nfs -V | ||
72 | |||
73 | If the version is less than 1.1.2 or the command does not exist, | ||
74 | you should install the latest version of nfs-utils. | ||
75 | |||
76 | Download the latest package from: | ||
77 | |||
78 | http://www.kernel.org/pub/linux/utils/nfs | ||
79 | |||
80 | Uncompress the package and follow the installation instructions. | ||
81 | |||
82 | If you will not need the idmapper and gssd executables (you do not need | ||
83 | these to create an NFS/RDMA enabled mount command), the installation | ||
84 | process can be simplified by disabling these features when running | ||
85 | configure: | ||
86 | |||
87 | $ ./configure --disable-gss --disable-nfsv4 | ||
88 | |||
89 | To build nfs-utils you will need the tcp_wrappers package installed. For | ||
90 | more information on this see the package's README and INSTALL files. | ||
91 | |||
92 | After building the nfs-utils package, there will be a mount.nfs binary in | ||
93 | the utils/mount directory. This binary can be used to initiate NFS v2, v3, | ||
94 | or v4 mounts. To initiate a v4 mount, the binary must be called | ||
95 | mount.nfs4. The standard technique is to create a symlink called | ||
96 | mount.nfs4 to mount.nfs. | ||
97 | |||
98 | This mount.nfs binary should be installed at /sbin/mount.nfs as follows: | ||
99 | |||
100 | $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs | ||
101 | |||
102 | In this location, mount.nfs will be invoked automatically for NFS mounts | ||
103 | by the system mount command. | ||
104 | |||
105 | NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed | ||
106 | on the NFS client machine. You do not need this specific version of | ||
107 | nfs-utils on the server. Furthermore, only the mount.nfs command from | ||
108 | nfs-utils-1.1.2 is needed on the client. | ||
109 | |||
110 | - Install a Linux kernel with NFS/RDMA | ||
111 | |||
112 | The NFS/RDMA client and server are both included in the mainline Linux | ||
113 | kernel version 2.6.25 and later. This and other versions of the 2.6 Linux | ||
114 | kernel can be found at: | ||
115 | |||
116 | ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ | ||
117 | |||
118 | Download the sources and place them in an appropriate location. | ||
119 | |||
120 | - Configure the RDMA stack | ||
121 | |||
122 | Make sure your kernel configuration has RDMA support enabled. Under | ||
123 | Device Drivers -> InfiniBand support, update the kernel configuration | ||
124 | to enable InfiniBand support [NOTE: the option name is misleading. Enabling | ||
125 | InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)]. | ||
126 | |||
127 | Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or | ||
128 | iWARP adapter support (amso, cxgb3, etc.). | ||
129 | |||
130 | If you are using InfiniBand, be sure to enable IP-over-InfiniBand support. | ||
131 | |||
132 | - Configure the NFS client and server | ||
133 | |||
134 | Your kernel configuration must also have NFS file system support and/or | ||
135 | NFS server support enabled. These and other NFS related configuration | ||
136 | options can be found under File Systems -> Network File Systems. | ||
137 | |||
138 | - Build, install, reboot | ||
139 | |||
140 | The NFS/RDMA code will be enabled automatically if NFS and RDMA | ||
141 | are turned on. The NFS/RDMA client and server are configured via the hidden | ||
142 | SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The | ||
143 | value of SUNRPC_XPRT_RDMA will be: | ||
144 | |||
145 | - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client | ||
146 | and server will not be built | ||
147 | - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M, | ||
148 | in this case the NFS/RDMA client and server will be built as modules | ||
149 | - Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client | ||
150 | and server will be built into the kernel | ||
151 | |||
152 | Therefore, if you have followed the steps above and turned no NFS and RDMA, | ||
153 | the NFS/RDMA client and server will be built. | ||
154 | |||
155 | Build a new kernel, install it, boot it. | ||
156 | |||
157 | Check RDMA and NFS Setup | ||
158 | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
159 | |||
160 | Before configuring the NFS/RDMA software, it is a good idea to test | ||
161 | your new kernel to ensure that the kernel is working correctly. | ||
162 | In particular, it is a good idea to verify that the RDMA stack | ||
163 | is functioning as expected and standard NFS over TCP/IP and/or UDP/IP | ||
164 | is working properly. | ||
165 | |||
166 | - Check RDMA Setup | ||
167 | |||
168 | If you built the RDMA components as modules, load them at | ||
169 | this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel | ||
170 | card: | ||
171 | |||
172 | $ modprobe ib_mthca | ||
173 | $ modprobe ib_ipoib | ||
174 | |||
175 | If you are using InfiniBand, make sure there is a Subnet Manager (SM) | ||
176 | running on the network. If your IB switch has an embedded SM, you can | ||
177 | use it. Otherwise, you will need to run an SM, such as OpenSM, on one | ||
178 | of your end nodes. | ||
179 | |||
180 | If an SM is running on your network, you should see the following: | ||
181 | |||
182 | $ cat /sys/class/infiniband/driverX/ports/1/state | ||
183 | 4: ACTIVE | ||
184 | |||
185 | where driverX is mthca0, ipath5, ehca3, etc. | ||
186 | |||
187 | To further test the InfiniBand software stack, use IPoIB (this | ||
188 | assumes you have two IB hosts named host1 and host2): | ||
189 | |||
190 | host1$ ifconfig ib0 a.b.c.x | ||
191 | host2$ ifconfig ib0 a.b.c.y | ||
192 | host1$ ping a.b.c.y | ||
193 | host2$ ping a.b.c.x | ||
194 | |||
195 | For other device types, follow the appropriate procedures. | ||
196 | |||
197 | - Check NFS Setup | ||
198 | |||
199 | For the NFS components enabled above (client and/or server), | ||
200 | test their functionality over standard Ethernet using TCP/IP or UDP/IP. | ||
201 | |||
202 | NFS/RDMA Setup | ||
203 | ~~~~~~~~~~~~~~ | ||
204 | |||
205 | We recommend that you use two machines, one to act as the client and | ||
206 | one to act as the server. | ||
207 | |||
208 | One time configuration: | ||
209 | |||
210 | - On the server system, configure the /etc/exports file and | ||
211 | start the NFS/RDMA server. | ||
212 | |||
213 | Exports entries with the following formats have been tested: | ||
214 | |||
215 | /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) | ||
216 | /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) | ||
217 | |||
218 | The IP address(es) is(are) the client's IPoIB address for an InfiniBand | ||
219 | HCA or the cleint's iWARP address(es) for an RNIC. | ||
220 | |||
221 | NOTE: The "insecure" option must be used because the NFS/RDMA client does | ||
222 | not use a reserved port. | ||
223 | |||
224 | Each time a machine boots: | ||
225 | |||
226 | - Load and configure the RDMA drivers | ||
227 | |||
228 | For InfiniBand using a Mellanox adapter: | ||
229 | |||
230 | $ modprobe ib_mthca | ||
231 | $ modprobe ib_ipoib | ||
232 | $ ifconfig ib0 a.b.c.d | ||
233 | |||
234 | NOTE: use unique addresses for the client and server | ||
235 | |||
236 | - Start the NFS server | ||
237 | |||
238 | If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in | ||
239 | kernel config), load the RDMA transport module: | ||
240 | |||
241 | $ modprobe svcrdma | ||
242 | |||
243 | Regardless of how the server was built (module or built-in), start the | ||
244 | server: | ||
245 | |||
246 | $ /etc/init.d/nfs start | ||
247 | |||
248 | or | ||
249 | |||
250 | $ service nfs start | ||
251 | |||
252 | Instruct the server to listen on the RDMA transport: | ||
253 | |||
254 | $ echo rdma 20049 > /proc/fs/nfsd/portlist | ||
255 | |||
256 | - On the client system | ||
257 | |||
258 | If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in | ||
259 | kernel config), load the RDMA client module: | ||
260 | |||
261 | $ modprobe xprtrdma.ko | ||
262 | |||
263 | Regardless of how the client was built (module or built-in), use this | ||
264 | command to mount the NFS/RDMA server: | ||
265 | |||
266 | $ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt | ||
267 | |||
268 | To verify that the mount is using RDMA, run "cat /proc/mounts" and check | ||
269 | the "proto" field for the given mount. | ||
270 | |||
271 | Congratulations! You're using NFS/RDMA! | ||
diff --git a/Documentation/filesystems/nfs/nfs.txt b/Documentation/filesystems/nfs/nfs.txt new file mode 100644 index 000000000000..f50f26ce6cd0 --- /dev/null +++ b/Documentation/filesystems/nfs/nfs.txt | |||
@@ -0,0 +1,98 @@ | |||
1 | |||
2 | The NFS client | ||
3 | ============== | ||
4 | |||
5 | The NFS version 2 protocol was first documented in RFC1094 (March 1989). | ||
6 | Since then two more major releases of NFS have been published, with NFSv3 | ||
7 | being documented in RFC1813 (June 1995), and NFSv4 in RFC3530 (April | ||
8 | 2003). | ||
9 | |||
10 | The Linux NFS client currently supports all the above published versions, | ||
11 | and work is in progress on adding support for minor version 1 of the NFSv4 | ||
12 | protocol. | ||
13 | |||
14 | The purpose of this document is to provide information on some of the | ||
15 | upcall interfaces that are used in order to provide the NFS client with | ||
16 | some of the information that it requires in order to fully comply with | ||
17 | the NFS spec. | ||
18 | |||
19 | The DNS resolver | ||
20 | ================ | ||
21 | |||
22 | NFSv4 allows for one server to refer the NFS client to data that has been | ||
23 | migrated onto another server by means of the special "fs_locations" | ||
24 | attribute. See | ||
25 | http://tools.ietf.org/html/rfc3530#section-6 | ||
26 | and | ||
27 | http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00 | ||
28 | |||
29 | The fs_locations information can take the form of either an ip address and | ||
30 | a path, or a DNS hostname and a path. The latter requires the NFS client to | ||
31 | do a DNS lookup in order to mount the new volume, and hence the need for an | ||
32 | upcall to allow userland to provide this service. | ||
33 | |||
34 | Assuming that the user has the 'rpc_pipefs' filesystem mounted in the usual | ||
35 | /var/lib/nfs/rpc_pipefs, the upcall consists of the following steps: | ||
36 | |||
37 | (1) The process checks the dns_resolve cache to see if it contains a | ||
38 | valid entry. If so, it returns that entry and exits. | ||
39 | |||
40 | (2) If no valid entry exists, the helper script '/sbin/nfs_cache_getent' | ||
41 | (may be changed using the 'nfs.cache_getent' kernel boot parameter) | ||
42 | is run, with two arguments: | ||
43 | - the cache name, "dns_resolve" | ||
44 | - the hostname to resolve | ||
45 | |||
46 | (3) After looking up the corresponding ip address, the helper script | ||
47 | writes the result into the rpc_pipefs pseudo-file | ||
48 | '/var/lib/nfs/rpc_pipefs/cache/dns_resolve/channel' | ||
49 | in the following (text) format: | ||
50 | |||
51 | "<ip address> <hostname> <ttl>\n" | ||
52 | |||
53 | Where <ip address> is in the usual IPv4 (123.456.78.90) or IPv6 | ||
54 | (ffee:ddcc:bbaa:9988:7766:5544:3322:1100, ffee::1100, ...) format. | ||
55 | <hostname> is identical to the second argument of the helper | ||
56 | script, and <ttl> is the 'time to live' of this cache entry (in | ||
57 | units of seconds). | ||
58 | |||
59 | Note: If <ip address> is invalid, say the string "0", then a negative | ||
60 | entry is created, which will cause the kernel to treat the hostname | ||
61 | as having no valid DNS translation. | ||
62 | |||
63 | |||
64 | |||
65 | |||
66 | A basic sample /sbin/nfs_cache_getent | ||
67 | ===================================== | ||
68 | |||
69 | #!/bin/bash | ||
70 | # | ||
71 | ttl=600 | ||
72 | # | ||
73 | cut=/usr/bin/cut | ||
74 | getent=/usr/bin/getent | ||
75 | rpc_pipefs=/var/lib/nfs/rpc_pipefs | ||
76 | # | ||
77 | die() | ||
78 | { | ||
79 | echo "Usage: $0 cache_name entry_name" | ||
80 | exit 1 | ||
81 | } | ||
82 | |||
83 | [ $# -lt 2 ] && die | ||
84 | cachename="$1" | ||
85 | cache_path=${rpc_pipefs}/cache/${cachename}/channel | ||
86 | |||
87 | case "${cachename}" in | ||
88 | dns_resolve) | ||
89 | name="$2" | ||
90 | result="$(${getent} hosts ${name} | ${cut} -f1 -d\ )" | ||
91 | [ -z "${result}" ] && result="0" | ||
92 | ;; | ||
93 | *) | ||
94 | die | ||
95 | ;; | ||
96 | esac | ||
97 | echo "${result} ${name} ${ttl}" >${cache_path} | ||
98 | |||
diff --git a/Documentation/filesystems/nfs/nfs41-server.txt b/Documentation/filesystems/nfs/nfs41-server.txt new file mode 100644 index 000000000000..1bd0d0c05171 --- /dev/null +++ b/Documentation/filesystems/nfs/nfs41-server.txt | |||
@@ -0,0 +1,222 @@ | |||
1 | NFSv4.1 Server Implementation | ||
2 | |||
3 | Server support for minorversion 1 can be controlled using the | ||
4 | /proc/fs/nfsd/versions control file. The string output returned | ||
5 | by reading this file will contain either "+4.1" or "-4.1" | ||
6 | correspondingly. | ||
7 | |||
8 | Currently, server support for minorversion 1 is disabled by default. | ||
9 | It can be enabled at run time by writing the string "+4.1" to | ||
10 | the /proc/fs/nfsd/versions control file. Note that to write this | ||
11 | control file, the nfsd service must be taken down. Use your user-mode | ||
12 | nfs-utils to set this up; see rpc.nfsd(8) | ||
13 | |||
14 | (Warning: older servers will interpret "+4.1" and "-4.1" as "+4" and | ||
15 | "-4", respectively. Therefore, code meant to work on both new and old | ||
16 | kernels must turn 4.1 on or off *before* turning support for version 4 | ||
17 | on or off; rpc.nfsd does this correctly.) | ||
18 | |||
19 | The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based | ||
20 | on the latest NFSv4.1 Internet Draft: | ||
21 | http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29 | ||
22 | |||
23 | From the many new features in NFSv4.1 the current implementation | ||
24 | focuses on the mandatory-to-implement NFSv4.1 Sessions, providing | ||
25 | "exactly once" semantics and better control and throttling of the | ||
26 | resources allocated for each client. | ||
27 | |||
28 | Other NFSv4.1 features, Parallel NFS operations in particular, | ||
29 | are still under development out of tree. | ||
30 | See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design | ||
31 | for more information. | ||
32 | |||
33 | The current implementation is intended for developers only: while it | ||
34 | does support ordinary file operations on clients we have tested against | ||
35 | (including the linux client), it is incomplete in ways which may limit | ||
36 | features unexpectedly, cause known bugs in rare cases, or cause | ||
37 | interoperability problems with future clients. Known issues: | ||
38 | |||
39 | - gss support is questionable: currently mounts with kerberos | ||
40 | from a linux client are possible, but we aren't really | ||
41 | conformant with the spec (for example, we don't use kerberos | ||
42 | on the backchannel correctly). | ||
43 | - no trunking support: no clients currently take advantage of | ||
44 | trunking, but this is a mandatory feature, and its use is | ||
45 | recommended to clients in a number of places. (E.g. to ensure | ||
46 | timely renewal in case an existing connection's retry timeouts | ||
47 | have gotten too long; see section 8.3 of the draft.) | ||
48 | Therefore, lack of this feature may cause future clients to | ||
49 | fail. | ||
50 | - Incomplete backchannel support: incomplete backchannel gss | ||
51 | support and no support for BACKCHANNEL_CTL mean that | ||
52 | callbacks (hence delegations and layouts) may not be | ||
53 | available and clients confused by the incomplete | ||
54 | implementation may fail. | ||
55 | - Server reboot recovery is unsupported; if the server reboots, | ||
56 | clients may fail. | ||
57 | - We do not support SSV, which provides security for shared | ||
58 | client-server state (thus preventing unauthorized tampering | ||
59 | with locks and opens, for example). It is mandatory for | ||
60 | servers to support this, though no clients use it yet. | ||
61 | - Mandatory operations which we do not support, such as | ||
62 | DESTROY_CLIENTID, FREE_STATEID, SECINFO_NO_NAME, and | ||
63 | TEST_STATEID, are not currently used by clients, but will be | ||
64 | (and the spec recommends their uses in common cases), and | ||
65 | clients should not be expected to know how to recover from the | ||
66 | case where they are not supported. This will eventually cause | ||
67 | interoperability failures. | ||
68 | |||
69 | In addition, some limitations are inherited from the current NFSv4 | ||
70 | implementation: | ||
71 | |||
72 | - Incomplete delegation enforcement: if a file is renamed or | ||
73 | unlinked, a client holding a delegation may continue to | ||
74 | indefinitely allow opens of the file under the old name. | ||
75 | |||
76 | The table below, taken from the NFSv4.1 document, lists | ||
77 | the operations that are mandatory to implement (REQ), optional | ||
78 | (OPT), and NFSv4.0 operations that are required not to implement (MNI) | ||
79 | in minor version 1. The first column indicates the operations that | ||
80 | are not supported yet by the linux server implementation. | ||
81 | |||
82 | The OPTIONAL features identified and their abbreviations are as follows: | ||
83 | pNFS Parallel NFS | ||
84 | FDELG File Delegations | ||
85 | DDELG Directory Delegations | ||
86 | |||
87 | The following abbreviations indicate the linux server implementation status. | ||
88 | I Implemented NFSv4.1 operations. | ||
89 | NS Not Supported. | ||
90 | NS* unimplemented optional feature. | ||
91 | P pNFS features implemented out of tree. | ||
92 | PNS pNFS features that are not supported yet (out of tree). | ||
93 | |||
94 | Operations | ||
95 | |||
96 | +----------------------+------------+--------------+----------------+ | ||
97 | | Operation | REQ, REC, | Feature | Definition | | ||
98 | | | OPT, or | (REQ, REC, | | | ||
99 | | | MNI | or OPT) | | | ||
100 | +----------------------+------------+--------------+----------------+ | ||
101 | | ACCESS | REQ | | Section 18.1 | | ||
102 | NS | BACKCHANNEL_CTL | REQ | | Section 18.33 | | ||
103 | NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 | | ||
104 | | CLOSE | REQ | | Section 18.2 | | ||
105 | | COMMIT | REQ | | Section 18.3 | | ||
106 | | CREATE | REQ | | Section 18.4 | | ||
107 | I | CREATE_SESSION | REQ | | Section 18.36 | | ||
108 | NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 | | ||
109 | | DELEGRETURN | OPT | FDELG, | Section 18.6 | | ||
110 | | | | DDELG, pNFS | | | ||
111 | | | | (REQ) | | | ||
112 | NS | DESTROY_CLIENTID | REQ | | Section 18.50 | | ||
113 | I | DESTROY_SESSION | REQ | | Section 18.37 | | ||
114 | I | EXCHANGE_ID | REQ | | Section 18.35 | | ||
115 | NS | FREE_STATEID | REQ | | Section 18.38 | | ||
116 | | GETATTR | REQ | | Section 18.7 | | ||
117 | P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 | | ||
118 | P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 | | ||
119 | | GETFH | REQ | | Section 18.8 | | ||
120 | NS*| GET_DIR_DELEGATION | OPT | DDELG (REQ) | Section 18.39 | | ||
121 | P | LAYOUTCOMMIT | OPT | pNFS (REQ) | Section 18.42 | | ||
122 | P | LAYOUTGET | OPT | pNFS (REQ) | Section 18.43 | | ||
123 | P | LAYOUTRETURN | OPT | pNFS (REQ) | Section 18.44 | | ||
124 | | LINK | OPT | | Section 18.9 | | ||
125 | | LOCK | REQ | | Section 18.10 | | ||
126 | | LOCKT | REQ | | Section 18.11 | | ||
127 | | LOCKU | REQ | | Section 18.12 | | ||
128 | | LOOKUP | REQ | | Section 18.13 | | ||
129 | | LOOKUPP | REQ | | Section 18.14 | | ||
130 | | NVERIFY | REQ | | Section 18.15 | | ||
131 | | OPEN | REQ | | Section 18.16 | | ||
132 | NS*| OPENATTR | OPT | | Section 18.17 | | ||
133 | | OPEN_CONFIRM | MNI | | N/A | | ||
134 | | OPEN_DOWNGRADE | REQ | | Section 18.18 | | ||
135 | | PUTFH | REQ | | Section 18.19 | | ||
136 | | PUTPUBFH | REQ | | Section 18.20 | | ||
137 | | PUTROOTFH | REQ | | Section 18.21 | | ||
138 | | READ | REQ | | Section 18.22 | | ||
139 | | READDIR | REQ | | Section 18.23 | | ||
140 | | READLINK | OPT | | Section 18.24 | | ||
141 | NS | RECLAIM_COMPLETE | REQ | | Section 18.51 | | ||
142 | | RELEASE_LOCKOWNER | MNI | | N/A | | ||
143 | | REMOVE | REQ | | Section 18.25 | | ||
144 | | RENAME | REQ | | Section 18.26 | | ||
145 | | RENEW | MNI | | N/A | | ||
146 | | RESTOREFH | REQ | | Section 18.27 | | ||
147 | | SAVEFH | REQ | | Section 18.28 | | ||
148 | | SECINFO | REQ | | Section 18.29 | | ||
149 | NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, | | ||
150 | | | | layout (REQ) | Section 13.12 | | ||
151 | I | SEQUENCE | REQ | | Section 18.46 | | ||
152 | | SETATTR | REQ | | Section 18.30 | | ||
153 | | SETCLIENTID | MNI | | N/A | | ||
154 | | SETCLIENTID_CONFIRM | MNI | | N/A | | ||
155 | NS | SET_SSV | REQ | | Section 18.47 | | ||
156 | NS | TEST_STATEID | REQ | | Section 18.48 | | ||
157 | | VERIFY | REQ | | Section 18.31 | | ||
158 | NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 | | ||
159 | | WRITE | REQ | | Section 18.32 | | ||
160 | |||
161 | Callback Operations | ||
162 | |||
163 | +-------------------------+-----------+-------------+---------------+ | ||
164 | | Operation | REQ, REC, | Feature | Definition | | ||
165 | | | OPT, or | (REQ, REC, | | | ||
166 | | | MNI | or OPT) | | | ||
167 | +-------------------------+-----------+-------------+---------------+ | ||
168 | | CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 | | ||
169 | P | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 | | ||
170 | NS*| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 | | ||
171 | P | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 | | ||
172 | NS*| CB_NOTIFY_LOCK | OPT | | Section 20.11 | | ||
173 | NS*| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 | | ||
174 | | CB_RECALL | OPT | FDELG, | Section 20.2 | | ||
175 | | | | DDELG, pNFS | | | ||
176 | | | | (REQ) | | | ||
177 | NS*| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 | | ||
178 | | | | DDELG, pNFS | | | ||
179 | | | | (REQ) | | | ||
180 | NS | CB_RECALL_SLOT | REQ | | Section 20.8 | | ||
181 | NS*| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 | | ||
182 | | | | (REQ) | | | ||
183 | I | CB_SEQUENCE | OPT | FDELG, | Section 20.9 | | ||
184 | | | | DDELG, pNFS | | | ||
185 | | | | (REQ) | | | ||
186 | NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 | | ||
187 | | | | DDELG, pNFS | | | ||
188 | | | | (REQ) | | | ||
189 | +-------------------------+-----------+-------------+---------------+ | ||
190 | |||
191 | Implementation notes: | ||
192 | |||
193 | DELEGPURGE: | ||
194 | * mandatory only for servers that support CLAIM_DELEGATE_PREV and/or | ||
195 | CLAIM_DELEG_PREV_FH (which allows clients to keep delegations that | ||
196 | persist across client reboots). Thus we need not implement this for | ||
197 | now. | ||
198 | |||
199 | EXCHANGE_ID: | ||
200 | * only SP4_NONE state protection supported | ||
201 | * implementation ids are ignored | ||
202 | |||
203 | CREATE_SESSION: | ||
204 | * backchannel attributes are ignored | ||
205 | * backchannel security parameters are ignored | ||
206 | |||
207 | SEQUENCE: | ||
208 | * no support for dynamic slot table renegotiation (optional) | ||
209 | |||
210 | nfsv4.1 COMPOUND rules: | ||
211 | The following cases aren't supported yet: | ||
212 | * Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION, | ||
213 | DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID. | ||
214 | * DESTROY_SESSION MUST be the final operation in the COMPOUND request. | ||
215 | |||
216 | Nonstandard compound limitations: | ||
217 | * No support for a sessions fore channel RPC compound that requires both a | ||
218 | ca_maxrequestsize request and a ca_maxresponsesize reply, so we may | ||
219 | fail to live up to the promise we made in CREATE_SESSION fore channel | ||
220 | negotiation. | ||
221 | * No more than one IO operation (read, write, readdir) allowed per | ||
222 | compound. | ||
diff --git a/Documentation/filesystems/nfs/nfsroot.txt b/Documentation/filesystems/nfs/nfsroot.txt new file mode 100644 index 000000000000..3ba0b945aaf8 --- /dev/null +++ b/Documentation/filesystems/nfs/nfsroot.txt | |||
@@ -0,0 +1,270 @@ | |||
1 | Mounting the root filesystem via NFS (nfsroot) | ||
2 | =============================================== | ||
3 | |||
4 | Written 1996 by Gero Kuhlmann <gero@gkminix.han.de> | ||
5 | Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz> | ||
6 | Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org> | ||
7 | Updated 2006 by Horms <horms@verge.net.au> | ||
8 | |||
9 | |||
10 | |||
11 | In order to use a diskless system, such as an X-terminal or printer server | ||
12 | for example, it is necessary for the root filesystem to be present on a | ||
13 | non-disk device. This may be an initramfs (see Documentation/filesystems/ | ||
14 | ramfs-rootfs-initramfs.txt), a ramdisk (see Documentation/initrd.txt) or a | ||
15 | filesystem mounted via NFS. The following text describes on how to use NFS | ||
16 | for the root filesystem. For the rest of this text 'client' means the | ||
17 | diskless system, and 'server' means the NFS server. | ||
18 | |||
19 | |||
20 | |||
21 | |||
22 | 1.) Enabling nfsroot capabilities | ||
23 | ----------------------------- | ||
24 | |||
25 | In order to use nfsroot, NFS client support needs to be selected as | ||
26 | built-in during configuration. Once this has been selected, the nfsroot | ||
27 | option will become available, which should also be selected. | ||
28 | |||
29 | In the networking options, kernel level autoconfiguration can be selected, | ||
30 | along with the types of autoconfiguration to support. Selecting all of | ||
31 | DHCP, BOOTP and RARP is safe. | ||
32 | |||
33 | |||
34 | |||
35 | |||
36 | 2.) Kernel command line | ||
37 | ------------------- | ||
38 | |||
39 | When the kernel has been loaded by a boot loader (see below) it needs to be | ||
40 | told what root fs device to use. And in the case of nfsroot, where to find | ||
41 | both the server and the name of the directory on the server to mount as root. | ||
42 | This can be established using the following kernel command line parameters: | ||
43 | |||
44 | |||
45 | root=/dev/nfs | ||
46 | |||
47 | This is necessary to enable the pseudo-NFS-device. Note that it's not a | ||
48 | real device but just a synonym to tell the kernel to use NFS instead of | ||
49 | a real device. | ||
50 | |||
51 | |||
52 | nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>] | ||
53 | |||
54 | If the `nfsroot' parameter is NOT given on the command line, | ||
55 | the default "/tftpboot/%s" will be used. | ||
56 | |||
57 | <server-ip> Specifies the IP address of the NFS server. | ||
58 | The default address is determined by the `ip' parameter | ||
59 | (see below). This parameter allows the use of different | ||
60 | servers for IP autoconfiguration and NFS. | ||
61 | |||
62 | <root-dir> Name of the directory on the server to mount as root. | ||
63 | If there is a "%s" token in the string, it will be | ||
64 | replaced by the ASCII-representation of the client's | ||
65 | IP address. | ||
66 | |||
67 | <nfs-options> Standard NFS options. All options are separated by commas. | ||
68 | The following defaults are used: | ||
69 | port = as given by server portmap daemon | ||
70 | rsize = 4096 | ||
71 | wsize = 4096 | ||
72 | timeo = 7 | ||
73 | retrans = 3 | ||
74 | acregmin = 3 | ||
75 | acregmax = 60 | ||
76 | acdirmin = 30 | ||
77 | acdirmax = 60 | ||
78 | flags = hard, nointr, noposix, cto, ac | ||
79 | |||
80 | |||
81 | ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf> | ||
82 | |||
83 | This parameter tells the kernel how to configure IP addresses of devices | ||
84 | and also how to set up the IP routing table. It was originally called | ||
85 | `nfsaddrs', but now the boot-time IP configuration works independently of | ||
86 | NFS, so it was renamed to `ip' and the old name remained as an alias for | ||
87 | compatibility reasons. | ||
88 | |||
89 | If this parameter is missing from the kernel command line, all fields are | ||
90 | assumed to be empty, and the defaults mentioned below apply. In general | ||
91 | this means that the kernel tries to configure everything using | ||
92 | autoconfiguration. | ||
93 | |||
94 | The <autoconf> parameter can appear alone as the value to the `ip' | ||
95 | parameter (without all the ':' characters before). If the value is | ||
96 | "ip=off" or "ip=none", no autoconfiguration will take place, otherwise | ||
97 | autoconfiguration will take place. The most common way to use this | ||
98 | is "ip=dhcp". | ||
99 | |||
100 | <client-ip> IP address of the client. | ||
101 | |||
102 | Default: Determined using autoconfiguration. | ||
103 | |||
104 | <server-ip> IP address of the NFS server. If RARP is used to determine | ||
105 | the client address and this parameter is NOT empty only | ||
106 | replies from the specified server are accepted. | ||
107 | |||
108 | Only required for NFS root. That is autoconfiguration | ||
109 | will not be triggered if it is missing and NFS root is not | ||
110 | in operation. | ||
111 | |||
112 | Default: Determined using autoconfiguration. | ||
113 | The address of the autoconfiguration server is used. | ||
114 | |||
115 | <gw-ip> IP address of a gateway if the server is on a different subnet. | ||
116 | |||
117 | Default: Determined using autoconfiguration. | ||
118 | |||
119 | <netmask> Netmask for local network interface. If unspecified | ||
120 | the netmask is derived from the client IP address assuming | ||
121 | classful addressing. | ||
122 | |||
123 | Default: Determined using autoconfiguration. | ||
124 | |||
125 | <hostname> Name of the client. May be supplied by autoconfiguration, | ||
126 | but its absence will not trigger autoconfiguration. | ||
127 | |||
128 | Default: Client IP address is used in ASCII notation. | ||
129 | |||
130 | <device> Name of network device to use. | ||
131 | |||
132 | Default: If the host only has one device, it is used. | ||
133 | Otherwise the device is determined using | ||
134 | autoconfiguration. This is done by sending | ||
135 | autoconfiguration requests out of all devices, | ||
136 | and using the device that received the first reply. | ||
137 | |||
138 | <autoconf> Method to use for autoconfiguration. In the case of options | ||
139 | which specify multiple autoconfiguration protocols, | ||
140 | requests are sent using all protocols, and the first one | ||
141 | to reply is used. | ||
142 | |||
143 | Only autoconfiguration protocols that have been compiled | ||
144 | into the kernel will be used, regardless of the value of | ||
145 | this option. | ||
146 | |||
147 | off or none: don't use autoconfiguration | ||
148 | (do static IP assignment instead) | ||
149 | on or any: use any protocol available in the kernel | ||
150 | (default) | ||
151 | dhcp: use DHCP | ||
152 | bootp: use BOOTP | ||
153 | rarp: use RARP | ||
154 | both: use both BOOTP and RARP but not DHCP | ||
155 | (old option kept for backwards compatibility) | ||
156 | |||
157 | Default: any | ||
158 | |||
159 | |||
160 | |||
161 | |||
162 | 3.) Boot Loader | ||
163 | ---------- | ||
164 | |||
165 | To get the kernel into memory different approaches can be used. | ||
166 | They depend on various facilities being available: | ||
167 | |||
168 | |||
169 | 3.1) Booting from a floppy using syslinux | ||
170 | |||
171 | When building kernels, an easy way to create a boot floppy that uses | ||
172 | syslinux is to use the zdisk or bzdisk make targets which use zimage | ||
173 | and bzimage images respectively. Both targets accept the | ||
174 | FDARGS parameter which can be used to set the kernel command line. | ||
175 | |||
176 | e.g. | ||
177 | make bzdisk FDARGS="root=/dev/nfs" | ||
178 | |||
179 | Note that the user running this command will need to have | ||
180 | access to the floppy drive device, /dev/fd0 | ||
181 | |||
182 | For more information on syslinux, including how to create bootdisks | ||
183 | for prebuilt kernels, see http://syslinux.zytor.com/ | ||
184 | |||
185 | N.B: Previously it was possible to write a kernel directly to | ||
186 | a floppy using dd, configure the boot device using rdev, and | ||
187 | boot using the resulting floppy. Linux no longer supports this | ||
188 | method of booting. | ||
189 | |||
190 | 3.2) Booting from a cdrom using isolinux | ||
191 | |||
192 | When building kernels, an easy way to create a bootable cdrom that | ||
193 | uses isolinux is to use the isoimage target which uses a bzimage | ||
194 | image. Like zdisk and bzdisk, this target accepts the FDARGS | ||
195 | parameter which can be used to set the kernel command line. | ||
196 | |||
197 | e.g. | ||
198 | make isoimage FDARGS="root=/dev/nfs" | ||
199 | |||
200 | The resulting iso image will be arch/<ARCH>/boot/image.iso | ||
201 | This can be written to a cdrom using a variety of tools including | ||
202 | cdrecord. | ||
203 | |||
204 | e.g. | ||
205 | cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso | ||
206 | |||
207 | For more information on isolinux, including how to create bootdisks | ||
208 | for prebuilt kernels, see http://syslinux.zytor.com/ | ||
209 | |||
210 | 3.2) Using LILO | ||
211 | When using LILO all the necessary command line parameters may be | ||
212 | specified using the 'append=' directive in the LILO configuration | ||
213 | file. | ||
214 | |||
215 | However, to use the 'root=' directive you also need to create | ||
216 | a dummy root device, which may be removed after LILO is run. | ||
217 | |||
218 | mknod /dev/boot255 c 0 255 | ||
219 | |||
220 | For information on configuring LILO, please refer to its documentation. | ||
221 | |||
222 | 3.3) Using GRUB | ||
223 | When using GRUB, kernel parameter are simply appended after the kernel | ||
224 | specification: kernel <kernel> <parameters> | ||
225 | |||
226 | 3.4) Using loadlin | ||
227 | loadlin may be used to boot Linux from a DOS command prompt without | ||
228 | requiring a local hard disk to mount as root. This has not been | ||
229 | thoroughly tested by the authors of this document, but in general | ||
230 | it should be possible configure the kernel command line similarly | ||
231 | to the configuration of LILO. | ||
232 | |||
233 | Please refer to the loadlin documentation for further information. | ||
234 | |||
235 | 3.5) Using a boot ROM | ||
236 | This is probably the most elegant way of booting a diskless client. | ||
237 | With a boot ROM the kernel is loaded using the TFTP protocol. The | ||
238 | authors of this document are not aware of any no commercial boot | ||
239 | ROMs that support booting Linux over the network. However, there | ||
240 | are two free implementations of a boot ROM, netboot-nfs and | ||
241 | etherboot, both of which are available on sunsite.unc.edu, and both | ||
242 | of which contain everything you need to boot a diskless Linux client. | ||
243 | |||
244 | 3.6) Using pxelinux | ||
245 | Pxelinux may be used to boot linux using the PXE boot loader | ||
246 | which is present on many modern network cards. | ||
247 | |||
248 | When using pxelinux, the kernel image is specified using | ||
249 | "kernel <relative-path-below /tftpboot>". The nfsroot parameters | ||
250 | are passed to the kernel by adding them to the "append" line. | ||
251 | It is common to use serial console in conjunction with pxeliunx, | ||
252 | see Documentation/serial-console.txt for more information. | ||
253 | |||
254 | For more information on isolinux, including how to create bootdisks | ||
255 | for prebuilt kernels, see http://syslinux.zytor.com/ | ||
256 | |||
257 | |||
258 | |||
259 | |||
260 | 4.) Credits | ||
261 | ------- | ||
262 | |||
263 | The nfsroot code in the kernel and the RARP support have been written | ||
264 | by Gero Kuhlmann <gero@gkminix.han.de>. | ||
265 | |||
266 | The rest of the IP layer autoconfiguration code has been written | ||
267 | by Martin Mares <mj@atrey.karlin.mff.cuni.cz>. | ||
268 | |||
269 | In order to write the initial version of nfsroot I would like to thank | ||
270 | Jens-Uwe Mager <jum@anubis.han.de> for his help. | ||