diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2008-04-24 14:45:00 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2008-04-24 14:45:00 -0400 |
commit | 10c993a6b5418cb1026775765ba4c70ffb70853d (patch) | |
tree | 717deba79b938c2f3f786ff6fe908d30582f06f8 /Documentation | |
parent | c328d54cd4ad120d76284e46dcca6c6cf996154a (diff) | |
parent | ca456252db0521e5e88024fa2b67535e9739e030 (diff) |
Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux: (52 commits)
knfsd: clear both setuid and setgid whenever a chown is done
knfsd: get rid of imode variable in nfsd_setattr
SUNRPC: Use unsigned loop and array index in svc_init_buffer()
SUNRPC: Use unsigned index when looping over arrays
SUNRPC: Update RPC server's TCP record marker decoder
SUNRPC: RPC server still uses 2.4 method for disabling TCP Nagle
NLM: don't let lockd exit on unexpected svc_recv errors (try #2)
NFS: don't let nfs_callback_svc exit on unexpected svc_recv errors (try #2)
Use a zero sized array for raw field in struct fid
nfsd: use static memory for callback program and stats
SUNRPC: remove svc_create_thread()
nfsd: fix comment
lockd: Fix stale nlmsvc_unlink_block comment
NFSD: Strip __KERNEL__ testing from unexported header files.
sunrpc: make token header values less confusing
gss_krb5: consistently use unsigned for seqnum
NFSD: Remove NFSv4 dependency on NFSv3
SUNRPC: Remove PROC_FS dependency
NFSD: Use "depends on" for PROC_FS dependency
nfsd: move most of fh_verify to separate function
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/filesystems/nfs-rdma.txt | 252 |
1 files changed, 252 insertions, 0 deletions
diff --git a/Documentation/filesystems/nfs-rdma.txt b/Documentation/filesystems/nfs-rdma.txt new file mode 100644 index 000000000000..1ae34879574b --- /dev/null +++ b/Documentation/filesystems/nfs-rdma.txt | |||
@@ -0,0 +1,252 @@ | |||
1 | ################################################################################ | ||
2 | # # | ||
3 | # NFS/RDMA README # | ||
4 | # # | ||
5 | ################################################################################ | ||
6 | |||
7 | Author: NetApp and Open Grid Computing | ||
8 | Date: February 25, 2008 | ||
9 | |||
10 | Table of Contents | ||
11 | ~~~~~~~~~~~~~~~~~ | ||
12 | - Overview | ||
13 | - Getting Help | ||
14 | - Installation | ||
15 | - Check RDMA and NFS Setup | ||
16 | - NFS/RDMA Setup | ||
17 | |||
18 | Overview | ||
19 | ~~~~~~~~ | ||
20 | |||
21 | This document describes how to install and setup the Linux NFS/RDMA client | ||
22 | and server software. | ||
23 | |||
24 | The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server | ||
25 | was first included in the following release, Linux 2.6.25. | ||
26 | |||
27 | In our testing, we have obtained excellent performance results (full 10Gbit | ||
28 | wire bandwidth at minimal client CPU) under many workloads. The code passes | ||
29 | the full Connectathon test suite and operates over both Infiniband and iWARP | ||
30 | RDMA adapters. | ||
31 | |||
32 | Getting Help | ||
33 | ~~~~~~~~~~~~ | ||
34 | |||
35 | If you get stuck, you can ask questions on the | ||
36 | |||
37 | nfs-rdma-devel@lists.sourceforge.net | ||
38 | |||
39 | mailing list. | ||
40 | |||
41 | Installation | ||
42 | ~~~~~~~~~~~~ | ||
43 | |||
44 | These instructions are a step by step guide to building a machine for | ||
45 | use with NFS/RDMA. | ||
46 | |||
47 | - Install an RDMA device | ||
48 | |||
49 | Any device supported by the drivers in drivers/infiniband/hw is acceptable. | ||
50 | |||
51 | Testing has been performed using several Mellanox-based IB cards, the | ||
52 | Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter. | ||
53 | |||
54 | - Install a Linux distribution and tools | ||
55 | |||
56 | The first kernel release to contain both the NFS/RDMA client and server was | ||
57 | Linux 2.6.25 Therefore, a distribution compatible with this and subsequent | ||
58 | Linux kernel release should be installed. | ||
59 | |||
60 | The procedures described in this document have been tested with | ||
61 | distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). | ||
62 | |||
63 | - Install nfs-utils-1.1.1 or greater on the client | ||
64 | |||
65 | An NFS/RDMA mount point can only be obtained by using the mount.nfs | ||
66 | command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs | ||
67 | you are using, type: | ||
68 | |||
69 | > /sbin/mount.nfs -V | ||
70 | |||
71 | If the version is less than 1.1.1 or the command does not exist, | ||
72 | then you will need to install the latest version of nfs-utils. | ||
73 | |||
74 | Download the latest package from: | ||
75 | |||
76 | http://www.kernel.org/pub/linux/utils/nfs | ||
77 | |||
78 | Uncompress the package and follow the installation instructions. | ||
79 | |||
80 | If you will not be using GSS and NFSv4, the installation process | ||
81 | can be simplified by disabling these features when running configure: | ||
82 | |||
83 | > ./configure --disable-gss --disable-nfsv4 | ||
84 | |||
85 | For more information on this see the package's README and INSTALL files. | ||
86 | |||
87 | After building the nfs-utils package, there will be a mount.nfs binary in | ||
88 | the utils/mount directory. This binary can be used to initiate NFS v2, v3, | ||
89 | or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4. | ||
90 | The standard technique is to create a symlink called mount.nfs4 to mount.nfs. | ||
91 | |||
92 | NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed | ||
93 | on the NFS client machine. You do not need this specific version of | ||
94 | nfs-utils on the server. Furthermore, only the mount.nfs command from | ||
95 | nfs-utils-1.1.1 is needed on the client. | ||
96 | |||
97 | - Install a Linux kernel with NFS/RDMA | ||
98 | |||
99 | The NFS/RDMA client and server are both included in the mainline Linux | ||
100 | kernel version 2.6.25 and later. This and other versions of the 2.6 Linux | ||
101 | kernel can be found at: | ||
102 | |||
103 | ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ | ||
104 | |||
105 | Download the sources and place them in an appropriate location. | ||
106 | |||
107 | - Configure the RDMA stack | ||
108 | |||
109 | Make sure your kernel configuration has RDMA support enabled. Under | ||
110 | Device Drivers -> InfiniBand support, update the kernel configuration | ||
111 | to enable InfiniBand support [NOTE: the option name is misleading. Enabling | ||
112 | InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)]. | ||
113 | |||
114 | Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or | ||
115 | iWARP adapter support (amso, cxgb3, etc.). | ||
116 | |||
117 | If you are using InfiniBand, be sure to enable IP-over-InfiniBand support. | ||
118 | |||
119 | - Configure the NFS client and server | ||
120 | |||
121 | Your kernel configuration must also have NFS file system support and/or | ||
122 | NFS server support enabled. These and other NFS related configuration | ||
123 | options can be found under File Systems -> Network File Systems. | ||
124 | |||
125 | - Build, install, reboot | ||
126 | |||
127 | The NFS/RDMA code will be enabled automatically if NFS and RDMA | ||
128 | are turned on. The NFS/RDMA client and server are configured via the hidden | ||
129 | SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The | ||
130 | value of SUNRPC_XPRT_RDMA will be: | ||
131 | |||
132 | - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client | ||
133 | and server will not be built | ||
134 | - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M, | ||
135 | in this case the NFS/RDMA client and server will be built as modules | ||
136 | - Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client | ||
137 | and server will be built into the kernel | ||
138 | |||
139 | Therefore, if you have followed the steps above and turned no NFS and RDMA, | ||
140 | the NFS/RDMA client and server will be built. | ||
141 | |||
142 | Build a new kernel, install it, boot it. | ||
143 | |||
144 | Check RDMA and NFS Setup | ||
145 | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
146 | |||
147 | Before configuring the NFS/RDMA software, it is a good idea to test | ||
148 | your new kernel to ensure that the kernel is working correctly. | ||
149 | In particular, it is a good idea to verify that the RDMA stack | ||
150 | is functioning as expected and standard NFS over TCP/IP and/or UDP/IP | ||
151 | is working properly. | ||
152 | |||
153 | - Check RDMA Setup | ||
154 | |||
155 | If you built the RDMA components as modules, load them at | ||
156 | this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel | ||
157 | card: | ||
158 | |||
159 | > modprobe ib_mthca | ||
160 | > modprobe ib_ipoib | ||
161 | |||
162 | If you are using InfiniBand, make sure there is a Subnet Manager (SM) | ||
163 | running on the network. If your IB switch has an embedded SM, you can | ||
164 | use it. Otherwise, you will need to run an SM, such as OpenSM, on one | ||
165 | of your end nodes. | ||
166 | |||
167 | If an SM is running on your network, you should see the following: | ||
168 | |||
169 | > cat /sys/class/infiniband/driverX/ports/1/state | ||
170 | 4: ACTIVE | ||
171 | |||
172 | where driverX is mthca0, ipath5, ehca3, etc. | ||
173 | |||
174 | To further test the InfiniBand software stack, use IPoIB (this | ||
175 | assumes you have two IB hosts named host1 and host2): | ||
176 | |||
177 | host1> ifconfig ib0 a.b.c.x | ||
178 | host2> ifconfig ib0 a.b.c.y | ||
179 | host1> ping a.b.c.y | ||
180 | host2> ping a.b.c.x | ||
181 | |||
182 | For other device types, follow the appropriate procedures. | ||
183 | |||
184 | - Check NFS Setup | ||
185 | |||
186 | For the NFS components enabled above (client and/or server), | ||
187 | test their functionality over standard Ethernet using TCP/IP or UDP/IP. | ||
188 | |||
189 | NFS/RDMA Setup | ||
190 | ~~~~~~~~~~~~~~ | ||
191 | |||
192 | We recommend that you use two machines, one to act as the client and | ||
193 | one to act as the server. | ||
194 | |||
195 | One time configuration: | ||
196 | |||
197 | - On the server system, configure the /etc/exports file and | ||
198 | start the NFS/RDMA server. | ||
199 | |||
200 | Exports entries with the following format have been tested: | ||
201 | |||
202 | /vol0 10.97.103.47(rw,async) 192.168.0.47(rw,async,insecure,no_root_squash) | ||
203 | |||
204 | Here the first IP address is the client's Ethernet address and the second | ||
205 | IP address is the clients IPoIB address. | ||
206 | |||
207 | Each time a machine boots: | ||
208 | |||
209 | - Load and configure the RDMA drivers | ||
210 | |||
211 | For InfiniBand using a Mellanox adapter: | ||
212 | |||
213 | > modprobe ib_mthca | ||
214 | > modprobe ib_ipoib | ||
215 | > ifconfig ib0 a.b.c.d | ||
216 | |||
217 | NOTE: use unique addresses for the client and server | ||
218 | |||
219 | - Start the NFS server | ||
220 | |||
221 | If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), | ||
222 | load the RDMA transport module: | ||
223 | |||
224 | > modprobe svcrdma | ||
225 | |||
226 | Regardless of how the server was built (module or built-in), start the server: | ||
227 | |||
228 | > /etc/init.d/nfs start | ||
229 | |||
230 | or | ||
231 | |||
232 | > service nfs start | ||
233 | |||
234 | Instruct the server to listen on the RDMA transport: | ||
235 | |||
236 | > echo rdma 2050 > /proc/fs/nfsd/portlist | ||
237 | |||
238 | - On the client system | ||
239 | |||
240 | If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), | ||
241 | load the RDMA client module: | ||
242 | |||
243 | > modprobe xprtrdma.ko | ||
244 | |||
245 | Regardless of how the client was built (module or built-in), issue the mount.nfs command: | ||
246 | |||
247 | > /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050 | ||
248 | |||
249 | To verify that the mount is using RDMA, run "cat /proc/mounts" and check the | ||
250 | "proto" field for the given mount. | ||
251 | |||
252 | Congratulations! You're using NFS/RDMA! | ||