diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/knfsd-stats.txt | 159 | ||||
-rw-r--r-- | Documentation/filesystems/nfs41-server.txt | 161 |
2 files changed, 320 insertions, 0 deletions
diff --git a/Documentation/filesystems/knfsd-stats.txt b/Documentation/filesystems/knfsd-stats.txt new file mode 100644 index 000000000000..64ced5149d37 --- /dev/null +++ b/Documentation/filesystems/knfsd-stats.txt | |||
@@ -0,0 +1,159 @@ | |||
1 | |||
2 | Kernel NFS Server Statistics | ||
3 | ============================ | ||
4 | |||
5 | This document describes the format and semantics of the statistics | ||
6 | which the kernel NFS server makes available to userspace. These | ||
7 | statistics are available in several text form pseudo files, each of | ||
8 | which is described separately below. | ||
9 | |||
10 | In most cases you don't need to know these formats, as the nfsstat(8) | ||
11 | program from the nfs-utils distribution provides a helpful command-line | ||
12 | interface for extracting and printing them. | ||
13 | |||
14 | All the files described here are formatted as a sequence of text lines, | ||
15 | separated by newline '\n' characters. Lines beginning with a hash | ||
16 | '#' character are comments intended for humans and should be ignored | ||
17 | by parsing routines. All other lines contain a sequence of fields | ||
18 | separated by whitespace. | ||
19 | |||
20 | /proc/fs/nfsd/pool_stats | ||
21 | ------------------------ | ||
22 | |||
23 | This file is available in kernels from 2.6.30 onwards, if the | ||
24 | /proc/fs/nfsd filesystem is mounted (it almost always should be). | ||
25 | |||
26 | The first line is a comment which describes the fields present in | ||
27 | all the other lines. The other lines present the following data as | ||
28 | a sequence of unsigned decimal numeric fields. One line is shown | ||
29 | for each NFS thread pool. | ||
30 | |||
31 | All counters are 64 bits wide and wrap naturally. There is no way | ||
32 | to zero these counters, instead applications should do their own | ||
33 | rate conversion. | ||
34 | |||
35 | pool | ||
36 | The id number of the NFS thread pool to which this line applies. | ||
37 | This number does not change. | ||
38 | |||
39 | Thread pool ids are a contiguous set of small integers starting | ||
40 | at zero. The maximum value depends on the thread pool mode, but | ||
41 | currently cannot be larger than the number of CPUs in the system. | ||
42 | Note that in the default case there will be a single thread pool | ||
43 | which contains all the nfsd threads and all the CPUs in the system, | ||
44 | and thus this file will have a single line with a pool id of "0". | ||
45 | |||
46 | packets-arrived | ||
47 | Counts how many NFS packets have arrived. More precisely, this | ||
48 | is the number of times that the network stack has notified the | ||
49 | sunrpc server layer that new data may be available on a transport | ||
50 | (e.g. an NFS or UDP socket or an NFS/RDMA endpoint). | ||
51 | |||
52 | Depending on the NFS workload patterns and various network stack | ||
53 | effects (such as Large Receive Offload) which can combine packets | ||
54 | on the wire, this may be either more or less than the number | ||
55 | of NFS calls received (which statistic is available elsewhere). | ||
56 | However this is a more accurate and less workload-dependent measure | ||
57 | of how much CPU load is being placed on the sunrpc server layer | ||
58 | due to NFS network traffic. | ||
59 | |||
60 | sockets-enqueued | ||
61 | Counts how many times an NFS transport is enqueued to wait for | ||
62 | an nfsd thread to service it, i.e. no nfsd thread was considered | ||
63 | available. | ||
64 | |||
65 | The circumstance this statistic tracks indicates that there was NFS | ||
66 | network-facing work to be done but it couldn't be done immediately, | ||
67 | thus introducing a small delay in servicing NFS calls. The ideal | ||
68 | rate of change for this counter is zero; significantly non-zero | ||
69 | values may indicate a performance limitation. | ||
70 | |||
71 | This can happen either because there are too few nfsd threads in the | ||
72 | thread pool for the NFS workload (the workload is thread-limited), | ||
73 | or because the NFS workload needs more CPU time than is available in | ||
74 | the thread pool (the workload is CPU-limited). In the former case, | ||
75 | configuring more nfsd threads will probably improve the performance | ||
76 | of the NFS workload. In the latter case, the sunrpc server layer is | ||
77 | already choosing not to wake idle nfsd threads because there are too | ||
78 | many nfsd threads which want to run but cannot, so configuring more | ||
79 | nfsd threads will make no difference whatsoever. The overloads-avoided | ||
80 | statistic (see below) can be used to distinguish these cases. | ||
81 | |||
82 | threads-woken | ||
83 | Counts how many times an idle nfsd thread is woken to try to | ||
84 | receive some data from an NFS transport. | ||
85 | |||
86 | This statistic tracks the circumstance where incoming | ||
87 | network-facing NFS work is being handled quickly, which is a good | ||
88 | thing. The ideal rate of change for this counter will be close | ||
89 | to but less than the rate of change of the packets-arrived counter. | ||
90 | |||
91 | overloads-avoided | ||
92 | Counts how many times the sunrpc server layer chose not to wake an | ||
93 | nfsd thread, despite the presence of idle nfsd threads, because | ||
94 | too many nfsd threads had been recently woken but could not get | ||
95 | enough CPU time to actually run. | ||
96 | |||
97 | This statistic counts a circumstance where the sunrpc layer | ||
98 | heuristically avoids overloading the CPU scheduler with too many | ||
99 | runnable nfsd threads. The ideal rate of change for this counter | ||
100 | is zero. Significant non-zero values indicate that the workload | ||
101 | is CPU limited. Usually this is associated with heavy CPU usage | ||
102 | on all the CPUs in the nfsd thread pool. | ||
103 | |||
104 | If a sustained large overloads-avoided rate is detected on a pool, | ||
105 | the top(1) utility should be used to check for the following | ||
106 | pattern of CPU usage on all the CPUs associated with the given | ||
107 | nfsd thread pool. | ||
108 | |||
109 | - %us ~= 0 (as you're *NOT* running applications on your NFS server) | ||
110 | |||
111 | - %wa ~= 0 | ||
112 | |||
113 | - %id ~= 0 | ||
114 | |||
115 | - %sy + %hi + %si ~= 100 | ||
116 | |||
117 | If this pattern is seen, configuring more nfsd threads will *not* | ||
118 | improve the performance of the workload. If this patten is not | ||
119 | seen, then something more subtle is wrong. | ||
120 | |||
121 | threads-timedout | ||
122 | Counts how many times an nfsd thread triggered an idle timeout, | ||
123 | i.e. was not woken to handle any incoming network packets for | ||
124 | some time. | ||
125 | |||
126 | This statistic counts a circumstance where there are more nfsd | ||
127 | threads configured than can be used by the NFS workload. This is | ||
128 | a clue that the number of nfsd threads can be reduced without | ||
129 | affecting performance. Unfortunately, it's only a clue and not | ||
130 | a strong indication, for a couple of reasons: | ||
131 | |||
132 | - Currently the rate at which the counter is incremented is quite | ||
133 | slow; the idle timeout is 60 minutes. Unless the NFS workload | ||
134 | remains constant for hours at a time, this counter is unlikely | ||
135 | to be providing information that is still useful. | ||
136 | |||
137 | - It is usually a wise policy to provide some slack, | ||
138 | i.e. configure a few more nfsds than are currently needed, | ||
139 | to allow for future spikes in load. | ||
140 | |||
141 | |||
142 | Note that incoming packets on NFS transports will be dealt with in | ||
143 | one of three ways. An nfsd thread can be woken (threads-woken counts | ||
144 | this case), or the transport can be enqueued for later attention | ||
145 | (sockets-enqueued counts this case), or the packet can be temporarily | ||
146 | deferred because the transport is currently being used by an nfsd | ||
147 | thread. This last case is not very interesting and is not explicitly | ||
148 | counted, but can be inferred from the other counters thus: | ||
149 | |||
150 | packets-deferred = packets-arrived - ( sockets-enqueued + threads-woken ) | ||
151 | |||
152 | |||
153 | More | ||
154 | ---- | ||
155 | Descriptions of the other statistics file should go here. | ||
156 | |||
157 | |||
158 | Greg Banks <gnb@sgi.com> | ||
159 | 26 Mar 2009 | ||
diff --git a/Documentation/filesystems/nfs41-server.txt b/Documentation/filesystems/nfs41-server.txt new file mode 100644 index 000000000000..05d81cbcb2e1 --- /dev/null +++ b/Documentation/filesystems/nfs41-server.txt | |||
@@ -0,0 +1,161 @@ | |||
1 | NFSv4.1 Server Implementation | ||
2 | |||
3 | Server support for minorversion 1 can be controlled using the | ||
4 | /proc/fs/nfsd/versions control file. The string output returned | ||
5 | by reading this file will contain either "+4.1" or "-4.1" | ||
6 | correspondingly. | ||
7 | |||
8 | Currently, server support for minorversion 1 is disabled by default. | ||
9 | It can be enabled at run time by writing the string "+4.1" to | ||
10 | the /proc/fs/nfsd/versions control file. Note that to write this | ||
11 | control file, the nfsd service must be taken down. Use your user-mode | ||
12 | nfs-utils to set this up; see rpc.nfsd(8) | ||
13 | |||
14 | The NFSv4 minorversion 1 (NFSv4.1) implementation in nfsd is based | ||
15 | on the latest NFSv4.1 Internet Draft: | ||
16 | http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-29 | ||
17 | |||
18 | From the many new features in NFSv4.1 the current implementation | ||
19 | focuses on the mandatory-to-implement NFSv4.1 Sessions, providing | ||
20 | "exactly once" semantics and better control and throttling of the | ||
21 | resources allocated for each client. | ||
22 | |||
23 | Other NFSv4.1 features, Parallel NFS operations in particular, | ||
24 | are still under development out of tree. | ||
25 | See http://wiki.linux-nfs.org/wiki/index.php/PNFS_prototype_design | ||
26 | for more information. | ||
27 | |||
28 | The table below, taken from the NFSv4.1 document, lists | ||
29 | the operations that are mandatory to implement (REQ), optional | ||
30 | (OPT), and NFSv4.0 operations that are required not to implement (MNI) | ||
31 | in minor version 1. The first column indicates the operations that | ||
32 | are not supported yet by the linux server implementation. | ||
33 | |||
34 | The OPTIONAL features identified and their abbreviations are as follows: | ||
35 | pNFS Parallel NFS | ||
36 | FDELG File Delegations | ||
37 | DDELG Directory Delegations | ||
38 | |||
39 | The following abbreviations indicate the linux server implementation status. | ||
40 | I Implemented NFSv4.1 operations. | ||
41 | NS Not Supported. | ||
42 | NS* unimplemented optional feature. | ||
43 | P pNFS features implemented out of tree. | ||
44 | PNS pNFS features that are not supported yet (out of tree). | ||
45 | |||
46 | Operations | ||
47 | |||
48 | +----------------------+------------+--------------+----------------+ | ||
49 | | Operation | REQ, REC, | Feature | Definition | | ||
50 | | | OPT, or | (REQ, REC, | | | ||
51 | | | MNI | or OPT) | | | ||
52 | +----------------------+------------+--------------+----------------+ | ||
53 | | ACCESS | REQ | | Section 18.1 | | ||
54 | NS | BACKCHANNEL_CTL | REQ | | Section 18.33 | | ||
55 | NS | BIND_CONN_TO_SESSION | REQ | | Section 18.34 | | ||
56 | | CLOSE | REQ | | Section 18.2 | | ||
57 | | COMMIT | REQ | | Section 18.3 | | ||
58 | | CREATE | REQ | | Section 18.4 | | ||
59 | I | CREATE_SESSION | REQ | | Section 18.36 | | ||
60 | NS*| DELEGPURGE | OPT | FDELG (REQ) | Section 18.5 | | ||
61 | | DELEGRETURN | OPT | FDELG, | Section 18.6 | | ||
62 | | | | DDELG, pNFS | | | ||
63 | | | | (REQ) | | | ||
64 | NS | DESTROY_CLIENTID | REQ | | Section 18.50 | | ||
65 | I | DESTROY_SESSION | REQ | | Section 18.37 | | ||
66 | I | EXCHANGE_ID | REQ | | Section 18.35 | | ||
67 | NS | FREE_STATEID | REQ | | Section 18.38 | | ||
68 | | GETATTR | REQ | | Section 18.7 | | ||
69 | P | GETDEVICEINFO | OPT | pNFS (REQ) | Section 18.40 | | ||
70 | P | GETDEVICELIST | OPT | pNFS (OPT) | Section 18.41 | | ||
71 | | GETFH | REQ | | Section 18.8 | | ||
72 | NS*| GET_DIR_DELEGATION | OPT | DDELG (REQ) | Section 18.39 | | ||
73 | P | LAYOUTCOMMIT | OPT | pNFS (REQ) | Section 18.42 | | ||
74 | P | LAYOUTGET | OPT | pNFS (REQ) | Section 18.43 | | ||
75 | P | LAYOUTRETURN | OPT | pNFS (REQ) | Section 18.44 | | ||
76 | | LINK | OPT | | Section 18.9 | | ||
77 | | LOCK | REQ | | Section 18.10 | | ||
78 | | LOCKT | REQ | | Section 18.11 | | ||
79 | | LOCKU | REQ | | Section 18.12 | | ||
80 | | LOOKUP | REQ | | Section 18.13 | | ||
81 | | LOOKUPP | REQ | | Section 18.14 | | ||
82 | | NVERIFY | REQ | | Section 18.15 | | ||
83 | | OPEN | REQ | | Section 18.16 | | ||
84 | NS*| OPENATTR | OPT | | Section 18.17 | | ||
85 | | OPEN_CONFIRM | MNI | | N/A | | ||
86 | | OPEN_DOWNGRADE | REQ | | Section 18.18 | | ||
87 | | PUTFH | REQ | | Section 18.19 | | ||
88 | | PUTPUBFH | REQ | | Section 18.20 | | ||
89 | | PUTROOTFH | REQ | | Section 18.21 | | ||
90 | | READ | REQ | | Section 18.22 | | ||
91 | | READDIR | REQ | | Section 18.23 | | ||
92 | | READLINK | OPT | | Section 18.24 | | ||
93 | NS | RECLAIM_COMPLETE | REQ | | Section 18.51 | | ||
94 | | RELEASE_LOCKOWNER | MNI | | N/A | | ||
95 | | REMOVE | REQ | | Section 18.25 | | ||
96 | | RENAME | REQ | | Section 18.26 | | ||
97 | | RENEW | MNI | | N/A | | ||
98 | | RESTOREFH | REQ | | Section 18.27 | | ||
99 | | SAVEFH | REQ | | Section 18.28 | | ||
100 | | SECINFO | REQ | | Section 18.29 | | ||
101 | NS | SECINFO_NO_NAME | REC | pNFS files | Section 18.45, | | ||
102 | | | | layout (REQ) | Section 13.12 | | ||
103 | I | SEQUENCE | REQ | | Section 18.46 | | ||
104 | | SETATTR | REQ | | Section 18.30 | | ||
105 | | SETCLIENTID | MNI | | N/A | | ||
106 | | SETCLIENTID_CONFIRM | MNI | | N/A | | ||
107 | NS | SET_SSV | REQ | | Section 18.47 | | ||
108 | NS | TEST_STATEID | REQ | | Section 18.48 | | ||
109 | | VERIFY | REQ | | Section 18.31 | | ||
110 | NS*| WANT_DELEGATION | OPT | FDELG (OPT) | Section 18.49 | | ||
111 | | WRITE | REQ | | Section 18.32 | | ||
112 | |||
113 | Callback Operations | ||
114 | |||
115 | +-------------------------+-----------+-------------+---------------+ | ||
116 | | Operation | REQ, REC, | Feature | Definition | | ||
117 | | | OPT, or | (REQ, REC, | | | ||
118 | | | MNI | or OPT) | | | ||
119 | +-------------------------+-----------+-------------+---------------+ | ||
120 | | CB_GETATTR | OPT | FDELG (REQ) | Section 20.1 | | ||
121 | P | CB_LAYOUTRECALL | OPT | pNFS (REQ) | Section 20.3 | | ||
122 | NS*| CB_NOTIFY | OPT | DDELG (REQ) | Section 20.4 | | ||
123 | P | CB_NOTIFY_DEVICEID | OPT | pNFS (OPT) | Section 20.12 | | ||
124 | NS*| CB_NOTIFY_LOCK | OPT | | Section 20.11 | | ||
125 | NS*| CB_PUSH_DELEG | OPT | FDELG (OPT) | Section 20.5 | | ||
126 | | CB_RECALL | OPT | FDELG, | Section 20.2 | | ||
127 | | | | DDELG, pNFS | | | ||
128 | | | | (REQ) | | | ||
129 | NS*| CB_RECALL_ANY | OPT | FDELG, | Section 20.6 | | ||
130 | | | | DDELG, pNFS | | | ||
131 | | | | (REQ) | | | ||
132 | NS | CB_RECALL_SLOT | REQ | | Section 20.8 | | ||
133 | NS*| CB_RECALLABLE_OBJ_AVAIL | OPT | DDELG, pNFS | Section 20.7 | | ||
134 | | | | (REQ) | | | ||
135 | I | CB_SEQUENCE | OPT | FDELG, | Section 20.9 | | ||
136 | | | | DDELG, pNFS | | | ||
137 | | | | (REQ) | | | ||
138 | NS*| CB_WANTS_CANCELLED | OPT | FDELG, | Section 20.10 | | ||
139 | | | | DDELG, pNFS | | | ||
140 | | | | (REQ) | | | ||
141 | +-------------------------+-----------+-------------+---------------+ | ||
142 | |||
143 | Implementation notes: | ||
144 | |||
145 | EXCHANGE_ID: | ||
146 | * only SP4_NONE state protection supported | ||
147 | * implementation ids are ignored | ||
148 | |||
149 | CREATE_SESSION: | ||
150 | * backchannel attributes are ignored | ||
151 | * backchannel security parameters are ignored | ||
152 | |||
153 | SEQUENCE: | ||
154 | * no support for dynamic slot table renegotiation (optional) | ||
155 | |||
156 | nfsv4.1 COMPOUND rules: | ||
157 | The following cases aren't supported yet: | ||
158 | * Enforcing of NFS4ERR_NOT_ONLY_OP for: BIND_CONN_TO_SESSION, CREATE_SESSION, | ||
159 | DESTROY_CLIENTID, DESTROY_SESSION, EXCHANGE_ID. | ||
160 | * DESTROY_SESSION MUST be the final operation in the COMPOUND request. | ||
161 | |||