diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/feature-removal-schedule.txt | 40 | ||||
-rw-r--r-- | Documentation/filesystems/afs.txt | 214 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 9 | ||||
-rw-r--r-- | Documentation/keys.txt | 12 | ||||
-rw-r--r-- | Documentation/networking/bonding.txt | 35 | ||||
-rw-r--r-- | Documentation/networking/dccp.txt | 10 | ||||
-rw-r--r-- | Documentation/networking/ip-sysctl.txt | 31 | ||||
-rw-r--r-- | Documentation/networking/rxrpc.txt | 859 | ||||
-rw-r--r-- | Documentation/networking/wan-router.txt | 1 |
9 files changed, 1093 insertions, 118 deletions
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 19b4c96b2a49..6da663607f7b 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt | |||
@@ -211,15 +211,6 @@ Who: Adrian Bunk <bunk@stusta.de> | |||
211 | 211 | ||
212 | --------------------------- | 212 | --------------------------- |
213 | 213 | ||
214 | What: IPv4 only connection tracking/NAT/helpers | ||
215 | When: 2.6.22 | ||
216 | Why: The new layer 3 independant connection tracking replaces the old | ||
217 | IPv4 only version. After some stabilization of the new code the | ||
218 | old one will be removed. | ||
219 | Who: Patrick McHardy <kaber@trash.net> | ||
220 | |||
221 | --------------------------- | ||
222 | |||
223 | What: ACPI hooks (X86_SPEEDSTEP_CENTRINO_ACPI) in speedstep-centrino driver | 214 | What: ACPI hooks (X86_SPEEDSTEP_CENTRINO_ACPI) in speedstep-centrino driver |
224 | When: December 2006 | 215 | When: December 2006 |
225 | Why: Speedstep-centrino driver with ACPI hooks and acpi-cpufreq driver are | 216 | Why: Speedstep-centrino driver with ACPI hooks and acpi-cpufreq driver are |
@@ -294,18 +285,6 @@ Who: Richard Purdie <rpurdie@rpsys.net> | |||
294 | 285 | ||
295 | --------------------------- | 286 | --------------------------- |
296 | 287 | ||
297 | What: Wireless extensions over netlink (CONFIG_NET_WIRELESS_RTNETLINK) | ||
298 | When: with the merge of wireless-dev, 2.6.22 or later | ||
299 | Why: The option/code is | ||
300 | * not enabled on most kernels | ||
301 | * not required by any userspace tools (except an experimental one, | ||
302 | and even there only for some parts, others use ioctl) | ||
303 | * pointless since wext is no longer evolving and the ioctl | ||
304 | interface needs to be kept | ||
305 | Who: Johannes Berg <johannes@sipsolutions.net> | ||
306 | |||
307 | --------------------------- | ||
308 | |||
309 | What: i8xx_tco watchdog driver | 288 | What: i8xx_tco watchdog driver |
310 | When: in 2.6.22 | 289 | When: in 2.6.22 |
311 | Why: the i8xx_tco watchdog driver has been replaced by the iTCO_wdt | 290 | Why: the i8xx_tco watchdog driver has been replaced by the iTCO_wdt |
@@ -313,3 +292,22 @@ Why: the i8xx_tco watchdog driver has been replaced by the iTCO_wdt | |||
313 | Who: Wim Van Sebroeck <wim@iguana.be> | 292 | Who: Wim Van Sebroeck <wim@iguana.be> |
314 | 293 | ||
315 | --------------------------- | 294 | --------------------------- |
295 | |||
296 | What: Multipath cached routing support in ipv4 | ||
297 | When: in 2.6.23 | ||
298 | Why: Code was merged, then submitter immediately disappeared leaving | ||
299 | us with no maintainer and lots of bugs. The code should not have | ||
300 | been merged in the first place, and many aspects of it's | ||
301 | implementation are blocking more critical core networking | ||
302 | development. It's marked EXPERIMENTAL and no distribution | ||
303 | enables it because it cause obscure crashes due to unfixable bugs | ||
304 | (interfaces don't return errors so memory allocation can't be | ||
305 | handled, calling contexts of these interfaces make handling | ||
306 | errors impossible too because they get called after we've | ||
307 | totally commited to creating a route object, for example). | ||
308 | This problem has existed for years and no forward progress | ||
309 | has ever been made, and nobody steps up to try and salvage | ||
310 | this code, so we're going to finally just get rid of it. | ||
311 | Who: David S. Miller <davem@davemloft.net> | ||
312 | |||
313 | --------------------------- | ||
diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt index 2f4237dfb8c7..12ad6c7f4e50 100644 --- a/Documentation/filesystems/afs.txt +++ b/Documentation/filesystems/afs.txt | |||
@@ -1,31 +1,82 @@ | |||
1 | ==================== | ||
1 | kAFS: AFS FILESYSTEM | 2 | kAFS: AFS FILESYSTEM |
2 | ==================== | 3 | ==================== |
3 | 4 | ||
4 | ABOUT | 5 | Contents: |
5 | ===== | 6 | |
7 | - Overview. | ||
8 | - Usage. | ||
9 | - Mountpoints. | ||
10 | - Proc filesystem. | ||
11 | - The cell database. | ||
12 | - Security. | ||
13 | - Examples. | ||
14 | |||
15 | |||
16 | ======== | ||
17 | OVERVIEW | ||
18 | ======== | ||
6 | 19 | ||
7 | This filesystem provides a fairly simple AFS filesystem driver. It is under | 20 | This filesystem provides a fairly simple secure AFS filesystem driver. It is |
8 | development and only provides very basic facilities. It does not yet support | 21 | under development and does not yet provide the full feature set. The features |
9 | the following AFS features: | 22 | it does support include: |
10 | 23 | ||
11 | (*) Write support. | 24 | (*) Security (currently only AFS kaserver and KerberosIV tickets). |
12 | (*) Communications security. | ||
13 | (*) Local caching. | ||
14 | (*) pioctl() system call. | ||
15 | (*) Automatic mounting of embedded mountpoints. | ||
16 | 25 | ||
26 | (*) File reading. | ||
17 | 27 | ||
28 | (*) Automounting. | ||
29 | |||
30 | It does not yet support the following AFS features: | ||
31 | |||
32 | (*) Write support. | ||
33 | |||
34 | (*) Local caching. | ||
35 | |||
36 | (*) pioctl() system call. | ||
37 | |||
38 | |||
39 | =========== | ||
40 | COMPILATION | ||
41 | =========== | ||
42 | |||
43 | The filesystem should be enabled by turning on the kernel configuration | ||
44 | options: | ||
45 | |||
46 | CONFIG_AF_RXRPC - The RxRPC protocol transport | ||
47 | CONFIG_RXKAD - The RxRPC Kerberos security handler | ||
48 | CONFIG_AFS - The AFS filesystem | ||
49 | |||
50 | Additionally, the following can be turned on to aid debugging: | ||
51 | |||
52 | CONFIG_AF_RXRPC_DEBUG - Permit AF_RXRPC debugging to be enabled | ||
53 | CONFIG_AFS_DEBUG - Permit AFS debugging to be enabled | ||
54 | |||
55 | They permit the debugging messages to be turned on dynamically by manipulating | ||
56 | the masks in the following files: | ||
57 | |||
58 | /sys/module/af_rxrpc/parameters/debug | ||
59 | /sys/module/afs/parameters/debug | ||
60 | |||
61 | |||
62 | ===== | ||
18 | USAGE | 63 | USAGE |
19 | ===== | 64 | ===== |
20 | 65 | ||
21 | When inserting the driver modules the root cell must be specified along with a | 66 | When inserting the driver modules the root cell must be specified along with a |
22 | list of volume location server IP addresses: | 67 | list of volume location server IP addresses: |
23 | 68 | ||
24 | insmod rxrpc.o | 69 | insmod af_rxrpc.o |
70 | insmod rxkad.o | ||
25 | insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 | 71 | insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 |
26 | 72 | ||
27 | The first module is a driver for the RxRPC remote operation protocol, and the | 73 | The first module is the AF_RXRPC network protocol driver. This provides the |
28 | second is the actual filesystem driver for the AFS filesystem. | 74 | RxRPC remote operation protocol and may also be accessed from userspace. See: |
75 | |||
76 | Documentation/networking/rxrpc.txt | ||
77 | |||
78 | The second module is the kerberos RxRPC security driver, and the third module | ||
79 | is the actual filesystem driver for the AFS filesystem. | ||
29 | 80 | ||
30 | Once the module has been loaded, more modules can be added by the following | 81 | Once the module has been loaded, more modules can be added by the following |
31 | procedure: | 82 | procedure: |
@@ -33,7 +84,7 @@ procedure: | |||
33 | echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells | 84 | echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells |
34 | 85 | ||
35 | Where the parameters to the "add" command are the name of a cell and a list of | 86 | Where the parameters to the "add" command are the name of a cell and a list of |
36 | volume location servers within that cell. | 87 | volume location servers within that cell, with the latter separated by colons. |
37 | 88 | ||
38 | Filesystems can be mounted anywhere by commands similar to the following: | 89 | Filesystems can be mounted anywhere by commands similar to the following: |
39 | 90 | ||
@@ -42,11 +93,6 @@ Filesystems can be mounted anywhere by commands similar to the following: | |||
42 | mount -t afs "#root.afs." /afs | 93 | mount -t afs "#root.afs." /afs |
43 | mount -t afs "#root.cell." /afs/cambridge | 94 | mount -t afs "#root.cell." /afs/cambridge |
44 | 95 | ||
45 | NB: When using this on Linux 2.4, the mount command has to be different, | ||
46 | since the filesystem doesn't have access to the device name argument: | ||
47 | |||
48 | mount -t afs none /afs -ovol="#root.afs." | ||
49 | |||
50 | Where the initial character is either a hash or a percent symbol depending on | 96 | Where the initial character is either a hash or a percent symbol depending on |
51 | whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O | 97 | whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O |
52 | volume, but are willing to use a R/W volume instead (percent). | 98 | volume, but are willing to use a R/W volume instead (percent). |
@@ -60,55 +106,66 @@ named volume will be looked up in the cell specified during insmod. | |||
60 | Additional cells can be added through /proc (see later section). | 106 | Additional cells can be added through /proc (see later section). |
61 | 107 | ||
62 | 108 | ||
109 | =========== | ||
63 | MOUNTPOINTS | 110 | MOUNTPOINTS |
64 | =========== | 111 | =========== |
65 | 112 | ||
66 | AFS has a concept of mountpoints. These are specially formatted symbolic links | 113 | AFS has a concept of mountpoints. In AFS terms, these are specially formatted |
67 | (of the same form as the "device name" passed to mount). kAFS presents these | 114 | symbolic links (of the same form as the "device name" passed to mount). kAFS |
68 | to the user as directories that have special properties: | 115 | presents these to the user as directories that have a follow-link capability |
116 | (ie: symbolic link semantics). If anyone attempts to access them, they will | ||
117 | automatically cause the target volume to be mounted (if possible) on that site. | ||
69 | 118 | ||
70 | (*) They cannot be listed. Running a program like "ls" on them will incur an | 119 | Automatically mounted filesystems will be automatically unmounted approximately |
71 | EREMOTE error (Object is remote). | 120 | twenty minutes after they were last used. Alternatively they can be unmounted |
121 | directly with the umount() system call. | ||
72 | 122 | ||
73 | (*) Other objects can't be looked up inside of them. This also incurs an | 123 | Manually unmounting an AFS volume will cause any idle submounts upon it to be |
74 | EREMOTE error. | 124 | culled first. If all are culled, then the requested volume will also be |
125 | unmounted, otherwise error EBUSY will be returned. | ||
75 | 126 | ||
76 | (*) They can be queried with the readlink() system call, which will return | 127 | This can be used by the administrator to attempt to unmount the whole AFS tree |
77 | the name of the mountpoint to which they point. The "readlink" program | 128 | mounted on /afs in one go by doing: |
78 | will also work. | ||
79 | 129 | ||
80 | (*) They can be mounted on (which symbolic links can't). | 130 | umount /afs |
81 | 131 | ||
82 | 132 | ||
133 | =============== | ||
83 | PROC FILESYSTEM | 134 | PROC FILESYSTEM |
84 | =============== | 135 | =============== |
85 | 136 | ||
86 | The rxrpc module creates a number of files in various places in the /proc | ||
87 | filesystem: | ||
88 | |||
89 | (*) Firstly, some information files are made available in a directory called | ||
90 | "/proc/net/rxrpc/". These list the extant transport endpoint, peer, | ||
91 | connection and call records. | ||
92 | |||
93 | (*) Secondly, some control files are made available in a directory called | ||
94 | "/proc/sys/rxrpc/". Currently, all these files can be used for is to | ||
95 | turn on various levels of tracing. | ||
96 | |||
97 | The AFS modules creates a "/proc/fs/afs/" directory and populates it: | 137 | The AFS modules creates a "/proc/fs/afs/" directory and populates it: |
98 | 138 | ||
99 | (*) A "cells" file that lists cells currently known to the afs module. | 139 | (*) A "cells" file that lists cells currently known to the afs module and |
140 | their usage counts: | ||
141 | |||
142 | [root@andromeda ~]# cat /proc/fs/afs/cells | ||
143 | USE NAME | ||
144 | 3 cambridge.redhat.com | ||
100 | 145 | ||
101 | (*) A directory per cell that contains files that list volume location | 146 | (*) A directory per cell that contains files that list volume location |
102 | servers, volumes, and active servers known within that cell. | 147 | servers, volumes, and active servers known within that cell. |
103 | 148 | ||
149 | [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/servers | ||
150 | USE ADDR STATE | ||
151 | 4 172.16.18.91 0 | ||
152 | [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/vlservers | ||
153 | ADDRESS | ||
154 | 172.16.18.91 | ||
155 | [root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/volumes | ||
156 | USE STT VLID[0] VLID[1] VLID[2] NAME | ||
157 | 1 Val 20000000 20000001 20000002 root.afs | ||
104 | 158 | ||
159 | |||
160 | ================= | ||
105 | THE CELL DATABASE | 161 | THE CELL DATABASE |
106 | ================= | 162 | ================= |
107 | 163 | ||
108 | The filesystem maintains an internal database of all the cells it knows and | 164 | The filesystem maintains an internal database of all the cells it knows and the |
109 | the IP addresses of the volume location servers for those cells. The cell to | 165 | IP addresses of the volume location servers for those cells. The cell to which |
110 | which the computer belongs is added to the database when insmod is performed | 166 | the system belongs is added to the database when insmod is performed by the |
111 | by the "rootcell=" argument. | 167 | "rootcell=" argument or, if compiled in, using a "kafs.rootcell=" argument on |
168 | the kernel command line. | ||
112 | 169 | ||
113 | Further cells can be added by commands similar to the following: | 170 | Further cells can be added by commands similar to the following: |
114 | 171 | ||
@@ -118,20 +175,65 @@ Further cells can be added by commands similar to the following: | |||
118 | No other cell database operations are available at this time. | 175 | No other cell database operations are available at this time. |
119 | 176 | ||
120 | 177 | ||
178 | ======== | ||
179 | SECURITY | ||
180 | ======== | ||
181 | |||
182 | Secure operations are initiated by acquiring a key using the klog program. A | ||
183 | very primitive klog program is available at: | ||
184 | |||
185 | http://people.redhat.com/~dhowells/rxrpc/klog.c | ||
186 | |||
187 | This should be compiled by: | ||
188 | |||
189 | make klog LDLIBS="-lcrypto -lcrypt -lkrb4 -lkeyutils" | ||
190 | |||
191 | And then run as: | ||
192 | |||
193 | ./klog | ||
194 | |||
195 | Assuming it's successful, this adds a key of type RxRPC, named for the service | ||
196 | and cell, eg: "afs@<cellname>". This can be viewed with the keyctl program or | ||
197 | by cat'ing /proc/keys: | ||
198 | |||
199 | [root@andromeda ~]# keyctl show | ||
200 | Session Keyring | ||
201 | -3 --alswrv 0 0 keyring: _ses.3268 | ||
202 | 2 --alswrv 0 0 \_ keyring: _uid.0 | ||
203 | 111416553 --als--v 0 0 \_ rxrpc: afs@CAMBRIDGE.REDHAT.COM | ||
204 | |||
205 | Currently the username, realm, password and proposed ticket lifetime are | ||
206 | compiled in to the program. | ||
207 | |||
208 | It is not required to acquire a key before using AFS facilities, but if one is | ||
209 | not acquired then all operations will be governed by the anonymous user parts | ||
210 | of the ACLs. | ||
211 | |||
212 | If a key is acquired, then all AFS operations, including mounts and automounts, | ||
213 | made by a possessor of that key will be secured with that key. | ||
214 | |||
215 | If a file is opened with a particular key and then the file descriptor is | ||
216 | passed to a process that doesn't have that key (perhaps over an AF_UNIX | ||
217 | socket), then the operations on the file will be made with key that was used to | ||
218 | open the file. | ||
219 | |||
220 | |||
221 | ======== | ||
121 | EXAMPLES | 222 | EXAMPLES |
122 | ======== | 223 | ======== |
123 | 224 | ||
124 | Here's what I use to test this. Some of the names and IP addresses are local | 225 | Here's what I use to test this. Some of the names and IP addresses are local |
125 | to my internal DNS. My "root.afs" partition has a mount point within it for | 226 | to my internal DNS. My "root.afs" partition has a mount point within it for |
126 | some public volumes volumes. | 227 | some public volumes volumes. |
127 | 228 | ||
128 | insmod -S /tmp/rxrpc.o | 229 | insmod /tmp/rxrpc.o |
129 | insmod -S /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 | 230 | insmod /tmp/rxkad.o |
231 | insmod /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.91 | ||
130 | 232 | ||
131 | mount -t afs \%root.afs. /afs | 233 | mount -t afs \%root.afs. /afs |
132 | mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/ | 234 | mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/ |
133 | 235 | ||
134 | echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells | 236 | echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells |
135 | mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/ | 237 | mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/ |
136 | mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive | 238 | mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive |
137 | mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib | 239 | mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib |
@@ -141,15 +243,7 @@ mount -t afs "#grand.central.org:root.service." /afs/grand.central.org/service | |||
141 | mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software | 243 | mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software |
142 | mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user | 244 | mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user |
143 | 245 | ||
144 | umount /afs/grand.central.org/user | ||
145 | umount /afs/grand.central.org/software | ||
146 | umount /afs/grand.central.org/service | ||
147 | umount /afs/grand.central.org/project | ||
148 | umount /afs/grand.central.org/doc | ||
149 | umount /afs/grand.central.org/contrib | ||
150 | umount /afs/grand.central.org/archive | ||
151 | umount /afs/grand.central.org | ||
152 | umount /afs/cambridge.redhat.com | ||
153 | umount /afs | 246 | umount /afs |
154 | rmmod kafs | 247 | rmmod kafs |
248 | rmmod rxkad | ||
155 | rmmod rxrpc | 249 | rmmod rxrpc |
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 5484ab5efd4f..7aaf09b86a55 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -1421,6 +1421,15 @@ fewer messages that will be written. Message_burst controls when messages will | |||
1421 | be dropped. The default settings limit warning messages to one every five | 1421 | be dropped. The default settings limit warning messages to one every five |
1422 | seconds. | 1422 | seconds. |
1423 | 1423 | ||
1424 | warnings | ||
1425 | -------- | ||
1426 | |||
1427 | This controls console messages from the networking stack that can occur because | ||
1428 | of problems on the network like duplicate address or bad checksums. Normally, | ||
1429 | this should be enabled, but if the problem persists the messages can be | ||
1430 | disabled. | ||
1431 | |||
1432 | |||
1424 | netdev_max_backlog | 1433 | netdev_max_backlog |
1425 | ------------------ | 1434 | ------------------ |
1426 | 1435 | ||
diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 60c665d9cfaa..81d9aa097298 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt | |||
@@ -859,6 +859,18 @@ payload contents" for more information. | |||
859 | void unregister_key_type(struct key_type *type); | 859 | void unregister_key_type(struct key_type *type); |
860 | 860 | ||
861 | 861 | ||
862 | Under some circumstances, it may be desirable to desirable to deal with a | ||
863 | bundle of keys. The facility provides access to the keyring type for managing | ||
864 | such a bundle: | ||
865 | |||
866 | struct key_type key_type_keyring; | ||
867 | |||
868 | This can be used with a function such as request_key() to find a specific | ||
869 | keyring in a process's keyrings. A keyring thus found can then be searched | ||
870 | with keyring_search(). Note that it is not possible to use request_key() to | ||
871 | search a specific keyring, so using keyrings in this way is of limited utility. | ||
872 | |||
873 | |||
862 | =================================== | 874 | =================================== |
863 | NOTES ON ACCESSING PAYLOAD CONTENTS | 875 | NOTES ON ACCESSING PAYLOAD CONTENTS |
864 | =================================== | 876 | =================================== |
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index de809e58092f..1da566630831 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt | |||
@@ -920,40 +920,9 @@ options, you may wish to use the "max_bonds" module parameter, | |||
920 | documented above. | 920 | documented above. |
921 | 921 | ||
922 | To create multiple bonding devices with differing options, it | 922 | To create multiple bonding devices with differing options, it |
923 | is necessary to load the bonding driver multiple times. Note that | 923 | is necessary to use bonding parameters exported by sysfs, documented |
924 | current versions of the sysconfig network initialization scripts | 924 | in the section below. |
925 | handle this automatically; if your distro uses these scripts, no | ||
926 | special action is needed. See the section Configuring Bonding | ||
927 | Devices, above, if you're not sure about your network initialization | ||
928 | scripts. | ||
929 | |||
930 | To load multiple instances of the module, it is necessary to | ||
931 | specify a different name for each instance (the module loading system | ||
932 | requires that every loaded module, even multiple instances of the same | ||
933 | module, have a unique name). This is accomplished by supplying | ||
934 | multiple sets of bonding options in /etc/modprobe.conf, for example: | ||
935 | |||
936 | alias bond0 bonding | ||
937 | options bond0 -o bond0 mode=balance-rr miimon=100 | ||
938 | |||
939 | alias bond1 bonding | ||
940 | options bond1 -o bond1 mode=balance-alb miimon=50 | ||
941 | |||
942 | will load the bonding module two times. The first instance is | ||
943 | named "bond0" and creates the bond0 device in balance-rr mode with an | ||
944 | miimon of 100. The second instance is named "bond1" and creates the | ||
945 | bond1 device in balance-alb mode with an miimon of 50. | ||
946 | |||
947 | In some circumstances (typically with older distributions), | ||
948 | the above does not work, and the second bonding instance never sees | ||
949 | its options. In that case, the second options line can be substituted | ||
950 | as follows: | ||
951 | |||
952 | install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \ | ||
953 | mode=balance-alb miimon=50 | ||
954 | 925 | ||
955 | This may be repeated any number of times, specifying a new and | ||
956 | unique name in place of bond1 for each subsequent instance. | ||
957 | 926 | ||
958 | 3.4 Configuring Bonding Manually via Sysfs | 927 | 3.4 Configuring Bonding Manually via Sysfs |
959 | ------------------------------------------ | 928 | ------------------------------------------ |
diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt index 387482e46c47..4504cc59e405 100644 --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt | |||
@@ -57,6 +57,16 @@ DCCP_SOCKOPT_SEND_CSCOV is for the receiver and has a different meaning: it | |||
57 | coverage value are also acceptable. The higher the number, the more | 57 | coverage value are also acceptable. The higher the number, the more |
58 | restrictive this setting (see [RFC 4340, sec. 9.2.1]). | 58 | restrictive this setting (see [RFC 4340, sec. 9.2.1]). |
59 | 59 | ||
60 | The following two options apply to CCID 3 exclusively and are getsockopt()-only. | ||
61 | In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned. | ||
62 | DCCP_SOCKOPT_CCID_RX_INFO | ||
63 | Returns a `struct tfrc_rx_info' in optval; the buffer for optval and | ||
64 | optlen must be set to at least sizeof(struct tfrc_rx_info). | ||
65 | DCCP_SOCKOPT_CCID_TX_INFO | ||
66 | Returns a `struct tfrc_tx_info' in optval; the buffer for optval and | ||
67 | optlen must be set to at least sizeof(struct tfrc_tx_info). | ||
68 | |||
69 | |||
60 | Sysctl variables | 70 | Sysctl variables |
61 | ================ | 71 | ================ |
62 | Several DCCP default parameters can be managed by the following sysctls | 72 | Several DCCP default parameters can be managed by the following sysctls |
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 702d1d8dd04a..af6a63ab9026 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt | |||
@@ -179,11 +179,31 @@ tcp_fin_timeout - INTEGER | |||
179 | because they eat maximum 1.5K of memory, but they tend | 179 | because they eat maximum 1.5K of memory, but they tend |
180 | to live longer. Cf. tcp_max_orphans. | 180 | to live longer. Cf. tcp_max_orphans. |
181 | 181 | ||
182 | tcp_frto - BOOLEAN | 182 | tcp_frto - INTEGER |
183 | Enables F-RTO, an enhanced recovery algorithm for TCP retransmission | 183 | Enables F-RTO, an enhanced recovery algorithm for TCP retransmission |
184 | timeouts. It is particularly beneficial in wireless environments | 184 | timeouts. It is particularly beneficial in wireless environments |
185 | where packet loss is typically due to random radio interference | 185 | where packet loss is typically due to random radio interference |
186 | rather than intermediate router congestion. | 186 | rather than intermediate router congestion. If set to 1, basic |
187 | version is enabled. 2 enables SACK enhanced F-RTO, which is | ||
188 | EXPERIMENTAL. The basic version can be used also when SACK is | ||
189 | enabled for a flow through tcp_sack sysctl. | ||
190 | |||
191 | tcp_frto_response - INTEGER | ||
192 | When F-RTO has detected that a TCP retransmission timeout was | ||
193 | spurious (i.e, the timeout would have been avoided had TCP set a | ||
194 | longer retransmission timeout), TCP has several options what to do | ||
195 | next. Possible values are: | ||
196 | 0 Rate halving based; a smooth and conservative response, | ||
197 | results in halved cwnd and ssthresh after one RTT | ||
198 | 1 Very conservative response; not recommended because even | ||
199 | though being valid, it interacts poorly with the rest of | ||
200 | Linux TCP, halves cwnd and ssthresh immediately | ||
201 | 2 Aggressive response; undoes congestion control measures | ||
202 | that are now known to be unnecessary (ignoring the | ||
203 | possibility of a lost retransmission that would require | ||
204 | TCP to be more cautious), cwnd and ssthresh are restored | ||
205 | to the values prior timeout | ||
206 | Default: 0 (rate halving based) | ||
187 | 207 | ||
188 | tcp_keepalive_time - INTEGER | 208 | tcp_keepalive_time - INTEGER |
189 | How often TCP sends out keepalive messages when keepalive is enabled. | 209 | How often TCP sends out keepalive messages when keepalive is enabled. |
@@ -995,7 +1015,12 @@ bridge-nf-call-ip6tables - BOOLEAN | |||
995 | Default: 1 | 1015 | Default: 1 |
996 | 1016 | ||
997 | bridge-nf-filter-vlan-tagged - BOOLEAN | 1017 | bridge-nf-filter-vlan-tagged - BOOLEAN |
998 | 1 : pass bridged vlan-tagged ARP/IP traffic to arptables/iptables. | 1018 | 1 : pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables. |
1019 | 0 : disable this. | ||
1020 | Default: 1 | ||
1021 | |||
1022 | bridge-nf-filter-pppoe-tagged - BOOLEAN | ||
1023 | 1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables. | ||
999 | 0 : disable this. | 1024 | 0 : disable this. |
1000 | Default: 1 | 1025 | Default: 1 |
1001 | 1026 | ||
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt new file mode 100644 index 000000000000..cae231b1c134 --- /dev/null +++ b/Documentation/networking/rxrpc.txt | |||
@@ -0,0 +1,859 @@ | |||
1 | ====================== | ||
2 | RxRPC NETWORK PROTOCOL | ||
3 | ====================== | ||
4 | |||
5 | The RxRPC protocol driver provides a reliable two-phase transport on top of UDP | ||
6 | that can be used to perform RxRPC remote operations. This is done over sockets | ||
7 | of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and | ||
8 | receive data, aborts and errors. | ||
9 | |||
10 | Contents of this document: | ||
11 | |||
12 | (*) Overview. | ||
13 | |||
14 | (*) RxRPC protocol summary. | ||
15 | |||
16 | (*) AF_RXRPC driver model. | ||
17 | |||
18 | (*) Control messages. | ||
19 | |||
20 | (*) Socket options. | ||
21 | |||
22 | (*) Security. | ||
23 | |||
24 | (*) Example client usage. | ||
25 | |||
26 | (*) Example server usage. | ||
27 | |||
28 | (*) AF_RXRPC kernel interface. | ||
29 | |||
30 | |||
31 | ======== | ||
32 | OVERVIEW | ||
33 | ======== | ||
34 | |||
35 | RxRPC is a two-layer protocol. There is a session layer which provides | ||
36 | reliable virtual connections using UDP over IPv4 (or IPv6) as the transport | ||
37 | layer, but implements a real network protocol; and there's the presentation | ||
38 | layer which renders structured data to binary blobs and back again using XDR | ||
39 | (as does SunRPC): | ||
40 | |||
41 | +-------------+ | ||
42 | | Application | | ||
43 | +-------------+ | ||
44 | | XDR | Presentation | ||
45 | +-------------+ | ||
46 | | RxRPC | Session | ||
47 | +-------------+ | ||
48 | | UDP | Transport | ||
49 | +-------------+ | ||
50 | |||
51 | |||
52 | AF_RXRPC provides: | ||
53 | |||
54 | (1) Part of an RxRPC facility for both kernel and userspace applications by | ||
55 | making the session part of it a Linux network protocol (AF_RXRPC). | ||
56 | |||
57 | (2) A two-phase protocol. The client transmits a blob (the request) and then | ||
58 | receives a blob (the reply), and the server receives the request and then | ||
59 | transmits the reply. | ||
60 | |||
61 | (3) Retention of the reusable bits of the transport system set up for one call | ||
62 | to speed up subsequent calls. | ||
63 | |||
64 | (4) A secure protocol, using the Linux kernel's key retention facility to | ||
65 | manage security on the client end. The server end must of necessity be | ||
66 | more active in security negotiations. | ||
67 | |||
68 | AF_RXRPC does not provide XDR marshalling/presentation facilities. That is | ||
69 | left to the application. AF_RXRPC only deals in blobs. Even the operation ID | ||
70 | is just the first four bytes of the request blob, and as such is beyond the | ||
71 | kernel's interest. | ||
72 | |||
73 | |||
74 | Sockets of AF_RXRPC family are: | ||
75 | |||
76 | (1) created as type SOCK_DGRAM; | ||
77 | |||
78 | (2) provided with a protocol of the type of underlying transport they're going | ||
79 | to use - currently only PF_INET is supported. | ||
80 | |||
81 | |||
82 | The Andrew File System (AFS) is an example of an application that uses this and | ||
83 | that has both kernel (filesystem) and userspace (utility) components. | ||
84 | |||
85 | |||
86 | ====================== | ||
87 | RXRPC PROTOCOL SUMMARY | ||
88 | ====================== | ||
89 | |||
90 | An overview of the RxRPC protocol: | ||
91 | |||
92 | (*) RxRPC sits on top of another networking protocol (UDP is the only option | ||
93 | currently), and uses this to provide network transport. UDP ports, for | ||
94 | example, provide transport endpoints. | ||
95 | |||
96 | (*) RxRPC supports multiple virtual "connections" from any given transport | ||
97 | endpoint, thus allowing the endpoints to be shared, even to the same | ||
98 | remote endpoint. | ||
99 | |||
100 | (*) Each connection goes to a particular "service". A connection may not go | ||
101 | to multiple services. A service may be considered the RxRPC equivalent of | ||
102 | a port number. AF_RXRPC permits multiple services to share an endpoint. | ||
103 | |||
104 | (*) Client-originating packets are marked, thus a transport endpoint can be | ||
105 | shared between client and server connections (connections have a | ||
106 | direction). | ||
107 | |||
108 | (*) Up to a billion connections may be supported concurrently between one | ||
109 | local transport endpoint and one service on one remote endpoint. An RxRPC | ||
110 | connection is described by seven numbers: | ||
111 | |||
112 | Local address } | ||
113 | Local port } Transport (UDP) address | ||
114 | Remote address } | ||
115 | Remote port } | ||
116 | Direction | ||
117 | Connection ID | ||
118 | Service ID | ||
119 | |||
120 | (*) Each RxRPC operation is a "call". A connection may make up to four | ||
121 | billion calls, but only up to four calls may be in progress on a | ||
122 | connection at any one time. | ||
123 | |||
124 | (*) Calls are two-phase and asymmetric: the client sends its request data, | ||
125 | which the service receives; then the service transmits the reply data | ||
126 | which the client receives. | ||
127 | |||
128 | (*) The data blobs are of indefinite size, the end of a phase is marked with a | ||
129 | flag in the packet. The number of packets of data making up one blob may | ||
130 | not exceed 4 billion, however, as this would cause the sequence number to | ||
131 | wrap. | ||
132 | |||
133 | (*) The first four bytes of the request data are the service operation ID. | ||
134 | |||
135 | (*) Security is negotiated on a per-connection basis. The connection is | ||
136 | initiated by the first data packet on it arriving. If security is | ||
137 | requested, the server then issues a "challenge" and then the client | ||
138 | replies with a "response". If the response is successful, the security is | ||
139 | set for the lifetime of that connection, and all subsequent calls made | ||
140 | upon it use that same security. In the event that the server lets a | ||
141 | connection lapse before the client, the security will be renegotiated if | ||
142 | the client uses the connection again. | ||
143 | |||
144 | (*) Calls use ACK packets to handle reliability. Data packets are also | ||
145 | explicitly sequenced per call. | ||
146 | |||
147 | (*) There are two types of positive acknowledgement: hard-ACKs and soft-ACKs. | ||
148 | A hard-ACK indicates to the far side that all the data received to a point | ||
149 | has been received and processed; a soft-ACK indicates that the data has | ||
150 | been received but may yet be discarded and re-requested. The sender may | ||
151 | not discard any transmittable packets until they've been hard-ACK'd. | ||
152 | |||
153 | (*) Reception of a reply data packet implicitly hard-ACK's all the data | ||
154 | packets that make up the request. | ||
155 | |||
156 | (*) An call is complete when the request has been sent, the reply has been | ||
157 | received and the final hard-ACK on the last packet of the reply has | ||
158 | reached the server. | ||
159 | |||
160 | (*) An call may be aborted by either end at any time up to its completion. | ||
161 | |||
162 | |||
163 | ===================== | ||
164 | AF_RXRPC DRIVER MODEL | ||
165 | ===================== | ||
166 | |||
167 | About the AF_RXRPC driver: | ||
168 | |||
169 | (*) The AF_RXRPC protocol transparently uses internal sockets of the transport | ||
170 | protocol to represent transport endpoints. | ||
171 | |||
172 | (*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC | ||
173 | connections are handled transparently. One client socket may be used to | ||
174 | make multiple simultaneous calls to the same service. One server socket | ||
175 | may handle calls from many clients. | ||
176 | |||
177 | (*) Additional parallel client connections will be initiated to support extra | ||
178 | concurrent calls, up to a tunable limit. | ||
179 | |||
180 | (*) Each connection is retained for a certain amount of time [tunable] after | ||
181 | the last call currently using it has completed in case a new call is made | ||
182 | that could reuse it. | ||
183 | |||
184 | (*) Each internal UDP socket is retained [tunable] for a certain amount of | ||
185 | time [tunable] after the last connection using it discarded, in case a new | ||
186 | connection is made that could use it. | ||
187 | |||
188 | (*) A client-side connection is only shared between calls if they have have | ||
189 | the same key struct describing their security (and assuming the calls | ||
190 | would otherwise share the connection). Non-secured calls would also be | ||
191 | able to share connections with each other. | ||
192 | |||
193 | (*) A server-side connection is shared if the client says it is. | ||
194 | |||
195 | (*) ACK'ing is handled by the protocol driver automatically, including ping | ||
196 | replying. | ||
197 | |||
198 | (*) SO_KEEPALIVE automatically pings the other side to keep the connection | ||
199 | alive [TODO]. | ||
200 | |||
201 | (*) If an ICMP error is received, all calls affected by that error will be | ||
202 | aborted with an appropriate network error passed through recvmsg(). | ||
203 | |||
204 | |||
205 | Interaction with the user of the RxRPC socket: | ||
206 | |||
207 | (*) A socket is made into a server socket by binding an address with a | ||
208 | non-zero service ID. | ||
209 | |||
210 | (*) In the client, sending a request is achieved with one or more sendmsgs, | ||
211 | followed by the reply being received with one or more recvmsgs. | ||
212 | |||
213 | (*) The first sendmsg for a request to be sent from a client contains a tag to | ||
214 | be used in all other sendmsgs or recvmsgs associated with that call. The | ||
215 | tag is carried in the control data. | ||
216 | |||
217 | (*) connect() is used to supply a default destination address for a client | ||
218 | socket. This may be overridden by supplying an alternate address to the | ||
219 | first sendmsg() of a call (struct msghdr::msg_name). | ||
220 | |||
221 | (*) If connect() is called on an unbound client, a random local port will | ||
222 | bound before the operation takes place. | ||
223 | |||
224 | (*) A server socket may also be used to make client calls. To do this, the | ||
225 | first sendmsg() of the call must specify the target address. The server's | ||
226 | transport endpoint is used to send the packets. | ||
227 | |||
228 | (*) Once the application has received the last message associated with a call, | ||
229 | the tag is guaranteed not to be seen again, and so it can be used to pin | ||
230 | client resources. A new call can then be initiated with the same tag | ||
231 | without fear of interference. | ||
232 | |||
233 | (*) In the server, a request is received with one or more recvmsgs, then the | ||
234 | the reply is transmitted with one or more sendmsgs, and then the final ACK | ||
235 | is received with a last recvmsg. | ||
236 | |||
237 | (*) When sending data for a call, sendmsg is given MSG_MORE if there's more | ||
238 | data to come on that call. | ||
239 | |||
240 | (*) When receiving data for a call, recvmsg flags MSG_MORE if there's more | ||
241 | data to come for that call. | ||
242 | |||
243 | (*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg | ||
244 | to indicate the terminal message for that call. | ||
245 | |||
246 | (*) A call may be aborted by adding an abort control message to the control | ||
247 | data. Issuing an abort terminates the kernel's use of that call's tag. | ||
248 | Any messages waiting in the receive queue for that call will be discarded. | ||
249 | |||
250 | (*) Aborts, busy notifications and challenge packets are delivered by recvmsg, | ||
251 | and control data messages will be set to indicate the context. Receiving | ||
252 | an abort or a busy message terminates the kernel's use of that call's tag. | ||
253 | |||
254 | (*) The control data part of the msghdr struct is used for a number of things: | ||
255 | |||
256 | (*) The tag of the intended or affected call. | ||
257 | |||
258 | (*) Sending or receiving errors, aborts and busy notifications. | ||
259 | |||
260 | (*) Notifications of incoming calls. | ||
261 | |||
262 | (*) Sending debug requests and receiving debug replies [TODO]. | ||
263 | |||
264 | (*) When the kernel has received and set up an incoming call, it sends a | ||
265 | message to server application to let it know there's a new call awaiting | ||
266 | its acceptance [recvmsg reports a special control message]. The server | ||
267 | application then uses sendmsg to assign a tag to the new call. Once that | ||
268 | is done, the first part of the request data will be delivered by recvmsg. | ||
269 | |||
270 | (*) The server application has to provide the server socket with a keyring of | ||
271 | secret keys corresponding to the security types it permits. When a secure | ||
272 | connection is being set up, the kernel looks up the appropriate secret key | ||
273 | in the keyring and then sends a challenge packet to the client and | ||
274 | receives a response packet. The kernel then checks the authorisation of | ||
275 | the packet and either aborts the connection or sets up the security. | ||
276 | |||
277 | (*) The name of the key a client will use to secure its communications is | ||
278 | nominated by a socket option. | ||
279 | |||
280 | |||
281 | Notes on recvmsg: | ||
282 | |||
283 | (*) If there's a sequence of data messages belonging to a particular call on | ||
284 | the receive queue, then recvmsg will keep working through them until: | ||
285 | |||
286 | (a) it meets the end of that call's received data, | ||
287 | |||
288 | (b) it meets a non-data message, | ||
289 | |||
290 | (c) it meets a message belonging to a different call, or | ||
291 | |||
292 | (d) it fills the user buffer. | ||
293 | |||
294 | If recvmsg is called in blocking mode, it will keep sleeping, awaiting the | ||
295 | reception of further data, until one of the above four conditions is met. | ||
296 | |||
297 | (2) MSG_PEEK operates similarly, but will return immediately if it has put any | ||
298 | data in the buffer rather than sleeping until it can fill the buffer. | ||
299 | |||
300 | (3) If a data message is only partially consumed in filling a user buffer, | ||
301 | then the remainder of that message will be left on the front of the queue | ||
302 | for the next taker. MSG_TRUNC will never be flagged. | ||
303 | |||
304 | (4) If there is more data to be had on a call (it hasn't copied the last byte | ||
305 | of the last data message in that phase yet), then MSG_MORE will be | ||
306 | flagged. | ||
307 | |||
308 | |||
309 | ================ | ||
310 | CONTROL MESSAGES | ||
311 | ================ | ||
312 | |||
313 | AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex | ||
314 | calls, to invoke certain actions and to report certain conditions. These are: | ||
315 | |||
316 | MESSAGE ID SRT DATA MEANING | ||
317 | ======================= === =========== =============================== | ||
318 | RXRPC_USER_CALL_ID sr- User ID App's call specifier | ||
319 | RXRPC_ABORT srt Abort code Abort code to issue/received | ||
320 | RXRPC_ACK -rt n/a Final ACK received | ||
321 | RXRPC_NET_ERROR -rt error num Network error on call | ||
322 | RXRPC_BUSY -rt n/a Call rejected (server busy) | ||
323 | RXRPC_LOCAL_ERROR -rt error num Local error encountered | ||
324 | RXRPC_NEW_CALL -r- n/a New call received | ||
325 | RXRPC_ACCEPT s-- n/a Accept new call | ||
326 | |||
327 | (SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message) | ||
328 | |||
329 | (*) RXRPC_USER_CALL_ID | ||
330 | |||
331 | This is used to indicate the application's call ID. It's an unsigned long | ||
332 | that the app specifies in the client by attaching it to the first data | ||
333 | message or in the server by passing it in association with an RXRPC_ACCEPT | ||
334 | message. recvmsg() passes it in conjunction with all messages except | ||
335 | those of the RXRPC_NEW_CALL message. | ||
336 | |||
337 | (*) RXRPC_ABORT | ||
338 | |||
339 | This is can be used by an application to abort a call by passing it to | ||
340 | sendmsg, or it can be delivered by recvmsg to indicate a remote abort was | ||
341 | received. Either way, it must be associated with an RXRPC_USER_CALL_ID to | ||
342 | specify the call affected. If an abort is being sent, then error EBADSLT | ||
343 | will be returned if there is no call with that user ID. | ||
344 | |||
345 | (*) RXRPC_ACK | ||
346 | |||
347 | This is delivered to a server application to indicate that the final ACK | ||
348 | of a call was received from the client. It will be associated with an | ||
349 | RXRPC_USER_CALL_ID to indicate the call that's now complete. | ||
350 | |||
351 | (*) RXRPC_NET_ERROR | ||
352 | |||
353 | This is delivered to an application to indicate that an ICMP error message | ||
354 | was encountered in the process of trying to talk to the peer. An | ||
355 | errno-class integer value will be included in the control message data | ||
356 | indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call | ||
357 | affected. | ||
358 | |||
359 | (*) RXRPC_BUSY | ||
360 | |||
361 | This is delivered to a client application to indicate that a call was | ||
362 | rejected by the server due to the server being busy. It will be | ||
363 | associated with an RXRPC_USER_CALL_ID to indicate the rejected call. | ||
364 | |||
365 | (*) RXRPC_LOCAL_ERROR | ||
366 | |||
367 | This is delivered to an application to indicate that a local error was | ||
368 | encountered and that a call has been aborted because of it. An | ||
369 | errno-class integer value will be included in the control message data | ||
370 | indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call | ||
371 | affected. | ||
372 | |||
373 | (*) RXRPC_NEW_CALL | ||
374 | |||
375 | This is delivered to indicate to a server application that a new call has | ||
376 | arrived and is awaiting acceptance. No user ID is associated with this, | ||
377 | as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT. | ||
378 | |||
379 | (*) RXRPC_ACCEPT | ||
380 | |||
381 | This is used by a server application to attempt to accept a call and | ||
382 | assign it a user ID. It should be associated with an RXRPC_USER_CALL_ID | ||
383 | to indicate the user ID to be assigned. If there is no call to be | ||
384 | accepted (it may have timed out, been aborted, etc.), then sendmsg will | ||
385 | return error ENODATA. If the user ID is already in use by another call, | ||
386 | then error EBADSLT will be returned. | ||
387 | |||
388 | |||
389 | ============== | ||
390 | SOCKET OPTIONS | ||
391 | ============== | ||
392 | |||
393 | AF_RXRPC sockets support a few socket options at the SOL_RXRPC level: | ||
394 | |||
395 | (*) RXRPC_SECURITY_KEY | ||
396 | |||
397 | This is used to specify the description of the key to be used. The key is | ||
398 | extracted from the calling process's keyrings with request_key() and | ||
399 | should be of "rxrpc" type. | ||
400 | |||
401 | The optval pointer points to the description string, and optlen indicates | ||
402 | how long the string is, without the NUL terminator. | ||
403 | |||
404 | (*) RXRPC_SECURITY_KEYRING | ||
405 | |||
406 | Similar to above but specifies a keyring of server secret keys to use (key | ||
407 | type "keyring"). See the "Security" section. | ||
408 | |||
409 | (*) RXRPC_EXCLUSIVE_CONNECTION | ||
410 | |||
411 | This is used to request that new connections should be used for each call | ||
412 | made subsequently on this socket. optval should be NULL and optlen 0. | ||
413 | |||
414 | (*) RXRPC_MIN_SECURITY_LEVEL | ||
415 | |||
416 | This is used to specify the minimum security level required for calls on | ||
417 | this socket. optval must point to an int containing one of the following | ||
418 | values: | ||
419 | |||
420 | (a) RXRPC_SECURITY_PLAIN | ||
421 | |||
422 | Encrypted checksum only. | ||
423 | |||
424 | (b) RXRPC_SECURITY_AUTH | ||
425 | |||
426 | Encrypted checksum plus packet padded and first eight bytes of packet | ||
427 | encrypted - which includes the actual packet length. | ||
428 | |||
429 | (c) RXRPC_SECURITY_ENCRYPTED | ||
430 | |||
431 | Encrypted checksum plus entire packet padded and encrypted, including | ||
432 | actual packet length. | ||
433 | |||
434 | |||
435 | ======== | ||
436 | SECURITY | ||
437 | ======== | ||
438 | |||
439 | Currently, only the kerberos 4 equivalent protocol has been implemented | ||
440 | (security index 2 - rxkad). This requires the rxkad module to be loaded and, | ||
441 | on the client, tickets of the appropriate type to be obtained from the AFS | ||
442 | kaserver or the kerberos server and installed as "rxrpc" type keys. This is | ||
443 | normally done using the klog program. An example simple klog program can be | ||
444 | found at: | ||
445 | |||
446 | http://people.redhat.com/~dhowells/rxrpc/klog.c | ||
447 | |||
448 | The payload provided to add_key() on the client should be of the following | ||
449 | form: | ||
450 | |||
451 | struct rxrpc_key_sec2_v1 { | ||
452 | uint16_t security_index; /* 2 */ | ||
453 | uint16_t ticket_length; /* length of ticket[] */ | ||
454 | uint32_t expiry; /* time at which expires */ | ||
455 | uint8_t kvno; /* key version number */ | ||
456 | uint8_t __pad[3]; | ||
457 | uint8_t session_key[8]; /* DES session key */ | ||
458 | uint8_t ticket[0]; /* the encrypted ticket */ | ||
459 | }; | ||
460 | |||
461 | Where the ticket blob is just appended to the above structure. | ||
462 | |||
463 | |||
464 | For the server, keys of type "rxrpc_s" must be made available to the server. | ||
465 | They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an | ||
466 | rxkad key for the AFS VL service). When such a key is created, it should be | ||
467 | given the server's secret key as the instantiation data (see the example | ||
468 | below). | ||
469 | |||
470 | add_key("rxrpc_s", "52:2", secret_key, 8, keyring); | ||
471 | |||
472 | A keyring is passed to the server socket by naming it in a sockopt. The server | ||
473 | socket then looks the server secret keys up in this keyring when secure | ||
474 | incoming connections are made. This can be seen in an example program that can | ||
475 | be found at: | ||
476 | |||
477 | http://people.redhat.com/~dhowells/rxrpc/listen.c | ||
478 | |||
479 | |||
480 | ==================== | ||
481 | EXAMPLE CLIENT USAGE | ||
482 | ==================== | ||
483 | |||
484 | A client would issue an operation by: | ||
485 | |||
486 | (1) An RxRPC socket is set up by: | ||
487 | |||
488 | client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET); | ||
489 | |||
490 | Where the third parameter indicates the protocol family of the transport | ||
491 | socket used - usually IPv4 but it can also be IPv6 [TODO]. | ||
492 | |||
493 | (2) A local address can optionally be bound: | ||
494 | |||
495 | struct sockaddr_rxrpc srx = { | ||
496 | .srx_family = AF_RXRPC, | ||
497 | .srx_service = 0, /* we're a client */ | ||
498 | .transport_type = SOCK_DGRAM, /* type of transport socket */ | ||
499 | .transport.sin_family = AF_INET, | ||
500 | .transport.sin_port = htons(7000), /* AFS callback */ | ||
501 | .transport.sin_address = 0, /* all local interfaces */ | ||
502 | }; | ||
503 | bind(client, &srx, sizeof(srx)); | ||
504 | |||
505 | This specifies the local UDP port to be used. If not given, a random | ||
506 | non-privileged port will be used. A UDP port may be shared between | ||
507 | several unrelated RxRPC sockets. Security is handled on a basis of | ||
508 | per-RxRPC virtual connection. | ||
509 | |||
510 | (3) The security is set: | ||
511 | |||
512 | const char *key = "AFS:cambridge.redhat.com"; | ||
513 | setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key)); | ||
514 | |||
515 | This issues a request_key() to get the key representing the security | ||
516 | context. The minimum security level can be set: | ||
517 | |||
518 | unsigned int sec = RXRPC_SECURITY_ENCRYPTED; | ||
519 | setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL, | ||
520 | &sec, sizeof(sec)); | ||
521 | |||
522 | (4) The server to be contacted can then be specified (alternatively this can | ||
523 | be done through sendmsg): | ||
524 | |||
525 | struct sockaddr_rxrpc srx = { | ||
526 | .srx_family = AF_RXRPC, | ||
527 | .srx_service = VL_SERVICE_ID, | ||
528 | .transport_type = SOCK_DGRAM, /* type of transport socket */ | ||
529 | .transport.sin_family = AF_INET, | ||
530 | .transport.sin_port = htons(7005), /* AFS volume manager */ | ||
531 | .transport.sin_address = ..., | ||
532 | }; | ||
533 | connect(client, &srx, sizeof(srx)); | ||
534 | |||
535 | (5) The request data should then be posted to the server socket using a series | ||
536 | of sendmsg() calls, each with the following control message attached: | ||
537 | |||
538 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
539 | |||
540 | MSG_MORE should be set in msghdr::msg_flags on all but the last part of | ||
541 | the request. Multiple requests may be made simultaneously. | ||
542 | |||
543 | If a call is intended to go to a destination other then the default | ||
544 | specified through connect(), then msghdr::msg_name should be set on the | ||
545 | first request message of that call. | ||
546 | |||
547 | (6) The reply data will then be posted to the server socket for recvmsg() to | ||
548 | pick up. MSG_MORE will be flagged by recvmsg() if there's more reply data | ||
549 | for a particular call to be read. MSG_EOR will be set on the terminal | ||
550 | read for a call. | ||
551 | |||
552 | All data will be delivered with the following control message attached: | ||
553 | |||
554 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
555 | |||
556 | If an abort or error occurred, this will be returned in the control data | ||
557 | buffer instead, and MSG_EOR will be flagged to indicate the end of that | ||
558 | call. | ||
559 | |||
560 | |||
561 | ==================== | ||
562 | EXAMPLE SERVER USAGE | ||
563 | ==================== | ||
564 | |||
565 | A server would be set up to accept operations in the following manner: | ||
566 | |||
567 | (1) An RxRPC socket is created by: | ||
568 | |||
569 | server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET); | ||
570 | |||
571 | Where the third parameter indicates the address type of the transport | ||
572 | socket used - usually IPv4. | ||
573 | |||
574 | (2) Security is set up if desired by giving the socket a keyring with server | ||
575 | secret keys in it: | ||
576 | |||
577 | keyring = add_key("keyring", "AFSkeys", NULL, 0, | ||
578 | KEY_SPEC_PROCESS_KEYRING); | ||
579 | |||
580 | const char secret_key[8] = { | ||
581 | 0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 }; | ||
582 | add_key("rxrpc_s", "52:2", secret_key, 8, keyring); | ||
583 | |||
584 | setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7); | ||
585 | |||
586 | The keyring can be manipulated after it has been given to the socket. This | ||
587 | permits the server to add more keys, replace keys, etc. whilst it is live. | ||
588 | |||
589 | (2) A local address must then be bound: | ||
590 | |||
591 | struct sockaddr_rxrpc srx = { | ||
592 | .srx_family = AF_RXRPC, | ||
593 | .srx_service = VL_SERVICE_ID, /* RxRPC service ID */ | ||
594 | .transport_type = SOCK_DGRAM, /* type of transport socket */ | ||
595 | .transport.sin_family = AF_INET, | ||
596 | .transport.sin_port = htons(7000), /* AFS callback */ | ||
597 | .transport.sin_address = 0, /* all local interfaces */ | ||
598 | }; | ||
599 | bind(server, &srx, sizeof(srx)); | ||
600 | |||
601 | (3) The server is then set to listen out for incoming calls: | ||
602 | |||
603 | listen(server, 100); | ||
604 | |||
605 | (4) The kernel notifies the server of pending incoming connections by sending | ||
606 | it a message for each. This is received with recvmsg() on the server | ||
607 | socket. It has no data, and has a single dataless control message | ||
608 | attached: | ||
609 | |||
610 | RXRPC_NEW_CALL | ||
611 | |||
612 | The address that can be passed back by recvmsg() at this point should be | ||
613 | ignored since the call for which the message was posted may have gone by | ||
614 | the time it is accepted - in which case the first call still on the queue | ||
615 | will be accepted. | ||
616 | |||
617 | (5) The server then accepts the new call by issuing a sendmsg() with two | ||
618 | pieces of control data and no actual data: | ||
619 | |||
620 | RXRPC_ACCEPT - indicate connection acceptance | ||
621 | RXRPC_USER_CALL_ID - specify user ID for this call | ||
622 | |||
623 | (6) The first request data packet will then be posted to the server socket for | ||
624 | recvmsg() to pick up. At that point, the RxRPC address for the call can | ||
625 | be read from the address fields in the msghdr struct. | ||
626 | |||
627 | Subsequent request data will be posted to the server socket for recvmsg() | ||
628 | to collect as it arrives. All but the last piece of the request data will | ||
629 | be delivered with MSG_MORE flagged. | ||
630 | |||
631 | All data will be delivered with the following control message attached: | ||
632 | |||
633 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
634 | |||
635 | (8) The reply data should then be posted to the server socket using a series | ||
636 | of sendmsg() calls, each with the following control messages attached: | ||
637 | |||
638 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
639 | |||
640 | MSG_MORE should be set in msghdr::msg_flags on all but the last message | ||
641 | for a particular call. | ||
642 | |||
643 | (9) The final ACK from the client will be posted for retrieval by recvmsg() | ||
644 | when it is received. It will take the form of a dataless message with two | ||
645 | control messages attached: | ||
646 | |||
647 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
648 | RXRPC_ACK - indicates final ACK (no data) | ||
649 | |||
650 | MSG_EOR will be flagged to indicate that this is the final message for | ||
651 | this call. | ||
652 | |||
653 | (10) Up to the point the final packet of reply data is sent, the call can be | ||
654 | aborted by calling sendmsg() with a dataless message with the following | ||
655 | control messages attached: | ||
656 | |||
657 | RXRPC_USER_CALL_ID - specifies the user ID for this call | ||
658 | RXRPC_ABORT - indicates abort code (4 byte data) | ||
659 | |||
660 | Any packets waiting in the socket's receive queue will be discarded if | ||
661 | this is issued. | ||
662 | |||
663 | Note that all the communications for a particular service take place through | ||
664 | the one server socket, using control messages on sendmsg() and recvmsg() to | ||
665 | determine the call affected. | ||
666 | |||
667 | |||
668 | ========================= | ||
669 | AF_RXRPC KERNEL INTERFACE | ||
670 | ========================= | ||
671 | |||
672 | The AF_RXRPC module also provides an interface for use by in-kernel utilities | ||
673 | such as the AFS filesystem. This permits such a utility to: | ||
674 | |||
675 | (1) Use different keys directly on individual client calls on one socket | ||
676 | rather than having to open a whole slew of sockets, one for each key it | ||
677 | might want to use. | ||
678 | |||
679 | (2) Avoid having RxRPC call request_key() at the point of issue of a call or | ||
680 | opening of a socket. Instead the utility is responsible for requesting a | ||
681 | key at the appropriate point. AFS, for instance, would do this during VFS | ||
682 | operations such as open() or unlink(). The key is then handed through | ||
683 | when the call is initiated. | ||
684 | |||
685 | (3) Request the use of something other than GFP_KERNEL to allocate memory. | ||
686 | |||
687 | (4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be | ||
688 | intercepted before they get put into the socket Rx queue and the socket | ||
689 | buffers manipulated directly. | ||
690 | |||
691 | To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket, | ||
692 | bind an addess as appropriate and listen if it's to be a server socket, but | ||
693 | then it passes this to the kernel interface functions. | ||
694 | |||
695 | The kernel interface functions are as follows: | ||
696 | |||
697 | (*) Begin a new client call. | ||
698 | |||
699 | struct rxrpc_call * | ||
700 | rxrpc_kernel_begin_call(struct socket *sock, | ||
701 | struct sockaddr_rxrpc *srx, | ||
702 | struct key *key, | ||
703 | unsigned long user_call_ID, | ||
704 | gfp_t gfp); | ||
705 | |||
706 | This allocates the infrastructure to make a new RxRPC call and assigns | ||
707 | call and connection numbers. The call will be made on the UDP port that | ||
708 | the socket is bound to. The call will go to the destination address of a | ||
709 | connected client socket unless an alternative is supplied (srx is | ||
710 | non-NULL). | ||
711 | |||
712 | If a key is supplied then this will be used to secure the call instead of | ||
713 | the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls | ||
714 | secured in this way will still share connections if at all possible. | ||
715 | |||
716 | The user_call_ID is equivalent to that supplied to sendmsg() in the | ||
717 | control data buffer. It is entirely feasible to use this to point to a | ||
718 | kernel data structure. | ||
719 | |||
720 | If this function is successful, an opaque reference to the RxRPC call is | ||
721 | returned. The caller now holds a reference on this and it must be | ||
722 | properly ended. | ||
723 | |||
724 | (*) End a client call. | ||
725 | |||
726 | void rxrpc_kernel_end_call(struct rxrpc_call *call); | ||
727 | |||
728 | This is used to end a previously begun call. The user_call_ID is expunged | ||
729 | from AF_RXRPC's knowledge and will not be seen again in association with | ||
730 | the specified call. | ||
731 | |||
732 | (*) Send data through a call. | ||
733 | |||
734 | int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg, | ||
735 | size_t len); | ||
736 | |||
737 | This is used to supply either the request part of a client call or the | ||
738 | reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the | ||
739 | data buffers to be used. msg_iov may not be NULL and must point | ||
740 | exclusively to in-kernel virtual addresses. msg.msg_flags may be given | ||
741 | MSG_MORE if there will be subsequent data sends for this call. | ||
742 | |||
743 | The msg must not specify a destination address, control data or any flags | ||
744 | other than MSG_MORE. len is the total amount of data to transmit. | ||
745 | |||
746 | (*) Abort a call. | ||
747 | |||
748 | void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code); | ||
749 | |||
750 | This is used to abort a call if it's still in an abortable state. The | ||
751 | abort code specified will be placed in the ABORT message sent. | ||
752 | |||
753 | (*) Intercept received RxRPC messages. | ||
754 | |||
755 | typedef void (*rxrpc_interceptor_t)(struct sock *sk, | ||
756 | unsigned long user_call_ID, | ||
757 | struct sk_buff *skb); | ||
758 | |||
759 | void | ||
760 | rxrpc_kernel_intercept_rx_messages(struct socket *sock, | ||
761 | rxrpc_interceptor_t interceptor); | ||
762 | |||
763 | This installs an interceptor function on the specified AF_RXRPC socket. | ||
764 | All messages that would otherwise wind up in the socket's Rx queue are | ||
765 | then diverted to this function. Note that care must be taken to process | ||
766 | the messages in the right order to maintain DATA message sequentiality. | ||
767 | |||
768 | The interceptor function itself is provided with the address of the socket | ||
769 | and handling the incoming message, the ID assigned by the kernel utility | ||
770 | to the call and the socket buffer containing the message. | ||
771 | |||
772 | The skb->mark field indicates the type of message: | ||
773 | |||
774 | MARK MEANING | ||
775 | =============================== ======================================= | ||
776 | RXRPC_SKB_MARK_DATA Data message | ||
777 | RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call | ||
778 | RXRPC_SKB_MARK_BUSY Client call rejected as server busy | ||
779 | RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer | ||
780 | RXRPC_SKB_MARK_NET_ERROR Network error detected | ||
781 | RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered | ||
782 | RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance | ||
783 | |||
784 | The remote abort message can be probed with rxrpc_kernel_get_abort_code(). | ||
785 | The two error messages can be probed with rxrpc_kernel_get_error_number(). | ||
786 | A new call can be accepted with rxrpc_kernel_accept_call(). | ||
787 | |||
788 | Data messages can have their contents extracted with the usual bunch of | ||
789 | socket buffer manipulation functions. A data message can be determined to | ||
790 | be the last one in a sequence with rxrpc_kernel_is_data_last(). When a | ||
791 | data message has been used up, rxrpc_kernel_data_delivered() should be | ||
792 | called on it.. | ||
793 | |||
794 | Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose | ||
795 | of. It is possible to get extra refs on all types of message for later | ||
796 | freeing, but this may pin the state of a call until the message is finally | ||
797 | freed. | ||
798 | |||
799 | (*) Accept an incoming call. | ||
800 | |||
801 | struct rxrpc_call * | ||
802 | rxrpc_kernel_accept_call(struct socket *sock, | ||
803 | unsigned long user_call_ID); | ||
804 | |||
805 | This is used to accept an incoming call and to assign it a call ID. This | ||
806 | function is similar to rxrpc_kernel_begin_call() and calls accepted must | ||
807 | be ended in the same way. | ||
808 | |||
809 | If this function is successful, an opaque reference to the RxRPC call is | ||
810 | returned. The caller now holds a reference on this and it must be | ||
811 | properly ended. | ||
812 | |||
813 | (*) Reject an incoming call. | ||
814 | |||
815 | int rxrpc_kernel_reject_call(struct socket *sock); | ||
816 | |||
817 | This is used to reject the first incoming call on the socket's queue with | ||
818 | a BUSY message. -ENODATA is returned if there were no incoming calls. | ||
819 | Other errors may be returned if the call had been aborted (-ECONNABORTED) | ||
820 | or had timed out (-ETIME). | ||
821 | |||
822 | (*) Record the delivery of a data message and free it. | ||
823 | |||
824 | void rxrpc_kernel_data_delivered(struct sk_buff *skb); | ||
825 | |||
826 | This is used to record a data message as having been delivered and to | ||
827 | update the ACK state for the call. The socket buffer will be freed. | ||
828 | |||
829 | (*) Free a message. | ||
830 | |||
831 | void rxrpc_kernel_free_skb(struct sk_buff *skb); | ||
832 | |||
833 | This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC | ||
834 | socket. | ||
835 | |||
836 | (*) Determine if a data message is the last one on a call. | ||
837 | |||
838 | bool rxrpc_kernel_is_data_last(struct sk_buff *skb); | ||
839 | |||
840 | This is used to determine if a socket buffer holds the last data message | ||
841 | to be received for a call (true will be returned if it does, false | ||
842 | if not). | ||
843 | |||
844 | The data message will be part of the reply on a client call and the | ||
845 | request on an incoming call. In the latter case there will be more | ||
846 | messages, but in the former case there will not. | ||
847 | |||
848 | (*) Get the abort code from an abort message. | ||
849 | |||
850 | u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb); | ||
851 | |||
852 | This is used to extract the abort code from a remote abort message. | ||
853 | |||
854 | (*) Get the error number from a local or network error message. | ||
855 | |||
856 | int rxrpc_kernel_get_error_number(struct sk_buff *skb); | ||
857 | |||
858 | This is used to extract the error number from a message indicating either | ||
859 | a local error occurred or a network error occurred. | ||
diff --git a/Documentation/networking/wan-router.txt b/Documentation/networking/wan-router.txt index 653978dcea7f..07dd6d9930a1 100644 --- a/Documentation/networking/wan-router.txt +++ b/Documentation/networking/wan-router.txt | |||
@@ -250,7 +250,6 @@ PRODUCT COMPONENTS AND RELATED FILES | |||
250 | sdladrv.h SDLA support module API definitions | 250 | sdladrv.h SDLA support module API definitions |
251 | sdlasfm.h SDLA firmware module definitions | 251 | sdlasfm.h SDLA firmware module definitions |
252 | if_wanpipe.h WANPIPE Socket definitions | 252 | if_wanpipe.h WANPIPE Socket definitions |
253 | if_wanpipe_common.h WANPIPE Socket/Driver common definitions. | ||
254 | sdlapci.h WANPIPE PCI definitions | 253 | sdlapci.h WANPIPE PCI definitions |
255 | 254 | ||
256 | 255 | ||