diff options
author | Gerrit Renker <gerrit@erg.abdn.ac.uk> | 2006-11-27 14:10:57 -0500 |
---|---|---|
committer | David S. Miller <davem@sunset.davemloft.net> | 2006-12-03 00:22:46 -0500 |
commit | ba4e58eca8aa9473b44fdfd312f26c4a2e7798b3 (patch) | |
tree | 700f8f989f48da480beb83b983637cfd2b5a3f67 /Documentation/networking | |
parent | 6051e2f4fb68fc8e5343db58fa680ece376f405c (diff) |
[NET]: Supporting UDP-Lite (RFC 3828) in Linux
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:
* UDP and UDP-Lite now use separate object files
* source file dependencies resolved via header files
net/ipv{4,6}/udp_impl.h
* order of inclusion files in udp.c/udplite.c adapted
accordingly
[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)
This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
* generic routines are all located in net/ipv4/udp.c
* UDP-Lite specific routines are in net/ipv4/udplite.c
* MIB/statistics support in /proc/net/snmp and /proc/net/udplite
* shared API with extensions for partial checksum coverage
[NET/IPv6]: Extension for UDP-Lite over IPv6
It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
* UDPv6 generic and shared code is in net/ipv6/udp.c
* UDP-Litev6 specific extensions are in net/ipv6/udplite.c
* MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
* support for IPV6_ADDRFORM
* aligned the coding style of protocol initialisation with af_inet6.c
* made the error handling in udpv6_queue_rcv_skb consistent;
to return `-1' on error on all error cases
* consolidation of shared code
[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support
The UDP-Lite patch further provides
* API documentation for UDP-Lite
* basic xfrm support
* basic netfilter support for IPv4 and IPv6 (LOG target)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking')
-rw-r--r-- | Documentation/networking/udplite.txt | 281 |
1 files changed, 281 insertions, 0 deletions
diff --git a/Documentation/networking/udplite.txt b/Documentation/networking/udplite.txt new file mode 100644 index 00000000000..dd6f46b83da --- /dev/null +++ b/Documentation/networking/udplite.txt | |||
@@ -0,0 +1,281 @@ | |||
1 | =========================================================================== | ||
2 | The UDP-Lite protocol (RFC 3828) | ||
3 | =========================================================================== | ||
4 | |||
5 | |||
6 | UDP-Lite is a Standards-Track IETF transport protocol whose characteristic | ||
7 | is a variable-length checksum. This has advantages for transport of multimedia | ||
8 | (video, VoIP) over wireless networks, as partly damaged packets can still be | ||
9 | fed into the codec instead of being discarded due to a failed checksum test. | ||
10 | |||
11 | This file briefly describes the existing kernel support and the socket API. | ||
12 | For in-depth information, you can consult: | ||
13 | |||
14 | o The UDP-Lite Homepage: http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ | ||
15 | Fom here you can also download some example application source code. | ||
16 | |||
17 | o The UDP-Lite HOWTO on | ||
18 | http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/UDP-Lite-HOWTO.txt | ||
19 | |||
20 | o The Wireshark UDP-Lite WiKi (with capture files): | ||
21 | http://wiki.wireshark.org/Lightweight_User_Datagram_Protocol | ||
22 | |||
23 | o The Protocol Spec, RFC 3828, http://www.ietf.org/rfc/rfc3828.txt | ||
24 | |||
25 | |||
26 | I) APPLICATIONS | ||
27 | |||
28 | Several applications have been ported successfully to UDP-Lite. Ethereal | ||
29 | (now called wireshark) has UDP-Litev4/v6 support by default. The tarball on | ||
30 | |||
31 | http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz | ||
32 | |||
33 | has source code for several v4/v6 client-server and network testing examples. | ||
34 | |||
35 | Porting applications to UDP-Lite is straightforward: only socket level and | ||
36 | IPPROTO need to be changed; senders additionally set the checksum coverage | ||
37 | length (default = header length = 8). Details are in the next section. | ||
38 | |||
39 | |||
40 | II) PROGRAMMING API | ||
41 | |||
42 | UDP-Lite provides a connectionless, unreliable datagram service and hence | ||
43 | uses the same socket type as UDP. In fact, porting from UDP to UDP-Lite is | ||
44 | very easy: simply add `IPPROTO_UDPLITE' as the last argument of the socket(2) | ||
45 | call so that the statement looks like: | ||
46 | |||
47 | s = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDPLITE); | ||
48 | |||
49 | or, respectively, | ||
50 | |||
51 | s = socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDPLITE); | ||
52 | |||
53 | With just the above change you are able to run UDP-Lite services or connect | ||
54 | to UDP-Lite servers. The kernel will assume that you are not interested in | ||
55 | using partial checksum coverage and so emulate UDP mode (full coverage). | ||
56 | |||
57 | To make use of the partial checksum coverage facilities requires setting a | ||
58 | single socket option, which takes an integer specifying the coverage length: | ||
59 | |||
60 | * Sender checksum coverage: UDPLITE_SEND_CSCOV | ||
61 | |||
62 | For example, | ||
63 | |||
64 | int val = 20; | ||
65 | setsockopt(s, SOL_UDPLITE, UDPLITE_SEND_CSCOV, &val, sizeof(int)); | ||
66 | |||
67 | sets the checksum coverage length to 20 bytes (12b data + 8b header). | ||
68 | Of each packet only the first 20 bytes (plus the pseudo-header) will be | ||
69 | checksummed. This is useful for RTP applications which have a 12-byte | ||
70 | base header. | ||
71 | |||
72 | |||
73 | * Receiver checksum coverage: UDPLITE_RECV_CSCOV | ||
74 | |||
75 | This option is the receiver-side analogue. It is truly optional, i.e. not | ||
76 | required to enable traffic with partial checksum coverage. Its function is | ||
77 | that of a traffic filter: when enabled, it instructs the kernel to drop | ||
78 | all packets which have a coverage _less_ than this value. For example, if | ||
79 | RTP and UDP headers are to be protected, a receiver can enforce that only | ||
80 | packets with a minimum coverage of 20 are admitted: | ||
81 | |||
82 | int min = 20; | ||
83 | setsockopt(s, SOL_UDPLITE, UDPLITE_RECV_CSCOV, &min, sizeof(int)); | ||
84 | |||
85 | The calls to getsockopt(2) are analogous. Being an extension and not a stand- | ||
86 | alone protocol, all socket options known from UDP can be used in exactly the | ||
87 | same manner as before, e.g. UDP_CORK or UDP_ENCAP. | ||
88 | |||
89 | A detailed discussion of UDP-Lite checksum coverage options is in section IV. | ||
90 | |||
91 | |||
92 | III) HEADER FILES | ||
93 | |||
94 | The socket API requires support through header files in /usr/include: | ||
95 | |||
96 | * /usr/include/netinet/in.h | ||
97 | to define IPPROTO_UDPLITE | ||
98 | |||
99 | * /usr/include/netinet/udplite.h | ||
100 | for UDP-Lite header fields and protocol constants | ||
101 | |||
102 | For testing purposes, the following can serve as a `mini' header file: | ||
103 | |||
104 | #define IPPROTO_UDPLITE 136 | ||
105 | #define SOL_UDPLITE 136 | ||
106 | #define UDPLITE_SEND_CSCOV 10 | ||
107 | #define UDPLITE_RECV_CSCOV 11 | ||
108 | |||
109 | Ready-made header files for various distros are in the UDP-Lite tarball. | ||
110 | |||
111 | |||
112 | IV) KERNEL BEHAVIOUR WITH REGARD TO THE VARIOUS SOCKET OPTIONS | ||
113 | |||
114 | To enable debugging messages, the log level need to be set to 8, as most | ||
115 | messages use the KERN_DEBUG level (7). | ||
116 | |||
117 | 1) Sender Socket Options | ||
118 | |||
119 | If the sender specifies a value of 0 as coverage length, the module | ||
120 | assumes full coverage, transmits a packet with coverage length of 0 | ||
121 | and according checksum. If the sender specifies a coverage < 8 and | ||
122 | different from 0, the kernel assumes 8 as default value. Finally, | ||
123 | if the specified coverage length exceeds the packet length, the packet | ||
124 | length is used instead as coverage length. | ||
125 | |||
126 | 2) Receiver Socket Options | ||
127 | |||
128 | The receiver specifies the minimum value of the coverage length it | ||
129 | is willing to accept. A value of 0 here indicates that the receiver | ||
130 | always wants the whole of the packet covered. In this case, all | ||
131 | partially covered packets are dropped and an error is logged. | ||
132 | |||
133 | It is not possible to specify illegal values (<0 and <8); in these | ||
134 | cases the default of 8 is assumed. | ||
135 | |||
136 | All packets arriving with a coverage value less than the specified | ||
137 | threshold are discarded, these events are also logged. | ||
138 | |||
139 | 3) Disabling the Checksum Computation | ||
140 | |||
141 | On both sender and receiver, checksumming will always be performed | ||
142 | and can not be disabled using SO_NO_CHECK. Thus | ||
143 | |||
144 | setsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, ... ); | ||
145 | |||
146 | will always will be ignored, while the value of | ||
147 | |||
148 | getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...); | ||
149 | |||
150 | is meaningless (as in TCP). Packets with a zero checksum field are | ||
151 | illegal (cf. RFC 3828, sec. 3.1) will be silently discarded. | ||
152 | |||
153 | 4) Fragmentation | ||
154 | |||
155 | The checksum computation respects both buffersize and MTU. The size | ||
156 | of UDP-Lite packets is determined by the size of the send buffer. The | ||
157 | minimum size of the send buffer is 2048 (defined as SOCK_MIN_SNDBUF | ||
158 | in include/net/sock.h), the default value is configurable as | ||
159 | net.core.wmem_default or via setting the SO_SNDBUF socket(7) | ||
160 | option. The maximum upper bound for the send buffer is determined | ||
161 | by net.core.wmem_max. | ||
162 | |||
163 | Given a payload size larger than the send buffer size, UDP-Lite will | ||
164 | split the payload into several individual packets, filling up the | ||
165 | send buffer size in each case. | ||
166 | |||
167 | The precise value also depends on the interface MTU. The interface MTU, | ||
168 | in turn, may trigger IP fragmentation. In this case, the generated | ||
169 | UDP-Lite packet is split into several IP packets, of which only the | ||
170 | first one contains the L4 header. | ||
171 | |||
172 | The send buffer size has implications on the checksum coverage length. | ||
173 | Consider the following example: | ||
174 | |||
175 | Payload: 1536 bytes Send Buffer: 1024 bytes | ||
176 | MTU: 1500 bytes Coverage Length: 856 bytes | ||
177 | |||
178 | UDP-Lite will ship the 1536 bytes in two separate packets: | ||
179 | |||
180 | Packet 1: 1024 payload + 8 byte header + 20 byte IP header = 1052 bytes | ||
181 | Packet 2: 512 payload + 8 byte header + 20 byte IP header = 540 bytes | ||
182 | |||
183 | The coverage packet covers the UDP-Lite header and 848 bytes of the | ||
184 | payload in the first packet, the second packet is fully covered. Note | ||
185 | that for the second packet, the coverage length exceeds the packet | ||
186 | length. The kernel always re-adjusts the coverage length to the packet | ||
187 | length in such cases. | ||
188 | |||
189 | As an example of what happens when one UDP-Lite packet is split into | ||
190 | several tiny fragments, consider the following example. | ||
191 | |||
192 | Payload: 1024 bytes Send buffer size: 1024 bytes | ||
193 | MTU: 300 bytes Coverage length: 575 bytes | ||
194 | |||
195 | +-+-----------+--------------+--------------+--------------+ | ||
196 | |8| 272 | 280 | 280 | 280 | | ||
197 | +-+-----------+--------------+--------------+--------------+ | ||
198 | 280 560 840 1032 | ||
199 | ^ | ||
200 | *****checksum coverage************* | ||
201 | |||
202 | The UDP-Lite module generates one 1032 byte packet (1024 + 8 byte | ||
203 | header). According to the interface MTU, these are split into 4 IP | ||
204 | packets (280 byte IP payload + 20 byte IP header). The kernel module | ||
205 | sums the contents of the entire first two packets, plus 15 bytes of | ||
206 | the last packet before releasing the fragments to the IP module. | ||
207 | |||
208 | To see the analogous case for IPv6 fragmentation, consider a link | ||
209 | MTU of 1280 bytes and a write buffer of 3356 bytes. If the checksum | ||
210 | coverage is less than 1232 bytes (MTU minus IPv6/fragment header | ||
211 | lengths), only the first fragment needs to be considered. When using | ||
212 | larger checksum coverage lengths, each eligible fragment needs to be | ||
213 | checksummed. Suppose we have a checksum coverage of 3062. The buffer | ||
214 | of 3356 bytes will be split into the following fragments: | ||
215 | |||
216 | Fragment 1: 1280 bytes carrying 1232 bytes of UDP-Lite data | ||
217 | Fragment 2: 1280 bytes carrying 1232 bytes of UDP-Lite data | ||
218 | Fragment 3: 948 bytes carrying 900 bytes of UDP-Lite data | ||
219 | |||
220 | The first two fragments have to be checksummed in full, of the last | ||
221 | fragment only 598 (= 3062 - 2*1232) bytes are checksummed. | ||
222 | |||
223 | While it is important that such cases are dealt with correctly, they | ||
224 | are (annoyingly) rare: UDP-Lite is designed for optimising multimedia | ||
225 | performance over wireless (or generally noisy) links and thus smaller | ||
226 | coverage lenghts are likely to be expected. | ||
227 | |||
228 | |||
229 | V) UDP-LITE RUNTIME STATISTICS AND THEIR MEANING | ||
230 | |||
231 | Exceptional and error conditions are logged to syslog at the KERN_DEBUG | ||
232 | level. Live statistics about UDP-Lite are available in /proc/net/snmp | ||
233 | and can (with newer versions of netstat) be viewed using | ||
234 | |||
235 | netstat -svu | ||
236 | |||
237 | This displays UDP-Lite statistics variables, whose meaning is as follows. | ||
238 | |||
239 | InDatagrams: Total number of received datagrams. | ||
240 | |||
241 | NoPorts: Number of packets received to an unknown port. | ||
242 | These cases are counted separately (not as InErrors). | ||
243 | |||
244 | InErrors: Number of erroneous UDP-Lite packets. Errors include: | ||
245 | * internal socket queue receive errors | ||
246 | * packet too short (less than 8 bytes or stated | ||
247 | coverage length exceeds received length) | ||
248 | * xfrm4_policy_check() returned with error | ||
249 | * application has specified larger min. coverage | ||
250 | length than that of incoming packet | ||
251 | * checksum coverage violated | ||
252 | * bad checksum | ||
253 | |||
254 | OutDatagrams: Total number of sent datagrams. | ||
255 | |||
256 | These statistics derive from the UDP MIB (RFC 2013). | ||
257 | |||
258 | |||
259 | VI) IPTABLES | ||
260 | |||
261 | There is packet match support for UDP-Lite as well as support for the LOG target. | ||
262 | If you copy and paste the following line into /etc/protcols, | ||
263 | |||
264 | udplite 136 UDP-Lite # UDP-Lite [RFC 3828] | ||
265 | |||
266 | then | ||
267 | iptables -A INPUT -p udplite -j LOG | ||
268 | |||
269 | will produce logging output to syslog. Dropping and rejecting packets also works. | ||
270 | |||
271 | |||
272 | VII) MAINTAINER ADDRESS | ||
273 | |||
274 | The UDP-Lite patch was developed at | ||
275 | University of Aberdeen | ||
276 | Electronics Research Group | ||
277 | Department of Engineering | ||
278 | Fraser Noble Building | ||
279 | Aberdeen AB24 3UE; UK | ||
280 | The current maintainer is Gerrit Renker, <gerrit@erg.abdn.ac.uk>. Initial | ||
281 | code was developed by William Stanislaus, <william@erg.abdn.ac.uk>. | ||