diff options
-rw-r--r-- | Documentation/networking/ip-sysctl.txt | 56 | ||||
-rw-r--r-- | Documentation/networking/tcp.txt | 69 |
2 files changed, 73 insertions, 52 deletions
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index a2c893a7475d..ab65714d95fc 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt | |||
@@ -304,57 +304,6 @@ tcp_low_latency - BOOLEAN | |||
304 | changed would be a Beowulf compute cluster. | 304 | changed would be a Beowulf compute cluster. |
305 | Default: 0 | 305 | Default: 0 |
306 | 306 | ||
307 | tcp_westwood - BOOLEAN | ||
308 | Enable TCP Westwood+ congestion control algorithm. | ||
309 | TCP Westwood+ is a sender-side only modification of the TCP Reno | ||
310 | protocol stack that optimizes the performance of TCP congestion | ||
311 | control. It is based on end-to-end bandwidth estimation to set | ||
312 | congestion window and slow start threshold after a congestion | ||
313 | episode. Using this estimation, TCP Westwood+ adaptively sets a | ||
314 | slow start threshold and a congestion window which takes into | ||
315 | account the bandwidth used at the time congestion is experienced. | ||
316 | TCP Westwood+ significantly increases fairness wrt TCP Reno in | ||
317 | wired networks and throughput over wireless links. | ||
318 | Default: 0 | ||
319 | |||
320 | tcp_vegas_cong_avoid - BOOLEAN | ||
321 | Enable TCP Vegas congestion avoidance algorithm. | ||
322 | TCP Vegas is a sender-side only change to TCP that anticipates | ||
323 | the onset of congestion by estimating the bandwidth. TCP Vegas | ||
324 | adjusts the sending rate by modifying the congestion | ||
325 | window. TCP Vegas should provide less packet loss, but it is | ||
326 | not as aggressive as TCP Reno. | ||
327 | Default:0 | ||
328 | |||
329 | tcp_bic - BOOLEAN | ||
330 | Enable BIC TCP congestion control algorithm. | ||
331 | BIC-TCP is a sender-side only change that ensures a linear RTT | ||
332 | fairness under large windows while offering both scalability and | ||
333 | bounded TCP-friendliness. The protocol combines two schemes | ||
334 | called additive increase and binary search increase. When the | ||
335 | congestion window is large, additive increase with a large | ||
336 | increment ensures linear RTT fairness as well as good | ||
337 | scalability. Under small congestion windows, binary search | ||
338 | increase provides TCP friendliness. | ||
339 | Default: 0 | ||
340 | |||
341 | tcp_bic_low_window - INTEGER | ||
342 | Sets the threshold window (in packets) where BIC TCP starts to | ||
343 | adjust the congestion window. Below this threshold BIC TCP behaves | ||
344 | the same as the default TCP Reno. | ||
345 | Default: 14 | ||
346 | |||
347 | tcp_bic_fast_convergence - BOOLEAN | ||
348 | Forces BIC TCP to more quickly respond to changes in congestion | ||
349 | window. Allows two flows sharing the same connection to converge | ||
350 | more rapidly. | ||
351 | Default: 1 | ||
352 | |||
353 | tcp_default_win_scale - INTEGER | ||
354 | Sets the minimum window scale TCP will negotiate for on all | ||
355 | conections. | ||
356 | Default: 7 | ||
357 | |||
358 | tcp_tso_win_divisor - INTEGER | 307 | tcp_tso_win_divisor - INTEGER |
359 | This allows control over what percentage of the congestion window | 308 | This allows control over what percentage of the congestion window |
360 | can be consumed by a single TSO frame. | 309 | can be consumed by a single TSO frame. |
@@ -368,6 +317,11 @@ tcp_frto - BOOLEAN | |||
368 | where packet loss is typically due to random radio interference | 317 | where packet loss is typically due to random radio interference |
369 | rather than intermediate router congestion. | 318 | rather than intermediate router congestion. |
370 | 319 | ||
320 | tcp_congestion_control - STRING | ||
321 | Set the congestion control algorithm to be used for new | ||
322 | connections. The algorithm "reno" is always available, but | ||
323 | additional choices may be available based on kernel configuration. | ||
324 | |||
371 | somaxconn - INTEGER | 325 | somaxconn - INTEGER |
372 | Limit of socket listen() backlog, known in userspace as SOMAXCONN. | 326 | Limit of socket listen() backlog, known in userspace as SOMAXCONN. |
373 | Defaults to 128. See also tcp_max_syn_backlog for additional tuning | 327 | Defaults to 128. See also tcp_max_syn_backlog for additional tuning |
diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt index 71749007091e..0fa300425575 100644 --- a/Documentation/networking/tcp.txt +++ b/Documentation/networking/tcp.txt | |||
@@ -1,5 +1,72 @@ | |||
1 | How the new TCP output machine [nyi] works. | 1 | TCP protocol |
2 | ============ | ||
3 | |||
4 | Last updated: 21 June 2005 | ||
5 | |||
6 | Contents | ||
7 | ======== | ||
8 | |||
9 | - Congestion control | ||
10 | - How the new TCP output machine [nyi] works | ||
11 | |||
12 | Congestion control | ||
13 | ================== | ||
14 | |||
15 | The following variables are used in the tcp_sock for congestion control: | ||
16 | snd_cwnd The size of the congestion window | ||
17 | snd_ssthresh Slow start threshold. We are in slow start if | ||
18 | snd_cwnd is less than this. | ||
19 | snd_cwnd_cnt A counter used to slow down the rate of increase | ||
20 | once we exceed slow start threshold. | ||
21 | snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to. | ||
22 | snd_cwnd_stamp Timestamp for when congestion window last validated. | ||
23 | snd_cwnd_used Used as a highwater mark for how much of the | ||
24 | congestion window is in use. It is used to adjust | ||
25 | snd_cwnd down when the link is limited by the | ||
26 | application rather than the network. | ||
27 | |||
28 | As of 2.6.13, Linux supports pluggable congestion control algorithms. | ||
29 | A congestion control mechanism can be registered through functions in | ||
30 | tcp_cong.c. The functions used by the congestion control mechanism are | ||
31 | registered via passing a tcp_congestion_ops struct to | ||
32 | tcp_register_congestion_control. As a minimum name, ssthresh, | ||
33 | cong_avoid, min_cwnd must be valid. | ||
2 | 34 | ||
35 | Private data for a congestion control mechanism is stored in tp->ca_priv. | ||
36 | tcp_ca(tp) returns a pointer to this space. This is preallocated space - it | ||
37 | is important to check the size of your private data will fit this space, or | ||
38 | alternatively space could be allocated elsewhere and a pointer to it could | ||
39 | be stored here. | ||
40 | |||
41 | There are three kinds of congestion control algorithms currently: The | ||
42 | simplest ones are derived from TCP reno (highspeed, scalable) and just | ||
43 | provide an alternative the congestion window calculation. More complex | ||
44 | ones like BIC try to look at other events to provide better | ||
45 | heuristics. There are also round trip time based algorithms like | ||
46 | Vegas and Westwood+. | ||
47 | |||
48 | Good TCP congestion control is a complex problem because the algorithm | ||
49 | needs to maintain fairness and performance. Please review current | ||
50 | research and RFC's before developing new modules. | ||
51 | |||
52 | The method that is used to determine which congestion control mechanism is | ||
53 | determined by the setting of the sysctl net.ipv4.tcp_congestion_control. | ||
54 | The default congestion control will be the last one registered (LIFO); | ||
55 | so if you built everything as modules. the default will be reno. If you | ||
56 | build with the default's from Kconfig, then BIC will be builtin (not a module) | ||
57 | and it will end up the default. | ||
58 | |||
59 | If you really want a particular default value then you will need | ||
60 | to set it with the sysctl. If you use a sysctl, the module will be autoloaded | ||
61 | if needed and you will get the expected protocol. If you ask for an | ||
62 | unknown congestion method, then the sysctl attempt will fail. | ||
63 | |||
64 | If you remove a tcp congestion control module, then you will get the next | ||
65 | available one. Since reno can not be built as a module, and can not be | ||
66 | deleted, it will always be available. | ||
67 | |||
68 | How the new TCP output machine [nyi] works. | ||
69 | =========================================== | ||
3 | 70 | ||
4 | Data is kept on a single queue. The skb->users flag tells us if the frame is | 71 | Data is kept on a single queue. The skb->users flag tells us if the frame is |
5 | one that has been queued already. To add a frame we throw it on the end. Ack | 72 | one that has been queued already. To add a frame we throw it on the end. Ack |