aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorStephen Hemminger <shemminger@osdl.org>2005-06-23 15:22:36 -0400
committerDavid S. Miller <davem@davemloft.net>2005-06-23 15:22:36 -0400
commit9d7bcfc6b8586ee5a52043f061e0411965e71b88 (patch)
treeef6aa8e6fd9dc0c7187b9cd1497d13e180ae36a8 /Documentation
parent056ede6cface66b400cd3b8e60ed077cc5b85c18 (diff)
[TCP]: Update sysctl and congestion control documentation.
Update the documentation to remove the old sysctl values and include the new congestion control infrastructure. Includes changes to tcp.txt by Ian McDonald. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/networking/ip-sysctl.txt56
-rw-r--r--Documentation/networking/tcp.txt69
2 files changed, 73 insertions, 52 deletions
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index a2c893a7475d..ab65714d95fc 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -304,57 +304,6 @@ tcp_low_latency - BOOLEAN
304 changed would be a Beowulf compute cluster. 304 changed would be a Beowulf compute cluster.
305 Default: 0 305 Default: 0
306 306
307tcp_westwood - BOOLEAN
308 Enable TCP Westwood+ congestion control algorithm.
309 TCP Westwood+ is a sender-side only modification of the TCP Reno
310 protocol stack that optimizes the performance of TCP congestion
311 control. It is based on end-to-end bandwidth estimation to set
312 congestion window and slow start threshold after a congestion
313 episode. Using this estimation, TCP Westwood+ adaptively sets a
314 slow start threshold and a congestion window which takes into
315 account the bandwidth used at the time congestion is experienced.
316 TCP Westwood+ significantly increases fairness wrt TCP Reno in
317 wired networks and throughput over wireless links.
318 Default: 0
319
320tcp_vegas_cong_avoid - BOOLEAN
321 Enable TCP Vegas congestion avoidance algorithm.
322 TCP Vegas is a sender-side only change to TCP that anticipates
323 the onset of congestion by estimating the bandwidth. TCP Vegas
324 adjusts the sending rate by modifying the congestion
325 window. TCP Vegas should provide less packet loss, but it is
326 not as aggressive as TCP Reno.
327 Default:0
328
329tcp_bic - BOOLEAN
330 Enable BIC TCP congestion control algorithm.
331 BIC-TCP is a sender-side only change that ensures a linear RTT
332 fairness under large windows while offering both scalability and
333 bounded TCP-friendliness. The protocol combines two schemes
334 called additive increase and binary search increase. When the
335 congestion window is large, additive increase with a large
336 increment ensures linear RTT fairness as well as good
337 scalability. Under small congestion windows, binary search
338 increase provides TCP friendliness.
339 Default: 0
340
341tcp_bic_low_window - INTEGER
342 Sets the threshold window (in packets) where BIC TCP starts to
343 adjust the congestion window. Below this threshold BIC TCP behaves
344 the same as the default TCP Reno.
345 Default: 14
346
347tcp_bic_fast_convergence - BOOLEAN
348 Forces BIC TCP to more quickly respond to changes in congestion
349 window. Allows two flows sharing the same connection to converge
350 more rapidly.
351 Default: 1
352
353tcp_default_win_scale - INTEGER
354 Sets the minimum window scale TCP will negotiate for on all
355 conections.
356 Default: 7
357
358tcp_tso_win_divisor - INTEGER 307tcp_tso_win_divisor - INTEGER
359 This allows control over what percentage of the congestion window 308 This allows control over what percentage of the congestion window
360 can be consumed by a single TSO frame. 309 can be consumed by a single TSO frame.
@@ -368,6 +317,11 @@ tcp_frto - BOOLEAN
368 where packet loss is typically due to random radio interference 317 where packet loss is typically due to random radio interference
369 rather than intermediate router congestion. 318 rather than intermediate router congestion.
370 319
320tcp_congestion_control - STRING
321 Set the congestion control algorithm to be used for new
322 connections. The algorithm "reno" is always available, but
323 additional choices may be available based on kernel configuration.
324
371somaxconn - INTEGER 325somaxconn - INTEGER
372 Limit of socket listen() backlog, known in userspace as SOMAXCONN. 326 Limit of socket listen() backlog, known in userspace as SOMAXCONN.
373 Defaults to 128. See also tcp_max_syn_backlog for additional tuning 327 Defaults to 128. See also tcp_max_syn_backlog for additional tuning
diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt
index 71749007091e..0fa300425575 100644
--- a/Documentation/networking/tcp.txt
+++ b/Documentation/networking/tcp.txt
@@ -1,5 +1,72 @@
1How the new TCP output machine [nyi] works. 1TCP protocol
2============
3
4Last updated: 21 June 2005
5
6Contents
7========
8
9- Congestion control
10- How the new TCP output machine [nyi] works
11
12Congestion control
13==================
14
15The following variables are used in the tcp_sock for congestion control:
16snd_cwnd The size of the congestion window
17snd_ssthresh Slow start threshold. We are in slow start if
18 snd_cwnd is less than this.
19snd_cwnd_cnt A counter used to slow down the rate of increase
20 once we exceed slow start threshold.
21snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to.
22snd_cwnd_stamp Timestamp for when congestion window last validated.
23snd_cwnd_used Used as a highwater mark for how much of the
24 congestion window is in use. It is used to adjust
25 snd_cwnd down when the link is limited by the
26 application rather than the network.
27
28As of 2.6.13, Linux supports pluggable congestion control algorithms.
29A congestion control mechanism can be registered through functions in
30tcp_cong.c. The functions used by the congestion control mechanism are
31registered via passing a tcp_congestion_ops struct to
32tcp_register_congestion_control. As a minimum name, ssthresh,
33cong_avoid, min_cwnd must be valid.
2 34
35Private data for a congestion control mechanism is stored in tp->ca_priv.
36tcp_ca(tp) returns a pointer to this space. This is preallocated space - it
37is important to check the size of your private data will fit this space, or
38alternatively space could be allocated elsewhere and a pointer to it could
39be stored here.
40
41There are three kinds of congestion control algorithms currently: The
42simplest ones are derived from TCP reno (highspeed, scalable) and just
43provide an alternative the congestion window calculation. More complex
44ones like BIC try to look at other events to provide better
45heuristics. There are also round trip time based algorithms like
46Vegas and Westwood+.
47
48Good TCP congestion control is a complex problem because the algorithm
49needs to maintain fairness and performance. Please review current
50research and RFC's before developing new modules.
51
52The method that is used to determine which congestion control mechanism is
53determined by the setting of the sysctl net.ipv4.tcp_congestion_control.
54The default congestion control will be the last one registered (LIFO);
55so if you built everything as modules. the default will be reno. If you
56build with the default's from Kconfig, then BIC will be builtin (not a module)
57and it will end up the default.
58
59If you really want a particular default value then you will need
60to set it with the sysctl. If you use a sysctl, the module will be autoloaded
61if needed and you will get the expected protocol. If you ask for an
62unknown congestion method, then the sysctl attempt will fail.
63
64If you remove a tcp congestion control module, then you will get the next
65available one. Since reno can not be built as a module, and can not be
66deleted, it will always be available.
67
68How the new TCP output machine [nyi] works.
69===========================================
3 70
4Data is kept on a single queue. The skb->users flag tells us if the frame is 71Data is kept on a single queue. The skb->users flag tells us if the frame is
5one that has been queued already. To add a frame we throw it on the end. Ack 72one that has been queued already. To add a frame we throw it on the end. Ack