diff options
Diffstat (limited to 'Documentation/networking/tcp.txt')
-rw-r--r-- | Documentation/networking/tcp.txt | 69 |
1 files changed, 68 insertions, 1 deletions
diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt index 71749007091e..0fa300425575 100644 --- a/Documentation/networking/tcp.txt +++ b/Documentation/networking/tcp.txt | |||
@@ -1,5 +1,72 @@ | |||
1 | How the new TCP output machine [nyi] works. | 1 | TCP protocol |
2 | ============ | ||
3 | |||
4 | Last updated: 21 June 2005 | ||
5 | |||
6 | Contents | ||
7 | ======== | ||
8 | |||
9 | - Congestion control | ||
10 | - How the new TCP output machine [nyi] works | ||
11 | |||
12 | Congestion control | ||
13 | ================== | ||
14 | |||
15 | The following variables are used in the tcp_sock for congestion control: | ||
16 | snd_cwnd The size of the congestion window | ||
17 | snd_ssthresh Slow start threshold. We are in slow start if | ||
18 | snd_cwnd is less than this. | ||
19 | snd_cwnd_cnt A counter used to slow down the rate of increase | ||
20 | once we exceed slow start threshold. | ||
21 | snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to. | ||
22 | snd_cwnd_stamp Timestamp for when congestion window last validated. | ||
23 | snd_cwnd_used Used as a highwater mark for how much of the | ||
24 | congestion window is in use. It is used to adjust | ||
25 | snd_cwnd down when the link is limited by the | ||
26 | application rather than the network. | ||
27 | |||
28 | As of 2.6.13, Linux supports pluggable congestion control algorithms. | ||
29 | A congestion control mechanism can be registered through functions in | ||
30 | tcp_cong.c. The functions used by the congestion control mechanism are | ||
31 | registered via passing a tcp_congestion_ops struct to | ||
32 | tcp_register_congestion_control. As a minimum name, ssthresh, | ||
33 | cong_avoid, min_cwnd must be valid. | ||
2 | 34 | ||
35 | Private data for a congestion control mechanism is stored in tp->ca_priv. | ||
36 | tcp_ca(tp) returns a pointer to this space. This is preallocated space - it | ||
37 | is important to check the size of your private data will fit this space, or | ||
38 | alternatively space could be allocated elsewhere and a pointer to it could | ||
39 | be stored here. | ||
40 | |||
41 | There are three kinds of congestion control algorithms currently: The | ||
42 | simplest ones are derived from TCP reno (highspeed, scalable) and just | ||
43 | provide an alternative the congestion window calculation. More complex | ||
44 | ones like BIC try to look at other events to provide better | ||
45 | heuristics. There are also round trip time based algorithms like | ||
46 | Vegas and Westwood+. | ||
47 | |||
48 | Good TCP congestion control is a complex problem because the algorithm | ||
49 | needs to maintain fairness and performance. Please review current | ||
50 | research and RFC's before developing new modules. | ||
51 | |||
52 | The method that is used to determine which congestion control mechanism is | ||
53 | determined by the setting of the sysctl net.ipv4.tcp_congestion_control. | ||
54 | The default congestion control will be the last one registered (LIFO); | ||
55 | so if you built everything as modules. the default will be reno. If you | ||
56 | build with the default's from Kconfig, then BIC will be builtin (not a module) | ||
57 | and it will end up the default. | ||
58 | |||
59 | If you really want a particular default value then you will need | ||
60 | to set it with the sysctl. If you use a sysctl, the module will be autoloaded | ||
61 | if needed and you will get the expected protocol. If you ask for an | ||
62 | unknown congestion method, then the sysctl attempt will fail. | ||
63 | |||
64 | If you remove a tcp congestion control module, then you will get the next | ||
65 | available one. Since reno can not be built as a module, and can not be | ||
66 | deleted, it will always be available. | ||
67 | |||
68 | How the new TCP output machine [nyi] works. | ||
69 | =========================================== | ||
3 | 70 | ||
4 | Data is kept on a single queue. The skb->users flag tells us if the frame is | 71 | Data is kept on a single queue. The skb->users flag tells us if the frame is |
5 | one that has been queued already. To add a frame we throw it on the end. Ack | 72 | one that has been queued already. To add a frame we throw it on the end. Ack |