aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/networking/ip-sysctl.txt
diff options
context:
space:
mode:
authorNandita Dukkipati <nanditad@google.com>2013-03-11 06:00:43 -0400
committerDavid S. Miller <davem@davemloft.net>2013-03-12 08:30:34 -0400
commit6ba8a3b19e764b6a65e4030ab0999be50c291e6c (patch)
tree57ba4b6411762d1124a3e08577e32e86769c024f /Documentation/networking/ip-sysctl.txt
parent83e519b63480e691d43ee106547b10941bfa0232 (diff)
tcp: Tail loss probe (TLP)
This patch series implement the Tail loss probe (TLP) algorithm described in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The first patch implements the basic algorithm. TLP's goal is to reduce tail latency of short transactions. It achieves this by converting retransmission timeouts (RTOs) occuring due to tail losses (losses at end of transactions) into fast recovery. TLP transmits one packet in two round-trips when a connection is in Open state and isn't receiving any ACKs. The transmitted packet, aka loss probe, can be either new or a retransmission. When there is tail loss, the ACK from a loss probe triggers FACK/early-retransmit based fast recovery, thus avoiding a costly RTO. In the absence of loss, there is no change in the connection state. PTO stands for probe timeout. It is a timer event indicating that an ACK is overdue and triggers a loss probe packet. The PTO value is set to max(2*SRTT, 10ms) and is adjusted to account for delayed ACK timer when there is only one oustanding packet. TLP Algorithm On transmission of new data in Open state: -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms). -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms) -> PTO = min(PTO, RTO) Conditions for scheduling PTO: -> Connection is in Open state. -> Connection is either cwnd limited or no new data to send. -> Number of probes per tail loss episode is limited to one. -> Connection is SACK enabled. When PTO fires: new_segment_exists: -> transmit new segment. -> packets_out++. cwnd remains same. no_new_packet: -> retransmit the last segment. Its ACK triggers FACK or early retransmit based recovery. ACK path: -> rearm RTO at start of ACK processing. -> reschedule PTO if need be. In addition, the patch includes a small variation to the Early Retransmit (ER) algorithm, such that ER and TLP together can in principle recover any N-degree of tail loss through fast recovery. TLP is controlled by the same sysctl as ER, tcp_early_retrans sysctl. tcp_early_retrans==0; disables TLP and ER. ==1; enables RFC5827 ER. ==2; delayed ER. ==3; TLP and delayed ER. [DEFAULT] ==4; TLP only. The TLP patch series have been extensively tested on Google Web servers. It is most effective for short Web trasactions, where it reduced RTOs by 15% and improved HTTP response time (average by 6%, 99th percentile by 10%). The transmitted probes account for <0.5% of the overall transmissions. Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'Documentation/networking/ip-sysctl.txt')
-rw-r--r--Documentation/networking/ip-sysctl.txt8
1 files changed, 6 insertions, 2 deletions
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index dc2dc87d2557..1cae6c383e1b 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -190,7 +190,9 @@ tcp_early_retrans - INTEGER
190 Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold 190 Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold
191 for triggering fast retransmit when the amount of outstanding data is 191 for triggering fast retransmit when the amount of outstanding data is
192 small and when no previously unsent data can be transmitted (such 192 small and when no previously unsent data can be transmitted (such
193 that limited transmit could be used). 193 that limited transmit could be used). Also controls the use of
194 Tail loss probe (TLP) that converts RTOs occuring due to tail
195 losses into fast recovery (draft-dukkipati-tcpm-tcp-loss-probe-01).
194 Possible values: 196 Possible values:
195 0 disables ER 197 0 disables ER
196 1 enables ER 198 1 enables ER
@@ -198,7 +200,9 @@ tcp_early_retrans - INTEGER
198 by a fourth of RTT. This mitigates connection falsely 200 by a fourth of RTT. This mitigates connection falsely
199 recovers when network has a small degree of reordering 201 recovers when network has a small degree of reordering
200 (less than 3 packets). 202 (less than 3 packets).
201 Default: 2 203 3 enables delayed ER and TLP.
204 4 enables TLP only.
205 Default: 3
202 206
203tcp_ecn - INTEGER 207tcp_ecn - INTEGER
204 Control use of Explicit Congestion Notification (ECN) by TCP. 208 Control use of Explicit Congestion Notification (ECN) by TCP.