aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorYuchung Cheng <ycheng@google.com>2013-08-12 19:41:25 -0400
committerDavid S. Miller <davem@davemloft.net>2013-08-13 19:08:33 -0400
commit74c181d528bd8b5989f424a489262d0742ca31ae (patch)
treedb72e02385e949e1b3a9c305e661db65a27f639b
parent98f1b7f3820a50a42e51f9bd3e7014cf9b2688a8 (diff)
tcp: reset reordering est. selectively on timeout
On timeout the TCP sender unconditionally resets the estimated degree of network reordering (tp->reordering). The idea behind this is that the estimate is too large to trigger fast recovery (e.g., due to a IP path change). But for example if the sender only had 2 packets outstanding, then a timeout doesn't tell much about reordering. A sender that learns about reordering on big writes and loses packets on small writes will end up falsely retransmitting again and again, especially when reordering is more likely on big writes. Therefore the sender should only suspect that tp->reordering is too high if it could have gone into fast recovery with the (lower) default estimate. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-rw-r--r--net/ipv4/tcp_input.c9
1 files changed, 7 insertions, 2 deletions
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b61274b666f6..e965cc7b87ff 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1877,8 +1877,13 @@ void tcp_enter_loss(struct sock *sk, int how)
1877 } 1877 }
1878 tcp_verify_left_out(tp); 1878 tcp_verify_left_out(tp);
1879 1879
1880 tp->reordering = min_t(unsigned int, tp->reordering, 1880 /* Timeout in disordered state after receiving substantial DUPACKs
1881 sysctl_tcp_reordering); 1881 * suggests that the degree of reordering is over-estimated.
1882 */
1883 if (icsk->icsk_ca_state <= TCP_CA_Disorder &&
1884 tp->sacked_out >= sysctl_tcp_reordering)
1885 tp->reordering = min_t(unsigned int, tp->reordering,
1886 sysctl_tcp_reordering);
1882 tcp_set_ca_state(sk, TCP_CA_Loss); 1887 tcp_set_ca_state(sk, TCP_CA_Loss);
1883 tp->high_seq = tp->snd_nxt; 1888 tp->high_seq = tp->snd_nxt;
1884 TCP_ECN_queue_cwr(tp); 1889 TCP_ECN_queue_cwr(tp);