diff options
author | Eric Dumazet <edumazet@google.com> | 2014-04-30 14:58:13 -0400 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2014-05-02 17:54:35 -0400 |
commit | e114a710aa5058c0ba4aa1dfb105132aefeb5e04 (patch) | |
tree | 3d7c656358bbc5cd37f7c2a973923e6be6ced1d9 /net/ipv4/tcp_cong.c | |
parent | 4e8bbb819d1594a01f91b1de83321f68d3e6e245 (diff) |
tcp: fix cwnd limited checking to improve congestion control
Yuchung discovered tcp_is_cwnd_limited() was returning false in
slow start phase even if the application filled the socket write queue.
All congestion modules take into account tcp_is_cwnd_limited()
before increasing cwnd, so this behavior limits slow start from
probing the bandwidth at full speed.
The problem is that even if write queue is full (aka we are _not_
application limited), cwnd can be under utilized if TSO should auto
defer or TCP Small queues decided to hold packets.
So the in_flight can be kept to smaller value, and we can get to the
point tcp_is_cwnd_limited() returns false.
With TCP Small Queues and FQ/pacing, this issue is more visible.
We fix this by having tcp_cwnd_validate(), which is supposed to track
such things, take into account unsent_segs, the number of segs that we
are not sending at the moment due to TSO or TSQ, but intend to send
real soon. Then when we are cwnd-limited, remember this fact while we
are processing the window of ACKs that comes back.
For example, suppose we have a brand new connection with cwnd=10; we
are in slow start, and we send a flight of 9 packets. By the time we
have received ACKs for all 9 packets we want our cwnd to be 18.
We implement this by setting tp->lsnd_pending to 9, and
considering ourselves to be cwnd-limited while cwnd is less than
twice tp->lsnd_pending (2*9 -> 18).
This makes tcp_is_cwnd_limited() more understandable, by removing
the GSO/TSO kludge, that tried to work around the issue.
Note the in_flight parameter can be removed in a followup cleanup
patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp_cong.c')
-rw-r--r-- | net/ipv4/tcp_cong.c | 20 |
1 files changed, 0 insertions, 20 deletions
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index 2b9464c93b88..a93b41ba05ff 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c | |||
@@ -276,26 +276,6 @@ int tcp_set_congestion_control(struct sock *sk, const char *name) | |||
276 | return err; | 276 | return err; |
277 | } | 277 | } |
278 | 278 | ||
279 | /* RFC2861 Check whether we are limited by application or congestion window | ||
280 | * This is the inverse of cwnd check in tcp_tso_should_defer | ||
281 | */ | ||
282 | bool tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight) | ||
283 | { | ||
284 | const struct tcp_sock *tp = tcp_sk(sk); | ||
285 | u32 left; | ||
286 | |||
287 | if (in_flight >= tp->snd_cwnd) | ||
288 | return true; | ||
289 | |||
290 | left = tp->snd_cwnd - in_flight; | ||
291 | if (sk_can_gso(sk) && | ||
292 | left * sysctl_tcp_tso_win_divisor < tp->snd_cwnd && | ||
293 | left < tp->xmit_size_goal_segs) | ||
294 | return true; | ||
295 | return left <= tcp_max_tso_deferred_mss(tp); | ||
296 | } | ||
297 | EXPORT_SYMBOL_GPL(tcp_is_cwnd_limited); | ||
298 | |||
299 | /* Slow start is used when congestion window is no greater than the slow start | 279 | /* Slow start is used when congestion window is no greater than the slow start |
300 | * threshold. We base on RFC2581 and also handle stretch ACKs properly. | 280 | * threshold. We base on RFC2581 and also handle stretch ACKs properly. |
301 | * We do not implement RFC3465 Appropriate Byte Counting (ABC) per se but | 281 | * We do not implement RFC3465 Appropriate Byte Counting (ABC) per se but |