tcp: fix cwnd limited checking to improve congestion control

Yuchung discovered tcp_is_cwnd_limited() was returning false in slow start phase even if the application filled the socket write queue. All congestion modules take into account tcp_is_cwnd_limited() before increasing cwnd, so this behavior limits slow start from probing the bandwidth at full speed. The problem is that even if write queue is full (aka we are _not_ application limited), cwnd can be under utilized if TSO should auto defer or TCP Small queues decided to hold packets. So the in_flight can be kept to smaller value, and we can get to the point tcp_is_cwnd_limited() returns false. With TCP Small Queues and FQ/pacing, this issue is more visible. We fix this by having tcp_cwnd_validate(), which is supposed to track such things, take into account unsent_segs, the number of segs that we are not sending at the moment due to TSO or TSQ, but intend to send real soon. Then when we are cwnd-limited, remember this fact while we are processing the window of ACKs that comes back. For example, suppose we have a brand new connection with cwnd=10; we are in slow start, and we send a flight of 9 packets. By the time we have received ACKs for all 9 packets we want our cwnd to be 18. We implement this by setting tp->lsnd_pending to 9, and considering ourselves to be cwnd-limited while cwnd is less than twice tp->lsnd_pending (2*9 -> 18). This makes tcp_is_cwnd_limited() more understandable, by removing the GSO/TSO kludge, that tried to work around the issue. Note the in_flight parameter can be removed in a followup cleanup patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Eric Dumazet <edumazet@google.com> 2014-04-30 14:58:13 -0400
committer: David S. Miller <davem@davemloft.net> 2014-05-02 17:54:35 -0400
commit: e114a710aa5058c0ba4aa1dfb105132aefeb5e04 (patch)
tree: 3d7c656358bbc5cd37f7c2a973923e6be6ced1d9 /include/net/tcp.h
parent: 4e8bbb819d1594a01f91b1de83321f68d3e6e245 (diff)
1 files changed, 21 insertions, 1 deletions
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 163d2b467d78..a9fe7bc4f4bb 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -974,7 +974,27 @@ static inline u32 tcp_wnd_end(const struct tcp_sock *tp)
 {
        return tp->snd_una + tp->snd_wnd;
 }
-bool tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight);
+/* We follow the spirit of RFC2861 to validate cwnd but implement a more
+ * flexible approach. The RFC suggests cwnd should not be raised unless
+ * it was fully used previously. But we allow cwnd to grow as long as the
+ * application has used half the cwnd.
+ * Example :
+ *    cwnd is 10 (IW10), but application sends 9 frames.
+ *    We allow cwnd to reach 18 when all frames are ACKed.
+ * This check is safe because it's as aggressive as slow start which already
+ * risks 100% overshoot. The advantage is that we discourage application to
+ * either send more filler packets or data to artificially blow up the cwnd
+ * usage, and allow application-limited process to probe bw more aggressively.
+ *
+ * TODO: remove in_flight once we can fix all callers, and their callers...
+ */
+static inline bool tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight)
+{
+        const struct tcp_sock *tp = tcp_sk(sk);
+        return tp->snd_cwnd < 2 * tp->lsnd_pending;
+}
 static inline void tcp_check_probe_timer(struct sock *sk)
 {
author	Eric Dumazet <edumazet@google.com>	2014-04-30 14:58:13 -0400
committer	David S. Miller <davem@davemloft.net>	2014-05-02 17:54:35 -0400
commit	e114a710aa5058c0ba4aa1dfb105132aefeb5e04 (patch)
tree	3d7c656358bbc5cd37f7c2a973923e6be6ced1d9 /include/net/tcp.h
parent	4e8bbb819d1594a01f91b1de83321f68d3e6e245 (diff)

diff --git a/include/net/tcp.h b/include/net/tcp.h index 163d2b467d78..a9fe7bc4f4bb 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h
@@ -974,7 +974,27 @@ static inline u32 tcp_wnd_end(const struct tcp_sock *tp)
974	{	974	{
975	return tp->snd_una + tp->snd_wnd;	975	return tp->snd_una + tp->snd_wnd;
976	}	976	}
977	bool tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight);	977
		978	/* We follow the spirit of RFC2861 to validate cwnd but implement a more
		979	* flexible approach. The RFC suggests cwnd should not be raised unless
		980	* it was fully used previously. But we allow cwnd to grow as long as the
		981	* application has used half the cwnd.
		982	* Example :
		983	* cwnd is 10 (IW10), but application sends 9 frames.
		984	* We allow cwnd to reach 18 when all frames are ACKed.
		985	* This check is safe because it's as aggressive as slow start which already
		986	* risks 100% overshoot. The advantage is that we discourage application to
		987	* either send more filler packets or data to artificially blow up the cwnd
		988	* usage, and allow application-limited process to probe bw more aggressively.
		989	*
		990	* TODO: remove in_flight once we can fix all callers, and their callers...
		991	*/
		992	static inline bool tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight)
		993	{
		994	const struct tcp_sock *tp = tcp_sk(sk);
		995
		996	return tp->snd_cwnd < 2 * tp->lsnd_pending;
		997	}
978		998
979	static inline void tcp_check_probe_timer(struct sock *sk)	999	static inline void tcp_check_probe_timer(struct sock *sk)
980	{	1000	{