aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4
diff options
context:
space:
mode:
authorGuenter Roeck <guenter.roeck@ericsson.com>2010-03-21 23:55:13 -0400
committerDavid S. Miller <davem@davemloft.net>2010-03-21 23:55:13 -0400
commit5e016cbf6cffd4a53b7922e0c91b775399d7fe47 (patch)
tree6ad4b48375958de4f4c47e4ac674ec55da42f7f2 /net/ipv4
parente3a61d47cc37c51834abe537e0ed685829d56ee2 (diff)
ipv4: Don't drop redirected route cache entry unless PTMU actually expired
TCP sessions over IPv4 can get stuck if routers between endpoints do not fragment packets but implement PMTU instead, and we are using those routers because of an ICMP redirect. Setup is as follows MTU1 MTU2 MTU1 A--------B------C------D with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C implement PMTU and drop packets larger than MTU2 (for example because DF is set on all packets). TCP sessions are initiated between A and D. There is packet loss between A and D, causing frequent TCP retransmits. After the number of retransmits on a TCP session reaches tcp_retries1, tcp calls dst_negative_advice() prior to each retransmit. This results in route cache entries for the peer to be deleted in ipv4_negative_advice() if the Path MTU is set. If the outstanding data on an affected TCP session is larger than MTU2, packets sent from the endpoints will be dropped by B or C, and ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and update PMTU. Before the next retransmit, tcp will again call dst_negative_advice(), causing the route cache entry (with correct PMTU) to be deleted. The retransmitted packet will be larger than MTU2, causing it to be dropped again. This sequence repeats until the TCP session aborts or is terminated. Problem is fixed by removing redirected route cache entries in ipv4_negative_advice() only if the PMTU is expired. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4')
-rw-r--r--net/ipv4/route.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 32d396196df8..54fd68c14c87 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1510,7 +1510,8 @@ static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
1510 ip_rt_put(rt); 1510 ip_rt_put(rt);
1511 ret = NULL; 1511 ret = NULL;
1512 } else if ((rt->rt_flags & RTCF_REDIRECTED) || 1512 } else if ((rt->rt_flags & RTCF_REDIRECTED) ||
1513 rt->u.dst.expires) { 1513 (rt->u.dst.expires &&
1514 time_after_eq(jiffies, rt->u.dst.expires))) {
1514 unsigned hash = rt_hash(rt->fl.fl4_dst, rt->fl.fl4_src, 1515 unsigned hash = rt_hash(rt->fl.fl4_dst, rt->fl.fl4_src,
1515 rt->fl.oif, 1516 rt->fl.oif,
1516 rt_genid(dev_net(dst->dev))); 1517 rt_genid(dev_net(dst->dev)));