aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4/tcp_ipv4.c
diff options
context:
space:
mode:
authorEric Dumazet <edumazet@google.com>2015-12-14 17:08:53 -0500
committerDavid S. Miller <davem@davemloft.net>2015-12-14 23:52:00 -0500
commit5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 (patch)
treedc12a109e95664b9b2d3f0dbbf97f195c57595f0 /net/ipv4/tcp_ipv4.c
parent2274d3753f6c5a885be4cfdf8b39ae2045ba6e30 (diff)
net: fix IP early demux races
David Wilder reported crashes caused by dst reuse. <quote David> I am seeing a crash on a distro V4.2.3 kernel caused by a double release of a dst_entry. In ipv4_dst_destroy() the call to list_empty() finds a poisoned next pointer, indicating the dst_entry has already been removed from the list and freed. The crash occurs 18 to 24 hours into a run of a network stress exerciser. </quote> Thanks to his detailed report and analysis, we were able to understand the core issue. IP early demux can associate a dst to skb, after a lookup in TCP/UDP sockets. When socket cache is not properly set, we want to store into sk->sk_dst_cache the dst for future IP early demux lookups, by acquiring a stable refcount on the dst. Problem is this acquisition is simply using an atomic_inc(), which works well, unless the dst was queued for destruction from dst_release() noticing dst refcount went to zero, if DST_NOCACHE was set on dst. We need to make sure current refcount is not zero before incrementing it, or risk double free as David reported. This patch, being a stable candidate, adds two new helpers, and use them only from IP early demux problematic paths. It might be possible to merge in net-next skb_dst_force() and skb_dst_force_safe(), but I prefer having the smallest patch for stable kernels : Maybe some skb_dst_force() callers do not expect skb->dst can suddenly be cleared. Can probably be backported back to linux-3.6 kernels Reported-by: David J. Wilder <dwilder@us.ibm.com> Tested-by: David J. Wilder <dwilder@us.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp_ipv4.c')
-rw-r--r--net/ipv4/tcp_ipv4.c5
1 files changed, 2 insertions, 3 deletions
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index db003438aaf5..d8841a2f1569 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1493,7 +1493,7 @@ bool tcp_prequeue(struct sock *sk, struct sk_buff *skb)
1493 if (likely(sk->sk_rx_dst)) 1493 if (likely(sk->sk_rx_dst))
1494 skb_dst_drop(skb); 1494 skb_dst_drop(skb);
1495 else 1495 else
1496 skb_dst_force(skb); 1496 skb_dst_force_safe(skb);
1497 1497
1498 __skb_queue_tail(&tp->ucopy.prequeue, skb); 1498 __skb_queue_tail(&tp->ucopy.prequeue, skb);
1499 tp->ucopy.memory += skb->truesize; 1499 tp->ucopy.memory += skb->truesize;
@@ -1721,8 +1721,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb)
1721{ 1721{
1722 struct dst_entry *dst = skb_dst(skb); 1722 struct dst_entry *dst = skb_dst(skb);
1723 1723
1724 if (dst) { 1724 if (dst && dst_hold_safe(dst)) {
1725 dst_hold(dst);
1726 sk->sk_rx_dst = dst; 1725 sk->sk_rx_dst = dst;
1727 inet_sk(sk)->rx_dst_ifindex = skb->skb_iif; 1726 inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
1728 } 1727 }