diff options
author | Eric Dumazet <edumazet@google.com> | 2015-12-14 17:08:53 -0500 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2015-12-14 23:52:00 -0500 |
commit | 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 (patch) | |
tree | dc12a109e95664b9b2d3f0dbbf97f195c57595f0 /net/ipv4/tcp_ipv4.c | |
parent | 2274d3753f6c5a885be4cfdf8b39ae2045ba6e30 (diff) |
net: fix IP early demux races
David Wilder reported crashes caused by dst reuse.
<quote David>
I am seeing a crash on a distro V4.2.3 kernel caused by a double
release of a dst_entry. In ipv4_dst_destroy() the call to
list_empty() finds a poisoned next pointer, indicating the dst_entry
has already been removed from the list and freed. The crash occurs
18 to 24 hours into a run of a network stress exerciser.
</quote>
Thanks to his detailed report and analysis, we were able to understand
the core issue.
IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.
When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.
Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.
We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.
This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.
It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.
Can probably be backported back to linux-3.6 kernels
Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp_ipv4.c')
-rw-r--r-- | net/ipv4/tcp_ipv4.c | 5 |
1 files changed, 2 insertions, 3 deletions
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index db003438aaf5..d8841a2f1569 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c | |||
@@ -1493,7 +1493,7 @@ bool tcp_prequeue(struct sock *sk, struct sk_buff *skb) | |||
1493 | if (likely(sk->sk_rx_dst)) | 1493 | if (likely(sk->sk_rx_dst)) |
1494 | skb_dst_drop(skb); | 1494 | skb_dst_drop(skb); |
1495 | else | 1495 | else |
1496 | skb_dst_force(skb); | 1496 | skb_dst_force_safe(skb); |
1497 | 1497 | ||
1498 | __skb_queue_tail(&tp->ucopy.prequeue, skb); | 1498 | __skb_queue_tail(&tp->ucopy.prequeue, skb); |
1499 | tp->ucopy.memory += skb->truesize; | 1499 | tp->ucopy.memory += skb->truesize; |
@@ -1721,8 +1721,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb) | |||
1721 | { | 1721 | { |
1722 | struct dst_entry *dst = skb_dst(skb); | 1722 | struct dst_entry *dst = skb_dst(skb); |
1723 | 1723 | ||
1724 | if (dst) { | 1724 | if (dst && dst_hold_safe(dst)) { |
1725 | dst_hold(dst); | ||
1726 | sk->sk_rx_dst = dst; | 1725 | sk->sk_rx_dst = dst; |
1727 | inet_sk(sk)->rx_dst_ifindex = skb->skb_iif; | 1726 | inet_sk(sk)->rx_dst_ifindex = skb->skb_iif; |
1728 | } | 1727 | } |