diff options
author | Eric Dumazet <eric.dumazet@gmail.com> | 2011-11-09 02:24:35 -0500 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2011-11-09 16:36:27 -0500 |
commit | d826eb14ecef3574b6b3be55e5f4329f4a76fbf3 (patch) | |
tree | e072ee768f065be430543709d48f08a36f4eed2d /net/ipv4/udp.c | |
parent | acb32ba3dee66d58704caeeb8c6ff95f60efdc66 (diff) |
ipv4: PKTINFO doesnt need dst reference
Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :
> At least, in recent kernels we dont change dst->refcnt in forwarding
> patch (usinf NOREF skb->dst)
>
> One particular point is the atomic_inc(dst->refcnt) we have to perform
> when queuing an UDP packet if socket asked PKTINFO stuff (for example a
> typical DNS server has to setup this option)
>
> I have one patch somewhere that stores the information in skb->cb[] and
> avoid the atomic_{inc|dec}(dst->refcnt).
>
OK I found it, I did some extra tests and believe its ready.
[PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference
When a socket uses IP_PKTINFO notifications, we currently force a dst
reference for each received skb. Reader has to access dst to get needed
information (rt_iif & rt_spec_dst) and must release dst reference.
We also forced a dst reference if skb was put in socket backlog, even
without IP_PKTINFO handling. This happens under stress/load.
We can instead store the needed information in skb->cb[], so that only
softirq handler really access dst, improving cache hit ratios.
This removes two atomic operations per packet, and false sharing as
well.
On a benchmark using a mono threaded receiver (doing only recvmsg()
calls), I can reach 720.000 pps instead of 570.000 pps.
IP_PKTINFO is typically used by DNS servers, and any multihomed aware
UDP application.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/udp.c')
-rw-r--r-- | net/ipv4/udp.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ab0966df1e2a..6854f581313f 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c | |||
@@ -1357,7 +1357,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) | |||
1357 | if (inet_sk(sk)->inet_daddr) | 1357 | if (inet_sk(sk)->inet_daddr) |
1358 | sock_rps_save_rxhash(sk, skb); | 1358 | sock_rps_save_rxhash(sk, skb); |
1359 | 1359 | ||
1360 | rc = ip_queue_rcv_skb(sk, skb); | 1360 | rc = sock_queue_rcv_skb(sk, skb); |
1361 | if (rc < 0) { | 1361 | if (rc < 0) { |
1362 | int is_udplite = IS_UDPLITE(sk); | 1362 | int is_udplite = IS_UDPLITE(sk); |
1363 | 1363 | ||
@@ -1473,6 +1473,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) | |||
1473 | 1473 | ||
1474 | rc = 0; | 1474 | rc = 0; |
1475 | 1475 | ||
1476 | ipv4_pktinfo_prepare(skb); | ||
1476 | bh_lock_sock(sk); | 1477 | bh_lock_sock(sk); |
1477 | if (!sock_owned_by_user(sk)) | 1478 | if (!sock_owned_by_user(sk)) |
1478 | rc = __udp_queue_rcv_skb(sk, skb); | 1479 | rc = __udp_queue_rcv_skb(sk, skb); |