aboutsummaryrefslogtreecommitdiffstats
path: root/include
diff options
context:
space:
mode:
authorWillem de Bruijn <willemb@google.com>2012-10-26 07:52:08 -0400
committerDavid S. Miller <davem@davemloft.net>2012-10-31 13:56:40 -0400
commitecd5cf5dc5b07bc57aedf4c92453d6c343e3d840 (patch)
tree1e1b88386c2f7ceaabf56d362bf3b7789c91f9e1 /include
parent9add4d8174907c9977ad29cf6e71460829ec954e (diff)
net: compute skb->rxhash if nic hash may be 3-tuple
Network device drivers can communicate a Toeplitz hash in skb->rxhash, but devices differ in their hashing capabilities. All compute a 5-tuple hash for TCP over IPv4, but for other connection-oriented protocols, they may compute only a 3-tuple. This breaks RPS load balancing, e.g., for TCP over IPv6 flows. Additionally, for GRE and other tunnels, the kernel computes a 5-tuple hash over the inner packet if possible, but devices do not. This patch recomputes the rxhash in software in all cases where it cannot be certain that a 5-tuple was computed. Device drivers can avoid recomputation by setting the skb->l4_rxhash flag. Recomputing adds cycles to each packet when RPS is enabled or the packet arrives over a tunnel. A comparison of 200x TCP_STREAM between two servers running unmodified netnext with rxhash computation in hardware vs software (using ethtool -K eth0 rxhash [on|off]) shows how much time is spent in __skb_get_rxhash in this worst case: 0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash 0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash 0.05% swapper [kernel.kallsyms] [k] __skb_get_rxhash With 200x TCP_RR it increases to 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash 0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash I considered having the patch explicitly skips recomputation when it knows that it will not improve the hash (TCP over IPv4), but that conditional complicates code without saving many cycles in practice, because it has to take place after flow dissector. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include')
-rw-r--r--include/linux/skbuff.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6a2c34e6d962..a2a0bdb95a8f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -643,7 +643,7 @@ extern unsigned int skb_find_text(struct sk_buff *skb, unsigned int from,
643extern void __skb_get_rxhash(struct sk_buff *skb); 643extern void __skb_get_rxhash(struct sk_buff *skb);
644static inline __u32 skb_get_rxhash(struct sk_buff *skb) 644static inline __u32 skb_get_rxhash(struct sk_buff *skb)
645{ 645{
646 if (!skb->rxhash) 646 if (!skb->l4_rxhash)
647 __skb_get_rxhash(skb); 647 __skb_get_rxhash(skb);
648 648
649 return skb->rxhash; 649 return skb->rxhash;