aboutsummaryrefslogtreecommitdiffstats
path: root/net/core
Commit message (Collapse)AuthorAge
...
| * | | | | | | flow_dissector: support L2 GREMichael Dalton2013-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for L2 GRE tunnels, so that RPS can be more effective. Signed-off-by: Michael Dalton <mwdalton@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | tunneling: Add generic Tunnel segmentation.Pravin B Shelar2013-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds generic tunneling offloading support for IPv4-UDP based tunnels. GSO type is added to request this offload for a skb. netdev feature NETIF_F_UDP_TUNNEL is added for hardware offloaded udp-tunnel support. Currently no device supports this feature, software offload is used. This can be used by tunneling protocols like VXLAN. CC: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | tunneling: Capture inner mac header during encapsulation.Pravin B Shelar2013-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds inner mac header. This will be used in next patch to find tunner header length. Header len is required to copy tunnel header to each gso segment. This patch does not change any functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: Add skb_headers_offset_update helper function.Pravin B Shelar2013-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function will be used in next VXLAN_GSO patch. This patch does not change any functionality. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | tunnel: Inherit NETIF_F_SG for hw_enc_features.Pravin B Shelar2013-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inherit scatergather feature for tunnel devices to avoid copy for TSO packets of tunneling device like GRE. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: Kill link between CSUM and SG features.Pravin B Shelar2013-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Earlier SG was unset if CSUM was not available for given device to force skb copy to avoid sending inconsistent csum. Commit c9af6db4c11c (net: Fix possible wrong checksum generation) added explicit flag to force copy to fix this issue. Therefore there is no need to link SG and CSUM, following patch kills this link between there two features. This patch is also required following patch in series. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: generic fdb support for drivers without ndo_fdb_<op>Vlad Yasevich2013-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the driver does not support the ndo_op use the generic handler for it. This should work in the majority of cases. Eventually the fdb_dflt_add call gets translated into a __dev_set_rx_mode() call which should handle hardware support for filtering via the IFF_UNICAST_FLT flag. Namely IFF_UNICAST_FLT indicates if the hardware can do unicast address filtering. If no support is available the device is put into promisc mode. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: introduce NAPI_POLL_WEIGHTEric Dumazet2013-03-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers use a too big NAPI poll weight. This patch adds a NAPI_POLL_WEIGHT default value and issues an error message if a driver attempts to use a bigger weight. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net/core: remove duplicate statements by do-while loopAkinobu Mita2013-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove duplicate statements by using do-while loop instead of while loop. - A; - while (e) { + do { A; - } + } while (e); Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | net/core: rename random32() to prandom_u32()Akinobu Mita2013-04-29
| |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use preferable function name which implies using a pseudo-random number generator. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | net: rate-limit warn-bad-offload splats.Ben Greear2013-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If one does do something unfortunate and allow a bad offload bug into the kernel, this the skb_warn_bad_offload can effectively live-lock the system, filling the logs with the same error over and over. Add rate limitation to this so that box remains otherwise functional in this case. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | rtnetlink: Call nlmsg_parse() with correct header lengthMichael Riesch2013-04-08
| |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Michael Riesch <michael.riesch@omicron.at> Cc: "David S. Miller" <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Benc <jbenc@redhat.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-kernel@vger.kernel.org Acked-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | netfilter: don't reset nf_trace in nf_reset()Patrick McHardy2013-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code to reset nf_trace in nf_reset(). This is wrong and unnecessary. nf_reset() is used in the following cases: - when passing packets up the the socket layer, at which point we want to release all netfilter references that might keep modules pinned while the packet is queued. nf_trace doesn't matter anymore at this point. - when encapsulating or decapsulating IPsec packets. We want to continue tracing these packets after IPsec processing. - when passing packets through virtual network devices. Only devices on that encapsulate in IPv4/v6 matter since otherwise nf_trace is not used anymore. Its not entirely clear whether those packets should be traced after that, however we've always done that. - when passing packets through virtual network devices that make the packet cross network namespace boundaries. This is the only cases where we clearly want to reset nf_trace and is also what the original patch intended to fix. Add a new function nf_reset_trace() and use it in dev_forward_skb() to fix this properly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | net: count hw_addr syncs so that unsync works properly.Vlad Yasevich2013-04-05
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few drivers use dev_uc_sync/unsync to synchronize the address lists from master down to slave/lower devices. In some cases (bond/team) a single address list is synched down to multiple devices. At the time of unsync, we have a leak in these lower devices, because "synced" is treated as a boolean and the address will not be unsynced for anything after the first device/call. Treat "synced" as a count (same as refcount) and allow all unsync calls to work. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-04-01
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) sadb_msg prepared for IPSEC userspace forgets to initialize the satype field, fix from Nicolas Dichtel. 2) Fix mac80211 synchronization during station removal, from Johannes Berg. 3) Fix IPSEC sequence number notifications when they wrap, from Steffen Klassert. 4) Fix cfg80211 wdev tracing crashes when add_virtual_intf() returns an error pointer, from Johannes Berg. 5) In mac80211, don't call into the channel context code with the interface list mutex held. From Johannes Berg. 6) In mac80211, if we don't actually associate, do not restart the STA timer, otherwise we can crash. From Ben Greear. 7) Missing dma_mapping_error() check in e1000, ixgb, and e1000e. From Christoph Paasch. 8) Fix sja1000 driver defines to not conflict with SH port, from Marc Kleine-Budde. 9) Don't call il4965_rs_use_green with a NULL station, from Colin Ian King. 10) Suspend/Resume in the FEC driver fail because the buffer descriptors are not initialized at all the moments in which they should. Fix from Frank Li. 11) cpsw and davinci_emac drivers both use the wrong interface to restart a stopped TX queue. Use netif_wake_queue not netif_start_queue, the latter is for initialization/bringup not active management of the queue. From Mugunthan V N. 12) Fix regression in rate calculations done by psched_ratecfg_precompute(), missing u64 type promotion. From Sergey Popovich. 13) Fix length overflow in tg3 VPD parsing, from Kees Cook. 14) AOE driver fails to allocate enough headroom, resulting in crashes. Fix from Eric Dumazet. 15) RX overflow happens too quickly in sky2 driver because pause packet thresholds are not programmed correctly. From Mirko Lindner. 16) Bonding driver manages arp_interval and miimon settings incorrectly, disabling one unintentionally disables both. Fix from Nikolay Aleksandrov. 17) smsc75xx drivers don't program the RX mac properly for jumbo frames. Fix from Steve Glendinning. 18) Fix off-by-one in Codel packet scheduler. From Vijay Subramanian. 19) Fix packet corruption in atl1c by disabling MSI support, from Hannes Frederic Sowa. 20) netdev_rx_handler_unregister() needs a synchronize_net() to fix crashes in bonding driver unload stress tests. From Eric Dumazet. 21) rxlen field of ks8851 RX packet descriptors not interpreted correctly (it is 12 bits not 16 bits, so needs to be masked after shifting the 32-bit value down 16 bits). Fix from Max Nekludov. 22) Fix missed RX/TX enable in sh_eth driver due to mishandling of link change indications. From Sergei Shtylyov. 23) Fix crashes during spurious ECI interrupts in sh_eth driver, also from Sergei Shtylyov. 24) dm9000 driver initialization is done wrong for revision B devices with DSP PHY, from Joseph CHANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) DM9000B: driver initialization upgrade sh_eth: make 'link' field of 'struct sh_eth_private' *int* sh_eth: workaround for spurious ECI interrupt sh_eth: fix handling of no LINK signal ks8851: Fix interpretation of rxlen field. net: add a synchronize_net() in netdev_rx_handler_unregister() MAINTAINERS: Update netxen_nic maintainers list atl1e: drop pci-msi support because of packet corruption net: fq_codel: Fix off-by-one error net: calxedaxgmac: Wake-on-LAN fixes net: calxedaxgmac: fix rx ring handling when OOM net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' smsc75xx: fix jumbo frame support net: fix the use of this_cpu_ptr bonding: fix disabling of arp_interval and miimon ipv6: don't accept node local multicast traffic from the wire sky2: Threshold for Pause Packet is set wrong sky2: Receive Overflows not counted aoe: reserve enough headroom on skbs line up comment for ndo_bridge_getlink ...
| * | | | | net: add a synchronize_net() in netdev_rx_handler_unregister()Eric Dumazet2013-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 35d48903e97819 (bonding: fix rx_handler locking) added a race in bonding driver, reported by Steven Rostedt who did a very good diagnosis : <quoting Steven> I'm currently debugging a crash in an old 3.0-rt kernel that one of our customers is seeing. The bug happens with a stress test that loads and unloads the bonding module in a loop (I don't know all the details as I'm not the one that is directly interacting with the customer). But the bug looks to be something that may still be present and possibly present in mainline too. It will just be much harder to trigger it in mainline. In -rt, interrupts are threads, and can schedule in and out just like any other thread. Note, mainline now supports interrupt threads so this may be easily reproducible in mainline as well. I don't have the ability to tell the customer to try mainline or other kernels, so my hands are somewhat tied to what I can do. But according to a core dump, I tracked down that the eth irq thread crashed in bond_handle_frame() here: slave = bond_slave_get_rcu(skb->dev); bond = slave->bond; <--- BUG the slave returned was NULL and accessing slave->bond caused a NULL pointer dereference. Looking at the code that unregisters the handler: void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); RCU_INIT_POINTER(dev->rx_handler_data, NULL); } Which is basically: dev->rx_handler = NULL; dev->rx_handler_data = NULL; And looking at __netif_receive_skb() we have: rx_handler = rcu_dereference(skb->dev->rx_handler); if (rx_handler) { if (pt_prev) { ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } switch (rx_handler(&skb)) { My question to all of you is, what stops this interrupt from happening while the bonding module is unloading? What happens if the interrupt triggers and we have this: CPU0 CPU1 ---- ---- rx_handler = skb->dev->rx_handler netdev_rx_handler_unregister() { dev->rx_handler = NULL; dev->rx_handler_data = NULL; rx_handler() bond_handle_frame() { slave = skb->dev->rx_handler; bond = slave->bond; <-- NULL pointer dereference!!! What protection am I missing in the bond release handler that would prevent the above from happening? </quoting Steven> We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb'Shmulik Ladkani2013-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'nf_reset' is called just prior calling 'netif_rx'. No need to call it twice. Reported-by: Igor Michailov <rgohita@gmail.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: fix the use of this_cpu_ptrLi RongQing2013-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush_tasklet is not percpu var, and percpu is percpu var, and this_cpu_ptr(&info->cache->percpu->flush_tasklet) is not equal to &this_cpu_ptr(info->cache->percpu)->flush_tasklet 1f743b076(use this_cpu_ptr per-cpu helper) introduced this bug. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | rtnetlink: fix error return code in rtnl_link_fill()Wei Yongjun2013-03-27
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to return a negative error code from the error handling case instead of 0(possible overwrite to 0 by ops->fill_xstats call), as returned elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2013-03-28
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns fixes from Eric W Biederman: "The bulk of the changes are fixing the worst consequences of the user namespace design oversight in not considering what happens when one namespace starts off as a clone of another namespace, as happens with the mount namespace. The rest of the changes are just plain bug fixes. Many thanks to Andy Lutomirski for pointing out many of these issues." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Restrict when proc and sysfs can be mounted ipc: Restrict mounting the mqueue filesystem vfs: Carefully propogate mounts across user namespaces vfs: Add a mount flag to lock read only bind mounts userns: Don't allow creation if the user is chrooted yama: Better permission check for ptraceme pid: Handle the exit of a multi-threaded init. scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.
| * | | | scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.Eric W. Biederman2013-03-17
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't allow spoofing pids over unix domain sockets in the corner cases where a user has created a user namespace but has not yet created a pid namespace. Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | net: remove a WARN_ON() in net_enable_timestamp()Eric Dumazet2013-03-24
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false positive, in socket clone path, run from softirq context : [ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80() [ 3641.668811] Call Trace: [ 3641.671254] <IRQ> [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0 [ 3641.677871] [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20 [ 3641.683683] [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80 [ 3641.689668] [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450 [ 3641.695222] [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170 [ 3641.701213] [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820 [ 3641.707663] [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670 [ 3641.713354] [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0 [ 3641.719425] [<ffffffff807af63a>] tcp_check_req+0x3da/0x530 [ 3641.724979] [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80 [ 3641.730964] [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0 [ 3641.736430] [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0 [ 3641.741985] [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0 Its safe at this point because the parent socket owns a reference on the netstamp_needed, so we cant have a 0 -> 1 transition, which requires to lock a mutex. Instead of refining the check, lets remove it, as all known callers are safe. If it ever changes in the future, static_key_slow_inc() will complain anyway. Reported-by: Laurent Chavey <chavey@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | flow_keys: include thoff into flow_keys for later usageDaniel Borkmann2013-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're doing the work here anyway, also store thoff for a later usage, e.g. in the BPF filter. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | rtnetlink: Mask the rta_type when range checkingVlad Yasevich2013-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Range/validity checks on rta_type in rtnetlink_rcv_msg() do not account for flags that may be set. This causes the function to return -EINVAL when flags are set on the type (for example NLA_F_NESTED). Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net/core: move vlan_depth out of while loop in skb_network_protocol()Li RongQing2013-03-12
|/ / | | | | | | | | | | | | | | | | | | | | | | [ Bug added added in commit 05e8ef4ab2d8087d (net: factor out skb_mac_gso_segment() from skb_gso_segment() ) ] move vlan_depth out of while loop, or else vlan_depth always is ETH_HLEN, can not be increased, and lead to infinite loop when frame has two vlan headers. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rtnl: fix info leak on RTM_GETLINK request for VF devicesMathias Krause2013-03-10
| | | | | | | | | | | | | | | | | | | | | | Initialize the mac address buffer with 0 as the driver specific function will probably not fill the whole buffer. In fact, all in-kernel drivers fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible bytes. Therefore we currently leak 26 bytes of stack memory to userland via the netlink interface. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridging: fix rx_handlers return codeCristian Bercaru2013-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | The frames for which rx_handlers return RX_HANDLER_CONSUMED are no longer counted as dropped. They are counted as successfully received by 'netif_receive_skb'. This allows network interface drivers to correctly update their RX-OK and RX-DRP counters based on the result of 'netif_receive_skb'. Signed-off-by: Cristian Bercaru <B43982@freescale.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: reduce net_rx_action() latency to 2 HZEric Dumazet2013-03-06
| | | | | | | | | | | | | | | | | | | | We should use time_after_eq() to get maximum latency of two ticks, instead of three. Bug added in commit 24f8b2385 (net: increase receive packet quantum) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: fix new kernel-doc warnings in net coreRandy Dunlap2013-03-06
|/ | | | | | | | | | Fix new kernel-doc warnings in net/core/dev.c: Warning(net/core/dev.c:4788): No description found for parameter 'new_carrier' Warning(net/core/dev.c:4788): Excess function parameter 'new_carries' description in 'dev_change_carrier' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* hlist: drop the node parameter from iteratorsSasha Levin2013-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2013-02-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle *and* ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...
| * new helper: file_inode(file)Al Viro2013-02-22
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-02-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) ping_err() ICMP error handler looks at wrong ICMP header, from Li Wei. 2) TCP socket hash function on ipv6 is too weak, from Eric Dumazet. 3) netif_set_xps_queue() forgets to drop mutex on errors, fix from Alexander Duyck. 4) sum_frag_mem_limit() can deadlock due to lack of BH disabling, fix from Eric Dumazet. 5) TCP SYN data is miscalculated in tcp_send_syn_data(), because the amount of TCP option space was not taken into account properly in this code path. Fix from yuchung Cheng. 6) MLX4 driver allocates device queues with the wrong size, from Kleber Sacilotto. 7) sock_diag can access past the end of the sock_diag_handlers[] array, from Mathias Krause. 8) vlan_set_encap_proto() makes incorrect assumptions about where skb->data points, rework the logic so that it works regardless of where skb->data happens to be. From Jesse Gross. 9) Fix gianfar build failure with NET_POLL enabled, from Paul Gortmaker. 10) Fix Ipv4 ID setting and checksum calculations in GRE driver, from Pravin B Shelar. 11) bgmac driver does: int i; for (i = 0; ...; ...) { ... for (i = 0; ...; ...) { effectively corrupting the outer loop index, use a seperate variable for the inner loops. From Rafał Miłecki. 12) Fix suspend bugs in smsc95xx driver, from Ming Lei. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) usbnet: smsc95xx: rename FEATURE_AUTOSUSPEND usbnet: smsc95xx: fix broken runtime suspend usbnet: smsc95xx: fix suspend failure bgmac: fix indexing of 2nd level loops b43: Fix lockdep splat on module unload Revert "ip_gre: propogate target device GSO capability to the tunnel device" IP_GRE: Fix GRE_CSUM case. VXLAN: Use tunnel_ip_select_ident() for tunnel IP-Identification. IP_GRE: Fix IP-Identification. net/pasemi: Fix missing coding style vmxnet3: fix ethtool ring buffer size setting vmxnet3: make local function static bnx2x: remove dead code and make local funcs static gianfar: fix compile fail for NET_POLL=y due to struct packing vlan: adjust vlan_set_encap_proto() for its callers sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msg sock_diag: Fix out-of-bounds access to sock_diag_handlers[] vxlan: remove depends on CONFIG_EXPERIMENTAL mlx4_en: fix allocation of CPU affinity reverse-map mlx4_en: fix allocation of device tx_cq ...
| * | sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msgMathias Krause2013-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | The sock_diag_lock_handler() and sock_diag_unlock_handler() actually make the code less readable. Get rid of them and make the lock usage and access to sock_diag_handlers[] clear on the first sight. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sock_diag: Fix out-of-bounds access to sock_diag_handlers[]Mathias Krause2013-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userland can send a netlink message requesting SOCK_DIAG_BY_FAMILY with a family greater or equal then AF_MAX -- the array size of sock_diag_handlers[]. The current code does not test for this condition therefore is vulnerable to an out-of-bound access opening doors for a privilege escalation. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sock: only define socket limit if mem cgroup configuredstephen hemminger2013-02-22
| | | | | | | | | | | | | | | | | | | | | | | | The mem cgroup socket limit is only used if the config option is enabled. Found with sparse Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: Fix locking bug in netif_set_xps_queueAlexander Duyck2013-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Smatch found a locking bug in netif_set_xps_queue in which we were not releasing the lock in the case of an allocation failure. This change corrects that so that we release the xps_map_mutex before returning -ENOMEM in the case of an allocation failure. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net/core: apply pm_runtime_set_memalloc_noio on network devicesMing Lei2013-02-23
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deadlock might be caused by allocating memory with GFP_KERNEL in runtime_resume and runtime_suspend callback of network devices in iSCSI situation, so mark network devices and its ancestor as 'memalloc_noio' with the introduced pm_runtime_set_memalloc_noio(). Signed-off-by: Ming Lei <ming.lei@canonical.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Decotigny <david.decotigny@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Oliver Neukum <oneukum@suse.de> Cc: Jiri Kosina <jiri.kosina@suse.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | ipv6: fix race condition regarding dst->expires and dst->from.YOSHIFUJI Hideaki / 吉藤英明2013-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Dumazet wrote: | Some strange crashes happen in rt6_check_expired(), with access | to random addresses. | | At first glance, it looks like the RTF_EXPIRES and | stuff added in commit 1716a96101c49186b | (ipv6: fix problem with expired dst cache) | are racy : same dst could be manipulated at the same time | on different cpus. | | At some point, our stack believes rt->dst.from contains a dst pointer, | while its really a jiffie value (as rt->dst.expires shares the same area | of memory) | | rt6_update_expires() should be fixed, or am I missing something ? | | CC Neil because of https://bugzilla.redhat.com/show_bug.cgi?id=892060 Because we do not have any locks for dst_entry, we cannot change essential structure in the entry; e.g., we cannot change reference to other entity. To fix this issue, split 'from' and 'expires' field in dst_entry out of union. Once it is 'from' is assigned in the constructor, keep the reference until the very last stage of the life time of the object. Of course, it is unsafe to change 'from', so make rt6_set_from simple just for fresh entries. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Neil Horman <nhorman@tuxdriver.com> CC: Gao Feng <gaofeng@cn.fujitsu.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reported-by: Steinar H. Gunderson <sesse@google.com> Reviewed-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: fix a wrong assignment in skb_split()Amerigo Wang2013-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c9af6db4c11ccc6c3e7f1 (net: Fix possible wrong checksum generation) has a suspicous piece: - skb_shinfo(skb1)->gso_type = skb_shinfo(skb)->gso_type; - + skb_shinfo(skb)->tx_flags = skb_shinfo(skb1)->tx_flags & SKBTX_SHARED_FRAG; skb1 is the new skb, therefore should be on the left side of the assignment. This patch fixes it. Cc: Pravin B Shelar <pshelar@nicira.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: fix a build failure when !CONFIG_PROC_FSCong Wang2013-02-19
| | | | | | | | | | | | | | | | | | | | | | | | When !CONFIG_PROC_FS dev_mcast_init() is not defined, actually we can just merge dev_mcast_init() into dev_proc_init(). Reported-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: move procfs code to net/core/net-procfs.cCong Wang2013-02-19
| | | | | | | | | | | | | | | | | | Similar to net/core/net-sysfs.c, group procfs code to a single unit. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: proc: change proc_net_remove to remove_proc_entryGao feng2013-02-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | proc_net_remove is only used to remove proc entries that under /proc/net,it's not a general function for removing proc entries of netns. if we want to remove some proc entries which under /proc/net/stat/, we still need to call remove_proc_entry. this patch use remove_proc_entry to replace proc_net_remove. we can remove proc_net_remove after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: proc: change proc_net_fops_create to proc_createGao feng2013-02-18
| | | | | | | | | | | | | | | | | | | | | | | | | | Right now, some modules such as bonding use proc_create to create proc entries under /proc/net/, and other modules such as ipv4 use proc_net_fops_create. It looks a little chaos.this patch changes all of proc_net_fops_create to proc_create. we can remove proc_net_fops_create after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: move ioctl functions into a separated fileCong Wang2013-02-18
| | | | | | | | | | | | | | | | They well deserve a separated unit. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: use skb_reset_mac_len() in dev_gro_receive()Eric Dumazet2013-02-15
| | | | | | | | | | | | | | We no longer need to use mac_len, lets cleanup things. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | v4 GRE: Add TCP segmentation offload for GREPravin B Shelar2013-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following patch adds GRE protocol offload handler so that skb_gso_segment() can segment GRE packets. SKB GSO CB is added to keep track of total header length so that skb_segment can push entire header. e.g. in case of GRE, skb_segment need to push inner and outer headers to every segment. New NETIF_F_GRE_GSO feature is added for devices which support HW GRE TSO offload. Currently none of devices support it therefore GRE GSO always fall backs to software GSO. [ Compute pkt_len before ip_local_out() invocation. -DaveM ] Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: factor out skb_mac_gso_segment() from skb_gso_segment()Pravin B Shelar2013-02-15
| | | | | | | | | | | | | | | | | | | | | | | | This function will be used in next GRE_GSO patch. This patch does not change any functionality. It only exports skb_mac_gso_segment() function. [ Use skb_reset_mac_len() -DaveM ] Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Don't write to current task flags on every packet received.David S. Miller2013-02-14
| | | | | | | | | | | | | | | | | | | | | | | | Even for non-pfmalloc SKBs, __netif_receive_skb() will do a tsk_restore_flags() on current unconditionally. Make __netif_receive_skb() a shim around the existing code, renamed to __netif_receive_skb_core(). Let __netif_receive_skb() wrap the __netif_receive_skb_core() call with the task flag modifications, if necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: Add vlan support to static neighborsVlad Yasevich2013-02-13
| | | | | | | | | | | | | | | | | | | | | | When a user adds bridge neighbors, allow him to specify VLAN id. If the VLAN id is not specified, the neighbor will be added for VLANs currently in the ports filter list. If no VLANs are configured on the port, we use vlan 0 and only add 1 entry. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>