aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* net/mlx4: Add mlx4_bitmap zone allocatorMatan Barak2014-12-11
| | | | | | | | | | | | | | | The zone allocator is a mechanism which manages a few mlx4_bitmaps. When allocating a resource, the user indicates the desired zone of which this resource will be allocated from. If possible, the resource will be allocated from this zone. Otherwise, the resource will be allocated from a less-than, equal-to, higher-than priority zone, according to the desired zone's properties with that respective allocation order. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4: Add a check if there are too many reserved QPsDotan Barak2014-12-11
| | | | | | | | | | | The number of reserved QPs is affected both from the firmware and from the driver's requirements. This patch adds a check that validates that this number is indeed feasable. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4: Change QP allocation schemeEugenia Emantayev2014-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Use tasklet for user-space CQ completion eventsMatan Barak2014-12-11
| | | | | | | | | | | | | | | | | | | | | | Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx4_en's and IPoIB napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Mask out host side virtualization features for guestsOr Gerlitz2014-12-11
| | | | | | | | | When VFs (guests in this context) issue the QUERY_DEV_CAP command, they need not be told that host side virtualization features such as VST, FSM (MAC anti-spoofing) and running > 80 VFs are supported by the device. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_en: Set csum level for encapsulated packetsOr Gerlitz2014-12-11
| | | | | | | | This was dropped by mistake for the napi_gro_frags flow, fix that. Fixes: dd65beac48a5 ('net/mlx4_en: Extend usage of napi_gro_frags') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* be2net: Export tunnel offloads only when a VxLAN tunnel is createdSriharsha Basavapatna2014-12-11
| | | | | | | | | | | | | | The encapsulated offload flags shouldn't be unconditionally exported to the stack. The stack expects offloading to work across all tunnel types when those flags are set. This would break other tunnels (like GRE) since be2net currently supports tunnel offload for VxLAN only. Also, with VxLANs Skyhawk-R can offload only 1 UDP dport. If more than 1 UDP port is added, we should disable offloads in that case too. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* gianfar: Fix dma check map error when DMA_API_DEBUG is enabledKevin Hao2014-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to use dma_mapping_error() to check the dma address returned by dma_map_single/page(). Otherwise we would get warning like this: WARNING: at lib/dma-debug.c:1140 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-next-20141029 #196 task: c0834300 ti: effe6000 task.ti: c0874000 NIP: c02b2c98 LR: c02b2c98 CTR: c030abc4 REGS: effe7d70 TRAP: 0700 Not tainted (3.18.0-rc2-next-20141029) MSR: 00021000 <CE,ME> CR: 22044022 XER: 20000000 GPR00: c02b2c98 effe7e20 c0834300 00000098 00021000 00000000 c030b898 00000003 GPR08: 00000001 00000000 00000001 749eec9d 22044022 1001abe0 00000020 ef278678 GPR16: ef278670 ef278668 ef278660 070a8040 c087f99c c08cdc60 00029000 c0840d44 GPR24: c08be6e8 c0840000 effe7e78 ef041340 00000600 ef114e10 00000000 c08be6e0 NIP [c02b2c98] check_unmap+0x51c/0x9e4 LR [c02b2c98] check_unmap+0x51c/0x9e4 Call Trace: [effe7e20] [c02b2c98] check_unmap+0x51c/0x9e4 (unreliable) [effe7e70] [c02b31d8] debug_dma_unmap_page+0x78/0x8c [effe7ed0] [c03d1640] gfar_clean_rx_ring+0x208/0x488 [effe7f40] [c03d1a9c] gfar_poll_rx_sq+0x3c/0xa8 [effe7f60] [c04f8714] net_rx_action+0xc0/0x178 [effe7f90] [c00435a0] __do_softirq+0x100/0x1fc [effe7fe0] [c0043958] irq_exit+0xa4/0xc8 [effe7ff0] [c000d14c] call_do_irq+0x24/0x3c [c0875e90] [c00048a0] do_IRQ+0x8c/0xf8 [c0875eb0] [c000ed10] ret_from_except+0x0/0x18 For TX, we need to unmap the pages which has already been mapped and free the skb before return. For RX, move the dma mapping and error check to gfar_new_skb(). We would reuse the original skb in the rx ring when either allocating skb failure or dma mapping error. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/csiostor: Don't use MASTER_MUST for fw_hello callHariprasad Shenai2014-12-11
| | | | | | | | | Remove use of calls into t4_fw_hello() with MASTER_MUST, which results in FW_HELLO_CMD_MASTERFORCE being set. The firmware doesn't support this and of course any existing PF Drivers will totally go for a toss. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'fec-next'David S. Miller2014-12-10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fugang Duan says: ==================== net: fec: driver code clean and bug fix The patch serial include code clean and bug fix: Patch#1: avoid dummy operation during suspend/resume test. Patch#2: bug fix for i.MX6SX SOC that clean all interrupt events during MAC initial process. Patch#3: before phy device link status is up, only enable MDIO bus interrupt. V2: - Modify the comment form from David's suggestion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fec: only enable mdio interrupt before phy device link upNimrod Andy2014-12-10
| | | | | | | | | | | | | | | | Before phy device link up, we only enable FEC mdio interrupt, which is more reasonable. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fec: clear all interrupt events to support i.MX6SXNimrod Andy2014-12-10
| | | | | | | | | | | | | | | | | | For i.MX6SX FEC controller, there have interrupt mask and event field extension. To support all SOCs FEC, we clear all interrupt events during MAVC initial process. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fec: reset fep link status in suspend functionNimrod Andy2014-12-10
|/ | | | | | | | | | | | | | | | | | | | | | | | On some i.MX6 serial boards, phy power and refrence clock are supplied or controlled by SOC. When do suspend/resume test, the power and clock are disabled, so phy device link down. For current driver, fep->link is still up status, which cause extra operation like below code. To avoid the dumy operation, we set fep->link to down when phy device is real down. ... if (fep->link) { napi_disable(&fep->napi); netif_tx_lock_bh(ndev); fec_stop(ndev); netif_tx_unlock_bh(ndev); napi_enable(&fep->napi); fep->link = phy_dev->link; status_change = 1; } ... Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sock: fix access via invalid file descriptorAlexei Starovoitov2014-12-10
| | | | | | | | | | | | | | 0day robot reported the following crash: [ 21.233581] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007 [ 21.234709] IP: [<ffffffff8156ebda>] sk_attach_bpf+0x39/0xc2 It's due to bpf_prog_get() returning ERR_PTR. Check it properly. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: introduce helper macro for_each_cmsghdrGu Zheng2014-12-10
| | | | | | | | Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating cmsghdr from msghdr, just cleanup. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4/cxgb4vf: global named must be uniqueStephen Rothwell2014-12-10
| | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-12-10
|\ | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.cValdis.Kletnieks@vt.edu2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 46e5da40ae (net: qdisc: use rcu prefix and silence sparse warnings) triggers a spurious warning: net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage! The code should be using the _bh variant of rcu_dereference. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netfront: use correct linear area after linearizing an skbDavid Vrabel2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 97a6d1bb2b658ac85ed88205ccd1ab809899884d (xen-netfront: Fix handling packets on compound pages with skb_linearize) attempted to fix a problem where an skb that would have required too many slots would be dropped causing TCP connections to stall. However, it filled in the first slot using the original buffer and not the new one and would use the wrong offset and grant access to the wrong page. Netback would notice the malformed request and stop all traffic on the VIF, reporting: vif vif-3-0 vif3.0: txreq.offset: 85e, size: 4002, end: 6144 vif vif-3-0 vif3.0: fatal error; disabling device Reported-by: Anthony Wright <anthony@overnetdata.com> Tested-by: Anthony Wright <anthony@overnetdata.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: fix more NULL deref after prequeue changesEric Dumazet2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I cooked commit c3658e8d0f1 ("tcp: fix possible NULL dereference in tcp_vX_send_reset()") I missed other spots we could deref a NULL skb_dst(skb) Again, if a socket is provided, we do not need skb_dst() to get a pointer to network namespace : sock_net(sk) is good enough. Reported-by: Dann Frazier <dann.frazier@canonical.com> Bisected-by: Dann Frazier <dann.frazier@canonical.com> Tested-by: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: ca777eff51f7 ("tcp: remove dst refcount false sharing for prequeue mode") Signed-off-by: David S. Miller <davem@davemloft.net>
| * netback: don't store invalid vif pointerJan Beulich2014-12-09
| | | | | | | | | | | | | | | | | | | | When xenvif_alloc() fails, it returns a non-NULL error indicator. To avoid eventual races, we shouldn't store that into struct backend_info as readers of it only check for NULL. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'linux-can-fixes-for-3.18-20141207' of ↵David S. Miller2014-12-09
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://gitorious.org/linux-can/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2014-12-07 this is a pull request of three patches by Stephane Grosjean which fix several bugs in the peak_usb CAN drivers. Please queue, if possible for 3.18, if it's too late these patches takes the slow lane via net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * can: peak_usb: fix multi-byte values endianessStephane Grosjean2014-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the endianess definition as well as the usage of the multi-byte fields in the data structures exchanged with the PEAK-System USB adapters. By fixing the endianess, this patch also fixes the wrong usage of a 32-bits local variable for handling the error status 16-bits field, in function pcan_usb_pro_handle_error(). Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: peak_usb: fix cleanup sequence order in case of error during initStephane Grosjean2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch sets the correct reverse sequence order to the instructions set to run, when any failure occurs during the initialization steps. It also adds the missing unregistration call of the can device if the failure appears after having been registered. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * can: peak_usb: fix memset() usageStephane Grosjean2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchs fixes a misplaced call to memset() that fills the request buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT requests, the content set by the caller was thus lost. With this patch, the memory area is zeroed only when requesting info from the device. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * | bnx2x: Implement ndo_gso_check()Joe Stringer2014-12-09
| | | | | | | | | | | | | | | | | | | | | Use vxlan_gso_check() to advertise offload support for this NIC. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | amd-xgbe: Prevent Tx cleanup stallLendacky, Thomas2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When performing Tx cleanup, the dirty index counter is compared to the current index counter as one of the tests used to determine when to stop cleanup. The "less than" test will fail when the current index counter rolls over to zero causing cleanup to never occur again. Update the test to a "not equal" to avoid this situation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Update old iproute2 and Xen Remus linksAndrew Shewmaker2014-12-09
| | | | | | | | | | | | | | | | | | Signed-off-by: Andrew Shewmaker <agshew@gmail.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge tag 'master-2014-12-01' of ↵David S. Miller2014-12-09
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-12-03 One last(?) batch of fixes hoping to make 3.18... In this episode, we have another trio of rtlwifi fixes repairing a little more damage from the major update of the rtlwifi-family of drivers. These editing mistakes caused some memory corruption and missed a flag critical to proper interrupt handling. Together, these fix the kernel regression reported at https://bugzilla.kernel.org/show_bug.cgi?id=88951 by Catalin Iacob. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bnx2x: Limit 1G link enforcementYaniv Rosner2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change 1G-SFP module detection by verifying not only that it's not compliant with 10G-Ethernet, but also that it's 1G-ethernet compliant. Signed-off-by: Yaniv Rosner <Yaniv.Rosner@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | stmmac: fix max coal timer parameterGiuseppe CAVALLARO2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to fix the max coalesce timer setting that can be provided by ethtool. The default value (STMMAC_COAL_TX_TIMER) was used in the set_coalesce helper instead of the max one (STMMAC_MAX_COAL_TX_TICK, so defined but not used). Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: sctp: use MAX_HEADER for headroom reserve in output pathDaniel Borkmann2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To accomodate for enough headroom for tunnels, use MAX_HEADER instead of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs of trinity an skb_under_panic() via SCTP output path (see reference). I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere in other protocols might be one possible cause for this. In any case, it looks like accounting on chunks themself seems to look good as the skb already passed the SCTP output path and did not hit any skb_over_panic(). Given tunneling was enabled in his .config, the headroom would have been expanded by MAX_HEADER in this case. Reported-by: Robert Święcki <robert@swiecki.net> Reference: https://lkml.org/lkml/2014/12/1/507 Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | cxgb4: Update FW version string to match FW binary version 1.12.25.0Hariprasad Shenai2014-12-09
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | cxgb4: Add a check for flashing FW using ethtoolHariprasad Shenai2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't let T4 firmware flash on a T5 adapter and vice-versa using ethtool Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'amd-xgbe'David S. Miller2014-12-09
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver fixes 2014-12-02 The following series of patches includes two bug fixes. Unfortunately, the first patch will create a conflict when eventually merged into net-next but should be very easy to resolve. - Do not clear the interrupt bit in the xgbe_ring_data structure - Associate a Tx SKB with the proper xgbe_ring_data structure This patch series is based on net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | amd-xgbe: Associate Tx SKB with proper ring descriptorLendacky, Thomas2014-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SKB for a Tx packet is associated with an xgbe_ring_data structure in the xgbe_map_tx_skb function. However, it is being saved in the structure after the last structure used when the SKB is mapped. Use the last used structure to save the SKB value. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | amd-xgbe: Do not clear interrupt indicatorLendacky, Thomas2014-12-09
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The interrupt value within the xgbe_ring_data structure is used as an indicator of which Rx descriptor should have the INTE bit set to generate an interrupt when that Rx descriptor is used. This bit was mistakenly cleared in the xgbe_unmap_rdata function, effectively nullifying the ethtool rx-frames support. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | fib_trie: Fix /proc/net/fib_trie when CONFIG_IP_MULTIPLE_TABLES is not definedAlexander Duyck2014-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In recent testing I had disabled CONFIG_IP_MULTIPLE_TABLES and as a result when I ran "cat /proc/net/fib_trie" the main trie was displayed multiple times. I found that the problem line of code was in the function fib_trie_seq_next. Specifically the line below caused the indexes to go in the opposite direction of our traversal: h = tb->tb_id & (FIB_TABLE_HASHSZ - 1); This issue was that the RT tables are defined such that RT_TABLE_LOCAL is ID 255, while it is located at TABLE_LOCAL_INDEX of 0, and RT_TABLE_MAIN is 254 with a TABLE_MAIN_INDEX of 1. This means that the above line will return 1 for the local table and 0 for main. The result is that fib_trie_seq_next will return NULL at the end of the local table, fib_trie_seq_start will return the start of the main table, and then fib_trie_seq_next will loop on main forever as h will always return 0. The fix for this is to reverse the ordering of the two tables. It has the advantage of making it so that the tables now print in the same order regardless of if multiple tables are enabled or not. In order to make the definition consistent with the multiple tables case I simply masked the to RT_TABLE_XXX values by (FIB_TABLE_HASHSZ - 1). This way the two table layouts should always stay consistent. Fixes: 93456b6 ("[IPV4]: Unify access to the routing tables") Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: mvneta: fix race condition in mvneta_tx()Eric Dumazet2014-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mvneta_tx() dereferences skb to get skb->len too late, as hardware might have completed the transmit and TX completion could have freed the skb from another cpu. Fixes: 71f6d1b31fb1 ("net: mvneta: replace Tx timer with a real interrupt") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: mvneta: fix Tx interrupt delaywilly tarreau2014-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mvneta driver sets the amount of Tx coalesce packets to 16 by default. Normally that does not cause any trouble since the driver uses a much larger Tx ring size (532 packets). But some sockets might run with very small buffers, much smaller than the equivalent of 16 packets. This is what ping is doing for example, by setting SNDBUF to 324 bytes rounded up to 2kB by the kernel. The problem is that there is no documented method to force a specific packet to emit an interrupt (eg: the last of the ring) nor is it possible to make the NIC emit an interrupt after a given delay. In this case, it causes trouble, because when ping sends packets over its raw socket, the few first packets leave the system, and the first 15 packets will be emitted without an IRQ being generated, so without the skbs being freed. And since the socket's buffer is small, there's no way to reach that amount of packets, and the ping ends up with "send: no buffer available" after sending 6 packets. Running with 3 instances of ping in parallel is enough to hide the problem, because with 6 packets per instance, that's 18 packets total, which is enough to grant a Tx interrupt before all are sent. The original driver in the LSP kernel worked around this design flaw by using a software timer to clean up the Tx descriptors. This timer was slow and caused terrible network performance on some Tx-bound workloads (such as routing) but was enough to make tools like ping work correctly. Instead here, we simply set the packet counts before interrupt to 1. This ensures that each packet sent will produce an interrupt. NAPI takes care of coalescing interrupts since the interrupt is disabled once generated. No measurable performance impact nor CPU usage were observed on small nor large packets, including when saturating the link on Tx, and this fixes tools like ping which rely on too small a send buffer. If one wants to increase this value for certain workloads where it is safe to do so, "ethtool -C $dev tx-frames" will override this default setting. This fix needs to be applied to stable kernels starting with 3.10. Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | mips: bpf: Fix broken BPF_MODDenis Kirjanov2014-12-08
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | Remove optimize_div() from BPF_MOD | BPF_K case since we don't know the dividend and fix the emit_mod() by reading the mod operation result from HI register Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Reviewed-by: Markos Chandras <markos.chandras@imgtec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | openvswitch: Fix flow mask validation.Pravin B Shelar2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following patch fixes typo in the flow validation. This prevented installation of ARP and IPv6 flows. Fixes: 19e7a3df72 ("openvswitch: Fix NDP flow mask validation") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | uapi: fix to export linux/vm_sockets.hMasahiro Yamada2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | A typo "header=y" was introduced by commit 7071cf7fc435 (uapi: add missing network related headers to kbuild). Signed-off-by: Masahiro Yamada <yamada.m@jp.panasonic.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sky2: avoid pci write posting after disabling irqsLino Sanfilippo2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In sky2_change_mtu setting B0_IMSK to 0 may be delayed due to PCI write posting which could result in irqs being still active when synchronize_irq is called. Since we are not prepared to handle any further irqs after synchronize_irq (our resources are freed after that) force the write by a consecutive read from the same register. Similar situation in sky2_all_down: Here we disabled irqs by a write to B0_IMSK but did not ensure that this write took place before synchronize_irq. Fix that too. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | skge: Unmask interrupts in case of spurious interruptsLino Sanfilippo2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | In case of a spurious interrupt dont forget to reenable the interrupts that have been masked by reading the interrupt source register. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Acked-by: Mirko Lindner <mlindner@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pxa168: close race between napi and irq activationLino Sanfilippo2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In pxa168_eth_open() the irqs are enabled before napi. This opens a tiny time window in which the irq handler is processed, disables irqs but then is not able to schedule the not yet activated napi, leaving irqs disabled forever (since irqs are reenabled in napi poll function). Fix this race by activating napi before irqs are activated. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: x86: fix epilogue generation for eBPF programsAlexei Starovoitov2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | classic BPF has a restriction that last insn is always BPF_RET. eBPF doesn't have BPF_RET instruction and this restriction. It has BPF_EXIT insn which can appear anywhere in the program one or more times and it doesn't have to be last insn. Fix eBPF JIT to emit epilogue when first BPF_EXIT is seen and all other BPF_EXIT instructions will be emitted as jump. Since jump offset to epilogue is computed as: jmp_offset = ctx->cleanup_addr - addrs[i] we need to change type of cleanup_addr to signed to compute the offset as: (long long) ((int)20 - (int)30) instead of: (long long) ((unsigned int)20 - (int)30) Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | gre: Set inner mac header in gro completeTom Herbert2014-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set the inner mac header to point to the GRE payload when doing GRO. This is needed if we proceed to send the packet through GRE GSO which now uses the inner mac header instead of inner network header to determine the length of encapsulation headers. Fixes: 14051f0452a2 ("gre: Use inner mac length when computing tunnel length") Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | context_tracking: Restore previous state in schedule_userAndy Lutomirski2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It appears that some SCHEDULE_USER (asm for schedule_user) callers in arch/x86/kernel/entry_64.S are called from RCU kernel context, and schedule_user will return in RCU user context. This causes RCU warnings and possible failures. This is intended to be a minimal fix suitable for 3.18. Reported-and-tested-by: Dave Jones <davej@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Merge branch 'i2c/for-current' of ↵Linus Torvalds2014-12-03
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c bugfixes from Wolfram Sang: "A few driver bugfixes for 3.18" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: omap: fix i207 errata handling i2c: designware: prevent early stop on TX FIFO empty i2c: omap: fix NACK and Arbitration Lost irq handling