aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* lan78xx: Fix to handle hard_header_len updateNisar Sayed2017-08-02
| | | | | | | | | | | | Fix to handle hard_header_len update When ifconfig up/down sequence is initiated hard_header_len get updated incrementally for each ifconfig up /down sequence, this leads invalid hard_header_len, moving to lan78xx_bind to have one time update of hard_header_len addresses the issue. Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lan78xx: USB fast connect/disconnect crash fixNisar Sayed2017-08-02
| | | | | | | | | | | | | | | USB fast connect/disconnect crash fix When USB plugged/unplugged at fast rate, lan78xx_mdio_init() in lan78xx_bind() failing case is not handled. Whenever lan78xx_mdio_init() failed, dev->mdiobus will be freed, however since lan78xx_bind() not consider as error and try to proceed for further initialization in lan78xx_probe() which leads system hung/crash. Also when register_netdev() failed, netdev is freed without calling lan78xx_unbind(). Hence halting the failed cases right manner fixes the system crash/hung issue. Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-Fix-64-bit-statistics-seqcount-init'David S. Miller2017-08-01
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== drivers: net: Fix 64-bit statistics seqcount init This patch series fixes a bunch of drivers to have their 64-bit statistics seqcount cookie be initialized correctly. Most of these drivers (except b44, gtp) are probably used on 64-bit only hosts and so the lockdep splat might have never been seen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipvlan: Fix 64-bit statistics seqcount initializationFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that by using the proper helper function: netdev_alloc_pcpu_stats(). Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: Initialize 64-bit stats seqcountFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. In commit 6c80f3fc2398 ("netvsc: report per-channel stats in ethtool statistics") netdev_alloc_pcpu_stats() was removed in favor of open-coding the 64-bits statistics, except that u64_stats_init() was missed. Fixes: 6c80f3fc2398 ("netvsc: report per-channel stats in ethtool statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * gtp: Initialize 64-bit per-cpu stats correctlyFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that by using netdev_alloc_pcpu_stats() instead of an open coded allocation. Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: Initialize RX and TX ring 64-bit stats seqcountsFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: 4c3523623dc0 ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ixgbe: Initialize 64-bit stats seqcountsFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: 4197aa7bb818 ("ixgbevf: provide 64 bit statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * i40e: Initialize 64-bit statistics TX ring seqcountFlorian Fainelli2017-08-01
| | | | | | | | | | | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: 980e9b118642 ("i40e: Add support for 64 bit netstats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * b44: Initialize 64-bit stats seqcountFlorian Fainelli2017-08-01
|/ | | | | | | | | | | On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: eeda8585522b ("b44: add 64 bit stats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* gue: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDPK. Den2017-08-01
| | | | | | | | | | | | | | | | | | | | | In the case that GRO is turned on and the original received packet is CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last csum-unnecessary point, which for instance could occur if the packet comes from another Linux guest on the same Linux host, we have to do either remcsum_adjust or set up CHECKSUM_PARTIAL again with its csum_start properly reset considering RCO. However, since b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO") that barrier in such case could be skipped if GRO turned on, hence we pass over it and the inner L4 validation mistakenly reckons it as a bad csum. This patch makes remcsum_offload being reset at the same time of GRO remcsum cleanup, so as to make it work in such case as before. Fixes: b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO") Signed-off-by: Koichiro Den <den@klaipeden.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vxlan: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDPK. Den2017-08-01
| | | | | | | | | | | | | | | | | | | | | In the case that GRO is turned on and the original received packet is CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last csum-unnecessary point, which for instance could occur if the packet comes from another Linux guest on the same Linux host, we have to do either remcsum_adjust or set up CHECKSUM_PARTIAL again with its csum_start properly reset considering RCO. However, since b7fe10e5ebac("gro: Fix remcsum offload to deal with frags in GRO") that barrier in such case could be skipped if GRO turned on, hence we pass over it and the inner L4 validation mistakenly reckons it as a bad csum. This patch makes remcsum_offload being reset at the same time of GRO remcsum cleanup, so as to make it work in such case as before. Fixes: b7fe10e5ebac ("gro: Fix remcsum offload to deal with frags in GRO") Signed-off-by: Koichiro Den <den@klaipeden.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Cipso: cipso_v4_optptr enter infinite loopyujuan.qi2017-08-01
| | | | | | | | | | in for(),if((optlen > 0) && (optptr[1] == 0)), enter infinite loop. Test: receive a packet which the ip length > 20 and the first byte of ip option is 0, produce this issue Signed-off-by: yujuan.qi <yujuan.qi@mediatek.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ethernet-ti-cpts-fix-tx-timestamping-timeout'David S. Miller2017-08-01
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Grygorii Strashko says: ==================== net: ethernet: ti: cpts: fix tx timestamping timeout With the low Ethernet connection speed cpdma notification about packet processing can be received before CPTS TX timestamp event, which is set when packet actually left CPSW while cpdma notification is sent when packet pushed in CPSW fifo. As result, when connection is slow and CPU is fast enough TX timestamping is not working properly. Issue was discovered using timestamping tool on am57x boards with Ethernet link speed forced to 100M and on am335x-evm with Ethernet link speed forced to 10M. Patch3 - This series fixes it by introducing TX SKB queue to store PTP SKBs for which Ethernet Transmit Event hasn't been received yet and then re-check this queue with new Ethernet Transmit Events by scheduling CPTS overflow work more often until TX SKB queue is not empty. Patch 1,2 - As CPTS overflow work is time critical task it important to ensure that its scheduling is not delayed. Unfortunately, There could be significant delay in CPTS work schedule under high system load and on -RT which could cause CPTS misbehavior due to internal counter overflow and there is no way to tune CPTS overflow work execution policy and priority manually. The kthread_worker can be used instead of workqueues, as it creates separate named kthread for each worker and its its execution policy and priority can be configured using chrt tool. Instead of modifying CPTS driver itself it was proposed to it was proposed to add PTP auxiliary worker to the PHC subsystem [1], so other drivers can benefit from this feature also. [1] https://www.spinics.net/lists/netdev/msg445392.html changes in v4: - fixed memleak in ptp_clock_register() - undocumented change in cpts_find_ts() moved to separate patch (minor fix) changes in v3: - patch 1: added proper error handling in ptp_clock_register. minor comments applied. changes in v2: - added PTP auxiliary worker to the PHC subsystem Links v3: https://www.spinics.net/lists/netdev/msg446058.html v2: https://www.spinics.net/lists/netdev/msg445859.html v1: https://www.spinics.net/lists/netdev/msg445387.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ethernet: ti: cpts: fix fifo read in cpts_find_tsGrygorii Strashko2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the call chain cpts_find_ts() |- cpts_fifo_read(cpts, CPTS_EV_PUSH) will stop reading CPTS FIFO if PUSH event is found. But this is not expected and CPTS FIFI should be completely drained here. This is most probably copy-paste error and it has no negative impact as CPTS_EV_PUSH should not be present in FIFO without TS_PUSH request and cpts_systim_read() and cpts_find_ts() synchronized by spin_lock. Correct above by calling cpts_fifo_read() with -1 parameter, so it will read all CPTS event from FIFO. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ethernet: ti: cpts: fix tx timestamping timeoutGrygorii Strashko2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the low speed Ethernet connection CPDMA notification about packet processing can be received before CPTS TX timestamp event, which is set when packet actually left CPSW while cpdma notification is sent when packet pushed in CPSW fifo. As result, when connection is slow and CPU is fast enough TX timestamping is not working properly. Fix it, by introducing TX SKB queue to store PTP SKBs for which Ethernet Transmit Event hasn't been received yet and then re-check this queue with new Ethernet Transmit Events by scheduling CPTS overflow work more often (every 1 jiffies) until TX SKB queue is not empty. Side effect of this change is: - User space tools require to take into account possible delay in TX timestamp processing (for example ptp4l works with tx_timestamp_timeout=400 under net traffic and tx_timestamp_timeout=25 in idle). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ethernet: ti: cpts: convert to use ptp auxiliary workerGrygorii Strashko2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There could be significant delay in CPTS work schedule under high system load and on -RT which could cause CPTS misbehavior due to internal counter overflow. Usage of own kthread_worker allows to avoid such kind of issues and makes it possible to tune priority of CPTS kthread_worker thread on -RT (thread name "cpts"). Hence, the CPTS driver is converted to use PTP auxiliary worker as PHC subsystem implements such functionality in a generic way now. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ptp: introduce ptp auxiliary workerGrygorii Strashko2017-08-01
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many PTP drivers required to perform some asynchronous or periodic work, like periodically handling PHC counter overflow or handle delayed timestamp for RX/TX network packets. In most of the cases, such work is implemented using workqueues. Unfortunately, Kernel workqueues might introduce significant delay in work scheduling under high system load and on -RT, which could cause misbehavior of PTP drivers due to internal counter overflow, for example, and there is no way to tune its execution policy and priority manuallly. Hence, The kthread_worker can be used insted of workqueues, as it create separte named kthread for each worker and its its execution policy and priority can be configured using chrt tool. This prblem was reported for two drivers TI CPSW CPTS and dp83640, so instead of modifying each of these driver it was proposed to add PTP auxiliary worker to the PHC subsystem. The patch adds PTP auxiliary worker in PHC subsystem using kthread_worker and kthread_delayed_work and introduces two new PHC subsystem APIs: - long (*do_aux_work)(struct ptp_clock_info *ptp) callback in ptp_clock_info structure, which driver should assign if it require to perform asynchronous or periodic work. Driver should return the delay of the PTP next auxiliary work scheduling time (>=0) or negative value in case further scheduling is not required. - int ptp_schedule_worker(struct ptp_clock *ptp, unsigned long delay) which allows schedule PTP auxiliary work. The name of kthread_worker thread corresponds PTP PHC device name "ptp%d". Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2017-08-01
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Handle notifier registry failures properly in tun/tap driver, from Tonghao Zhang. 2) Fix bpf verifier handling of subtraction bounds and add a testcase for this, from Edward Cree. 3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt. 4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET, from Cong Wang. 5) Fix SElinux regression due to recent UDP optimizations, from Paolo Abeni. 6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code paths, fix from Stefano Brivio. 7) Fix some mem leaks in dccp, from Xin Long. 8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd Bergmann. 9) Mac address length check and buffer size fixes from Cong Wang. 10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni. 11) Fix return value when copy_from_user() fails in bpf_prog_get_info_by_fd(), from Daniel Borkmann. 12) Handle PHY_HALTED properly in phy library state machine, from Florian Fainelli. 13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel. 14) Fix truesize calculation in virtio_net which led to performance regressions, from Michael S Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) samples/bpf: fix bpf tunnel cleanup udp6: fix jumbogram reception ppp: Fix a scheduling-while-atomic bug in del_chan Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" virtio_net: fix truesize for mergeable buffers mv643xx_eth: fix of_irq_to_resource() error check MAINTAINERS: Add more files to the PHY LIBRARY section ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() net: phy: Correctly process PHY_HALTED in phy_stop_machine() sunhme: fix up GREG_STAT and GREG_IMASK register offsets bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len tcp: avoid bogus gcc-7 array-bounds warning net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt" bpf: don't indicate success when copy_from_user fails udp6: fix socket leak on early demux net: thunderx: Fix BGX transmit stall due to underflow Revert "vhost: cache used event for better performance" team: use a larger struct for mac address net: check dev->addr_len for dev_set_mac_address() phy: bcm-ns-usb3: fix MDIO_BUS dependency ...
| * samples/bpf: fix bpf tunnel cleanupWilliam Tu2017-08-01
| | | | | | | | | | | | | | | | | | test_tunnel_bpf.sh fails to remove the vxlan11 tunnel device, causing the next geneve tunnelling test case fails. In addition, the geneve reserved bit in tcbpf2_kern.c should be zero, according to the RFC. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp6: fix jumbogram receptionPaolo Abeni2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 67a51780aebb ("ipv6: udp: leverage scratch area helpers") udp6_recvmsg() read the skb len from the scratch area, to avoid a cache miss. But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their length exceeds the 16 bits available in the scratch area. As a side effect the length returned by recvmsg() is: <ingress datagram len> % (1<<16) This commit addresses the issue allocating one more bit in the IP6CB flags field and setting it for incoming jumbograms. Such field is still in the first cacheline, so at recvmsg() time we can check it and fallback to access skb->len if required, without a measurable overhead. Fixes: 67a51780aebb ("ipv6: udp: leverage scratch area helpers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ppp: Fix a scheduling-while-atomic bug in del_chanGao Feng2017-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would trigger this bug when __sk_free is invoked in atomic context, because of the call path pptp_sock_destruct->del_chan->synchronize_rcu. Now move the synchronize_rcu to pptp_release from del_chan. This is the only one case which would free the sock and need the synchronize_rcu. The following is the panic I met with kernel 3.3.8, but this issue should exist in current kernel too according to the codes. BUG: scheduling while atomic __schedule_bug+0x5e/0x64 __schedule+0x55/0x580 ? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic] ? dev_hard_start_xmit+0x423/0x530 ? sch_direct_xmit+0x73/0x170 __cond_resched+0x16/0x30 _cond_resched+0x22/0x30 wait_for_common+0x18/0x110 ? call_rcu_bh+0x10/0x10 wait_for_completion+0x12/0x20 wait_rcu_gp+0x34/0x40 ? wait_rcu_gp+0x40/0x40 synchronize_sched+0x1e/0x20 0xf8417298 0xf8417484 ? sock_queue_rcv_skb+0x109/0x130 __sk_free+0x16/0x110 ? udp_queue_rcv_skb+0x1f2/0x290 sk_free+0x16/0x20 __udp4_lib_rcv+0x3b8/0x650 Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config"Florian Fainelli2017-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 28b45910ccda ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config") because in the process of moving from dev_info() to dev_info_once() we essentially lost the helpful printed messages once the second instance of the driver is loaded. dev_info_once() does not actually print the message once per device instance, but once period. Fixes: 28b45910ccda ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Doug Berger <opendmb@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio_net: fix truesize for mergeable buffersMichael S. Tsirkin2017-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Seth Forshee noticed a performance degradation with some workloads. This turns out to be due to packet drops. Euan Kemp noticed that this is because we drop all packets where length exceeds the truesize, but for some packets we add in extra memory without updating the truesize. This in turn was kept around unchanged from ab7db91705e95 ("virtio-net: auto-tune mergeable rx buffer size for improved performance"). That commit had an internal reason not to account for the extra space: not enough bits to do it. No longer true so let's account for the allocated length exactly. Many thanks to Seth Forshee for the report and bisecting and Euan Kemp for debugging the issue. Fixes: 680557cf79f8 ("virtio_net: rework mergeable buffer handling") Reported-by: Euan Kemp <euan.kemp@coreos.com> Tested-by: Euan Kemp <euan.kemp@coreos.com> Reported-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mv643xx_eth: fix of_irq_to_resource() error checkSergei Shtylyov2017-07-31
| | | | | | | | | | | | | | | | | | | | of_irq_to_resource() has recently been fixed to return negative error #'s along with 0 in case of failure, however the Marvell MV643xx Ethernet driver still only regards 0 as invalid IRQ -- fix it up. Fixes: 7a4228bbff76 ("of: irq: use of_irq_get() in of_irq_to_resource()") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * MAINTAINERS: Add more files to the PHY LIBRARY sectionFlorian Fainelli2017-07-31
| | | | | | | | | | | | | | | | | | Include missing files that are provided by, used, or directly maintained within the PHY LIBRARY, this include uapi header, header files used by Device Tree code etc. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()Ido Schimmel2017-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Michał reported a NULL pointer deref during fib_sync_down_dev() when unregistering a netdevice. The problem is that we don't check for 'in_dev' being NULL, which can happen in very specific cases. Usually routes are flushed upon NETDEV_DOWN sent in either the netdev or the inetaddr notification chains. However, if an interface isn't configured with any IP address, then it's possible for host routes to be flushed following NETDEV_UNREGISTER, after NULLing dev->ip_ptr in inetdev_destroy(). To reproduce: $ ip link add type dummy $ ip route add local 1.1.1.0/24 dev dummy0 $ ip link del dev dummy0 Fix this by checking for the presence of 'in_dev' before referencing it. Fixes: 982acb97560c ("ipv4: fib: Notify about nexthop status changes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: Correctly process PHY_HALTED in phy_stop_machine()Florian Fainelli2017-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marc reported that he was not getting the PHY library adjust_link() callback function to run when calling phy_stop() + phy_disconnect() which does not indeed happen because we set the state machine to PHY_HALTED but we don't get to run it to process this state past that point. Fix this with a synchronous call to phy_state_machine() in order to have the state machine actually act on PHY_HALTED, set the PHY device's link down, turn the network device's carrier off and finally call the adjust_link() function. Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Fixes: a390d1f379cf ("phylib: convert state_queue work to delayed_work") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sunhme: fix up GREG_STAT and GREG_IMASK register offsetsMark Cave-Ayland2017-07-31
| | | | | | | | | | | | | | Update the values to match those from the STP2002QFP documentation. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_lenDaniel Borkmann2017-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bpf_prog_size(prog->len) is not the correct length we want to dump back to user space. The code in bpf_prog_get_info_by_fd() uses this to copy prog->insnsi to user space, but bpf_prog_size(prog->len) also includes the size of struct bpf_prog itself plus program instructions and is usually used either in context of accounting or for bpf_prog_alloc() et al, thus we copy out of bounds in bpf_prog_get_info_by_fd() potentially. Use the correct bpf_prog_insn_size() instead. Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: avoid bogus gcc-7 array-bounds warningArnd Bergmann2017-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using CONFIG_UBSAN_SANITIZE_ALL, the TCP code produces a false-positive warning: net/ipv4/tcp_output.c: In function 'tcp_connect': net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ^~ net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds [-Werror=array-bounds] tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~ I have opened a gcc bug for this, but distros have already shipped compilers with this problem, and it's not clear yet whether there is a way for gcc to avoid the warning. As the problem is related to the bitfield access, this introduces a temporary variable to store the old enum value. I did not notice this warning earlier, since UBSAN is disabled when building with COMPILE_TEST, and that was always turned on in both allmodconfig and randconfig tests. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'wireless-drivers-for-davem-2017-07-28' of ↵David S. Miller2017-07-29
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.13 Two fixes for for brcmfmac, the crash was reported by two people already so it's a high priority fix. brcmfmac * fix a crash in skb headroom handling in v4.13-rc1 * fix a memory leak due to a merge error in v4.6 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twiceArend Van Spriel2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to a bugfix in wireless tree and the commit mentioned below a merge was needed which went haywire. So the submitted change resulted in the function brcmf_sdiod_sgtable_alloc() being called twice during the probe thus leaking the memory of the first call. Cc: stable@vger.kernel.org # 4.6.x Fixes: 4d7928959832 ("brcmfmac: switch to new platform data") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| | * brcmfmac: Don't grow SKB by negative sizeDaniel Stone2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit to rework the headroom check in start_xmit() now calls pxskb_expand_head() unconditionally if the header is CoW. Unfortunately, it does so with the delta between the extant headroom and the header length, which may be negative if there is already sufficient headroom. pskb_expand_head() does allow for size being 0, in which case it just copies, so clamp the header delta to zero. Opening Chrome (and all my tabs) on a PCIE device was enough to reliably hit this. Fixes: 270a6c1f65fe ("brcmfmac: rework headroom check in .start_xmit()") Signed-off-by: Daniel Stone <daniels@collabora.com> Cc: Arend Van Spriel <arend.vanspriel@broadcom.com> Cc: James Hughes <james.hughes@raspberrypi.org> Cc: Hante Meuleman <hante.meuleman@broadcom.com> Cc: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Cc: Franky Lin <franky.lin@broadcom.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
| * | net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"Colin Ian King2017-07-29
| | | | | | | | | | | | | | | | | | | | | Trivial fix to spelling mistake in printk message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: don't indicate success when copy_from_user failsDaniel Borkmann2017-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | err in bpf_prog_get_info_by_fd() still holds 0 at that time from prior check_uarg_tail_zero() check. Explicitly return -EFAULT instead, so user space can be notified of buggy behavior. Fixes: 1e2709769086 ("bpf: Add BPF_OBJ_GET_INFO_BY_FD") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | udp6: fix socket leak on early demuxPaolo Abeni2017-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an early demuxed packet reaches __udp6_lib_lookup_skb(), the sk reference is retrieved and used, but the relevant reference count is leaked and the socket destructor is never called. Beyond leaking the sk memory, if there are pending UDP packets in the receive queue, even the related accounted memory is leaked. In the long run, this will cause persistent forward allocation errors and no UDP skbs (both ipv4 and ipv6) will be able to reach the user-space. Fix this by explicitly accessing the early demux reference before the lookup, and properly decreasing the socket reference count after usage. Also drop the skb_steal_sock() in __udp6_lib_lookup_skb(), and the now obsoleted comment about "socket cache". The newly added code is derived from the current ipv4 code for the similar path. v1 -> v2: fixed the __udp6_lib_rcv() return code for resubmission, as suggested by Eric Reported-by: Sam Edwards <CFSworks@gmail.com> Reported-by: Marc Haber <mh+netdev@zugschlus.de> Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: thunderx: Fix BGX transmit stall due to underflowSunil Goutham2017-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For SGMII/RGMII/QSGMII interfaces when physical link goes down while traffic is high is resulting in underflow condition being set on that specific BGX's LMAC. Which assets a backpresure and VNIC stops transmitting packets. This is due to BGX being disabled in link status change callback while packet is in transit. This patch fixes this issue by not disabling BGX but instead just disables packet Rx and Tx. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Revert "vhost: cache used event for better performance"Jason Wang2017-07-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92. Since it was reported to break vhost_net. We want to cache used event and use it to check for notification. The assumption was that guest won't move the event idx back, but this could happen in fact when 16 bit index wraps around after 64K entries. Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge tag 'mlx5-fixes-2017-07-27-V2' of ↵David S. Miller2017-07-29
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2017-07-27 This series contains some misc fixes to the mlx5 driver. Please pull and let me know if there's any problem. V1->V2: - removed redundant braces for -stable: 4.7 net/mlx5: Fix command bad flow on command entry allocation failure 4.9 net/mlx5: Consider tx_enabled in all modes on remap net/mlx5e: Fix outer_header_zero() check size 4.10 net/mlx5: Fix mlx5_add_flow_rules call with correct num of dests 4.11 net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure size net/mlx5e: Add field select to MTPPS register net/mlx5e: Fix broken disable 1PPS flow net/mlx5e: Change 1PPS out scheme net/mlx5e: Add missing support for PTP_CLK_REQ_PPS request net/mlx5e: Fix wrong delay calculation for overflow check scheduling net/mlx5e: Schedule overflow check work to mlx5e workqueue 4.12 net/mlx5: Fix command completion after timeout access invalid structure net/mlx5e: IPoIB, Modify add/remove underlay QPN flows I hope this is not too much, but most of the patches do apply cleanly on -stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/mlx5: Fix mlx5_add_flow_rules call with correct num of destsPaul Blakey2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding ethtool steering rule with action DISCARD we wrongly pass a NULL dest with dest_num 1 to mlx5_add_flow_rules(). What this error seems to have caused is sending VPORT 0 (MLX5_FLOW_DESTINATION_TYPE_VPORT) as the fte dest instead of no dests. We have fte action correctly set to DROP so it might been ignored anyways. To reproduce use: # sudo ethtool --config-nfc <dev> flow-type ether \ dst aa:bb:cc:dd:ee:ff action -1 Fixes: 74491de93712 ("net/mlx5: Add multi dest support") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Schedule overflow check work to mlx5e workqueueEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is done in order to ensure that work will not run after the cleanup. Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Fix wrong delay calculation for overflow check schedulingEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The overflow_period is calculated in seconds. In order to use it for delayed work scheduling translation to jiffies is needed. Fixes: ef9814deafd0 ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Add missing support for PTP_CLK_REQ_PPS requestEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing option to enable the PTP_CLK_PPS function. In this case pin should be configured as 1PPS IN first and then it will be connected to PPS mechanism. Events will be reported as PTP_CLOCK_PPSUSR events to relevant sysfs. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Change 1PPS out schemeEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to fix the drift in 1PPS out need to adjust the next pulse. On each 1PPS out falling edge driver gets the event, then the event handler adjusts the next pulse starting time. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Fix broken disable 1PPS flowEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Need to disable the MTPPS and unsubscribe from the pulse events when user disables the 1PPS functionality. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Add field select to MTPPS registerEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to mark relevant fields while setting the MTPPS register add field select. Otherwise it can cause a misconfiguration in firmware. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure sizeEugenia Emantayev2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix miscalculation in reserved_at_1a0 field. Fixes: ee7f12205abc ('net/mlx5e: Implement 1PPS support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: Fix outer_header_zero() check sizeIlan Tayari2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outer_header_zero() routine checks if the outer_headers match of a flow-table entry are all zero. This function uses the size of whole fte_match_param, instead of just the outer_headers member, causing failure to detect all-zeros if any other members of the fte_match_param are non-zero. Use the correct size for zero check. Fixes: 6dc6071cfcde ("net/mlx5e: Add ethtool flow steering support") Signed-off-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| | * | net/mlx5e: IPoIB, Modify add/remove underlay QPN flowsAlex Vesker2017-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On interface remove, the clean-up was done incorrectly causing an error in the log: "SET_FLOW_TABLE_ROOT(0x92f) op_mod(0x0) failed...syndrome (0x7e9f14)" This was caused by the following flow: -ndo_uninit: Move QP state to RST (this disconnects the QP from FT), the QP cannot be attached to any FT unless it is in RTS. -mlx5_rdma_netdev_free: cleanup_rx: Destroy FT cleanup_tx: Destroy QP and remove QPN from FT This caused a problem when destroying current FT we tried to re-attach the QP to the next FT which is not needed. The correct flow is: -mlx5_rdma_netdev_free: cleanup_rx: remove QPN from FT & Destroy FT cleanup_tx: Destroy QP Fixes: 508541146af1 ("net/mlx5: Use underlay QPN from the root name space") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>