aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* neigh: convert parms to an arrayJiri Pirko2013-12-09
| | | | | | | | | This patch converts the neigh param members to an array. This allows easier manipulation which will be needed later on to provide better management of default values. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'phy_reset'David S. Miller2013-12-09
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== net: phy: consolidate PHY reset This patchset consolidates the PHY reset through the MII BMCR register by using a central place were this is done. This patchset resumes the work Kyle Moffett started here: https://lkml.org/lkml/2011/10/20/301 Note that at this point, drivers doing funky things after issuing a PHY reset using phy_init_hw() will still suffer from PHY state machine problems, this will be taken care of later on. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sh_eth: do not issue a wild PHY reset through BMCRFlorian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | The sh_eth driver issues an uncontrolled PHY reset through the MII register BMCR but fails to wait for the reset to complete, and will also implicitely wipe out all possible PHY fixups applied. Use phy_init_hw() which remedies both problems. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: tc35815: use phy_init_hw for PHY resetFlorian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | Instead of open-coding the PHY reset through MII BMCR, use phy_init_hw() which does that for us and also makes sure that any PHY specific fixups are applied. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: pxa168_eth: use phy_init_hw for PHY resetFlorian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | | | Instead of open-coding a PHY reset through the MII BMCR register, use phy_init_hw() which does this for us and ensures that PHY device fixups are also applied. We also remove a call to ethernet_phy_reset() which is now unncessary since phy_attach() calls phy_attach_direct() which in turns calls phy_init_hw(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mv643xx_eth: use phy_init_hw to reset PHYFlorian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of open-coding a PHY reset through the MII BMCR register, use phy_init_hw() which does that for us and will also make sure that PHY fixups are applied if required. We also remove a call to phy_reset() due to the following sequence of calls in the driver: phy_scan() -> phy_connect() -> phy_connect_direct() -> phy_attach_direct() -> phy_init_hw() and we only have a call to phy_init() after phy_scan(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: consolidate PHY reset in phy_init_hw()Florian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are quite a lot of drivers touching a PHY device MII_BMCR register to reset the PHY without taking care of: 1) ensuring that BMCR_RESET is cleared after a given timeout 2) the PHY state machine resuming to the proper state and re-applying potentially changed settings such as auto-negotiation Introduce phy_poll_reset() which will take care of polling the MII_BMCR for the BMCR_RESET bit to be cleared after a given timeout or return a timeout error code. In order to make sure the PHY is in a correct state, phy_init_hw() first issues a software reset through MII_BMCR and then applies any fixups. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bfin_mac: do not reset PHY after phy_start()Florian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | The PHY is already reset during driver probing, and this manual reset after calling phy_start() will wipe out board-specific PHY fixups and driver specific configuration initialization. Remove that explicit PHY reset. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: greth: use phy_read_status()Florian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | | | | | In case the greth driver is bound to anything but the Generic PHY driver or the PHY has a special read_status callback implemented, unexpected things will happen. Make sure we that we use phy_read_status() which does the proper abstraction of calling the driver specific read_status() callback for a given PHY. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: use phy_init_hw instead of open-coding itFlorian Fainelli2013-12-09
| | | | | | | | | | | | | | | | | | Use phy_init_hw() instead of open-coding it in phy_mii_ioctl(), this improves consistenty and makes sure that we will not duplicate the same routine somewhere else. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: report link partner features through ethtoolFlorian Fainelli2013-12-09
|/ | | | | | | | | | The PHY library already reads the MII_STAT1000 and MII_LPA registers in genphy_read_status(), so extend it to also populate the PHY device link partner advertised features such that we can feed this back into ethtool when asked for it in phy_ethtool_gset(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tun: remove useless codes in tun_chr_aio_read() and tun_recvmsg()Zhi Yong Wu2013-12-09
| | | | | | | | By checking related codes, it is impossible that ret > len or total_len, so we should remove some useless codes in both above functions. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* macvtap: remove useless codes in macvtap_aio_read() and macvtap_recvmsg()Zhi Yong Wu2013-12-09
| | | | | | | | By checking related codes, it is impossible that ret > len or total_len, so we should remove some useless coeds in both above functions. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xen-netback: improve guest-receive-side flow controlPaul Durrant2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way that flow control works without this patch is that, in start_xmit() the code uses xenvif_count_skb_slots() to predict how many slots xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek' counter which it then uses to determine if the shared ring has that amount of space available by checking whether 'req_prod' has passed that value. If the ring doesn't have space the tx queue is stopped. xenvif_gop_skb() will then consume slots and update 'req_cons' and issue responses, updating 'rsp_prod' as it goes. The frontend will consume those responses and post new requests, by updating req_prod. So, req_prod chases req_cons which chases rsp_prod, and can never exceed that value. Thus if xenvif_count_skb_slots() ever returns a number of slots greater than xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot possibly achieve (since it's limited by the 'real' req_cons) and, if this happens enough times, req_cons_peek gets more than a ring size ahead of req_cons and the tx queue then remains stopped forever waiting for an unachievable amount of space to become available in the ring. Having two routines trying to calculate the same value is always going to be fragile, so this patch does away with that. All we essentially need to do is make sure that we have 'enough stuff' on our internal queue without letting it build up uncontrollably. So start_xmit() makes a cheap optimistic check of how much space is needed for an skb and only turns the queue off if that is unachievable. net_rx_action() is the place where we could do with an accurate predicition but, since that has proven tricky to calculate, a cheap worse-case (but not too bad) estimate is all we really need since the only thing we *must* prevent is xenvif_gop_skb() consuming more slots than are available. Without this patch I can trivially stall netback permanently by just doing a large guest to guest file copy between two Windows Server 2008R2 VMs on a single host. Patch tested with frontends in: - Windows Server 2008R2 - CentOS 6.0 - Debian Squeeze - Debian Wheezy - SLES11 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Annie Li <annie.li@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tipc: remove interface state mirroring in bearerErik Hugne2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct 'tipc_bearer' is a generic representation of the underlying media type, and exists in a one-to-one relationship to each interface TIPC is using. The struct contains a 'blocked' flag that mirrors the operational and execution state of the represented interface, and is updated through notification calls from the latter. The users of tipc_bearer are checking this flag before each attempt to send a packet via the interface. This state mirroring serves no purpose in the current code base. TIPC links will not discover a media failure any faster through this mechanism, and in reality the flag only adds overhead at packet sending and reception. Furthermore, the fact that the flag needs to be protected by a spinlock aggregated into tipc_bearer has turned out to cause a serious and completely unnecessary deadlock problem. CPU0 CPU1 ---- ---- Time 0: bearer_disable() link_timeout() Time 1: spin_lock_bh(&b_ptr->lock) tipc_link_push_queue() Time 2: tipc_link_delete() tipc_bearer_blocked(b_ptr) Time 3: k_cancel_timer(&req->timer) spin_lock_bh(&b_ptr->lock) Time 4: del_timer_sync(&req->timer) I.e., del_timer_sync() on CPU0 never returns, because the timer handler on CPU1 is waiting for the bearer lock. We eliminate the 'blocked' flag from struct tipc_bearer, along with all tests on this flag. This not only resolves the deadlock, but also simplifies and speeds up the data path execution of TIPC. It also fits well into our ongoing effort to make the locking policy simpler and more manageable. An effect of this change is that we can get rid of functions such as tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer(). We replace the latter with a new function, tipc_reset_bearer(), which resets all links associated to the bearer immediately after an interface goes down. A user might notice one slight change in link behaviour after this change. When an interface goes down, (e.g. through a NETDEV_DOWN event) all attached links will be reset immediately, instead of leaving it to each link to detect the failure through a timer-driven mechanism. We consider this an improvement, and see no obvious risks with the new behavior. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <Paul.Gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* x25: convert printks to pr_<level>wangweidong2013-12-09
| | | | | | | | use pr_<level> instead of printk(LEVEL) Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* packet: introduce PACKET_QDISC_BYPASS socket optionDaniel Borkmann2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a PACKET_QDISC_BYPASS socket option, that allows for using a similar xmit() function as in pktgen instead of taking the dev_queue_xmit() path. This can be very useful when PF_PACKET applications are required to be used in a similar scenario as pktgen, but with full, flexible packet payload that needs to be provided, for example. On default, nothing changes in behaviour for normal PF_PACKET TX users, so everything stays as is for applications. New users, however, can now set PACKET_QDISC_BYPASS if needed to prevent own packets from i) reentering packet_rcv() and ii) to directly push the frame to the driver. In doing so we can increase pps (here 64 byte packets) for PF_PACKET a bit: # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[**] 1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436 2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779 3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610 4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114 5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422 6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261 7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619 8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879 9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447 10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689 11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412 [...] 20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081 [**]: qdisc path with packet_rcv(), how probably most people seem to use it (hopefully not anymore if not needed) The test was done using a modified trafgen, sending a simple static 64 bytes packet, on all CPUs. The trick in the fast "qdisc path" case, is to avoid reentering packet_rcv() by setting the RAW socket protocol to zero, like: socket(PF_PACKET, SOCK_RAW, 0); Tradeoffs are documented as well in this patch, clearly, if queues are busy, we will drop more packets, tc disciplines are ignored, and these packets are not visible to taps anymore. For a pktgen like scenario, we argue that this is acceptable. The pointer to the xmit function has been placed in packet socket structure hole between cached_dev and prot_hook that is hot anyway as we're working on cached_dev in each send path. Done in joint work together with Jesper Dangaard Brouer. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dev: move inline skb_needs_linearize helper to headerDaniel Borkmann2013-12-09
| | | | | | | | | | | As we need it elsewhere, move the inline helper function of skb_needs_linearize() over to skbuff.h include file. While at it, also convert the return to 'bool' instead of 'int' and add a proper kernel doc. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-12-09
|\ | | | | | | | | | | | | Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>
| * packet: fix send path when running with proto == 0Daniel Borkmann2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e40526cb20b5 introduced a cached dev pointer, that gets hooked into register_prot_hook(), __unregister_prot_hook() to update the device used for the send path. We need to fix this up, as otherwise this will not work with sockets created with protocol = 0, plus with sll_protocol = 0 passed via sockaddr_ll when doing the bind. So instead, assign the pointer directly. The compiler can inline these helper functions automagically. While at it, also assume the cached dev fast-path as likely(), and document this variant of socket creation as it seems it is not widely used (seems not even the author of TX_RING was aware of that in his reference example [1]). Tested with reproducer from e40526cb20b5. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example Fixes: e40526cb20b5 ("packet: fix use after free race in send path when dev is released") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Salam Noureddine <noureddine@aristanetworks.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio-net: free bufs correctly on invalid packet lengthMichael Dalton2013-12-06
| | | | | | | | | | | | | | | | | | | | | | When a packet with invalid length arrives, ensure that the packet is freed correctly if mergeable packet buffers and big packets (GUEST_TSO4) are both enabled. Signed-off-by: Michael Dalton <mwdalton@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Andrew Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvneta: Fix incorrect DMA unmapping sizeEzequiel Garcia2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code unmaps the DMA mapping created for rx skb_buff's by using the data_size as the the mapping size. This is wrong since the correct size to specify should match the size used to create the mapping. This commit removes the following DMA_API_DEBUG warning: ------------[ cut here ]------------ WARNING: at lib/dma-debug.c:887 check_unmap+0x3a8/0x860() mvneta d0070000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002eb80000] [map size=1600 bytes] [unmap size=66 bytes] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.21-01444-ga88ae13-dirty #92 [<c0013600>] (unwind_backtrace+0x0/0xf8) from [<c0010fb8>] (show_stack+0x10/0x14) [<c0010fb8>] (show_stack+0x10/0x14) from [<c001afa0>] (warn_slowpath_common+0x48/0x68) [<c001afa0>] (warn_slowpath_common+0x48/0x68) from [<c001b01c>] (warn_slowpath_fmt+0x30/0x40) [<c001b01c>] (warn_slowpath_fmt+0x30/0x40) from [<c018d0fc>] (check_unmap+0x3a8/0x860) [<c018d0fc>] (check_unmap+0x3a8/0x860) from [<c018d734>] (debug_dma_unmap_page+0x64/0x70) [<c018d734>] (debug_dma_unmap_page+0x64/0x70) from [<c0233f78>] (mvneta_rx+0xec/0x468) [<c0233f78>] (mvneta_rx+0xec/0x468) from [<c023436c>] (mvneta_poll+0x78/0x16c) [<c023436c>] (mvneta_poll+0x78/0x16c) from [<c02db468>] (net_rx_action+0x94/0x160) [<c02db468>] (net_rx_action+0x94/0x160) from [<c0021e68>] (__do_softirq+0xe8/0x1d0) [<c0021e68>] (__do_softirq+0xe8/0x1d0) from [<c0021ff8>] (do_softirq+0x4c/0x58) [<c0021ff8>] (do_softirq+0x4c/0x58) from [<c0022228>] (irq_exit+0x58/0x90) [<c0022228>] (irq_exit+0x58/0x90) from [<c000e7c8>] (handle_IRQ+0x3c/0x94) [<c000e7c8>] (handle_IRQ+0x3c/0x94) from [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) [<c0008548>] (armada_370_xp_handle_irq+0x4c/0xb4) from [<c000dc20>] (__irq_svc+0x40/0x50) Exception stack(0xc04f1f70 to 0xc04f1fb8) 1f60: c1fe46f8 00000000 00001d92 00001d92 1f80: c04f0000 c04f0000 c04f84a4 c03e081c c05220e7 00000001 c05220e7 c04f0000 1fa0: 00000000 c04f1fb8 c000eaf8 c004c048 60000113 ffffffff [<c000dc20>] (__irq_svc+0x40/0x50) from [<c004c048>] (cpu_startup_entry+0x54/0x128) [<c004c048>] (cpu_startup_entry+0x54/0x128) from [<c04c1a14>] (start_kernel+0x29c/0x2f0) [<c04c1a14>] (start_kernel+0x29c/0x2f0) from [<00008074>] (0x8074) ---[ end trace d4955f6acd178110 ]--- Mapped at: [<c018d600>] debug_dma_map_page+0x4c/0x11c [<c0235d6c>] mvneta_setup_rxqs+0x398/0x598 [<c0236084>] mvneta_open+0x40/0x17c [<c02dbbd4>] __dev_open+0x9c/0x100 [<c02dbe58>] __dev_change_flags+0x7c/0x134 Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * br: fix use of ->rx_handler_data in code executed on non-rx_handler pathJiri Pirko2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | br_stp_rcv() is reached by non-rx_handler path. That means there is no guarantee that dev is bridge port and therefore simple NULL check of ->rx_handler_data is not enough. There is need to check if dev is really bridge port and since only rcu read lock is held here, do it by checking ->rx_handler pointer. Note that synchronize_net() in netdev_rx_handler_unregister() ensures this approach as valid. Introduced originally by: commit f350a0a87374418635689471606454abc7beaa3a "bridge: use rx_handler_data pointer to store net_bridge_port pointer" Fixed but not in the best way by: commit b5ed54e94d324f17c97852296d61a143f01b227a "bridge: fix RCU races with bridge port" Reintroduced by: commit 716ec052d2280d511e10e90ad54a86f5b5d4dcc2 "bridge: fix NULL pointer deref of br_port_get_rcu" Please apply to stable trees as well. Thanks. RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770 Reported-by: Laine Stump <laine@redhat.com> Debugged-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio: delete napi structures from netdev before releasing memoryAndrey Vagin2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | free_netdev calls netif_napi_del too, but it's too late, because napi structures are placed on vi->rq. netif_napi_add() is called from virtnet_alloc_queues. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000 RIP: 0010:[<ffffffff81322e19>] [<ffffffff81322e19>] __list_del_entry+0x29/0xd0 RSP: 0018:ffff8800379e1dd0 EFLAGS: 00010a83 RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200 RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0 RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90 R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0 FS: 00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0 Stack: ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0 ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20 ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388 Call Trace: [<ffffffff8155beab>] netif_napi_del+0x1b/0x80 [<ffffffff8156499b>] free_netdev+0x8b/0x110 [<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net] [<ffffffff813ae323>] virtio_dev_remove+0x23/0x80 [<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0 [<ffffffff813f6ca0>] driver_detach+0xc0/0xd0 [<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0 [<ffffffff813f72ec>] driver_unregister+0x2c/0x50 [<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10 [<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net] [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220 [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0 [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 RIP [<ffffffff81322e19>] __list_del_entry+0x29/0xd0 RSP <ffff8800379e1dd0> ---[ end trace d5931cd3f87c9763 ]--- Fixes: 986a4f4d452d (virtio_net: multiqueue support) Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio-net: determine type of bufs correctlyAndrey Vagin2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | free_unused_bufs must check vi->mergeable_rx_bufs before vi->big_packets, because we use this sequence in other places. Otherwise we allocate buffer of one type, then free it as another type. general protection fault: 0000 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables pcspkr virtio_balloon virtio_net(-) i2c_pii CPU: 0 PID: 400 Comm: rmmod Not tainted 3.13.0-rc2+ #170 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8800b6d2a210 ti: ffff8800aed32000 task.ti: ffff8800aed32000 RIP: 0010:[<ffffffffa00345f3>] [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net] RSP: 0018:ffff8800aed33dd8 EFLAGS: 00010202 RAX: ffff8800b1fe2c00 RBX: ffff8800b66a7240 RCX: 6b6b6b6b6b6b6b6b RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800b8419a68 RDI: ffff8800b66a1148 RBP: ffff8800aed33e00 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffff8800b66a1148 R14: 0000000000000000 R15: 000077ff80000000 FS: 00007fc4f9c4e740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f63f432f000 CR3: 00000000b6538000 CR4: 00000000000006f0 Stack: ffff8800b66a7240 ffff8800b66a7380 ffff8800377bd3f0 0000000000000000 00000000023302f0 ffff8800aed33e18 ffffffffa00346e2 ffff8800b66a7240 ffff8800aed33e38 ffffffffa003474d ffff8800377bd388 ffff8800377bd390 Call Trace: [<ffffffffa00346e2>] remove_vq_common+0x22/0x40 [virtio_net] [<ffffffffa003474d>] virtnet_remove+0x4d/0x90 [virtio_net] [<ffffffff813ae303>] virtio_dev_remove+0x23/0x80 [<ffffffff813f62cf>] __device_release_driver+0x7f/0xf0 [<ffffffff813f6c80>] driver_detach+0xc0/0xd0 [<ffffffff813f5f08>] bus_remove_driver+0x58/0xd0 [<ffffffff813f72cc>] driver_unregister+0x2c/0x50 [<ffffffff813ae63e>] unregister_virtio_driver+0xe/0x10 [<ffffffffa0036852>] virtio_net_driver_exit+0x10/0x7be [virtio_net] [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220 [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0 [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b Code: c0 74 55 0f 1f 44 00 00 80 7b 30 00 74 7a 48 8b 50 30 4c 89 e6 48 03 73 20 48 85 d2 0f 84 bb 00 00 00 66 0f RIP [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net] RSP <ffff8800aed33dd8> ---[ end trace edb570ea923cce9c ]--- Fixes: 2613af0ed18a (virtio_net: migrate mergeable rx buffers to page frag allocators) Cc: Michael Dalton <mwdalton@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: fix packets_per_slave showingNikolay Aleksandrov2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's an issue when showing the value of packets_per_slave due to using signed integer. The value may be < 0 and thus not put through reciprocal_value() before showing. This patch makes it use unsigned integer when showing it. CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: Free/delete pmacs (in be_clear()) only if they existSomnath Kotur2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | During suspend-resume and lancer error recovery we will cleanup and re-initialize the resources through be_clear() and be_setup() respectively. During re-initialisation in be_setup(), if be_get_config() fails, we'll again call be_clear() which will cause a NULL pointer dereference as adapter->pmac_id is already freed. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * be2net: Fix Lancer error recovery to distinguish FW downloadSomnath Kotur2013-12-06
| | | | | | | | | | | | | | | | | | | | | | | | The Firmware update would be detected by looking at the sliport_error1/ sliport_error2 register values(0x02/0x00). If its not a FW reset the current messaging would take place. If the error is due to FW reset, log a message to user that "Firmware update in progress" and also do not log sliport_status and sliport_error register values. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tun: update file current positionZhi Yong Wu2013-12-06
| | | | | | | | | | Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * macvtap: update file current positionZhi Yong Wu2013-12-06
| | | | | | | | | | Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: clear local_df when passing skb between namespacesHannes Frederic Sowa2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We must clear local_df when passing the skb between namespaces as the packet is not local to the new namespace any more and thus may not get fragmented by local rules. Fred Templin noticed that other namespaces do fragment IPv6 packets while forwarding. Instead they should have send back a PTB. The same problem should be present when forwarding DF-IPv4 packets between namespaces. Reported-by: Templin, Fred L <Fred.L.Templin@boeing.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_memcontrol: Cleanup/fix cg_proto->memory_pressure handling.Eric W. Biederman2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kill memcg_tcp_enter_memory_pressure. The only function of memcg_tcp_enter_memory_pressure was to reduce deal with the unnecessary abstraction that was tcp_memcontrol. Now that struct tcp_memcontrol is gone remove this unnecessary function, the unnecessary function pointer, and modify sk_enter_memory_pressure to set this field directly, just as sk_leave_memory_pressure cleas this field directly. This fixes a small bug I intruduced when killing struct tcp_memcontrol that caused memcg_tcp_enter_memory_pressure to never be called and thus failed to ever set cg_proto->memory_pressure. Remove the cg_proto enter_memory_pressure function as it now serves no useful purpose. Don't test cg_proto->memory_presser in sk_leave_memory_pressure before clearing it. The test was originally there to ensure that the pointer was non-NULL. Now that cg_proto is not a pointer the pointer does not matter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * forcedeth: run loopback test only on chipsets that support itIvan Vecera2013-12-05
| | | | | | | | | | | | | | | | | | The driver incorrectly run loopback test on chips that don't support it. Loopback test is only supported by chips that has DEV_HAS_TEST_EXTENDED flag and returns 4 (NV_TEST_COUNT_EXTENDED) as test count. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: add arp_ip_target checks when install the moduledingtianhong2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I install the bonding with the wrong arp_ip_target, just like arp_ip_target=500.500.500.500, the arp_ip_target was transfored to 245.245.245.244 and stored in the ip target success, it is uncorrect, so I add checks to avoid adding wrong address. The in4_pton() will set wrong ip address to 0.0.0.0 and return 0, also use the micro IS_IP_TARGET_UNUSABLE_ADDRESS to simplify the code. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: avoid null pointer dereference when enabling SR-IOVMichal Kalderon2013-12-05
| | | | | | | | | | | | | | | | | | | | Fixed NULL pointer dereference when dynamically activating SR-IOV after vf database failed to be allocated in probe stage (for example due to no ARI support in pci hub). Signed-off-by: Michal Kalderon <michals@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: disable max_burst when the max_burst is 0wangweidong2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As Michael pointed out that when max_burst is 0, it just disable max_burst. It declared in rfc6458#section-8.1.24. so add the check in sctp_transport_burst_limited, when it 0, just do nothing. Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Suggested-by: Michael Tuexen <Michael.Tuexen@lurchi.franken.de> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: davinci_emac: Fix platform data handling and make usable for am3517Tony Lindgren2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booted with device tree, we may still have platform data passed as auxdata. For am3517 this is needed for passing the interrupt_enable and interrupt_disable callbacks that access the omap system control module registers. These callback functions will eventually go away when we have a separate system control module driver. Some of the things that are currently passed as platform data we don't need to set up as device tree properties as they are always the same on am3517. So let's use a new compatible flag for those so we can get those from the device tree match data. Also note that we need to fix setting of phy_dev to NULL instead of an empty string as the code later on uses that to find the first phy on the mdio bus. This seems to have been caused by 5d69e0076a72 (net: davinci_emac: switch to new mdio). Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * qlge: Update version to 1.00.00.34Jitendra Kalsaria2013-12-05
| | | | | | | | | | Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * qlge: Allow enable/disable rx/tx vlan acceleration independentlyJitendra Kalsaria2013-12-05
| | | | | | | | | | | | | | | | | | | | | | o Fix the driver to allow user to enable/disable rx/tx vlan acceleration independently. For example: ethtool -K ethX rxvlan on/off ethtool -K ethX txvlan on/off Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * qlge: Fix ethtool statisticsJitendra Kalsaria2013-12-05
| | | | | | | | | | | | | | o Receive mac error stat was getting overwritten by other stats. Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: fix fragment detection in checksum setupPaul Durrant2013-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to detect fragments in checksum_setup() was missing for IPv4 and too eager for IPv6. (It transpires that Windows seems to send IPv6 packets with a fragment header even if they are not a fragment - i.e. offset is zero, and M bit is not set). This patch also incorporates a fix to callers of maybe_pull_tail() where skb->network_header was being erroneously added to the length argument. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> cc: David Miller <davem@davemloft.net> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * MAINTAINERS: Update Intel Wired Ethernet LAN MaintainersJeff Kirsher2013-12-05
| | | | | | | | | | | | | | | | Remove Tushar and Peter (PJ) as maintainers, since they have moved out of our group and no longer work on Wired Ethernet. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'net_sched_actions'David S. Miller2013-12-05
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jamal Hadi Salim says: ==================== net_sched: actions - Add default lookup and walker This set of patches provide defaults for lookup and walkers for actions. Users can override when needed. It also enforces mandatory action methods to be passed. Thanks to David Miller who paid attention and made this patch set much better. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: Use default action walker methodsJamal Hadi Salim2013-12-05
| | | | | | | | | | | | | | | Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: Provide default walker function for actionsJamal Hadi Salim2013-12-05
| | | | | | | | | | | | | | | Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: Use default action lookup functionsJamal Hadi Salim2013-12-05
| | | | | | | | | | | | | | | Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: Default action lookup method for actionsJamal Hadi Salim2013-12-05
| | | | | | | | | | | | | | | Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net_sched: Fail if missing mandatory action operation methodsJamal Hadi Salim2013-12-05
| |/ | | | | | | | | Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | pkt_sched: give visibility to mq slave qdiscsEric Dumazet2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline") added the ability to change default qdisc from pfifo_fast to say fq But as most modern ethernet devices are multiqueue, we cant really see all the statistics from "tc -s qdisc show", as the default root qdisc is mq. This patch adds the calls to qdisc_list_add() to mq and mqprio Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2013-12-09
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to i40e only. Jacob provides a i40e patch to get 1588 work correctly by separating TSYNVALID and TSYNINDX fields in the receive descriptor. Jesse provides several i40e patches, first to correct the checking of the multi-bit state. The hash is reported correctly in the RSS field if and only if the filter status is 3. Other values of the filter status mean different things and we should not depend on a bitwise result. Then provides a patch to enable a couple of workarounds based on revision ID that allow the driver to work more fully on early hardware. Shannon provides several i40e patches as well. First sets the media type in the hardware structure based on the external connection type. Then provides a patch to only setup the rings that will be used. Lastly provides a fix where the TESTING state was still set when exiting the ethtool diagnostics. Kevin Scott provides one i40e patch to add a new flag to the i40e_add_veb() which allows the driver to request the hardware to filter on layer 2 parameters. Anjali provides four i40e patches, first refactors the reset code in order to re-size queues and vectors while the interface is still up. Then provides a patch to enable all PCTYPEs expect FCoE for RSS. Adds a message to notify the user of how many VFs are initialized on each port. Lastly adds a new variable to track the number of PF instances, this is a global counter on purpose so that each PF loaded has a unique ID. Catherine bumps the driver version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>