litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	net: Limit socket I/O iovec total length to INT_MAX.	David S. Miller	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This helps protect us from overflow issues down in the individual protocol sendmsg/recvmsg handlers. Once we hit INT_MAX we truncate out the rest of the iovec by setting the iov_len members to zero. This works because: 1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial writes are allowed and the application will just continue with another write to send the rest of the data. 2) For datagram oriented sockets, where there must be a one-to-one correspondance between write() calls and packets on the wire, INT_MAX is going to be far larger than the packet size limit the protocol is going to check for and signal with -EMSGSIZE. Based upon a patch by Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>
*	USB: gadget: fix ethernet gadget crash in gether_setup	Dmitry Artamonow	2010-10-28
\| \| \| \| \| \| \| \| \| \|	Crash is triggered by commit e6484930d7 ("net: allocate tx queues in register_netdevice"), which moved tx netqueue creation into register_netdev. So now calling netif_stop_queue() before register_netdev causes an oops. Move netif_stop_queue() after net device registration to fix crash. Signed-off-by: Dmitry Artamonow <mad_soft@inbox.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fib: Fix fib zone and its hash leak on namespace stop	Pavel Emelyanov	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \|	When we stop a namespace we flush the table and free one, but the added fn_zone-s (and their hashes if grown) are leaked. Need to free. Tries releases all its stuff in the flushing code. Shame on us - this bug exists since the very first make-fib-per-net patches in 2.6.27 :( Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	cxgb3: Fix panic in free_tx_desc()	Krishna Kumar	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I got a few of these panics (on 2.6.36-rc7) when running high number of netperf sessions: BUG: unable to handle kernel paging request at 0000100000000000 IP: [<ffffffff813125f0>] skb_release_data+0xa0/0xd0 Oops: 0000 [#1] SMP Pid: 2155, comm: vhost-2115 Not tainted 2.6.36-rc7-ORG #1 49Y6512 /System x3650 M2 -[7947AC1]- RIP: 0010:[<ffffffff813125f0>] [<ffffffff813125f0>] skb_release_data+0xa0/0xd0 RSP: 0018:ffff880001803738 EFLAGS: 00010206 RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000 RSP: 0018:ffff880001803738 EFLAGS: 00010206 RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000 RDX: ffff880179b0fd40 RSI: 0000000000000000 RDI: 0000100000000000 RBP: ffff880001803748 R08: 0000000000000001 R09: ffff88017f117000 R10: ffff88017b990608 R11: ffff88017f117090 R12: ffff880178b441c0 R13: ffff88017f117090 R14: 0000000000000000 R15: ffff880178b441c0 FS: 0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000100000000000 CR3: 000000017ea64000 CR4: 00000000000026e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process vhost-2115 (pid: 2155, threadinfo ffff88017d872000, task ffff88017e954680) Stack: ffff880178b441c0 0000000000000007 ffff880001803768 ffffffff81312119 <0> 0000000000000000 0000000000000002 ffff880001803778 ffffffff813121f9 <0> ffff880001803818 ffffffffa012d14c ffffffffa02de076 ffff880001803700 Call Trace: <IRQ> [<ffffffff81312119>] __kfree_skb+0x19/0xa0 [<ffffffff813121f9>] kfree_skb+0x19/0x40 [<ffffffffa012d14c>] free_tx_desc+0x2fc/0x350 [cxgb3] [<ffffffffa02de076>] ? vhost_poll_wakeup+0x16/0x20 [vhost_net] [<ffffffffa01323db>] t3_eth_xmit+0x28b/0x380 [cxgb3] [<ffffffff8131ce47>] dev_hard_start_xmit+0x377/0x5a0 [<ffffffff81335a4a>] sch_direct_xmit+0xfa/0x1d0 [<ffffffff8131d1a9>] dev_queue_xmit+0x139/0x450 [<ffffffff81326225>] neigh_resolve_output+0x125/0x340 [<ffffffff8135a77c>] ip_finish_output+0x14c/0x320 [<ffffffff8135a9fe>] ip_output+0xae/0xc0 [<ffffffff8135620f>] ip_forward_finish+0x3f/0x50 [<ffffffff8135641f>] ip_forward+0x1ff/0x400 [<ffffffff81354789>] ip_rcv_finish+0x119/0x3e0 [<ffffffff81354c7d>] ip_rcv+0x22d/0x300 [<ffffffff8131a95b>] __netif_receive_skb+0x29b/0x570 [<ffffffff8131ba70>] ? netif_receive_skb+0x0/0x80 [<ffffffff8131bae8>] netif_receive_skb+0x78/0x80 [<ffffffffa02a96d8>] br_handle_frame_finish+0x198/0x260 [bridge] [<ffffffffa02aebc8>] br_nf_pre_routing_finish+0x238/0x380 [bridge] [<ffffffff813424bc>] ? nf_hook_slow+0x6c/0x100 [<ffffffffa02ae990>] ? br_nf_pre_routing_finish+0x0/0x380 [bridge] [<ffffffffa02afb08>] br_nf_pre_routing+0x698/0x7a0 [bridge] [<ffffffff81342414>] nf_iterate+0x64/0xa0 [<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge] [<ffffffff813424bc>] nf_hook_slow+0x6c/0x100 [<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge] [<ffffffffa02a9931>] br_handle_frame+0x191/0x240 [bridge] [<ffffffffa02a97a0>] ? br_handle_frame+0x0/0x240 [bridge] [<ffffffff8131a863>] __netif_receive_skb+0x1a3/0x570 [<ffffffff812ef3f6>] ? dma_issue_pending_all+0x76/0xa0 [<ffffffff8131ad32>] process_backlog+0x102/0x200 [<ffffffff8131c2d0>] net_rx_action+0x100/0x220 [<ffffffff810548ef>] __do_softirq+0xaf/0x140 [<ffffffff8100bcdc>] call_softirq+0x1c/0x30 [<ffffffff8100dfc5>] ? do_softirq+0x65/0xa0 [<ffffffff8131c6b8>] netif_rx_ni+0x28/0x30 [<ffffffffa02c305d>] tun_sendmsg+0x2cd/0x4b0 [tun] [<ffffffffa02e01af>] handle_tx+0x1df/0x340 [vhost_net] [<ffffffffa02e0340>] handle_tx_kick+0x10/0x20 [vhost_net] [<ffffffffa02de29b>] vhost_worker+0xbb/0x130 [vhost_net] [<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net] [<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net] [<ffffffff81069686>] kthread+0x96/0xa0 [<ffffffff8100bbe4>] kernel_thread_helper+0x4/0x10 [<ffffffff810695f0>] ? kthread+0x0/0xa0 [<ffffffff8100bbe0>] ? kernel_thread_helper+0x0/0x10 Code: 8b 94 24 d0 00 00 00 49 8b 84 24 d8 00 00 00 48 8d 14 10 0f b7 0a 39 d9 7f d1 48 8b 7a 10 48 85 ff 74 20 48 c7 42 10 00 00 00 00 <48> 8b 1f e8 e8 fb ff ff 48 85 db 48 89 df 75 f0 49 8b 84 24 d8 Patch below fixes the panic. cxgb4 and cxgb4vf already have this fix. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	cxgb3: fix crash due to manipulating queues before registration	Nishanth Aravamudan	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Along the same lines as "cxgb4: fix crash due to manipulating queues before registration" (8f6d9f40476895571df039b6f1f5230ec7faebad), before commit "net: allocate tx queues in register_netdevice" netif_tx_stop_all_queues and related functions could be used between device allocation and registration but now only after registration. cxgb4 has such a call before registration and crashes now. Move it after register_netdev. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: eric.dumazet@gmail.com Cc: sonnyrao@us.ibm.com Cc: Divy Le Ray <divy@chelsio.com> Cc: Dimitris Michailidis <dm@chelsio.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Tested-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	8390: Don't oops on starting dev queue	Pavel Emelyanov	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The __NS8390_init tries to start the device queue before the device is registered. This results in an oops (snipped): [ 2.865493] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 2.866106] IP: [<ffffffffa000602a>] netif_start_queue+0xb/0x12 [8390] [ 2.881267] Call Trace: [ 2.881437] [<ffffffffa000624d>] __NS8390_init+0x102/0x15a [8390] [ 2.881999] [<ffffffffa00062ae>] NS8390_init+0x9/0xb [8390] [ 2.882237] [<ffffffffa000d820>] ne2k_pci_init_one+0x297/0x354 [ne2k_pci] [ 2.882955] [<ffffffff811c7a0e>] local_pci_probe+0x12/0x16 [ 2.883308] [<ffffffff811c85ad>] pci_device_probe+0xc3/0xef [ 2.884049] [<ffffffff8129218d>] driver_probe_device+0xbe/0x14b [ 2.884937] [<ffffffff81292260>] __driver_attach+0x46/0x62 [ 2.885170] [<ffffffff81291788>] bus_for_each_dev+0x49/0x78 [ 2.885781] [<ffffffff81291fbb>] driver_attach+0x1c/0x1e [ 2.886089] [<ffffffff812912ab>] bus_add_driver+0xba/0x227 [ 2.886330] [<ffffffff8129259a>] driver_register+0x9e/0x115 [ 2.886933] [<ffffffff811c8815>] __pci_register_driver+0x50/0xac [ 2.887785] [<ffffffffa001102c>] ne2k_pci_init+0x2c/0x2e [ne2k_pci] [ 2.888093] [<ffffffff81000212>] do_one_initcall+0x7c/0x130 [ 2.888693] [<ffffffff8106d74f>] sys_init_module+0x99/0x1da [ 2.888946] [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b This happens because the netif_start_queue sets respective bit on the dev->_tx array which is not yet allocated. As far as I understand the code removing the netif_start_queue from __NS8390_init is OK, since queue will be started later on device open. Plz, correct me if I'm wrong. Found in the Dave's current tree, so he's in Cc. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dccp ccid-2: Stop polling	Gerrit Renker	2010-10-28
\| \| \| \| \| \| \| \|	This updates CCID-2 to use the CCID dequeuing mechanism, converting from previous continuous-polling to a now event-driven mechanism. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dccp: Refine the wait-for-ccid mechanism	Gerrit Renker	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the existing wait-for-ccid routine so that it may be used with different types of CCID, addressing the following problems: 1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for example has a full TX queue and becomes network-limited just as the application wants to close, then waiting for CCID-2 to become unblocked could lead to an indefinite delay (i.e., application "hangs"). 2) Since each TX CCID in turn uses a feedback mechanism, there may be changes in its sending policy while the queue is being drained. This can lead to further delays during which the application will not be able to terminate. 3) The minimum wait time for CCID-3/4 can be expected to be the queue length times the current inter-packet delay. For example if tx_qlen=100 and a delay of 15 ms is used for each packet, then the application would have to wait for a minimum of 1.5 seconds before being allowed to exit. 4) There is no way for the user/application to control this behaviour. It would be good to use the timeout argument of dccp_close() as an upper bound. Then the maximum time that an application is willing to wait for its CCIDs to can be set via the SO_LINGER option. These problems are addressed by giving the CCID a grace period of up to the `timeout' value. The wait-for-ccid function is, as before, used when the application (a) has read all the data in its receive buffer and (b) if SO_LINGER was set with a non-zero linger time, or (c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ state (client application closes after receiving CloseReq). In addition, there is a catch-all case of __skb_queue_purge() after waiting for the CCID. This is necessary since the write queue may still have data when (a) the host has been passively-closed, (b) abnormal termination (unread data, zero linger time), (c) wait-for-ccid could not finish within the given time limit. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dccp: Extend CCID packet dequeueing interface	Gerrit Renker	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the packet dequeuing interface of dccp_write_xmit() to allow 1. CCIDs to take care of timing when the next packet may be sent; 2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds). The main purpose is to take CCID-2 out of its polling mode (when it is network- limited, it tries every millisecond to send, without interruption). The mode of operation for (2) is as follows: * new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(), * ccid_hc_tx_send_packet() detects that it may not send (e.g. window full), * it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER', * dccp_write_xmit() returns without further action; * after some time the wait-condition for CCID becomes true, * that CCID schedules the tasklet, * tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(), * since the wait-condition is now true, ccid_hc_tx_packet() returns "send now", * packet is sent, and possibly more (since dccp_write_xmit() loops). Code reuse: the taskled function calls dccp_write_xmit(), the timer function reduces to a wrapper around the same code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	dccp: Return-value convention of hc_tx_send_packet()	Gerrit Renker	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch reorganises the return value convention of the CCID TX sending function, to permit more flexible schemes, as required by subsequent patches. Currently the convention is * values < 0 mean error, * a value == 0 means "send now", and * a value x > 0 means "send in x milliseconds". The patch provides symbolic constants and a function to interpret return values. In addition, it caps the maximum positive return value to 0xFFFF milliseconds, corresponding to 65.535 seconds. This is possible since in CCID-3/4 the maximum possible inter-packet gap is fixed at t_mbi = 64 sec. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	igbvf: fix panic on load	Emil Tantilov	2010-10-28
\| \| \| \| \| \| \| \| \| \| \|	Introduced by commit:e6484930d7c73d324bccda7d43d131088da697b9 net: allocate tx queues in register_netdevice Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Acked-by: Greg Rose <greg.v.rose@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ixgb: call pci_disable_device in ixgb_remove	Emil Tantilov	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ixgb fails to work after reload on recent kernels: rmmod ixgb (dev->current_state = PCI_UNKNOWN) modprobe ixgb (pci_enable_device will bail leaving current_state to PCI_UNKNOWN) ifup eth0 do_IRQ: 2.82 No irq handler for vector (irq -1) The issue was exposed by commit fcd097f31a6ee207cc0c3da9cccd2a86d4334785 PCI: MSI: Remove unsafe and unnecessary hardware access which avoids HW writes for power states != PCI_D0 CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ixgbe: DCB, fix TX hang occurring in stress condition with PFC	John Fastabend	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DCB credits refill quantum _must_ be greater than half the max packet size. This is needed to guarantee that TX DMA operations are not attempted during a pause state. Additionally, the min IFG must be set correctly for DCB mode. If a DMA operation is requested unexpectedly during the pause state the HW data store may be corrupted leading to a DMA hang. The DMA hang requires a reset to correct. This fixes the HW configuration to avoid this condition. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	e1000e: Add check for reset flags before displaying reset message	Carolyn Wyborny	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \|	Some parts need to execute resets during normal operation. This flag check ensures that those parts reset without needlessly alarming the user. Other unexpected resets by other parts will dump debug info and message the reset action to the user, as originally intended. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Acked-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	e1000e: reset PHY after errors detected	Carolyn Wyborny	2010-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some errors can be induced in the PHY via environmental testing (specifically extreme temperature changes and electro static discharge testing), and in the case of the PHY hanging due to this input, this detects the problem and resets to continue. This issue only applies to 82574 silicon. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	pch_gbe: Select MII.	David S. Miller	2010-10-28
\| \| \| \| \|	Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	igb: Fix unused variable warning.	Jesse Gross	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \|	Commit eab6d18d "vlan: Don't check for vlan group before vlan_tx_tag_present" removed the need for the adapter variable in igb_xmit_frame_ring_adv(). This removes the variable as well to avoid the compiler warning. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ehea: Fixing statistics	Breno Leitao	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Applied over Eric's "ehea: fix use after free" patch) Currently ehea stats are broken. The bytes counters are got from the hardware, while the packets counters are got from the device driver. Also, the device driver counters are resetted during the the down process, and the hardware aren't, causing some weird numbers. This patch just consolidates the packets and bytes on the device driver. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: Fix lockdep warning after bond_vlan_rx_register()	Jarek Poplawski	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix lockdep warning: [ 52.991402] ====================================================== [ 52.991511] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] [ 52.991569] 2.6.36-04573-g4b60626-dirty #65 [ 52.991622] ------------------------------------------------------ [ 52.991696] ip/4842 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: [ 52.991758] (&bond->lock){++++..}, at: [<efe4d300>] bond_set_multicast_list+0x60/0x2c0 [bonding] [ 52.991966] [ 52.991967] and this task is already holding: [ 52.992008] (&bonding_netdev_addr_lock_key){+.....}, at: [<c04e5530>] dev_mc_sync+0x50/0xa0 [ 52.992008] which would create a new lock dependency: [ 52.992008] (&bonding_netdev_addr_lock_key){+.....} -> (&bond->lock){++++..} [ 52.992008] [ 52.992008] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 52.992008] (&(&mc->mca_lock)->rlock){+.-...} [ 52.992008] ... which became SOFTIRQ-irq-safe at: [ 52.992008] [<c0272beb>] __lock_acquire+0x96b/0x1960 [ 52.992008] [<c027415e>] lock_acquire+0x7e/0xf0 [ 52.992008] [<c05f356d>] _raw_spin_lock_bh+0x3d/0x50 [ 52.992008] [<c0584e40>] mld_ifc_timer_expire+0xf0/0x280 [ 52.992008] [<c024cee6>] run_timer_softirq+0x146/0x310 [ 52.992008] [<c024591d>] __do_softirq+0xad/0x1c0 [ 52.992008] [ 52.992008] to a SOFTIRQ-irq-unsafe lock: [ 52.992008] (&bond->lock){++++..} [ 52.992008] ... which became SOFTIRQ-irq-unsafe at: [ 52.992008] ... [<c0272c3b>] __lock_acquire+0x9bb/0x1960 [ 52.992008] [<c027415e>] lock_acquire+0x7e/0xf0 [ 52.992008] [<c05f36b8>] _raw_write_lock+0x38/0x50 [ 52.992008] [<efe4cbe4>] bond_vlan_rx_register+0x24/0x70 [bonding] [ 52.992008] [<c0598010>] register_vlan_dev+0xc0/0x280 [ 52.992008] [<c0599f3a>] vlan_newlink+0xaa/0xd0 [ 52.992008] [<c04ed4b4>] rtnl_newlink+0x404/0x490 [ 52.992008] [<c04ece35>] rtnetlink_rcv_msg+0x1e5/0x220 [ 52.992008] [<c050424e>] netlink_rcv_skb+0x8e/0xb0 [ 52.992008] [<c04ecbac>] rtnetlink_rcv+0x1c/0x30 [ 52.992008] [<c0503bfb>] netlink_unicast+0x24b/0x290 [ 52.992008] [<c0503e37>] netlink_sendmsg+0x1f7/0x310 [ 52.992008] [<c04cd41c>] sock_sendmsg+0xac/0xe0 [ 52.992008] [<c04ceb80>] sys_sendmsg+0x130/0x230 [ 52.992008] [<c04cf04e>] sys_socketcall+0xde/0x280 [ 52.992008] [<c0202d10>] sysenter_do_call+0x12/0x36 [ 52.992008] [ 52.992008] other info that might help us debug this: ... [ Full info at netdev: Wed, 27 Oct 2010 12:24:30 +0200 Subject: [BUG net-2.6 vlan/bonding] lockdep splats ] Use BH variant of write_lock(&bond->lock) (as elsewhere in bond_main) to prevent this dependency. Fixes commit f35188faa0fbabefac476536994f4b6f3677380f [v2.6.36] Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jay Vosburgh <fubar@us.ibm.com>
*	tunnels: Fix tunnels change rcu protection	Pavel Emelyanov	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	caif-u5500: Build config for CAIF shared mem driver	Amarnath Revanna	2010-10-27
\| \| \| \| \|	Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	caif-u5500: CAIF shared memory mailbox interface	Amarnath Revanna	2010-10-27
\| \| \| \| \|	Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	caif-u5500: CAIF shared memory transport protocol	sjur.brandeland@stericsson.com	2010-10-27
\| \| \| \| \|	Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	caif-u5500: Adding shared memory include	Amarnath Revanna	2010-10-27
\| \| \| \| \|	Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	drivers/isdn: delete double assignment	Julia Lawall	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Delete successive assignments to the same location. In the first case, the hscx array has two elements, so change the assignment to initialize the second one. In the second case, the two assignments are simply identical. Furthermore, neither is necessary, because the effect of the assignment is only visible in the next line, in the assignment in the if test. The patch inlines the right hand side value in the latter assignment and pulls that assignment out of the if test. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression i; @@ *i = ...; i = ...; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	drivers/net/typhoon.c: delete double assignment	Julia Lawall	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Delete successive assignments to the same location. The current definition does not initialize the respRing structure, which has the same type as the cmdRing structure, so initialize that one instead. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression i; @@ *i = ...; i = ...; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: David Dillow <dave@thedillows.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	drivers/net/sb1000.c: delete double assignment	Julia Lawall	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The other code around these duplicated assignments initializes the 0 1 2 and 3 elements of an array, so change the initialization of the rx_session_id array to do the same. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression i; @@ *i = ...; i = ...; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	qlcnic: define valid vlan id range	Sony Chacko	2010-10-27
\| \| \| \| \| \| \| \|	4095 vlan id is reserved and should not be use. Signed-off-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	qlcnic: reduce rx ring size	Sony Chacko	2010-10-27
\| \| \| \| \| \| \| \| \|	If eswitch is enabled, rcv ring size can be reduce, as physical port is partition-ed. Signed-off-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	qlcnic: fix mac learning	amit salecha	2010-10-27
\| \| \| \| \| \| \| \| \|	In failover bonding case, same mac address can be programmed on other slave function. Fw will delete old entry (original func) associated with that mac address. Need to reporgram mac address, if failover again happen to original function. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ehea: fix use after free	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \| \| \| \| \|	ehea_start_xmit() dereferences skb after its freeing in ehea_xmit3() to get vlan tags. Move the offending block before the potential ehea_xmit3() call. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	inetpeer: __rcu annotations	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds __rcu annotations to inetpeer (struct inet_peer)->avl_left (struct inet_peer)->avl_right This is a tedious cleanup, but removes one smp_wmb() from link_to_pool() since we now use more self documenting rcu_assign_pointer(). Note the use of RCU_INIT_POINTER() instead of rcu_assign_pointer() in all cases we dont need a memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fib_rules: __rcu annotates ctarget	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \|	Adds __rcu annotation to (struct fib_rule)->ctarget Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tunnels: add __rcu annotations	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add __rcu annotations to : (struct ip_tunnel)->prl (struct ip_tunnel_prl_entry)->next (struct xfrm_tunnel)->next struct xfrm_tunnel tunnel4_handlers struct xfrm_tunnel tunnel64_handlers And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: add __rcu annotations to protocol	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \|	Add __rcu annotations to : struct net_protocol inet_protos struct net_protocol inet6_protos And use appropriate casts to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: add __rcu annotations to routes.c	Eric Dumazet	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \|	Add __rcu annotations to : (struct dst_entry)->rt_next (struct rt_hash_bucket)->chain And use appropriate rcu primitives to reduce sparse warnings if CONFIG_SPARSE_RCU_POINTER=y Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	qlge: bugfix: Restoring the vlan setting.	Ron Mercer	2010-10-27
\| \| \| \| \| \|	Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	be2net: Schedule/Destroy worker thread in probe()/remove() rather than ↵	Somnath Kotur	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \|	open()/close() When async mcc compls are rcvd on an i/f that is down (and so interrupts are disabled) they just lie unprocessed in the compl queue.The compl queue can eventually get filled up and cause the BE to lock up.The fix is to use be_worker to reap mcc compls when the i/f is down.be_worker is now launched in be_probe() and canceled in be_remove(). Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: fix refcnt problem related to POSTDAD state	Ursula Braun	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After running this bonding setup script modprobe bonding miimon=100 mode=0 max_bonds=1 ifconfig bond0 10.1.1.1/16 ifenslave bond0 eth1 ifenslave bond0 eth3 on s390 with qeth-driven slaves, modprobe -r fails with this message unregister_netdevice: waiting for bond0 to become free. Usage count = 1 due to twice detection of duplicate address. Problem is caused by a missing decrease of ifp->refcnt in addrconf_dad_failure. An extra call of in6_ifa_put(ifp) solves it. Problem has been introduced with commit f2344a131bccdbfc5338e17fa71a807dee7944fa. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: NETIF_F_HW_CSUM does not imply FCoE CRC offload	Ben Hutchings	2010-10-27
\| \| \| \| \| \| \| \| \| \|	NETIF_F_HW_CSUM indicates the ability to update an TCP/IP-style 16-bit checksum with the checksum of an arbitrary part of the packet data, whereas the FCoE CRC is something entirely different. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Cc: stable@kernel.org [2.6.32+] Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Fix some corner cases in dev_can_checksum()	Ben Hutchings	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dev_can_checksum() incorrectly returns true in these cases: 1. The skb has both out-of-band and in-band VLAN tags and the device supports checksum offload for the encapsulated protocol but only with one layer of encapsulation. 2. The skb has a VLAN tag and the device supports generic checksumming but not in conjunction with VLAN encapsulation. Rearrange the VLAN tag checks to avoid these. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	gianfar: Fix crashes on RX path (Was Re: [Bugme-new] [Bug 19692] New: ↵	Jarek Poplawski	2010-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	linux-2.6.36-rc5 crash with gianfar ethernet at full line rate traffic) The rx_recycle queue is global per device but can be accesed by many napi handlers at the same time, so it needs full skb_queue primitives (with locking). Otherwise, various crashes caused by broken skbs are possible. This patch resolves, at least partly, bugzilla bug 19692. (Because of some doubts that there could be still something around which is hard to reproduce my proposal is to leave this bug opened for a month.) Fixes commit: 0fd56bb5be6455d0d42241e65aed057244665e5e ("gianfar: Add support for skb recycling") Reported-by: emin ak <eminak71@gmail.com> Tested-by: emin ak <eminak71@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> CC: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	IPv6: Temp addresses are immediately deleted.	Glenn Wurster	2010-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a bug in the interaction between ipv6_create_tempaddr and addrconf_verify. Because ipv6_create_tempaddr uses the cstamp and tstamp from the public address in creating a private address, if we have not received a router advertisement in a while, tstamp + temp_valid_lft might be < now. If this happens, the new address is created inside ipv6_create_tempaddr, then the loop within addrconf_verify starts again and the address is immediately deleted. We are left with no temporary addresses on the interface, and no more will be created until the public IP address is updated. To avoid this, set the expiry time to be the minimum of the time left on the public address or the config option PLUS the current age of the public interface. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	IPv6: Create temporary address if none exists.	Glenn Wurster	2010-10-26
\| \| \| \| \| \| \| \|	If privacy extentions are enabled, but no current temporary address exists, then create one when we get a router advertisement. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fib_hash: fix rcu sparse and logical errors	Eric Dumazet	2010-10-26
\| \| \| \| \| \| \| \| \| \| \|	While fixing CONFIG_SPARSE_RCU_POINTER errors, I had to fix accesses to fz->fz_hash for real. - &fz->fz_hash[fn_hash(f->fn_key, fz)] + rcu_dereference(fz->fz_hash) + fn_hash(f->fn_key, fz) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	fib: fix fib_nl_newrule()	Eric Dumazet	2010-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some panic reports in fib_rules_lookup() show a rule could have a NULL pointer as a next pointer in the rules_list. This can actually happen because of a bug in fib_nl_newrule() : It checks if current rule is the destination of unresolved gotos. (Other rules have gotos to this about to be inserted rule) Problem is it does the resolution of the gotos before the rule is inserted in the rules_list (and has a valid next pointer) Fix this by moving the rules_list insertion before the changes on gotos. A lockless reader can not any more follow a ctarget pointer, unless destination is ready (has a valid next pointer) Reported-by: Oleg A. Arkhangelsky <sysoleg@yandex.ru> Reported-by: Joe Buehler <aspam@cox.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	drivers/atm/eni.c: Remove multiple uses of KERN_<level>	Joe Perches	2010-10-26
\| \| \| \| \|	Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tg3: Do not call device_set_wakeup_enable() under spin_lock_bh	Rafael J. Wysocki	2010-10-26
\| \| \| \| \| \| \| \| \| \| \| \|	The tg3 driver calls device_set_wakeup_enable() under spin_lock_bh, which causes a problem to happen after the recent core power management changes, because this function can sleep now. Fix this by moving the device_set_wakeup_enable() call out of the spin_lock_bh-protected area. Reported-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2010-10-26
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
\| *	iwlwifi: quiet a noisy printk	Don Fry	2010-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Timing issues in microcode for some devices can cause a compressed BA to be sent to the driver prior to returning any a-MPDU notification. Traces show RTS-CTS is exchanged and then the timer fires which causes an empty BA to be sent which acknowledges nothing. This results in a noisy printk. Only print the message if the bitmap is non-zero. Signed-off-by: Don Fry <donald.h.fry@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>