aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* igb: add pci device pointer to ring structureAlexander Duyck2009-10-28
| | | | | | | | | This patch adds a pci device pointer to the ring structure. The main use of this pointer is for memory mapping/unmapping of the rings. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: change the head and tail offsets into pointersAlexander Duyck2009-10-28
| | | | | | | | | | | | Since we are writting to the head/tail pointers frequently we might as well save ourselves some processing time by converting the head and tail offsets directly to pointers. This will shave a few cycles off the rx/tx path and allows us to move one step closer to the rings being a bit more independant of each other. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: move SRRCTL register configuration into ring specific configAlexander Duyck2009-10-28
| | | | | | | | | | The SRRCTL register exists per ring. Instead of configuring all of them in the RCTL configuration which is meant to be global it makes more sense to move this out into the ring specific configuration. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: remove rx_ps_hdr_lenAlexander Duyck2009-10-28
| | | | | | | | | | | This patch removes the rx_ps_hdr_len which isn't really needed since we can now use rx_buffer_len less than 1K to indicate that we are in a packet split mode. We also don't need it since we always use a half page for the data buffers when receiving so we always know the size to map/unmap. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: move the tx and rx ring specific config into seperate functionsAlexander Duyck2009-10-28
| | | | | | | | | This change makes the tx and rx config a bit cleaner by breaking out the ring specific configuration from the generic rx and tx configuration. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: increase minimum rx buffer size to 1KAlexander Duyck2009-10-28
| | | | | | | | | | This update increases the minimum rx buffer size to 1K. The reason for this change is to support SR-IOV and avoid any conflicts with the rings being able to set their own MTU sizes. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: remove rx checksum good counterAlexander Duyck2009-10-28
| | | | | | | | | | | Counting packets with a good checksum can cause a significant amount of cache line bouncing due to the shared counter being written to by all of the queues. In order to avoid this I am removing the counter since we still have the checksum failed counter which will tell us if there are any issues. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: add new data structure for handling interrupts and NAPIAlexander Duyck2009-10-28
| | | | | | | | | | | Add a new igb_q_vector data structure to handle interrupts and NAPI. This helps to abstract the rings away from the adapter struct. In addition it allows for a bit of consolidation since a tx and rx ring can share a q_vector. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atl1c: duplicate atl1c_get_tpdJie Yang2009-10-28
| | | | | | | remove duplicate atl1c_get_tpd, it may cause hardware to send wrong packets. Signed-off-by: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: Remove bond_dev from xmit_hash_policy call.Jasper Spaans2009-10-27
| | | | | | | | | | | Now that the bonding device is no longer used in determining the device to which to send packets, it can be dropped from the argument list of the various xmit_hash_policy calls. Signed-off-by: Jasper Spaans <spaans@fox-it.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-10-27
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sh_eth.c
| * sh_eth: Add asm/cacheflush.hNobuhiro Iwamatsu2009-10-26
| | | | | | | | | | | | | | | | Add include asm/cacheflush.h, because declaration of __flush_purge_region moved to asm/cacheflush.h. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * PPPoE: Fix flush/close races.Michal Ostrowski2009-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Be more careful about the state of pointers during tear-down. The "pppoe_dev" field can only be looked at safely while holding socket locks. This subsequently allows for the flush_lock to be killed. We depend on the PPPOX_CONNECTED state to tell us that that those fields are valid, so whoever clears that state (pppox_unbind_sock()) is responsible for the dev_put() call. We also have to ensure that we delete_item() on all sockets before they are cleaned up. The need for these changes has been exposed by scenarios wherein namespace bindings of ethernet devices change while there are ongoing PPPoE sessions, which resulted in oopses due to unusual socket connection termination paths, exposing these issues. Signed-off-by: Michal Ostrowski <mostrows@gmail.com> Reviewed-by: Cyril Gorcunov <gorcunov@gmail.com> Reported-by: Denys Fedoryschenko <denys@visp.net.lb> Tested-by: Denys Fedoryschenko <denys@visp.net.lb>
| * e1000e: allow for swflag to be held over consecutive PHY accessesBruce Allan2009-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCH-based parts (82577/82578) and some ICH8-based parts (82566) need to hold the swflag (sw/fw/hw hardware semaphore) over consecutive PHY accesses in order to perform sw-driven PHY configuration during initialization to workaround known hardware issues (see follow-on patch). This patch provides new PHY read/write functions (and function pointers) that will allow accessing the PHY registers assuming the swflag has already been acquired. The actual PHY register access code has moved into helper functions that are called with a flag indicating whether or not the swflag has already been acquired and acquires/releases it if not. The functions called from within the updated PHY access functions had to be updated to assume the swflag was already acquired, and other functions that called those functions were also updated to acquire/release the swflag. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCHBruce Allan2009-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | Accesses to NVM and PHY/CSR registers on ICHx/PCH-based parts are protected from concurrent accesses with a mutex that is acquired when the access is initiated and released when the access has completed. However, the two types of accesses should not be protected by the same mutex because the driver may have to access the NVM while already holding the mutex over several consecutive PHY/CSR accesses which would result in livelock. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82577/82578 requires a different method to configure LPLUBruce Allan2009-10-26
| | | | | | | | | | | | | | | | | | | | Unlike previous ICHx-based parts, the PCH-based parts (82577/82578) require LPLU (Low Power Link Up, or "reverse auto-negotiation") to be configured in the PHY rather than the MAC. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: increase swflag acquisition timeout for ICHx/PCHBruce Allan2009-10-26
| | | | | | | | | | | | | | | | | | | | | | | | In some conditions (e.g. when AMT is enabled on the system), it is possible to take an extended period of time to for the driver to acquire the sw/fw/hw hardware semaphore used to protect against concurrent access of a shared resource (e.g. PHY registers). This could cause PHY registers to not get configured properly resulting in link issues. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: clear PHY wakeup bit after LCD reset on 82577/82578Bruce Allan2009-10-26
| | | | | | | | | | | | | | | | | | | | Performing a dummy read of the PHY Wakeup Control (WUC) register clears the wakeup enable bit set by an PHY reset. If this bit remains set, link problems may occur. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * igbvf: fix memory leak when ring size changed while interface downAlexander Duyck2009-10-26
| | | | | | | | | | | | | | | | | | This patch resolves a memory leak which occurs while changing the ring size while the interface is down. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ixgbe: fix memory leak when resizing rings while interface is downAlexander Duyck2009-10-26
| | | | | | | | | | | | | | | | | | This patch resolves a memory leak that occurs when you resize the rings via the ethtool -G option while the interface is down. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * igb: fix memory leak when setting ring size while interface is downAlexander Duyck2009-10-26
| | | | | | | | | | | | | | | | | | | | | | | | Changing ring sizes while the interface was down was causing a double allocation of the receive and transmit rings. This issue is amplified when there are multiple rings enabled. To prevent this we need to add an additional check which will just update the ring counts when the interface is not up and skip the allocation steps. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Modify hash transmit policies to use the packet's source MAC addressJasper Spaans2009-10-24
| | | | | | | | | | | | | | | | | | | | | | | | Modify bonding hash transmit policies to use the psource MAC address of the packet instead of the MAC address configured for the bonding device. The old sitation conflicts with the documentation. Signed-off-by: Jasper Spaans <spaans@fox-it.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * pktgen: Dont leak kernel memoryEric Dumazet2009-10-24
| | | | | | | | | | | | | | | | | | | | | | | | While playing with pktgen, I realized IP ID was not filled and a random value was taken, possibly leaking 2 bytes of kernel memory. We can use an increasing ID, this can help diagnostics anyway. Also clear packet payload, instead of leaking kernel memory. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * DM9000: Fix revision ID for DM9000BBen Dooks2009-10-24
| | | | | | | | | | | | | | | | | | | | | | The DM9000B revision ID is 0x1A, not 0x1B as set in the curernt dm9000.h header. Fix bug reported by Paolo Zebelloni. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: Simtec Linux Team <linux@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8169: fix Ethernet Hangup for RTL8110SC rev dSimon Wunderlich2009-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 8110SC rev d chip on our board shows a regression which the 8110SB chip did not have. When inbound traffic is overflowing the receive descriptor queue, "holes" in the ring buffer may occur which lead to a hangup until the buffer is filled again. The packets are than completely processed, but the ring remains porous and no packets are processed until the next overflow. Setting the interface down and up can fix the problem temporary from userspace. For some reason we don't know, this behaviour is not occuring if the RxVlan bit for hardware VLAN untagging is set. There is another "Work around for AMD plateform" in the current code which checks the VLAN status word in receive descriptors, but does never come to effect when hardware VLAN support is enabled. We assume that this is a bug in the chip. The following patch fixes the problem. Without the patch we could reproduce the hang within minutes (given other devices also generating lots of interrupts), without we couldn't reproduce within a few days of long term testing. This version contains minor style adjustments and is sent with mutt which will hopefully not destroy the formatting again. Signed-off-by: Bernhard Schmidt <bernhard.schmidt@saxnet.de> Signed-off-by: Simon Wunderlich <simon.wunderlich@saxnet.de> Acked-by: Francois Romieu <romieu@zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ifb: should not use __dev_get_by_index() without locksEric Dumazet2009-10-23
| | | | | | | | | | | | | | | | At this point (ri_tasklet()), RTNL or dev_base_lock are not held, we must use dev_get_by_index() instead of __dev_get_by_index() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: au1000_eth: add missing capability.hManuel Lauss2009-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | fixes the following build failure: CC drivers/net/au1000_eth.o /drivers/net/au1000_eth.c: In function 'au1000_set_settings': /drivers/net/au1000_eth.c:623: error: implicit declaration of function 'capable' /drivers/net/au1000_eth.c:623: error: 'CAP_NET_ADMIN' undeclared (first use in this function) /drivers/net/au1000_eth.c:623: error: (Each undeclared identifier is reported only once /drivers/net/au1000_eth.c:623: error: for each function it appears in. Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * myri10ge: improve port type reporting in ethtoolBrice Goglin2009-10-23
| | | | | | | | | | | | | | | | | | Improve the reporting of myri10ge port type in ethtool, and update for new boards. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: use WARN() for the WARN_ON in commit b6b39e8f3fbbbArjan van de Ven2009-10-23
| | | | | | | | | | | | | | | | | | | | | | | | Commit b6b39e8f3fbbb (tcp: Try to catch MSG_PEEK bug) added a printk() to the WARN_ON() that's in tcp.c. This patch changes this combination to WARN(); the advantage of WARN() is that the printk message shows up inside the message, so that kerneloops.org will collect the message. In addition, this gets rid of an extra if() statement. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: reset the PHY on 82577/82578 when going to SxBruce Allan2009-10-23
| | | | | | | | | | | | | | | | | | | | | | | | The PHY on 82577/82578 parts needs a soft reset when transitioning to Sx state in order for the PHY write which disables gigabit speed to take effect. Gigabit speed must be disabled in order for the PHY writes to registers on page 800 (the wakeup control registers) to work as expected otherwise the system might not wake via WoL. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * isdn: fix possible circular locking dependencyXiaotian Feng2009-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a circular locking dependency: ---> isdn_net_get_locked_lp     --->lock &nd->queue_lock     --->lock &nd->queue->xmit_lock     .....................     ---->unlock &nd->queue_lock ---> isdn_net_writebuf_skb (called with &nd->queue->xmit_lock locked)     ---->isdn_net_inc_frame_cnt          ---->isdn_net_device_busy               ----> lock &nd->queue_lock This will trigger lockdep warnings:  =======================================================  [ INFO: possible circular locking dependency detected ]  2.6.32-rc4-testing #7  -------------------------------------------------------  ipppd/28379 is trying to acquire lock:  (&netdev->queue_lock){......}, at: [<e62ad0fd>] isdn_net_device_busy+0x2c/0x74 [isdn]  but task is already holding lock:  (&netdev->local->xmit_lock){+.....}, at: [<e62aefc2>] isdn_net_write_super+0x3f/0x6e [isdn]  which lock already depends on the new lock. ....... We don't need to lock nd->queue->xmit_lock to protect single isdn_net_lp_busy(). This can fix above lockdep warnings. Reported-and-tested-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Xiaotian Feng <xtfeng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netxen: avoid undue board config checkDhananjay Phadke2009-10-22
| | | | | | | | | | | | | | | | | | | | Old code assumed board config version in the flash to be 1. When this will get changed by tools, driver just refuses to attach. This is unnecessary since driver does not have to parse board config structure directly (maintained by firmware). Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netxen: fix tx timeout handling on firmware hangAmit Kumar Salecha2009-10-22
| | | | | | | | | | | | | | | | | | | | Clear NX_RESETING bit in netxen_tx_timeout_task() so that the firmware watchdog task can catch need_reset request from tx timeout. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netxen: fix i2c initDhananjay Phadke2009-10-22
| | | | | | | | | | | | | | | | Avoid resetting subsys ID in i2c block. Also remove duplicate check for address tranlsation error. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was ↵Joyce Yu2009-10-21
| | | | | | | | | | | | | | copied to the head buffer in the Vlan packets case Signed-off-by: Joyce Yu <joyce.yu@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICASTBen Dooks2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting the RXCR1_AE bit by accident. This meant that all unicast frames where being accepted by the device. Remove RXCR1_AE from this case. Note, RXCR1_AE was also masking a problem with setting the MAC address properly, so needs to be applied after fixing the MAC write order. Fixes a bug reported by Doong, Ping of Micrel. This version of the patch avoids setting RXCR1_ME for all cases. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * KS8851: Fix MAC address write orderBen Dooks2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | The MAC address register was being written in the wrong order, so add a new address macro to convert mac-address byte to register address and a ks8851_wrreg8() function to write each byte without having to worry about any difficult byte swapping. Fixes a bug reported by Doong, Ping of Micrel. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * KS8851: Add soft reset at probe timeBen Dooks2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | Issue a full soft reset at probe time. This was reported by Doong Ping of Micrel, but no explanation of why this is necessary or what bug it is fixing. Add it as it does not seem to hurt the current driver and ensures that the device is in a known state when we start setting it up. Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fix section mismatch in fec.cSteven King2009-10-20
| | | | | | | | | | | | | | | | fec_enet_init is called by both fec_probe and fec_resume, so it shouldn't be marked as __init. Signed-off-by: Steven King <sfking@fdwdc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Fix struct inet_timewait_sock bitfield annotationEric Dumazet2009-10-20
| | | | | | | | | | | | | | | | | | | | | | commit 9e337b0f (net: annotate inet_timewait_sock bitfields) added 4/8 bytes in struct inet_timewait_sock. Fix this by declaring tw_ipv6_offset in the 'flags' bitfield The 14 bits hole is named tw_pad to make it cleary apparent. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: Try to catch MSG_PEEK bugHerbert Xu2009-10-20
| | | | | | | | | | | | | | | | | | This patch tries to print out more information when we hit the MSG_PEEK bug in tcp_recvmsg. It's been around since at least 2005 and it's about time that we finally fix it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Fix IP_MULTICAST_IFEric Dumazet2009-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls. This function should be called only with RTNL or dev_base_lock held, or reader could see a corrupt hash chain and eventually enter an endless loop. Fix is to call dev_get_by_index()/dev_put(). If this happens to be performance critical, we could define a new dev_exist_by_index() function to avoid touching dev refcount. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bluetooth: static lock key fixDave Young2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When shutdown ppp connection, lockdep waring about non-static key will happen, it is caused by the lock is not initialized properly at that time. Fix with tuning the lock/skb_queue_head init order [ 94.339261] INFO: trying to register non-static key. [ 94.342509] the code is fine but needs lockdep annotation. [ 94.342509] turning off the locking correctness validator. [ 94.342509] Pid: 0, comm: swapper Not tainted 2.6.31-mm1 #2 [ 94.342509] Call Trace: [ 94.342509] [<c0248fbe>] register_lock_class+0x58/0x241 [ 94.342509] [<c024b5df>] ? __lock_acquire+0xb57/0xb73 [ 94.342509] [<c024ab34>] __lock_acquire+0xac/0xb73 [ 94.342509] [<c024b7fa>] ? lock_release_non_nested+0x17b/0x1de [ 94.342509] [<c024b662>] lock_acquire+0x67/0x84 [ 94.342509] [<c04cd1eb>] ? skb_dequeue+0x15/0x41 [ 94.342509] [<c054a857>] _spin_lock_irqsave+0x2f/0x3f [ 94.342509] [<c04cd1eb>] ? skb_dequeue+0x15/0x41 [ 94.342509] [<c04cd1eb>] skb_dequeue+0x15/0x41 [ 94.342509] [<c054a648>] ? _read_unlock+0x1d/0x20 [ 94.342509] [<c04cd641>] skb_queue_purge+0x14/0x1b [ 94.342509] [<fab94fdc>] l2cap_recv_frame+0xea1/0x115a [l2cap] [ 94.342509] [<c024b5df>] ? __lock_acquire+0xb57/0xb73 [ 94.342509] [<c0249c04>] ? mark_lock+0x1e/0x1c7 [ 94.342509] [<f8364963>] ? hci_rx_task+0xd2/0x1bc [bluetooth] [ 94.342509] [<fab95346>] l2cap_recv_acldata+0xb1/0x1c6 [l2cap] [ 94.342509] [<f8364997>] hci_rx_task+0x106/0x1bc [bluetooth] [ 94.342509] [<fab95295>] ? l2cap_recv_acldata+0x0/0x1c6 [l2cap] [ 94.342509] [<c02302c4>] tasklet_action+0x69/0xc1 [ 94.342509] [<c022fbef>] __do_softirq+0x94/0x11e [ 94.342509] [<c022fcaf>] do_softirq+0x36/0x5a [ 94.342509] [<c022fe14>] irq_exit+0x35/0x68 [ 94.342509] [<c0204ced>] do_IRQ+0x72/0x89 [ 94.342509] [<c02038ee>] common_interrupt+0x2e/0x34 [ 94.342509] [<c024007b>] ? pm_qos_add_requirement+0x63/0x9d [ 94.342509] [<c038e8a5>] ? acpi_idle_enter_bm+0x209/0x238 [ 94.342509] [<c049d238>] cpuidle_idle_call+0x5c/0x94 [ 94.342509] [<c02023f8>] cpu_idle+0x4e/0x6f [ 94.342509] [<c0534153>] rest_init+0x53/0x55 [ 94.342509] [<c0781894>] start_kernel+0x2f0/0x2f5 [ 94.342509] [<c0781091>] i386_start_kernel+0x91/0x96 Reported-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bluetooth: scheduling while atomic bug fixDave Young2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to driver core changes dev_set_drvdata will call kzalloc which should be in might_sleep context, but hci_conn_add will be called in atomic context Like dev_set_name move dev_set_drvdata to work queue function. oops as following: Oct 2 17:41:59 darkstar kernel: [ 438.001341] BUG: sleeping function called from invalid context at mm/slqb.c:1546 Oct 2 17:41:59 darkstar kernel: [ 438.001345] in_atomic(): 1, irqs_disabled(): 0, pid: 2133, name: sdptool Oct 2 17:41:59 darkstar kernel: [ 438.001348] 2 locks held by sdptool/2133: Oct 2 17:41:59 darkstar kernel: [ 438.001350] #0: (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at: [<faa1d2f5>] lock_sock+0xa/0xc [l2cap] Oct 2 17:41:59 darkstar kernel: [ 438.001360] #1: (&hdev->lock){+.-.+.}, at: [<faa20e16>] l2cap_sock_connect+0x103/0x26b [l2cap] Oct 2 17:41:59 darkstar kernel: [ 438.001371] Pid: 2133, comm: sdptool Not tainted 2.6.31-mm1 #2 Oct 2 17:41:59 darkstar kernel: [ 438.001373] Call Trace: Oct 2 17:41:59 darkstar kernel: [ 438.001381] [<c022433f>] __might_sleep+0xde/0xe5 Oct 2 17:41:59 darkstar kernel: [ 438.001386] [<c0298843>] __kmalloc+0x4a/0x15a Oct 2 17:41:59 darkstar kernel: [ 438.001392] [<c03f0065>] ? kzalloc+0xb/0xd Oct 2 17:41:59 darkstar kernel: [ 438.001396] [<c03f0065>] kzalloc+0xb/0xd Oct 2 17:41:59 darkstar kernel: [ 438.001400] [<c03f04ff>] device_private_init+0x15/0x3d Oct 2 17:41:59 darkstar kernel: [ 438.001405] [<c03f24c5>] dev_set_drvdata+0x18/0x26 Oct 2 17:41:59 darkstar kernel: [ 438.001414] [<fa51fff7>] hci_conn_init_sysfs+0x40/0xd9 [bluetooth] Oct 2 17:41:59 darkstar kernel: [ 438.001422] [<fa51cdc0>] ? hci_conn_add+0x128/0x186 [bluetooth] Oct 2 17:41:59 darkstar kernel: [ 438.001429] [<fa51ce0f>] hci_conn_add+0x177/0x186 [bluetooth] Oct 2 17:41:59 darkstar kernel: [ 438.001437] [<fa51cf8a>] hci_connect+0x3c/0xfb [bluetooth] Oct 2 17:41:59 darkstar kernel: [ 438.001442] [<faa20e87>] l2cap_sock_connect+0x174/0x26b [l2cap] Oct 2 17:41:59 darkstar kernel: [ 438.001448] [<c04c8df5>] sys_connect+0x60/0x7a Oct 2 17:41:59 darkstar kernel: [ 438.001453] [<c024b703>] ? lock_release_non_nested+0x84/0x1de Oct 2 17:41:59 darkstar kernel: [ 438.001458] [<c028804b>] ? might_fault+0x47/0x81 Oct 2 17:41:59 darkstar kernel: [ 438.001462] [<c028804b>] ? might_fault+0x47/0x81 Oct 2 17:41:59 darkstar kernel: [ 438.001468] [<c033361f>] ? __copy_from_user_ll+0x11/0xce Oct 2 17:41:59 darkstar kernel: [ 438.001472] [<c04c9419>] sys_socketcall+0x82/0x17b Oct 2 17:41:59 darkstar kernel: [ 438.001477] [<c020329d>] syscall_call+0x7/0xb Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: fix TCP_DEFER_ACCEPT retrans calculationJulian Anastasov2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | Fix TCP_DEFER_ACCEPT conversion between seconds and retransmission to match the TCP SYN-ACK retransmission periods because the time is converted to such retransmissions. The old algorithm selects one more retransmission in some cases. Allow up to 255 retransmissions. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPTJulian Anastasov2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT users to not retransmit SYN-ACKs during the deferring period if ACK from client was received. The goal is to reduce traffic during the deferring period. When the period is finished we continue with sending SYN-ACKs (at least one) but this time any traffic from client will change the request to established socket allowing application to terminate it properly. Also, do not drop acked request if sending of SYN-ACK fails. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: accept socket after TCP_DEFER_ACCEPT periodJulian Anastasov2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Willy Tarreau and many other folks in recent years were concerned what happens when the TCP_DEFER_ACCEPT period expires for clients which sent ACK packet. They prefer clients that actively resend ACK on our SYN-ACK retransmissions to be converted from open requests to sockets and queued to the listener for accepting after the deferring period is finished. Then application server can decide to wait longer for data or to properly terminate the connection with FIN if read() returns EAGAIN which is an indication for accepting after the deferring period. This change still can have side effects for applications that expect always to see data on the accepted socket. Others can be prepared to work in both modes (with or without TCP_DEFER_ACCEPT period) and their data processing can ignore the read=EAGAIN notification and to allocate resources for clients which proved to have no data to send during the deferring period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not as a timeout) to wait for data will notice clients that didn't send data for 3 seconds but that still resend ACKs. Thanks to Willy Tarreau for the initial idea and to Eric Dumazet for the review and testing the change. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "tcp: fix tcp_defer_accept to consider the timeout"David S. Miller2009-10-19
| | | | | | | | | | | | | | | | | | This reverts commit 6d01a026b7d3009a418326bdcf313503a314f1ea. Julian Anastasov, Willy Tarreau and Eric Dumazet have come up with a more correct way to deal with this. Signed-off-by: David S. Miller <davem@davemloft.net>
| * AF_UNIX: Fix deadlock on connecting to shutdown socketTomoki Sekiyama2009-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found a deadlock bug in UNIX domain socket, which makes able to DoS attack against the local machine by non-root users. How to reproduce: 1. Make a listening AF_UNIX/SOCK_STREAM socket with an abstruct namespace(*), and shutdown(2) it. 2. Repeat connect(2)ing to the listening socket from the other sockets until the connection backlog is full-filled. 3. connect(2) takes the CPU forever. If every core is taken, the system hangs. PoC code: (Run as many times as cores on SMP machines.) int main(void) { int ret; int csd; int lsd; struct sockaddr_un sun; /* make an abstruct name address (*) */ memset(&sun, 0, sizeof(sun)); sun.sun_family = PF_UNIX; sprintf(&sun.sun_path[1], "%d", getpid()); /* create the listening socket and shutdown */ lsd = socket(AF_UNIX, SOCK_STREAM, 0); bind(lsd, (struct sockaddr *)&sun, sizeof(sun)); listen(lsd, 1); shutdown(lsd, SHUT_RDWR); /* connect loop */ alarm(15); /* forcely exit the loop after 15 sec */ for (;;) { csd = socket(AF_UNIX, SOCK_STREAM, 0); ret = connect(csd, (struct sockaddr *)&sun, sizeof(sun)); if (-1 == ret) { perror("connect()"); break; } puts("Connection OK"); } return 0; } (*) Make sun_path[0] = 0 to use the abstruct namespace. If a file-based socket is used, the system doesn't deadlock because of context switches in the file system layer. Why this happens: Error checks between unix_socket_connect() and unix_wait_for_peer() are inconsistent. The former calls the latter to wait until the backlog is processed. Despite the latter returns without doing anything when the socket is shutdown, the former doesn't check the shutdown state and just retries calling the latter forever. Patch: The patch below adds shutdown check into unix_socket_connect(), so connect(2) to the shutdown socket will return -ECONREFUSED. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com> Signed-off-by: Masanori Yoshida <masanori.yoshida.tv@hitachi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ethoc: clear only pending irqsThomas Chou2009-10-19
| | | | | | | | | | | | | | | | | | This patch fixed the problem of dropped packets due to lost of interrupt requests. We should only clear what was pending at the moment we read the irq source reg. Signed-off-by: Thomas Chou <thomas@wytron.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>