aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* e1000e: fix IPMI trafficJeff Kirsher2008-11-16
| | | | | | | | | | | | | | | Some users reported that they have machines with BMCs enabled that cannot receive IPMI traffic after e1000e is loaded. http://marc.info/?l=e1000-devel&m=121909039127414&w=2 http://marc.info/?l=e1000-devel&m=121365543823387&w=2 This fixes the issue if they load with the new parameter = 0 by disabling crc stripping, but leaves the performance feature on for most users. Based on work done by Hong Zhang. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* e1000e: fix warn_on reload after phy_id errorJeff Kirsher2008-11-16
| | | | | | | | | | If the driver fails to initialize the first time due to the failure in the phy_id check the kernel triggers a warn_on on the second try to load the driver because the driver did not free the msi/x resources in the first load because of the previous failure in phy_id check. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* phy: fix phy address bugGiulio Benetti2008-11-16
| | | | | | | | PHYID returns 0xffff and not 0xffffffff when not found and in some case(at91sam9263) 0x0. Maybe this patch could be useful. Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* e100: fix dma error in direction for mappingJesse Brandeburg2008-11-16
| | | | | | | | | | | | | | The e100 driver triggers BUG_ON(buf->direction != dir) by doing pci_map_single(..., PCI_DMA_BIDIRECTIONAL) and pci_dma_sync_single_for_device(..., PCI_DMA_TODEVICE). Changing the DMA direction, especially with dmabounce will result in unexpected behaviour. Reported-by: Anders Grafstrom <grfstrm@users.sourceforge.net> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: use dev_printk instead of printkBjorn Helgaas2008-11-16
| | | | | | | | Use dev_printk() instead of printk() to give a little more context and use consistent format. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* qla3xxx: Cleanup: Fix link print statements.Ron Mercer2008-11-16
| | | | | | | Removed debug print statements and improved conditionals around informational statements. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* igb: Use device_set_wakeup_enable\"Rafael J. Wysocki\2008-11-16
| | | | | | | | | | | | Since dev->power.should_wakeup bit is used by the PCI core to decide whether the device should wake up the system from sleep states, set/unset this bit whenever WOL is enabled/disabled using igb_set_wol(). Accordingly, use device_can_wakeup() for checking if wake-up is supported by the device. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* e1000: Use device_set_wakeup_enable\"Rafael J. Wysocki\2008-11-16
| | | | | | | | | | | | Since dev->power.should_wakeup bit is used by the PCI core to decide whether the device should wake up the system from sleep states, set/unset this bit whenever WOL is enabled/disabled using e1000_set_wol(). Accordingly, use device_can_wakeup() for checking if wake-up is supported by the device. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* e1000e: Use device_set_wakeup_enable\"Rafael J. Wysocki\2008-11-16
| | | | | | | | | | | | Since dev->power.should_wakeup bit is used by the PCI core to decide whether the device should wake up the system from sleep states, set/unset this bit whenever WOL is enabled/disabled using e1000_set_wol(). Accordingly, use device_can_wakeup() for checking if wake-up is supported by the device. Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* via-velocity: enable perfect filtering for multicast packetsJoey Zhuo2008-11-16
| | | | | | Signed-off-by: Joey Zhuo <joeyzhuo@via.com.tw> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* phy: Add support for Marvell 88E1118 PHYRon Madrid2008-11-15
| | | | | | | This patch will add support for the Marvell 88E1118 PHY which supports gigabit ethernet among other things. Signed-off-by: Ron Madrid <ron_madrid@sbcglobal.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_en: Pause parameters per portYevgeny Petrilin2008-11-15
| | | | | | | | Before the change the driver reported the same pause parameters for all the ports, even only one of them was modified. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'davem-fixes' of ↵David S. Miller2008-11-14
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| * phylib: fix premature freeing of struct mii_busLennert Buytenhek2008-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 46abc02175b3c246dd5141d878f565a8725060c9 ("phylib: give mdio buses a device tree presence") added a call to device_unregister() in a situation where the caller did not intend for the device to be freed yet, but apart from just unregistering the device from the system, device_unregister() does an additional put_device() that is intended to free it. The right function to use in this situation is device_del(), which unregisters the device from the system like device_unregister() does, but without dropping the reference count an additional time. Bug report from Bryan Wu <cooloney@kernel.org>. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * atl1: Do not enumerate options unsupported by chipJ. K. Cliburn2008-11-14
| | | | | | | | | | | | | | | | | | Of the various WOL options provided in include/linux/ethtool.h, the L1 NIC supports only magic packet. Remove all options except magic packet from the atl1 driver. Signed-off-by: Jay Cliburn <jcliburn@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * atl1e: fix broken multicast by removing unnecessary crc inversionJ. K. Cliburn2008-11-14
| | | | | | | | | | | | | | | | | | | | Inverting the crc after calling ether_crc_le() is unnecessary and breaks multicast. Remove it. Tested-by: David Madore <david.madore@ens.fr> Signed-off-by: Jay Cliburn <jcliburn@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * gianfar: Fix DMA unmap invocationsAndy Fleming2008-11-14
| | | | | | | | | | | | | | | | We weren't unmapping DMA memory, which will break when gianfar gets used on systems with more than 32-bits of memory. Also, it's just plain wrong. Signed-off-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * net/ucc_geth: Fix oops in uec_get_ethtool_stats()Anton Vorontsov2008-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | p_{tx,rx}_fw_statistics_pram are special: they're available only when a device is open. If the device is closed, we should just fill the data with zeroes. Fixes the following oops: root@b1:~# ifconfig eth1 down root@b1:~# ethtool -S eth1 Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc01e1dcc Oops: Kernel access of bad area, sig: 11 [#1] [...] NIP [c01e1dcc] uec_get_ethtool_stats+0x98/0x124 LR [c0287cc8] ethtool_get_stats+0xfc/0x23c Call Trace: [cfaadde0] [c0287ca8] ethtool_get_stats+0xdc/0x23c (unreliable) [cfaade20] [c0288340] dev_ethtool+0x2fc/0x588 [cfaade50] [c0285648] dev_ioctl+0x290/0x33c [cfaadea0] [c0272238] sock_ioctl+0x80/0x2ec [cfaadec0] [c00b5ae4] vfs_ioctl+0x40/0xc0 [cfaadee0] [c00b5fa8] do_vfs_ioctl+0x78/0x20c [cfaadf10] [c00b617c] sys_ioctl+0x40/0x74 [cfaadf40] [c00142d8] ret_from_syscall+0x0/0x38 [...] ---[ end trace b941007b2dfb9759 ]--- Segmentation fault p.s. While at it, also remove u64 casts, they aren't needed. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | scm: fix scm_fp_list->list initialization made in wrong placePavel Emelyanov2008-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the next page of the scm recursion story (the commit f8d570a4 net: Fix recursive descent in __scm_destroy()). In function scm_fp_dup(), the INIT_LIST_HEAD(&fpl->list) of newly created fpl is done *before* the subsequent memcpy from the old structure and thus the freshly initialized list is overwritten. But that's OK, since this initialization is not required at all, since the fpl->list is list_add-ed at the destruction time in any case (and is unused in other code), so I propose to drop both initializations, rather than moving it after the memcpy. Please, correct me if I miss something significant. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | niu: Bump driver version and release date.David S. Miller2008-11-14
| | | | | | | | | | | | | | | | | | | | This driver is pretty mature, and the worst of the known problems has been fixed (the 32-bit failures due to readq implementation). So let's finally give it a version of 1.0 Signed-off-by: David S. Miller <davem@davemloft.net>
* | NIU: Add Sun CP3260 ATCA blade supportSantwona Behera2008-11-14
| | | | | | | | | | | | | | | | | | This patch adds support for the Sun CP3260 ATCA blade which is a N2 based ATCA blade with 2 NIU ports. The NIU ports do not have on-board PHY. Signed-off-by: Santwona Behera <santwona.behera@sun.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | lockdep: include/linux/lockdep.h - fix warning in net/bluetooth/af_bluetooth.cIngo Molnar2008-11-14
|/ | | | | | | | | | | | | | | | | | | | fix this warning: net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used this is a lockdep macro problem in the !LOCKDEP case. We cannot convert it to an inline because the macro works on multiple types, but we can mark the parameter used. [ also clean up a misaligned tab in sock_lock_init_class_and_name() ] [ also remove #ifdefs from around af_family_clock_key strings - which were certainly added to get rid of the ugly build warnings. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
* 9p: restrict RDMA usageRandy Dunlap2008-11-13
| | | | | | | | | | | | | | | | | | | | | | Make 9p's RDMA option depend on INET since it uses Infiniband rdma_* functions and that code depends on INET. Otherwise 9p can try to use symbols which don't exist. ERROR: "rdma_destroy_id" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_connect" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_create_id" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_create_qp" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_resolve_route" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_disconnect" [net/9p/9pnet_rdma.ko] undefined! ERROR: "rdma_resolve_addr" [net/9p/9pnet_rdma.ko] undefined! I used an if/endif block so that the menu items would remain presented together. Also correct an article adjective. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: shy netns_ok checkAlexey Dobriyan2008-11-13
| | | | | | | | | | | Failure to pass netns_ok check is SILENT, except some MIB counter is incremented somewhere. And adding "netns_ok = 1" (after long head-scratching session) is usually the last step in making some protocol netns-ready... Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: routing header fixesBrian Haley2008-11-13
| | | | | | | | | | | | | | | This patch fixes two bugs: 1. setsockopt() of anything but a Type 2 routing header should return EINVAL instead of EPERM. Noticed by Shan Wei (shanwei@cn.fujitsu.com). 2. setsockopt()/sendmsg() of a Type 2 routing header with invalid length or segments should return EINVAL. These values are statically fixed in RFC 3775, unlike the variable Type 0 was. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2: fix poll_controller to pass proper structures and check all rx queuesNeil Horman2008-11-12
| | | | | | | | | | | | | Fix bnx2 so that netpoll works properly. Specifically: 1) Fix parameters to bnx2_interrupt to be a struct bnx2_napi rather than a struct net_device 2) Fix poll_controller method to check every queue in the rx case so frames aren't missed Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2008-11-12
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| * hostap: pad the skb->cb usage in lieu of a proper fixJohannes Berg2008-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like mac80211 did, this driver makes 'clever' use of skb->cb to pass information along with an skb as it is requeued from the virtual device to the physical wireless device. Unfortunately, that trick no longer works... Unlike mac80211, code complexity and driver apathy makes this hack the best option we have in the short run. Hopefully someone will eventually be motivated to code a proper fix before all the effected hardware dies. (Above text by me. Johannes officially disavows all knowledge of this hack. -- JWL) Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * rtl8187 : support for Sitecom WL-168 0001 v4Bob Jolliffe2008-11-12
| | | | | | | | | | | | | | the Sitecom 0001 v4 with product id 0x0df6:0028, uses Realtek's RTL8187B and work fine with new 2.6.27 driver. Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * mac80211: fix notify_mac functionJohannes Berg2008-11-12
| | | | | | | | | | | | | | | | | | | | The ieee80211_notify_mac() function uses ieee80211_sta_req_auth() which in turn calls ieee80211_set_disassoc() which calls a few functions that need to be able to sleep, so ieee80211_notify_mac() cannot use RCU locking for the interface list and must use rtnl locking instead. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * rtl8187: Add Abocom USB IDIvan Kuten2008-11-12
| | | | | | | | | | | | Signed-off-by: Ivan Kuten <ivan.kuten@promwad.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | niu: Fix readq implementation when architecture does not provide one.David S. Miller2008-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a TX hang reported by Jesper Dangaard Brouer. When an architecutre cannot provide a fully functional 64-bit atomic readq/writeq, the driver must implement it's own. This is because only the driver can say whether doing something like using two 32-bit reads to implement the full 64-bit read will actually work properly. In particular one of the issues is whether the top 32-bits or the bottom 32-bits of the 64-bit register should be read first. There could be side effects, and in fact that is exactly the problem here. The TX_CS register has counters in the upper 32-bits and state bits in the lower 32-bits. A read clears the state bits. We would read the counter half before the state bit half. That first read would clear the state bits, and then the driver thinks that no interrupts are pending because the interrupt indication state bits are seen clear every time. Fix this by reading the bottom half before the upper half. Tested-by: Jesper Dangaard Brouer <jdb@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: put_cmsg_compat + SO_TIMESTAMP[NS]: use same name for value as callerPatrick Ohly2008-11-12
| | | | | | | | | | | | | | | | In __sock_recv_timestamp() the additional SCM_TIMESTAMP[NS] is used. This has the same value as SO_TIMESTAMP[NS], so this is a purely cosmetic change. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | tcp_htcp: last_cong bug fixDoug Leith2008-11-12
|/ | | | | | | | | | | | | | | | This patch fixes a minor bug in tcp_htcp.c which has been highlighted by Lachlan Andrew and Lawrence Stewart. Currently, the time since the last congestion event, which is stored in variable last_cong, is reset whenever there is a state change into TCP_CA_Open. This includes transitions of the type TCP_CA_Open->TCP_CA_Disorder->TCP_CA_Open which are not associated with backoff of cwnd. The patch changes last_cong to be updated only on transitions into TCP_CA_Open that occur after experiencing the congestion-related states TCP_CA_Loss, TCP_CA_Recovery, TCP_CA_CWR. Signed-off-by: Doug Leith <doug.leith@nuim.ie> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'davem-fixes' of ↵David S. Miller2008-11-11
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
| * [netdrvr] smc911x: fix for driver resume (and compilation warning)Dasgupta, Romit2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am trying out suspend, resume on an OMAP3 based board. What I see during resume is that the SMC911x driver resume routing gets stuck after trying to transmit the packet out of the controller. Some debug messages below: --> smc911x_drv_resume eth0: --> smc911x_reset eth0: smc911x_reset timeout waiting for PM restore eth0: --> smc911x_enable eth0: --> smc911x_phy_configure() eth0: --> smc911x_phy_reset() eth0: phy caps=0x782d eth0: phy advertised caps=0x0de1 eth0: --> smc911x_phy_check_media smc911x_phy_read: phyaddr=0x1, phyreg=0x01, phydata=0x7809 smc911x_phy_read: phyaddr=0x1, phyreg=0x01, phydata=0x7809 eth0: link down Restarting tasks ... eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt eth0: --> smc911x_hard_start_xmit eth0: --> smc911x_hardware_send_pkt nfs: server 172.24.190.217 not responding, still trying nfs: server 172.24.190.217 not responding, still trying The following change makes it work fine: (The change within smc911x_drv_probe function was to get rid of a compilation warning). Signed-off-by: Romit Dasgupta <romit@ti.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * RDMA/cxgb3: deadlock in iw_cxgb3 can cause hang when configuring interface.Steve Wise2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the iw_cxgb3 module's cxgb3_client "add" func gets called by the cxgb3 module, the iwarp driver ends up calling the ethtool ops get_drvinfo function in cxgb3 to get the fw version and other info. Currently the iwarp driver grabs the rtnl lock around this down call to serialize. As of 2.6.27 or so, things changed such that the rtnl lock is held around the call to the netdev driver open function. Also the cxgb3_client "add" function doesn't get called if the device is down. So, if you load cxgb3, then load iw_cxgb3, then ifconfig up the device, the iw_cxgb3 add func gets called with the rtnl_lock held. If you load cxgb3, ifconfig up the device, then load iw_cxgb3, the add func gets called without the rtnl_lock held. The former causes the deadlock, the latter does not. In addition, there are iw_cxgb3 sysfs handlers that also can call down into cxgb3 to gather the fw and hw versions. These can be called concurrently on different processors and at any time. Thus we need to push this serialization down in the cxgb3 driver get_drvinfo func. The fix is to remove rtnl lock usage, and use a per-device lock in cxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * cxgb3 - Limit multiqueue setting to msi-xDivy Le Ray2008-11-11
| | | | | | | | | | | | | | Allow multiqueue setting in MSI-X mode only Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * cxgb3 - eeprom read fixesDivy Le Ray2008-11-11
| | | | | | | | | | | | | | | | Protect against invalid phy entries in the eeprom. Extend eeprom access timeout. Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
| * myri10ge: fix stop/go ordering even moreBrice Goglin2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | The doorbell writes may be seen out of order by the firmware if they are in WC memory since the tx spin(un)lock does not flush WC writes. Hence if the "stop" is written on a different CPU than the "go", it is possible that the stop will arrive after the go unless we add an explicit memory barrier (and mmiowb() is not enough). It fixes transmit hangs in multi tx queue mode. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | Merge branch 'timers-fixes-for-linus' of ↵Linus Torvalds2008-11-11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context nohz: disable tick_nohz_kick_tick() for now irq: call __irq_enter() before calling the tick_idle_check x86: HPET: enter hpet_interrupt_handler with interrupts disabled x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMP x86: HPET: convert WARN_ON to WARN_ON_ONCE
| * | timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq contextGautham R Shenoy2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix incorrect locking triggered during hotplug-intense stress-tests While migrating the the CB_IRQSAFE_UNLOCKED timers during a cpu-offline, we queue them on the cb_pending list, so that they won't go stale. Thus, when the callbacks of the timers run from the softirq context, they could run into potential deadlocks, since these callbacks assume that they're running with irq's disabled, thereby annoying lockdep! Fix this by emulating hardirq context while running these callbacks from the hrtimer softirq. ================================= [ INFO: inconsistent lock state ] 2.6.27 #2 -------------------------------- inconsistent {in-hardirq-W} -> {hardirq-on-W} usage. ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes: (&rq->lock){++..}, at: [<c011db84>] sched_rt_period_timer+0x9e/0x1fc {in-hardirq-W} state was registered at: [<c014103c>] __lock_acquire+0x549/0x121e [<c0107890>] native_sched_clock+0x88/0x99 [<c013aa12>] clocksource_get_next+0x39/0x3f [<c0139abc>] update_wall_time+0x616/0x7df [<c0141d6b>] lock_acquire+0x5a/0x74 [<c0121724>] scheduler_tick+0x3a/0x18d [<c047ed45>] _spin_lock+0x1c/0x45 [<c0121724>] scheduler_tick+0x3a/0x18d [<c0121724>] scheduler_tick+0x3a/0x18d [<c012c436>] update_process_times+0x3a/0x44 [<c013c044>] tick_periodic+0x63/0x6d [<c013c062>] tick_handle_periodic+0x14/0x5e [<c010568c>] timer_interrupt+0x44/0x4a [<c0150c9f>] handle_IRQ_event+0x13/0x3d [<c0151c14>] handle_level_irq+0x79/0xbd [<c0105634>] do_IRQ+0x69/0x7d [<c01041e4>] common_interrupt+0x28/0x30 [<c047007b>] aac_probe_one+0x1a3/0x3f3 [<c047ec2d>] _spin_unlock_irqrestore+0x36/0x39 [<c01512b4>] setup_irq+0x1be/0x1f9 [<c065d70b>] start_kernel+0x259/0x2c5 [<ffffffff>] 0xffffffff irq event stamp: 50102 hardirqs last enabled at (50102): [<c047ebf4>] _spin_unlock_irq+0x20/0x23 hardirqs last disabled at (50101): [<c047edc2>] _spin_lock_irq+0xa/0x4b softirqs last enabled at (50088): [<c0128ba6>] do_softirq+0x37/0x4d softirqs last disabled at (50099): [<c0128ba6>] do_softirq+0x37/0x4d other info that might help us debug this: no locks held by ksoftirqd/0/4. stack backtrace: Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27 #2 [<c013f6cb>] print_usage_bug+0x13e/0x147 [<c013fef5>] mark_lock+0x493/0x797 [<c01410b1>] __lock_acquire+0x5be/0x121e [<c0141d6b>] lock_acquire+0x5a/0x74 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c047ed45>] _spin_lock+0x1c/0x45 [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c011db84>] sched_rt_period_timer+0x9e/0x1fc [<c01210fd>] finish_task_switch+0x41/0xbd [<c0107890>] native_sched_clock+0x88/0x99 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc [<c0136dda>] run_hrtimer_pending+0x54/0xe5 [<c011dae6>] sched_rt_period_timer+0x0/0x1fc [<c0128afb>] __do_softirq+0x7b/0xef [<c0128ba6>] do_softirq+0x37/0x4d [<c0128c12>] ksoftirqd+0x56/0xc5 [<c0128bbc>] ksoftirqd+0x0/0xc5 [<c0134649>] kthread+0x38/0x5d [<c0134611>] kthread+0x0/0x5d [<c0104477>] kernel_thread_helper+0x7/0x10 ======================= Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | nohz: disable tick_nohz_kick_tick() for nowThomas Gleixner2008-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: nohz powersavings and wakeup regression commit fb02fbc14d17837b4b7b02dbb36142c16a7bf208 (NOHZ: restart tick device from irq_enter()) causes a serious wakeup regression. While the patch is correct it does not take into account that spurious wakeups happen on x86. A fix for this issue is available, but we just revert to the .27 behaviour and let long running softirqs screw themself. Disable it for now. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | irq: call __irq_enter() before calling the tick_idle_checkThomas Gleixner2008-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: avoid spurious ksoftirqd wakeups The tick idle check which is called from irq_enter() was run before the call to __irq_enter() which did not set the in_interrupt() bits in preempt_count. That way the raise of a softirq woke up softirqd for nothing as the softirq was handled on return from interrupt. Call __irq_enter() before calling into the tick idle check code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: HPET: enter hpet_interrupt_handler with interrupts disabledMatt Fleming2008-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some functions that may be called from this handler require that interrupts are disabled. Also, combining IRQF_DISABLED and IRQF_SHARED does not reliably disable interrupts in a handler, so remove IRQF_SHARED from the irq flags (this irq is not shared anyway). Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Cc: "Will Newton" <will.newton@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMPMatt Fleming2008-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In hpet_next_event() we check that the value we just wrote to HPET_Tn_CMP(timer) has reached the chip. Currently, we're checking that the value we wrote to HPET_Tn_CMP(timer) is in HPET_T0_CMP, which, if timer is anything other than timer 0, is likely to fail. Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: HPET: convert WARN_ON to WARN_ON_ONCEMatt Fleming2008-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to flood the console with call traces if the WARN_ON condition is true because of the frequency with which this function is called. Signed-off-by: Matt Fleming <mjf@gentoo.org> Cc: mingo@elte.hu Cc: venkatesh.pallipadi@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds2008-11-11
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: release buddies on yield fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock sched: clean up debug info
| * | | sched: release buddies on yieldPeter Zijlstra2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clear buddies on yield, so that the buddy rules don't schedule them despite them being placed right-most. This fixed a performance regression with yield-happy binary JVMs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Lin Ming <ming.m.lin@intel.com>
| * | | fix for account_group_exec_runtime(), make sure ->signal can't be freed ↵Oleg Nesterov2008-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | under rq->lock Impact: fix hang/crash on ia64 under high load This is ugly, but the simplest patch by far. Unlike other similar routines, account_group_exec_runtime() could be called "implicitly" from within scheduler after exit_notify(). This means we can race with the parent doing release_task(), we can't just check ->signal != NULL. Change __exit_signal() to do spin_unlock_wait(&task_rq(tsk)->lock) before __cleanup_signal() to make sure ->signal can't be freed under task_rq(tsk)->lock. Note that task_rq_unlock_wait() doesn't care about the case when tsk changes cpu/rq under us, this should be OK. Thanks to Ingo who nacked my previous buggy patch. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Doug Chapman <doug.chapman@hp.com>