aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
...
* Revert "sunrpc: move the close processing after do recvfrom method"J. Bruce Fields2010-04-01
| | | | | | | | | | | | | commit 1b644b6e6f6160ae35ce4b52c2ca89ed3e356e18 upstream. This reverts commit b0401d725334a94d57335790b8ac2404144748ee, which moved svc_delete_xprt() outside of XPT_BUSY, and allowed it to be called after svc_xpt_recived(), removing its last reference and destroying it after it had already been queued for future processing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Revert "sunrpc: fix peername failed on closed listener"J. Bruce Fields2010-04-01
| | | | | | | | | | | | | commit f5822754ea006563e1bf0a1f43faaad49c0d8bb2 upstream. This reverts commit b292cf9ce70d221c3f04ff62db5ab13d9a249ca8. The commit that it attempted to patch up, b0401d725334a94d57335790b8ac2404144748ee, was fundamentally wrong, and will also be reverted. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* NFS: Prevent another deadlock in nfs_release_page()Trond Myklebust2010-04-01
| | | | | | | | | | | | | commit d812e575822a2b7ab1a7cadae2571505ec6ec2bd upstream. We should not attempt to free the page if __GFP_FS is not set. Otherwise we can deadlock as per http://bugzilla.kernel.org/show_bug.cgi?id=15578 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* NFS: Avoid a deadlock in nfs_release_pageTrond Myklebust2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit bb6fbc4548b9ae7ebbd06ef72f00229df259d217 upstream. J.R. Okajima reports the following deadlock: INFO: task kswapd0:305 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kswapd0 D 0000000000000001 0 305 2 0x00000000 ffff88001f21d4f0 0000000000000046 ffff88001fdea680 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21c000 ffff88001f21dfd8 ffff88001f21dfd8 ffff88001fdea040 0000000000014c00 0000000000000001 ffff88001fdea040 Call Trace: [<ffffffff8146155d>] io_schedule+0x4d/0x70 [<ffffffff810d2be5>] sync_page+0x65/0xa0 [<ffffffff81461b12>] __wait_on_bit_lock+0x52/0xb0 [<ffffffff810d2b80>] ? sync_page+0x0/0xa0 [<ffffffff810d2b64>] __lock_page+0x64/0x70 [<ffffffff81070ce0>] ? wake_bit_function+0x0/0x40 [<ffffffff810df1d4>] truncate_inode_pages_range+0x344/0x4a0 [<ffffffff810df340>] truncate_inode_pages+0x10/0x20 [<ffffffff8112cbfe>] generic_delete_inode+0x15e/0x190 [<ffffffff8112cc8d>] generic_drop_inode+0x5d/0x80 [<ffffffff8112bb88>] iput+0x78/0x80 [<ffffffff811bc908>] nfs_dentry_iput+0x38/0x50 [<ffffffff811285f4>] dentry_iput+0x84/0x110 [<ffffffff811286ae>] d_kill+0x2e/0x60 [<ffffffff8112912a>] dput+0x7a/0x170 [<ffffffff8111e925>] path_put+0x15/0x40 [<ffffffff811c3a44>] __put_nfs_open_context+0xa4/0xb0 [<ffffffff811cb5d0>] ? nfs_free_request+0x0/0x50 [<ffffffff811c3b0b>] put_nfs_open_context+0xb/0x10 [<ffffffff811cb5f9>] nfs_free_request+0x29/0x50 [<ffffffff81234b7e>] kref_put+0x8e/0xe0 [<ffffffff811cb594>] nfs_release_request+0x14/0x20 [<ffffffff811cf769>] nfs_find_and_lock_request+0x89/0xa0 [<ffffffff811d1180>] nfs_wb_page+0x80/0x110 [<ffffffff811c0770>] nfs_release_page+0x70/0x90 [<ffffffff810d18ee>] try_to_release_page+0x5e/0x80 [<ffffffff810e1178>] shrink_page_list+0x638/0x860 [<ffffffff810e19de>] shrink_zone+0x63e/0xc40 We can fix this by making the call to put_nfs_open_context() happen when we actually remove the write request from the inode (which is done by the nfsiod thread in this case). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* NFSv4: Don't ignore the NFS_INO_REVAL_FORCED flag in nfs_revalidate_inode()Trond Myklebust2010-04-01
| | | | | | | | | | | | commit b4d2314bb88b07e5a04e6c75b442a1dfcd60e340 upstream. If the NFS_INO_REVAL_FORCED flag is set, that means that we don't yet have an up to date attribute cache. Even if we hold a delegation, we must put a GETATTR on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* SCSI: scsi_transport_fc: Fix synchronization issue while deleting vportGal Rosen2010-04-01
| | | | | | | | | | | | | | | | | | | | | | commit 0d9dc7c8b9b7fa0f53647423b41056ee1beed735 upstream. The issue occur while deleting 60 virtual ports through the sys interface /sys/class/fc_vports/vport-X/vport_delete. It happen while in a mistake each request sent twice for the same vport. This interface is asynchronous, entering the delete request into a work queue, allowing more than one request to enter to the delete work queue. The result is a NULL pointer. The first request already delete the vport, while the second request got a pointer to the vport before the device destroyed. Re-create vport later cause system freeze. Solution: Check vport flags before entering the request to the work queue. [jejb: fixed int<->long problem on spinlock flags variable] Signed-off-by: Gal Rosen <galr@storwize.com> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* doc: add the documentation for mpol=localKOSAKI Motohiro2010-04-01
| | | | | | | | | | | | | | | | | | | | commit 5574169613b40b85d6f4c67208fa4846b897a0a1 upstream. commit 3f226aa1c (mempolicy: support mpol=local tmpfs mount option) added new mpol=local mount option. but it didn't add a documentation. This patch does it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tmpfs: cleanup mpol_parse_str()KOSAKI Motohiro2010-04-01
| | | | | | | | | | | | | | | | | | | | commit 926f2ae04f183098cf9a30521776fb2759c8afeb upstream. mpol_parse_str() made lots 'err' variable related bug. Because it is ugly and reviewing unfriendly. This patch simplifies it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tmpfs: handle MPOL_LOCAL mount option properlyKOSAKI Motohiro2010-04-01
| | | | | | | | | | | | | | | | | | | | | | commit 12821f5fb942e795f8009ece14bde868893bd811 upstream. commit 71fe804b6d5 (mempolicy: use struct mempolicy pointer in shmem_sb_info) added mpol=local mount option. but its feature is broken since it was born. because such code always return 1 (i.e. mount failure). This patch fixes it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tmpfs: mpol=bind:0 don't cause mount error.KOSAKI Motohiro2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | commit d69b2e63e9172afb4d07c305601b79a55509ac4c upstream. Currently, following mount operation cause mount error. % mount -t tmpfs -ompol=bind:0 none /tmp Because commit 71fe804b6d5 (mempolicy: use struct mempolicy pointer in shmem_sb_info) corrupted MPOL_BIND parse code. This patch restore the needed one. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Ravikiran Thirumalai <kiran@scalex86.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tmpfs: fix oops on mounts with mpol=defaultRavikiran G Thirumalai2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 413b43deab8377819aba1dbad2abf0c15d59b491 upstream. Fix an 'oops' when a tmpfs mount point is mounted with the mpol=default mempolicy. Upon remounting a tmpfs mount point with 'mpol=default' option, the mount code crashed with a null pointer dereference. The initial problem report was on 2.6.27, but the problem exists in mainline 2.6.34-rc as well. On examining the code, we see that mpol_new returns NULL if default mempolicy was requested. This 'NULL' mempolicy is accessed to store the node mask resulting in oops. The following patch fixes it. Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* perf probe: Fix probe_point buffer overrunMasami Hiramatsu2010-04-01
| | | | | | | | | | | | | | | | | | | | | commit 594087a04eea544356f9c52e83c1a9bc380ce80f upstream. Fix probe_point array-size overrun problem. In some cases (e.g. inline function), one user-specified probe-point can be translated to many probe address, and it overruns pre-defined array-size. This also removes redundant MAX_PROBES macro definition. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> LKML-Reference: <20100312232217.2017.45017.stgit@localhost6.localdomain6> [ Note that only root can create new probes. Eventually we should remove the MAX_PROBES limit, but that is a larger patch not eligible to perf/urgent treatment. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* perf_event: Fix oops triggered by cpu offline/onlinePaul Mackerras2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 220b140b52ab6cc133f674a7ffec8fa792054f25 upstream. Anton Blanchard found that he could reliably make the kernel hit a BUG_ON in the slab allocator by taking a cpu offline and then online while a system-wide perf record session was running. The reason is that when the cpu comes up, we completely reinitialize the ctx field of the struct perf_cpu_context for the cpu. If there is a system-wide perf record session running, then there will be a struct perf_event that has a reference to the context, so its refcount will be 2. (The perf_event has been removed from the context's group_entry and event_entry lists by perf_event_exit_cpu(), but that doesn't remove the perf_event's reference to the context and doesn't decrement the context's refcount.) When the cpu comes up, perf_event_init_cpu() gets called, and it calls __perf_event_init_context() on the cpu's context. That resets the refcount to 1. Then when the perf record session finishes and the perf_event is closed, the refcount gets decremented to 0 and the context gets kfreed after an RCU grace period. Since the context wasn't kmalloced -- it's part of a per-cpu variable -- bad things happen. In fact we don't need to completely reinitialize the context when the cpu comes up. It's sufficient to initialize the context once at boot, but we need to do it for all possible cpus. This moves the context initialization to happen at boot time. With this, we don't trash the refcount and the context never gets kfreed, and we don't hit the BUG_ON. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Tested-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* perf: Make the install relative to DESTDIR if specifiedJohn Kacur2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 7ae5f21361fea11f58c398701da635f778635d13 upstream. Without this change, the install path is relative to prefix/DESTDIR where prefix is automatically set to $HOME. This can produce unexpected results. For example: make -C tools/perf DESTDIR=/home/jkacur/tmp install-man creates the directory: /home/jkacur/home/jkacur/tmp/share/... instead of the expected: /home/jkacur/tmp/share/... Signed-off-by: John Kacur <jkacur@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Kyle McMartin <kyle@redhat.com> LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* perf: Provide generic perf_sample_data initializationPeter Zijlstra2010-04-01
| | | | | | | | | | | | | | | | | | | | | This makes it easier to extend perf_sample_data and fixes a bug on arm and sparc, which failed to set ->raw to NULL, which can cause crashes when combined with PERF_SAMPLE_RAW. It also optimizes PowerPC and tracepoint, because the struct initialization is forced to zero out the whole structure. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jean Pihet <jpihet@mvista.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100304140100.315416040@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* gigaset: correct range checking off by one errorTilman Schmidt2010-04-01
| | | | | | | | | | | | | | commit 6ad34145cf809384359fe513481d6e16638a57a3 upstream. Correct a potential array overrun due to an off by one error in the range check on the CAPI CONNECT_REQ CIPValue parameter. Found and reported by Dan Carpenter using smatch. Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* gigaset: fix build failureTilman Schmidt2010-04-01
| | | | | | | | | | | | | | | commit 22001a13d09d82772e831dcdac0553994a4bac5d upstream. Update the dummy LL interface to the LL interface change introduced by commit daab433c03c15fd642c71c94eb51bdd3f32602c8. This fixes the build failure occurring after that commit when enabling ISDN_DRV_GIGASET but neither ISDN_I4L nor ISDN_CAPI. Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* gigaset: avoid registering CAPI driver more than onceTilman Schmidt2010-04-01
| | | | | | | | | | | | | | | | | | commit bc35b4e347c047fb1c665bb761ddb22482539f7f upstream. Registering/unregistering the Gigaset CAPI driver when a device is connected/disconnected causes an Oops when disconnecting two Gigaset devices in a row, because the same capi_driver structure gets unregistered twice. Fix by making driver registration/unregistration a separate operation (empty in the ISDN4Linux case) called when the main module is loaded/unloaded. Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* gigaset: prune use of tty_buffer_request_roomTilman Schmidt2010-04-01
| | | | | | | | | | | | | | | commit 873a69a358a6b393fd8d9d92e193ec8895cac4d7 upstream. Calling tty_buffer_request_room() before tty_insert_flip_string() is unnecessary, costs CPU and for big buffers can mess up the multi-page allocation avoidance. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Acked-by: Karsten Keil <keil@b1-systems.de> CC: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* gigaset: correct clearing of at_state strings on RINGTilman Schmidt2010-04-01
| | | | | | | | | | | | | | | commit 3a0a3a6b92edf181f849ebd8417122392ba73a96 upstream. In RING handling, clear the table of received parameter strings in a loop like everywhere else, instead of by enumeration which had already gotten out of sync. Impact: minor bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: cmipci: work around invalid PCM pointerClemens Ladisch2010-04-01
| | | | | | | | | | | | | | | | | | commit 1c583063a5c769fe2ec604752e383972c69e6d9b upstream. When the CMI8738 FRAME2 register is read, the chip sometimes (probably when wrapping around) returns an invalid value that would be outside the programmed DMA buffer. This leads to an inconsistent PCM pointer that is likely to result in an underrun. To work around this, read the register multiple times until we get a valid value; the error state seems to be very short-lived. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Reported-and-tested-by: Matija Nalis <mnalis-alsadev@voyager.hr> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: hda: Fix 0 dB offset for HP laptops using CX20551 (Waikiki)Daniel T Chen2010-04-01
| | | | | | | | | | | | | | | | | commit 025f206c9e0f96cc41567b01c07fb852d8900da1 upstream. BugLink: https://launchpad.net/bugs/420578 The OR has verified that his hardware distorts because of the 0 dB offset not corresponding to the highest PCM level. Fix this by capping said PCM level to 0 dB similarly to what we do for CX20549 (Venice). Reported-by: Mike Pontillo <pontillo@gmail.com> Tested-by: Mike Pontillo <pontillo@gmail.com> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: hda - Fix secondary ADC of ALC260 basic modelTakashi Iwai2010-04-01
| | | | | | | | | | | | commit 9c4cc0bdede1c39bde60a0d5d9251aac71fbe719 upstream. Fix adc_nids[] for ALC260 basic model to match with num_adc_nids. Otherwise you get an invalid NID in the secondary "Input Source" mixer element. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: hda - Disable MSI for Nvidia controllerTakashi Iwai2010-04-01
| | | | | | | | | | | | | commit 80c43ed724797627d8f86855248c497a6161a214 upstream. Judging from the member of enable_msi white-list, Nvidia controller seems to cause troubles with MSI enabled, e.g. boot hang up or other serious issue may come up. It's safer to disable MSI as default for Nvidia controllers again for now. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: hda: Use LPIB and 6stack-dig for eMachines T5212Daniel T Chen2010-04-01
| | | | | | | | | | | | | | | | | commit 572c0e3c73341755f3e7dfaaef6b26df12bd709c upstream. BugLink: https://bugs.launchpad.net/bugs/538895 The OR has verified that both position_fix=1 and model=6stack-dig are necessary to have capture function properly. (The existing 3stack-6ch model quirk seems to be incorrect.) Reported-by: Reuben Bailey <reuben.e.bailey@gmail.com> Tested-by: Reuben Bailey <reuben.e.bailey@gmail.com> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sched: Fix SCHED_MC regression caused by change in sched cpu_powerSuresh Siddha2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dd5feea14a7de4edbd9f36db1a2db785de91b88d upstream On platforms like dual socket quad-core platform, the scheduler load balancer is not detecting the load imbalances in certain scenarios. This is leading to scenarios like where one socket is completely busy (with all the 4 cores running with 4 tasks) and leaving another socket completely idle. This causes performance issues as those 4 tasks share the memory controller, last-level cache bandwidth etc. Also we won't be taking advantage of turbo-mode as much as we would like, etc. Some of the comparisons in the scheduler load balancing code are comparing the "weighted cpu load that is scaled wrt sched_group's cpu_power" with the "weighted average load per task that is not scaled wrt sched_group's cpu_power". While this has probably been broken for a longer time (for multi socket numa nodes etc), the problem got aggrevated via this recent change: | | commit f93e65c186ab3c05ce2068733ca10e34fd00125e | Author: Peter Zijlstra <a.p.zijlstra@chello.nl> | Date: Tue Sep 1 10:34:32 2009 +0200 | | sched: Restore __cpu_power to a straight sum of power | Also with this change, the sched group cpu power alone no longer reflects the group capacity that is needed to implement MC, MT performance (default) and power-savings (user-selectable) policies. We need to use the computed group capacity (sgs.group_capacity, that is computed using the SD_PREFER_SIBLING logic in update_sd_lb_stats()) to find out if the group with the max load is above its capacity and how much load to move etc. Reported-by: Ma Ling <ling.ma@intel.com> Initial-Analysis-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> [ -v2: build fix ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1266970432.11588.22.camel@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* perf annotate: Defer allocating sym_priv->hist arrayArnaldo Carvalho de Melo2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | commit 628ada0cb03666dd463f7c25947eaccdf440c309 upstream Because symbol->end is not fixed up at symbol_filter time, only after all symbols for a DSO are loaded, and that, for asm symbols, may be bogus, causing segfaults when hits happen in these symbols. Backported-from: 628ada0 Reported-by: David Miller <davem@davemloft.net> Reported-by: Anton Blanchard <anton@samba.org> Acked-by: David Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20100225155740.GB8553@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* can: fix bfin_can build error after alloc_candev() changeBarry Song2010-04-01
| | | | | | | | | | | | | | commit e9dcd1613f0ac0b3573b7d813a2c5672cd8302eb upstream. Looks like commit a6e4bc530403 didn't include updates to drivers so the Blackfin CAN driver fails to build now. Signed-off-by: Barry Song <barry.song@analog.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* iwlwifi: use dma_alloc_coherentStanislaw Gruszka2010-04-01
| | | | | | | | | | | | commit f36d04abe684f9e2b07c6ebe9f77ae20eb5c1e84 upstream. Change pci_alloc_consistent() to dma_alloc_coherent() so we can use GFP_KERNEL flag. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* virtio: fix out of range array accessMichael S. Tsirkin2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3119815912a220bdac943dfbdfee640414c0c611 upstream. I have observed the following error on virtio-net module unload: ------------[ cut here ]------------ WARNING: at kernel/irq/manage.c:858 __free_irq+0xa0/0x14c() Hardware name: Bochs Trying to free already-free IRQ 0 Modules linked in: virtio_net(-) virtio_blk virtio_pci virtio_ring virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan] Pid: 1957, comm: rmmod Not tainted 2.6.33-rc8-vhost #24 Call Trace: [<ffffffff8103e195>] warn_slowpath_common+0x7c/0x94 [<ffffffff8103e204>] warn_slowpath_fmt+0x41/0x43 [<ffffffff810a7a36>] ? __free_pages+0x5a/0x70 [<ffffffff8107cc00>] __free_irq+0xa0/0x14c [<ffffffff8107cceb>] free_irq+0x3f/0x65 [<ffffffffa0081424>] vp_del_vqs+0x81/0xb1 [virtio_pci] [<ffffffffa0091d29>] virtnet_remove+0xda/0x10b [virtio_net] [<ffffffffa0075200>] virtio_dev_remove+0x22/0x4a [virtio] [<ffffffff812709ee>] __device_release_driver+0x66/0xac [<ffffffff81270ab7>] driver_detach+0x83/0xa9 [<ffffffff8126fc66>] bus_remove_driver+0x91/0xb4 [<ffffffff81270fcf>] driver_unregister+0x6c/0x74 [<ffffffffa0075418>] unregister_virtio_driver+0xe/0x10 [virtio] [<ffffffffa0091c4d>] fini+0x15/0x17 [virtio_net] [<ffffffff8106997b>] sys_delete_module+0x1c3/0x230 [<ffffffff81007465>] ? old_ich_force_enable_hpet+0x117/0x164 [<ffffffff813bb720>] ? do_page_fault+0x29c/0x2cc [<ffffffff81028e58>] sysenter_dispatch+0x7/0x27 ---[ end trace 15e88e4c576cc62b ]--- The bug is in virtio-pci: we use msix_vector as array index to get irq entry, but some vqs do not have a dedicated vector so this causes an out of bounds access. By chance, we seem to often get 0 value, which results in this error. Fix by verifying that vector is legal before using it as index. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Anthony Liguori <aliguori@us.ibm.com> Acked-by: Shirley Ma <xma@us.ibm.com> Acked-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* mqueue: fix mq_open() file descriptor leak on user-space processesAndré Goddard Rosa2010-04-01
| | | | | | | | | | | commit 4294a8eedb17bbc45e1e7447c2a4d05332943248 upstream. We leak fd on lookup_one_len() failure Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ath9k: add support for 802.11n bonded out AR2427Luis R. Rodriguez2010-04-01
| | | | | | | | | | | | | | | | | | This is a backport of of upstream commit: 5ffaf8a361b4c9025963959a744f21d8173c7669 Some single chip family devices are sold in the market with 802.11n bonded out, these have no hardware capability for 02.11n but ath9k can still support them. These are called AR2427. Reported-by: Rolf Leggewie <bugzilla.kernel.org@rolf.leggewie.biz> Tested-by: Bernhard Reiter <ockham@raz.or.at> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ath9k: fix lockdep warning when unloading moduleMing Lei2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a9f042cbe5284f34ccff15f3084477e11b39b17b upstream. Since txq->axq_lock may be hold in softirq context, it must be acquired with spin_lock_bh() instead of spin_lock() if softieq is enabled. The patch fixes the lockdep warning below when unloading ath9k modules. ================================= [ INFO: inconsistent lock state ] 2.6.33-wl #12 --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. rmmod/3642 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&txq->axq_lock)->rlock){+.?...}, at: [<ffffffffa03568c3>] ath_tx_node_cleanup+0x62/0x180 [ath9k] {IN-SOFTIRQ-W} state was registered at: [<ffffffff8107577d>] __lock_acquire+0x2f6/0xd35 [<ffffffff81076289>] lock_acquire+0xcd/0xf1 [<ffffffff813a7486>] _raw_spin_lock_bh+0x3b/0x6e [<ffffffffa0356b49>] spin_lock_bh+0xe/0x10 [ath9k] [<ffffffffa0358ec7>] ath_tx_tasklet+0xcd/0x391 [ath9k] [<ffffffffa0354f5f>] ath9k_tasklet+0x70/0xc8 [ath9k] [<ffffffff8104e601>] tasklet_action+0x8c/0xf4 [<ffffffff8104f459>] __do_softirq+0xf8/0x1cd [<ffffffff8100ab1c>] call_softirq+0x1c/0x30 [<ffffffff8100c2cf>] do_softirq+0x4b/0xa3 [<ffffffff8104f045>] irq_exit+0x4a/0x8c [<ffffffff813acccc>] do_IRQ+0xac/0xc3 [<ffffffff813a7d53>] ret_from_intr+0x0/0x16 [<ffffffff81302d52>] cpuidle_idle_call+0x9e/0xf8 [<ffffffff81008be7>] cpu_idle+0x62/0x9d [<ffffffff81391c1a>] rest_init+0x7e/0x80 [<ffffffff818bbd38>] start_kernel+0x3e8/0x3f3 [<ffffffff818bb2bc>] x86_64_start_reservations+0xa7/0xab [<ffffffff818bb3b8>] x86_64_start_kernel+0xf8/0x107 irq event stamp: 42037 hardirqs last enabled at (42037): [<ffffffff813a7b21>] _raw_spin_unlock_irqrestore+0x47/0x56 hardirqs last disabled at (42036): [<ffffffff813a72f4>] _raw_spin_lock_irqsave+0x2b/0x88 softirqs last enabled at (42000): [<ffffffffa0353ea6>] spin_unlock_bh+0xe/0x10 [ath9k] softirqs last disabled at (41998): [<ffffffff813a7463>] _raw_spin_lock_bh+0x18/0x6e other info that might help us debug this: 4 locks held by rmmod/3642: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8132c10d>] rtnl_lock+0x17/0x19 #1: (&wdev->mtx){+.+.+.}, at: [<ffffffffa01e53f2>] cfg80211_netdev_notifier_call+0x28d/0x46d [cfg80211] #2: (&ifmgd->mtx){+.+.+.}, at: [<ffffffffa0260834>] ieee80211_mgd_deauth+0x3f/0x17e [mac80211] #3: (&local->sta_mtx){+.+.+.}, at: [<ffffffffa025a381>] sta_info_destroy_addr+0x2b/0x5e [mac80211] stack backtrace: Pid: 3642, comm: rmmod Not tainted 2.6.33-wl #12 Call Trace: [<ffffffff81074469>] valid_state+0x178/0x18b [<ffffffff81014f94>] ? save_stack_trace+0x2f/0x4c [<ffffffff81074e08>] ? check_usage_backwards+0x0/0x88 [<ffffffff8107458f>] mark_lock+0x113/0x230 [<ffffffff810757f1>] __lock_acquire+0x36a/0xd35 [<ffffffff8101018d>] ? native_sched_clock+0x2d/0x5f [<ffffffffa03568c3>] ? ath_tx_node_cleanup+0x62/0x180 [ath9k] [<ffffffff81076289>] lock_acquire+0xcd/0xf1 [<ffffffffa03568c3>] ? ath_tx_node_cleanup+0x62/0x180 [ath9k] [<ffffffff810732eb>] ? trace_hardirqs_off+0xd/0xf [<ffffffff813a7193>] _raw_spin_lock+0x36/0x69 [<ffffffffa03568c3>] ? ath_tx_node_cleanup+0x62/0x180 [ath9k] [<ffffffffa03568c3>] ath_tx_node_cleanup+0x62/0x180 [ath9k] [<ffffffff810749ed>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa0353950>] ath9k_sta_remove+0x22/0x26 [ath9k] [<ffffffffa025a08f>] __sta_info_destroy+0x1ad/0x38c [mac80211] [<ffffffffa025a394>] sta_info_destroy_addr+0x3e/0x5e [mac80211] [<ffffffffa02605d6>] ieee80211_set_disassoc+0x175/0x180 [mac80211] [<ffffffffa026084d>] ieee80211_mgd_deauth+0x58/0x17e [mac80211] [<ffffffff813a60c1>] ? __mutex_lock_common+0x37f/0x3a4 [<ffffffffa01e53f2>] ? cfg80211_netdev_notifier_call+0x28d/0x46d [cfg80211] [<ffffffffa026786e>] ieee80211_deauth+0x1e/0x20 [mac80211] [<ffffffffa01f47f9>] __cfg80211_mlme_deauth+0x130/0x13f [cfg80211] [<ffffffffa01e53f2>] ? cfg80211_netdev_notifier_call+0x28d/0x46d [cfg80211] [<ffffffff810732eb>] ? trace_hardirqs_off+0xd/0xf [<ffffffffa01f7eee>] __cfg80211_disconnect+0x111/0x189 [cfg80211] [<ffffffffa01e5433>] cfg80211_netdev_notifier_call+0x2ce/0x46d [cfg80211] [<ffffffff813aa9ea>] notifier_call_chain+0x37/0x63 [<ffffffff81068c98>] raw_notifier_call_chain+0x14/0x16 [<ffffffff81322e97>] call_netdevice_notifiers+0x1b/0x1d [<ffffffff8132386d>] dev_close+0x6a/0xa6 [<ffffffff8132395f>] rollback_registered_many+0xb6/0x2f4 [<ffffffff81323bb8>] unregister_netdevice_many+0x1b/0x66 [<ffffffffa026494f>] ieee80211_remove_interfaces+0xc5/0xd0 [mac80211] [<ffffffffa02580a2>] ieee80211_unregister_hw+0x47/0xe8 [mac80211] [<ffffffffa035290e>] ath9k_deinit_device+0x7a/0x9b [ath9k] [<ffffffffa035bc26>] ath_pci_remove+0x38/0x76 [ath9k] [<ffffffff8120940a>] pci_device_remove+0x2d/0x51 [<ffffffff8129d797>] __device_release_driver+0x7b/0xd1 [<ffffffff8129d885>] driver_detach+0x98/0xbe [<ffffffff8129ca7a>] bus_remove_driver+0x94/0xb7 [<ffffffff8129ddd6>] driver_unregister+0x6c/0x74 [<ffffffff812096d2>] pci_unregister_driver+0x46/0xad [<ffffffffa035bae1>] ath_pci_exit+0x15/0x17 [ath9k] [<ffffffffa035e1a2>] ath9k_exit+0xe/0x2f [ath9k] [<ffffffff8108050a>] sys_delete_module+0x1c7/0x236 [<ffffffff813a7df5>] ? retint_swapgs+0x13/0x1b [<ffffffff810749b5>] ? trace_hardirqs_on_caller+0x119/0x144 [<ffffffff8109b9f6>] ? audit_syscall_entry+0x11e/0x14a [<ffffffff81009bb2>] system_call_fastpath+0x16/0x1b wlan1: deauthenticating from 00:23:cd:e1:f9:b2 by local choice (reason=3) PM: Removing info for No Bus:wlan1 cfg80211: Calling CRDA to update world regulatory domain PM: Removing info for No Bus:rfkill2 PM: Removing info for No Bus:phy1 ath9k 0000:16:00.0: PCI INT A disabled Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sh: Fix zImage boot using fixed PMB.Nobuhiro Iwamatsu2010-04-01
| | | | | | | | | | commit 319c2cc761505ee54a9536c5d0b9c2ee3fb33866 upstream. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com> Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sparc64: Make prom entry spinlock NMI safe.David S. Miller2010-04-01
| | | | | | | | | | | | | [ Upstream commit 8a4fd1e4922413cfdfa6c51a59efb720d904a5eb ] If we do something like try to print to the OF console from an NMI while we're already in OpenFirmware, we'll deadlock on the spinlock. Use a raw spinlock and disable NMIs when we take it. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* pci: add support for 82576NS serdes to existing SR-IOV quirkAlexander Duyck2010-04-01
| | | | | | | | | | | | | commit 7a0deb6bcda98c2a764cb87f1441eef920fd3663 upstream. This patch adds support for the 82576NS Serdes adapter to the existing pci quirk for 82576 parts. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: x86: Add KVM_CAP_X86_ROBUST_SINGLESTEPJan Kiszka2010-04-01
| | | | | | | | | | | | Commit d2be1651b736002e0c76d7095d6c0ba77b4a897c upstream. This marks the guest single-step API improvement of 94fe45da and 91586a3b with a capability flag to allow reliable detection by user space. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* V4L/DVB (13961): em28xx-dvb: fix memleak in dvb_fini()Francesco Lavra2010-04-01
| | | | | | | | | | | | | | | commit 19f48cb105b7fa18d0dcab435919a3a29b7a7c4c upstream. this patch fixes a memory leak which occurs when an em28xx card with DVB extension is unplugged or its DVB extension driver is unloaded. In dvb_fini(), dev->dvb must be freed before being set to NULL, as is done in dvb_init() in case of error. Note that this bug is also present in the latest stable kernel release. Signed-off-by: Francesco Lavra <francescolavra@interfree.it> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* coredump: suppress uid comparison test if core output files are pipesNeil Horman2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 76595f79d76fbe6267a51b3a866a028d150f06d4 upstream. Modify uid check in do_coredump so as to not apply it in the case of pipes. This just got noticed in testing. The end of do_coredump validates the uid of the inode for the created file against the uid of the crashing process to ensure that no one can pre-create a core file with different ownership and grab the information contained in the core when they shouldn' tbe able to. This causes failures when using pipes for a core dumps if the crashing process is not root, which is the uid of the pipe when it is created. The fix is simple. Since the check for matching uid's isn't relevant for pipes (a process can't create a pipe that the uermodehelper code will open anyway), we can just just skip it in the event ispipe is non-zero Reverts a pipe-affecting change which was accidentally made in : commit c46f739dd39db3b07ab5deb4e3ec81e1c04a91af : Author: Ingo Molnar <mingo@elte.hu> : AuthorDate: Wed Nov 28 13:59:18 2007 +0100 : Commit: Linus Torvalds <torvalds@woody.linux-foundation.org> : CommitDate: Wed Nov 28 10:58:01 2007 -0800 : : vfs: coredumping fix Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tracing: Do not record user stack trace from NMI contextSteven Rostedt2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b6345879ccbd9b92864fbd7eb8ac48acdb4d6b15 upstream. A bug was found with Li Zefan's ftrace_stress_test that caused applications to segfault during the test. Placing a tracing_off() in the segfault code, and examining several traces, I found that the following was always the case. The lock tracer was enabled (lockdep being required) and userstack was enabled. Testing this out, I just enabled the two, but that was not good enough. I needed to run something else that could trigger it. Running a load like hackbench did not work, but executing a new program would. The following would trigger the segfault within seconds: # echo 1 > /debug/tracing/options/userstacktrace # echo 1 > /debug/tracing/events/lock/enable # while :; do ls > /dev/null ; done Enabling the function graph tracer and looking at what was happening I finally noticed that all cashes happened just after an NMI. 1) | copy_user_handle_tail() { 1) | bad_area_nosemaphore() { 1) | __bad_area_nosemaphore() { 1) | no_context() { 1) | fixup_exception() { 1) 0.319 us | search_exception_tables(); 1) 0.873 us | } [...] 1) 0.314 us | __rcu_read_unlock(); 1) 0.325 us | native_apic_mem_write(); 1) 0.943 us | } 1) 0.304 us | rcu_nmi_exit(); [...] 1) 0.479 us | find_vma(); 1) | bad_area() { 1) | __bad_area() { After capturing several traces of failures, all of them happened after an NMI. Curious about this, I added a trace_printk() to the NMI handler to read the regs->ip to see where the NMI happened. In which I found out it was here: ffffffff8135b660 <page_fault>: ffffffff8135b660: 48 83 ec 78 sub $0x78,%rsp ffffffff8135b664: e8 97 01 00 00 callq ffffffff8135b800 <error_entry> What was happening is that the NMI would happen at the place that a page fault occurred. It would call rcu_read_lock() which was traced by the lock events, and the user_stack_trace would run. This would trigger a page fault inside the NMI. I do not see where the CR2 register is saved or restored in NMI handling. This means that it would corrupt the page fault handling that the NMI interrupted. The reason the while loop of ls helped trigger the bug, was that each execution of ls would cause lots of pages to be faulted in, and increase the chances of the race happening. The simple solution is to not allow user stack traces in NMI context. After this patch, I ran the above "ls" test for a couple of hours without any issues. Without this patch, the bug would trigger in less than a minute. Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tracing: Disable buffer switching when starting or stopping traceSteven Rostedt2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a2f8071428ed9a0f06865f417c962421c9a6b488 upstream. When the trace iterator is read, tracing_start() and tracing_stop() is called to stop tracing while the iterator is processing the trace output. These functions disable both the standard buffer and the max latency buffer. But if the wakeup tracer is running, it can switch these buffers between the two disables: buffer = global_trace.buffer; if (buffer) ring_buffer_record_disable(buffer); <<<--------- swap happens here buffer = max_tr.buffer; if (buffer) ring_buffer_record_disable(buffer); What happens is that we disabled the same buffer twice. On tracing_start() we can enable the same buffer twice. All ring_buffer_record_disable() must be matched with a ring_buffer_record_enable() or the buffer can be disable permanently, or enable prematurely, and cause a bug where a reset happens while a trace is commiting. This patch protects these two by taking the ftrace_max_lock to prevent a switch from occurring. Found with Li Zefan's ftrace_stress_test. Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tracing: Use same local variable when resetting the ring bufferSteven Rostedt2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | commit 283740c619d211e34572cc93c8cdba92ccbdb9cc upstream. In the ftrace code that resets the ring buffer it references the buffer with a local variable, but then uses the tr->buffer as the parameter to reset. If the wakeup tracer is running, which can switch the tr->buffer with the max saved buffer, this can break the requirement of disabling the buffer before the reset. buffer = tr->buffer; ring_buffer_record_disable(buffer); synchronize_sched(); __tracing_reset(tr->buffer, cpu); If the tr->buffer is swapped, then the reset is not happening to the buffer that was disabled. This will cause the ring buffer to fail. Found with Li Zefan's ftrace_stress_test. Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* tracing: Fix warning in s_next of trace file opsLai Jiangshan2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ac91d85456372a90af5b85eb6620fd2efb1e431b upstream. This warning in s_next() can be triggered by lseek(): [<c018b3f7>] ? s_next+0x77/0x80 [<c013e3c1>] warn_slowpath_common+0x81/0xa0 [<c018b3f7>] ? s_next+0x77/0x80 [<c013e3fa>] warn_slowpath_null+0x1a/0x20 [<c018b3f7>] s_next+0x77/0x80 [<c01efa77>] traverse+0x117/0x200 [<c01eff13>] seq_lseek+0xa3/0x120 [<c01efe70>] ? seq_lseek+0x0/0x120 [<c01d7081>] vfs_llseek+0x41/0x50 [<c01d8116>] sys_llseek+0x66/0xa0 [<c0102bd0>] sysenter_do_call+0x12/0x26 The iterator "leftover" variable is zeroed in the opening of the trace file. But lseek can call s_start() which will call s_next() without reseting the "leftover" variable back to zero, which might trigger the WARN_ON_ONCE(iter->leftover) that is in s_next(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4B8CE06A.9090207@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* function-graph: Init curr_ret_stack with ret_stackSteven Rostedt2010-04-01
| | | | | | | | | | | | | | | | | | | | | | commit ea14eb714041d40fcc5180b5a586034503650149 upstream. If the graph tracer is active, and a task is forked but the allocating of the processes graph stack fails, it can cause crash later on. This is due to the temporary stack being NULL, but the curr_ret_stack variable is copied from the parent. If it is not -1, then in ftrace_graph_probe_sched_switch() the following: for (index = next->curr_ret_stack; index >= 0; index--) next->ret_stack[index].calltime += timestamp; Will cause a kernel OOPS. Found with Li Zefan's ftrace_stress_test. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* hw-breakpoints: Remove stub unthrottle callbackFrederic Weisbecker2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | commit 1e259e0a9982078896f3404240096cbea01daca4 upstream. We support event unthrottling in breakpoint events. It means that if we have more than sysctl_perf_event_sample_rate/HZ, perf will throttle, ignoring subsequent events until the next tick. So if ptrace exceeds this max rate, it will omit events, which breaks the ptrace determinism that is supposed to report every triggered breakpoints. This is likely to happen if we set sysctl_perf_event_sample_rate to 1. This patch removes support for unthrottling in breakpoint events to break throttling and restore ptrace determinism. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86/stacktrace: Don't dereference bad frame pointersFrederic Weisbecker2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 29044ad1509ecc229f1d5a31aeed7a8dc61a71c4 upstream. Callers of a stacktrace might pass bad frame pointers. Those are usually checked for safety in stack walking helpers before any dereferencing, but this is not the case when we need to go through one more frame pointer that backlinks the irq stack to the previous one, as we don't have any reliable address boudaries to compare this frame pointer against. This raises crashes when we record callchains for ftrace events with perf because we don't use the right helpers to capture registers there. We get wrong frame pointers as we call task_pt_regs() even on kernel threads, which is a wrong thing as it gives us the initial state of any kernel threads freshly created. This is even not what we want for user tasks. What we want is a hot snapshot of registers when the ftrace event triggers, not the state before a task entered the kernel. This requires more thoughts to do it correctly though. So first put a guardian to ensure the given frame pointer can be dereferenced to avoid crashes. We'll think about how to fix the callers in a subsequent patch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K ↵Suresh Siddha2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | already commit 281ff33b7c1b1ba2a5f9b03425e5f692a94913fa upstream. We currently enforce the !RW mapping for the kernel mapping that maps holes between different text, rodata and data sections. However, kernel identity mappings will have different RWX permissions to the pages mapping to text and to the pages padding (which are freed) the text, rodata sections. Hence kernel identity mappings will be broken to smaller pages. For 64-bit, kernel text and kernel identity mappings are different, so we can enable protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB large page mappings for kernel text. Konrad reported a boot failure with the Linux Xen paravirt guest because of this. In this paravirt guest case, the kernel text mapping and the kernel identity mapping share the same page-table pages. Thus forcing the !RW mapping for some of the kernel mappings also cause the kernel identity mappings to be read-only resulting in the boot failure. Linux Xen paravirt guest also uses 4k mappings and don't use 2M mapping. Fix this issue and retain large page performance advantage for native kernels by not working hard and not enforcing !RW for the kernel text mapping, if the current mapping is already using small page mapping. Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1266522700.2909.34.camel@sbs-t61.sc.intel.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ring-buffer: Move disabled check into preempt disable sectionLai Jiangshan2010-04-01
| | | | | | | | | | | | | | | | | | | commit 52fbe9cde7fdb5c6fac196d7ebd2d92d05ef3cd4 upstream. The ring buffer resizing and resetting relies on a schedule RCU action. The buffers are disabled, a synchronize_sched() is called and then the resize or reset takes place. But this only works if the disabling of the buffers are within the preempt disabled section, otherwise a window exists that the buffers can be written to while a reset or resize takes place. Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4B949E43.2010906@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ath5k: fix setup for CAB queueBob Copeland2010-04-01
| | | | | | | | | | | | | | | | | | | | | | commit a951ae2176b982574ffa197455db6c89359fd5eb upstream. The beacon sent gating doesn't seem to work with any combination of flags. Thus, buffered frames tend to stay buffered forever, using up tx descriptors. Instead, use the DBA gating and hold transmission of the buffered frames until 80% of the beacon interval has elapsed using the ready time. This fixes the following error in AP mode: ath5k phy0: no further txbuf available, dropping packet Add a comment to acknowledge that this isn't the best solution. Signed-off-by: Bob Copeland <me@bobcopeland.com> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ath5k: dont use external sleep clock in AP modeBob Copeland2010-04-01
| | | | | | | | | | | | | | | | | commit 5d6ce628f986d1a3c523cbb0a5a52095c48cc332 upstream. When using the external sleep clock in AP mode, the TSF increments too quickly, causing beacon interval to be much lower than it is supposed to be, resulting in lots of beacon-not-ready interrupts. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14802. Signed-off-by: Bob Copeland <me@bobcopeland.com> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>