aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* ACPI / scan: Add second pass of companion offlining to hot-remove codeRafael J. Wysocki2013-06-01
| | | | | | | | | | | | | | | | As indicated by comments in mm/memory_hotplug.c:remove_memory(), if CONFIG_MEMCG is set, it may not be possible to offline all of the memory blocks held by one module (FRU) in one pass (because one of them may be used by the others to store page cgroup in that case and that block has to be offlined before the other ones). To handle that arguably corner case, add a second pass of companion device offlining to acpi_scan_hot_remove() and make it ignore errors returned in the first pass (and make it skip the second pass if the first one is successful). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com>
* Driver core / MM: Drop offline_memory_block()Rafael J. Wysocki2013-06-01
| | | | | | | | | | Since offline_memory_block(mem) is functionally equivalent to device_offline(&mem->dev), make the only caller of the former use the latter instead and drop offline_memory_block() entirely. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Toshi Kani <toshi.kani@hp.com>
* ACPI / processor: Pass processor object handle to acpi_bind_one()Rafael J. Wysocki2013-06-01
| | | | | | | | | | Make acpi_processor_add() pass the ACPI handle of the processor namespace object to acpi_bind_one() instead of setting it directly to allow acpi_bind_one() to catch possible bugs causing the ACPI handle of the processor device to be set earlier. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com>
* ACPI: Drop removal_type field from struct acpi_deviceRafael J. Wysocki2013-06-01
| | | | | | | | | | The ACPI processor driver was the only user of the removal_type field in struct acpi_device, but it doesn't use that field any more after recent changes. Thus, removal_type has no more users, so drop it along with the associated data type. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Toshi Kani <toshi.kani@hp.com>
* Driver core / memory: Simplify __memory_block_change_state()Rafael J. Wysocki2013-06-01
| | | | | | | | | | | | | | As noted by Tang Chen, the last_online field in struct memory_block introduced by commit 4960e05 (Driver core: Introduce offline/online callbacks for memory blocks) is not really necessary, because online_pages() restores the previous state if passed ONLINE_KEEP as the last argument. Therefore, remove that field along with the code referring to it. References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
* ACPI / processor: Initialize per_cpu(processors, pr->id) properlyRafael J. Wysocki2013-05-30
| | | | | | | | | | | | | | | | | | | | | Commit ac212b6 (ACPI / processor: Use common hotplug infrastructure) forgot about initializing the per-CPU 'processors' variables which lead to ACPI cpuidle failure to use C-states and caused boot slowdown on multi-CPU machines. Fix the problem by adding per_cpu(processors, pr->id) initialization to acpi_processor_add() and add make acpi_processor_remove() clean it up as appropriate. Also modify acpi_processor_stop() so that it doesn't clear per_cpu(processors, pr->id) on processor driver removal which would then cause problems to happen when the driver is loaded again. This version of the patch contains fixes from Yinghai Lu. Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Reported-and-tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* CPU: Fix sysfs cpu/online of offlined CPUsToshi Kani2013-05-29
| | | | | | | | | | | | | As reported by Dave Hansen, sysfs cpu/online shows 1 for offlined CPUs at boot. Fix this problem by initializing dev.offline with cpu_online() when registering a CPU. References: https://lkml.org/lkml/2013/5/29/403 Reported-and-tested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Driver core: Introduce offline/online callbacks for memory blocksRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | Introduce .offline() and .online() callbacks for memory_subsys that will allow the generic device_offline() and device_online() to be used with device objects representing memory blocks. That, in turn, allows the ACPI subsystem to use device_offline() to put removable memory blocks offline, if possible, before removing memory modules holding them. The 'online' sysfs attribute of memory block devices will attempt to put them offline if 0 is written to it and will attempt to apply the previously used online type when onlining them (i.e. when 1 is written to it). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* ACPI / memhotplug: Bind removable memory blocks to ACPI device nodesRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | | | | | During ACPI memory hotplug configuration bind memory blocks residing in modules removable through the standard ACPI mechanism to struct acpi_device objects associated with ACPI namespace objects representing those modules. Accordingly, unbind those memory blocks from the struct acpi_device objects when the memory modules in question are being removed. When "offline" operation for devices representing memory blocks is introduced, this will allow the ACPI core's device hot-remove code to use it to carry out remove_memory() for those memory blocks and check the results of that before it actually removes the modules holding them from the system. Since walk_memory_range() is used for accessing all memory blocks corresponding to a given ACPI namespace object, it is exported from memory_hotplug.c so that the code in acpi_memhotplug.c can use it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* ACPI / processor: Use common hotplug infrastructureRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the ACPI processor driver into two parts, one that is non-modular, resides in the ACPI core and handles the enumeration and hotplug of processors and one that implements the rest of the existing processor driver functionality. The non-modular part uses an ACPI scan handler object to enumerate processors on the basis of information provided by the ACPI namespace and to hook up with the common ACPI hotplug infrastructure. It also populates the ACPI handle of each processor device having a corresponding object in the ACPI namespace, which allows the driver proper to bind to those devices, and makes the driver bind to them if it is readily available (i.e. loaded) when the scan handler's .attach() routine is running. There are a few reasons to make this change. First, switching the ACPI processor driver to using the common ACPI hotplug infrastructure reduces code duplication and size considerably, even though a new file is created along with a header comment etc. Second, since the common hotplug code attempts to offline devices before starting the (non-reversible) removal procedure, it will abort (and possibly roll back) hot-remove operations involving processors if cpu_down() returns an error code for one of them instead of continuing them blindly (if /sys/firmware/acpi/hotplug/force_remove is unset). That is a more desirable behavior than what the current code does. Finally, the separation of the scan/hotplug part from the driver proper makes it possible to simplify the driver's .remove() routine, because it doesn't need to worry about the possible cleanup related to processor removal any more (the scan/hotplug part is responsible for that now) and can handle device removal and driver removal symmetricaly (i.e. as appropriate). Some user-visible changes in sysfs are made (for example, the 'sysdev' link from the ACPI device node to the processor device's directory is gone and a 'physical_node' link is present instead and a corresponding 'firmware_node' is present in the processor device's directory, the processor driver is now visible under /sys/bus/cpu/drivers/ and bound to the processor device), but that shouldn't affect the functionality that users care about (frequency scaling, C-states and thermal management). Tested on my venerable Toshiba Portege R500. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* ACPI / hotplug: Use device offline/online for graceful hot-removalRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | | Modify the generic ACPI hotplug code to be able to check if devices scheduled for hot-removal may be gracefully removed from the system using the device offline/online mechanism introduced previously. Namely, make acpi_scan_hot_remove() handling device hot-removal call device_offline() for all physical companions of the ACPI device nodes involved in the operation and check the results. If any of the device_offline() calls fails, the function will not progress to the removal phase (which cannot be aborted), unless its (new) force argument is set (in case of a failing offline it will put the devices offlined by it back online). In support of 'forced' device hot-removal, add a new sysfs attribute 'force_remove' that will reside under /sys/firmware/acpi/hotplug/. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* Driver core: Use generic offline/online for CPU offline/onlineRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | Rework the CPU hotplug code in drivers/base/cpu.c to use the generic offline/online support introduced previously instead of its own CPU-specific code. For this purpose, modify cpu_subsys to provide offline and online callbacks for CONFIG_HOTPLUG_CPU set and remove the code handling the CPU-specific 'online' sysfs attribute. This modification is not supposed to change the user-observable behavior of the kernel (i.e. the 'online' attribute will be present in exactly the same place in sysfs and should trigger exactly the same actions as before). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* Driver core: Add offline/online device operationsRafael J. Wysocki2013-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases, graceful hot-removal of devices is not possible, although in principle the devices in question support hotplug. For example, that may happen for the last CPU in the system or for memory modules holding kernel memory. In those cases it is nice to be able to check if the given device can be gracefully hot-removed before triggering a removal procedure that cannot be aborted or reversed. Unfortunately, however, the kernel currently doesn't provide any support for that. To address that deficiency, introduce support for offline and online operations that can be performed on devices, respectively, before a hot-removal and in case when it is necessary (or convenient) to put a device back online after a successful offline (that has not been followed by removal). The idea is that the offline will fail whenever the given device cannot be gracefully removed from the system and it will not be allowed to use the device after a successful offline (until a subsequent online) in analogy with the existing CPU offline/online mechanism. For now, the offline and online operations are introduced at the bus type level, as that should be sufficient for the most urgent use cases (CPUs and memory modules). In the future, however, the approach may be extended to cover some more complicated device offline/online scenarios involving device drivers etc. The lock_device_hotplug() and unlock_device_hotplug() functions are introduced because subsequent patches need to put larger pieces of code under device_hotplug_lock to prevent race conditions between device offline and removal from happening. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Toshi Kani <toshi.kani@hp.com>
* Merge branch 'acpi-fixes' into acpi-hotplugRafael J. Wysocki2013-05-12
|\ | | | | | | | | | | | | | | | | * acpi-fixes: ACPI / AC: Add sleep quirk for Thinkpad e530 ACPI / EC: Restart transaction even when the IBF flag set ACPI video: ignore BIOS initial backlight value for HP 1000 ACPI: Fix section to __init. Align with usage in acpixf.h ACPI / PM: Move processor suspend/resume to syscore_ops
| * ACPI / AC: Add sleep quirk for Thinkpad e530Lan Tianyu2013-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Thinkpad e530's BIOS notifies the AC device first and then sleeps for certain amount of time before doing real work in the EC event handler (_Qxx): Method (_Q27, 0, NotSerialized) { Notify (AC, 0x80) Sleep (0x03E8) Store (Zero, PWRS) PNOT () } This causes the AC driver to report an outdated AC state to user space, because it reads the state information from the device while the EC handler is sleeping. Introduce a quirk to cause the AC driver to wait in acpi_ac_notify() before calling acpi_ac_get_state() on systems known to have this problem and add Thinkpad e530 to the list of quirky machines (with a 1s delay which has been verified to be sufficient for that machine). [rjw: Changelog] References: https://bugzilla.kernel.org/show_bug.cgi?id=45221 Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * ACPI / EC: Restart transaction even when the IBF flag setLan Tianyu2013-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The EC driver works abnormally with IBF flag always set. IBF means "The host has written a byte of data to the command or data port, but the embedded controller has not yet read it". If IBF is set in the EC status and not cleared, this will cause all subsequent EC requests to fail with a timeout error. Change the EC driver so that it doesn't refuse to restart a transaction if IBF is set in the status. Also increase the number of transaction restarts to 5, as it turns out that 2 is not sufficient in some cases. This bug happens on several different machines (Asus V1S, Dell Latitude E6530, Samsung R719, Acer Aspire 5930G, Sony Vaio SR19VN and others). [rjw: Changelog] References: https://bugzilla.kernel.org/show_bug.cgi?id=14733 References: https://bugzilla.kernel.org/show_bug.cgi?id=15560 References: https://bugzilla.kernel.org/show_bug.cgi?id=15946 References: https://bugzilla.kernel.org/show_bug.cgi?id=42945 References: https://bugzilla.kernel.org/show_bug.cgi?id=48221 Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Cc: All <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * ACPI video: ignore BIOS initial backlight value for HP 1000Alex Hung2013-05-12
| | | | | | | | | | | | | | | | | | | | On HP 1000 lapops, BIOS reports minimum backlight on boot and causes backlight to dim completely. This ignores the initial backlight values and set to max brightness. References:: https://bugs.launchpad.net/bugs/1167760 Signed-off-by: Alex Hung <alex.hung@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * ACPI: Fix section to __init. Align with usage in acpixf.hJan-Simon Möller2013-05-12
| | | | | | | | | | | | | | | | Fixes warning during compilation with clang. [rjw: Subject and changelog] Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * ACPI / PM: Move processor suspend/resume to syscore_opsRafael J. Wysocki2013-05-12
|/ | | | | | | | | | | | | | | The system suspend routine of the ACPI processor driver saves the BUS_MASTER_RLD register and its resume routine restores it. However, there can be only one such register in the system and it really should be saved after non-boot CPUs have been offlined and restored before they are put back online during resume. For this reason, move the saving and restoration of BUS_MASTER_RLD to syscore suspend and syscore resume, respectively, and drop the no longer necessary suspend/resume callbacks from the ACPI processor driver. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Linux 3.10-rc1Linus Torvalds2013-05-11
|
* Merge tag 'trace-fixes-v3.10' of ↵Linus Torvalds2013-05-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing/kprobes update from Steven Rostedt: "The majority of these changes are from Masami Hiramatsu bringing kprobes up to par with the latest changes to ftrace (multi buffering and the new function probes). He also discovered and fixed some bugs in doing so. When pulling in his patches, I also found a few minor bugs as well and fixed them. This also includes a compile fix for some archs that select the ring buffer but not tracing. I based this off of the last patch you took from me that fixed the merge conflict error, as that was the commit that had all the changes I needed for this set of changes." * tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/kprobes: Support soft-mode disabling tracing/kprobes: Support ftrace_event_file base multibuffer tracing/kprobes: Pass trace_probe directly from dispatcher tracing/kprobes: Increment probe hit-count even if it is used by perf tracing/kprobes: Use bool for retprobe checker ftrace: Fix function probe when more than one probe is added ftrace: Fix the output of enabled_functions debug file ftrace: Fix locking in register_ftrace_function_probe() tracing: Add helper function trace_create_new_event() to remove duplicate code tracing: Modify soft-mode only if there's no other referrer tracing: Indicate enabled soft-mode in enable file tracing/kprobes: Fix to increment return event probe hit-count ftrace: Cleanup regex_lock and ftrace_lock around hash updating ftrace, kprobes: Fix a deadlock on ftrace_regex_lock ftrace: Have ftrace_regex_write() return either read or error tracing: Return error if register_ftrace_function_probe() fails for event_enable_func() tracing: Don't succeed if event_enable_func did not register anything ring-buffer: Select IRQ_WORK
| * tracing/kprobes: Support soft-mode disablingMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support soft-mode disabling on kprobe-based dynamic events. Soft-disabling is just ignoring recording if the soft disabled flag is set. Link: http://lkml.kernel.org/r/20130509054454.30398.7237.stgit@mhiramat-M0-7522 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/kprobes: Support ftrace_event_file base multibufferMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support multi-buffer on kprobe-based dynamic events by using ftrace_event_file. Link: http://lkml.kernel.org/r/20130509054449.30398.88343.stgit@mhiramat-M0-7522 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/kprobes: Pass trace_probe directly from dispatcherMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the pointer of struct trace_probe directly from probe dispatcher to handlers. This removes redundant container_of macro uses. Same thing has already done in trace_uprobe. Link: http://lkml.kernel.org/r/20130509054441.30398.69112.stgit@mhiramat-M0-7522 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/kprobes: Increment probe hit-count even if it is used by perfMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increment probe hit-count for profiling even if it is used by perf tool. Same thing has already done in trace_uprobe. Link: http://lkml.kernel.org/r/20130509054436.30398.21133.stgit@mhiramat-M0-7522 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/kprobes: Use bool for retprobe checkerMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Use bool instead of int for kretprobe checker. Link: http://lkml.kernel.org/r/20130509054431.30398.38561.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Fix function probe when more than one probe is addedSteven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the first function probe is added and the function tracer is updated the functions are modified to call the probe. But when a second function is added, it updates the function records to have the second function also update, but it fails to update the actual function itself. This prevents the second (or third or forth and so on) probes from having their functions called. # echo vfs_symlink:enable_event:sched:sched_switch > set_ftrace_filter # echo vfs_unlink:enable_event:sched:sched_switch > set_ftrace_filter # cat trace # tracer: nop # # entries-in-buffer/entries-written: 0/0 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | # touch /tmp/a # rm /tmp/a # cat trace # tracer: nop # # entries-in-buffer/entries-written: 0/0 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | # ln -s /tmp/a # cat trace # tracer: nop # # entries-in-buffer/entries-written: 414/414 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] d..3 2847.923031: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=bash next_pid=2786 next_prio=120 <...>-3114 [001] d..4 2847.923035: sched_switch: prev_comm=ln prev_pid=3114 prev_prio=120 prev_state=x ==> next_comm=swapper/1 next_pid=0 next_prio=120 bash-2786 [000] d..3 2847.923535: sched_switch: prev_comm=bash prev_pid=2786 prev_prio=120 prev_state=S ==> next_comm=kworker/0:1 next_pid=34 next_prio=120 kworker/0:1-34 [000] d..3 2847.923552: sched_switch: prev_comm=kworker/0:1 prev_pid=34 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 <idle>-0 [002] d..3 2847.923554: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=sshd next_pid=2783 next_prio=120 sshd-2783 [002] d..3 2847.923660: sched_switch: prev_comm=sshd prev_pid=2783 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120 Still need to update the functions even though the probe itself does not need to be registered again when added a new probe. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Fix the output of enabled_functions debug fileSteven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enabled_functions debugfs file was created to be able to see what functions have been modified from nops to calling a tracer. The current method uses the counter in the function record. As when a ftrace_ops is registered to a function, its count increases. But that doesn't mean that the function is actively being traced. /proc/sys/kernel/ftrace_enabled can be set to zero which would disable it, as well as something can go wrong and we can think its enabled when only the counter is set. The record's FTRACE_FL_ENABLED flag is set or cleared when its function is modified. That is a much more accurate way of knowing what function is enabled or not. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Fix locking in register_ftrace_function_probe()Steven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | The iteration of the ftrace function list and the call to ftrace_match_record() need to be protected by the ftrace_lock. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Add helper function trace_create_new_event() to remove duplicate codeSteven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Both __trace_add_new_event() and __trace_early_add_new_event() do basically the same thing, except that __trace_add_new_event() does a little more. Instead of having duplicate code between the two functions, add a helper function trace_create_new_event() that both can use. This will help against having bugs fixed in one function but not the other. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Modify soft-mode only if there's no other referrerMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify soft-mode flag only if no other soft-mode referrer (currently only the ftrace triggers) by using a reference counter in each ftrace_event_file. Without this fix, adding and removing several different enable/disable_event triggers on the same event clear soft-mode bit from the ftrace_event_file. This also happens with a typo of glob on setting triggers. e.g. # echo vfs_symlink:enable_event:net:netif_rx > set_ftrace_filter # cat events/net/netif_rx/enable 0* # echo typo_func:enable_event:net:netif_rx > set_ftrace_filter # cat events/net/netif_rx/enable 0 # cat set_ftrace_filter #### all functions enabled #### vfs_symlink:enable_event:net:netif_rx:unlimited As above, we still have a trigger, but soft-mode is gone. Link: http://lkml.kernel.org/r/20130509054429.30398.7464.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: David Sharp <dhsharp@google.com> Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Indicate enabled soft-mode in enable fileMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Indicate enabled soft-mode event as "1*" in "enable" file for each event, because it can be soft-disabled when disable_event trigger is hit. Link: http://lkml.kernel.org/r/20130509054426.30398.28202.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing/kprobes: Fix to increment return event probe hit-countMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to increment probe hit-count for function return event. Link: http://lkml.kernel.org/r/20130509054424.30398.34058.stgit@mhiramat-M0-7522 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Cleanup regex_lock and ftrace_lock around hash updatingMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup regex_lock and ftrace_lock locking points around ftrace_ops hash update code. The new rule is that regex_lock protects ops->*_hash read-update-write code for each ftrace_ops. Usually, hash update is done by following sequence. 1. allocate a new local hash and copy the original hash. 2. update the local hash. 3. move(actually, copy) back the local hash to ftrace_ops. 4. update ftrace entries if needed. 5. release the local hash. This makes regex_lock protect #1-#4, and ftrace_lock to protect #3, #4 and adding and removing ftrace_ops from the ftrace_ops_list. The ftrace_lock protects #3 as well because the move functions update the entries too. Link: http://lkml.kernel.org/r/20130509054421.30398.83411.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace, kprobes: Fix a deadlock on ftrace_regex_lockMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a deadlock on ftrace_regex_lock which happens when setting an enable_event trigger on dynamic kprobe event as below. ---- sh-2.05b# echo p vfs_symlink > kprobe_events sh-2.05b# echo vfs_symlink:enable_event:kprobes:p_vfs_symlink_0 > set_ftrace_filter ============================================= [ INFO: possible recursive locking detected ] 3.9.0+ #35 Not tainted --------------------------------------------- sh/72 is trying to acquire lock: (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810ba6c1>] ftrace_set_hash+0x81/0x1f0 but task is already holding lock: (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810b7cbd>] ftrace_regex_write.isra.29.part.30+0x3d/0x220 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(ftrace_regex_lock); lock(ftrace_regex_lock); *** DEADLOCK *** ---- To fix that, this introduces a finer regex_lock for each ftrace_ops. ftrace_regex_lock is too big of a lock which protects all filter/notrace_hash operations, but it doesn't need to be a global lock after supporting multiple ftrace_ops because each ftrace_ops has its own filter/notrace_hash. Link: http://lkml.kernel.org/r/20130509054417.30398.84254.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> [ Added initialization flag and automate mutex initialization for non ftrace.c ftrace_probes. ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace: Have ftrace_regex_write() return either read or errorSteven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | | | | | | | | | As ftrace_regex_write() reads the result of ftrace_process_regex() which can sometimes return a positive number, only consider a failure if the return is negative. Otherwise, it will skip possible other registered probes and by returning a positive number that wasn't read, it will confuse the user processes doing the writing. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Return error if register_ftrace_function_probe() fails for ↵Steven Rostedt (Red Hat)2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | event_enable_func() register_ftrace_function_probe() returns the number of functions it registered, which can be zero, it can also return a negative number if something went wrong. But event_enable_func() only checks for the case that it didn't register anything, it needs to also check for the case that something went wrong and return that error code as well. Added some comments about the code as well, to make it more understandable. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Don't succeed if event_enable_func did not register anythingMasami Hiramatsu2013-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return 0 instead of the number of activated ftrace function probes if event_enable_func succeeded and return an error code if it failed or did not register any functions. But it currently returns the number of registered functions and if it didn't register anything, it returns 0, but that is considered success. This also fixes the return value. As if it succeeds, it returns the number of functions that were enabled, which is returned back to the user in ftrace_regex_write (the write() return code). If only one function is enabled, then the return code of the write is one, and this can confuse the user program in thinking it only wrote 1 byte. Link: http://lkml.kernel.org/r/20130509054413.30398.55650.stgit@mhiramat-M0-7522 Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tom Zanussi <tom.zanussi@intel.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> [ Rewrote change log to reflect that this fixes two bugs - SR ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ring-buffer: Select IRQ_WORKSteven Rostedt (Red Hat)2013-05-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the wake up logic for waiters on the buffer has been moved from the tracing code to the ring buffer, it requires also adding IRQ_WORK as the wake up code is performed via irq_work. This fixes compile breakage when a user of the ring buffer is selected but tracing and irq_work are not. Link http://lkml.kernel.org/r/20130503115332.GT8356@rric.localhost Cc: Arnd Bergmann <arnd@arndb.de> Reported-by: Robert Richter <rric@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | Merge tag 'stable/for-linus-3.10-rc0-tag-two' of ↵Linus Torvalds2013-05-11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: - More fixes in the vCPU PVHVM hotplug path. - Add more documentation. - Fix various ARM related issues in the Xen generic drivers. - Updates in the xen-pciback driver per Bjorn's updates. - Mask the x2APIC feature for PV guests. * tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pci: Used cached MSI-X capability offset xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST xen: mask x2APIC feature in PV xen: SWIOTLB is only used on x86 xen/spinlock: Fix check from greater than to be also be greater or equal to. xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info xen/vcpu: Document the xen_vcpu_info and xen_vcpu xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.
| * | xen/pci: Used cached MSI-X capability offsetBjorn Helgaas2013-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now cache the MSI-X capability offset in the struct pci_dev, so no need to find the capability again. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASKBjorn Helgaas2013-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the Table Offset register, not the flags ("Message Control" per spec) register. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: clear IRQ_NOAUTOEN and IRQ_NOREQUESTJulien Grall2013-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reset the IRQ_NOAUTOEN and IRQ_NOREQUEST flags that are enabled by default on ARM. If IRQ_NOAUTOEN is set, __setup_irq doesn't call irq_startup, that is responsible for calling irq_unmask at startup time. As a result event channels remain masked. The clear is already made in bind_evtchn_to_irq with commit a8636c0 but was missing on all others bind_*_to_irq. Move the clear in xen_irq_info_common_init. On x86, IRQ_NOAUTOEN and IRQ_NOREQUEST are cleared by default, so this commit doesn't impact this architecture. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: mask x2APIC feature in PVZhenzhong Duan2013-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On x2apic enabled pvm, doing sysrq+l, got NULL pointer dereference as below. SysRq : Show backtrace of all active CPUs BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120 Call Trace: [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160 [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20 [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0 [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50 [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10 [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140 [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140 [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60 [<ffffffff811ca996>] proc_reg_write+0x86/0xc0 [<ffffffff8116dd8e>] vfs_write+0xce/0x190 [<ffffffff8116e3e5>] sys_write+0x55/0x90 [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b That's because apic points to apic_x2apic_cluster or apic_x2apic_phys but the basic element like cpumask isn't initialized. Mask x2APIC feature in pvm to avoid overwrite of apic pointer, update commit message per Konrad's suggestion. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Tested-by: Tamon Shiose <tamon.shiose@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen: SWIOTLB is only used on x86Arnd Bergmann2013-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling SWIOTLB_XEN on ARM results in build errors because the underlying SWIOTLB is only available on X86: drivers/xen/swiotlb-xen.c: In function 'is_xen_swiotlb_buffer': drivers/xen/swiotlb-xen.c:105:2: error: implicit declaration of function 'mfn_to_local_pfn Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/spinlock: Fix check from greater than to be also be greater or equal to.Konrad Rzeszutek Wilk2013-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During review of git commit cb9c6f15f318aa3aeb62fe525aa5c6dcf6eee159 ("xen/spinlock: Check against default value of -1 for IRQ line.") Stefano pointed out a bug in the patch. Unfortunatly due to vacation timing the fix was not applied and this patch fixes it up. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_infoKonrad Rzeszutek Wilk2013-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it will point to some data, but not event channel data (the shared_info has an array limited to 32). This means that for PVHVM guests with more than 32 VCPUs without the usage of VCPUOP_register_info any interrupts to VCPUs larger than 32 would have gone unnoticed during early bootup. That is OK, as during early bootup, in smp_init we end up calling the hotplug mechanism (xen_hvm_cpu_notify) which makes the VCPUOP_register_vcpu_info call for all VCPUs and we can receive interrupts on VCPUs 33 and further. This is just a cleanup. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/vcpu: Document the xen_vcpu_info and xen_vcpuKonrad Rzeszutek Wilk2013-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They are important structures and it is not clear at first look what they are for. The xen_vcpu is a pointer. By default it points to the shared_info structure (at the CPU offset location). However if the VCPUOP_register_vcpu_info hypercall is implemented we can make the xen_vcpu pointer point to a per-CPU location. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Added comments from Ian Campbell] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.Konrad Rzeszutek Wilk2013-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a user did: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online we would (this a build with DEBUG enabled) get to: smpboot: ++++++++++++++++++++=_---CPU UP 1 .. snip.. smpboot: Stack at about ffff880074c0ff44 smpboot: CPU1: has booted. and hang. The RCU mechanism would kick in an try to IPI the CPU1 but the IPIs (and all other interrupts) would never arrive at the CPU1. At first glance at least. A bit digging in the hypervisor trace shows that (using xenanalyze): [vla] d4v1 vec 243 injecting 0.043163027 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043163639 --|x d4v1 vmentry cycles 1468 ] 0.043164913 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043164913 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting 0.043164913 --|x d4v1 intr_window vec 243 src 5(vector) intr f3 ] 0.043165526 --|x d4v1 vmentry cycles 1472 ] 0.043166800 --|x d4v1 vmexit exit_reason PENDING_INTERRUPT eip ffffffff81673254 0.043166800 --|x d4v1 inj_virq vec 243 real [vla] d4v1 vec 243 injecting there is a pending event (subsequent debugging shows it is the IPI from the VCPU0 when smpboot.c on VCPU1 has done "set_cpu_online(smp_processor_id(), true)") and the guest VCPU1 is interrupted with the callback IPI (0xf3 aka 243) which ends up calling __xen_evtchn_do_upcall. The __xen_evtchn_do_upcall seems to do *something* but not acknowledge the pending events. And the moment the guest does a 'cli' (that is the ffffffff81673254 in the log above) the hypervisor is invoked again to inject the IPI (0xf3) to tell the guest it has pending interrupts. This repeats itself forever. The culprit was the per_cpu(xen_vcpu, cpu) pointer. At the bootup we set each per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] but later on use the VCPUOP_register_vcpu_info to register per-CPU structures (xen_vcpu_setup). This is used to allow events for more than 32 VCPUs and for performance optimizations reasons. When the user performs the VCPU hotplug we end up calling the the xen_vcpu_setup once more. We make the hypercall which returns -EINVAL as it does not allow multiple registration calls (and already has re-assigned where the events are being set). We pick the fallback case and set per_cpu(xen_vcpu, cpu) to point to the shared_info->vcpu_info[vcpu] (which is a good fallback during bootup). However the hypervisor is still setting events in the register per-cpu structure (per_cpu(xen_vcpu_info, cpu)). As such when the events are set by the hypervisor (such as timer one), and when we iterate in __xen_evtchn_do_upcall we end up reading stale events from the shared_info->vcpu_info[vcpu] instead of the per_cpu(xen_vcpu_info, cpu) structures. Hence we never acknowledge the events that the hypervisor has set and the hypervisor keeps on reminding us to ack the events which we never do. The fix is simple. Don't on the second time when xen_vcpu_setup is called over-write the per_cpu(xen_vcpu, cpu) if it points to per_cpu(xen_vcpu_info). Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | Merge tag 'scsi-for-linus' of ↵Linus Torvalds2013-05-11
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull second SCSI update from James "Jaj B" Bottomley: "This is the final round of SCSI patches for the merge window. It consists mostly of driver updates (bnx2fc, ibmfc, fnic, lpfc, be2iscsi, pm80xx, qla4x and ipr). There's also the power management updates that complete the patches in Jens' tree, an iscsi refcounting problem fix from the last pull, some dif handling in scsi_debug fixes, a few nice code cleanups and an error handling busy bug fix." * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (92 commits) [SCSI] qla2xxx: Update firmware link in Kconfig file. [SCSI] iscsi class, qla4xxx: fix sess/conn refcounting when find fns are used [SCSI] sas: unify the pointlessly separated enums sas_dev_type and sas_device_type [SCSI] pm80xx: thermal, sas controller config and error handling update [SCSI] pm80xx: NCQ error handling changes [SCSI] pm80xx: WWN Modification for PM8081/88/89 controllers [SCSI] pm80xx: Changed module name and debug messages update [SCSI] pm80xx: Firmware flash memory free fix, with addition of new memory region for it [SCSI] pm80xx: SPC new firmware changes for device id 0x8081 alone [SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files [SCSI] pm80xx: MSI-X implementation for using 64 interrupts [SCSI] pm80xx: Updated common functions common for SPC and SPCv/ve [SCSI] pm80xx: Multiple inbound/outbound queue configuration [SCSI] pm80xx: Added SPCv/ve specific ids, variables and modify for SPC [SCSI] lpfc: fix up Kconfig dependencies [SCSI] Handle MLQUEUE busy response in scsi_send_eh_cmnd [SCSI] sd: change to auto suspend mode [SCSI] sd: use REQ_PM in sd's runtime suspend operation [SCSI] qla4xxx: Fix iocb_cnt calculation in qla4xxx_send_mbox_iocb() [SCSI] ufs: Correct the expected data transfersize ...