aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* xen/grant-table: add error-handling code on failure of gnttab_resumeJulia Lawall2012-04-17
| | | | | | | | | | Jump to the label ini_nomem as done on the failure of the page allocations above. The code at ini_nomem is modified to accommodate different return values. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/pcifront: avoid pci_frontend_enable_msix() falsely returning successJan Beulich2012-04-06
| | | | | | | | | | The original XenoLinux code has always had things this way, and for compatibility reasons (in particular with a subsequent pciback adjustment) upstream Linux should behave the same way (allowing for two distinct error indications to be returned by the backend). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/pciback: fix XEN_PCI_OP_enable_msix resultJan Beulich2012-04-06
| | | | | | | | | | | | | | Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a positive value to indicate the number of vectors (less than the amount requested) that can be set up for a given device. Returning this as an operation value (secondary result) is fine, but (primary) operation results are expected to be negative (error) or zero (success) according to the protocol. With the frontend fixed to match the XenoLinux behavior, the backend can now validly return zero (success) here, passing the upper limit on the number of vectors in op->value. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/smp: Remove unnecessary call to smp_processor_id()Srivatsa S. Bhat2012-04-06
| | | | | | | | There is an extra and unnecessary call to smp_processor_id() in cpu_bringup(). Remove it. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus ↵Konrad Rzeszutek Wilk2012-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | io-apic entries' The above mentioned patch checks the IOAPIC and if it contains -1, then it unmaps said IOAPIC. But under Xen we get this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 IP: [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0 PGD 0 Oops: 0002 [#1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.2.10-3.fc16.x86_64 #1 Dell Inc. Inspiron 1525 /0U990C RIP: e030:[<ffffffff8134e51f>] [<ffffffff8134e51f>] xen_irq_init+0x1f/0xb0 RSP: e02b: ffff8800d42cbb70 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000ffffffef RCX: 0000000000000001 RDX: 0000000000000040 RSI: 00000000ffffffef RDI: 0000000000000001 RBP: ffff8800d42cbb80 R08: ffff8800d6400000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000ffffffef R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff8800df5fe000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0:000000008005003b CR2: 0000000000000040 CR3: 0000000001a05000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff8800d42ca000, task ffff8800d42d0000) Stack: 00000000ffffffef 0000000000000010 ffff8800d42cbbe0 ffffffff8134f157 ffffffff8100a9b2 ffffffff8182ffd1 00000000000000a0 00000000829e7384 0000000000000002 0000000000000010 00000000ffffffff 0000000000000000 Call Trace: [<ffffffff8134f157>] xen_bind_pirq_gsi_to_irq+0x87/0x230 [<ffffffff8100a9b2>] ? check_events+0x12+0x20 [<ffffffff814bab42>] xen_register_pirq+0x82/0xe0 [<ffffffff814bac1a>] xen_register_gsi.part.2+0x4a/0xd0 [<ffffffff814bacc0>] acpi_register_gsi_xen+0x20/0x30 [<ffffffff8103036f>] acpi_register_gsi+0xf/0x20 [<ffffffff8131abdb>] acpi_pci_irq_enable+0x12e/0x202 [<ffffffff814bc849>] pcibios_enable_device+0x39/0x40 [<ffffffff812dc7ab>] do_pci_enable_device+0x4b/0x70 [<ffffffff812dc878>] __pci_enable_device_flags+0xa8/0xf0 [<ffffffff812dc8d3>] pci_enable_device+0x13/0x20 The reason we are dying is b/c the call acpi_get_override_irq() is used, which returns the polarity and trigger for the IRQs. That function calls mp_find_ioapics to get the 'struct ioapic' structure - which along with the mp_irq[x] is used to figure out the default values and the polarity/trigger overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt. The proper fix for this is going in v3.5 and adds an x86_io_apic_ops struct so that platforms can override it. But for v3.4 lets carry this work-around. This patch does that by providing a slightly different variant of the fake IOAPIC entries. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: only check xen_platform_pci_unplug if hvmIgor Mammedov2012-04-06
| | | | | | | | | | | | commit b9136d207f08 xen: initialize platform-pci even if xen_emul_unplug=never breaks blkfront/netfront by not loading them because of xen_platform_pci_unplug=0 and it is never set for PV guest. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/acpi: Fix Kconfig dependency on CPU_FREQKonrad Rzeszutek Wilk2012-03-24
| | | | | | | | | | | | | | | The functions: "acpi_processor_*" sound like they depend on CONFIG_ACPI_PROCESSOR but in reality they are exposed when CONFIG_CPU_FREQ=[y|m]. As such update the Kconfig to have this dependency and fix compile issues: ERROR: "acpi_processor_unregister_performance" [drivers/xen/xen-acpi-processor.ko] undefined! ERROR: "acpi_processor_notify_smm" [drivers/xen/xen-acpi-processor.ko] undefined! ERROR: "acpi_processor_register_performance" [drivers/xen/xen-acpi-processor.ko] undefined! ERROR: "acpi_processor_preregister_performance" [drivers/xen/xen-acpi-processor.ko] undefined! Note: We still need the CONFIG_ACPI Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: initialize platform-pci even if xen_emul_unplug=neverIgor Mammedov2012-03-22
| | | | | | | | | | | | | | | When xen_emul_unplug=never is specified on kernel command line reading files from /sys/hypervisor is broken (returns -EBUSY). It is caused by xen_bus dependency on platform-pci and platform-pci isn't initialized when xen_emul_unplug=never is specified. Fix it by allowing platform-pci to ignore xen_emul_unplug=never, and do not intialize xen_[blk|net]front instead. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/smp: Fix bringup bug in AP code.Konrad Rzeszutek Wilk2012-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | The CPU hotplug code has now a callback to help bring up the CPU. Without the call we end up getting: BUG: soft lockup - CPU#0 stuck for 29s! [migration/0:6] Modules linked in: CPU ] Pid: 6, comm: migration/0 Not tainted 3.3.0upstream-01180-ged378a5 #1 Dell Inc. PowerEdge T105 /0RR825 RIP: e030:[<ffffffff810d3b8b>] [<ffffffff810d3b8b>] stop_machine_cpu_stop+0x7b/0xf0 RSP: e02b:ffff8800ceaabdb0 EFLAGS: 00000293 .. snip.. Call Trace: [<ffffffff810d3b10>] ? stop_one_cpu_nowait+0x50/0x50 [<ffffffff810d3841>] cpu_stopper_thread+0xf1/0x1c0 [<ffffffff815a9776>] ? __schedule+0x3c6/0x760 [<ffffffff815aa749>] ? _raw_spin_unlock_irqrestore+0x19/0x30 [<ffffffff810d3750>] ? res_counter_charge+0x150/0x150 [<ffffffff8108dc76>] kthread+0x96/0xa0 [<ffffffff815b27e4>] kernel_thread_helper+0x4/0x10 [<ffffffff815aacbc>] ? retint_restore_ar Thix fixes it. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/acpi: Remove the WARN's as they just create noise.Konrad Rzeszutek Wilk2012-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting the kernel under machines that do not have P-states we would end up with: ------------[ cut here ]------------ WARNING: at drivers/xen/xen-acpi-processor.c:504 xen_acpi_processor_init+0x286/0 x2e0() Hardware name: ProLiant BL460c G6 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.39-200.0.3.el5uek #1 Call Trace: [<ffffffff8191d056>] ? xen_acpi_processor_init+0x286/0x2e0 [<ffffffff81068300>] warn_slowpath_common+0x90/0xc0 [<ffffffff8191cdd0>] ? check_acpi_ids+0x1e0/0x1e0 [<ffffffff8106834a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8191d056>] xen_acpi_processor_init+0x286/0x2e0 [<ffffffff8191cdd0>] ? check_acpi_ids+0x1e0/0x1e0 [<ffffffff81002168>] do_one_initcall+0xe8/0x130 .. snip.. Which is OK - the machines do not have P-states, so we fail to register to process the _PXX states. But there is no need to WARN the user of it. Oracle BZ# 13871288 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/tmem: cleanupJan Beulich2012-03-20
| | | | | | | | | Use 'bool' for boolean variables. Do proper section placement. Eliminate an unnecessary export. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: support pirq_eoi_mapStefano Stabellini2012-03-20
| | | | | | | | | | | | | | | | | | | | | | | | The pirq_eoi_map is a bitmap offered by Xen to check which pirqs need to be EOI'd without having to issue an hypercall every time. We use PHYSDEVOP_pirq_eoi_gmfn_v2 to map the bitmap, then if we succeed we use pirq_eoi_map to check whether pirqs need eoi. Changes in v3: - explicitly use PHYSDEVOP_pirq_eoi_gmfn_v2 rather than PHYSDEVOP_pirq_eoi_gmfn; - introduce pirq_check_eoi_map, a function to check if a pirq needs an eoi using the map; -rename pirq_needs_eoi into pirq_needs_eoi_flag; - introduce a function pointer called pirq_needs_eoi that is going to be set to the right implementation depending on the availability of PHYSDEVOP_pirq_eoi_gmfn_v2. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/acpi-processor: Do not depend on CPU frequency scaling drivers.Konrad Rzeszutek Wilk2012-03-20
| | | | | | | | | | | With patch "xen/cpufreq: Disable the cpu frequency scaling drivers from loading." we do not have to worry about said drivers loading themselves before the xen-acpi-processor driver. Hence we can remove the default selection (=y if CPU frequency drivers were built-in, or =m if CPU frequency drivers were built as modules), and just select =m for the default case. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/cpufreq: Disable the cpu frequency scaling drivers from loading.Konrad Rzeszutek Wilk2012-03-20
| | | | | | | | | | | | | | | By using the functionality provided by "[CPUFREQ]: provide disable_cpuidle() function to disable the API." Under the Xen hypervisor we do not want the initial domain to exercise the cpufreq scaling drivers. This is b/c the Xen hypervisor is in charge of doing this as well and we can end up with both the Linux kernel and the hypervisor trying to change the P-states leading to weird performance issues. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: Fix compile error spotted by Benjamin Schweikert <b.schweikert@googlemail.com>]
* provide disable_cpufreq() function to disable the API.Konrad Rzeszutek Wilk2012-03-20
| | | | | | | | useful for disabling cpufreq altogether. The cpu frequency scaling drivers and cpu frequency governors will fail to register. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>
* xen kconfig: relax INPUT_XEN_KBDDEV_FRONTEND depsAndrew Jones2012-03-16
| | | | | | | | | | | | PV-on-HVM guests may want to use the xen keyboard/mouse frontend, but they don't use the xen frame buffer frontend. For this case it doesn't make much sense for INPUT_XEN_KBDDEV_FRONTEND to depend on XEN_FBDEV_FRONTEND. The opposite direction always makes more sense, i.e. if you're using xenfb, then you'll want xenkbd. Switch the dependencies. Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.Konrad Rzeszutek Wilk2012-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This driver solves three problems: 1). Parse and upload ACPI0007 (or PROCESSOR_TYPE) information to the hypervisor - aka P-states (cpufreq data). 2). Upload the the Cx state information (cpuidle data). 3). Inhibit CPU frequency scaling drivers from loading. The reason for wanting to solve 1) and 2) is such that the Xen hypervisor is the only one that knows the CPU usage of different guests and can make the proper decision of when to put CPUs and packages in proper states. Unfortunately the hypervisor has no support to parse ACPI DSDT tables, hence it needs help from the initial domain to provide this information. The reason for 3) is that we do not want the initial domain to change P-states while the hypervisor is doing it as well - it causes rather some funny cases of P-states transitions. For this to work, the driver parses the Power Management data and uploads said information to the Xen hypervisor. It also calls acpi_processor_notify_smm() to inhibit the other CPU frequency scaling drivers from being loaded. Everything revolves around the 'struct acpi_processor' structure which gets updated during the bootup cycle in different stages. At the startup, when the ACPI parser starts, the C-state information is processed (processor_idle) and saved in said structure as 'power' element. Later on, the CPU frequency scaling driver (powernow-k8 or acpi_cpufreq), would call the the acpi_processor_* (processor_perflib functions) to parse P-states information and populate in the said structure the 'performance' element. Since we do not want the CPU frequency scaling drivers from loading we have to call the acpi_processor_* functions to parse the P-states and call "acpi_processor_notify_smm" to stop them from loading. There is also one oddity in this driver which is that under Xen, the physical online CPU count can be different from the virtual online CPU count. Meaning that the macros 'for_[online|possible]_cpu' would process only up to virtual online CPU count. We on the other hand want to process the full amount of physical CPUs. For that, the driver checks if the ACPI IDs count is different from the APIC ID count - which can happen if the user choose to use dom0_max_vcpu argument. In such a case a backup of the PM structure is used and uploaded to the hypervisor. [v1-v2: Initial RFC implementations that were posted] [v3: Changed the name to passthru suggested by Pasi Kärkkäinen <pasik@iki.fi>] [v4: Added vCPU != pCPU support - aka dom0_max_vcpus support] [v5: Cleaned up the driver, fix bug under Athlon XP] [v6: Changed the driver to a CPU frequency governor] [v7: Jan Beulich <jbeulich@suse.com> suggestion to make it a cpufreq scaling driver made me rework it as driver that inhibits cpufreq scaling driver] [v8: Per Jan's review comments, fixed up the driver] [v9: Allow to continue even if acpi_processor_preregister_perf.. fails] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: constify all instances of "struct attribute_group"Jan Beulich2012-03-14
| | | | | | | | The functions these get passed to have been taking pointers to const since at least 2.6.16. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/xenbus: ignore console/0Stefano Stabellini2012-03-13
| | | | | | | | | | | | | | Unfortunately xend creates a bogus console/0 frotend/backend entry pair on xenstore that console backends cannot properly cope with. Any guest behavior that is not completely ignoring console/0 is going to either cause problems with xenconsoled or qemu. Returning 0 or -ENODEV from xencons_probe is not enough because it is going to cause the frontend state to become 4 or 6 respectively. The best possible thing we can do here is just ignore the entry from xenbus_probe_frontend. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* hvc_xen: introduce HVC_XEN_FRONTENDStefano Stabellini2012-03-13
| | | | | | | | Introduce a new config option HVC_XEN_FRONTEND to enable/disable the xenbus based pv console frontend. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* hvc_xen: implement multiconsole supportStefano Stabellini2012-03-13
| | | | | | | | | | | | | | | | | This patch implements support for multiple consoles: consoles other than the first one are setup using the traditional xenbus and grant-table based mechanism. We use a list to keep track of the allocated consoles, we don't expect too many of them anyway. Changes in v3: - call hvc_remove before removing the console from xenconsoles; - do not lock xencons_lock twice in the destruction path; - use the DEFINE_XENBUS_DRIVER macro. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* hvc_xen: support PV on HVM consolesStefano Stabellini2012-03-13
| | | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xenbus: don't free other end details too earlyJan Beulich2012-03-13
| | | | | | | | | | | The individual drivers' remove functions could legitimately attempt to access this information (for logging messages if nothing else). Note that I did not in fact observe a problem anywhere, but I came across this while looking into the reasons for what turned out to need the fix at https://lkml.org/lkml/2012/3/5/336 to vsprintf(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/enlighten: Expose MWAIT and MWAIT_LEAF if hypervisor OKs it.Konrad Rzeszutek Wilk2012-03-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the hypervisor to take advantage of the MWAIT support it needs to extract from the ACPI _CST the register address. But the hypervisor does not have the support to parse DSDT so it relies on the initial domain (dom0) to parse the ACPI Power Management information and push it up to the hypervisor. The pushing of the data is done by the processor_harveset_xen module which parses the information that the ACPI parser has graciously exposed in 'struct acpi_processor'. For the ACPI parser to also expose the Cx states for MWAIT, we need to expose the MWAIT capability (leaf 1). Furthermore we also need to expose the MWAIT_LEAF capability (leaf 5) for cstate.c to properly function. The hypervisor could expose these flags when it traps the XEN_EMULATE_PREFIX operations, but it can't do it since it needs to be backwards compatible. Instead we choose to use the native CPUID to figure out if the MWAIT capability exists and use the XEN_SET_PDC query hypercall to figure out if the hypervisor wants us to expose the MWAIT_LEAF capability or not. Note: The XEN_SET_PDC query was implemented in c/s 23783: "ACPI: add _PDC input override mechanism". With this in place, instead of C3 ACPI IOPORT 415 we get now C3:ACPI FFH INTEL MWAIT 0x20 Note: The cpu_idle which would be calling the mwait variants for idling never gets set b/c we set the default pm_idle to be the hypercall variant. Acked-by: Jan Beulich <JBeulich@suse.com> [v2: Fix missing header file include and #ifdef] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/setup/pm/acpi: Remove the call to boot_option_idle_override.Konrad Rzeszutek Wilk2012-03-10
| | | | | | | | | | We needed that call in the past to force the kernel to use default_idle (which called safe_halt, which called xen_safe_halt). But set_pm_idle_to_default() does now that, so there is no need to use this boot option operand. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xenbus: address compiler warningsJan Beulich2012-02-26
| | | | | | | | - casting pointers to integer types of different size is being warned on - an uninitialized variable warning occurred on certain gcc versions Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: use this_cpu_xxx replace percpu_xxx funcsAlex Shi2012-01-24
| | | | | | | | | | | | | | percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. I don't know much of xen code. But, since the code is in x86 architecture, the percpu_xxx is exactly same as this_cpu_xxx serials functions. So, the change is safe. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen/pciback: Support pci_reset_function, aka FLR or D3 support.Konrad Rzeszutek Wilk2012-01-12
| | | | | | | | | | We use the __pci_reset_function_locked to perform the action. Also on attaching ("bind") and detaching ("unbind") we save and restore the configuration states. When the device is disconnected from a guest we use the "pci_reset_function" to also reset the device before being passed to another guest. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* pci: Introduce __pci_reset_function_locked to be used when holding device_lock.Konrad Rzeszutek Wilk2012-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | The use case of this is when a driver wants to call FLR when a device is attached to it using the SysFS "bind" or "unbind" functionality. The call chain when a user does "bind" looks as so: echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind and ends up calling: driver_bind: device_lock(dev); <=== TAKES LOCK XXXX_probe: .. pci_enable_device() ...__pci_reset_function(), which calls pci_dev_reset(dev, 0): if (!0) { device_lock(dev) <==== DEADLOCK The __pci_reset_function_locked function allows the the drivers 'probe' function to call the "pci_reset_function" while still holding the driver mutex lock. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* xen: Utilize the restore_msi_irqs hook.Tang Liang2012-01-12
| | | | | | | | to make a hypercall to restore the vectors in the MSI/MSI-X configuration space. Signed-off-by: Tang Liang <liang.tang@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'linux-next' of ↵Linus Torvalds2012-01-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits) x86/PCI: Expand the x86_msi_ops to have a restore MSIs. PCI: Increase resource array mask bit size in pcim_iomap_regions() PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT) PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB x86/PCI: amd: factor out MMCONFIG discovery PCI: Enable ATS at the device state restore PCI: msi: fix imbalanced refcount of msi irq sysfs objects PCI: kconfig: English typo in pci/pcie/Kconfig PCI/PM/Runtime: make PCI traces quieter PCI: remove pci_create_bus() xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus() x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented() x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources sparc/PCI: convert to pci_create_root_bus() sh/PCI: convert to pci_scan_root_bus() for correct root bus resources powerpc/PCI: convert to pci_create_root_bus() powerpc/PCI: split PHB part out of pcibios_map_io_space() ... Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due to the same patches being applied in other branches.
| * x86/PCI: Expand the x86_msi_ops to have a restore MSIs.Konrad Rzeszutek Wilk2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The MSI restore function will become a function pointer in an x86_msi_ops struct. It defaults to the implementation in the io_apic.c and msi.c. We piggyback on the indirection mechanism introduced by "x86: Introduce x86_msi_ops". Cc: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-pci@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: Increase resource array mask bit size in pcim_iomap_regions()Yinghai Lu2012-01-06
| | | | | | | | | | | | | | | | | | DEVICE_COUNT_RESOURCE will be bigger than 16 when SRIOV supported is enabled. Let them pass with int just like pci_enable_resources(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCESYinghai Lu2012-01-06
| | | | | | | | | | | | | | Save some bytes for device resource array. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)Alessandro Rubini2012-01-06
| | | | | | | | | | | | | | | | | | | | | | The chip is an I/O hub used by some Atom boards. Most of those symbols are used in arch/x86/platform/sta2x11/sta2x11.c (to be introduced) and in the specific drivers as well. Signed-off-by: Alessandro Rubini <rubini@gnudd.com> Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USBBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some Dell BIOSes have MCFG tables that don't report the entire MMCONFIG area claimed by the chipset. If we move PCI devices into that claimed-but-unreported area, they don't work. This quirk reads the AMD MMCONFIG MSRs and adds PNP0C01 resources as needed to cover the entire area. Example problem scenario: BIOS-e820: 00000000cfec5400 - 00000000d4000000 (reserved) Fam 10h mmconf [d0000000, dfffffff] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xd0000000-0xd3ffffff] (base 0xd0000000) pnp 00:0c: [mem 0xd0000000-0xd3ffffff] pci 0000:00:12.0: reg 10: [mem 0xffb00000-0xffb00fff] pci 0000:00:12.0: no compatible bridge window for [mem 0xffb00000-0xffb00fff] pci 0000:00:12.0: BAR 0: assigned [mem 0xd4000000-0xd40000ff] Reported-by: Lisa Salimbas <lisa.salimbas@canonical.com> Reported-by: <thuban@singularity.fr> Tested-by: dann frazier <dann.frazier@canonical.com> References: https://bugzilla.kernel.org/show_bug.cgi?id=31602 References: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/647043 References: https://bugzilla.redhat.com/show_bug.cgi?id=770308 Cc: stable@kernel.org # 2.6.34+ Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * x86/PCI: amd: factor out MMCONFIG discoveryBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This factors out the AMD native MMCONFIG discovery so we can use it outside amd_bus.c. amd_bus.c reads AMD MSRs so it can remove the MMCONFIG area from the PCI resources. We may also need the MMCONFIG information to work around BIOS defects in the ACPI MCFG table. Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org # 2.6.34+ Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: Enable ATS at the device state restoreHao, Xudong2012-01-06
| | | | | | | | | | | | | | | | | | | | During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly. This patch enables ATS at the device state restore if PCI device has ATS capability. Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: msi: fix imbalanced refcount of msi irq sysfs objectsNeil Horman2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This warning was recently reported to me: ------------[ cut here ]------------ WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60() Hardware name: VMware Virtual Platform kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is being called. Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10 vmw_pvscsi Pid: 630, comm: modprobe Tainted: G W 3.1.6-1.fc16.x86_64 #1 Call Trace: [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50 [<ffffffff810da293>] ? free_desc+0x63/0x70 [<ffffffff812a9aa0>] kobject_put+0x50/0x60 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3] [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3] [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90 [<ffffffff8138b63e>] driver_attach+0x1e/0x20 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0 [<ffffffffa0188000>] ? 0xffffffffa0187fff [<ffffffff8138c246>] driver_register+0x76/0x140 [<ffffffff815ca414>] ? printk+0x51/0x53 [<ffffffffa0188000>] ? 0xffffffffa0187fff [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3] [<ffffffff81002042>] do_one_initcall+0x42/0x180 [<ffffffff810aad71>] sys_init_module+0x91/0x200 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b ---[ end trace 44593438a59a9558 ]--- Using INTx interrupt, #Rx queues: 1. It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to be called. Because populate_msi_sysfs fails, we never registered any of the msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put on each of them, which gets flagged in the above stack trace. The fix is pretty straightforward. We can key of the parent pointer in the kobject. It is only set if the kobject_init_and_add succededs in populate_msi_sysfs. If anything fails there, each kobject has its parent reset to NULL Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Bjorn Helgaas <bhelgaas@google.com> CC: Greg Kroah-Hartman <gregkh@suse.de> CC: linux-pci@vger.kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: kconfig: English typo in pci/pcie/KconfigP. Christeas2012-01-06
| | | | | | | | | | | | | | Just fix this help text. Signed-off-by: P. Christeas <xrg@linux.gr> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI/PM/Runtime: make PCI traces quieterVincent Palatin2012-01-06
| | | | | | | | | | | | | | | | | | | | | | When the runtime PM is activated on PCI, if a device switches state frequently (e.g. an EHCI controller with autosuspending USB devices connected) the PCI configuration traces might be very verbose in the kernel log. Let's guard those traces with DEBUG condition. Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * PCI: remove pci_create_bus()Bjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | All users of pci_create_bus() have been converted to pci_create_root_bus(), so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resourcesBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert from pci_scan_bus() to pci_scan_root_bus() and remove root bus fixups. This fixes the problem of "early" and "header" quirks seeing incorrect root bus resources. This arch was unusual because it filled in bus->resource[0..3] in pcibios_init(), then overwrote them, applied io_space.offset and checked for unset resources in pcibios_fixup_bus(). I moved all of that to a new pci_controller_apertures() that we can use before scanning the root bus. CC: Chris Zankel <chris@zankel.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()Bjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 has two kinds of PCI root bus scanning: (1) ACPI-based, using _CRS resources. This used pci_create_bus(), not pci_scan_bus(), because ACPI hotplug needed to split the pci_bus_add_devices() into a separate host bridge .start() method. This patch parses the _CRS resources earlier, so we can build a list of resources and pass it to pci_create_root_bus(). Note that as before, we parse the _CRS even if we aren't going to use it so we can print it for debugging purposes. (2) All other, which used either default resources (ioport_resource and iomem_resource) or information read from the hardware via amd_bus.c or similar. This used pci_scan_bus(). This patch converts x86_pci_root_bus_res_quirks() (previously called from pcibios_fixup_bus()) to x86_pci_root_bus_resources(), which builds a list of resources before we call pci_scan_root_bus(). We also use x86_pci_root_bus_resources() if we have ACPI but are ignoring _CRS. CC: Yinghai Lu <yinghai.lu@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()Bjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't change any functionality, but it makes a subsequent patch slightly simpler. pci_scan_bus(NULL, ...) and pci_scan_bus_parented() are identical except that pci_scan_bus() also calls pci_bus_add_devices(): pci_scan_bus_parented pci_create_bus pci_scan_child_bus pci_scan_bus pci_create_bus pci_scan_child_bus pci_bus_add_devices All callers of pcibios_scan_root() call pci_bus_add_devices() explicitly, and we don't pass a parent device, so we might as well use pci_scan_bus(). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * x86/PCI: read Broadcom CNB20LE host bridge info before PCI scanBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently read the CNB20LE aperture information in a PCI quirk, which happens after we've already created the root bus. This patch changes it to read the apertures earlier so we can create the root bus with the correct resources. I believe the CNB20LE lives at "pci 0000:00:00" based on https://lkml.org/lkml/2010/8/13/220 CC: Ira W. Snyder <iws@ovro.caltech.edu> CC: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resourcesBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from pci_scan_bus_parented() to pci_scan_root_bus() and remove root bus resource fixups. This fixes the problem of "early" and "header" quirks seeing incorrect root bus resources. pci_scan_root_bus() also includes the pci_bus_add_devices() so we don't need to do that separately. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * sparc/PCI: convert to pci_create_root_bus()Bjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from pci_create_bus() to pci_create_root_bus(). This way the root bus resources are correct immediately. This patch doesn't fix a problem because sparc fixed the resources before scanning the bus, but it makes sparc more consistent with other architectures. v2: fix build error (from sfr) Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * sh/PCI: convert to pci_scan_root_bus() for correct root bus resourcesBjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | Convert from pci_scan_bus() to pci_scan_root_bus() and remove root bus resource fixups. This fixes the problem of "early" and "header" quirks seeing incorrect root bus resources. CC: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * powerpc/PCI: convert to pci_create_root_bus()Bjorn Helgaas2012-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert from pci_create_bus() to pci_create_root_bus(). This way the root bus resources are correct immediately. This patch doesn't fix a problem because powerpc fixed the resources before scanning the bus, but it makes powerpc more consistent with other architectures. v2: fix build error with resource pointer passing CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>