aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* xen/Makefile: fix dom-y buildStefano Stabellini2012-10-02
| | | | | | | | We need to add $(dom0-y) to obj-$(CONFIG_XEN_DOM0) after dom0-y is defined otherwise we end up adding nothing. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* Merge branch 'xenarm-for-linus' of ↵Konrad Rzeszutek Wilk2012-09-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://xenbits.xen.org/people/sstabellini/linux-pvhvm into stable/for-linus-3.7 * 'xenarm-for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm: arm: introduce a DTS for Xen unprivileged virtual machines MAINTAINERS: add myself as Xen ARM maintainer xen/arm: compile netback xen/arm: compile blkfront and blkback xen/arm: implement alloc/free_xenballooned_pages with alloc_pages/kfree xen/arm: receive Xen events on ARM xen/arm: initialize grant_table on ARM xen/arm: get privilege status xen/arm: introduce CONFIG_XEN on ARM xen: do not compile manage, balloon, pci, acpi, pcpu and cpu_hotplug on ARM xen/arm: Introduce xen_ulong_t for unsigned long xen/arm: Xen detection and shared_info page mapping docs: Xen ARM DT bindings xen/arm: empty implementation of grant_table arch specific functions xen/arm: sync_bitops xen/arm: page.h definitions xen/arm: hypercalls arm: initial Xen support Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * arm: introduce a DTS for Xen unprivileged virtual machinesStefano Stabellini2012-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that the xenvm machine is based on vexpress but with an extremely limited selection of peripherals (the guest is supposed to use virtual devices instead), add "xen,xenvm" to the list of compatible machines in mach-vexpress. Changes in v3: - add comments to mark fields that are likely to be changed by the hypervisor. Changes in v2: - remove include skeleton; - use #address-cells = <2> and #size-cells = <2>; - remove the debug bootargs; - use memory@80000000 instead of memory; - remove the ranges and interrupt-map from the motherboard node; - set the machine compatible to "xen,xenvm-4.2", "xen,xenvm"; - rename the dts file to xenvm-4.2.dts; - add "xen,xenvm" to the list of compatible DT strings to mach-vexpress. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Pawel Moll <pawel.moll@arm.com> (v2m changes)
| * MAINTAINERS: add myself as Xen ARM maintainerStefano Stabellini2012-09-14
| | | | | | | | | | | | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Arnd Bergmann <arnd@arndb.de>
| * xen/arm: compile netbackStefano Stabellini2012-08-08
| | | | | | | | | | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: compile blkfront and blkbackStefano Stabellini2012-08-08
| | | | | | | | | | | | Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: implement alloc/free_xenballooned_pages with alloc_pages/kfreeStefano Stabellini2012-08-08
| | | | | | | | | | | | | | | | Only until we get the balloon driver to work. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: receive Xen events on ARMStefano Stabellini2012-09-14
| | | | | | | | | | | | | | | | | | | | Compile events.c on ARM. Parse, map and enable the IRQ to get event notifications from the device tree (node "/xen"). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: initialize grant_table on ARMStefano Stabellini2012-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize the grant table mapping at the address specified at index 0 in the DT under the /xen node. After the grant table is initialized, call xenbus_probe (if not dom0). Changes in v2: - introduce GRANT_TABLE_PHYSADDR; - remove unneeded initialization of boot_max_nr_grant_frames. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
| * xen/arm: get privilege statusStefano Stabellini2012-08-08
| | | | | | | | | | | | | | | | | | | | Use Xen features to figure out if we are privileged. XENFEAT_dom0 was introduced by 23735 in xen-unstable.hg. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: introduce CONFIG_XEN on ARMStefano Stabellini2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in v5: - make XEN_DOM0 depend on XEN; - avoid "select XEN_DOM0" in XEN. Changes in v2: - mark Xen guest support on ARM as EXPERIMENTAL. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: Sergei Shtylyov <sshtylyov@mvista.com>
| * xen: do not compile manage, balloon, pci, acpi, pcpu and cpu_hotplug on ARMStefano Stabellini2012-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes in v4: - compile pcpu only on x86; - use "+=" instead of ":=" for dom0- targets. Changes in v2: - make pci.o depend on CONFIG_PCI and acpi.o depend on CONFIG_ACPI. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: Introduce xen_ulong_t for unsigned longStefano Stabellini2012-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All the original Xen headers have xen_ulong_t as unsigned long type, however when they have been imported in Linux, xen_ulong_t has been replaced with unsigned long. That might work for x86 and ia64 but it does not for arm. Bring back xen_ulong_t and let each architecture define xen_ulong_t as they see fit. Also explicitly size pointers (__DEFINE_GUEST_HANDLE) to 64 bit. Changes in v3: - remove the incorrect changes to multicall_entry; - remove the change to apic_physbase. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: Xen detection and shared_info page mappingStefano Stabellini2012-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check for a node in the device tree compatible with "xen,xen", if it is present set xen_domain_type to XEN_HVM_DOMAIN and continue initialization. Map the real shared info page using XENMEM_add_to_physmap with XENMAPSPACE_shared_info. Changes in v4: - simpler parsing of Xen version in the compatible DT node. Changes in v3: - use the "xen,xen" notation rather than "arm,xen"; - add an additional check on the presence of the Xen version. Changes in v2: - replace pr_info with pr_debug. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
| * docs: Xen ARM DT bindingsStefano Stabellini2012-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a doc to describe the Xen ARM device tree bindings Changes in v5: - add a comment about the size of the grant table memory region; - add a comment about the required presence of a GIC node; - specify that the described properties are part of a top-level "hypervisor" node; - specify #address-cells and #size-cells for the example. Changes in v4: - "xen,xen" should be last as it is less specific; - update reg property using 2 address-cells and 2 size-cells. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Rob Herring <rob.herring@calxeda.com> CC: devicetree-discuss@lists.ozlabs.org CC: David Vrabel <david.vrabel@citrix.com> CC: Rob Herring <robherring2@gmail.com> CC: Dave Martin <dave.martin@linaro.org>
| * xen/arm: empty implementation of grant_table arch specific functionsStefano Stabellini2012-08-08
| | | | | | | | | | | | | | | | | | | | Changes in v2: - return -ENOSYS rather than -1. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: sync_bitopsStefano Stabellini2012-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sync_bitops functions are equivalent to the SMP implementation of the original functions, independently from CONFIG_SMP being defined. We need them because _set_bit etc are not SMP safe if !CONFIG_SMP. But under Xen you might be communicating with a completely external entity who might be on another CPU (e.g. two uniprocessor guests communicating via event channels and grant tables). So we need a variant of the bit ops which are SMP safe even on a UP kernel. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: page.h definitionsStefano Stabellini2012-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | ARM Xen guests always use paging in hardware, like PV on HVM guests in the X86 world. Changes in v3: - improve comments. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/arm: hypercallsStefano Stabellini2012-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use r12 to pass the hypercall number to the hypervisor. We need a register to pass the hypercall number because we might not know it at compile time and HVC only takes an immediate argument. Among the available registers r12 seems to be the best choice because it is defined as "intra-procedure call scratch register". Use the ISS to pass an hypervisor specific tag. Changes in v2: - define an HYPERCALL macro for 5 arguments hypercall wrappers, even if at the moment is unused; - use ldm instead of pop; - fix up comments. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * arm: initial Xen supportStefano Stabellini2012-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Basic hypervisor.h and interface.h definitions. - Skeleton enlighten.c, set xen_start_info to an empty struct. - Make xen_initial_domain dependent on the SIF_PRIVILIGED_BIT. The new code only compiles when CONFIG_XEN is set, that is going to be added to arch/arm/Kconfig in patch #11 "xen/arm: introduce CONFIG_XEN on ARM". Changes in v3: - improve comments. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen/pciback: Restore the PCI config space after an FLR.Konrad Rzeszutek Wilk2012-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we do an FLR, or D0->D3_hot we may lose the BARs as the device has turned itself off (and on). This means the device cannot function unless the pci_restore_state is called - which it is when the PCI device is unbound from the Xen PCI backend driver. For PV guests it ends up calling pci_enable_device / pci_enable_msi[x] which does the proper steps That however is not happening if a HVM guest is run as QEMU deals with PCI configuration space. QEMU also requires that the device be "parked" under the ownership of a pci-stub driver to guarantee that the PCI device is not being used. Hence we follow the same incantation as pci_reset_function does - by doing an FLR, then restoring the PCI configuration space. The result of this patch is that when you run lspci, you get now this: - Region 0: [virtual] Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] - Region 1: [virtual] Memory at fe800000 (32-bit, non-prefetchable) [size=512K] + Region 0: Memory at fe8c0000 (32-bit, non-prefetchable) [size=128K] + Region 1: Memory at fe800000 (32-bit, non-prefetchable) [size=512K] Region 2: I/O ports at c000 [size=32] - Region 3: [virtual] Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] + Region 3: Memory at fe8e0000 (32-bit, non-prefetchable) [size=16K] The [virtual] means that lspci read those entries from SysFS but when it read them from the device it got a different value (0xfffffff). CC: stable@vger.kernel.org #only for 3.5, 3.6 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen-pciback: properly clean up after calling pcistub_device_find()Jan Beulich2012-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | As the function calls pcistub_device_get() before returning non-NULL, its callers need to take care of calling pcistub_device_put() on (mostly, but not exclusively) error paths. Otoh, the function already guarantees that the 'dev' member is non-NULL upon successful return, so callers do not need to check for this a second time. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen/vga: add the xen EFI video mode supportJan Beulich2012-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to add xen EFI frambebuffer video support, it is required to add xen-efi's new video type (XEN_VGATYPE_EFI_LFB) case and handle it in the function xen_init_vga and set the video type to VIDEO_TYPE_EFI to enable efi video mode. The original patch from which this was broken out from: http://marc.info/?i=4E099AA6020000780004A4C6@nat28.tlf.novell.com Signed-off-by: Jan Beulich <JBeulich@novell.com> Signed-off-by: Tang Liang <liang.tang@oracle.com> [v2: The original author is Jan Beulich and Liang Tang ported it to upstream] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen/x86: retrieve keyboard shift status flags from hypervisor.Konrad Rzeszutek Wilk2012-09-24
| | | | | | | | | | | | | | | | | | The xen c/s 25873 allows the hypervisor to retrieve the NUMLOCK flag. With this patch, the Linux kernel can get the state according to the data in the BIOS. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge branch 'stable/late-swiotlb.v3.3' into stable/for-linus-3.7Konrad Rzeszutek Wilk2012-09-22
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable/late-swiotlb.v3.3: xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer. xen/swiotlb: Remove functions not needed anymore. xen/pcifront: Use Xen-SWIOTLB when initting if required. xen/swiotlb: For early initialization, return zero on success. xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used. xen/swiotlb: Move the error strings to its own function. xen/swiotlb: Move the nr_tbl determination in its own function. swiotlb: add the late swiotlb initialization function with iotlb memory xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB. xen/swiotlb: Simplify the logic. Conflicts: arch/x86/xen/pci-swiotlb-xen.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Fix compile warnings when using plain integer instead of NULL ↵Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer. arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Remove functions not needed anymore.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse warns us off: drivers/xen/swiotlb-xen.c:506:1: warning: symbol 'xen_swiotlb_map_sg' was not declared. Should it be static? drivers/xen/swiotlb-xen.c:534:1: warning: symbol 'xen_swiotlb_unmap_sg' was not declared. Should it be static? and it looks like we do not need this function at all. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/pcifront: Use Xen-SWIOTLB when initting if required.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We piggyback on "xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used." functionality to start up the Xen-SWIOTLB if we are hot-plugged. This allows us to bypass the need to supply 'iommu=soft' on the Linux command line (mostly). With this patch, if a user forgot 'iommu=soft' on the command line, and hotplug a PCI device they will get: pcifront pci-0: Installing PCI frontend Warning: only able to allocate 4 MB for software IO TLB software IO TLB [mem 0x2a000000-0x2a3fffff] (4MB) mapped at [ffff88002a000000-ffff88002a3fffff] pcifront pci-0: Creating PCI Frontend Bus 0000:00 pcifront pci-0: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x0000-0xffff] pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff] pci 0000:00:00.0: [8086:10d3] type 00 class 0x020000 pci 0000:00:00.0: reg 10: [mem 0xfe5c0000-0xfe5dffff] pci 0000:00:00.0: reg 14: [mem 0xfe500000-0xfe57ffff] pci 0000:00:00.0: reg 18: [io 0xe000-0xe01f] pci 0000:00:00.0: reg 1c: [mem 0xfe5e0000-0xfe5e3fff] pcifront pci-0: claiming resource 0000:00:00.0/0 pcifront pci-0: claiming resource 0000:00:00.0/1 pcifront pci-0: claiming resource 0000:00:00.0/2 pcifront pci-0: claiming resource 0000:00:00.0/3 e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k e1000e: Copyright(c) 1999 - 2012 Intel Corporation. e1000e 0000:00:00.0: Disabling ASPM L0s L1 e1000e 0000:00:00.0: enabling device (0000 -> 0002) e1000e 0000:00:00.0: Xen PCI mapped GSI16 to IRQ34 e1000e 0000:00:00.0: (unregistered net_device): Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode e1000e 0000:00:00.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:1b:21:ab:c6:13 e1000e 0000:00:00.0: eth0: Intel(R) PRO/1000 Network Connection e1000e 0000:00:00.0: eth0: MAC: 3, PHY: 8, PBA No: E46981-005 The "Warning only" will go away if one supplies 'iommu=soft' instead as we have a higher chance of being able to allocate large swaths of memory. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: For early initialization, return zero on success.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If everything is setup properly we would return -ENOMEM since rc by default is set to that value. Lets not do that and return a proper return code. Note: The reason the early code needs this special treatment is that it SWIOTLB library call does not return anything (and had it failed it would call panic()) - but our function does. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late ↵Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when PV PCI is used. With this patch we provide the functionality to initialize the Xen-SWIOTLB late in the bootup cycle - specifically for Xen PCI-frontend. We still will work if the user had supplied 'iommu=soft' on the Linux command line. Note: We cannot depend on after_bootmem to automatically determine whether this is early or not. This is because when PCI IOMMUs are initialized it is after after_bootmem but before a lot of "other" subsystems are initialized. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [v1: Fix smatch warnings] [v2: Added check for xen_swiotlb] [v3: Rebased with new xen-swiotlb changes] [v4: squashed xen/swiotlb: Depending on after_bootmem is not correct in] Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Move the error strings to its own function.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | That way we can more easily reuse those errors when using the late SWIOTLB init. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Move the nr_tbl determination in its own function.Konrad Rzeszutek Wilk2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | Moving the function out of the way to prepare for the late SWIOTLB init. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | swiotlb: add the late swiotlb initialization function with iotlb memoryKonrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables the caller to initialize swiotlb with its own iotlb memory late in the bootup. See git commit eb605a5754d050a25a9f00d718fb173f24c486ef "swiotlb: add swiotlb_tbl_map_single library function" which will explain the full details of what it can be used for. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [v1: Fold in smatch warning] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.Konrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a PV guest is booted the native SWIOTLB should not be turned on. It does not help us (we don't have any PCI devices) and it eats 64MB of good memory. In the case of PV guests with PCI devices we need the Xen-SWIOTLB one. [v1: Rewrite comment per Stefano's suggestion] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/swiotlb: Simplify the logic.Konrad Rzeszutek Wilk2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Its pretty easy: 1). We only check to see if we need Xen SWIOTLB for PV guests. 2). If swiotlb=force or iommu=soft is set, then Xen SWIOTLB will be enabled. 3). If it is an initial domain, then Xen SWIOTLB will be enabled. 4). Native SWIOTLB must be disabled for PV guests. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen/gndev: Xen backend support for paged out grant targets V4.Andres Lagar-Cavilla2012-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since Xen-4.2, hvm domains may have portions of their memory paged out. When a foreign domain (such as dom0) attempts to map these frames, the map will initially fail. The hypervisor returns a suitable errno, and kicks an asynchronous page-in operation carried out by a helper. The foreign domain is expected to retry the mapping operation until it eventually succeeds. The foreign domain is not put to sleep because itself could be the one running the pager assist (typical scenario for dom0). This patch adds support for this mechanism for backend drivers using grant mapping and copying operations. Specifically, this covers the blkback and gntdev drivers (which map foreign grants), and the netback driver (which copies foreign grants). * Add a retry method for grants that fail with GNTST_eagain (i.e. because the target foreign frame is paged out). * Insert hooks with appropriate wrappers in the aforementioned drivers. The retry loop is only invoked if the grant operation status is GNTST_eagain. It guarantees to leave a new status code different from GNTST_eagain. Any other status code results in identical code execution as before. The retry loop performs 256 attempts with increasing time intervals through a 32 second period. It uses msleep to yield while waiting for the next retry. V2 after feedback from David Vrabel: * Explicit MAX_DELAY instead of wrap-around delay into zero * Abstract GNTST_eagain check into core grant table code for netback module. V3 after feedback from Ian Campbell: * Add placeholder in array of grant table error descriptions for unrelated error code we jump over. * Eliminate single map and retry macro in favor of a generic batch flavor. * Some renaming. * Bury most implementation in grant_table.c, cleaner interface. V4 rebased on top of sync of Xen grant table interface headers. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> [v5: Fixed whitespace issues] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | | xen-pciback: support wild cards in slot specificationsJan Beulich2012-09-18
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | Particularly for hiding sets of SR-IOV devices, specifying them all individually is rather cumbersome. Therefore, allow function and slot numbers to be replaced by a wildcard character ('*'). Unfortunately this gets complicated by the in-kernel sscanf() implementation not being really standard conformant - matching of plain text tails cannot be checked by the caller (a patch to overcome this will be sent shortly, and a follow-up patch for simplifying the code is planned to be sent when that fixed went upstream). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen/arm: compile and run xenbusStefano Stabellini2012-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bind_evtchn_to_irqhandler can legitimately return 0 (irq 0): it is not an error. If Linux is running as an HVM domain and is running as Dom0, use xenstored_local_init to initialize the xenstore page and event channel. Changes in v4: - do not xs_reset_watches on dom0. Changes in v2: - refactor xenbus_init. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v5: Fixed case switch indentations] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | xen: resynchronise grant table status codes with upstreamIan Campbell2012-09-14
| | | | | | | | | | | | | | Adds GNTST_address_too_big and GNTST_eagain. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* | Merge branch 'stable/128gb.v5.1' into stable/for-linus-3.7Konrad Rzeszutek Wilk2012-09-12
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable/128gb.v5.1: xen/mmu: If the revector fails, don't attempt to revector anything else. xen/p2m: When revectoring deal with holes in the P2M array. xen/mmu: Release just the MFN list, not MFN list and part of pagetables. xen/mmu: Remove from __ka space PMD entries for pagetables. xen/mmu: Copy and revector the P2M tree. xen/p2m: Add logic to revector a P2M tree to use __va leafs. xen/mmu: Recycle the Xen provided L4, L3, and L2 pages xen/mmu: For 64-bit do not call xen_map_identity_early xen/mmu: use copy_page instead of memcpy. xen/mmu: Provide comments describing the _ka and _va aliasing issue xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything. Revert "xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain." and "xen/x86: Use memblock_reserve for sensitive areas." xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain. xen/x86: Use memblock_reserve for sensitive areas. xen/p2m: Fix the comment describing the P2M tree. Conflicts: arch/x86/xen/mmu.c The pagetable_init is the old xen_pagetable_setup_done and xen_pagetable_setup_start rolled in one. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: If the revector fails, don't attempt to revector anything else.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the P2M revectoring would fail, we would try to continue on by cleaning the PMD for L1 (PTE) page-tables. The xen_cleanhighmap is greedy and erases the PMD on both boundaries. Since the P2M array can share the PMD, we would wipe out part of the __ka that is still used in the P2M tree to point to P2M leafs. This fixes it by bypassing the revectoring and continuing on. If the revector fails, a nice WARN is printed so we can still troubleshoot this. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/p2m: When revectoring deal with holes in the P2M array.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we free the PFNs and then subsequently populate them back during bootup: Freeing 20000-20200 pfn range: 512 pages freed 1-1 mapping on 20000->20200 Freeing 40000-40200 pfn range: 512 pages freed 1-1 mapping on 40000->40200 Freeing bad80-badf4 pfn range: 116 pages freed 1-1 mapping on bad80->badf4 Freeing badf6-bae7f pfn range: 137 pages freed 1-1 mapping on badf6->bae7f Freeing bb000-100000 pfn range: 282624 pages freed 1-1 mapping on bb000->100000 Released 283999 pages of unused memory Set 283999 page(s) to 1-1 mapping Populating 1acb8a-1f20e9 pfn range: 283999 pages added We end up having the P2M array (that is the one that was grafted on the P2M tree) filled with IDENTITY_FRAME or INVALID_P2M_ENTRY) entries. The patch titled "xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID." recycles said slots and replaces the P2M tree leaf's with &mfn_list[xx] with p2m_identity or p2m_missing. And re-uses the P2M array sections for other P2M tree leaf's. For the above mentioned bootup excerpt, the PFNs at 0x20000->0x20200 are going to be IDENTITY based: P2M[0][256][0] -> P2M[0][257][0] get turned in IDENTITY_FRAME. We can re-use that and replace P2M[0][256] to point to p2m_identity. The "old" page (the grafted P2M array provided by Xen) that was at P2M[0][256] gets put somewhere else. Specifically at P2M[6][358], b/c when we populate back: Populating 1acb8a-1f20e9 pfn range: 283999 pages added we fill P2M[6][358][0] (and P2M[6][358], P2M[6][359], ...) with the new MFNs. That is all OK, except when we revector we assume that the PFN count would be the same in the grafted P2M array and in the newly allocated. Since that is no longer the case, as we have holes in the P2M that point to p2m_missing or p2m_identity we have to take that into account. [v2: Check for overflow] [v3: Move within the __va check] [v4: Fix the computation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Release just the MFN list, not MFN list and part of pagetables.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We call memblock_reserve for [start of mfn list] -> [PMD aligned end of mfn list] instead of <start of mfn list> -> <page aligned end of mfn list]. This has the disastrous effect that if at bootup the end of mfn_list is not PMD aligned we end up returning to memblock parts of the region past the mfn_list array. And those parts are the PTE tables with the disastrous effect of seeing this at bootup: Write protecting the kernel read-only data: 10240k Freeing unused kernel memory: 1860k freed Freeing unused kernel memory: 200k freed (XEN) mm.c:2429:d0 Bad type (saw 1400000000000002 != exp 7000000000000000) for mfn 116a80 (pfn 14e26) ... (XEN) mm.c:908:d0 Error getting mfn 116a83 (pfn 14e2a) from L1 entry 8000000116a83067 for l1e_owner=0, pg_owner=0 (XEN) mm.c:908:d0 Error getting mfn 4040 (pfn 5555555555555555) from L1 entry 0000000004040601 for l1e_owner=0, pg_owner=0 .. and so on. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Remove from __ka space PMD entries for pagetables.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please first read the description in "xen/mmu: Copy and revector the P2M tree." At this stage, the __ka address space (which is what the old P2M tree was using) is partially disassembled. The cleanup_highmap has removed the PMD entries from 0-16MB and anything past _brk_end up to the max_pfn_mapped (which is the end of the ramdisk). The xen_remove_p2m_tree and code around has ripped out the __ka for the old P2M array. Here we continue on doing it to where the Xen page-tables were. It is safe to do it, as the page-tables are addressed using __va. For good measure we delete anything that is within MODULES_VADDR and up to the end of the PMD. At this point the __ka only contains PMD entries for the start of the kernel up to __brk. [v1: Per Stefano's suggestion wrapped the MODULES_VADDR in debug] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Copy and revector the P2M tree.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please first read the description in "xen/p2m: Add logic to revector a P2M tree to use __va leafs" patch. The 'xen_revector_p2m_tree()' function allocates a new P2M tree copies the contents of the old one in it, and returns the new one. At this stage, the __ka address space (which is what the old P2M tree was using) is partially disassembled. The cleanup_highmap has removed the PMD entries from 0-16MB and anything past _brk_end up to the max_pfn_mapped (which is the end of the ramdisk). We have revectored the P2M tree (and the one for save/restore as well) to use new shiny __va address to new MFNs. The xen_start_info has been taken care of already in 'xen_setup_kernel_pagetable()' and xen_start_info->shared_info in 'xen_setup_shared_info()', so we are free to roam and delete PMD entries - which is exactly what we are going to do. We rip out the __ka for the old P2M array. [v1: Fix smatch warnings] [v2: memset was doing 0 instead of 0xff] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/p2m: Add logic to revector a P2M tree to use __va leafs.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During bootup Xen supplies us with a P2M array. It sticks it right after the ramdisk, as can be seen with a 128GB PV guest: (certain parts removed for clarity): xc_dom_build_image: called xc_dom_alloc_segment: kernel : 0xffffffff81000000 -> 0xffffffff81e43000 (pfn 0x1000 + 0xe43 pages) xc_dom_pfn_to_ptr: domU mapping: pfn 0x1000+0xe43 at 0x7f097d8bf000 xc_dom_alloc_segment: ramdisk : 0xffffffff81e43000 -> 0xffffffff925c7000 (pfn 0x1e43 + 0x10784 pages) xc_dom_pfn_to_ptr: domU mapping: pfn 0x1e43+0x10784 at 0x7f0952dd2000 xc_dom_alloc_segment: phys2mach : 0xffffffff925c7000 -> 0xffffffffa25c7000 (pfn 0x125c7 + 0x10000 pages) xc_dom_pfn_to_ptr: domU mapping: pfn 0x125c7+0x10000 at 0x7f0942dd2000 xc_dom_alloc_page : start info : 0xffffffffa25c7000 (pfn 0x225c7) xc_dom_alloc_page : xenstore : 0xffffffffa25c8000 (pfn 0x225c8) xc_dom_alloc_page : console : 0xffffffffa25c9000 (pfn 0x225c9) nr_page_tables: 0x0000ffffffffffff/48: 0xffff000000000000 -> 0xffffffffffffffff, 1 table(s) nr_page_tables: 0x0000007fffffffff/39: 0xffffff8000000000 -> 0xffffffffffffffff, 1 table(s) nr_page_tables: 0x000000003fffffff/30: 0xffffffff80000000 -> 0xffffffffbfffffff, 1 table(s) nr_page_tables: 0x00000000001fffff/21: 0xffffffff80000000 -> 0xffffffffa27fffff, 276 table(s) xc_dom_alloc_segment: page tables : 0xffffffffa25ca000 -> 0xffffffffa26e1000 (pfn 0x225ca + 0x117 pages) xc_dom_pfn_to_ptr: domU mapping: pfn 0x225ca+0x117 at 0x7f097d7a8000 xc_dom_alloc_page : boot stack : 0xffffffffa26e1000 (pfn 0x226e1) xc_dom_build_image : virt_alloc_end : 0xffffffffa26e2000 xc_dom_build_image : virt_pgtab_end : 0xffffffffa2800000 So the physical memory and virtual (using __START_KERNEL_map addresses) layout looks as so: phys __ka /------------\ /-------------------\ | 0 | empty | 0xffffffff80000000| | .. | | .. | | 16MB | <= kernel starts | 0xffffffff81000000| | .. | | | | 30MB | <= kernel ends => | 0xffffffff81e43000| | .. | & ramdisk starts | .. | | 293MB | <= ramdisk ends=> | 0xffffffff925c7000| | .. | & P2M starts | .. | | .. | | .. | | 549MB | <= P2M ends => | 0xffffffffa25c7000| | .. | start_info | 0xffffffffa25c7000| | .. | xenstore | 0xffffffffa25c8000| | .. | cosole | 0xffffffffa25c9000| | 549MB | <= page tables => | 0xffffffffa25ca000| | .. | | | | 550MB | <= PGT end => | 0xffffffffa26e1000| | .. | boot stack | | \------------/ \-------------------/ As can be seen, the ramdisk, P2M and pagetables are taking a bit of __ka addresses space. Which is a problem since the MODULES_VADDR starts at 0xffffffffa0000000 - and P2M sits right in there! This results during bootup with the inability to load modules, with this error: ------------[ cut here ]------------ WARNING: at /home/konrad/ssd/linux/mm/vmalloc.c:106 vmap_page_range_noflush+0x2d9/0x370() Call Trace: [<ffffffff810719fa>] warn_slowpath_common+0x7a/0xb0 [<ffffffff81030279>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff81071a45>] warn_slowpath_null+0x15/0x20 [<ffffffff81130b89>] vmap_page_range_noflush+0x2d9/0x370 [<ffffffff81130c4d>] map_vm_area+0x2d/0x50 [<ffffffff811326d0>] __vmalloc_node_range+0x160/0x250 [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80 [<ffffffff810c6186>] ? load_module+0x66/0x19c0 [<ffffffff8105cadc>] module_alloc+0x5c/0x60 [<ffffffff810c5369>] ? module_alloc_update_bounds+0x19/0x80 [<ffffffff810c5369>] module_alloc_update_bounds+0x19/0x80 [<ffffffff810c70c3>] load_module+0xfa3/0x19c0 [<ffffffff812491f6>] ? security_file_permission+0x86/0x90 [<ffffffff810c7b3a>] sys_init_module+0x5a/0x220 [<ffffffff815ce339>] system_call_fastpath+0x16/0x1b ---[ end trace fd8f7704fdea0291 ]--- vmalloc: allocation failure, allocated 16384 of 20480 bytes modprobe: page allocation failure: order:0, mode:0xd2 Since the __va and __ka are 1:1 up to MODULES_VADDR and cleanup_highmap rids __ka of the ramdisk mapping, what we want to do is similar - get rid of the P2M in the __ka address space. There are two ways of fixing this: 1) All P2M lookups instead of using the __ka address would use the __va address. This means we can safely erase from __ka space the PMD pointers that point to the PFNs for P2M array and be OK. 2). Allocate a new array, copy the existing P2M into it, revector the P2M tree to use that, and return the old P2M to the memory allocate. This has the advantage that it sets the stage for using XEN_ELF_NOTE_INIT_P2M feature. That feature allows us to set the exact virtual address space we want for the P2M - and allows us to boot as initial domain on large machines. So we pick option 2). This patch only lays the groundwork in the P2M code. The patch that modifies the MMU is called "xen/mmu: Copy and revector the P2M tree." Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Recycle the Xen provided L4, L3, and L2 pagesKonrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we are not using them. We end up only using the L1 pagetables and grafting those to our page-tables. [v1: Per Stefano's suggestion squashed two commits] [v2: Per Stefano's suggestion simplified loop] [v3: Fix smatch warnings] [v4: Add more comments] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: For 64-bit do not call xen_map_identity_earlyKonrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | B/c we do not need it. During the startup the Xen provides us with all the initial memory mapped that we need to function. The initial memory mapped is up to the bootstack, which means we can reference using __ka up to 4.f): (from xen/interface/xen.h): 4. This the order of bootstrap elements in the initial virtual region: a. relocated kernel image b. initial ram disk [mod_start, mod_len] c. list of allocated page frames [mfn_list, nr_pages] d. start_info_t structure [register ESI (x86)] e. bootstrap page tables [pt_base, CR3 (x86)] f. bootstrap stack [register ESP (x86)] (initial ram disk may be ommitted). [v1: More comments in git commit] Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: use copy_page instead of memcpy.Konrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | After all, this is what it is there for. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | xen/mmu: Provide comments describing the _ka and _va aliasing issueKonrad Rzeszutek Wilk2012-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Which is that the level2_kernel_pgt (__ka virtual addresses) and level2_ident_pgt (__va virtual address) contain the same PMD entries. So if you modify a PTE in __ka, it will be reflected in __va (and vice-versa). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>