aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/xen/setup.c
diff options
context:
space:
mode:
authorZhang, Fengzhe <fengzhe.zhang@intel.com>2011-02-16 09:26:20 -0500
committerKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>2011-02-22 12:48:50 -0500
commit2f14ddc3a7146ea4cd5a3d1ecd993f85f2e4f948 (patch)
tree96c2d5bb6c34551b68938a82860ca2e6bf2b5569 /arch/x86/xen/setup.c
parent85e2efbb1db9a18d218006706d6e4fbeb0216213 (diff)
xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.
With the hypervisor argument of dom0_mem=X we iterate over the physical (only for the initial domain) E820 and subtract the the size from each E820_RAM region the delta so that the cumulative size of all E820_RAM regions is equal to 'X'. This sometimes ends up with E820_RAM regions with zero size (which are removed by e820_sanitize) and E820_RAM that are smaller than physically. Later on the PCI API looks at the E820 and attempts to set up an resource region for the "PCI mem". The E820 (assume dom0_mem=1GB is set) compared to the physical looks as so: [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 0000000000097c00 (usable) [ 0.000000] Xen: 0000000000097c00 - 0000000000100000 (reserved) -[ 0.000000] Xen: 0000000000100000 - 00000000defafe00 (usable) +[ 0.000000] Xen: 0000000000100000 - 0000000040000000 (usable) [ 0.000000] Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS) [ 0.000000] Xen: 00000000defb1ea0 - 00000000e0000000 (reserved) [ 0.000000] Xen: 00000000f4000000 - 00000000f8000000 (reserved) .. And we get [ 0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:9efafe00) while it should have started at e0000000 (a nice big gap up to f4000000 exists). The "Allocating PCI" is part of the resource API. The users that end up using those PCI I/O regions usually supply their own BARs when calling the resource API (request_resource, or allocate_resource), but there are exceptions which provide an empty 'struct resource' and expect the API to provide the 'struct resource' to be populated with valid values. The one that triggered this bug was the intel AGP driver that requested a region for the flush page (intel_i9xx_setup_flush). Before this patch, when running under Xen hypervisor, the 'struct resource' returned could have (depending on the dom0_mem size) physical ranges of a 'System RAM' instead of 'I/O' regions. This ended up with the Hypervisor failing a request to populate PTE's with those PFNs as the domain did not have access to those 'System RAM' regions (rightly so). After this patch, the left-over E820_RAM region from the truncation, will be labeled as E820_UNUSABLE. The E820 will look as so: [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] Xen: 0000000000000000 - 0000000000097c00 (usable) [ 0.000000] Xen: 0000000000097c00 - 0000000000100000 (reserved) -[ 0.000000] Xen: 0000000000100000 - 00000000defafe00 (usable) +[ 0.000000] Xen: 0000000000100000 - 0000000040000000 (usable) +[ 0.000000] Xen: 0000000040000000 - 00000000defafe00 (unusable) [ 0.000000] Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS) [ 0.000000] Xen: 00000000defb1ea0 - 00000000e0000000 (reserved) [ 0.000000] Xen: 00000000f4000000 - 00000000f8000000 (reserved) For more information: http://mid.gmane.org/1A42CE6F5F474C41B63392A5F80372B2335E978C@shsmsx501.ccr.corp.intel.com BugLink: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1726 Signed-off-by: Fengzhe Zhang <fengzhe.zhang@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Diffstat (limited to 'arch/x86/xen/setup.c')
-rw-r--r--arch/x86/xen/setup.c8
1 files changed, 8 insertions, 0 deletions
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index a8a66a50d446..2a4add9634ce 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -194,6 +194,14 @@ char * __init xen_memory_setup(void)
194 end -= delta; 194 end -= delta;
195 195
196 extra_pages += PFN_DOWN(delta); 196 extra_pages += PFN_DOWN(delta);
197 /*
198 * Set RAM below 4GB that is not for us to be unusable.
199 * This prevents "System RAM" address space from being
200 * used as potential resource for I/O address (happens
201 * when 'allocate_resource' is called).
202 */
203 if (delta && end < 0x100000000UL)
204 e820_add_region(end, delta, E820_UNUSABLE);
197 } 205 }
198 206
199 if (map[i].size > 0 && end > xen_extra_mem_start) 207 if (map[i].size > 0 && end > xen_extra_mem_start)