aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* drm/i915: split irq handling into per-chipset functionsJesse Barnes2011-05-13
| | | | | | | | | | Set the IRQ handling functions in driver load so they'll just be used directly, rather than branching over most of the code in the chipset functions. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: make FDI training a display functionJesse Barnes2011-05-13
| | | | | | | | | Rather than branching in ironlake_pch_enable, add a new train_fdi function to the display function pointer struct and use it instead. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: forcewake debugfs fixBen Widawsky2011-05-13
| | | | | | | | | Forcewake needs to register itself with drm to use the remove function. The file also should be read only. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>
* MAINTAINERS: Switch maintainer for drm/i915 to Keith PackardKeith Packard2011-05-11
| | | | Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: debugfs interface for forcewake reference countBen Widawsky2011-05-10
| | | | | | | | | forcewake is controlled by the open and close of the debugfs file. This assures that buggy applications cannot cause the GT to stay on forever. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: move gen6 rps handling to workqueueBen Widawsky2011-05-10
| | | | | | | | | | | | The render P-state handling code requires reading from a GT register. This means that FORCEWAKE must be written to, a resource which is shared and should be protected by struct_mutex. Hence we can not manipulate that register from within the interrupt handling and so must delegate the task to a workqueue. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: forcewake struct mutex locking fixesBen Widawsky2011-05-10
| | | | | | | | | Found by the new strict checking for the mutex being held whilst manipulating the forcewake status. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: reference counted forcewakeBen Widawsky2011-05-10
| | | | | | | | | | | | Provide a reference count to track the forcewake state of the GPU and give a safe mechanism for userspace to wake the GT. This also potentially saves a UC read if the GT is known to be awake already. The reference count is atomic, but the register access and hardware wake sequence is protected by struct_mutex. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: proper use of forcewakeBen Widawsky2011-05-10
| | | | | | | | | | | | | Moved the macros around to properly do reads and writes for the given GPU. This is to address special requirements for gen6 (SNB) reads and writes. Registers in the range 0-0x40000 on gen6 platforms require special handling. Instead of relying on the callers to pick the registers correctly, move the logic into the read and write functions. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Disable all outputs early, before KMS takeoverChris Wilson2011-05-10
| | | | | | | | | | | | | | | | | | | | | | If the outputs are active and continuing to access the GATT when we teardown the PTEs, then there is a potential for us to hang the GPU. The hang tends to be a PGTBL_ER with either an invalid host access or an invalid display plane fetch. v2: Reorder IRQ initialisation to defer until after GEM is setup. Reported-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> (855GM) Tested-by: Pekka Enberg <penberg@kernel.org> # note that this doesn't fix the underlying problem of the PGTBL_ER and pipe underruns being reported immediately upon init on his 965GM MacBook Reported-and-tested-by: Rick Bramley <richard.bramley@hp.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35635 Reported-and-tested-by: Zdenek Kabelac <zdenek.kabelac@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36048 Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
* drm/i915: Do not clflush snooped objectsChris Wilson2011-05-10
| | | | | | | | | | | Rely on the GPU snooping into the CPU cache for appropriately bound objects on MI_FLUSH. Or perhaps one day we will have a cache-coherent CPU/GPU package... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Rename agp_type to cache_levelChris Wilson2011-05-10
| | | | | | | | | | | ... to clarify just how we use it inside the driver and remove the confusion of the poorly matching agp_type names. We still need to translate through agp_type for interface into the fake AGP driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: debugfs for context informationBen Widawsky2011-05-10
| | | | | | | | | Currently this is only useful for the rc6 stuff. But this would also be useful when I finally get around to the logical context + ppgtt stuff. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: use i915_enable_rc6 on SNB tooJesse Barnes2011-05-10
| | | | | | | | For debug & testing. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: fix rc6 initialization on IronlakeBen Widawsky2011-05-10
| | | | | | | | | | | | | There is a race condition between setting PWRCTXA and executing MI_SET_CONTEXT. PWRCTXA must not be set until a valid context has been written (or else the GPU could possible go into rc6, and return to an invalid context). Reported-and-Tested-by: Gu Rui <chaos.proton@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=28582 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/1915: ringbuffer wait for idle functionBen Widawsky2011-05-10
| | | | | | | | | | | Added a new function which waits for the ringbuffer space to be equal to (total - 8). This is the empty condition of the ringbuffer, and equivalent to head==tail. Also modified two users of this functionality elsewhere in the code. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: fix ilk rc6 teardown lockingBen Widawsky2011-05-10
| | | | | | | | | | | | | In the failure cases during rc6 initialization, both the power context and render context may get !refcount without holding struct_mutex. However, on rc6 disabling, the lock is held by the caller. Rearranged the locking so that it's safe in both cases. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Fold the DPLL limit defines into the structs that use them.Eric Anholt2011-05-10
| | | | | | | | | | | | | They're used in one place, and not providing any descriptive value, with their names just being approximately the conjunction of the struct name and the struct field. This diff was produced with gcc -E, copying the new struct definitions out, moving a couple of the old comments into place in the new structs, and reindenting. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Clean up leftover DPLL and LVDS register choice from pch split.Eric Anholt2011-05-10
| | | | | | | | | We used to have these from the product of (pch, non-pch) * (pipe a, pipe b). Now we can just use the nice per-pipe reg macros in the split out crtc_mode_sets. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Drop remaining pre-Ironlake code from ironlake_crtc_mode_set().Eric Anholt2011-05-10
| | | | | Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Drop non-HAS_PCH_SPLIT() code from ironlake_crtc_mode_set().Eric Anholt2011-05-10
| | | | | | | Ironlake is where the PCH split started. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Drop the remaining bit of Ironlake code from i9xx_crtc_mode_set().Eric Anholt2011-05-10
| | | | | | Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Drop the eDP paths from the pre-Ironlake crtc_mode_set.Eric Anholt2011-05-10
| | | | | | | | While g4x had DP, eDP came with Ironlake, so we don't need that code here. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Remove the PCH paths from the pre-Ironlake crtc_mode_set().Eric Anholt2011-05-10
| | | | | | Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Move the vblank pre/post modeset to the common crtc_mode_set.Eric Anholt2011-05-10
| | | | | | Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Split the crtc_mode_set function along HAS_PCH_SPLIT() lines.Eric Anholt2011-05-10
| | | | | | | | | | | | | | | | This path, which shouldn't be *that* complicated, is now so littered with per-chipset tweaks that it's hard to trace the order of what happens. HAS_PCH_SPLIT() is the most radical change across chipsets, so it seems like a natural split to simplify the code. This first commit just copies the existing code without changing anything. v2: updated to track removal of call to intel_enable_plane from i9xx_crtc_mode_set Signed-off-by: Eric Anholt <eric@anholt.net> Hella-acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* drm/i915: Attach a fb to the load-detect pipeChris Wilson2011-05-10
| | | | | | | | | | | | | | | | | | | | We need to ensure that we feed valid memory into the display plane attached to the pipe when switching the pipe on. Otherwise, the display engine may read through an invalid PTE and so throw an PGTBL_ER exception. As we need to perform load detection before even the first object is allocated for the fbdev, there is no pre-existing object large enough for us to borrow to use as the framebuffer. So we need to create one and cleanup afterwards. At other times, the current fbcon may be large enough for us to borrow it for duration of load detection. Found by assert_fb_bound_for_plane(). Reported-by: Knut Petersen <Knut_Petersen@t-online.de> References: https://bugs.freedesktop.org/show_bug.cgi?id=36246 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Remove dead code from intel_release_load_detect_pipe()Chris Wilson2011-05-10
| | | | | | | | | As we now never attempt to steal a crtc for load detection, we either set a mode on a new pipe, or change the dpms mode on an existing pipe. Never both, so we can simplify the code slightly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Remove dead code from intel_get_load_detect_pipe()Chris Wilson2011-05-10
| | | | | | | | | As we only allow the use of a disabled CRTC, we don't need to handle the case where we are reusing an already enabled pipe. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Pass the saved adjusted_mode when adding to the load-detect crtcChris Wilson2011-05-10
| | | | | | Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Remove unused supported_crtc from intel_load_detect_pipeChris Wilson2011-05-10
| | | | | | | | | ... and the no longer relevant comment. The code ceased stealing a pipe for load detection a long time ago. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Don't store temporary load-detect variables in the generic encoderChris Wilson2011-05-10
| | | | | | | | | | | Keep all the state required for undoing and restoring the previous pipe configuration together in a single struct passed from intel_get_load_detect_pipe() to intel_release_load_detect_pipe() rather than stuffing them inside the common encoder structure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Propagate failure to set mode for load-detect pipeChris Wilson2011-05-10
| | | | | | | | | | | Check the return value from drm_crtc_set_mode(), report the failure via a debug message and propagate the error back to the caller. This prevents us from blissfully continuing to do the load detection on a disabled pipe. Fortunately actual failure for modesetting is very rare, and reported failures even rarer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Simplify return value from intel_get_load_detect_pipeChris Wilson2011-05-10
| | | | | | | | | | ... and so remove the confusion as to whether to use the returned crtc or intel_encoder->base.crtc with the subsequent load-detection. Even though they were the same, the two instances of load-detection code disagreed over which was the more correct. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Move the irq wait queue initialisation into the ring initChris Wilson2011-05-10
| | | | | | | | | | | | | | Required so that we don't obliterate the queue if initialising the rings after the global IRQ handler is installed. [Jesse, you recently looked at refactoring the IRQ installation routines, does moving the initialisation of ring buffer data structures away from that routine make sense in your grand scheme?] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915/lvds: Only act on lid notify when the device is onAlex Williamson2011-05-09
| | | | | | | | | | | | | | If we're using vga switcheroo, the device may be turned off and poking it can return random state. This provokes an OOPS fixed separately by 8ff887c847 (drm/i915/dp: Be paranoid in case we disable a DP before it is attached). Trying to use and respond to events on a device that has been turned off by the user is in principle a silly thing to do. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: fix intel_crtc_clock_get pipe reads after "cleanup cleanup"Chris Wilson2011-05-09
| | | | | | | | | | | | | | | Despite the fixes in 548f245ba6a31 (drm/i915: fix per-pipe reads after "cleanup"), we missed one neighbouring read that was mistakenly replaced with the reg value in 9db4a9c (drm/i915: cleanup per-pipe reg usage). This was preventing us from correctly determining the mode the BIOS left the panel in for machines that neither have an OpRegion nor access to the VBT, (e.g. the EeePC 700). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: stable@kernel.org Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Only enable the plane after setting the fb base (pre-ILK)Chris Wilson2011-05-09
| | | | | | | | | | | | | | | When enabling the plane, it is helpful to have already pointed that plane to valid memory or else we may incur the wrath of a PGTBL_ER. This code preserved the behaviour from the bad old days for unknown reasons... Found by assert_fb_bound_for_plane(). References: https://bugs.freedesktop.org/show_bug.cgi?id=36246 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915/dp: Be paranoid in case we disable a DP before it is attachedChris Wilson2011-05-04
| | | | | | | | | | | | | | | | Given that the hardware may be left in a random condition by the BIOS, it is conceivable that we then attempt to clear the DP_PIPEB_SELECT bit without us ever enabling/attaching the DP encoder to a pipe. Thus causing a NULL deference when we attempt to wait for a vblank on that crtc. Reported-and-tested-by: Bryan Christ <bryan.christ@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36314 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36456 Reported-and-tested-by: Bo Wang <bo.b.wang@intel.com> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>
* drm/i915: Release object along create user fb error pathChris Wilson2011-05-04
| | | | | | | Reported-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>
* Linux 2.6.39-rc6Linus Torvalds2011-05-03
|
* Merge branch 'drm-fixes' of ↵Linus Torvalds2011-05-03
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: fix gart setup on fusion parts (v2) drm: Send pending vblank events before disabling vblank. drm/radeon: fix regression on atom cards with hardcoded EDID record. drm/radeon/kms: add some new pci ids
| * drm/radeon/kms: fix gart setup on fusion parts (v2)Alex Deucher2011-05-03
| | | | | | | | | | | | | | | | | | | | | | | | Out of the entire GART/VM subsystem, the hw designers changed the location of 3 regs. v2: airlied: add parameter for userspace to work from. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Jerome Glisse <jglisse@redhat.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
| * drm: Send pending vblank events before disabling vblank.Christopher James Halse Rogers2011-05-03
| | | | | | | | | | | | | | | | | | | | | | | | This is the least-bad behaviour. It means that we signal the vblank event before it actually happens, but since we're disabling vblanks there's no guarantee that it will *ever* happen otherwise. This prevents GL applications which use WaitMSC from hanging indefinitely. Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * drm/radeon: fix regression on atom cards with hardcoded EDID record.Dave Airlie2011-05-03
| | | | | | | | | | | | | | | | | | | | | | | | Since fafcf94e2b5732d1e13b440291c53115d2b172e9 introduced an edid size, it seems to have broken this path. This manifest as oops on T500 Lenovo laptops with dual graphics primarily. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33812 cc: stable@kernel.org Reviewed-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
| * drm/radeon/kms: add some new pci idsAlex Deucher2011-05-03
| | | | | | | | | | | | Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
* | logfs: initialize superblock entries earlierLinus Torvalds2011-05-03
|/ | | | | | | | | | | | | | In particular, s_freeing_list needs to be initialized early, since it is used on some of the error paths when mounts fail. The mapping inode, for example, would be initialized and then free'd on an error path before s_freeing_list was initialized, but the inode drop operation needs the s_freeing_list to be set up. Normally you'd never see this, because not only is logfs fairly rare, but a successful mount will never have any issues. Reported-by: werner <w.landgraf@ru.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'stable/bug-fixes-for-rc5' of ↵Linus Torvalds2011-05-03
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top xen/mmu: Add workaround "x86-64, mm: Put early page table high"
| * xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_topStefano Stabellini2011-05-02
| | | | | | | | | | | | | | | | | | | | | | mask_rw_pte is currently checking if a pfn is a pagetable page if it falls in the range pgt_buf_start - pgt_buf_end but that is incorrect because pgt_buf_end is a moving target: pgt_buf_top is the real boundary. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * xen/mmu: Add workaround "x86-64, mm: Put early page table high"Konrad Rzeszutek Wilk2011-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <yinghai@kernel.org> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high it causes the Linux kernel to crash under Xen: mapping kernel into physical memory Xen: setup ISA identity maps about to get started... (XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7) (XEN) mm.c:3027:d0 Error while pinning mfn b1d89 (XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 0 (vcpu#0) crashed on cpu#0: ... The reason is that at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason, this function is introduced which is called _after_ the init_memory_mapping has completed (in a perfect world we would call this function from init_memory_mapping, but lets ignore that). Because we are called _after_ init_memory_mapping the pgt_buf_[start, end,top] have all changed to new values (b/c another init_memory_mapping is called). Hence, the first time we enter this function, we save away the pgt_buf_start value and update the pgt_buf_[end,top]. When we detect that the "old" pgt_buf_start through pgt_buf_end PFNs have been reserved (so memblock_x86_reserve_range has been called), we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top. And then we update those "old" pgt_buf_[end|top] with the new ones so that we can redo this on the next pagetable. Acked-by: "H. Peter Anvin" <hpa@zytor.com> Reviewed-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> [v1: Updated with Jeremy's comments] [v2: Added the crash output] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>