| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for setting up a fixed IOMMU mapping on certain
cell machines. For 64-bit devices this avoids the performance overhead of
mapping and unmapping pages at runtime. 32-bit devices are unable to use
the fixed mapping.
The fixed mapping is established at boot, and maps all of physical memory
1:1 into device space at some offset. On machines with < 30 GB of memory
we setup the fixed mapping immediately above the normal IOMMU window.
For example a machine with 4GB of memory would end up with the normal
IOMMU window from 0-2GB and the fixed mapping window from 2GB to 6GB. In
this case a 64-bit device wishing to DMA to 1GB would be told to DMA to
3GB, plus any offset required by firmware. The firmware offset is encoded
in the "dma-ranges" property.
On machines with 30GB or more of memory, we are unable to place the fixed
mapping above the normal IOMMU window as we would run out of address space.
Instead we move the normal IOMMU window to coincide with the hash page
table, this region does not need to be part of the fixed mapping as no
device should ever be DMA'ing to it. We then setup the fixed mapping
from 0 to 32GB.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
| |
Split out the ioid fetching and checking logic so we can use it elsewhere
in a subsequent patch.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support to cell_iommu_setup_page_tables() for handling two windows,
the dynamic window and the fixed window. A fixed window size of 0
indicates that there is no fixed window at all.
Currently there are no callers who pass a non-zero fixed window, but the
upcoming fixed IOMMU mapping patch will change that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
|
| |
Split the IOMMU logic out from cell_dma_dev_setup() into a separate
function. If we're not using dma_direct_ops or dma_iommu_ops we don't
know what the hell's going on, so BUG.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
|
| |
Split cell_iommu_setup_hardware() into two parts. Split the page table
setup into cell_iommu_setup_page_tables() and the bits that kick the
hardware into cell_iommu_enable_hardware().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
|
| |
Split out the logic that allocates a struct iommu into a separate
function. This can fail however the calling code has never cared - so
just return if we can't allocate an iommu.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to support the fixed IOMMU mapping (in a subsequent patch),
we need the hash table to be inside the IOMMUs DMA window. This is
usually 2G, but let's make sure the hash table is under 1G as that
will satisfy the IOMMU requirements and also means the hash table will
be on node 0.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
|
|\ |
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
lguest: use __PAGE_KERNEL instead of _PAGE_KERNEL
lguest: Use explicit includes rateher than indirect
lguest: get rid of lg variable assignments
lguest: change gpte_addr header
lguest: move changed bitmap to lg_cpu
lguest: move last_pages to lg_cpu
lguest: change last_guest to last_cpu
lguest: change spte_addr header
lguest: per-vcpu lguest pgdir management
lguest: make pending notifications per-vcpu
lguest: makes special fields be per-vcpu
lguest: per-vcpu lguest task management
lguest: replace lguest_arch with lg_cpu_arch.
lguest: make registers per-vcpu
lguest: make emulate_insn receive a vcpu struct.
lguest: map_switcher_in_guest() per-vcpu
lguest: per-vcpu interrupt processing.
lguest: per-vcpu lguest timers
lguest: make hypercalls use the vcpu struct
lguest: make write() operation smp aware
...
Manual conflict resolved (maybe even correctly, who knows) in
drivers/lguest/x86/core.c
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Reboot Implemented
(Prevent fd leak, fix style and fix documentation --RR)
Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
PPC: Fix powerpc vio_find_name to not use devices_subsys
Driver core: add bus_find_device_by_name function
Module: check to see if we have a built in module with the same name
x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Driver core: Fix up build when CONFIG_BLOCK=N
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This fixes vio_find_name() in arch/powerpc/kernel/vio.c, which is
currently broken because it tries to use devices_subsys. That is bad
for two reasons: (1) it's doing (or trying to do) a scan of all
devices when it should only be scanning those on the vio bus, and
(2) devices_subsys was an internal symbol of the device system code
which was never meant for external use and has now gone away, and
thus the kernel fails to compile on pSeries.
The new version uses bus_find_device_by_name() on the vio bus
(vio_bus_type).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This problem is due to the kobject rework recently done in this file.
The mce_amd_64.c code uses some wierd forward calls to back out of the
recursive way the code creates kobjects. Because of this, we need to
verify that we have really created a kobject before calling
kobject_uevent().
Many thanks to Yinghai Lu <yhlu.kernel@gmail.com> for reporting the
problem and testing.
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (249 commits)
KVM: Move apic timer migration away from critical section
KVM: Put kvm_para.h include outside __KERNEL__
KVM: Fix unbounded preemption latency
KVM: Initialize the mmu caches only after verifying cpu support
KVM: MMU: Fix dirty page setting for pages removed from rmap
KVM: Portability: Move kvm_fpu to asm-x86/kvm.h
KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD
KVM: MMU: Merge shadow level check in FNAME(fetch)
KVM: MMU: Move kvm_free_some_pages() into critical section
KVM: MMU: Switch to mmu spinlock
KVM: MMU: Avoid calling gfn_to_page() in mmu_set_spte()
KVM: Add kvm_read_guest_atomic()
KVM: MMU: Concurrent guest walkers
KVM: Disable vapic support on Intel machines with FlexPriority
KVM: Accelerated apic support
KVM: local APIC TPR access reporting facility
KVM: Print data for unimplemented wrmsr
KVM: MMU: Add cache miss statistic
KVM: MMU: Coalesce remote tlb flushes
KVM: Expose ioapic to ia64 save/restore APIs
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Migrating the apic timer in the critical section is not very nice, and is
absolutely horrible with the real-time port. Move migration to the regular
vcpu execution path, triggered by a new bitflag.
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When preparing to enter the guest, if an interrupt comes in while
preemption is disabled but interrupts are still enabled, we miss a
preemption point. Fix by explicitly checking whether we need to
reschedule.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Otherwise we re-initialize the mmu caches, which will fail since the
caches are already registered, which will cause us to deinitialize said caches.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Right now rmap_remove won't set the page as dirty if the shadow pte
pointed to this page had write access and then it became readonly.
This patches fixes that, by setting the page as dirty for spte changes from
write to readonly access.
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When executing a test program called "crashme", we found the KVM guest cannot
survive more than ten seconds, then encounterd kernel panic. The basic concept
of "crashme" is generating random assembly code and trying to execute it.
After some fixes on emulator insn validity judgment, we found it's hard to
get the current emulator handle the invalid instructions correctly, for the
#UD trap for hypercall patching caused troubles. The problem is, if the opcode
itself was OK, but combination of opcode and modrm_reg was invalid, and one
operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch
the memory operand first rather than checking the validity, and may encounter
an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem.
In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall,
then return from emulate_instruction() and inject a #UD to guest. With the
patch, the guest had been running for more than 12 hours.
Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Remove the redundant level check when fetching
shadow pte for present & non-present spte.
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If some other cpu steals mmu pages between our check and an attempt to
allocate, we can run out of mmu pages. Fix by moving the check into the
same critical section as the allocation.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Convert the synchronization of the shadow handling to a separate mmu_lock
spinlock.
Also guard fetch() by mmap_sem in read-mode to protect against alias
and memslot changes.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Since gfn_to_page() is a sleeping function, and we want to make the core mmu
spinlocked, we need to pass the page from the walker context (which can sleep)
to the shadow context (which cannot).
[marcelo: avoid recursive locking of mmap_sem]
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In preparation for a mmu spinlock, add kvm_read_guest_atomic()
and use it in fetch() and prefetch_page().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Do not hold kvm->lock mutex across the entire pagefault code,
only acquire it in places where it is necessary, such as mmu
hash list, active list, rmap and parent pte handling.
Allow concurrent guest walkers by switching walk_addr() to use
mmap_sem in read-mode.
And get rid of the lockless __gfn_to_page.
[avi: move kvm_mmu_pte_write() locking inside the function]
[avi: add locking for real mode]
[avi: fix cmpxchg locking]
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
FlexPriority accelerates the tpr without any patching.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This adds a mechanism for exposing the virtual apic tpr to the guest, and a
protocol for letting the guest update the tpr without causing a vmexit if
conditions allow (e.g. there is no interrupt pending with a higher priority
than the new tpr).
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Add a facility to report on accesses to the local apic tpr even if the
local apic is emulated in the kernel. This is basically a hack that
allows userspace to patch Windows which tends to bang on the tpr a lot.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This can help diagnosing what the guest is trying to do. In many cases
we can get away with partial emulation of msrs.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Host side TLB flush can be merged together if multiple
spte need to be write-protected.
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Moving kvm_vcpu_kick() to x86.c. Since it should be
common for all archs, put its declarations in <linux/kvm_host.h>
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Move ioapic code to common, since IA64 also needs it.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This allows reuse of ioapic in ia64.
Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | | |
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This paves the way for multiple architecture support. Note that while
ioapic.c could potentially be shared with ia64, it is also moved.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h>
includes, not existing. Rather than add a zillion <asm/kvm.h>s, export kvm.h
only if the arch actually supports it.
Signed-off-by: Avi Kivity <avi@qumranet.com>
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
With the sg table code, every SCSI driver is now either chain capable
or broken (or has sg_tablesize set so chaining is never activated), so
there's no need to have a check in the host template.
Also tidy up the code by moving the scatterlist size defines into the
SCSI includes and permit the last entry of the scatterlist pools not
to be a power of two.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
memnode.map is s16 array because of nodeid is 16 bit now.
so need to increase the nodemap_size according to that bits.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
one early crash on one 8 node 256g machine:
Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable)
BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data)
BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS)
BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000004020000000 (usable)
Early serial console at I/O port 0x3f8 (options '115200n8')
console [uart0] enabled
end_pfn_map = 67239936
Kernel panic - not syncing: Duplicated early reservation d40000-e42000
Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3
Call Trace:
[<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10
[<ffffffff80221657>] clear_local_APIC+0x5/0xcf
[<ffffffff80221726>] disable_local_APIC+0x5/0x17
[<ffffffff8021fe16>] smp_send_stop+0x46/0x4c
[<ffffffff80235293>] panic+0x94/0x13e
[<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34
[<ffffffff80b9f1c5>] reserve_early+0x30/0x6c
[<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc
[<ffffffff80b9dc01>] setup_arch+0x21f/0x44e
[<ffffffff80b978be>] start_kernel+0x6f/0x2c7
[<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3
it turns out there is overlap between pgtable and bss...
in System.map we have
ffffffff80d40420 b rsi_table
ffffffff80d40620 B krb5_seq_lock
ffffffff80d40628 b i.20437
ffffffff80d40630 b xprt_rdma_inline_write_padding
ffffffff80d40638 b sunrpc_table_header
ffffffff80d40640 b zero
ffffffff80d40644 b min_memreg
ffffffff80d40648 b rpcrdma_tk_lock_g
ffffffff80d40650 B sctp_assocs_id_lock
ffffffff80d40658 B proc_net_sctp
ffffffff80d40660 B sctp_assocs_id
ffffffff80d40680 B sysctl_sctp_mem
ffffffff80d40690 B sysctl_sctp_rmem
ffffffff80d406a0 B sysctl_sctp_wmem
ffffffff80d406b0 b sctp_ctl_socket
ffffffff80d406b8 b sctp_pf_inet6_specific
ffffffff80d406c0 b sctp_pf_inet_specific
ffffffff80d406c8 b sctp_af_v4_specific
ffffffff80d406d0 b sctp_af_v6_specific
ffffffff80d406d8 b sctp_rand.33270
ffffffff80d406dc b sctp_memory_pressure
ffffffff80d406e0 b sctp_sockets_allocated
ffffffff80d406e4 b sctp_memory_allocated
ffffffff80d406e8 b sctp_sysctl_header
ffffffff80d406f0 b zero
ffffffff80d406f4 A __bss_stop
ffffffff80d406f4 A _end
need to round up table_start to PAGE_SIZE.
also make the panic more informative.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This just adds the PCI IDs of AMD's family 10h and 11h CPU's northbridges to
k8topology discovery.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Put appropriate pagetable update hooks in so that paravirt knows
what's going on in there.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Use a standard list threaded through page->lru for maintaining the pgd
list on PAE. This is the same as 64-bit, and seems saner than using a
non-standard list via page->index.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch adds a new configuration option, which adds support for a new
early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch()
to decide wether OHCI-1394 FireWire controllers should be initialized and
enabled for physical DMA access to allow remote debugging of early problems
like issues ACPI or other subsystems which are executed very early.
If the config option is not enabled, no code is changed, and if the boot
paramenter is not given, no new code is executed, and independent of that,
all new code is freed after boot, so the config option can be even enabled
in standard, non-debug kernels.
With specialized tools, it is then possible to get debugging information
from machines which have no serial ports (notebooks) such as the printk
buffer contents, or any data which can be referenced from global pointers,
if it is stored below the 4GB limit and even memory dumps of of the physical
RAM region below the 4GB limit can be taken without any cooperation from the
CPU of the host, so the machine can be crashed early, it does not matter.
In the extreme, even kernel debuggers can be accessed in this way. I wrote
a small kgdb module and an accompanying gdb stub for FireWire which allows
to gdb to talk to kgdb using remote remory reads and writes over FireWire.
An version of the gdb stub fore FireWire is able to read all global data
from a system which is running a a normal kernel without any kernel debugger,
without any interruption or support of the system's CPU. That way, e.g. the
task struct and so on can be read and even manipulated when the physical DMA
access is granted.
A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt
and I've put a copy online at
ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt
It also has links to all the tools which are available to make use of it
another copy of it is online at:
ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diff
Signed-Off-By: Bernhard Kaindl <bk@suse.de>
Tested-By: Thomas Renninger <trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In x86 PAE mode, stop treating pmds as a special case. Previously
they were always allocated and freed with the pgd. The modifies the
code to be the same as 64-bit mode, where they are allocated on
demand.
This is a step on the way to unifying 32/64-bit pagetable allocation
as much as possible.
There is a complicating wart, however. When you install a new
reference to a pmd in the pgd, the processor isn't guaranteed to see
it unless you reload cr3. Since reloading cr3 also has the
side-effect of flushing the tlb, this is an expense that we want to
avoid whereever possible.
This patch simply avoids reloading cr3 unless the update is to the
current pagetable. Later patches will optimise this further.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: William Irwin <wli@holomorphy.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The change from current to tsk in do_page_fault is safe as
this is set at the very beginning of the function.
Removes a likely() annotation from the 64-bit version, this
could have instead been added to 32-bit.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When changing a kernel page from RO->RW, it's OK to leave stale TLB
entries around, since doing a global flush is expensive and they pose
no security problem. They can, however, generate a spurious fault,
which we should catch and simply return from (which will have the
side-effect of reloading the TLB to the current PTE).
This can occur when running under Xen, because it frequently changes
kernel pages from RW->RO->RW to implement Xen's pagetable semantics.
It could also occur when using CONFIG_DEBUG_PAGEALLOC, since it avoids
doing a global TLB flush after changing page permissions.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
On !PAE 32-bit, _PAGE_NX will be 0, making is_prefetch always
return early. The test is sufficient on PAE as __supported_pte_mask
is updated in the same places as nx_enabled in init_32.c which also
takes disable_nx into account.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Unify includes in moved fault.c.
Modify Makefiles to pick up unified file.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Elimination of these ifdefs can be done in a unified file.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|