aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAge
* Bugfix: avoid link error in Feather-Trace on x86Bjoern B. Brandenburg2010-07-16
| | | | | | | | | | | | | | | | | | | If no events are defined but Feater-Trace support is enabled, then the current implementation generates a link error because the __event_table sections is absent. > arch/x86/built-in.o: In function `ft_disable_all_events': > (.text+0x242af): undefined reference to `__start___event_table' As a simple work around, we force zero-element array to always be "allocated" in the __event_table section. This ensures that we end up with a zero-byte section if no events are enabled, and does not affect the layout of the section if events are present. > bbb@ludwig:~/dev/litmus2010$ nm vmlinux | grep event_table > ffffffff81950cdc D __event_table_dummy > ffffffff81950cdc A __start___event_table > ffffffff81950cdc A __stop___event_table
* Make platform-specific Feather-Trace depend on !CONFIG_DEBUG_RODATABjoern B. Brandenburg2010-06-01
| | | | | | | | | | | Feather-Trace rewrites instructions in the kernel's .text segment. This segment may be write-protected if CONFIG_DEBUG_RODATA is selected. In this case, fall back to the default flag-based Feather-Trace implementation. In the future, we could either adopt the ftrace method of rewriting .text addresses using non-.text mappings or we could consider replacing Feather-Trace with ftrace altogether. For now, this patch avoids unexpected runtime errors.
* Make C-EDF depend on x86 and SYSFSAndrea Bastoni2010-06-01
| | | | | | C-EDF depends on intel_cacheinfo.c (for get_shared_cpu_map()) which is only available on x86 architectures. Furthermore, get_shared_cpu_map() is only available if SYSFS filesystem is present.
* Make smp_send_pull_timers() optional.Bjoern B. Brandenburg2010-06-01
| | | | | There is currently no need to implement this in ARM. So let's make it optional instead.
* Make __ARCH_HAS_FEATHER_TRACE a proper CONFIG_ variable.Bjoern B. Brandenburg2010-05-30
| | | | | | | | | The idea of the Feather-Trace default implementation is that LITMUS^RT should work without a specialized Feather-Trace implementation present. This was actually broken. Changes litmus/feather_trace.h to only include asm/feather_trace.h if actually promised by the architecture.
* Merge branch 'master' into wip-merge-2.6.34Andrea Bastoni2010-05-29
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simple merge between master and 2.6.34 with conflicts resolved. This commit does not compile, the following main problems are still unresolved: - spinlock -> raw_spinlock API changes - kfifo API changes - sched_class API changes Conflicts: Makefile arch/x86/include/asm/hw_irq.h arch/x86/include/asm/unistd_32.h arch/x86/kernel/syscall_table_32.S include/linux/hrtimer.h kernel/sched.c kernel/sched_fair.c
| * Export shared_cpu_mapAndrea Bastoni2010-05-29
| | | | | | | | | | | | The cpumap of CPUs that share the same cache level is not normally available outside intel_cacheinfo.c. This commit allows to export such map.
| * Add Feather-Trace x86_64 architecture dependent codeAndrea Bastoni2010-05-29
| |
| * [ported from 2008.3] Add Feather-Trace x86_32 architecture dependent codeAndrea Bastoni2010-05-29
| | | | | | | | | | - [ported from 2008.3] Add x86_32 architecture dependent code. - Add the infrastructure for x86_32 - x86_64 integration.
| * Add support for x86_64 architectureAndrea Bastoni2010-05-29
| | | | | | | | | | | | | | - Add syscall on x86_64 - Refactor __NR_sleep_next_period -> __NR_complete_job for both x86_32 and x86_64
| * Add pull_timers_interrupt() to x86_64Andrea Bastoni2010-05-29
| | | | | | | | Add apic interrupt vector for pull_timers() in x86_64 arch.
| * [ported from 2008.3] Add hrtimer_start_on() APIAndrea Bastoni2010-05-29
| |
| * [ported from 2008.3] Add send_pull_timers() support for x86_32 archAndrea Bastoni2010-05-29
| |
| * [ported from 2008.3] Add GSN-EDF pluginAndrea Bastoni2010-05-29
| | | | | | | | | | | | - insert arm_release_timer() in add_relese() path - arm_release_timer() uses __hrtimer_start_range_ns() instead of hrtimer_start() to avoid deadlock on rq->lock.
| * [ported from 2008.3] Add LITRMUS^RT syscalls to x86_32Andrea Bastoni2010-05-29
| |
| * [ported from 2008.3] Add tracing support and hook up Litmus KConfig for x86Andrea Bastoni2010-05-29
| | | | | | | | | | | | | | | | - fix requesting more than 2^11 pages (MAX_ORDER) to system allocator Still to be merged: - feather-trace generic implementation
* | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-05-15
|\ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mrst: Don't blindly access extended config space
| * | x86, mrst: Don't blindly access extended config spaceH. Peter Anvin2010-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not blindly access extended configuration space unless we actively know we're on a Moorestown platform. The fixed-size BAR capability lives in the extended configuration space, and thus is not applicable if the configuration space isn't appropriately sized. This fixes booting certain VMware configurations with CONFIG_MRST=y. Moorestown will add a fake PCI-X 266 capability to advertise the presence of extended configuration space. Reported-and-tested-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Jacob Pan <jacob.jun.pan@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> LKML-Reference: <AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ@mail.gmail.com>
* | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-05-14
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments x86, k8: Fix build error when K8_NB is disabled x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs x86: Fix fake apicid to node mapping for numa emulation
| * | x86, cacheinfo: Turn off L3 cache index disable feature in virtualized ↵Frank Arnold2010-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | environments When running a quest kernel on xen we get: BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df PGD 0 Oops: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.34-rc3 #1 /HVM domU RIP: 0010:[<ffffffff8142f2fb>] [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x 2ca/0x3df RSP: 0018:ffff880002203e08 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060 RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000 RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38 R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0 R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68 FS: 0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020) Stack: ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30 <0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70 <0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b Call Trace: <IRQ> [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c [<ffffffff81059140>] ? mod_timer+0x23/0x25 [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36 [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3 [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232 [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82 [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0 [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27 [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20 <EOI> [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63 [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd [<ffffffff810114eb>] ? default_idle+0x36/0x53 [<ffffffff81008c22>] cpu_idle+0xaa/0xe4 [<ffffffff81423a9a>] rest_init+0x7e/0x80 [<ffffffff81b10dd2>] start_kernel+0x40e/0x419 [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7 [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107 Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b 00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb RIP [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df RSP <ffff880002203e08> CR2: 0000000000000038 ---[ end trace a7919e7f17c0a726 ]--- The L3 cache index disable feature of AMD CPUs has to be disabled if the kernel is running as guest on top of a hypervisor because northbridge devices are not available to the guest. Currently, this fixes a boot crash on top of Xen. In the future this will become an issue on KVM as well. Check if northbridge devices are present and do not enable the feature if there are none. [ hpa: backported to 2.6.34 ] Signed-off-by: Frank Arnold <frank.arnold@amd.com> LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>
| * | x86, k8: Fix build error when K8_NB is disabledBorislav Petkov2010-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail at the final linking stage due to missing exported num_k8_northbridges. Add a header stub for that. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100503183036.GJ26107@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org>
| * | x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRsAndreas Herrmann2010-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If host CPU is exposed to a guest the OSVW MSRs are not guaranteed to be present and a GP fault occurs. Thus checking the feature flag is essential. Cc: <stable@kernel.org> # .32.x .33.x Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100427101348.GC4489@alberich.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | x86: Fix fake apicid to node mapping for numa emulationDavid Rientjes2010-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With NUMA emulation, it's possible for a single cpu to be bound to multiple nodes since more than one may have affinity if allocated on a physical node that is local to the cpu. APIC ids must therefore be mapped to the lowest node ids to maintain generic kernel use of functions such as cpu_to_node() that determine device affinity. For example, if a device has proximity to physical node 1, for instance, and a cpu happens to be mapped to a higher emulated node id 8, the proximity may not be correctly determined by comparison in generic code even though the cpu may be truly local and allocated on physical node 1. When this happens, the true topology of the machine isn't accurately represented in the emulated environment; although this isn't critical to the system's uptime, any generic code that is NUMA aware benefits from the physical topology being accurately represented. This can affect any system that maps multiple APIC ids to a single node and is booted with numa=fake=N where N is greater than the number of physical nodes. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <alpine.DEB.2.00.1005060224140.19473@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | KVM: VMX: blocked-by-sti must not defer NMI injectionsJan Kiszka2010-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the processor may not consider GUEST_INTR_STATE_STI as a reason for blocking NMI, it could return immediately with EXIT_REASON_NMI_WINDOW when we asked for it. But as we consider this state as NMI-blocking, we can run into an endless loop. Resolve this by allowing NMI injection if just GUEST_INTR_STATE_STI is active (originally suggested by Gleb). Intel confirmed that this is safe, the processor will never complain about NMI injection in this state. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> KVM-Stable-Tag Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | | KVM: x86: Call vcpu_load and vcpu_put in cpuid_updateDongxiao Xu2010-05-13
| | | | | | | | | | | | | | | | | | | | | | | | cpuid_update may operate VMCS, so vcpu_load() and vcpu_put() should be called to ensure correctness. Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | | KVM: SVM: Fix wrong intercept masks on 32 bitJoerg Roedel2010-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes KVM on 32 bit SVM working again by correcting the masks used for iret interception. With the wrong masks the upper 32 bits of the intercepts are masked out which leaves vmrun unintercepted. This is not legal on svm and the vmrun fails. Bug was introduced by commits 95ba827313 and 3cfc3092. Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | | kprobes/x86: Fix removed int3 checking orderMasami Hiramatsu2010-05-11
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix kprobe/x86 to check removed int3 when failing to get kprobe from hlist. Since we have a time window between checking int3 exists on probed address and getting kprobe on that address, we can have following scenario: ------- CPU1 CPU2 hit int3 check int3 exists remove int3 remove kprobe from hlist get kprobe from hlist no kprobe->OOPS! ------- This patch moves int3 checking if there is no kprobe on that address for fixing this problem as follows: ------ CPU1 CPU2 hit int3 remove int3 remove kprobe from hlist get kprobe from hlist no kprobe->check int3 exists ->rollback&retry ------ Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Dave Anderson <anderson@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100427223348.2322.9112.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-05-04
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: powernow-k8: Fix frequency reporting x86: Fix parse_reservetop() build failure on certain configs x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests x86: Fix 'reservetop=' functionality
| * | powernow-k8: Fix frequency reportingMark Langsdorf2010-05-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With F10, model 10, all valid frequencies are in the ACPI _PST table. Cc: <stable@kernel.org> # 33.x 32.x Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> LKML-Reference: <1270065406-1814-6-git-send-email-bp@amd64.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: Fix parse_reservetop() build failure on certain configsIngo Molnar2010-05-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e67a807 ("x86: Fix 'reservetop=' functionality") added a fixup_early_ioremap() call to parse_reservetop() and declared it in io.h. But asm/io.h was only included indirectly - and on some configs not at all, causing a build failure on those configs. Cc: Liang Li <liang.li@windriver.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Wang Chen <wangchen@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86: Fix NULL pointer access in irq_force_complete_move() for Xen guestsPrarit Bhargava2010-04-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream PV guests fail to boot because of a NULL pointer in irq_force_complete_move(). It is possible that xen guests have irq_desc->chip_data = NULL. Test for NULL chip_data pointer before attempting to complete an irq move. Signed-off-by: Prarit Bhargava <prarit@redhat.com> LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org> [2.6.33]
| * | x86: Fix 'reservetop=' functionalityLiang Li2010-04-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When specifying the 'reservetop=0xbadc0de' kernel parameter, the kernel will stop booting due to a early_ioremap bug that relates to commit 8827247ff. The root cause of boot failure problem is the value of 'slot_virt[i]' was initialized in setup_arch->early_ioremap_init(). But later in setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP' when 'reservetop=0xbadc0de' being specified. The simplest fix might be use __fix_to_virt(idx0) to get updated value of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value from slot_virt[slot] directly. Changelog since v0: -v1: When reservetop being handled then FIXADDR_TOP get adjusted, Hence check prev_map then re-initialize slot_virt and PMD based on new FIXADDR_TOP. -v2: place fixup_early_ioremap hence call early_ioremap_init in reserve_top_address to re-initialize slot_virt and corresponding PMD when parse_reservertop -v3: move fixup_early_ioremap out of reserve_top_address to make sure other clients of reserve_top_address like xen/lguest won't broken Signed-off-by: Liang Li <liang.li@windriver.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Wang Chen <wangchen@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com> [ fixed three small cleanliness details in fixup_early_ioremap() ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Fix the x86_64 implementation of call_rwsem_wait()David Howells2010-05-04
|/ / | | | | | | | | | | | | | | | | The x86_64 call_rwsem_wait() treats the active state counter part of the R/W semaphore state as being 16-bit when it's actually 32-bit (it's half of the 64-bit state). It should do "decl %edx" not "decw %dx". Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-04-28
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Disable large pages on CPUs with Atom erratum AAE44 x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzero x86, mrst: Conditionally register cpu hotplug notifier for apbt
| * | x86: Disable large pages on CPUs with Atom erratum AAE44H. Peter Anvin2010-04-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Atom erratum AAE44/AAF40/AAG38/AAH41: "If software clears the PS (page size) bit in a present PDE (page directory entry), that will cause linear addresses mapped through this PDE to use 4-KByte pages instead of using a large page after old TLB entries are invalidated. Due to this erratum, if a code fetch uses this PDE before the TLB entry for the large page is invalidated then it may fetch from a different physical address than specified by either the old large page translation or the new 4-KByte page translation. This erratum may also cause speculative code fetches from incorrect addresses." [http://download.intel.com/design/processor/specupdt/319536.pdf] Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to workaround errata AAH41 (mixed 4K TLBs) it reduces the window of opportunity for the bug to occur and does not totally remove it. This patch disables mixed 4K/4MB page tables totally avoiding the page splitting and not tripping this processor issue. This is based on an original patch by Colin King. Originally-by: Colin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Cc: <stable@kernel.org>
| * | x86-64: Clear a 64-bit FS/GS base on fork if selector is nonzeroH. Peter Anvin2010-04-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we do a thread switch, we clear the outgoing FS/GS base if the corresponding selector is nonzero. This is taken by __switch_to() as an entry invariant; it does not verify that it is true on entry. However, copy_thread() doesn't enforce this constraint, which can result in inconsistent results after fork(). Make copy_thread() match the behavior of __switch_to(). Reported-and-tested-by: Samuel Thibault <samuel.thibault@inria.fr> Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4BD1E061.8030605@zytor.com> Cc: <stable@kernel.org>
| * | x86, mrst: Conditionally register cpu hotplug notifier for apbtJacob Pan2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | APB timer is used on Moorestown platforms but not on a standard PC. If APB timer code is compiled in but not initialized at run-time due to lack of FW reported SFI table, kernel would panic when the non-boot CPUs are offlined and notifier is called. https://bugzilla.kernel.org/show_bug.cgi?id=15786 This patch ensures CPU hotplug notifier for APB timer is only registered when the APBT timer block is initialized. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> LKML-Reference: <1271701423-1162-1-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | x86/PCI: compute Address Space length rather than using _LENBjorn Helgaas2010-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ACPI _CRS Address Space Descriptors have _MIN, _MAX, and _LEN. Linux has been computing Address Spaces as [_MIN to _MIN + _LEN - 1]. Based on the tests in the bug reports below, Windows apparently uses [_MIN to _MAX]. Per spec (ACPI 4.0, Table 6-40), for _CRS fixed-size, fixed location descriptors, "_LEN must be (_MAX - _MIN + 1)", and when that's true, it doesn't matter which way we compute the end. But of course, there are BIOSes that don't follow this rule, and we're better off if Linux handles those exceptions the same way as Windows. This patch makes Linux use [_MIN to _MAX], as Windows seems to do. This effectively reverts d558b483d5 and 03db42adfe and replaces them with simpler code. https://bugzilla.kernel.org/show_bug.cgi?id=14337 (round) https://bugzilla.kernel.org/show_bug.cgi?id=15480 (truncate) Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* | | x86/PCI: never allocate PCI MMIO resources below BIOS_ENDBjorn Helgaas2010-04-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we move a PCI device or assign resources to a device not configured by the BIOS, we want to avoid the BIOS region below 1MB. Note that if the BIOS places devices below 1MB, we leave them there. See https://bugzilla.kernel.org/show_bug.cgi?id=15744 and https://bugzilla.kernel.org/show_bug.cgi?id=15841 Tested-by: Andy Isaacson <adi@hexapodia.org> Tested-by: Andy Bailey <bailey@akamai.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2010-04-24
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: Ensure we re-enable devices on resume x86/PCI: parse additional host bridge window resource types PCI: revert broken device warning PCI aerdrv: use correct bit defines and add 2ms delay to aer_root_reset x86/PCI: ignore Consumer/Producer bit in ACPI window descriptions
| * | | x86/PCI: parse additional host bridge window resource typesBjorn Helgaas2010-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for Memory24, Memory32, and Memory32Fixed descriptors in PCI host bridge _CRS. I experimentally determined that Windows (2008 R2) accepts these descriptors and treats them as windows that are forwarded to the PCI bus, e.g., if it finds any PCI devices with BARs outside the windows, it moves them into the windows. I don't know whether any machines actually use these descriptors in PCI host bridge _CRS methods, but if any exist and they're new enough that we automatically turn on "pci=use_crs", they will work with Windows but not with Linux. Here are the details: https://bugzilla.kernel.org/show_bug.cgi?id=15817 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | | x86/PCI: ignore Consumer/Producer bit in ACPI window descriptionsBjorn Helgaas2010-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ACPI Address Space Descriptors (used in _CRS) have a Consumer/Producer bit that is supposed to distinguish regions that are consumed directly by a device from those that are forwarded ("produced") by a bridge. But BIOSes have apparently not used this consistently, and Windows seems to ignore it, so I think Linux should ignore it as well. I can't point to any of these supposed broken BIOSes, but since we now rely on _CRS by default, I think it's safer to ignore this bit from the start. Here are details of my experiments with how Windows handles it: https://bugzilla.kernel.org/show_bug.cgi?id=15701 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* | | | VMware Balloon driverDmitry Torokhov2010-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a standalone version of VMware Balloon driver. Ballooning is a technique that allows hypervisor dynamically limit the amount of memory available to the guest (with guest cooperation). In the overcommit scenario, when hypervisor set detects that it needs to shuffle some memory, it instructs the driver to allocate certain number of pages, and the underlying memory gets returned to the hypervisor. Later hypervisor may return memory to the guest by reattaching memory to the pageframes and instructing the driver to "deflate" balloon. We are submitting a standalone driver because KVM maintainer (Avi Kivity) expressed opinion (rightly) that our transport does not fit well into virtqueue paradigm and thus it does not make much sense to integrate with virtio. There were also some concerns whether current ballooning technique is the right thing. If there appears a better framework to achieve this we are prepared to evaluate and switch to using it, but in the meantime we'd like to get this driver upstream. We want to get the driver accepted in distributions so that users do not have to deal with an out-of-tree module and many distributions have "upstream first" requirement. The driver has been shipping for a number of years and users running on VMware platform will have it installed as part of VMware Tools even if it will not come from a distribution, thus there should not be additional risk in pulling the driver into mainline. The driver will only activate if host is VMware so everyone else should not be affected at all. Signed-off-by: Dmitry Torokhov <dtor@vmware.com> Cc: Avi Kivity <avi@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2010-04-21
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'kvm-updates/2.6.34' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix TSS size check for 16-bit tasks KVM: Add missing srcu_read_lock() for kvm_mmu_notifier_release() KVM: Increase NR_IOBUS_DEVS limit to 200 KVM: fix the handling of dirty bitmaps to avoid overflows KVM: MMU: fix kvm_mmu_zap_page() and its calling path KVM: VMX: Save/restore rflags.vm correctly in real mode KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTL KVM: Don't spam kernel log when injecting exceptions due to bad cr writes KVM: SVM: Fix memory leaks that happen when svm_create_vcpu() fails KVM: take srcu lock before call to complete_pio()
| * | | KVM: x86: Fix TSS size check for 16-bit tasksJan Kiszka2010-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A 16-bit TSS is only 44 bytes long. So make sure to test for the correct size on task switch. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: fix the handling of dirty bitmaps to avoid overflowsTakuya Yoshikawa2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Int is not long enough to store the size of a dirty bitmap. This patch fixes this problem with the introduction of a wrapper function to calculate the sizes of dirty bitmaps. Note: in mark_page_dirty(), we have to consider the fact that __set_bit() takes the offset as int, not long. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | KVM: MMU: fix kvm_mmu_zap_page() and its calling pathXiao Guangrong2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix: - calculate zapped page number properly in mmu_zap_unsync_children() - calculate freeed page number properly kvm_mmu_change_mmu_pages() - if zapped children page it shoud restart hlist walking KVM-Stable-Tag. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | KVM: VMX: Save/restore rflags.vm correctly in real modeAvi Kivity2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we set eflags.vm unconditionally when entering real mode emulation through virtual-8086 mode, and clear it unconditionally when we enter protected mode. The means that the following sequence KVM_SET_REGS (rflags.vm=1) KVM_SET_SREGS (cr0.pe=1) Ends up with rflags.vm clear due to KVM_SET_SREGS triggering enter_pmode(). Fix by shadowing rflags.vm (and rflags.iopl) correctly while in real mode: reads and writes to those bits access a shadow register instead of the actual register. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
| * | | KVM: allow bit 10 to be cleared in MSR_IA32_MC4_CTLAndre Przywara2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a quirk for AMD K8 CPUs in many Linux kernels (see arch/x86/kernel/cpu/mcheck/mce.c:__mcheck_cpu_apply_quirks()) that clears bit 10 in that MCE related MSR. KVM can only cope with all zeros or all ones, so it will inject a #GP into the guest, which will let it panic. So lets add a quirk to the quirk and ignore this single cleared bit. This fixes -cpu kvm64 on all machines and -cpu host on K8 machines with some guest Linux kernels. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
| * | | KVM: Don't spam kernel log when injecting exceptions due to bad cr writesAvi Kivity2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | These are guest-triggerable. Signed-off-by: Avi Kivity <avi@redhat.com>