aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/include/asm
Commit message (Collapse)AuthorAge
* Merge branch 'x86-mce-for-linus' of ↵Linus Torvalds2012-01-06
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: add IRQ context simulation in module mce-inject x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog x86, MCE: Drain mcelog buffer x86, mce: Add wrappers for registering on the decode chain
| * Merge branch 'mce-inject' of ↵Ingo Molnar2011-12-18
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce
| | * x86: add IRQ context simulation in module mce-injectChen Gong2011-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mce-inject provides a mechanism to simulate errors so that test scripts can check for correct operation of the kernel without requiring any specialized hardware to create rare events. The existing code can simulate events in normal process context and also in NMI context - but not in IRQ context. This patch fills that gap. Link: https://lkml.org/lkml/2011/12/7/537 Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | x86, mce: Add wrappers for registering on the decode chainBorislav Petkov2011-12-14
| |/ | | | | | | | | | | | | No functionality change, this is done so that in a follow-on patch all queued-up MCEs can be decoded after registering on the chain. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | Merge branch 'x86-microcode-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode, AMD: Update copyrights x86, microcode, AMD: Exit early on success x86, microcode, AMD: Simplify ucode verification x86, microcode, AMD: Add a reusable buffer x86, microcode, AMD: Add a vendor-specific exit function
| * \ Merge branch 'ucode-verify-size' of ↵Ingo Molnar2011-12-15
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/microcode
| | * | x86, microcode, AMD: Add a vendor-specific exit functionBorislav Petkov2011-12-14
| | |/ | | | | | | | | | | | | | | | This will be used to do cleanup work before the driver exits. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
* | | Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Use "do { } while(0)" for empty lock_cmos()/unlock_cmos() macros x86: Use "do { } while(0)" for empty flush_tlb_fix_spurious_fault() macro x86, CPU: Drop superfluous get_cpu_cap() prototype arch/x86/mm/pageattr.c: Quiet sparse noise; local functions should be static arch/x86/kernel/ptrace.c: Quiet sparse noise x86: Use kmemdup() in copy_thread(), rather than duplicating its implementation x86: Replace the EVT_TO_HPET_DEV() macro with an inline function
| * | | x86: Use "do { } while(0)" for empty lock_cmos()/unlock_cmos() macrosJesper Juhl2011-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc noticed (when using -Wempty-body) that our use of lock_cmos() and unlock_cmos() in arch/x86/include/asm/mach_traps.h is potentially problematic : arch/x86/include/asm/mach_traps.h:32:15: warning: suggest braces around empty body in an ¡else¢ statement [-Wempty-body] arch/x86/include/asm/mach_traps.h:40:16: warning: suggest braces around empty body in an ¡else¢ statement [-Wempty-body] Let's just use the standard 'do {} while (0)' solution. That shuts up gcc and also prevents future problems if the macros should end up being used in a similar situation elsewhere. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1112180103130.21784@swampdragon.chaosbits.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: Use "do { } while(0)" for empty flush_tlb_fix_spurious_fault() macroJesper Juhl2011-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If one builds the kernel with -Wempty-body one gets this warning: mm/memory.c:3432:46: warning: suggest braces around empty body in an ¡if¢ statement [-Wempty-body] due to the fact that 'flush_tlb_fix_spurious_fault' is a macro that can sometimes be defined to nothing. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-mm@kvack.org Cc: Michel Lespinasse <walken@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1112180128070.21784@swampdragon.chaosbits.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86: Fix atomic64_xxx_cx8() functions x86: Fix and improve cmpxchg_double{,_local}() x86_64, asm: Optimise fls(), ffs() and fls64() x86, bitops: Move fls64.h inside __KERNEL__ x86: Fix and improve percpu_cmpxchg{8,16}b_double() x86: Report cpb and eff_freq_ro flags correctly x86/i386: Use less assembly in strlen(), speed things up a bit x86: Use the same node_distance for 32 and 64-bit x86: Fix rflags in FAKE_STACK_FRAME x86: Clean up and extend do_int3() x86: Call do_notify_resume() with interrupts enabled x86/div64: Add a micro-optimization shortcut if base is power of two x86-64: Cleanup some assembly entry points x86-64: Slightly shorten line system call entry and exit paths x86-64: Reduce amount of redundant code generated for invalidate_interruptNN x86-64: Slightly shorten int_ret_from_sys_call x86, efi: Convert efi_phys_get_time() args to physical addresses x86: Default to vsyscall=emulate x86-64: Set siginfo and context on vsyscall emulation faults x86: consolidate xchg and xadd macros ...
| * | | | x86: Fix atomic64_xxx_cx8() functionsEric Dumazet2012-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It appears about all functions in arch/x86/lib/atomic64_cx8_32.S are wrong in case cmpxchg8b must be restarted, because LOCK_PREFIX macro defines a label "1" clashing with other local labels : 1: some_instructions LOCK_PREFIX cmpxchg8b (%ebp) jne 1b / jumps to beginning of LOCK_PREFIX ! A possible fix is to use a magic label "672" in LOCK_PREFIX asm definition, similar to the "671" one we defined in LOCK_PREFIX_HERE. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Jan Beulich <JBeulich@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1325608540.2320.103.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Fix and improve cmpxchg_double{,_local}()Jan Beulich2012-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like the per-CPU ones they had several problems/shortcomings: Only the first memory operand was mentioned in the asm() operands, and the 2x64-bit version didn't have a memory clobber while the 2x32-bit one did. The former allowed the compiler to not recognize the need to re-load the data in case it had it cached in some register, while the latter was overly destructive. The types of the local copies of the old and new values were incorrect (the types of the pointed-to variables should be used here, to make sure the respective old/new variable types are compatible). The __dummy/__junk variables were pointless, given that local copies of the inputs already existed (and can hence be used for discarded outputs). The 32-bit variant of cmpxchg_double_local() referenced cmpxchg16b_local(). At once also: - change the return value type to what it really is: 'bool' - unify 32- and 64-bit variants - abstract out the common part of the 'normal' and 'local' variants Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/4F01F12A020000780006A19B@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | Merge commit 'v3.2-rc7' into x86/asmIngo Molnar2012-01-04
| |\ \ \ \ | | | |/ / | | |/| | | | | | | | | | | | | | | | | Merge reason: Update from -rc4 to -rc7. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86_64, asm: Optimise fls(), ffs() and fls64()David Howells2011-12-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fls(N), ffs(N) and fls64(N) can be optimised on x86_64. Currently they use a CMOV instruction after the BSR/BSF to set the destination register to -1 if the value to be scanned was 0 (in which case BSR/BSF set the Z flag). Instead, according to the AMD64 specification, we can make use of the fact that BSR/BSF doesn't modify its output register if its input is 0. By preloading the output with -1 and incrementing the result, we achieve the desired result without the need for a conditional check. The Intel x86_64 specification, however, says that the result of BSR/BSF in such a case is undefined. That said, when queried, one of the Intel CPU architects said that the behaviour on all Intel CPUs is that: (1) with BSRQ/BSFQ, the 64-bit destination register is written with its original value if the source is 0, thus, in essence, giving the effect we want. And, (2) with BSRL/BSFL, the lower half of the 64-bit destination register is written with its original value if the source is 0, and the upper half is cleared, thus giving us the effect we want (we return a 4-byte int). Further, it was indicated that they (Intel) are unlikely to get away with changing the behaviour. It might be possible to optimise the 32-bit versions of these functions, but there's a lot more variation, and so the effective non-destructive property of BSRL/BSRF cannot be relied on. [ hpa: specifically, some 486 chips are known to NOT have this property. ] I have benchmarked these functions on my Core2 Duo test machine using the following program: #include <stdlib.h> #include <stdio.h> #ifndef __x86_64__ #error #endif #define PAGE_SHIFT 12 typedef unsigned long long __u64, u64; typedef unsigned int __u32, u32; #define noinline __attribute__((noinline)) static __always_inline int fls64(__u64 x) { long bitpos = -1; asm("bsrq %1,%0" : "+r" (bitpos) : "rm" (x)); return bitpos + 1; } static inline unsigned long __fls(unsigned long word) { asm("bsr %1,%0" : "=r" (word) : "rm" (word)); return word; } static __always_inline int old_fls64(__u64 x) { if (x == 0) return 0; return __fls(x) + 1; } static noinline // __attribute__((const)) int old_get_order(unsigned long size) { int order; size = (size - 1) >> (PAGE_SHIFT - 1); order = -1; do { size >>= 1; order++; } while (size); return order; } static inline __attribute__((const)) int get_order_old_fls64(unsigned long size) { int order; size--; size >>= PAGE_SHIFT; order = old_fls64(size); return order; } static inline __attribute__((const)) int get_order(unsigned long size) { int order; size--; size >>= PAGE_SHIFT; order = fls64(size); return order; } unsigned long prevent_optimise_out; static noinline unsigned long test_old_get_order(void) { unsigned long n, total = 0; long rep, loop; for (rep = 1000000; rep > 0; rep--) { for (loop = 0; loop <= 16384; loop += 4) { n = 1UL << loop; total += old_get_order(n); } } return total; } static noinline unsigned long test_get_order_old_fls64(void) { unsigned long n, total = 0; long rep, loop; for (rep = 1000000; rep > 0; rep--) { for (loop = 0; loop <= 16384; loop += 4) { n = 1UL << loop; total += get_order_old_fls64(n); } } return total; } static noinline unsigned long test_get_order(void) { unsigned long n, total = 0; long rep, loop; for (rep = 1000000; rep > 0; rep--) { for (loop = 0; loop <= 16384; loop += 4) { n = 1UL << loop; total += get_order(n); } } return total; } int main(int argc, char **argv) { unsigned long total; switch (argc) { case 1: total = test_old_get_order(); break; case 2: total = test_get_order_old_fls64(); break; default: total = test_get_order(); break; } prevent_optimise_out = total; return 0; } This allows me to test the use of the old fls64() implementation and the new fls64() implementation and also to contrast these to the out-of-line loop-based implementation of get_order(). The results were: warthog>time ./get_order real 1m37.191s user 1m36.313s sys 0m0.861s warthog>time ./get_order x real 0m16.892s user 0m16.586s sys 0m0.287s warthog>time ./get_order x x real 0m7.731s user 0m7.727s sys 0m0.002s Using the current upstream fls64() as a basis for an inlined get_order() [the second result above] is much faster than using the current out-of-line loop-based get_order() [the first result above]. Using my optimised inline fls64()-based get_order() [the third result above] is even faster still. [ hpa: changed the selection of 32 vs 64 bits to use CONFIG_X86_64 instead of comparing BITS_PER_LONG, updated comments, rebased manually on top of 83d99df7c4bf x86, bitops: Move fls64.h inside __KERNEL__ ] Signed-off-by: David Howells <dhowells@redhat.com> Link: http://lkml.kernel.org/r/20111213145654.14362.39868.stgit@warthog.procyon.org.uk Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | x86, bitops: Move fls64.h inside __KERNEL__H. Peter Anvin2011-12-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would include <asm-generic/bitops/fls64.h> even without __KERNEL__, but that doesn't make sense, as: 1. That file provides fls64(), but the corresponding function fls() is not exported to user space. 2. The implementation of fls64.h uses kernel-only symbols. 3. fls64.h is not exported to user space. This appears to have been a bug introduced in checkin: d57594c203b1 bitops: use __fls for fls64 on 64-bit archs Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Alexander van Heukelum <heukelum@mailshack.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/4EEA77E1.6050009@zytor.com
| * | | | x86: Fix and improve percpu_cmpxchg{8,16}b_double()Jan Beulich2011-12-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They had several problems/shortcomings: Only the first memory operand was mentioned in the 2x32bit asm() operands, and 2x64-bit version had a memory clobber. The first allowed the compiler to not recognize the need to re-load the data in case it had it cached in some register, and the second was overly destructive. The memory operand in the 2x32-bit asm() was declared to only be an output. The types of the local copies of the old and new values were incorrect (as in other per-CPU ops, the types of the per-CPU variables accessed should be used here, to make sure the respective types are compatible). The __dummy variable was pointless (and needlessly initialized in the 2x32-bit case), given that local copies of the inputs already exist. The 2x64-bit variant forced the address of the first object into %rsi, even though this is needed only for the call to the emulation function. The real cmpxchg16b can operate on an memory. At once also change the return value type to what it really is - 'bool'. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4EE86D6502000078000679FE@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Use the same node_distance for 32 and 64-bitH Hartley Sweeten2011-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The node_distance function is not x86 64-bit specific. Having the #ifdef around the extern function declaration and the #define causes the default node_distance macro to be used in asm-generic/topology.h. This also causes a sparse warning in arch/x86/mm/numa.c when CONFIG_X86_64 is not set: warning: symbol '__node_distance' was not declared. Should it be static? Remove the #ifdef to fix both issues. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1112061220310.28251@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86: Fix rflags in FAKE_STACK_FRAMESeiichi Ikarashi2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The x86_64 kernel pushes the fake kernel stack in arch/x86/kernel/entry_64.S:FAKE_STACK_FRAME, and rflags register in it does not conform to the specification. Although Intel's manual[1] says bit 1 of it shall be set to 1, this bit is cleared to 0 on pushing the fake stack. [1] Intel(R) 64 and IA-32 Architectures Software Developer's Manual Vol.1 3-21 Figure 3-8. EFLAGS Register If it is not on purpose, it is better to be fixed, because it can lead some tools misunderstanding the stack frame. For example, "crash" utility[2] actually detects it and warns you like below: RIP: ffffffff8005dfa2 RSP: ffff8104ce0c7f58 RFLAGS: 00000200 [...] bt: WARNING: possibly bogus exception frame Signed-off-by: Seiichi Ikarashi <s.ikarashi@jp.fujitsu.com> Tested-by: Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86/div64: Add a micro-optimization shortcut if base is power of twoSebastian Andrzej Siewior2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the target code I have a do_div(x, PAGE_SIZE). The x86-64 version of it was doing a shift and a mask which is clever. The 32bit version of it had a div operation in it which made me think. After digging I noticed that x86 has an optimized version of it. This patch adds this shift and mask optimization if base is constant so we don't have any runtime "checking" overhead since most users use a power of ten. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1322649814-544-1-git-send-email-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86-64: Slightly shorten line system call entry and exit pathsJan Beulich2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GET_THREAD_INFO() involves a memory read immediately followed by an "sub" on the value read, in turn (in several cases) immediately followed by a use of the calculated value as the base address of a memory access. This combination of instructions has a non-negligible potential for stalls. In the system call entry point code, however, the (fixed) offset of the stack pointer from the end of the stack is generally known, and hence we can instead avoid the memory load and subtract, and instead do the memory reference using %rsp as the base register. To do so in a legible fashion, introduce a THREAD_INFO() macro which, provided a register (generally %rsp) and the known offset from the end of the stack, produces a suitable memory access operand. The patch attempts to only touch the fast paths (no auditing and alike), but manages to do so only in the 64-bit entry point case; the compatibility mode entry points have so many interdependencies between their various branch targets that it was necessary to also adjust the slow paths to eliminate the risk of having missed some register dependency during code analysis. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/4ED4CD690200007800064075@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | x86-64: Set siginfo and context on vsyscall emulation faultsAndy Lutomirski2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To make this work, we teach the page fault handler how to send signals on failed uaccess. This only works for user addresses (kernel addresses will never hit the page fault handler in the first place), so we need to generate signals for those separately. This gets the tricky case right: if the user buffer spans multiple pages and only the second page is invalid, we set cr2 and si_addr correctly. UML relies on this behavior to "fault in" pages as needed. We steal a bit from thread_info.uaccess_err to enable this. Before this change, uaccess_err was a 32-bit boolean value. This fixes issues with UML when vsyscall=emulate. Reported-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/4c8f91de7ec5cd2ef0f59521a04e1015f11e42b4.1320712291.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | Merge branch 'upstream/ticketlock-cleanup' of ↵Ingo Molnar2011-12-05
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen into x86/asm
| | * | | x86: consolidate xchg and xadd macrosJeremy Fitzhardinge2011-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They both have a basic "put new value in location, return old value" pattern, so they can use the same macro easily. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
| | * | | x86/cmpxchg: add a locked add() helperJeremy Fitzhardinge2011-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mostly to remove some conditional code in spinlock.h. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
* | | | | Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Skip cpus with apic-ids >= 255 in !x2apic_mode x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present x86, apic: Add probe() for apic_flat x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86' x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat x86: Add per-cpu stat counter for APIC ICR read tries pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86 x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support x86: Add NumaChip support x86: Add x86_init platform override to fix up NUMA core numbering x86: Make flat_init_apic_ldr() available
| * | | | | x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOSYinghai Lu2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently "nox2apic" boot parameter was not enabling x2apic mode if the cpu, kernel are all capable of enabling x2apic mode and the OS handover happened in xapic mode. However If the bios enabled x2apic prior to OS handover, using "nox2apic" boot parameter had no effect. If the boot cpu's apicid is < 255, enable "nox2apic" boot parameter to disable the x2apic mode setup by the bios. This will enable the kernel to fallback to xapic mode and bringup only the cpu's which has apic-id < 255. -v2: fix patch error and two compiling warning make disable_x2apic to be __init Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/CAE9FiQUeB-3uxJAMiHsz=uPWoFv5Hg1pVepz7aU6YtqOxMC-=Q@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remappingYinghai Lu2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some of the recent Intel SNB platforms, by default bios is pre-enabling x2apic mode in the cpu with out setting up interrupt-remapping. This case was resulting in the kernel to panic as the cpu is already in x2apic mode but the OS was not able to enable interrupt-remapping (which is a pre-req for using x2apic capability). On these platforms all the apic-ids are < 255 and the kernel can fallback to xapic mode if the bios has not enabled interrupt-remapping (which is mostly the case if the bios has not exported interrupt-remapping tables to the OS). Reported-by: Berck E. Nash <flyboy@gmail.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20111222014632.600418637@sbsiddha-desk.sc.intel.com Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'Kevin Winchester2011-12-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several fields in struct cpuinfo_x86 were not defined for the !SMP case, likely to save space. However, those fields still have some meaning for UP, and keeping them allows some #ifdef removal from other files. The additional size of the UP kernel from this change is not significant enough to worry about keeping up the distinction: text data bss dec hex filename 4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before 4737444 506459 972040 6215943 5ed907 vmlinux.o.after for a difference of 276 bytes for an example UP config. If someone wants those 276 bytes back badly then it should be implemented in a cleaner way. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Steffen Persvold <sp@numascale.com> Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: Convert per-cpu counter icr_read_retry_count into a member of irq_statFernando Luis Vazquez Cao2011-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LAPIC related statistics are grouped inside the per-cpu structure irq_stat, so there is no need for icr_read_retry_count to be a standalone per-cpu variable. This patch moves icr_read_retry_count to where it belongs. Suggested-y: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: Jörn Engel <joern@logfs.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: Add per-cpu stat counter for APIC ICR read triesFernando Luis Vázquez Cao2011-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the IPI delivery slow path (NMI delivery) we retry the ICR read to check for delivery completion a limited number of times. [ The reason for the limited retries is that some of the places where it is used (cpu boot, kdump, etc) IPI delivery might not succeed (due to a firmware bug or system crash, for example) and in such a case it is better to give up and resume execution of other code. ] This patch adds a new entry to /proc/interrupts, RTR, which tells user space the number of times we retried the ICR read in the IPI delivery slow path. This should give some insight into how well the APIC message delivery hardware is working - if the counts are way too large then we are hitting a (very-) slow path way too often. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Cc: Jörn Engel <joern@logfs.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/n/tip-vzsp20lo2xdzh5f70g0eis2s@git.kernel.org [ extended the changelog ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: Add NumaChip supportSteffen Persvold2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for Numascale NumaChip large-SMP systems. It is needed to enable the booting of more than ~168 cores. v2: - [Steffen] enumerate only accessible northbridges - [Daniel] rediffed and validated against 3.1-rc10 v3: - [Daniel] use x86_init core numbering override - [Daniel] cleanups as per feedback v4: - [Daniel] use updated x86_cpuinit override v5: - drop disabling interrupts locally, as ISR write is atomic; drop delay - added read-mostly annotations where appropriate - require CONFIG_SMP, so drop conditional path Workload tested on 96 cores/16 sockets. Signed-off-by: Steffen Persvold <sp@numascale.com> Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Link: http://lkml.kernel.org/r/1323101246-2400-1-git-send-email-daniel@numascale-asia.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: Add x86_init platform override to fix up NUMA core numberingDaniel J Blueman2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an x86_init vector for handling inconsistent core numbering. This is useful for multi-fabric platforms, such as Numascale NumaConnect. v2: - use struct x86_cpuinit_ops - provide default fall-back function to warn Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Link: http://lkml.kernel.org/r/1323073238-32686-2-git-send-email-daniel@numascale-asia.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: Make flat_init_apic_ldr() availableDaniel J Blueman2011-12-05
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow flat_init_apic_ldr() to be used outside the compilation unit for similar APIC implementations. Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Link: http://lkml.kernel.org/r/1323073238-32686-1-git-send-email-daniel@numascale-asia.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'timers-core-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, tsc: Skip TSC synchronization checks for tsc=reliable clocksource: Convert tcb_clksrc to use clocksource_register_hz/khz clocksource: cris: Convert to clocksource_register_khz clocksource: xtensa: Convert to clocksource_register_hz/khz clocksource: um: Convert to clocksource_register_hz/khz clocksource: parisc: Convert to clocksource_register_hz/khz clocksource: m86k: Convert to clocksource_register_hz/khz time: x86: Replace LATCH with PIT_LATCH in i8253 clocksource driver time: x86: Remove CLOCK_TICK_RATE from acpi_pm clocksource driver time: x86: Remove CLOCK_TICK_RATE from mach_timer.h time: x86: Remove CLOCK_TICK_RATE from tsc code time: Fix spelling mistakes in new comments time: fix bogus comment in timekeeping_get_ns_raw
| * \ \ \ \ Merge branch 'fortglx/3.3/tip/timers/core' of ↵Thomas Gleixner2011-12-05
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | git://git.linaro.org/people/jstultz/linux into timers/core
| | * | | | | time: x86: Remove CLOCK_TICK_RATE from mach_timer.hDeepak Saxena2011-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CLOCK_TICK_RATE is defined as PIT_TICK_RATE on x86 so we update mach_timers.h to just use the later as we want to depecrate CLOCK_TICK_RATE. Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org>
| * | | | | | x86, tsc: Skip TSC synchronization checks for tsc=reliableSuresh Siddha2011-12-05
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tsc=reliable boot parameter is supposed to skip all the TSC stablility checks during boot time. On a 8-socket system where we want to run an experiment with the "tsc=reliable" boot option, TSC synchronization checks are not getting skipped and marking the TSC as not stable. Check for tsc_clocksource_reliable (which is set via tsc=reliable or for platforms supporting synthetic TSC_RELIABLE feature bit etc) and when set, skip the TSC synchronization tests during boot. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: John Stultz <johnstul@us.ibm.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1320446537.15071.14.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) sched/tracing: Add a new tracepoint for sleeptime sched: Disable scheduler warnings during oopses sched: Fix cgroup movement of waking process sched: Fix cgroup movement of newly created process sched: Fix cgroup movement of forking process sched: Remove cfs bandwidth period check in tg_set_cfs_period() sched: Fix load-balance lock-breaking sched: Replace all_pinned with a generic flags field sched: Only queue remote wakeups when crossing cache boundaries sched: Add missing rcu_dereference() around ->real_parent usage [S390] fix cputime overflow in uptime_proc_show [S390] cputime: add sparse checking and cleanup sched: Mark parent and real_parent as __rcu sched, nohz: Fix missing RCU read lock sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer sched, nohz: Fix the idle cpu check in nohz_idle_balance sched: Use jump_labels for sched_feat sched/accounting: Fix parameter passing in task_group_account_field sched/accounting: Fix user/system tick double accounting sched/accounting: Re-use scheduler statistics for the root cgroup ... Fix up conflicts in - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h usecs_to_cputime64() vs the sparse cleanups - kernel/sched/fair.c, kernel/time/tick-sched.c scheduler changes in multiple branches
| * \ \ \ \ \ Merge branch 'sched/core' of ↵Martin Schwidefsky2011-12-19
| |\ \ \ \ \ \ | | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip Conflicts: drivers/cpufreq/cpufreq_conservative.c drivers/cpufreq/cpufreq_ondemand.c drivers/macintosh/rack-meter.c fs/proc/stat.c fs/proc/uptime.c kernel/sched/core.c
| | * | | | | Merge commit 'v3.2-rc5' into sched/coreIngo Molnar2011-12-15
| | |\ \ \ \ \ | | | | |_|_|/ | | | |/| | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | sched/accounting: Change cpustat fields to an arrayGlauber Costa2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes fields in cpustat from a structure, to an u64 array. Math gets easier, and the code is more flexible. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Tuner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2012-01-06
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits) perf kvm: Fix copy & paste error in description perf script: Kill script_spec__delete perf top: Fix a memory leak perf stat: Introduce get_ratio_color() helper perf session: Remove impossible condition check perf tools: Fix feature-bits rework fallout, remove unused variable perf script: Add generic perl handler to process events perf tools: Use for_each_set_bit() to iterate over feature flags perf tools: Unify handling of features when writing feature section perf report: Accept fifos as input file perf tools: Moving code in some files perf tools: Fix out-of-bound access to struct perf_session perf tools: Continue processing header on unknown features perf tools: Improve macros for struct feature_ops perf: builtin-record: Document and check that mmap_pages must be a power of two. perf: builtin-record: Provide advice if mmap'ing fails with EPERM. perf tools: Fix truncated annotation perf script: look up thread using tid instead of pid perf tools: Look up thread names for system wide profiling perf tools: Fix comm for processes with named threads ...
| * | | | | | | perf events: Enable raw event support for Intel unhalted_reference_cycles eventStephane Eranian2011-12-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the encoding and definitions necessary for the unhalted_reference_cycles event avaialble since Intel Core 2 processors. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1323559734-3488-2-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | Merge commit 'v3.2-rc6' into perf/coreIngo Molnar2011-12-20
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Update with the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | perf, x86: Expose perf capability to other modulesGleb Natapov2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM needs to know perf capability to decide which PMU it can expose to a guest. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1320929850-10480-8-git-send-email-gleb@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86, perf: Disable non available architectural eventsGleb Natapov2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel CPUs report non-available architectural events in cpuid leaf 0AH.EBX. Use it to disable events that are not available according to CPU. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1320929850-10480-7-git-send-email-gleb@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86, perf: Add a build-time sanity test to the x86 decoderMasami Hiramatsu2011-11-10
| | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a sanity test of x86 insn decoder against a stream of randomly generated input, at build time. This test is also able to reproduce any bug that might trigger by allowing the passing of random-seed and iteration-number to the test, or by passing input which has invalid byte code. Changes in V2: - Code cleanup. - Show how to reproduce the error by insn_sanity test. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: acme@redhat.com Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: ravitillo@lbl.gov Cc: yrl.pp-manager.tt@hitachi.com Cc: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20111020140109.20938.92572.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | Merge branch 'memblock-kill-early_node_map' of ↵Ingo Molnar2011-12-20
|\ \ \ \ \ \ \ | |_|/ / / / / |/| | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/memblock
| * | | | | | Merge branch 'master' into x86/memblockTejun Heo2011-11-28
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts & resolutions: * arch/x86/xen/setup.c dc91c728fd "xen: allow extra memory to be in multiple regions" 24aa07882b "memblock, x86: Replace memblock_x86_reserve/free..." conflicted on xen_add_extra_mem() updates. The resolution is trivial as the latter just want to replace memblock_x86_reserve_range() with memblock_reserve(). * drivers/pci/intel-iommu.c 166e9278a3f "x86/ia64: intel-iommu: move to drivers/iommu/" 5dfe8660a3d "bootmem: Replace work_with_active_regions() with..." conflicted as the former moved the file under drivers/iommu/. Resolved by applying the chnages from the latter on the moved file. * mm/Kconfig 6661672053a "memblock: add NO_BOOTMEM config symbol" c378ddd53f9 "memblock, x86: Make ARCH_DISCARD_MEMBLOCK a config option" conflicted trivially. Both added config options. Just letting both add their own options resolves the conflict. * mm/memblock.c d1f0ece6cdc "mm/memblock.c: small function definition fixes" ed7b56a799c "memblock: Remove memblock_memory_can_coalesce()" confliected. The former updates function removed by the latter. Resolution is trivial. Signed-off-by: Tejun Heo <tj@kernel.org>