aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel
Commit message (Collapse)AuthorAge
* Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2010-08-22
|\ | | | | | | | | | | | | * 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PIT: free irq source id in handling error path KVM: destroy workqueue on kvm_create_pit() failures KVM: fix poison overwritten caused by using wrong xstate size
| * KVM: fix poison overwritten caused by using wrong xstate sizeXiaotian Feng2010-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep is xstate_size. xstate_size is set from cpuid instruction, which is often smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct xsave_struct) to fill in/out fpu.state.xsave, as what we allocated for fpu.state is xstate_size, kernel will write out of memory and caused poison/redzone/padding overwritten warnings. Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Reviewed-by: Sheng Yang <sheng@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Avi Kivity <avi@redhat.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Sheng Yang <sheng@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-08-20
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, apic: Fix apic=debug boot crash x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues x86-32: Fix dummy trampoline-related inline stubs x86-32: Separate 1:1 pagetables from swapper_pg_dir x86, cpu: Fix regression in AMD errata checking code
| * | x86, apic: Fix apic=debug boot crashDaniel Kiper2010-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a boot crash when apic=debug is used and the APIC is not properly initialized. This issue appears during Xen Dom0 kernel boot but the fix is generic and the crash could occur on real hardware as well. Signed-off-by: Daniel Kiper <dkiper@net-space.pl> Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: jeremy@goop.org Cc: <stable@kernel.org> # .35.x, .34.x, .33.x, .32.x LKML-Reference: <20100819224616.GB9967@router-fw-old.local.net-space.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issuesBorislav Petkov2010-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d: Stuck ??" message due to multiple cores concurrently accessing the cpu_callin_mask, among others. Since these codepaths are not protected from concurrent access due to the fact that there's no sane reason for making an already complex code unnecessarily more complex - we hit the issue only when insanely switching cores off- and online - serialize hotplugging cores on the sysfs level and be done with it. [ v2.1: fix !HOTPLUG_CPU build ] Cc: <stable@kernel.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100819181029.GC17171@aftab> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | x86-32: Separate 1:1 pagetables from swapper_pg_dirJoerg Roedel2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes machine crashes which occur when heavily exercising the CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by AMD Erratum 383 and result in a fatal machine check exception. Here's the scenario: 1. On 32-bit, the swapper_pg_dir page table is used as the initial page table for booting a secondary CPU. 2. To make this work, swapper_pg_dir needs a direct mapping of physical memory in it (the low mappings). By adding those low, large page (2M) mappings (PAE kernel), we create the necessary conditions for Erratum 383 to occur. 3. Other CPUs which do not participate in the off- and onlining game may use swapper_pg_dir while the low mappings are present (when leave_mm is called). For all steps below, the CPU referred to is a CPU that is using swapper_pg_dir, and not the CPU which is being onlined. 4. The presence of the low mappings in swapper_pg_dir can result in TLB entries for addresses below __PAGE_OFFSET to be established speculatively. These TLB entries are marked global and large. 5. When the CPU with such TLB entry switches to another page table, this TLB entry remains because it is global. 6. The process then generates an access to an address covered by the above TLB entry but there is a permission mismatch - the TLB entry covers a large global page not accessible to userspace. 7. Due to this permission mismatch a new 4kb, user TLB entry gets established. Further, Erratum 383 provides for a small window of time where both TLB entries are present. This results in an uncorrectable machine check exception signalling a TLB multimatch which panics the machine. There are two ways to fix this issue: 1. Always do a global TLB flush when a new cr3 is loaded and the old page table was swapper_pg_dir. I consider this a hack hard to understand and with performance implications 2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit does. This patch implements solution 2. It introduces a trampoline_pg_dir which has the same layout as swapper_pg_dir with low_mappings. This page table is used as the initial page table of the booting CPU. Later in the bringup process, it switches to swapper_pg_dir and does a global TLB flush. This fixes the crashes in our test cases. -v2: switch to swapper_pg_dir right after entering start_secondary() so that we are able to access percpu data which might not be mapped in the trampoline page table. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <20100816123833.GB28147@aftab> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | x86, cpu: Fix regression in AMD errata checking codeHans Rosenfeld2010-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bug in the family-model-stepping matching code caused the presence of errata to go undetected when OSVW was not used. This causes hangs on some K8 systems because the E400 workaround is not enabled. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> LKML-Reference: <1282141190-930137-1-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2010-08-19
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: kprobes/x86: Fix the return address of multiple kretprobes perf tools: Fix build error on read only source. perf, x86: Fix Intel-nhm PMU programming errata workaround
| * | | kprobes/x86: Fix the return address of multiple kretprobesKUMANO Syuhei2010-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the return address of subsequent kretprobes when multiple kretprobes are set on the same function. For example: # cd /sys/kernel/debug/tracing # echo "r:event1 sys_symlink" > kprobe_events # echo "r:event2 sys_symlink" >> kprobe_events # echo 1 > events/kprobes/enable # ln -s /tmp/foo /tmp/bar (without this patch) # cat trace ln-897 [000] 20404.133727: event1: (kretprobe_trampoline+0x0/0x4c <- sys_symlink) ln-897 [000] 20404.133747: event2: (system_call_fastpath+0x16/0x1b <- sys_symlink) (with this patch) # cat trace ln-740 [000] 13799.491076: event1: (system_call_fastpath+0x16/0x1b <- sys_symlink) ln-740 [000] 13799.491096: event2: (system_call_fastpath+0x16/0x1b <- sys_symlink) Signed-off-by: KUMANO Syuhei <kumano.prog@gmail.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> LKML-Reference: <1281853084.3254.11.camel@camp10-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | perf, x86: Fix Intel-nhm PMU programming errata workaroundZhang, Yanmin2010-08-18
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the Errata AAK100/AAP53/BD53 workaround, the officialy documented workaround we implemented in: 11164cd: perf, x86: Add Nehelem PMU programming errata workaround doesn't actually work fully and causes a stuck PMU state under load and non-functioning perf profiling. A functional workaround was found by trial & error. Affects all Nehalem-class Intel PMUs. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1281073148.2125.63.camel@ymzhang.sh.intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@kernel.org> # .35.x Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'for_linus' of ↵Linus Torvalds2010-08-17
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: vt,console,kdb: preserve console_blanked while in kdb vt: fix regression warnings from KMS merge arm,kgdb: fix GDB_MAX_REGS no longer used kgdb: add missing __percpu markup in arch/x86/kernel/kgdb.c kdb: fix compile error without CONFIG_KALLSYMS
| * | | kgdb: add missing __percpu markup in arch/x86/kernel/kgdb.cNamhyung Kim2010-08-16
| |/ / | | | | | | | | | | | | | | | | | | | | | breakinfo->pev is a pointer to percpu pointer but was missing __percpu markup. Add it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
* / / Make do_execve() take a const filename pointerDavid Howells2010-08-17
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make do_execve() take a const filename pointer so that kernel_execve() compiles correctly on ARM: arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type This also requires the argv and envp arguments to be consted twice, once for the pointer array and once for the strings the array points to. This is because do_execve() passes a pointer to the filename (now const) to copy_strings_kernel(). A simpler alternative would be to cast the filename pointer in do_execve() when it's passed to copy_strings_kernel(). do_execve() may not change any of the strings it is passed as part of the argv or envp lists as they are some of them in .rodata, so marking these strings as const should be fine. Further kernel_execve() and sys_execve() need to be changed to match. This has been test built on x86_64, frv, arm and mips. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'linus' into releaseLen Brown2010-08-15
|\ \ | | | | | | | | | | | | | | | | | | Conflicts: drivers/acpi/debug.c Signed-off-by: Len Brown <len.brown@intel.com>
| * \ Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds2010-08-13
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: Make kdump avoid stack dumps - fix !CONFIG_KEXEC breakage x86, UV: Initialize BAU hub map x86, UV: Make kdump avoid stack dumps
| | * | x86, UV: Initialize BAU hub mapCliff Wickman2010-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix uninitialized uvhub_mask: - An unitialized bit map variable was causing initialization of non-existant hubs (this one causes boot panics). - And the bit map was too small for large machines. This patch makes it dynamic in size. - Fix the case where socket 0 has no enabled cpu's. Don't assume every hub has a socket 0. - uv_init_per_cpu() should be __init. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> # for .35.x LKML-Reference: <E1Oeuyt-0004XS-0y@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | x86, UV: Make kdump avoid stack dumpsCliff Wickman2010-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UV NMI callback's should not write stack dumps when a kdump is to be written. When invoking the crash kernel to write a dump, kdump_nmi_shootdown_cpus() uses NMI's to get all the cpu's to save their register context and halt. But the NMI interrupt handler runs a callback list. This patch sets a flag to prevent any of those callbacks from interfering with the halt of the cpu. For UV, which currently has the only callback to which this is relevant, the uv_handle_nmi() callback should not do dumping of stacks. The 'in_crash_kexec' flag is defined as an extern in kdebug.h firstly because x2apic_uv_x.c includes it. Secondly because some future callback might need the flag to know that it should not enter the debugger. (Such a scenario was in fact present in the 2.6.32 kernel, SuSE distribution, where a call to kdb needed to be avoided.) Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1ObLvt-0005UZ-Va@eag09.americas.sgi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | Merge branch 'fixes' of ↵Linus Torvalds2010-08-13
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] acpi-cpufreq: add missing __percpu markup
| | * | | [CPUFREQ] acpi-cpufreq: add missing __percpu markupNamhyung Kim2010-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | acpi_perf_data is a percpu pointer but was missing __percpu markup. Add it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Dave Jones <davej@redhat.com>
| * | | | Mark arguments to certain syscalls as being constDavid Howells2010-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark arguments to certain system calls as being const where they should be but aren't. The list includes: (*) The filename arguments of various stat syscalls, execve(), various utimes syscalls and some mount syscalls. (*) The filename arguments of some syscall helpers relating to the above. (*) The buffer argument of various write syscalls. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2010-08-13
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) perf: Add back list_head data types perf ui hist browser: Fixup key bindings perf ui browser: Add ui_browser__show counterpart: __hide perf annotate: Cycle thru sorted lines with samples perf ui: Make SPACE work as PGDN in all browsers perf annotate: Sort by hottest lines in the TUI perf ui: Complete the breakdown of util/newt.c perf ui: Move hists browser to util/ui/browsers/ perf symbols: Ignore mapping symbols on ARM perf ui: Move map browser to util/ui/browsers/ perf ui: Move annotate browser to util/ui/browsers/ perf ui: Move ui_progress routines to separate file in util/ui/ perf ui: Move ui_helpline routines to separate file in util/ui/ perf ui: Shorten ui_browser member names perf, x86: P4 PMU -- update nmi irq statistics and unmask lvt entry properly perf ui: Start breaking down newt.c into multiple files perf tui: Introduce list_head based generic ui_browser refresh routine perf probe: Fix memory leaks in add_perf_probe_events perf probe: Fix to copy the type for raw parameters perf report: Speed up exit path ...
| | * | | | Merge branch 'linus' into perf/urgentIngo Molnar2010-08-12
| | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Fix upstream breakage introduced by: de5d9bf: Move list types from <linux/list.h> to <linux/types.h>. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | perf, x86: P4 PMU -- update nmi irq statistics and unmask lvt entry properlyCyrill Gorcunov2010-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case if last active performance counter is not overflowed at moment of NMI being triggered by another counter, the irq statistics may miss an update stage. As a more serious consequence -- apic quirk may not be triggered so apic lvt entry stay masked. Tested-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100805150917.GA6311@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | Merge branch 'x86-uv-for-linus' of ↵Linus Torvalds2010-08-13
| |\ \ \ \ \ | | | |_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: Initialize BAU MMRs only on hubs with cpus x86, UV: Modularize BAU send and wait x86, UV: BAU broadcast to the local hub x86, UV: Correct BAU regular message type x86, UV: Remove BAU check for stay-busy x86, UV: Correct BAU discovery of hubs and sockets x86, UV: Correct BAU software acknowledge x86, UV: BAU structure rearranging x86, UV: Shorten access to BAU statistics structure x86, UV: Disable BAU on network congestion x86, UV: BAU tunables into a debugfs file x86, UV: Calculate BAU destination timeout
| | * | | | x86, UV: Initialize BAU MMRs only on hubs with cpusCliff Wickman2010-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the initialization of MMRs UVH_LB_BAU_SB_ACTIVATION_CONTROL and UVH_BAU_DATA_BROADCAST on UV hubs that have no active cpus. Such initialization on hubs with no active cpus would result in a kernel page fault. This is not of real high priority, because we don't have any such systems (with UV hubs that have no active cpus). But they will be coming. Signed-off-by: Cliff Wickman <cpw@sgi.com> LKML-Reference: <E1OZmZN-0006cW-RC@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Modularize BAU send and waitCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Streamline the large uv_flush_send_and_wait() function by use of a couple of helper functions. And remove some excess comments. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004ay-IH@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: BAU broadcast to the local hubCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the Broadcast Assist Unit driver use the BAU for TLB shootdowns of cpu's on the local uvhub. It was previously thought that IPI might be faster to the cpu's on the local hub. But the IPI operation would have to follow the completion of the BAU broadcast anyway. So we broadcast to the local uvhub in all cases except when the current cpu was the only local cpu in the mask. This simplifies uv_flush_send_and_wait() in that it returns either all shootdowns complete, or none. Adjust the statistics to account for shootdowns on the local uvhub. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004aq-G7@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Correct BAU regular message typeCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Broadcast Assist Unit messages have a regular or retry message type. The regular type was not being set, but needs to be, because the lack of a message type is sometimes used to identify an unused entry in the message queue. Also removing some excess comments. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004ak-Dy@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Remove BAU check for stay-busyCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a faulty assumption that a long running BAU request has encountered a hardware problem and will never finish. Numalink congestion can make a request appear to have encountered such a problem, but it is not safe to cancel the request. If such a cancel is done but a reply is later received we can miss a TLB shootdown. We depend upon the max_bau_concurrent 'throttle' to prevent the stay-busy case from happening. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004ad-BV@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Correct BAU discovery of hubs and socketsCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the initialization-time assumption of contigous blade numbers and of sockets numbered from zero. There may be hubs present with no cpu's enabled. There may be disabled sockets such that the active socket is not number zero. And assign a 'socket master' by assuming that a socket is a node. (it is not safe to extract socket number from an apicid) Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004aW-9S@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Correct BAU software acknowledgeCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the acknowledgment and the reset of a BAU software-acknowledged message. A retry message should be testing only for timed-out resources (mask << 8). (And we delete a log message that might cause unnecessary concern) The acknowledge MMR is |--timed-out--|---pending--|, each is 8 bits. The IPI-driven reset of software acknowledge resources frees both timed out and pending resources. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004aP-7O@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: BAU structure rearrangingCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move some structure definitions from the C code to the BAU header file, and change the organization of that header file a little. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004aI-54@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Shorten access to BAU statistics structureCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a pointer from the per-cpu BAU control structure to the per-cpu BAU statistics structure. We nearly always know the first before needing the second. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004aB-2k@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Disable BAU on network congestionCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The numalink network can become so congested that TLB shootdown using the Broadcast Assist Unit becomes slower than using IPI's. In that case, disable the use of the BAU for a period of time. The period is tunable. When the period expires the use of the BAU is re-enabled. A count of these actions is added to the statistics file. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNy-0004a4-0a@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: BAU tunables into a debugfs fileCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the Broadcast Assist Unit driver's nine tuning values variable by making them accessible through a read/write debugfs file. The file will normally be mounted as /sys/kernel/debug/sgi_uv/bau_tunables. The tunables are kept in each cpu's per-cpu BAU structure. The patch also does a little name improvement, and corrects the reset of two destination timeout counters. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNx-0004Zx-Uo@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Calculate BAU destination timeoutCliff Wickman2010-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculate the Broadcast Assist Unit's destination timeout period from the values in the relevant MMR's. Store it in each cpu's per-cpu BAU structure so that a destination timeout can be differentiated from a 'plugged' situation in which all software ack resources are already allocated and a timeout is pending. That case returns an immediate destination error. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: gregkh@suse.de LKML-Reference: <E1OJvNx-0004Zq-RK@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | Merge branch 'x86/urgent' of ↵Linus Torvalds2010-08-13
| |\ \ \ \ \ | | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, asm: Use a lower case name for the end macro in atomic64_386_32.S x86, asm: Refactor atomic64_386_32.S to support old binutils and be cleaner x86: Document __phys_reloc_hide() usage in __pa_symbol() x86, apic: Map the local apic when parsing the MP table.
| | * | | | x86, apic: Map the local apic when parsing the MP table.Eric W. Biederman2010-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a regression in 2.6.35 from 2.6.34, that is present for select models of Intel cpus when people are using an MP table. The commit cf7500c0ea133d66f8449d86392d83f840102632 "x86, ioapic: In mpparse use mp_register_ioapic" started calling mp_register_ioapic from MP_ioapic_info. An extremely simple change that was obviously correct. Unfortunately mp_register_ioapic did just a little more than the previous hand crafted code and so we gained this call path. The problem call path is: MP_ioapic_info() mp_register_ioapic() io_apic_unique_id() io_apic_get_unique_id() get_physical_broadcast() modern_apic() lapic_get_version() apic_read(APIC_LVR) Which turned out to be a problem because the local apic was not mapped, at that point, unlike the similar point in the ACPI parsing code. This problem is fixed by mapping the local apic when parsing the mptable as soon as we reasonably can. Looking at the number of places we setup the fixmap for the local apic, I see some serious simplification opportunities. For the moment except for not duplicating the setting up of the fixmap in init_apic_mappings, I have not acted on them. The regression from 2.6.34 is tracked in bug https://bugzilla.kernel.org/show_bug.cgi?id=16173 Cc: <stable@kernel.org> 2.6.35 Reported-by: David Hill <hilld@binarystorm.net> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Tested-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <m1eiee86jg.fsf_-_@fess.ebiederm.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | Merge branch 'fixes' of ↵Linus Torvalds2010-08-12
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] add missing __percpu markup in pcc-cpufreq.c
| | * | | | | [CPUFREQ] add missing __percpu markup in pcc-cpufreq.cNamhyung Kim2010-08-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pcc_cpu_info is a percpu pointer but was missing __percpu markup. Add it. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Dave Jones <davej@redhat.com>
| * | | | | | x86/hpet: Use the FSEC_PER_SEC constant for femto-second periodsChris Wilson2010-08-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current computation, introduced with f12a15be63, of FSEC_PER_SEC using the multiplication of (FSEC_PER_NSEC * NSEC_PER_SEC) is performed only with 32bit integers on small machines, resulting in an overflow and a *very* short intervals being programmed. An interrupt storm follows. Note that we also have to specify FSEC_PER_SEC as being long long to overcome the same limitations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | Merge branch 'stable/xen-swiotlb-0.8.6' of ↵Linus Torvalds2010-08-12
| |\ \ \ \ \ \ | | |/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/xen-swiotlb-0.8.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: x86: Detect whether we should use Xen SWIOTLB. pci-swiotlb-xen: Add glue code to setup dma_ops utilizing xen_swiotlb_* functions. swiotlb-xen: SWIOTLB library for Xen PV guest with PCI passthrough. xen/mmu: inhibit vmap aliases rather than trying to clear them out vmap: add flag to allow lazy unmap to be disabled at runtime xen: Add xen_create_contiguous_region xen: Rename the balloon lock xen: Allow unprivileged Xen domains to create iomap pages xen: use _PAGE_IOMAP in ioremap to do machine mappings Fix up trivial conflicts (adding both xen swiotlb and xen pci platform driver setup close to each other) in drivers/xen/{Kconfig,Makefile} and include/xen/xen-ops.h
| | * | | | | x86: Detect whether we should use Xen SWIOTLB.Konrad Rzeszutek Wilk2010-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is paramount that we call pci_xen_swiotlb_detect before pci_swiotlb_detect as both implementations use the 'swiotlb' and 'swiotlb_force' flags. The pci-xen_swiotlb_detect inhibits the swiotlb_force and swiotlb flag so that the native SWIOTLB implementation is not enabled when running under Xen. [since v1 changed two Cc's to Acked-by] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [http://lkml.org/lkml/2010/7/27/374] Cc: Albert Herranz <albert_herranz@yahoo.es> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: "H. Peter Anvin" <hpa@zytor.com> [conditional http://lkml.org/lkml/2010/8/2/324] Cc: x86@kernel.org Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | | | | | x86: fix up system call numbering nitLinus Torvalds2010-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed out by Jiri Slaby: when I resolved the the 32-bit x85 system call entry tables for prlimit (due to the conflict with fanotify), I forgot to add the numbering in comments that we do for every fifth entry. Reported-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | Merge branch 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linuxLinus Torvalds2010-08-10
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux: unistd: add __NR_prlimit64 syscall numbers rlimits: implement prlimit64 syscall rlimits: switch more rlimit syscalls to do_prlimit rlimits: redo do_setrlimit to more generic do_prlimit rlimits: add rlimit64 structure rlimits: do security check under task_lock rlimits: allow setrlimit to non-current tasks rlimits: split sys_setrlimit rlimits: selinux, do rlimits changes under task_lock rlimits: make sure ->rlim_max never grows in sys_setrlimit rlimits: add task_struct to update_rlimit_cpu rlimits: security, add task_struct to setrlimit Fix up various system call number conflicts. We not only added fanotify system calls in the meantime, but asm-generic/unistd.h added a wait4 along with a range of reserved per-architecture system calls.
| | * | | | | | unistd: add __NR_prlimit64 syscall numbersJiri Slaby2010-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add __NR_prlimit64 syscall numbers to asm-generic. Add them also to asm-x86, both 32 and 64-bit. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com>
| * | | | | | | Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds2010-08-10
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.
| | * | | | | | | fanotify: sys_fanotify_mark declartionEric Paris2010-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simply declares the new sys_fanotify_mark syscall int fanotify_mark(int fanotify_fd, unsigned int flags, u64_mask, int dfd const char *pathname) Signed-off-by: Eric Paris <eparis@redhat.com>
| | * | | | | | | fanotify: fanotify_init syscall declarationEric Paris2010-07-28
| | | |_|/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch defines a new syscall fanotify_init() of the form: int sys_fanotify_init(unsigned int flags, unsigned int event_f_flags, unsigned int priority) This syscall is used to create and fanotify group. This is very similar to the inotify_init() syscall. Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | | | | | x86, ia64, smp: use workqueues unconditionally during do_boot_cpu()Suresh Siddha2010-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Workqueues are now initialized as part of the early_initcall(). So they are available for use during cold boot process aswell. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>