aboutsummaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAge
* Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6Linus Torvalds2007-05-05
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits) [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall [PATCH] i386: type may be unused [PATCH] i386: Some additional chipset register values validation. [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split. [PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu [PATCH] i386: white space fixes in i387.h [PATCH] i386: Drop noisy e820 debugging printks [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c [PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems [PATCH] x86-64: Share identical video.S between i386 and x86-64 [PATCH] x86-64: Remove CONFIG_REORDER [PATCH] x86-64: Print type and size correctly for unknown compat ioctls [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0) [PATCH] i386: Little cleanups in smpboot.c [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible [PATCH] i386: Add X86_FEATURE_RDTSCP [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386 [PATCH] i386: Implement alternative_io for i386 ... Fix up trivial conflict in include/linux/highmem.h manually. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in ↵Thomas Renninger2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | late_initcall In arch/i386/cpu/common.c there is: cpu_devs[X86_VENDOR_INTEL] cpu_devs[X86_VENDOR_CYRIX] cpu_devs[X86_VENDOR_AMD] ... They are all filled with data early. The data (struct) got set to NULL for all, but Intel in different late_initcall (exit_cpu_vendor) calls. I don't see what sense this makes at all, maybe something that got forgotten with the HOTPLUG_CPU extenstions? Please check/review whether initdata, cpuinitdata is still ok and this still works with HOTPLUG_CPU and without, it should... Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: davej@redhat.com
| * [PATCH] i386: type may be unusedDavid Rientjes2007-05-02
| | | | | | | | | | | | | | | | | | In the case of !CONFIG_PCI_DIRECT && !CONFIG_PCI_MMCONFIG, type is unreferened. Cc: Andi Kleen <ak@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Some additional chipset register values validation.Olivier Galibert2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On i945, a mmconfig range hitting the f0000000-ffffffff zone conflicts with the APIC registers and others. Consider it invalid. On E7520, values 0000 and f000 for the window register are defined invalid in the documentation. I haven't seen a bios use these values, but who trusts biosen these days? Signed-off-by: Olivier Galibert <galibert@pobox.com> Signed-off-by: Andi Kleen <ak@suse.de> arch/i386/pci/mmconfig-shared.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
| * [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.Bill Irwin2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only 1GB-aligned kernel/user splits are now handled for PAE. The 2GB/2GB split attempts to avoid aliasing vmallocspace with the 1:1 mapping for physical memory by using an actual split of 1.875/2.125 to accommodate 128MB of vmallocspace out of what would otherwise be a full 2GB for userspace. That attempt disturbs the alignment required by PAE for 2GB/2GB splits, and furthermore does not provide a 2GB/2GB split as advertised. This patch resolves the issues here in two manners. The first is by providing a true 2GB/2GB split in addition to the 1.875/2.125 split. The second is by renaming the 1.875/2.125 split to CONFIG_VMSPLIT_2G_OPT analogously to CONFIG_VMSPLIT_3G_OPT, which performs a similar manuever to avoid aliasing vmallocspace with the 1:1 mapping for physical memory around the 3GB boundary. With the 1.875/2.125 split properly-named, its config option is then tagged as depending on !HIGHMEM to express the PAE implementation's current inability to deal with such unaligned splits. This patch is essentially a combination of two patches, one written by Eric Biederman and the other by Eric Dumazet. If they could add their Signed-off-by: to this, I'd be much obliged. Signed-off-by: William Irwin <wli@holomorphy.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Mark Lord <lkml@rtr.ca> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Drop noisy e820 debugging printksAndi Kleen2007-05-02
| | | | | | | | Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Fix allnoconfig error in genapic_flat.cAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix: In file included from include2/asm/apic.h:5, from include2/asm/smp.h:15, from linux/arch/x86_64/kernel/genapic_flat.c:18: linux/include/linux/pm.h: In function ‘call_platform_enable_wakeup’: linux/include/linux/pm.h:331: error: ‘EIO’ undeclared (first use in this function) linux/include/linux/pm.h:331: error: (Each undeclared identifier is reported only once linux/include/linux/pm.h:331: error: for each function it appears in.) Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Share identical video.S between i386 and x86-64Andi Kleen2007-05-02
| | | | | | | | Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Remove CONFIG_REORDERAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | | | | | The option never worked well and functionlist wasn't well maintained. Also it made the build very slow on many binutils version. So just remove it. Cc: arjan@linux.intel.com Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)Andi Kleen2007-05-02
| | | | | | | | | | | | access_ok checks this case anyways, no need to check twice. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Little cleanups in smpboot.cAndi Kleen2007-05-02
| | | | | | | | | | | | | | - Remove #if that is always set - Fix warning Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanningAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | This was supposed to see the full memory on a ASUS A8SX motherboard with 4GB RAM where the northbridge reports less memory, but it didn't help there. But it's a reasonable change so let's include it anyways. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386Andi Kleen2007-05-02
| | | | | | | | | | | | Syncs up with x86-64. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Verify important CPUID bits in real modeAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | Check some CPUID bits that are needed for compiler generated early in boot. When the system is still in real mode before changing the VESA BIOS mode it is possible to still display an visible error message on the screen. Similar to x86-64. Includes cleanups from Eric Biederman Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Drop -traditional in arch/i386/bootAndi Kleen2007-05-02
| | | | | | | | | | | | Needed for followon patch Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Drop -traditional for arch/x86_64/bootAndi Kleen2007-05-02
| | | | | | | | | | | | Follows i386 and useful cleanup. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Use symbolic CPU features in early CPUID checkAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | Dead to magic numbers! Generated code is the same. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Avoid overflows during apic timer calibrationDavid P. Reed2007-05-02
| | | | | | | | | | | | | | | | | | | | - Use 64bit TSC calculations to avoid handling overflow - Use 32bit unsigned arithmetic for the APIC timer. This way overflows are handled correctly. - Fix exit check of loop to account for apic timer counting down Signed-off-by: dpreed@reed.com Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: Use the 32bit wd_ops for 64bit too.Andi Kleen2007-05-02
| | | | | | | | | | | | | | This mainly removes a lot of code, replacing it with calls into the new 32bit perfctr-watchdog.c Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Clean up NMI watchdog codeAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | | | | | | | - Introduce a wd_ops structure - Convert the various nmi watchdogs over to it - This allows to split the perfctr reservation from the watchdog setup cleanly. - Do perfctr reservation globally as it should have always been - Remove dead code referenced only by unused EXPORT_SYMBOLs Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: set node_possible_map at runtime - try 2Suresh Siddha2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | Set the node_possible_map at runtime on x86_64. On a non NUMA system, num_possible_nodes() will now say '1'. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <clameter@engr.sgi.com>
| * [PATCH] x86-64: Dynamically adjust machine check intervalTim Hockin2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: We've found that MCEs (specifically DRAM SBEs) tend to come in bunches, especially when we are trying really hard to stress the system out. The current MCE poller uses a static interval which does not care whether it has or has not found MCEs recently. Description: This patch makes the MCE poller adjust the polling interval dynamically. If we find an MCE, poll 2x faster (down to 10 ms). When we stop finding MCEs, poll 2x slower (up to check_interval seconds). The check_interval tunable becomes the max polling interval. The "Machine check events logged" printk() is rate limited to the check_interval, which should be identical behavior to the old functionality. Result: If you start to take a lot of correctable errors (not exceptions), you log them faster and more accurately (less chance of overflowing the MCA registers). If you don't take a lot of errors, you will see no change. Alternatives: I considered simply reducing the polling interval to 10 ms immediately and keeping it there as long as we continue to find errors. This felt a bit heavy handed, but does perform significantly better for the default check_interval of 5 minutes (we're using a few seconds when testing for DRAM errors). I could be convinced to go with this, if anyone felt it was not too aggressive. Testing: I used an error-injecting DIMM to create lots of correctable DRAM errors and verified that the polling interval accelerates. The printk() only happens once per check_interval seconds. Patch: This patch is against 2.6.21-rc7. Signed-Off-By: Tim Hockin <thockin@google.com> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: fix wrong comment for syscall stack layoutAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | `ret_from_sys_call' label no longer exist and `syscall_exit' label was introduced instead. Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: unexport cpu_llc_idAndrew Morton2007-05-02
| | | | | | | | | | | | | | | | | | | | | | WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data:cpu_llc_id from __ksymtab between '__ksymtab_cpu_llc_id' (at offset 0x4a0) and '__ksymtab_smp_num_siblings' It is strange to export a __cpuinitdata symbols to modules, and no module appears to use it anyway. Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: convert to the kthread APIEric W. Biederman2007-05-02
| | | | | | | | | | | | | | | | | | | | This patch just trivial converts from calling kernel_thread and daemonize to just calling kthread_run. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: remove xtime_lock'ing around cpufreq notifierDaniel Walker2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The locking of the xtime_lock around the cpu notifier is unessesary now. At one time the tsc was used after a frequency change for timekeeping, but the re-write of timekeeping no longer uses the TSC unless the frequency is constant. The variables that are changed in this section of code had also once been used for timekeeping, but not any longer .. Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] x86-64: Auto compute __NR_syscall_max at compile timeAndi Kleen2007-05-02
| | | | | | | | | | | | No need to maintain it anymore Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: check capabilityJoachim Deguara2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the i386 architecture checks the family for mce capability and this removes that and uses the CPUID information. Tested on a K8 revE and a family10h processor. This eliminates checking of a set AMD procesor family if mce is allowed and relies on the information being in CPUID. Signed-off-by: Joachim Deguara <joachim.deguara@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: clean up flush_tlb_others fnKeshavamurthy, Anil S2007-05-02
| | | | | | | | | | | | | | | | | | Cleanup flush_tlb_others(), no functional change. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: replace spin_lock_irqsave with spin_lockHisashi Hifumi2007-05-02
| | | | | | | | | | | | | | | | | | | | IRQ is already disabled through local_irq_disable(). So spin_lock_irqsave() can be replaced with spin_lock(). Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: avoid checking for cpu gone when CONFIG_HOTPLUG_CPU not definedKeshavamurthy, Anil S2007-05-02
| | | | | | | | | | | | | | | | | | | | | | Avoid checking for cpu gone in mm hot path when CONFIG_HOTPLUG_CPU is not defined. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] x86-64: move __vgetcpu_mode & __jiffies to the vsyscall_2 zoneEric Dumazet2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We apparently hit the 1024 limit of vsyscall_0 zone when some debugging options are set, or if __vsyscall_gtod_data is 64 bytes larger. In order to save 128 bytes from the vsyscall_0 zone, we move __vgetcpu_mode & __jiffies to vsyscall_2 zone where they really belong, since they are used only from vgetcpu() (which is in this vsyscall_2 area). After patch is applied, new layout is : ffffffffff600000 T vgettimeofday ffffffffff60004e t vsysc2 ffffffffff600140 t vread_hpet ffffffffff600150 t vread_tsc ffffffffff600180 D __vsyscall_gtod_data ffffffffff600400 T vtime ffffffffff600413 t vsysc1 ffffffffff600800 T vgetcpu ffffffffff600870 D __vgetcpu_mode ffffffffff600880 D __jiffies ffffffffff600c00 T venosys_1 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: PARAVIRT: fix startup_ipi_hook config dependencyJeremy Fitzhardinge2007-05-02
| | | | | | | | | | | | | | | | startup_ipi_hook depends on CONFIG_X86_LOCAL_APIC, so move it to the right part of the paravirt_ops initialization. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: fix mtrr sectionsRandy Dunlap2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix section mismatch warnings in mtrr code. Fix line length on one source line. WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x103) WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x180) WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x199) WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x1c1) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * [PATCH] i386: Use safe_apic_wait_icr_idle in safe_apic_wait_icr_idle - i386Fernando Luis [** ISO-8859-1 charset **] VzquezCao2007-05-02
| | | | | | | | | | | | | | | | | | Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is NMI_VECTOR to avoid potential hangups in the event of crash when kdump tries to stop the other CPUs. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: __send_IPI_dest_field - x86_64Fernando Luis [** ISO-8859-1 charset **] VzquezCao2007-05-02
| | | | | | | | | | | | | | | | | | Implement __send_IPI_dest_field which can be used to send IPIs when the "destination shorthand" field of the ICR is set to 00 (destination field). Use it whenever possible. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: __send_IPI_dest_field - i386Fernando Luis [** ISO-8859-1 charset **] VzquezCao2007-05-02
| | | | | | | | | | | | | | | | | | Implement __send_IPI_dest_field which can be used to send IPIs when the "destination shorthand" field of the ICR is set to 00 (destination field). Use it whenever possible. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64Fernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | inquire_remote_apic is used for APIC debugging, so use safe_apic_wait_icr_idle instead of apic_wait_icr_idle to avoid possible lockups when APIC delivery fails. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: use safe_apic_wait_icr_idle in smpboot.cFernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | __inquire_remote_apic is used for APIC debugging, so use safe_apic_wait_icr_idle instead of apic_wait_icr_idle to avoid possible lockups when APIC delivery fails. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64Fernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | The functionality provided by the new safe_apic_wait_icr_idle is being open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle instead to consolidate code and ease maintenance. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: use safe_apic_wait_icr_idle - i386Fernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | The functionality provided by the new safe_apic_wait_icr_idle is being open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle instead to consolidate code and ease maintenance. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86-64: safe_apic_wait_icr_idle - x86_64Fernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | apic_wait_icr_idle looks like this: static __inline__ void apic_wait_icr_idle(void) { while (apic_read(APIC_ICR) & APIC_ICR_BUSY) cpu_relax(); } The busy loop in this function would not be problematic if the corresponding status bit in the ICR were always updated, but that does not seem to be the case under certain crash scenarios. Kdump uses an IPI to stop the other CPUs in the event of a crash, but when any of the other CPUs are locked-up inside the NMI handler the CPU that sends the IPI will end up looping forever in the ICR check, effectively hard-locking the whole system. Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3: "A local APIC unit indicates successful dispatch of an IPI by resetting the Delivery Status bit in the Interrupt Command Register (ICR). The operating system polls the delivery status bit after sending an INIT or STARTUP IPI until the command has been dispatched. A period of 20 microseconds should be sufficient for IPI dispatch to complete under normal operating conditions. If the IPI is not successfully dispatched, the operating system can abort the command. Alternatively, the operating system can retry the IPI by writing the lower 32-bit double word of the ICR. This “time-out” mechanism can be implemented through an external interrupt, if interrupts are enabled on the processor, or through execution of an instruction or time-stamp counter spin loop." Intel's documentation suggests the implementation of a time-out mechanism, which, by the way, is already being open-coded in some parts of the kernel that tinker with ICR. Create a apic_wait_icr_idle replacement that implements the time-out mechanism and that can be used to solve the aforementioned problem. AK: moved both functions out of line AK: Added improved loop from Keith Owens Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: safe_apic_wait_icr_idle - i386Fernando Luis VazquezCao2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | apic_wait_icr_idle looks like this: static __inline__ void apic_wait_icr_idle(void) { while (apic_read(APIC_ICR) & APIC_ICR_BUSY) cpu_relax(); } The busy loop in this function would not be problematic if the corresponding status bit in the ICR were always updated, but that does not seem to be the case under certain crash scenarios. Kdump uses an IPI to stop the other CPUs in the event of a crash, but when any of the other CPUs are locked-up inside the NMI handler the CPU that sends the IPI will end up looping forever in the ICR check, effectively hard-locking the whole system. Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3: "A local APIC unit indicates successful dispatch of an IPI by resetting the Delivery Status bit in the Interrupt Command Register (ICR). The operating system polls the delivery status bit after sending an INIT or STARTUP IPI until the command has been dispatched. A period of 20 microseconds should be sufficient for IPI dispatch to complete under normal operating conditions. If the IPI is not successfully dispatched, the operating system can abort the command. Alternatively, the operating system can retry the IPI by writing the lower 32-bit double word of the ICR. This “time-out” mechanism can be implemented through an external interrupt, if interrupts are enabled on the processor, or through execution of an instruction or time-stamp counter spin loop." Intel's documentation suggests the implementation of a time-out mechanism, which, by the way, is already being open-coded in some parts of the kernel that tinker with ICR. Create a apic_wait_icr_idle replacement that implements the time-out mechanism and that can be used to solve the aforementioned problem. AK: moved both functions out of line AK: added improved loop from Keith Owens Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] i386: Enable support for fixed-range IORRs to keep RdMem & WrMem in syncBernhard Kaindl2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If our copy of the MTRRs of the BSP has RdMem or WrMem set, and we are running on an AMD64/K8 system, the boot CPU must have had MtrrFixDramEn and MtrrFixDramModEn set (otherwise our RDMSR would have copied these bits cleared), so we set them on this CPU as well. This allows us to keep the AMD64/K8 RdMem and WrMem bits in sync across the CPUs of SMP systems in order to fullfill the duty of system software to "initialize and maintain MTRR consistency across all processors." as written in the AMD and Intel manuals. If an WRMSR instruction fails because MtrrFixDramModEn is not set, I expect that also the Intel-style MTRR bits are not updated. AK: minor cleanup, moved MSR defines around Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>
| * [PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when suspendingBernhard Kaindl2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note: This patch didn'nt need an update since it's initial post. Some BIOSes may modify fixed-range MTRRs in SMM, e.g. when they transition the system into ACPI mode, which is entered thru an SMI, triggered by Linux in acpi_enable(). SMIs which cause that Linux is interrupted and BIOS code is executed (which may change e.g. fixed-range MTRRs) in SMM may be raised by an embedded system controller which is often found in notebooks also at other occasions. If we would not update our copy of the fixed-range MTRRs before suspending to RAM or to disk, restore_processor_state() would set the fixed-range MTRRs of the BSP using old backup values which may be outdated and this could cause the system to fail later during resume. This patch ensures that our copy of the fixed-range MTRRs is updated when saving the boot processor state on suspend to disk and suspend to RAM. In combination with other patches this allows to fix s2ram and s2disk on the Acer Ferrari 1000 notebook and at least s2disk on the Acer Ferrari 5000 notebook. Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>
| * [PATCH] x86: Save the MTRRs of the BSP before booting an APBernhard Kaindl2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Applied fix by Andew Morton: http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'. AMD and Intel x86 CPU manuals state that it is the responsibility of system software to initialize and maintain MTRR consistency across all processors in Multi-Processing Environments. Quote from page 188 of the AMD64 System Programming manual (Volume 2): 7.6.5 MTRRs in Multi-Processing Environments "In multi-processing environments, the MTRRs located in all processors must characterize memory in the same way. Generally, this means that identical values are written to the MTRRs used by the processors." (short omission here) "Failure to do so may result in coherency violations or loss of atomicity. Processor implementations do not check the MTRR settings in other processors to ensure consistency. It is the responsibility of system software to initialize and maintain MTRR consistency across all processors." Current Linux MTRR code already implements the above in the case that the BIOS does not properly initialize MTRRs on the secondary processors, but the case where the fixed-range MTRRs of the boot processor are changed after Linux started to boot, before the initialsation of a secondary processor, is not handled yet. In this case, secondary processors are currently initialized by Linux with MTRRs which the boot processor had very early, when mtrr_bp_init() did run, but not with the MTRRs which the boot processor uses at the time when that secondary processors is actually booted, causing differing MTRR contents on the secondary processors. Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs of the boot processor when it transitions the system into ACPI mode. The SMI handler of the BIOS does this in SMM, entered while Linux ACPI code runs acpi_enable(). Other occasions where the SMI handler of the BIOS may change bits in the MTRRs could occur as well. To initialize newly booted secodary processors with the fixed-range MTRRs which the boot processor uses at that time, this patch saves the fixed-range MTRRs of the boot processor before new secondary processors are started. When the secondary processors run their Linux initialisation code, their fixed-range MTRRs will be updated with the saved fixed-range MTRRs. If CONFIG_MTRR is not set, we define mtrr_save_state as an empty statement because there is nothing to do. Possible TODOs: *) CPU-hotplugging outside of SMP suspend/resume is not yet tested with this patch. *) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up, then the calls to mtrr_save_state() could be replaced by calls to mtrr_save_fixed_ranges(NULL) and mtrr_save_state() would not be needed. That would need either verification of the CPU-hotplug code or at least a test on a >2 CPU machine. *) The MTRRs of other running processors are not yet checked at this time but it might be interesting to syncronize the MTTRs of all processors before booting. That would be an incremental patch, but of rather low priority since there is no machine known so far which would require this. AK: moved prototypes on x86-64 around to fix warnings Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>
| * [PATCH] x86: Adds mtrr_save_fixed_ranges() for use in two later patches.Bernhard Kaindl2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this current implementation which is used in other patches, mtrr_save_fixed_ranges() accepts a dummy void pointer because in the current implementation of one of these patches, this function may be called from smp_call_function_single() which requires that this function takes a void pointer argument. This function calls get_fixed_ranges(), passing mtrr_state.fixed_ranges which is the element of the static struct which stores our current backup of the fixed-range MTRR values which all CPUs shall be using. Because mtrr_save_fixed_ranges calls get_fixed_ranges after kernel initialisation time, __init needs to be removed from the declaration of get_fixed_ranges(). If CONFIG_MTRR is not set, we define mtrr_save_fixed_ranges as an empty statement because there is nothing to do. AK: Moved prototypes for x86-64 around to fix warnings Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>
| * [PATCH] i386: Clean up ELF note generationJeremy Fitzhardinge2007-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Three cleanups: 1: ELF notes are never mapped, so there's no need to have any access flags in their phdr. 2: When generating them from asm, tell the assembler to use a SHT_NOTE section type. There doesn't seem to be a way to do this from C. 3: Use ANSI rather than traditional cpp behaviour to stringify the macro argument. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Eric W. Biederman <ebiederm@xmission.com>
| * [PATCH] i386: PARAVIRT: Export paravirt_ops for non GPL modules tooAndi Kleen2007-05-02
| | | | | | | | | | | | | | | | | | | | | | Otherwise non GPL modules cannot even do basic operations like disabling interrupts anymore, which would be excessive. Longer term should split the single structure up into internal and external symbols and not export the internal ones at all. Signed-off-by: Andi Kleen <ak@suse.de>
| * [PATCH] x86: PARAVIRT: Jeremy Fitzhardinge <jeremy@goop.org>Jeremy Fitzhardinge2007-05-02
| | | | | | | | | | | | | | | | | | | | | | The other symbols used to delineate the alt-instructions sections have the form __foo/__foo_end. Rename parainstructions to match. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>