aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAge
* x86: GEODE fix a race condition in the MFGPT timer tickJordan Crouse2008-01-22
| | | | | | | | | | | | | | | | | | | When we set the MFGPT timer tick, there is a chance that we'll immediately assert an event. If for some reason the IRQ routing for this clock has been setup for some other purpose, then we could end up firing an interrupt into the SMM handler or worse. This rearranges the timer tick init function to initalize the handler before we set up the MFGPT clock to make sure that even if we get an event, it will go to the handler. Furthermore, in the handler we need to make sure that we clear the event, even if the timer isn't running. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Tested-by: Arnd Hannemann <hannemann@i4.informatik.rwth-aachen.de>
* Revert "x86: fix NMI watchdog & 'stopped time' problem"Thomas Gleixner2008-01-22
| | | | | | | | | | | | | | | | This reverts commit d4d25deca49ec2527a634557bf5a6cf449f85deb. It tried to fix long standing bugzilla entries, but the solution was reported to break other systems. The reporter of http://bugzilla.kernel.org/show_bug.cgi?id=9791 tracked it down to this commit and confirmed that reverting the patch restores the correct behaviour. It's too late in the release cycle to find a better solution than reverting the commit to avoid regressions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu>
* x86: add support for the latest Intel processors to OprofileArjan van de Ven2008-01-18
| | | | | | | | | | The latest Intel processors (the 45nm ones) have a model number of 23 (old ones had 15); they're otherwise compatible on the oprofile side. This patch adds the new model number to the oprofile code. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* lockdep: more hardirq annotations for notify_die()Peter Zijlstra2008-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Sat, 2007-12-29 at 18:06 +0100, Marcin Slusarz wrote: > Hi > Today I've got this (while i was upgrading my gentoo box): > > WARNING: at kernel/lockdep.c:2658 check_flags() > Pid: 21680, comm: conftest Not tainted 2.6.24-rc6 #63 > > Call Trace: > [<ffffffff80253457>] check_flags+0x1c7/0x1d0 > [<ffffffff80257217>] lock_acquire+0x57/0xc0 > [<ffffffff8024d5c0>] __atomic_notifier_call_chain+0x60/0xd0 > [<ffffffff8024d641>] atomic_notifier_call_chain+0x11/0x20 > [<ffffffff8024d67e>] notify_die+0x2e/0x30 > [<ffffffff8020da0a>] do_divide_error+0x5a/0xa0 > [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a > [<ffffffff80255b89>] trace_hardirqs_on+0xd9/0x180 > [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a > [<ffffffff80523c2d>] error_exit+0x0/0xa9 > > possible reason: unannotated irqs-off. > irq event stamp: 4693 > hardirqs last enabled at (4693): [<ffffffff80522bdd>] trace_hardirqs_on_thunk+0x35/0x3a > hardirqs last disabled at (4692): [<ffffffff80522c17>] trace_hardirqs_off_thunk+0x35/0x37 > softirqs last enabled at (3546): [<ffffffff80238343>] __do_softirq+0xb3/0xd0 > softirqs last disabled at (3521): [<ffffffff8020c97c>] call_softirq+0x1c/0x30 more early fixups for notify_die().. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: fix RTC_AIE with CONFIG_HPET_EMULATE_RTCBernhard Walle2008-01-15
| | | | | | | | | | | | | | | | | | | | In the current code, RTC_AIE doesn't work if the RTC relies on CONFIG_HPET_EMULATE_RTC because the code sets the RTC_AIE flag in hpet_set_rtc_irq_bit(). The interrupt handles does accidentally check for RTC_PIE and not RTC_AIE when comparing the time which was set in hpet_set_alarm_time(). I now verified on a test system here that without the patch applied, the attached test program fails on a system that has HPET with 2.6.24-rc7-default. That's not critical since I guess the problem has been there for several kernel releases, but as the fix is quite obvious. Configuration is CONFIG_RTC=y and CONFIG_HPET_EMULATE_RTC=y. Signed-off-by: Bernhard Walle <bwalle@suse.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: fix boot crash on HIGHMEM4G && SPARSEMEMIngo Molnar2008-01-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Denys Fedoryshchenko reported a bootup crash when he upgraded his system from 3GB to 4GB RAM: http://lkml.org/lkml/2008/1/7/9 the bug is due to HIGHMEM4G && SPARSEMEM kernels making pfn_to_page() to return an invalid pointer when the pfn is in a memory hole. The 256 MB PCI aperture at the end of RAM was not mapped by sparsemem, and hence the pfn was not valid. But set_highmem_pages_init() iterated this range without checking the pfn's validity first. this bug was probably present in the sparsemem code ever since sparsemem has been introduced in v2.6.13. It was masked due to HIGHMEM64G using larger memory regions in sparsemem_32.h: #ifdef CONFIG_X86_PAE #define SECTION_SIZE_BITS 30 #define MAX_PHYSADDR_BITS 36 #define MAX_PHYSMEM_BITS 36 #else #define SECTION_SIZE_BITS 26 #define MAX_PHYSADDR_BITS 32 #define MAX_PHYSMEM_BITS 32 #endif which creates 1GB sparsemem regions instead of 64MB sparsemem regions. So in practice we only ever created true sparsemem holes on x86 with HIGHMEM4G - but that was rarely used by distros. ( btw., we could probably save 2MB of mem_map[]s on X86_PAE if we reduced the sparsemem region size to 256 MB. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>
* Kick CPUS that might be sleeping in cpus_idle_waitSteven Rostedt2008-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are already in idle, have no tasks waiting to run and have no interrupts going to them. This is common on bootup when switching cpu idle governors. This patch gives those CPUS that don't check in an IPI kick. Background: ----------- I notice this while developing the mcount patches, that every once in a while the system would hang. Looking deeper, the hang was always at boot up when registering init_menu of the cpu_idle menu governor. Talking with Thomas Gliexner, we discovered that one of the CPUS had no timer events scheduled for it and it was in idle (running with NO_HZ). So the CPU would not set the cpu_idle_state bit. Hitting sysrq-t a few times would eventually route the interrupt to the stuck CPU and the system would continue. Note, I would have used the PDA isidle but that is set after the cpu_idle_state bit is cleared, and would leave a window open where we may miss being kicked. hmm, looking closer at this, we still have a small race window between clearing the cpu_idle_state and disabling interrupts (hence the RFC). CPU0: CPU 1: --------- --------- cpu_idle_wait(): cpu_idle(): | __cpu_cpu_var(is_idle) = 1; | if (__get_cpu_var(cpu_idle_state)) /* == 0 */ per_cpu(cpu_idle_state, 1) = 1; | if (per_cpu(is_idle, 1)) /* == 1 */ | smp_call_function(1) | | receives ipi and runs do_nothing. wait on map == empty idle(); /* waits forever */ So really we need interrupts off for most of this then. One might think that we could simply clear the cpu_idle_state from do_nothing, but I'm assuming that cpu_idle governors can be removed, and this might cause a race that a governor might be used after the module was removed. Venki said: I think your RFC patch is the right solution here. As I see it, there is no race with your RFC patch. As long as you call a dummy smp_call_function on all CPUs, we should be OK. We can get rid of cpu_idle_state and the current wait forever logic altogether with dummy smp_call_function. And so there wont be any wait forever scenario. The whole point of cpu_idle_wait() is to make all CPUs come out of idle loop atleast once. The caller will use cpu_idle_wait something like this. // Want to change idle handler - Switch global idle handler to always present default_idle - call cpu_idle_wait so that all cpus come out of idle for an instant and stop using old idle pointer and start using default idle - Change the idle handler to a new handler - optional cpu_idle_wait if you want all cpus to start using the new handler immediately. Maybe the below 1s patch is safe bet for .24. But for .25, I would say we just replace all complicated logic by simple dummy smp_call_function and remove cpu_idle_state altogether. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@suse.de> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Pull bugzilla-9194 into release branchLen Brown2008-01-11
|\
| * PM: ACPI and APM must not be enabled at the same timeLen Brown2008-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ACPI and APM used "pm_active" to guarantee that they would not be simultaneously active. But pm_active was recently moved under CONFIG_PM_LEGACY, so that without CONFIG_PM_LEGACY, pm_active became a NOP -- allowing ACPI and APM to both be simultaneously enabled. This caused unpredictable results, including boot hangs. Further, the code under CONFIG_PM_LEGACY is scheduled for removal. So replace pm_active with pm_flags. pm_flags depends only on CONFIG_PM, which is present for both CONFIG_APM and CONFIG_ACPI. http://bugzilla.kernel.org/show_bug.cgi?id=9194 Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | x86: fix do_fork_idle section mismatchThomas Gleixner2008-01-08
| | | | | | | | | | | | | | | | | | | | | | | | | | With CPU_HOTPLUG=n: WARNING: vmlinux.o(.text+0x104f8): Section mismatch: reference to .init.text:fork_idle (between 'do_fork_idle' and 'lapic_timer_broadcast') do_fork_idle() needs to be __cpuinit. It can be static as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | fix lguest rmmod "bad pgd"Rusty Russell2008-01-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After 17d57a9206b4de6ad082ac9f2d2346985abbd2aa ("x86: fix x86-32 early fixmap initialization.") removing lg.ko caused a printk from vunmap: mm/memory.c:115: bad pgd 004b3027. On the second use after module load, the kernel crashes. This fixes the immediate problem (accessed and dirty bits not set as expected in pmd_none_or_clear_bad). I can't see why this would cause a crash, but I haven't been able to reproduce it once this is applied. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Revert "x86: fix show cpuinfo cpu number always zero"Linus Torvalds2007-12-25
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit fbdcf18df73758b2e187ab94678b30cd5f6ff9f9. As pointed out by Yanmin Zhang, the problem was already fixed differently (and correctly), and rather than fix anything, it actually causes us to create a sub-optimal sched-domains hierarchy (not setting up the domain belonging to the core) when CONFIG_X86_HT=y. Requested-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: intel_cacheinfo.c: cpu cache info entry for Intel TolapaiJason Gaston2007-12-20
| | | | | | | | | | | | | | | | This patch adds a cpu cache info entry for the Intel Tolapai cpu. Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: fix die() to not be preemptibleIngo Molnar2007-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew "Eagle Eye" Morton noticed that we use raw_local_save_flags() instead of raw_local_irq_save(flags) in die(). This allows the preemption of oopsing contexts - which is highly undesirable. It also causes CONFIG_DEBUG_PREEMPT to complain, as reported by Miles Lane. this bug was introduced via: commit 39743c9ef717fd4f2b5583f010115c5f2482b8ae Author: Andi Kleen <ak@suse.de> Date: Fri Oct 19 20:35:03 2007 +0200 x86: use raw locks during oopses - spin_lock_irqsave(&die.lock, flags); + __raw_spin_lock(&die.lock); + raw_local_save_flags(flags); that is not a correct open-coding of spin_lock_irqsave(): both the ordering is wrong (irqs should be disabled _first_), and the wrong flags-saving API was used. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | x86: fix show cpuinfo cpu number always zeroMike Travis2007-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when called by setup_arch) after smp_store_cpu_info() had set it to the correct value. The error shows up in 'cat /proc/cpuinfo' will all cpus = 0. Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Suresh B Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86_32: disable_pse must be __cpuinitdataAdrian Bunk2007-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_HOTPLUG_CPU=y: WARNING: vmlinux.o(.text+0xfa52): Section mismatch: reference to .init.data:disable_pse (between 'identify_cpu' and 'identify_secondary_cpu') [ akpm@linux-foundation.org: initializer fix. ] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86_32: select_idle_routine() must be __cpuinitAdrian Bunk2007-12-19
| | | | | | | | | | | | | | | | | | | | | | CONFIG_HOTPLUG_CPU=y: WARNING: vmlinux.o(.text+0x1199a): Section mismatch: reference to .init.text.5:select_idle_routine (between 'init_intel' and 'init_nexgen') Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86 smpboot_32.c section fixesAdrian Bunk2007-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_HOTPLUG_CPU=y: WARNING: vmlinux.o(.text+0x22c60): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu') WARNING: vmlinux.o(.text+0x22c99): Section mismatch: reference to .init.data:cpu_idle_tasks (between 'do_boot_cpu' and 'do_warm_boot_cpu') WARNING: vmlinux.o(.text+0x2359b): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear') WARNING: vmlinux.o(.text+0x235a0): Section mismatch: reference to .init.data:smp_b_stepping (between 'smp_store_cpu_info' and 'cpu_exit_clear') Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86 apic_32.c section fixAdrian Bunk2007-12-19
| | | | | | | | | | | | | | | | | | | | | | CONFIG_HOTPLUG_CPU=y: WARNING: vmlinux.o(.text+0x2390d): Section mismatch: reference to .init.text.5:setup_local_APIC (between 'start_secondary' and 'check_tsc_warp') Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"Ingo Molnar2007-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this is the tale of a full day spent debugging an ancient but elusive bug. after booting up thousands of random .config kernels, i finally happened to generate a .config that produced the following rare bootup failure on 32-bit x86: | ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 | ..MP-BIOS bug: 8254 timer not connected to IO-APIC | ...trying to set up timer (IRQ0) through the 8259A ... failed. | ...trying to set up timer as Virtual Wire IRQ... failed. | ...trying to set up timer as ExtINT IRQ... failed :(. | Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug | and send a report. Then try booting with the 'noapic' option this bug has been reported many times during the years, but it was never reproduced nor fixed. the bug that i hit was extremely sensitive to .config details. First i did a .config-bisection - suspecting some .config detail. That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear and the system would boot up just fine. Debugging my way through the MCE code ended up identifying two unlikely candidates: the thing that made a real difference to the hang was that X86_MCE did two printks: Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away! this left timing as the main suspect: i experimented with adding various udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and the race window turned out to be narrower than 30 microseconds (!). That made debugging especially funny, debugging without having printk ability before the bug hits is ... interesting ;-) eventually i started suspecting IRQ activities - those are pretty much the only thing that happen this early during bootup and have the timescale of a few dozen microseconds. Also, check_timer() changes the IRQ hardware in various creative ways, so the main candidate became IRQ0 interaction. i've added a counter to track timer irqs (on which core they arrived, at what exact time, etc.) and found that no timer IRQ would arrive after the bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A, but that we'd get a small number of timer irqs right around the time when we call the check_timer() function. Eventually i got the following backtrace triggered from debug code in the timer interrupt: ...trying to set up timer as Virtual Wire IRQ... failed. ...trying to set up timer as ExtINT IRQ... Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57) EIP: 0060:[<c044d57e>] EFLAGS: 00000246 CPU: 0 EIP is at _spin_unlock_irqrestore+0x5/0x1c EAX: c0634178 EBX: 00000000 ECX: c4947d63 EDX: 00000246 ESI: 00000002 EDI: 00010031 EBP: c04e0f2e ESP: f7c41df4 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: ffe04000 CR3: 00630000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [<c05f5784>] setup_IO_APIC+0x9c3/0xc5c the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0 entry while we are in the middle of setting up the local APIC, the i8259A and the PIT?? That is certainly not how it's supposed to work! check_timer() was supposed to be called with irqs turned off - but this eroded away sometime in the past. This code would still work most of the time because this code runs very quickly, but just the right timing conditions are present and IRQ0 hits in this small, ~30 usecs window, timer irqs stop and the system does not boot up. Also, given how early this is during bootup, the hang is very deterministic - but it would only occur on certain machines (and certain configs). The fix was quite simple: disable/restore interrupts properly in this function. With that in place the test-system now boots up just fine. (64-bit x86 io_apic_64.c had the same bug.) Phew! One down, only 1500 other kernel bugs are left ;-) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: kprobes bugfixMasami Hiramatsu2007-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Kprobes for x86-64 may cause a kernel crash if it inserted on "iret" instruction. "call absolute" is invalid on x86-64, so we don't need treat it. - Change the processing order as same as x86-32. - Add "iret"(0xcf) case. - Remove next_rip local variable. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: jprobe bugfixMasami Hiramatsu2007-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | jprobe for x86-64 may cause kernel page fault when the jprobe_return() is called from incorrect function. - Use jprobe_saved_regs instead getting it from stack. (Especially on x86-64, it may get incorrect data, because pt_regs can not be get by using container_of(rsp)) - Change the type of stack pointer to unsigned long *. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | oprofile: op_model_athlon.c support for AMD family 10h barcelona performance ↵Barry Kasindorf2007-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | counters This patch is for controlling the upper 32bits of the event ctrl msrs. This includes the upper 4 bits of the event select and the Guest Only and Host Only bits This patch is necessary to make Event Based Profiling work reliably on a Family 10h processor [akpm@linux-foundation.org: checkpatch.pl fixes] Signed-off-by: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | revert "Hibernation: Use temporary page tables for kernel text mapping on ↵Andrew Morton2007-12-17
|/ | | | | | | | | | | | | | | | | x86_64" Revert commit efa4d2fb047b25a6be67fe92178a2a78da6b3f6a ("Hibernation: Use temporary page tables for kernel text mapping on x86_64") because it causes my t61p to reboot right at the end of resume-from-disk. For reasons unknown at this time. Cc: Pavel Machek <pavel@ucw.cz> Cc: Andi Kleen <ak@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xen: relax signature checkJeremy Fitzhardinge2007-12-10
| | | | | | | | Some versions of Xen 3.x set their magic number to "xen-3.[12]", so relax the test to match them. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Pull suspend-2.6.24 into release branchLen Brown2007-12-06
|\
| * ACPI: suspend: old debugging hacks sneaked backPavel Machek2007-12-06
| | | | | | | | | | | | | | | | Old debugging hack sneaked back during x86 merge, this removes it. Signed-off-by: Pavel Machek <pavel@suse.cz> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
* | Tiny clean-up of OPROFILE/KPROBES configurationLinus Torvalds2007-12-06
| | | | | | | | | | | | | | Make the Kconfig.instrumentation file a bit easier on the eyes, and use the new ARCH_SUPPORTS_OPROFILE for x86[-64]. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: arch_register_cpu() section fixAndrew Morton2007-12-04
| | | | | | | | | | | | | | | | | | | | fix this on i386 allnoconfig: WARNING: vmlinux.o(.text+0x6f2e): Section mismatch: reference to .init.text:register_cpu (between 'arch_register_cpu' and 'text_poke') Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: free_cache_attributes() section fixAdrian Bunk2007-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | free_cache_attributes() must be __cpuinit since it calls the __cpuinit cache_remove_shared_cpu_map(). This patch fixes the following section mismatch reported by Chris Clayton: ... WARNING: vmlinux.o(.text+0x90b6): Section mismatch: reference to .init.text:cache_remove_shared_cpu_map (between 'free_cache_attributes' and 'show_level') ... Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: add the word 'WARNING' in check_nmi_watchdog() outputDon Zickus2007-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Our automated test suite looks for keywords like error, fail, warning in the boot log. In the case when the nmi watchdog is determined to be stuck in check_nmi_watchdog(), none of those keywords are displayed. This patch adds a keyword, "WARNING:", so it makes it easier to notice when the nmi watchdog isn't working correctly. Also add a proper KERN_WARNING mark to this printout. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | x86: revert CONFIG_X86_HT semantics changeAdrian Bunk2007-12-04
|/ | | | | | | | | | | | | | | | | | | The recent Kconfig changes in x86 resulted in CONFIG_X86_HT no longer being set if (X86_32 && MK8). After grep'ing through the tree I think the problem is that different places have different assumptions about the semantics of CONFIG_X86_HT, either: - hyperthreading or - multicore This should be sorted out properly, but until then we should keep the 2.6.23 status quo. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* x86: fix x86-32 early fixmap initialization.Eric W. Biederman2007-12-03
| | | | | | | | | | | | | | | | | | | | pageexec@freemail.hu writes: > i've just noticed that the chunk in i386/kernel/head.S ended up in a > weird place, namely, it's not going to be executed as it's just after > a 'jmp 3f' and before startup_32_smp, probably not what you intended. > on a sidenote, the whole thing can be done in a single insn, like: > > movl $(swapper_pg_pmd - __PAGE_OFFSET + 0x067), (swapper_pg_dir - > __PAGE_OFFSET+ 4092) Thanks for the reminder I thought we had fixed this problem a while ago. Needed to get fixed virtual address for USB debug and earlycon with mmio. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: disable hpet legacy replacement for kdumpOGAWA Hirofumi2007-12-03
| | | | | | | | we should also add hpet_disable() for kdump. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* x86: disable hpet on shutdownOGAWA Hirofumi2007-12-03
| | | | | | | | | | | | | | | | If HPET was enabled by pci quirks, we use i8253 as initial clockevent because pci quirks doesn't run until pci is initialized. The above means the kernel (or something) is assuming HPET legacy replacement is disabled and can use i8253 at boot. If we used kexec, it isn't true. So, this patch disables HPET legacy replacement for kexec in machine_shutdown(). Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-11-29
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup: x86 setup: don't recalculate ss:esp unless really necessary
| * x86 setup: don't recalculate ss:esp unless really necessaryJens Rottmann2007-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to work around old LILO versions providing an invalid ss register, the current setup code always sets up a new stack, immediately following .bss and the heap. But this breaks LOADLIN. This rewrite of the workaround checks for an invalid stack (ss!=ds) first, and leaves ss:sp alone otherwise (apart from aligning esp). [hpa note: LOADLIN has a number of arbitrary hard-coded limits that are being pushed up against. Without some major revision of LOADLIN itself it will not be sustainable keeping it alive. This gives it another brief lease on life, however. This patch also helps the cmdline truncation problem with old versions of SYSLINUX.] Signed-off-by: Jens Rottmann <JRottmann at LiPPERT-AT. de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | x86/paravirt: revert exports to restore old behaviourJeremy Fitzhardinge2007-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subdividing the paravirt_ops structure caused a regression in certain non-GPL modules which try to use mmu_ops and cpu_ops. This restores the old behaviour, and makes it consistent with the non-CONFIG_PARAVIRT case. Takashi Iwai <tiwai@suse.de> adds: > I took at this problem (as I have an nvidia card on one of my > workstations), and found out that the following suffer from > EXPORT_SYMBOL_GPL changes: > > * local_disable_irq(), local_irq_save*(), etc. > * MSR-related macros like rdmsr(), wrmsr(), read_cr0(), etc. > wbinvd(), too. > * pmd_val(), pgd_val(), etc are all involved with pv_mm_ops. > pmd_large() and pmd_bad() is also indirectly involved. > __flush_tlb() and friends suffer, too. Christoph Hellwig objects to this patch on the grounds that modules shouldn't be using these operations anyway. I don't think this is a particularly good reason to reject the patch, for several reasons: 1. These operations are still available to modules when not using CONFIG_PARAVIRT, since they are implicitly exported as inline functions via the kernel headers. Exporting the same functionality as GPL-only symbols just adds a gratuitious difference between CONFIG_PARAVIRT and non-CONFIG_PARAVIRT configurations. If we really think these operations are not for module use (or non-GPL module use), then we should solve the problem in a general way. 2. It's a regression from previous kernels, which would work these modules even with CONFIG_PARAVIRT enabled. 3. The operations in question seem pretty reasonable for modules to use. The control registers/MSRs can be accessed directly anyway, so there's no benefit in preventing modules from using standard interfaces. And it seems reasonable to allow a graphics driver to create its own mappings if it wants. Therefore, I think this patch should go in for 2.6.24. If people really think that these operations should not be available to modules, then we can address that separately. Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: Tobias Powalowski <t.powa@gmx.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Takashi Iwai <tiwai@suse.de> Cc: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | lguest: prevent VISWS or VOYAGER randconfigsRandy Dunlap2007-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep lguest from being enabled on VISWS or VOYAGER configs, just as is already done for VMI and XEN. Otherwise randconfigs with VISWS and LGUEST have this problem: In file included from arch/x86/kernel/setup_32.c:61: include/asm-x86/mach-visws/setup_arch.h:8:1: warning: "ARCH_SETUP" redefined In file included from include/asm/msr.h:80, from include/asm/processor_32.h:17, from include/asm/processor.h:2, from include/asm/thread_info_32.h:16, from include/asm/thread_info.h:2, from include/linux/thread_info.h:21, from include/linux/preempt.h:9, from include/linux/spinlock.h:49, from include/linux/seqlock.h:29, from include/linux/time.h:8, from include/linux/timex.h:57, from include/linux/sched.h:53, from arch/x86/kernel/setup_32.c:24: include/asm/paravirt.h:458:1: warning: this is the location of the previous definition (and of course, this happens because kconfig does not follow dependencies when [evil] select is used...) Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | memory hotplug x86_64: fix section mismatch in init_memory_mapping()KAMEZAWA Hiroyuki2007-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | Changes __meminit to __init_refok. WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to .init.text:find_e820_area (between 'init_memory_mapping' and 'arch_add_memory') Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | xen: mask _PAGE_PCD from ptesJeremy Fitzhardinge2007-11-29
|/ | | | | | | | | | | | | | | _PAGE_PCD maps a page with caching disabled, which is typically used for mapping harware registers. Xen never allows it to be set on a mapping, and unprivileged guests never need it since they can't see the real underlying hardware. However, some uncached mappings are made early when probing the (non-existent) APIC, and its OK to mask off the PCD flag in these cases. This became necessary because Xen started checking for this bit, rather than silently masking it off. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds2007-11-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix APIC related bootup crash on Athlon XP CPUs time: add ADJ_OFFSET_SS_READ x86: export the symbol empty_zero_page on the 32-bit x86 architecture x86: fix kprobes_64.c inlining borkage pci: use pci=bfsort for HP DL385 G2, DL585 G2 x86: correctly set UTS_MACHINE for "make ARCH=x86" lockdep: annotate do_debug() trap handler x86: turn off iommu merge by default x86: fix ACPI compile for LOCAL_APIC=n x86: printk kernel version in WARN_ON and other dump_stack users ACPI: Set max_cstate to 1 for early Opterons. x86: fix NMI watchdog & 'stopped time' problem
| * x86: fix APIC related bootup crash on Athlon XP CPUsIngo Molnar2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | warmbloodedcreature@gmail.com reported that an APIC-enabled Asus a7v8x-x with an Athlon XP reboots early in the bootup: http://bugzilla.kernel.org/show_bug.cgi?id=8723 after a long marathon of spontaneous-reboot debugging, it turns out to be caused by sync_Arb_ids(). AMD CPUs never really needed this sequence anyway, so just return early if we meet an AMD CPU. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86: export the symbol empty_zero_page on the 32-bit x86 architectureTheodore Ts'o2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | The latest KVM driver wants to use the empty_zero_page symbol, and it's not exported in 32-bit x86 (although it is exported by x86_64, s390, and uml architectures). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: tglx@linutronix.de Cc: linux-kernel@vger.kernel.com Cc: kvm-devel@lists.sourceforge.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: fix kprobes_64.c inlining borkageAndrew Morton2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix: arch/x86/kernel/kprobes_64.c: In function 'set_current_kprobe': arch/x86/kernel/kprobes_64.c:152: sorry, unimplemented: inlining failed in call to 'is_IF_modifier': recursive inlining arch/x86/kernel/kprobes_64.c:166: sorry, unimplemented: called from here Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: mingo@elte.hu Cc: akpm@linux-foundation.org Cc: tglx@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * pci: use pci=bfsort for HP DL385 G2, DL585 G2Michal Schmidt2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HP ProLiant systems DL385 G2 and DL585 G2 need pci=bfsort to enumerate PCI devices in the expected order. Matt sayeth: biosdevname is a userspace app I wrote to help solve this so we don't need to patch the kernel for future systems. It's not integrated into any distributions properly yet, but is included in openSUSE 10.3 and Fedora 8 for people who want to download and install it there. It acts as a udev helper. For the time being, patching the kernel is necessary. I really hope biosdevname eliminates that need in future distributions. http://linux.dell.com/biosdevname/ Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Cc: mingo@elte.hu Cc: andy@greyhouse.net Cc: john.cagle@hp.com Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Greg KH <greg@kroah.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: correctly set UTS_MACHINE for "make ARCH=x86"Andreas Herrmann2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86: correctly set UTS_MACHINE for "make ARCH=x86" For a kernel built with "make ARCH=x86" the following system information is displayed when running the new kernel $ uname -m x86 On some i386 systems (e.g. K7) we even have the following information $ uname -m x66 This is weird. The usual information for "uname -m" should be "x86_64" on 64-bit and "i386" or "i686" on 32-bit. This patch fixes the issue by setting UTS_MACHINE to "i386" for 32-bit kernel builds and to "x86_64" for 64-bit kernel builds. I.e., "x86" won't be used for UTS_MACHINE anymore. Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andreas Herrmann <aherrman@arcor.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@redhat.com>
| * lockdep: annotate do_debug() trap handlerPeter Zijlstra2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure the hardirq state is consistent before using locks. Use the rare trace_hardirqs_fixup() because the trap can happen in any context. resolves this rare lockdep warning: WARNING: at kernel/lockdep.c:2658 check_flags() [<c013571e>] check_flags+0x90/0x140 [<c0138a69>] lock_release+0x4b/0x1d0 [<c0507fea>] notifier_call_chain+0x2a/0x47 [<c050806b>] __atomic_notifier_call_chain+0x64/0x6d [<c0508007>] __atomic_notifier_call_chain+0x0/0x6d [<c050808b>] atomic_notifier_call_chain+0x17/0x1a [<c0131802>] notify_die+0x30/0x34 [<c0506b09>] do_debug+0x3e/0xd4 [<c050658f>] debug_stack_correct+0x27/0x2c [<c04be389>] tcp_rcv_established+0x1/0x620 [<c04c38c2>] tcp_v4_do_rcv+0x2b/0x313 [<c04c56b6>] tcp_v4_rcv+0x467/0x85d [<c0505ff2>] _spin_lock_nested+0x27/0x32 [<c04c5a4d>] tcp_v4_rcv+0x7fe/0x85d [<c04c560e>] tcp_v4_rcv+0x3bf/0x85d [<c04adbb5>] ip_local_deliver_finish+0x11b/0x1b0 [<c04adac8>] ip_local_deliver_finish+0x2e/0x1b0 [<c04ada7b>] ip_rcv_finish+0x27b/0x29a [<c04961e5>] netif_receive_skb+0xfb/0x2a6 [<c04add0f>] ip_rcv+0x0/0x1fb [<c0496354>] netif_receive_skb+0x26a/0x2a6 [<c04961e5>] netif_receive_skb+0xfb/0x2a6 [<c049872e>] process_backlog+0x7f/0xc6 [<c04983ba>] net_rx_action+0xb9/0x1ac [<c0498348>] net_rx_action+0x47/0x1ac [<c01376cb>] trace_hardirqs_on+0x118/0x16b [<c01225e2>] __do_softirq+0x49/0xa2 [<c010595f>] do_softirq+0x60/0xdd [<c0506300>] _spin_unlock_irq+0x20/0x2c [<c0103e4f>] restore_nocheck+0x12/0x15 [<c01440e1>] handle_fasteoi_irq+0x0/0x9b [<c0105a70>] do_IRQ+0x94/0xaa [<c0506300>] _spin_unlock_irq+0x20/0x2c [<c0104832>] common_interrupt+0x2e/0x34 [<c0114703>] native_safe_halt+0x2/0x3 [<c0102c01>] default_idle+0x44/0x65 [<c010257f>] cpu_idle+0x42/0x50 [<c076ea09>] start_kernel+0x26b/0x270 [<c076e317>] unknown_bootoption+0x0/0x196 ======================= irq event stamp: 559190 hardirqs last enabled at (559190): [<c0507316>] kprobe_exceptions_notify+0x299/0x305 hardirqs last disabled at (559189): [<c05067bf>] do_int3+0x1d/0x95 softirqs last enabled at (559172): [<c010595f>] do_softirq+0x60/0xdd softirqs last disabled at (559181): [<c010595f>] do_softirq+0x60/0xdd Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86: turn off iommu merge by defaultIngo Molnar2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | revert this commit for now: commit 948062683004d13ca21c8c05ac052d387978a449 Author: Andi Kleen <ak@suse.de> Date: Fri Oct 19 20:35:03 2007 +0200 x86: enable iommu_merge by default it's causing regressions: http://bugzilla.kernel.org/show_bug.cgi?id=9412 Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86: printk kernel version in WARN_ON and other dump_stack usersArjan van de Ven2007-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | today, all oopses contain a version number of the kernel, which is nice because the people who actually do bother to read the oops get this vital bit of information always without having to ask the reporter in another round trip. However, WARN_ON() and many other dump_stack() users right now lack this information; the patch below adds this. This information is essential for getting people to use their time effectively when looking at these things; in addition, it's essential for tools that try to collect statistics about defects. Please consider, since its so simple and important for long term kernel quality processes. The code is identical between 32/64 bit; a lot of this code should be unified over time, the patch keeps the identical-ness intact. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>