aboutsummaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAge
...
| | * | | | [SPARC64]: Adjust kernel PC validation test in fault handler.David S. Miller2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of the new futex validation init handler, we have to accept faults in init section text as well as the normal kernel text. Thanks to Tom Callaway for the bug report. Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | [SPARC64]: Loosen checks in exception table handling.David S. Miller2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some parts of the kernel now do things like do *_user() accesses while set_fs(KERNEL_DS) that fault on purpose. See, for example, the code added by changeset a0c1e9073ef7428a14309cba010633a6cd6719ea ("futex: runtime enable pi and robust functionality"). That trips up the ASI sanity checking we make in do_kernel_fault(). Just remove it for now. Maybe we can add it back later with an added conditional which looks at the current get_fs() value. Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | [SPARC64]: Fix section mismatch from kernel_map_rangeSam Ravnborg2008-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix following warnings: WARNING: vmlinux.o(.text+0x4f980): Section mismatch in reference from the function kernel_map_range() to the function .init.text:__alloc_bootmem() WARNING: vmlinux.o(.text+0x4f9cc): Section mismatch in reference from the function kernel_map_range() to the function .init.text:__alloc_bootmem() alloc_bootmem() is only used during early init and for any subsequent call to kernel_map_range() the program logic avoid the call. So annotate kernel_map_range() with __ref to tell modpost to ignore the reference to a __init function. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | [SPARC64]: Fix section mismatchs from dr_cpu_dataSam Ravnborg2008-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix following warnings: WARNING: vmlinux.o(.text+0x4b258): Section mismatch in reference from the function dr_cpu_data() to the function .devinit.text:mdesc_fill_in_cpu_data() WARNING: vmlinux.o(.text+0x4b290): Section mismatch in reference from the function dr_cpu_data() to the function .cpuinit.text:cpu_up() mdesc_fill_in_cpu_data() is only used during early init and for cpu hotplug so the __cpuinit annotation is the correct choice. We have the call chain: dr_cpu_data() => dr_cpu_configure() => mdesc_fill_in_cpu_data() dr_cpu_data() is used only during early init and for cpu hotplug. So annotating them all __cpuinit solves the section mismatch and should be correct. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | [SPARC]: Fix build in arch/sparc/kernel/led.cDavid S. Miller2008-02-24
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CC [M] arch/sparc/kernel/led.o arch/sparc/kernel/led.c: In function 'led_blink': arch/sparc/kernel/led.c:35: error: invalid use of undefined type 'struct timer_list' arch/sparc/kernel/led.c:35: error: 'jiffies' undeclared (first use in this function) arch/sparc/kernel/led.c:35: error: (Each undeclared identifier is reported only once arch/sparc/kernel/led.c:35: error: for each function it appears in.) arch/sparc/kernel/led.c:36: error: 'avenrun' undeclared (first use in this function) arch/sparc/kernel/led.c:36: error: 'FSHIFT' undeclared (first use in this function) arch/sparc/kernel/led.c:36: error: 'HZ' undeclared (first use in this function) arch/sparc/kernel/led.c:37: error: invalid use of undefined type 'struct timer_list' arch/sparc/kernel/led.c:39: error: invalid use of undefined type 'struct timer_list' arch/sparc/kernel/led.c:40: error: invalid use of undefined type 'struct timer_list' arch/sparc/kernel/led.c:42: error: implicit declaration of function 'add_timer' arch/sparc/kernel/led.c: In function 'led_write_proc': arch/sparc/kernel/led.c:70: error: implicit declaration of function 'copy_from_user' arch/sparc/kernel/led.c:84: error: implicit declaration of function 'del_timer_sync' arch/sparc/kernel/led.c: In function 'led_init': arch/sparc/kernel/led.c:109: error: implicit declaration of function 'init_timer' arch/sparc/kernel/led.c:110: error: invalid use of undefined type 'struct timer_list' make[1]: *** [arch/sparc/kernel/led.o] Error 1 Based upon original patch by Robert Reif. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds2008-02-26
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: (24 commits) x86: no robust/pi futex for real i386 CPUs x86: fix boot failure on 486 due to TSC breakage x86: fix build on non-C locales. x86: make c_idle.work have a static address. x86: don't save unreliable stack trace entries x86: don't make swapper_pg_pmd global x86: don't print a warning when MTRR are blank and running in KVM x86: fix execve with -fstack-protect x86: fix vsyscall wreckage x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZE x86: fix spontaneous reboot with allyesconfig bzImage x86: remove double-checking empty zero pages debug x86: notsc is ignored on common configurations x86/mtrr: fix kernel-doc missing notation x86: handle BIOSes which terminate e820 with CF=1 and no SMAP x86: add comments for NOPs x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC x86: require family >= 6 if we are using P6 NOPs x86: do not promote TM3x00/TM5x00 to i686-class x86: hpet fix docbook comment ...
| | * | | | x86: fix boot failure on 486 due to TSC breakageMikael Pettersson2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | > Diffing dmesg between git7 and git8 doesn't sched any light since > git8 also removed the printouts of the x86 caps as they were being > initialised and updated. I'm currently adding those printouts back > in the hope of seeing where and when the caps get broken. That turned out to be very illuminating: --- dmesg-2.6.24-git7 2008-02-24 18:01:25.295851000 +0100 +++ dmesg-2.6.24-git8 2008-02-24 18:01:25.530358000 +0100 ... CPU: After generic identify, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000 CPU: After all inits, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000 +CPU: After applying cleared_cpu_caps, caps: 00000013 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Notice how the TSC cap bit goes from Off to On. (The first two lines are printout loops from -git7 forward-ported to -git8, the third line is the same printout loop added just after the xor-with-cleared_cpu_caps[] loop.) Here's how the breakage occurs: 1. arch/x86/kernel/tsc_32.c:tsc_init() sees !cpu_has_tsc, so bails and calls setup_clear_cpu_cap(X86_FEATURE_TSC). 2. include/asm-x86/cpufeature.h:setup_clear_cpu_cap(bit) clears the bit in boot_cpu_data and sets it in cleared_cpu_caps 3. arch/x86/kernel/cpu/common.c:identify_cpu() XORs all caps in with cleared_cpu_caps HOWEVER, at this point c->x86_capability correctly has TSC Off, cleared_cpu_caps has TSC On, so the XOR incorrectly sets TSC to On in c->x86_capability, with disastrous results. The real bug is that clearing bits with XOR only works if the bits are known to be 1 prior to the XOR, and that's not true here. A simple fix is to convert the XOR to AND-NOT instead. The following patch does that, and allows my 486 to boot 2.6.25-rc kernels again. [ mingo@elte.hu: fixed a similar bug in setup_64.c as well. ] The breakage was introduced via commit 7d851c8d3db0. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: fix build on non-C locales.Priit Laes2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some locales regex range [a-zA-Z] does not work as it is supposed to. so we have to use [:alnum:] and [:xdigit:] to make it work as intended. [1] http://en.wikipedia.org/wiki/Estonian_alphabet Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: make c_idle.work have a static address.Glauber Costa2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, c_idle is declared in the stack, and thus, have no static address. Peter Zijlstra points out this simple solution, in which c_idle.work is initializated separatedly. Note that the INIT_WORK macro has a static declaration of a key inside. Signed-off-by: Glauber Costa <gcosta@redhat.com> Acked-by: Peter Zijlstra <pzijlstr@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: don't save unreliable stack trace entriesVegard Nossum2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there is no way for print_stack_trace() to determine whether a given stack trace entry was deemed reliable or not, simply because save_stack_trace() does not record this information. (Perhaps needless to say, this makes the saved stack traces A LOT harder to read, and probably with no other benefits, since debugging features that use save_stack_trace() most likely also require frame pointers, etc.) This patch reverts to the old behaviour of only recording the reliable trace entries for saved stack traces. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: don't make swapper_pg_pmd globalAdrian Bunk2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There doesn't seem to be any reason for swapper_pg_pmd being global. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: don't print a warning when MTRR are blank and running in KVMJoerg Roedel2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inside a KVM virtual machine the MTRRs are usually blank. This confuses Linux and causes a warning message at boot. This patch removes that warning message when running Linux as a KVM guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: fix execve with -fstack-protectIngo Molnar2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointed out by pageexec@freemail.hu: > what happens here is that gcc treats the argument area as owned by the > callee, not the caller and is allowed to do certain tricks. for ssp it > will make a copy of the struct passed by value into the local variable > area and pass *its* address down, and it won't copy it back into the > original instance stored in the argument area. > > so once sys_execve returns, the pt_regs passed by value hasn't at all > changed and its default content will cause a nice double fault (FWIW, > this part took me the longest to debug, being down with cold didn't > help it either ;). To fix this we pass in pt_regs by pointer. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: fix vsyscall wreckageThomas Gleixner2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | based on a report from Arne Georg Gleditsch about user-space apps misbehaving after toggling /proc/sys/kernel/vsyscall64, a review of the code revealed that the "NOP patching" done there is fundamentally unsafe for a number of reasons: 1) the patching code runs without synchronizing other CPUs 2) it inserts NOPs even if there is no clock source which provides vread 3) when the clock source changes to one without vread we run in exactly the same problem as in #2 4) if nobody toggles the proc entry from 1 to 0 and to 1 again, then the syscall is not patched out as a result it is possible to break user-space via this patching. The only safe thing for now is to remove the patching. This code was broken since v2.6.21. Reported-by: Arne Georg Gleditsch <arne.gleditsch@dolphinics.no> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZEIngo Molnar2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel text but data, bss and init sections as well. That name led me on the wrong path with the KERNEL_TEXT_SIZE regression, because i knew how big of _text_ my images have and i knew about the 40 MB "text" limit so i wrongly thought to be on the safe side of the 40 MB limit with my 29 MB of text, while the total image size was slightly above 40 MB. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: fix spontaneous reboot with allyesconfig bzImageIngo Molnar2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (40*1024*1024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 8646224 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: remove double-checking empty zero pages debugYinghai Lu2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | so far no one complained about that. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: notsc is ignored on common configurationsPavel Machek2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | notsc is ignored in 32-bit kernels if CONFIG_X86_TSC is on.. which is bad, fix it. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86/mtrr: fix kernel-doc missing notationRandy Dunlap2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix mtrr kernel-doc warning: Warning(linux-2.6.24-git12//arch/x86/kernel/cpu/mtrr/main.c:677): No description found for parameter 'end_pfn' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: handle BIOSes which terminate e820 with CF=1 and no SMAPH. Peter Anvin2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The proper way to terminate the e820 chain is with %ebx == 0 on the last legitimate memory block. However, several BIOSes don't do that and instead return error (CF = 1) when trying to read off the end of the list. For this error return, %eax doesn't necessarily return the SMAP signature -- correctly so, since %ah should contain an error code in this case. To deal with some particularly broken BIOSes, we clear the entire e820 chain if the SMAP signature is missing in the middle, indicating a plain insane e820 implementation. However, we need to make the test for CF = 1 before the SMAP check. This fixes at least one HP laptop (nc6400) for which none of the memory-probing methods (e820, e801, 88) functioned fully according to spec. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERICH. Peter Anvin2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | P6_NOPs are definitely not supported on some VIA CPUs, and possibly (unverified) on AMD K7s. It is also the only thing that prevents a 686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series. The performance benefit over generic NOPs is very small, so when building for generic consumption, avoid using them. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: require family >= 6 if we are using P6 NOPsH. Peter Anvin2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The P6 family of NOPs are only available on family >= 6 or above, so enforce that in the boot code. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: do not promote TM3x00/TM5x00 to i686-classH. Peter Anvin2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have been promoting Transmeta TM3x00/TM5x00 chips to i686-class based on the notion that they contain all the user-space visible features of an i686-class chip. However, this is not actually true: they lack the EA-taking long NOPs (0F 1F /0). Since this is a userspace-visible incompatibility, downgrade these CPUs to the manufacturer-defined i586 level. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: hpet fix docbook commentPavel Machek2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Pavel Machek <Pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86: make DEBUG_PAGEALLOC and CPA more robustIngo Molnar2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC case. This makes the code simpler and more robust against allocation failures. This fixes the following fallback to non-mmconfig: http://lkml.org/lkml/2008/2/20/551 http://bugzilla.kernel.org/show_bug.cgi?id=10083 Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | x86/lguest: fix pgdir pmd index calculationAhmed S. Darwish2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi all, Beginning from commits close to v2.6.25-rc2, running lguest always oopses the host kernel. Oops is at [1]. Bisection led to the following commit: commit 37cc8d7f963ba2deec29c9b68716944516a3244f x86/early_ioremap: don't assume we're using swapper_pg_dir At the early stages of boot, before the kernel pagetable has been fully initialized, a Xen kernel will still be running off the Xen-provided pagetables rather than swapper_pg_dir[]. Therefore, readback cr3 to determine the base of the pagetable rather than assuming swapper_pg_dir[]. static inline pmd_t * __init early_ioremap_pmd(unsigned long addr) { - pgd_t *pgd = &swapper_pg_dir[pgd_index(addr)]; + /* Don't assume we're using swapper_pg_dir at this point */ + pgd_t *base = __va(read_cr3()); + pgd_t *pgd = &base[pgd_index(addr)]; pud_t *pud = pud_offset(pgd, addr); pmd_t *pmd = pmd_offset(pud, addr); Trying to analyze the problem, it seems on the guest side of lguest, %cr3 has a different value from &swapper_pg-dir (which is AFAIK fine on a pravirt guest): Putting some debugging messages in early_ioremap_pmd: /* Appears 3 times */ [ 0.000000] *************************** [ 0.000000] __va(%cr3) = c0000000, &swapper_pg_dir = c02cc000 [ 0.000000] *************************** After 8 hours of debugging and staring on lguest code, I noticed something strange in paravirt_ops->set_pmd hypercall invocation: static void lguest_set_pmd(pmd_t *pmdp, pmd_t pmdval) { *pmdp = pmdval; lazy_hcall(LHCALL_SET_PMD, __pa(pmdp)&PAGE_MASK, (__pa(pmdp)&(PAGE_SIZE-1))/4, 0); } The first hcall parameter is global pgdir which looks fine. The second parameter is the pmd index in the pgdir which is suspectful. AFAIK, calculating the index of pmd does not need a divisoin over four. Removing the division made lguest work fine again . Patch is at [2]. I am not sure why the division over four existed in the first place. It seems bogus, maybe the Xen patch just made the problem appear ? [2]: The patch: [PATCH] lguest: fix pgdir pmd index cacluation Remove an error in index calculation which leads to removing a not existing shadow page table (leading to a Null dereference). Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | lguest: fix build breakageTony Breeds2008-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ mingo@elte.hu: merged to Rusty's patch ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | lguest: include function prototypesHarvey Harrison2008-02-26
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added a declaration to asm-x86/lguest.h and moved the extern arrays there as well. As an alternative to including asm/lguest.h directly, an include could be put in linux/lguest.h Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "rusty@rustcorp.com.au" <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * / / / sched: add declaration of sched_tail to sched.hHarvey Harrison2008-02-25
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoids sparse warnings: kernel/sched.c:2170:17: warning: symbol 'schedule_tail' was not declared. Should it be static? Avoids the need for an external declaration in arch/um/process.c Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | / ARM: OMAP: Release i2c_adapter after use (Siemens SX1)Jean Delvare2008-02-24
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | Each call to i2c_get_adapter() must be followed by a call to i2c_put_adapter() to release the grabbed reference. Otherwise the reference count grows forever and the adapter can never be unregistered. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Vladimir Ananiev <vovan888@gmail.com> Acked-by: Tony Lindgren <tony@atomide.com>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2008-02-24
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: make IOMMU code respect the segment boundary limits [SPARC64]: Fix cpu trampoline et al. mismatch warnings. [SPARC64]: More sparse warning fixes in process.c [SPARC64]: Fix sparse warning wrt. fault_in_user_windows. [SPARC64]: Kill show_regs32(). [SPARC64]: Fix sparse warnings wrt. __show_regs(). [SPARC64]: Kill show_stackframe{,32}(). [SPARC64]: Fix sparse warnings wrt. machine_alt_power_off().
| | * | [SPARC64]: make IOMMU code respect the segment boundary limitsFUJITA Tomonori2008-02-21
| | | | | | | | | | | | | | | | | | | | Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: Fix cpu trampoline et al. mismatch warnings.Sam Ravnborg2008-02-21
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: More sparse warning fixes in process.cDavid S. Miller2008-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/sparc64/kernel/process.c:504:17: warning: symbol 'sparc_do_fork' was not declared. Should it be static? arch/sparc64/kernel/process.c:655:5: warning: symbol 'dump_fpu' was not declared. Should it be static? arch/sparc64/kernel/process.c:708:16: warning: symbol 'sparc_execve' was not declared. Should it be static? Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: Kill show_regs32().David S. Miller2008-02-20
| | | | | | | | | | | | | | | | | | | | | | | | Unused, noticed via sparse. Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: Fix sparse warnings wrt. __show_regs().David S. Miller2008-02-19
| | | | | | | | | | | | | | | | | | | | | | | | arch/sparc64/kernel/process.c:219:6: warning: symbol '__show_regs' was not declared. Should it be static? Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: Kill show_stackframe{,32}().David S. Miller2008-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Noticed via sparse: arch/sparc64/kernel/process.c:215:6: warning: symbol 'show_stackframe' was not declared. Should it be static? arch/sparc64/kernel/process.c:243:6: warning: symbol 'show_stackframe32' was not declared. Should it be static? It is totally unused. Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [SPARC64]: Fix sparse warnings wrt. machine_alt_power_off().David S. Miller2008-02-19
| | | | | | | | | | | | | | | | | | | | | | | | arch/sparc64/kernel/process.c:123:6: warning: symbol 'machine_alt_power_off' was not declared. Should it be static? Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'master' of ↵Paul Mackerras2008-03-03
|\ \ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jk/spufs into merge
| * | | | [POWERPC] spufs: fix use time accounting on SPE-overcommitAndre Detsch2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spu_runcntl_RW register is restored within spu_restore function. So, at the end of spu_bind_context, the SPU context is not just loaded, but running. This change corrects the state switch to account the time as USER. Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: serialize SLB invalidation against SLB loadingArnd Bergmann2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a potential race between flushes of the entire SLB in the MFC and the point where new entries are being established. The problem is that we might put a ESID entry into the MFC SLB when the VSID entry has just been cleared by the global flush. This can be circumvented by holding the register_lock throughout both the flushing and the creation of SLB entries. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: invalidate SLB translation before adding a new entryArnd Bergmann2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we replace an SLB entry in the MFC after using up all the available entries, there is a short window in which an incorrect entry is marked as valid. The problem is that the 'valid' bit is stored in the ESID, which is always written after the VSID. Overwriting the VSID first will make the original ESID entry point to the new VSID, which means that any concurrent DMA accessing the old ESID ends up being redirected to the new virtual address. A few cycles later, we write the new ESID and everything is fine again. That race can be closed by writing a zero entry to the ESID first, which makes sure that the VSID is not accessed until we write the new ESID. Note that we don't actually need to invalidate the SLB entry using the invalidation register, which would also flush any ERAT entries for that segment, because the segment translation does not become invalid but is only removed from the SLB cache. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: synchronize IRQ when disablingArnd Bergmann2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a small race between the context save procedure and the SPU interrupt handling, where we expect all interrupt processing to have finished after disabling them, while an interrupt is still being processed on another CPU. The obvious fix is to call synchronize_irq() after disabling the interrupts at the start of the context save procedure to make sure we never access the SPU any more during an ongoing save or even after that. Thanks to Benjamin Herrenschmidt for pointing this out. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: fix order of sputrace thread IDsJeremy Kerr2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we get the following output from sputrace: [5.097935954] 1606: spufs_ps_nopfn__enter (thread = 1605, spu = -1) [5.097958164] 1606: spufs_ps_nopfn__insert (thread = 1605, spu = 15) [5.097973529] 1607: spufs_ps_nopfn__enter (thread = 1605, spu = -1) [5.097989174] 1607: spufs_ps_nopfn__insert (thread = 1605, spu = 14) Which leads me to believe that 160[67] is the current thread ID, and 1605 is the context backing the psmap. However, the 'current' and 'owner' tids are reversed - the 'current' tid is on the right. This change puts the current thread ID in the left-hand column instead, and renames the right to 'ctxthread'. Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: fix invalid scheduling of forgotten contextsJeremy Kerr2008-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At present, we have a situation where a context with no owner is re-scheduled by spu_forget: Thread 1: reading regs file Thread 2: context owner spu_forget() - ctx->owner = NULL - set SPU_SCHED_WAS_ACTIVE spu_acquire_saved() - context is in saved state spu_release_saved() - SPU_SCHED_WAS_ACTIVE is set, so spu_activate() the context, which now has no owner In spu_forget(), we shouldn't be requesting a re-schedule by setting SPU_SCHED_WAS_ACTIVE. This change removes the set_bit in spu_forget(), so that spu_release_saved() doesn't reinsert this destroyed context on to the run queue. Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
| * | | | [POWERPC] spufs: fix context destruction during psmap faultJeremy Kerr2008-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a small window where a spu context may be destroyed while we're servicing a page fault (from another thread) to the context's problem state mapping. After we up_read() the mmap_sem, it's possible that the context is destroyed by its owning thread, and so the later references to ctx are invalid. This can maifest as a deadlock on the (now free()-ed) context state mutex. This change adds a reference to the context before we release the mmap_sem, so that the context cannot be destroyed. Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
* | | | | Merge branch 'for-2.6.25' of ↵Paul Mackerras2008-03-03
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx into merge
| * | | | | [POWERPC] 4xx: Use correct board info structure in cuboot wrappersJosh Boyer2008-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct the remaining 44x cuboot wrappers to define TARGET_4xx as well. This creates the correct structure to use, including things like the second MAC address. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
| * | | | | [POWERPC] 44x: add missing define TARGET_4xx and TARGET_440GX to cuboot-taishanValentine Barshak2008-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to get the proper boad info (bd_info) structure defined in ppcboot.h both TARGET_4xx and TARGET_44x should be defined for all PowerPC 440 boards. The 440GX boards also need TARGET_440GX defined since they have 4 EMACs and there are 4 MAC addesses in bd_info passed by u-boot. Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
| * | | | | [POWERPC] 4xx: Fix L1 cache size in katmai DTSStefan Roese2008-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the katmai (440SPe) L1 cache size to 32k. Some whitespace issues are cleaned up too. Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>