litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	powerpc: Move common module code into its own file	Kumar Gala	2008-06-30
\| \| \| \| \| \| \| \|	Refactor common code between ppc32 and ppc64 module handling into a shared filed. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: hash_huge_page() should get the WIMG bits from the lpte	Dave Kleikamp	2008-06-30
\| \| \| \| \| \| \| \|	Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Tell firmware we support architecture V2.06	Joel Schopp	2008-06-30
\| \| \| \| \| \| \| \|	Add the bits to the architecture-vec so that ibm,client-architecture lets the firmware know we support the 2.06 architecture. Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Add cputable entry for Power7 architected mode	Joel Schopp	2008-06-30
\| \| \| \| \| \| \| \|	Add an entry for Power7 architected mode and add "(raw)" to Power7 raw mode to distinguish it more clearly. Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Only demote individual slices rather than whole process	Paul Mackerras	2008-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At present, if we have a kernel with a 64kB page size, and some process maps something that has to be mapped with 4kB pages (such as a cache-inhibited mapping on POWER5+, or the eHCA infiniband queue-pair pages), we change the process to use 4kB pages everywhere. This hurts the performance of HPC programs that access eHCA from userspace. With this patch, the kernel will only demote the slice(s) containing the eHCA or cache-inhibited mappings, leaving the remaining slices able to use 64kB hardware pages. This also changes the slice_get_unmapped_area code so that it is willing to place a 64k-page mapping into (or across) a 4k-page slice if there is no better alternative, i.e. if the program specified MAP_FIXED or if there is not sufficient space available in slices that are either empty or already have 64k-page mappings in them. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Add cputable entry for POWER7	Michael Neuling	2008-06-30
\| \| \| \| \| \| \| \| \| \|	Add a cputable entry for the POWER7 processor. Also tell firmware that we know about POWER7. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Get rid of bitfields in ppc_bat struct	Becky Bruce	2008-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While working on the 36-bit physical support, I noticed that there was exactly one line of code that actually referenced the bitfields. So I got rid of them and redefined ppc_bat as a struct of 2 u32's: batu and batl. I also got rid of the previous union that held the bitfield structs and a word representation of the batu/l values. This seems like a nicer solution than adding in a bunch of new bitfields to support extended bat addressing that would never get used, and just leaving the struct as-is would have been incomplete in the face of large physical addressing. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Change BAT code to use phys_addr_t	Becky Bruce	2008-06-30
\| \| \| \| \| \| \| \| \| \|	Currently, the physical address is an unsigned long, but it should be phys_addr_t in set_bat, [v/p]_mapped_by_bat. Also, create a macro that can convert a large physical address into the correct format for programming the BAT registers. Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Increase CRASH_HANDLER_MAX	Arnd Bergmann	2008-06-30
\| \| \| \| \| \| \| \|	There are now two potential callers of machine_crash_shutdown, so increase the limit accordingly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc/cell: Disable ptcal in case of crash kdump	Arnd Bergmann	2008-06-30
\| \| \| \| \| \| \| \| \|	We need to disable ptcal before starting a new kernel after a crash, in order to avoid overwriting data in the kdump kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc/pseries: Call pseries_kexec_setup only on pseries	Arnd Bergmann	2008-06-30
\| \| \| \| \| \| \| \| \|	The pseries_kexec_setup function overwrites some ppc_md pointers, so make sure it only gets called when running on the right architecture. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Free a PTE bit on ppc64 with 64K pages	Benjamin Herrenschmidt	2008-06-30
\| \| \| \| \| \| \| \| \| \|	This frees a PTE bit when using 64K pages on ppc64. This is done by getting rid of the separate _PAGE_HASHPTE bit. Instead, we just test if any of the 16 sub-page bits is set. For non-combo pages (ie. real 64K pages), we set SUB0 and the location encoding in that field. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	spufs: Convert nopfn to fault	Nick Piggin	2008-06-30
\| \| \| \| \| \| \| \|	Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	powerpc: Get rid of CROSS32{AS,LD,OBJCOPY}	Segher Boessenkool	2008-06-30
\| \| \| \| \| \| \| \| \| \| \| \| \|	CROSS32AS and CROSS32LD are never used (instead, CROSS32CC is used with proper command line options). CROSS32OBJCOPY isn't used anymore either, since the "wrapper" stuff was added. Remove these unused variables. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
*	Merge branch 'linux-2.6'	Paul Mackerras	2008-06-29
\|\
\| *	Blackfin arch: fix up section mismatch warning	Bryan Wu	2008-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-- WARNING: vmlinux.o(.text+0x721a): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_code_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_code_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x7238): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_code_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_code_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x7250): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_code_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_code_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x7264): Section mismatch in reference from the function ___fill_code_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_code_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_code_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x72a2): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_data_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_data_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x72bc): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_data_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_data_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x72d4): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_data_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_data_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. WARNING: vmlinux.o(.text+0x72e8): Section mismatch in reference from the function ___fill_data_cplbtab() to the function .init.text:_fill_cplbtab() The function ___fill_data_cplbtab() references the function __init _fill_cplbtab(). This is often because ___fill_data_cplbtab lacks a __init annotation or the annotation of _fill_cplbtab is wrong. -- Signed-off-by: Bryan Wu <cooloney@kernel.org>
\| *	Blackfin arch: fix bug - kernel boot fails when Spinlock and rw-lock ↵	Sonic Zhang	2008-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	debugging enabled Initialize the lock of bad_irq_desc properly. The content of irq_desc array is replaced by bad_irq_desc in blackfin arch irqchip init code. So, do it properly as common irq init code. Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
\| *	Merge branch 'release' of ↵	Linus Torvalds	2008-06-24
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte() [IA64] Handle count==0 in sn2_ptc_proc_write() [IA64] Fix boot failure on ia64/sn2
\| \| *	[IA64] Eliminate NULL test after alloc_bootmem in iosapic_alloc_rte()	Julia Lawall	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted by Akinobu Mita alloc_bootmem and related functions never return NULL and always return a zeroed region of memory. Thus a NULL test or memset after calls to these functions is unnecessary. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| \| *	[IA64] Handle count==0 in sn2_ptc_proc_write()	Cliff Wickman	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The fix applied in e0c6d97c65e0784aade7e97b9411f245a6c543e7 "security hole in sn2_ptc_proc_write" didn't take into account the case where count==0 (which results in a buffer underrun when adding the trailing '\0'). Thanks to Andi Kleen for pointing this out. Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| \| *	[IA64] Fix boot failure on ia64/sn2	Jes Sorensen	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call check_sal_cache_flush() after platform_setup() as check_sal_cache_flush() now relies on being able to call platform vector code. Problem was introduced by: 3463a93def55c309f3c0d0a8aaf216be3be42d64 "Update check_sal_cache_flush to use platform_send_ipi()" Signed-off-by: Jes Sorensen <jes@sgi.com> Tested-by: Alex Chiang: <achiang@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| * \|	Merge branch 'kvm-updates-2.6.26' of ↵	Linus Torvalds	2008-06-24
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: Remove now unused structs from kvm_para.h x86: KVM guest: Use the paravirt clocksource structs and functions KVM: Make kvm host use the paravirt clocksource structs x86: Make xen use the paravirt clocksource structs and functions x86: Add structs and functions for paravirt clocksource KVM: VMX: Fix host msr corruption with preemption enabled KVM: ioapic: fix lost interrupt when changing a device's irq KVM: MMU: Fix oops on guest userspace access to guest pagetable KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend) KVM: MMU: Fix rmap_write_protect() hugepage iteration bug KVM: close timer injection race window in __vcpu_run KVM: Fix race between timer migration and vcpu migration
\| \| * \|	x86: KVM guest: Use the paravirt clocksource structs and functions	Gerd Hoffmann	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates the kvm host code to use the pvclock structs and functions, thereby making it compatible with Xen. The patch also fixes an initialization bug: on SMP systems the per-cpu has two different locations early at boot and after CPU bringup. kvmclock must take that in account when registering the physical address within the host. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: Make kvm host use the paravirt clocksource structs	Gerd Hoffmann	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates the kvm host code to use the pvclock structs. It also makes the paravirt clock compatible with Xen. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	x86: Make xen use the paravirt clocksource structs and functions	Gerd Hoffmann	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates the xen guest to use the pvclock structs and helper functions. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	x86: Add structs and functions for paravirt clocksource	Gerd Hoffmann	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds structs for the paravirt clocksource ABI used by both xen and kvm (pvclock-abi.h). It also adds some helper functions to read system time and wall clock time from a paravirtual clocksource (pvclock.[ch]). They are based on the xen code. They are enabled using CONFIG_PARAVIRT_CLOCK. Subsequent patches of this series will put the code in use. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: VMX: Fix host msr corruption with preemption enabled	Avi Kivity	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Switching msrs can occur either synchronously as a result of calls to the msr management functions (usually in response to the guest touching virtualized msrs), or asynchronously when preempting a kvm thread that has guest state loaded. If we're unlucky enough to have the two at the same time, host msrs are corrupted and the machine goes kaput on the next syscall. Most easily triggered by Windows Server 2008, as it does a lot of msr switching during bootup. Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: MMU: Fix oops on guest userspace access to guest pagetable	Avi Kivity	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KVM has a heuristic to unshadow guest pagetables when userspace accesses them, on the assumption that most guests do not allow userspace to access pagetables directly. Unfortunately, in addition to unshadowing the pagetables, it also oopses. This never triggers on ordinary guests since sane OSes will clear the pagetables before assigning them to userspace, which will trigger the flood heuristic, unshadowing the pagetables before the first userspace access. One particular guest, though (Xenner) will run the kernel in userspace, triggering the oops. Since the heuristic is incorrect in this case, we can simply remove it. Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: MMU: large page update_pte issue with non-PAE 32-bit guests (resend)	Marcelo Tosatti	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	kvm_mmu_pte_write() does not handle 32-bit non-PAE large page backed guests properly. It will instantiate two 2MB sptes pointing to the same physical 2MB page when a guest large pte update is trapped. Instead of duplicating code to handle this, disallow directory level updates to happen through kvm_mmu_pte_write(), so the two 2MB sptes emulating one guest 4MB pte can be correctly created by the page fault handling path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: MMU: Fix rmap_write_protect() hugepage iteration bug	Marcelo Tosatti	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rmap_next() does not work correctly after rmap_remove(), as it expects the rmap chains not to change during iteration. Fix (for now) by restarting iteration from the beginning. Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: close timer injection race window in __vcpu_run	Marcelo Tosatti	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| \| * \|	KVM: Fix race between timer migration and vcpu migration	Marcelo Tosatti	2008-06-24
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A guest vcpu instance can be scheduled to a different physical CPU between the test for KVM_REQ_MIGRATE_TIMER and local_irq_disable(). If that happens, the timer will only be migrated to the current pCPU on the next exit, meaning that guest LAPIC timer event can be delayed until a host interrupt is triggered. Fix it by cancelling guest entry if any vcpu request is pending. This has the side effect of nicely consolidating vcpu->requests checks. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
\| * \|	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds	2008-06-24
\| \|\ \ \| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: remove support for non-PAE 32-bit
\| \| *	xen: remove support for non-PAE 32-bit	Jeremy Fitzhardinge	2008-06-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Non-PAE operation has been deprecated in Xen for a while, and is rarely tested or used. xen-unstable has now officially dropped non-PAE support. Since Xen/pvops' non-PAE support has also been broken for a while, we may as well completely drop it altogether. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| * \|	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds	2008-06-23
\| \|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: don't drop NX bit xen: mask unwanted pte bits in __supported_pte_mask xen: Use wmb instead of rmb in xen_evtchn_do_upcall(). x86: fix NULL pointer deref in __switch_to
\| \| *	xen: don't drop NX bit	Jeremy Fitzhardinge	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because NX is now enforced properly, we must put the hypercall page into the .text segment so that it is executable. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| \| *	xen: mask unwanted pte bits in __supported_pte_mask	Jeremy Fitzhardinge	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Stable: this isn't a bugfix in itself, but it's a pre-requiste for "xen: don't drop NX bit" ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: the arch/x86 maintainers <x86@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| * \|	Merge branch 'release' of ↵	Linus Torvalds	2008-06-20
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] SN2: security hole in sn2_ptc_proc_write
\| \| * \|	[IA64] SN2: security hole in sn2_ptc_proc_write	Cliff Wickman	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Security hole in sn2_ptc_proc_write It is possible to overrun a buffer with a write to this /proc file. Signed-off-by: Cliff Wickman <cpw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
\| * \| \|	alpha: resurrect Cypress IDE quirk	Ivan Kokshaysky	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Which was removed in the hope that generic legacy IDE quirk in drivers/pci/probe.c is sufficient for Cypress IDE. It isn't, as this controller has non-standard BAR layout: secondary channel registers are in the BAR0-1 of the second PCI function - not in the BAR2-3 of the same function, as the generic quirk routine assumes. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \| \|	alpha: fix compile failures with gcc-4.3 (bug #10438)	Ivan Kokshaysky	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Vast majority of these build failures are gcc-4.3 warnings about static functions and objects being referenced from non-static (read: "extern inline") functions, in conjunction with our -Werror. We cannot just convert "extern inline" to "static inline", as people keep suggesting all the time, because "extern inline" logic is crucial for generic kernel build. So - just make sure that all callees of critical "extern inline" functions are also "extern inline"; - use "static inline", wherever it's possible. traps.c: work around gcc-4.3 being too smart about array bounds-checking. TODO: add "gnu_inline" attribute to all our "extern inline" functions to ensure desired behaviour with future compilers. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \| \|	alpha: link failure fix	Ivan Kokshaysky	2008-06-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With built-in scsi disk driver, the final link fails with a following error: `.exit.text' referenced in section `.rodata' of drivers/built-in.o: defined in discarded section `.exit.text' of drivers/built-in.o This happens with -Os (CONFIG_CC_OPTIMIZE_FOR_SIZE=y) with all gcc-4 versions, and also with -O2 and gcc-4.3. The problem is in sd.c:sd_major() being inlined into __exit function exit_sd(), and the compiler generating a jump table in .rodata section for the 'switch' statement in sd_major(). So we have references to discarded section. Fixed with a big hammer in the form of -fno-jump-tables. Note that jump tables vs. discarded sections is a generic problem, other architectures are just lucky not to suffer from it. But with a slightly more complex switch/case statement it can be reproduced on x86 as well. So maybe at some point we should consider -fno-jump-tables as a generic compile option... Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| * \| \|	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds	2008-06-20
\| \|\ \ \ \| \| \| \|/ \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, geode: add a VSA2 ID for General Software x86: use BOOTMEM_EXCLUSIVE on 32-bit x86, 32-bit: fix boot failure on TSC-less processors x86: fix NULL pointer deref in __switch_to x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.
\| \| * \|	x86, geode: add a VSA2 ID for General Software	Jordan Crouse	2008-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	General Software writes their own VSA2 module for their version of the Geode BIOS, which returns a different ID then the standard VSA2. This was causing the framebuffer driver to break for most GSW boards. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Cc: tglx@linutronix.de Cc: linux-geode@lists.infradead.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| \| * \|	x86: use BOOTMEM_EXCLUSIVE on 32-bit	Bernhard Walle	2008-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for i386 and prints a error message on failure. The patch is still for 2.6.26 since it is only bug fixing. The unification of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>
\| \| * \|	x86, 32-bit: fix boot failure on TSC-less processors	Mikael Pettersson	2008-06-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6" (invalid opcode) and a kernel halt immediately after the kernel has been uncompressed. The BUG shows EIP pointing to an rdtsc instruction in native_read_tsc(), invoked from native_sched_clock(). (This error occurs so early that not even the serial console can capture it.) A bisection showed that this bug first occurs in 2.6.26-rc3-git7, via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910: >x86: distangle user disabled TSC from unstable > >tsc_enabled is set to 0 from the command line switch "notsc" and from >the mark_tsc_unstable code. Seperate those functionalities and replace >tsc_enable with tsc_disable. This makes also the native_sched_clock() >decision when to use TSC understandable. > >Preparatory patch to solve the sched_clock() issue on 32 bit. > >Signed-off-by: Thomas Gleixner <tglx@linutronix.de> The core reason for this bug is that native_sched_clock() gets called before tsc_init(). Before the commit above, tsc_32.c used a "tsc_enabled" variable which defaulted to 0 == disabled, and which only got enabled late in tsc_init(). Thus early calls to native_sched_clock() would skip the TSC and use jiffies instead. After the commit above, tsc_32.c uses a "tsc_disabled" variable which defaults to 0, meaning that the TSC is Ok to use. Early calls to native_sched_clock() now erroneously try to use the TSC on !cpu_has_tsc processors, leading to invalid opcode exceptions. My proposed fix is to initialise tsc_disabled to a "soft disabled" state distinct from the hard disabled state set up by the "notsc" kernel option. This fixes the native_sched_clock() problem. It also allows tsc_init() to be simplified: instead of setting tsc_disabled = 1 on every error return, we just set tsc_disabled = 0 once when all checks have succeeded. I've verified that this lets my 486 boot again. I've also verified that a Core2 machine still uses the TSC as clocksource after the patch. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| \| * \|	x86: fix NULL pointer deref in __switch_to	Suresh Siddha	2008-06-19
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118 > > [62060.043009] pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... / * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| * /	Reinstate ZERO_PAGE optimization in 'get_user_pages()' and fix XIP	Linus Torvalds	2008-06-20
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit 557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") removed the ZERO_PAGE from the VM mappings, any users of get_user_pages() will generally now populate the VM with real empty pages needlessly. We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but since fault handling no longer uses ZERO_PAGE for new anonymous pages, we now need to handle that special case in follow_page() instead. In particular, the removal of ZERO_PAGE effectively removed the core file writing optimization where we would skip writing pages that had not been populated at all, and increased memory pressure a lot by allocating all those useless newly zeroed pages. This reinstates the optimization by making the unmapped PTE case the same as for a non-existent page table, which already did this correctly. While at it, this also fixes the XIP case for follow_page(), where the caller could not differentiate between the case of a page that simply could not be used (because it had no "struct page" associated with it) and a page that just wasn't mapped. We do that by simply returning an error pointer for pages that could not be turned into a "struct page *". The error is arbitrarily picked to be EFAULT, since that was what get_user_pages() already used for the equivalent IO-mapped page case. [ Also removed an impossible test for pte_offset_map_lock() failing: that's not how that function works ] Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| *	[POWERPC] Clear sub-page HPTE present bits when demoting page size	Paul Mackerras	2008-06-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we demote a slice from 64k to 4k, and we are about to insert an HPTE for a 4k subpage and we notice that there is an existing 64k HPTE, we first invalidate that HPTE before inserting the new 4k subpage HPTE. Since the bits that encode which hash bucket the old HPTE was in overlap with the bits that encode which of the 16 subpages have HPTEs, we need to clear out the subpage HPTE-present bits before starting to insert HPTEs for the 4k subpages. If we don't do that, we can erroneously think that a subpage already has an HPTE when it doesn't. That in itself wouldn't be such a problem except that when we go to update the HPTE that we think is present on machines with a hypervisor, the hypervisor can tell us that the HPTE we think is there is actually there even though it isn't, which can lead to a process getting stuck in a loop, continually faulting. The reason for the confusion is that the AVPN (abbreviated virtual page number) we are looking for in the HPTE for a 4k subpage can actually match the AVPN in a stale HPTE for another 64k page. For example, the HPTE for the 4k subpage at 0x84000f000 will be in the same hash bucket and have the same AVPN as the HPTE for the 64k page at 0x8400f0000. This fixes the code to clear out the subpage HPTE-present bits. Signed-off-by: Paul Mackerras <paulus@samba.org>
\| *	[POWERPC] 4xx: Clear new TLB cache attribute bits in Data Storage vector	Josh Boyer	2008-06-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A recent commit added support for the new 440x6 and 464 cores that have the added WL1, IL1I, IL1D, IL2I, and ILD2 bits for the caching attributes in the TLBs. The new bits were cleared in the finish_tlb_load function, however a similar bit of code was missed in the DataStorage interrupt vector. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>