aboutsummaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAge
...
| * powerpc: Add "memory" attribute for mfmsr()Tiejun Chen2012-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1034988 commit b416c9a10baae6a177b4f9ee858b8d309542fbef upstream. Add "memory" attribute in inline assembly language as a compiler barrier to make sure 4.6.x GCC don't reorder mfmsr(). Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * powerpc/ftrace: Fix assembly trampoline register usageroger blofeld2012-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1034988 commit fd5a42980e1cf327b7240adf5e7b51ea41c23437 upstream. Just like the module loader, ftrace needs to be updated to use r12 instead of r11 with newer gcc's. Signed-off-by: Roger Blofeld <blofeldus@yahoo.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * MIPS: Properly align the .data..init_task section.David Daney2012-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1031926 commit 7b1c0d26a8e272787f0f9fcc5f3e8531df3b3409 upstream. Improper alignment can lead to unbootable systems and/or random crashes. [ralf@linux-mips.org: This is a lond standing bug since 6eb10bc9e2deab06630261cd05c4cb1e9a60e980 (kernel.org) rsp. c422a10917f75fd19fa7fe070aaaa23e384dae6f (lmo) [MIPS: Clean up linker script using new linker script macros.] so dates back to 2.6.32.] Signed-off-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/3881/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ARM: SAMSUNG: fix race in s3c_adc_start for ADCTodd Poynor2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1026850 commit 8265981bb439f3ecc5356fb877a6c2a6636ac88a upstream. Checking for adc->ts_pend already claimed should be done with the lock held. Signed-off-by: Todd Poynor <toddpoynor@google.com> Acked-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * x86, mce, therm_throt: Don't report power limit and package level thermal ↵Fenghua Yu2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | throttle events in mcelog BugLink: http://bugs.launchpad.net/bugs/930288 Thermal throttle and power limit events are not defined as MCE errors in x86 architecture and should not generate MCE errors in mcelog. Current kernel generates fake software defined MCE errors for these events. This may confuse users because they may think the machine has real MCE errors while actually only thermal throttle or power limit events happen. To make it worse, buggy firmware on some platforms may falsely generate the events. Therefore, kernel reports MCE errors which users think as real hardware errors. Although the firmware bugs should be fixed, on the other hand, kernel should not report MCE errors either. So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count, package_power_limit_count, core_throttle_count, and package_throttle_count) in /sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters for each event on each CPU. Please note that all CPU's on one package report duplicate counters. It's user application's responsibity to retrieve a package level counter for one package. This patch doesn't report package level power limit, core level power limit, and package level thermal throttle events in mcelog. When the events happen, only report them in respective counters in sysfs. Since core level thermal throttle has been legacy code in kernel for a while and users accepted it as MCE error in mcelog, core level thermal throttle is still reported in mcelog. In the mean time, the event is counted in a counter in sysfs as well. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Borislav Petkov <bp@amd64.org> Acked-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> (cherry picked from commit 29e9bf1841e4f9df13b4992a716fece7087dd237) Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ACPI: Remove one board specific WARN when ignoring timer overridingFeng Tang2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit 7f68b4c2e158019c2ec494b5cfbd9c83b4e5b253 upstream. Current WARN msg is only for the ati_ixp4x0 board, while this function is used by mulitple platforms. So this one board specific warning is not appropriate any more. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ACPI: Make acpi_skip_timer_override cover all source_irq==0 casesFeng Tang2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit ae10ccdc3093486f8c2369d227583f9d79f628e5 upstream. Currently when acpi_skip_timer_override is set, it only cover the (source_irq == 0 && global_irq == 2) cases. While there is also platform which need use this option and its global_irq is not 2. This patch will extend acpi_skip_timer_override to cover all timer overriding cases as long as the source irq is 0. This is the first part of a fix to kernel bug bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERMH. Peter Anvin2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit 4ad33411308596f2f918603509729922a1ec4411 upstream. It makes sense to label "Digital Thermal Sensor" as "DTS", but unfortunately the string "dts" was already used for "Debug Store", and /proc/cpuinfo is a user space ABI. Therefore, rename this to "dtherm". This conflict went into mainline via the hwmon tree without any x86 maintainer ack, and without any kind of hint in the subject. a4659053 x86/hwmon: fix initialization of coretemp Reported-by: Jean Delvare <khali@linux-fr.org> Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> [bwh: Backported to 3.2: drop the coretemp device table change] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ACPI, x86: fix Dell M6600 ACPI reboot regression via DMIZhang Rui2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit 76eb9a30db4bc8fd172f9155247264b5f2686d7b upstream. Dell Precision M6600 is known to require PCI reboot, so add it to the reboot blacklist in pci_reboot_dmi_table[]. https://bugzilla.kernel.org/show_bug.cgi?id=42749 cc: x86@kernel.org Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overridingFeng Tang2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit f6b54f083cc66cf9b11d2120d8df3c2ad4e0836d upstream. This is the 2nd part of fix for kernel bugzilla 40002: "IRQ 0 assigned to VGA" https://bugzilla.kernel.org/show_bug.cgi?id=40002 The root cause is the buggy FW, whose ACPI tables assign the GSI 16 to 2 irqs 0 and 16(VGA), and the VGA is the right owner of GSI 16. So add a quirk to ignore the irq0 overriding GSI 16 for the FUJITSU SIEMENS AMILO PRO V2030 platform will solve this issue. Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ARM: fix rcu stalls on SMP platformsRussell King2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit 7deabca0acfe02b8e18f59a4c95676012f49a304 upstream. We can stall RCU processing on SMP platforms if a CPU sits in its idle loop for a long time. This happens because we don't call irq_enter() and irq_exit() around generic_smp_call_function_interrupt() and friends. Add the necessary calls, and remove the one from within ipi_timer(), so that they're all in a common place. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> [add irq_enter()/irq_exit() in do_local_timer] Signed-off-by: UCHINO Satoshi <satoshi.uchino@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * powerpc/xmon: Use cpumask iterator to avoid warningAnton Blanchard2012-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1025406 commit bc1d7702910c7c7e88eb60b58429dbfe293683ce upstream. We have a bug report where the kernel hits a warning in the cpumask code: WARNING: at include/linux/cpumask.h:107 Which is: WARN_ON_ONCE(cpu >= nr_cpumask_bits); The backtrace is: cpu_cmd cmds xmon_core xmon die xmon is iterating through 0 to NR_CPUS. I'm not sure why we are still open coding this but iterating above nr_cpu_ids is definitely a bug. This patch iterates through all possible cpus, in case we issue a system reset and CPUs in an offline state call in. Perhaps the old code was trying to handle CPUs that were in the partition but were never started (eg kexec into a kernel with an nr_cpus= boot option). They are going to die way before we get into xmon since we haven't set any kernel state up for them. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * thp: avoid atomic64_read in pmd_read_atomic for 32bit PAEAndrea Arcangeli2012-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1026690 commit e4eed03fd06578571c01d4f1478c874bb432c815 upstream. In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under Xen. So instead of dealing only with "consistent" pmdvals in pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals where the low 32bit and high 32bit could be inconsistent (to avoid having to use cmpxchg8b). The only guarantee we get from pmd_read_atomic is that if the low part of the pmd was found null, the high part will be null too (so the pmd will be considered unstable). And if the low part of the pmd is found "stable" later, then it means the whole pmd was read atomically (because after a pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore, and we read the high part after the low part). In the 32bit PAE x86 case, it is enough to read the low part of the pmdval atomically to declare the pmd as "stable" and that's true for THP and no THP, furthermore in the THP case we also have a barrier() that will prevent any inconsistent pmdvals to be cached by a later re-read of the *pmd. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Tested-by: Andrew Jones <drjones@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit e4eed03fd06578571c01d4f1478c874bb432c815) Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
| * xen/setup: filter APERFMPERF cpuid feature outAndre Przywara2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1016720 commit 5e626254206a709c6e937f3dda69bf26c7344f6f upstream. Xen PV kernels allow access to the APERF/MPERF registers to read the effective frequency. Access to the MSRs is however redirected to the currently scheduled physical CPU, making consecutive read and compares unreliable. In addition each rdmsr traps into the hypervisor. So to avoid bogus readouts and expensive traps, disable the kernel internal feature flag for APERF/MPERF if running under Xen. This will a) remove the aperfmperf flag from /proc/cpuinfo b) not mislead the power scheduler (arch/x86/kernel/cpu/sched.c) to use the feature to improve scheduling (by default disabled) c) not mislead the cpufreq driver to use the MSRs This does not cover userland programs which access the MSRs via the device file interface, but this will be addressed separately. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ARM i.MX imx21ads: Fix overlapping static i/o mappingsJaccon Bastiaansen2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1016720 commit 350ab15bb2ffe7103bc6bf6c634f3c5b286eaf2a upstream. The statically defined I/O memory regions for the i.MX21 on chip peripherals and the on board I/O peripherals of the i.MX21ADS board overlap. This results in a kernel crash during startup. This is fixed by reducing the memory range for the on board I/O peripherals to the actually required range. Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * x86, MCE, AMD: Make APIC LVT thresholding interrupt optionalBorislav Petkov2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1014712 commit f227d4306cf30e1d5b6f231e8ef9006c34f3d186 upstream. Currently, the APIC LVT interrupt for error thresholding is implicitly enabled. However, there are models in the F15h range which do not enable it. Make the code machinery which sets up the APIC interrupt support an optional setting and add an ->interrupt_capable member to the bank representation mirroring that capability and enable the interrupt offset programming only if it is true. Simplify code and fixup comment style while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * crypto: aesni-intel - fix unaligned cbc decrypt for x86-32Mathias Krause2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1014712 commit 7c8d51848a88aafdb68f42b6b650c83485ea2f84 upstream. The 32 bit variant of cbc(aes) decrypt is using instructions requiring 128 bit aligned memory locations but fails to ensure this constraint in the code. Fix this by loading the data into intermediate registers with load unaligned instructions. This fixes reported general protection faults related to aesni. References: https://bugzilla.kernel.org/show_bug.cgi?id=43223 Reported-by: Daniel <garkein@mailueberfall.de> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * powerpc: Fix kernel panic during kernel module loadSteffen Rumler2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1014712 commit 3c75296562f43e6fbc6cddd3de948a7b3e4e9bcf upstream. This fixes a problem which can causes kernel oopses while loading a kernel module. According to the PowerPC EABI specification, GPR r11 is assigned the dedicated function to point to the previous stack frame. In the powerpc-specific kernel module loader, do_plt_call() (in arch/powerpc/kernel/module_32.c), GPR r11 is also used to generate trampoline code. This combination crashes the kernel, in the case where the compiler chooses to use a helper function for saving GPRs on entry, and the module loader has placed the .init.text section far away from the .text section, meaning that it has to generate a trampoline for functions in the .init.text section to call the GPR save helper. Because the trampoline trashes r11, references to the stack frame using r11 can cause an oops. The fix just uses GPR r12 instead of GPR r11 for generating the trampoline code. According to the statements from Freescale, this is safe from an EABI perspective. I've tested the fix for kernel 2.6.33 on MPC8541. Signed-off-by: Steffen Rumler <steffen.rumler.ext@nsn.com> [paulus@samba.org: reworded the description] Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * PARISC: fix TLB fault path on PA2.0 narrow systemsJames Bottomley2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1013748 commit 2f649c1f6f0fef445ce79a19b79e5ce8fe9d7f19 upstream. commit 5e185581d7c46ddd33cd9c01106d1fc86efb9376 Author: James Bottomley <JBottomley@Parallels.com> [PARISC] fix PA1.1 oops on boot Didn't quite fix the crash on boot. It moved it from PA1.1 processors to PA2.0 narrow kernels. The final fix is to make sure the [id]tlb_miss_20 paths also work. Even on narrow systems, these paths require using the wide instructions becuase the tlb insertion format is wide. Fix this by conditioning the dep[wd],z on whether we're being called from _11 or _20[w] paths. Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * PARISC: fix boot failure on 32-bit systems caused by branch stubs placed ↵John David Anglin2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | before .text BugLink: http://bugs.launchpad.net/bugs/1013748 commit ed5fb2471b7060767957fb964eb1aaec71533ab1 upstream. In certain configurations, the resulting kernel becomes too large to boot because the linker places the long branch stubs for the merged .text section at the very start of the image. As a result, the initial transfer of control jumps to an unexpected location. Fix this by placing the head text in a separate section so the stubs for .text are not at the start of the image. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race conditionAndrea Arcangeli2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1013748 commit 26c191788f18129af0eb32a358cdaea0c7479626 upstream. When holding the mmap_sem for reading, pmd_offset_map_lock should only run on a pmd_t that has been read atomically from the pmdp pointer, otherwise we may read only half of it leading to this crash. PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic" #0 [f06a9dd8] crash_kexec at c049b5ec #1 [f06a9e2c] oops_end at c083d1c2 #2 [f06a9e40] no_context at c0433ded #3 [f06a9e64] bad_area_nosemaphore at c043401a #4 [f06a9e6c] __do_page_fault at c0434493 #5 [f06a9eec] do_page_fault at c083eb45 #6 [f06a9f04] error_code (via page_fault) at c083c5d5 EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP: 00000000 DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0 CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246 #7 [f06a9f38] _spin_lock at c083bc14 #8 [f06a9f44] sys_mincore at c0507b7d #9 [f06a9fb0] system_call at c083becd start len EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00 SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033 CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286 This should be a longstanding bug affecting x86 32bit PAE without THP. Only archs with 64bit large pmd_t and 32bit unsigned long should be affected. With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad() would partly hide the bug when the pmd transition from none to stable, by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is enabled a new set of problem arises by the fact could then transition freely in any of the none, pmd_trans_huge or pmd_trans_stable states. So making the barrier in pmd_none_or_trans_huge_or_clear_bad() unconditional isn't good idea and it would be a flakey solution. This should be fully fixed by introducing a pmd_read_atomic that reads the pmd in order with THP disabled, or by reading the pmd atomically with cmpxchg8b with THP enabled. Luckily this new race condition only triggers in the places that must already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix is localized there but this bug is not related to THP. NOTE: this can trigger on x86 32bit systems with PAE enabled with more than 4G of ram, otherwise the high part of the pmd will never risk to be truncated because it would be zero at all times, in turn so hiding the SMP race. This bug was discovered and fully debugged by Ulrich, quote: ---- [..] pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and eax. 496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) 497 { 498 /* depend on compiler for an atomic pmd read */ 499 pmd_t pmdval = *pmd; // edi = pmd pointer 0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi ... // edx = PTE page table high address 0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx ... // eax = PTE page table low address 0xc0507a8e <sys_mincore+574>: mov (%edi),%eax [..] Please note that the PMD is not read atomically. These are two "mov" instructions where the high order bits of the PMD entry are fetched first. Hence, the above machine code is prone to the following race. - The PMD entry {high|low} is 0x0000000000000000. The "mov" at 0xc0507a84 loads 0x00000000 into edx. - A page fault (on another CPU) sneaks in between the two "mov" instructions and instantiates the PMD. - The PMD entry {high|low} is now 0x00000003fda38067. The "mov" at 0xc0507a8e loads 0xfda38067 into eax. ---- Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * x86/amd: Re-enable CPU topology extensions in case BIOS has disabled itAndreas Herrmann2012-07-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1009087 BIOS will switch off the corresponding feature flag on family 15h models 10h-1fh non-desktop CPUs. The topology extension CPUID leafs are required to detect which cores belong to the same compute unit. (thread siblings mask is set accordingly and also correct information about L1i and L2 cache sharing depends on this). W/o this patch we wouldn't see which cores belong to the same compute unit and also cache sharing information for L1i and L2 would be incorrect on such systems. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit f7f286a910221ae18b21c18d9d0f4cd88965829f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Herton Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
| * UBUNTU: SAUCE: fix get_gate_vma call in i386 NX emulation codeHerton Ronaldo Krzesinski2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 31db58b3 ("mm: arch: make get_gate_vma take an mm_struct instead of a task_struct"), that went in linux 2.6.39, get_gate_vma is expected to take an struct mm_struct pointer as its parameter. But get_gate_vma in i386 NX emulation code patch is still using the old way, fix it. BugLink: http://bugs.launchpad.net/bugs/1009200 Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ARM: 7409/1: Do not call flush_cache_user_range with mmap_sem heldDima Zavin2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 435a7ef52db7d86e67a009b36cac1457f8972391 upstream. We can't be holding the mmap_sem while calling flush_cache_user_range because the flush can fault. If we fault on a user address, the page fault handler will try to take mmap_sem again. Since both places acquire the read lock, most of the time it succeeds. However, if another thread tries to acquire the write lock on the mmap_sem (e.g. mmap) in between the call to flush_cache_user_range and the fault, the down_read in do_page_fault will deadlock. [will: removed drop of vma parameter as already queued by rmk (7365/1)] Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Dima Zavin <dima@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ARM: 7365/1: drop unused parameter from flush_cache_user_rangeDima Zavin2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 4542b6a0fa6b48d9ae6b41c1efeb618b7a221b2a upstream. vma isn't used and flush_cache_user_range isn't a standard macro that is used on several archs with the same prototype. In fact only unicore32 has a macro with the same name (with an identical implementation and no in-tree users). This is a part of a patch proposed by Dima Zavin (with Message-id: 1272439931-12795-1-git-send-email-dima@android.com) that didn't get accepted. Cc: Dima Zavin <dima@android.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * tile: fix bug where fls(0) was not returning 0Chris Metcalf2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 9f1d62bed7f015d11b9164078b7fea433b474114 upstream. This is because __builtin_clz(0) returns 64 for the "undefined" case of 0, since the builtin just does a right-shift 32 and "clz" instruction. So, use the alpha approach of casting to u32 and using __builtin_clzll(). Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * x86/mce: Fix check for processor context when machine check was taken.Tony Luck2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 875e26648cf9b6db9d8dc07b7959d7c61fb3f49c upstream. Linus pointed out that there was no value is checking whether m->ip was zero - because zero is a legimate value. If we have a reliable (or faked in the VM86 case) "m->cs" we can use it to tell whether we were in user mode or kernelwhen the machine check hit. Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * um: Implement a custom pte_same() functionRichard Weinberger2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit f15b9000eb1d09bbaa4b0a6b2089d7e1f64e84b3 upstream. UML uses the _PAGE_NEWPAGE flag to mark pages which are not jet installed on the host side using mmap(). pte_same() has to ignore this flag, otherwise unuse_pte_range() is unable to unuse the page because two identical page tables entries with different _PAGE_NEWPAGE flags would not match and swapoff() would never return. Analyzed-by: Hugh Dickins <hughd@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * um: Fix __swp_type()Richard Weinberger2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 2b76ebaa728f8a3967c52aa189261c72fe56a6f1 upstream. The current __swp_type() function uses a too small bitshift. Using more than one swap files causes bad pages because the type bits clash with other page flags. Analyzed-by: Hugh Dickins <hughd@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * perf/x86: Update event scheduling constraints for AMD family 15h modelsRobert Richter2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 5bcdf5e4fee3c45e1281c25e4941f2163cb28c65 upstream. This update is for newer family 15h cpu models from 0x02 to 0x1f. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1337337642-1621-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compatDavid Howells2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 45de6767dc51358a188f75dc4ad9dfddb7fb9480 upstream. Use the 32-bit compat keyctl() syscall wrapper on Sparc64 for Sparc32 binary compatibility. Without this, keyctl(KEYCTL_INSTANTIATE_IOV) is liable to malfunction as it uses an iovec array read from userspace - though the kernel should survive this as it checks pointers and sizes anyway. I think all the other keyctl() function should just work, provided (a) the top 32-bits of each 64-bit argument register are cleared prior to invoking the syscall routine, and the 32-bit address space is right at the 0-end of the 64-bit address space. Most of the arguments are 32-bit anyway, and so for those clearing is not required. Signed-off-by: David Howells <dhowells@redhat.com cc: "David S. Miller" <davem@davemloft.net> cc: sparclinux@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * s390/pfault: fix task state raceHeiko Carstens2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit d5e50a51ccbda36b379aba9d1131a852eb908dda upstream. When setting the current task state to TASK_UNINTERRUPTIBLE this can race with a different cpu. The other cpu could set the task state after it inspected it (while it was still TASK_RUNNING) to TASK_RUNNING which would change the state from TASK_UNINTERRUPTIBLE to TASK_RUNNING again. This race was always present in the pfault interrupt code but didn't cause anything harmful before commit f2db2e6c "[S390] pfault: cpu hotplug vs missing completion interrupts" which relied on the fact that after setting the task state to TASK_UNINTERRUPTIBLE the task would really sleep. Since this is not necessarily the case the result may be a list corruption of the pfault_list or, as observed, a use-after-free bug while trying to access the task_struct of a task which terminated itself already. To fix this, we need to get a reference of the affected task when receiving the initial pfault interrupt and add special handling if we receive yet another initial pfault interrupt when the task is already enqueued in the pfault list. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * PARISC: fix panic on prefetch(NULL) on PA7300LCJames Bottomley2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit b3cb8674811d1851bbf1486a73d62b90c119b994 upstream. Due to an errata, the PA7300LC generates a TLB miss interruption even on the prefetch instruction. This means that prefetch(NULL), which is supposed to be a nop on linux actually generates a NULL deref fault. Fix this by testing the address of prefetch against NULL before doing the prefetch. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * PARISC: fix crash in flush_icache_page_asm on PA1.1John David Anglin2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 207f583d7179f707f402c36a7bda5ca1fd03ad5b upstream. As pointed out by serveral people, PA1.1 only has a type 26 instruction meaning that the space register must be explicitly encoded. Not giving an explicit space means that the compiler uses the type 24 version which is PA2.0 only resulting in an illegal instruction crash. This regression was caused by commit f311847c2fcebd81912e2f0caf8a461dec28db41 Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Wed Dec 22 10:22:11 2010 -0600 parisc: flush pages through tmpalias space Reported-by: Helge Deller <deller@gmx.de> Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * PARISC: fix PA1.1 oops on bootJames Bottomley2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit 5e185581d7c46ddd33cd9c01106d1fc86efb9376 upstream. All PA1.1 systems have been oopsing on boot since commit f311847c2fcebd81912e2f0caf8a461dec28db41 Author: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Wed Dec 22 10:22:11 2010 -0600 parisc: flush pages through tmpalias space because a PA2.0 instruction was accidentally introduced into the PA1.1 TLB insertion interruption path when it was consolidated with the do_alias macro. Fix the do_alias macro only to use PA2.0 instructions if compiled for 64 bit. Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * tilegx: enable SYSCALL_WRAPPERS supportChris Metcalf2012-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1008697 commit e6d9668e119af44ae5bcd5f1197174531458afe3 upstream. Some discussion with the glibc mailing lists revealed that this was necessary for 64-bit platforms with MIPS-like sign-extension rules for 32-bit values. The original symptom was that passing (uid_t)-1 to setreuid() was failing in programs linked -pthread because of the "setxid" mechanism for passing setxid-type function arguments to the syscall code. SYSCALL_WRAPPERS handles ensuring that all syscall arguments end up with proper sign-extension and is thus the appropriate fix for this problem. On other platforms (s390, powerpc, sparc64, and mips) this was fixed in 2.6.28.6. The general issue is tracked as CVE-2009-0029. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
| * ia64: Add accept4() syscallÉmeric Maschino2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit 65cc21b4523e94d5640542a818748cd3be8cd6b4 upstream. While debugging udev > 170 failure on Debian Wheezy (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=648325), it appears that the issue was in fact due to missing accept4() in ia64. This patch simply adds accept4() to ia64. Signed-off-by: Émeric Maschino <emeric.maschino@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * arch/tile: apply commit 74fca9da0 to the compat signal handling as wellChris Metcalf2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit a134d228298c6aa9007205c6b81cae0cac0acb5d upstream. This passes siginfo and mcontext to tilegx32 signal handlers that don't have SA_SIGINFO set just as we have been doing for tilegx64. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ARM: prevent VM_GROWSDOWN mmaps extending below FIRST_USER_ADDRESSRussell King2012-05-25
| | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit 9b61a4d1b2064dbd0c9e61754305ac852170509f upstream. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * sparc64: Do not clobber %g2 in xcall_fetch_glob_regs().David S. Miller2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 [ Upstream commit a5a737e090e25981e99d69f01400e3a80356581c ] %g2 is meant to hold the CPUID number throughout this routine, since at the very beginning, and at the very end, we use %g2 to calculate indexes into per-cpu arrays. However we erroneously clobber it in order to hold the %cwp register value mid-stream. Fix this code to use %g3 for the %cwp read and related calulcations instead. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ARM: orion5x: Fix GPIO enable bits for MPP9Ben Hutchings2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit 48d99f47a81a66bdd61a348c7fe8df5a7afdf5f3 upstream. Commit 554cdaefd1cf7bb54b209c4e68c7cec87ce442a9 ('ARM: orion5x: Refactor mpp code to use common orion platform mpp.') seems to have accidentally inverted the GPIO valid bits for MPP9 (only). For the mv2120 platform which uses MPP9 as a GPIO LED device, this results in the error: [ 12.711476] leds-gpio: probe of leds-gpio failed with error -22 Reported-by: Henry von Tresckow <hvontres@gmail.com> References: http://bugs.debian.org/667446 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Hans Henry von Tresckow <hvontres@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ARM: 7414/1: SMP: prevent use of the console when using idmap_pgdColin Cross2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit fde165b2a29673aabf18ceff14dea1f1cfb0daad upstream. Commit 4e8ee7de227e3ab9a72040b448ad728c5428a042 (ARM: SMP: use idmap_pgd for mapping MMU enable during secondary booting) switched secondary boot to use idmap_pgd, which is initialized during early_initcall, instead of a page table initialized during __cpu_up. This causes idmap_pgd to contain the static mappings but be missing all dynamic mappings. If a console is registered that creates a dynamic mapping, the printk in secondary_start_kernel will trigger a data abort on the missing mapping before the exception handlers have been initialized, leading to a hang. Initial boot is not affected because no consoles have been registered, and resume is usually not affected because the offending console is suspended. Onlining a cpu with hotplug triggers the problem. A workaround is to the printk in secondary_start_kernel until after the page tables have been switched back to init_mm. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ARM: 7410/1: Add extra clobber registers for assembly in kernel_execveTim Bird2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit e787ec1376e862fcea1bfd523feb7c5fb43ecdb9 upstream. The inline assembly in kernel_execve() uses r8 and r9. Since this code sequence does not return, it usually doesn't matter if the register clobber list is accurate. However, I saw a case where a particular version of gcc used r8 as an intermediate for the value eventually passed to r9. Because r8 is used in the inline assembly, and not mentioned in the clobber list, r9 was set to an incorrect value. This resulted in a kernel panic on execution of the first user-space program in the system. r9 is used in ret_to_user as the thread_info pointer, and if it's wrong, bad things happen. Signed-off-by: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bitTejun Heo2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit d5e28005a1d2e67833852f4c9ea8ec206ea3ff85 upstream. With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE or PMD_SIZE for atom_size. PMD_SIZE is used when CPU supports PSE so that percpu areas are aligned to PMD mappings and possibly allow using PMD mappings in vmalloc areas in the future. Using larger atom_size doesn't waste actual memory; however, it does require larger vmalloc space allocation later on for !first chunks. With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem but x86_32 at this point is anything but reasonable in terms of address space and using larger atom_size reportedly leads to frequent percpu allocation failures on certain setups. As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space is aplenty and most x86_64 configurations support PSE, fix the issue by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32. v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and x86_32 PAGE_SIZE as suggested by hpa. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Reported-by: ShuoX Liu <shuox.liu@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4F97BA98.6010001@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * xen/pci: don't use PCI BIOS service for configuration space accessesDavid Vrabel2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit 76a8df7b49168509df02461f83fab117a4a86e08 upstream. The accessing PCI configuration space with the PCI BIOS32 service does not work in PV guests. On systems without MMCONFIG or where the BIOS hasn't marked the MMCONFIG region as reserved in the e820 map, the BIOS service is probed (even though direct access is preferred) and this hangs. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> [v1: Fixed compile error when CONFIG_PCI is not set] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEsKonrad Rzeszutek Wilk2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/1002880 commit b7e5ffe5d83fa40d702976d77452004abbe35791 upstream. If I try to do "cat /sys/kernel/debug/kernel_page_tables" I end up with: BUG: unable to handle kernel paging request at ffffc7fffffff000 IP: [<ffffffff8106aa51>] ptdump_show+0x221/0x480 PGD 0 Oops: 0000 [#1] SMP CPU 0 .. snip.. RAX: 0000000000000000 RBX: ffffc00000000fff RCX: 0000000000000000 RDX: 0000800000000000 RSI: 0000000000000000 RDI: ffffc7fffffff000 which is due to the fact we are trying to access a PFN that is not accessible to us. The reason (at least in this case) was that PGD[256] is set to __HYPERVISOR_VIRT_START which was setup (by the hypervisor) to point to a read-only linear map of the MFN->PFN array. During our parsing we would get the MFN (a valid one), try to look it up in the MFN->PFN tree and find it invalid and return ~0 as PFN. Then pte_mfn_to_pfn would happilly feed that in, attach the flags and return it back to the caller. 'ptdump_show' bitshifts it and gets and invalid value that it tries to dereference. Instead of doing all of that, we detect the ~0 case and just return !_PAGE_PRESENT. This bug has been in existence .. at least until 2.6.37 (yikes!) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
| * ARM: 7403/1: tls: remove covert channel via TPIDRURWWill Deacon2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/996109 commit 6a1c53124aa161eb624ce7b1e40ade728186d34c upstream. TPIDRURW is a user read/write register forming part of the group of thread registers in more recent versions of the ARM architecture (~v6+). Currently, the kernel does not touch this register, which allows tasks to communicate covertly by reading and writing to the register without context-switching affecting its contents. This patch clears TPIDRURW when TPIDRURO is updated via the set_tls macro, which is called directly from __switch_to. Since the current behaviour makes the register useless to userspace as far as thread pointers are concerned, simply clearing the register (rather than saving and restoring it) will not cause any problems to userspace. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen/smp: Fix crash when booting with ACPI hotplug CPUs.Konrad Rzeszutek Wilk2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/996109 commit cf405ae612b0f7e2358db7ff594c0e94846137aa upstream. When we boot on a machine that can hotplug CPUs and we are using 'dom0_max_vcpus=X' on the Xen hypervisor line to clip the amount of CPUs available to the initial domain, we get this: (XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all .. snip.. DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011 .. snip. SMP: Allowing 64 CPUs, 32 hotplug CPUs installing Xen timer for CPU 7 cpu 7 spinlock event irq 361 NMI watchdog: disabled (cpu7): hardware events not enabled Brought up 8 CPUs .. snip.. [acpi processor finds the CPUs are not initialized and starts calling arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online] CPU 8 got hotplugged CPU 9 got hotplugged CPU 10 got hotplugged .. snip.. initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs calling erst_init+0x0/0x2bb @ 1 [and the scheduler sticks newly started tasks on the new CPUs, but said CPUs cannot be initialized b/c the hypervisor has limited the amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag. The spinlock tries to kick the other CPU, but the structure for that is not initialized and we crash.] BUG: unable to handle kernel paging request at fffffffffffffed8 IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60 PGD 180d067 PUD 180e067 PMD 0 Oops: 0002 [#1] SMP CPU 7 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc2upstream-00001-gf5154e8 #1 Intel Corporation S2600CP/S2600CP RIP: e030:[<ffffffff81035289>] [<ffffffff81035289>] xen_spin_lock+0x29/0x60 RSP: e02b:ffff8801fb9b3a70 EFLAGS: 00010282 With this patch, we cap the amount of vCPUS that the initial domain can run, to exactly what dom0_max_vcpus=X has specified. In the future, if there is a hypercall that will allow a running domain to expand past its initial set of vCPUS, this patch should be re-evaluated. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * xen: correctly check for pending events when restoring irq flagsDavid Vrabel2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/996109 commit 7eb7ce4d2e8991aff4ecb71a81949a907ca755ac upstream. In xen_restore_fl_direct(), xen_force_evtchn_callback() was being called even if no events were pending. This resulted in (depending on workload) about a 100 times as many xen_version hypercalls as necessary. Fix this by correcting the sense of the conditional jump. This seems to give a significant performance benefit for some workloads. There is some subtle tricksy "..since the check here is trying to check both pending and masked in a single cmpw, but I think this is correct. It will call check_events now only when the combined mask+pending word is 0x0001 (aka unmasked, pending)." (Ian) Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * x86, apic: APIC code touches invalid MSR on P5 class machinesBryan O'Donoghue2012-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BugLink: http://bugs.launchpad.net/bugs/996109 commit cbf2829b61c136edcba302a5e1b6b40e97d32c00 upstream. Current APIC code assumes MSR_IA32_APICBASE is present for all systems. Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE was introduced as an architectural MSR by Intel @ P6. Code paths that can touch this MSR invalidly are when vendor == Intel && cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass lapic on the kernel command line, on a P5. The below patch stops Linux incorrectly interfering with the MSR_IA32_APICBASE for P5 class machines. Other code paths exist that touch the MSR - however those paths are not currently reachable for a conformant P5. Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linux.intel.com> Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>