aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
Commit message (Collapse)AuthorAge
...
| * | | | | x86, intr-remap: Set redirection hint in the IRTESuresh Siddha2010-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the redirection hint in the interrupt-remapping table entry is set to 0, which means the remapped interrupt is directed to the processors listed in the destination. So in logical flat mode in the presence of intr-remapping, this results in a single interrupt multi-casted to multiple cpu's as specified by the destination bit mask. But what we really want is to send that interrupt to one of the cpus based on the lowest priority delivery mode. Set the redirection hint in the IRTE to '1' to indicate that we want the remapped interrupt to be directed to only one of the processors listed in the destination. This fixes the issue of same interrupt getting delivered to multiple cpu's in the logical flat mode in the presence of interrupt-remapping. While there is no functional issue observed with this behavior, this will impact performance of such configurations (<=8 cpu's using logical flat mode in the presence of interrupt-remapping) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100827181049.013051492@sbsiddha-MOBL3.sc.intel.com> Cc: Weidong Han <weidong.han@intel.com> Cc: <stable@kernel.org> # [v2.6.32+] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | Merge branch 'x86-vmware-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Remove alloc_pmd_clone hook, only used by VMI x86, vmware: Remove deprecated VMI kernel support Fix up trivial #include conflict in arch/x86/kernel/smpboot.c
| * | | | | | x86, paravirt: Remove alloc_pmd_clone hook, only used by VMIAlok Kataria2010-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VMI was the only user of the alloc_pmd_clone hook, given that VMI is now removed we can also remove this hook. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1282608357.19396.36.camel@ank32.eng.vmware.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | x86, vmware: Remove deprecated VMI kernel supportAlok Kataria2010-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the recent innovations in CPU hardware acceleration technologies from Intel and AMD, VMware ran a few experiments to compare these techniques to guest paravirtualization technique on VMware's platform. These hardware assisted virtualization techniques have outperformed the performance benefits provided by VMI in most of the workloads. VMware expects that these hardware features will be ubiquitous in a couple of years, as a result, VMware has started a phased retirement of this feature from the hypervisor. Please note that VMI has always been an optimization and non-VMI kernels still work fine on VMware's platform. Latest versions of VMware's product which support VMI are, Workstation 7.0 and VSphere 4.0 on ESX side, future maintainence releases for these products will continue supporting VMI. For more details about VMI retirement take a look at this, http://blogs.vmware.com/guestosguide/2009/09/vmi-retirement.html This feature removal was scheduled for 2.6.37 back in September 2009. Signed-off-by: Alok N Kataria <akataria@vmware.com> LKML-Reference: <1282600151.19396.22.camel@ank32.eng.vmware.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | Merge branch 'x86-olpc-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc: XO-1 uses/depends on PCI x86, olpc: Register XO-1 platform devices x86, olpc: Add XO-1 poweroff support x86, olpc: Don't retry EC commands forever x86, olpc: Rework BIOS signature check x86, olpc: Only enable PCI configuration type override on XO-1
| * | | | | | | x86, olpc: XO-1 uses/depends on PCIRandy Dunlap2010-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | olpc-xo1 uses pci_*() interfaces so it should depend on PCI. Otherwise we get build failure like: arch/x86/kernel/olpc-xo1.c:65: error: implicit declaration of function 'pci_enable_device_io' arch/x86/kernel/olpc-xo1.c:71: error: implicit declaration of function 'pci_request_region' arch/x86/kernel/olpc-xo1.c:80: error: implicit declaration of function 'pci_release_region' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Daniel Drake <dsd@laptop.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20101014101313.adf7eb2a.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86, olpc: Register XO-1 platform devicesDaniel Drake2010-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The upcoming XO-1 rfkill driver (for drivers/platform/x86) will register itself with the name "xo1-rfkill", and the already-merged XO-1 poweroff code uses name "olpc-xo1" Add the necessary mechanics so that these devices are properly initialized on XO-1 laptops. Signed-off-by: Daniel Drake <dsd@laptop.org> LKML-Reference: <20101013181042.90C8F9D401B@zog.reactivated.net> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, olpc: Add XO-1 poweroff supportDaniel Drake2010-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a pm_power_off handler for the OLPC XO-1 laptop. The driver can be built modular and follows the behaviour of the APM driver, setting pm_power_off to NULL on unload. However, the ability to unload the module will probably be removed (with a simple __module_get(THIS_MODULE)) if/when XO-1 suspend/resume support is added to this file at a later date. Signed-off-by: Daniel Drake <dsd@laptop.org> LKML-Reference: <20101010094032.9AE669D401B@zog.reactivated.net> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | x86, olpc: Don't retry EC commands foreverPaul Fox2010-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoids a potential infinite loop. It was observed once, during an EC hacking/debugging session - not in regular operation. Signed-off-by: Daniel Drake <dsd@laptop.org> Cc: dilinger@queued.net Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86, olpc: Rework BIOS signature checkDaniel Drake2010-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The XO-1.5 laptop is not currently detected as an OLPC machine because it fails this XO-1-centric check. Now that we have OLPC OFW support in the kernel, a more sensible check is to see if we found OFW during boot and check the architecture property. Also remove a now-meaningless codepath, as we're always going to have OFW support with OLPC. Signed-off-by: Daniel Drake <dsd@laptop.org> LKML-Reference: <20100923162846.D8D409D401B@zog.reactivated.net> Cc: Andres Salomon <dilinger@queued.net> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, olpc: Only enable PCI configuration type override on XO-1Daniel Drake2010-09-23
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This configuration type override is for XO-1 only and must not happen on XO-1.5. Signed-off-by: Daniel Drake <dsd@laptop.org> LKML-Reference: <20100923162805.0F6549D401B@zog.reactivated.net> Cc: Andres Solomon <dilinger@queued.net> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | Merge branch 'x86-mtrr-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mtrr: Support mtrr lookup for range spanning across MTRR range x86, mtrr: Refactor MTRR type overlap check code
| * | | | | | | x86, mtrr: Support mtrr lookup for range spanning across MTRR rangeVenkatesh Pallipadi2010-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtrr_type_lookup [start:end] looked up the resultant MTRR type for that range, based on fixed and all variable MTRR ranges. It did check for multiple MTRR var ranges overlapping [start:end] and returned the net type. However, if the [start:end] range spanned across any var MTRR range, mtrr_type_lookup would return an error return of 0xFE. This was based on typical usage of mtrr_type_lookup in PAT mapping, where region being mapped would not normally span across MTRR ranges and also trying to keep the code simple. Mark recently reported the problem with this limitation. When there are two continguous MTRR's of type "writeback" and if there is a memory mapping over a region starting in one MTRR range and ending in another MTRR range, such mapping will fallback to "uncached" due to the above limitation. Change below adds support for such lookups spanning multiple MTRR ranges. We now have a wrapper mtrr_type_lookup that dynamically splits such a region into smaller chunks that fit within one MTRR range and does a __mtrr_type_lookup on it and combine the results later. Reported-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1284159350-19841-3-git-send-email-venki@google.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | x86, mtrr: Refactor MTRR type overlap check codeVenkatesh Pallipadi2010-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the MTRR type overlap check into a new function. No functional change in this patch. Just making it easier to add multiple region overlap check in the following patch. Signed-off-by: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1284159350-19841-2-git-send-email-venki@google.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | | Merge branch 'x86-mrst-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: sfi: Make local functions static x86, earlyprintk: Add hsu early console for Intel Medfield platform x86, earlyprintk: Add earlyprintk for Intel Moorestown platform x86: Add two helper macros for fixed address mapping x86, mrst: A function in a header file needs to be marked "inline"
| * | | | | | | | x86: sfi: Make local functions staticThomas Gleixner2010-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <lenb@kernel.org>
| * | | | | | | | x86, earlyprintk: Add hsu early console for Intel Medfield platformFeng Tang2010-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Medfield platform has a high speed UART device, which could act as a early console. To enable early printk of HSU console, simply add "earlyprintk=hsu" in kernel command line. Currently we put the code in the early_printk_mrst.c as it is also for Intel MID platforms like the mrst early console Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Alan Cox <alan@linux.intel.com> Cc: greg@kroah.com LKML-Reference: <1284361736-23011-5-git-send-email-feng.tang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | x86, earlyprintk: Add earlyprintk for Intel Moorestown platformFeng Tang2010-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel Moorestown platform has a spi-uart device(Maxim3110), which connects to a Designware spi core controller. This patch will add early console function based on it. As it will be used long before Linux spi subsystem get initialised, we simply directly manipulate the spi controller's register to acheive the early console func. This is safe as it will be disabled when devices subsytem get initialised. To use it, user need enable CONFIG_X86_MRST_EARLY_PRINTK in kenrel config and add "earlyprintk=mrst" in kernel command line. Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Alan Cox <alan@linux.intel.com> Cc: greg@kroah.com LKML-Reference: <1284361736-23011-4-git-send-email-feng.tang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | x86: Add two helper macros for fixed address mappingFeng Tang2010-10-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes fixmap will be used to map an physical address which is not PAGE align, so to use it we need first map it and then add the address offset to the mapped fixed address. These 2 new helpers are suggested by Ingo Molnar to make the process simpler. For a physicall address like "phys", a directly usable virtual address can be get by virt = (void *)set_fixmap_offset(fixed_idx, phys); or virt = (void *)set_fixmap_offset_nocache(fixed_idx, phys); (depends on whether the physical address is cachable or not). Signed-off-by: Feng Tang <feng.tang@intel.com> Cc: alan@linux.intel.com Cc: greg@kroah.com Cc: x86@kernel.org LKML-Reference: <1284361736-23011-3-git-send-email-feng.tang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | x86, mrst: A function in a header file needs to be marked "inline"H. Peter Anvin2010-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A function in a header file needs to be explicitly marked "inline", or gcc will complain if it is not used. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: <stable@kernel.org> v2.6.36 LKML-Reference: <1274295685-6774-3-git-send-email-jacob.jun.pan@linux.intel.com>
* | | | | | | | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, percpu: Correct the ordering of the percpu readmostly section x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64G x86: Spread tlb flush vector between nodes percpu: Introduce a read-mostly percpu API x86, mm: Fix incorrect data type in vmalloc_sync_all() x86, mm: Hold mm->page_table_lock while doing vmalloc_sync x86, mm: Fix bogus whitespace in sync_global_pgds() x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation x86, mm: Add RESERVE_BRK_ARRAY() helper mm, x86: Saving vmcore with non-lazy freeing of vmas x86, kdump: Change copy_oldmem_page() to use cached addressing x86, mm: fix uninitialized addr in kernel_physical_mapping_init() x86, kmemcheck: Remove double test x86, mm: Make spurious_fault check explicitly check the PRESENT bit x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions x86, mm: Avoid unnecessary TLB flush
| * | | | | | | | | x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64GFUJITA Tomonori2010-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set CONFIG_ARCH_DMA_ADDR_T_64BIT when we set dma_addr_t to 64 bits in <asm/types.h>; this allows Kconfig decisions based on this property. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <201010202255.o9KMtZXu009370@imap1.linux-foundation.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86: Spread tlb flush vector between nodesShaohua Li2010-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently flush tlb vector allocation is based on below equation: sender = smp_processor_id() % 8 This isn't optimal, CPUs from different node can have the same vector, this causes a lot of lock contention. Instead, we can assign the same vectors to CPUs from the same node, while different node has different vectors. This has below advantages: a. if there is lock contention, the lock contention is between CPUs from one node. This should be much cheaper than the contention between nodes. b. completely avoid lock contention between nodes. This especially benefits kswapd, which is the biggest user of tlb flush, since kswapd sets its affinity to specific node. In my test, this could reduce > 20% CPU overhead in extreme case.The test machine has 4 nodes and each node has 16 CPUs. I then bind each node's kswapd to the first CPU of the node. I run a workload with 4 sequential mmap file read thread. The files are empty sparse file. This workload will trigger a lot of page reclaim and tlbflush. The kswapd bind is to easy trigger the extreme tlb flush lock contention because otherwise kswapd keeps migrating between CPUs of a node and I can't get stable result. Sure in real workload, we can't always see so big tlb flush lock contention, but it's possible. [ hpa: folded in fix from Eric Dumazet to use this_cpu_read() ] Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <1287544023.4571.8.camel@sli10-conroe.sh.intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Fix incorrect data type in vmalloc_sync_all()Borislav Petkov2010-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/x86/mm/fault.c: In function 'vmalloc_sync_all': arch/x86/mm/fault.c:238: warning: assignment makes integer from pointer without a cast introduced by 617d34d9e5d8326ec8f188c616aa06ac59d083fe. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20101020103642.GA3135@kryptos.osrc.amd.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Hold mm->page_table_lock while doing vmalloc_syncJeremy Fitzhardinge2010-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Take mm->page_table_lock while syncing the vmalloc region. This prevents a race with the Xen pagetable pin/unpin code, which expects that the page_table_lock is already held. If this race occurs, then Xen can see an inconsistent page type (a page can either be read/write or a pagetable page, and pin/unpin converts it between them), which will cause either the pin or the set_p[gm]d to fail; either will crash the kernel. vmalloc_sync_all() should be called rarely, so this extra use of page_table_lock should not interfere with its normal users. The mm pointer is stashed in the pgd page's index field, as that won't be otherwise used for pgds. Reported-by: Ian Campbell <ian.cambell@eu.citrix.com> Originally-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4CB88A4C.1080305@goop.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Fix bogus whitespace in sync_global_pgds()Jeremy Fitzhardinge2010-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whitespace cleanup only. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86-32: Fix sparse warning for the __PHYSICAL_MASK calculationNamhyung Kim2010-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit non-PAE system, cast to 'phys_addr_t' truncates value before subtraction. Subtracting before cast produce same result but remove following warnings from sparse: arch/x86/include/asm/pgtable_types.h:255:38: warning: cast truncates bits from constant value (100000000 becomes 0) arch/x86/include/asm/pgtable_types.h:270:38: warning: cast truncates bits from constant value (100000000 becomes 0) arch/x86/include/asm/pgtable.h:127:32: warning: cast truncates bits from constant value (100000000 becomes 0) arch/x86/include/asm/pgtable.h:132:32: warning: cast truncates bits from constant value (100000000 becomes 0) arch/x86/include/asm/pgtable.h:344:31: warning: cast truncates bits from constant value (100000000 becomes 0) 64-bit or PAE machines will not be affected by this change. Signed-off-by: Namhyung Kim <namhyung@gmail.com> LKML-Reference: <1285770588-14065-1-git-send-email-namhyung@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Add RESERVE_BRK_ARRAY() helperJeremy Fitzhardinge2010-10-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is useful when converting static arrays into boot-time brk allocated objects. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> LKML-Reference: <4C805EEA.1080205@goop.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | | | mm, x86: Saving vmcore with non-lazy freeing of vmasCliff Wickman2010-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the reading of /proc/vmcore the kernel is doing ioremap()/iounmap() repeatedly. And the buildup of un-flushed vm_area_struct's is causing a great deal of overhead. (rb_next() is chewing up most of that time). This solution is to provide function set_iounmap_nonlazy(). It causes a subsequent call to iounmap() to immediately purge the vma area (with try_purge_vmap_area_lazy()). With this patch we have seen the time for writing a 250MB compressed dump drop from 71 seconds to 44 seconds. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: kexec@lists.infradead.org Cc: <stable@kernel.org> LKML-Reference: <E1OwHZ4-0005WK-Tw@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | x86, kdump: Change copy_oldmem_page() to use cached addressingCliff Wickman2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The copy of /proc/vmcore to a user buffer proceeds much faster if the kernel addresses memory as cached. With this patch we have seen an increase in transfer rate from less than 15MB/s to 80-460MB/s, depending on size of the transfer. This makes a big difference in time needed to save a system dump. Signed-off-by: Cliff Wickman <cpw@sgi.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Cc: <stable@kernel.org> # as far back as it would apply LKML-Reference: <E1OtMLz-0001yp-Ia@eag09.americas.sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | x86, mm: fix uninitialized addr in kernel_physical_mapping_init()Wu Fengguang2010-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-adds the lost chunk in commit 9b861528a80. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Haicheng Li <haicheng.li@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <20100903090407.GA19771@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | x86, kmemcheck: Remove double testJulia Lawall2010-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The opcodes 0x2e and 0x3e are tested for in the first Group 2 line as well. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @expression@ expression E; @@ ( * E || ... || E | * E && ... && E ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Vegard Nossum <vegardno@ifi.uio.no> LKML-Reference: <1283010066-20935-5-git-send-email-julia@diku.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | x86, mm: Make spurious_fault check explicitly check the PRESENT bitShaohua Li2010-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pte_present() returns true even present bit isn't set but _PAGE_PROTNONE (global bit) bit is set. While with CONFIG_DEBUG_PAGEALLOC, free pages have global bit set but present bit clear. This patch makes we could catch free pages access with CONFIG_DEBUG_PAGEALLOC enabled. [ hpa: added a comment in the code as a warning to janitors ] Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <1280217988.32400.75.camel@sli10-desk.sh.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changesHaicheng Li2010-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When memory hotplug-adding happens for a large enough area that a new PGD entry is needed for the direct mapping, the PGDs of other processes would not get updated. This leads to some CPUs oopsing like below when they have to access the unmapped areas. [ 1139.243192] BUG: soft lockup - CPU#0 stuck for 61s! [bash:6534] [ 1139.243195] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore [ 1139.243229] irq event stamp: 8538759 [ 1139.243230] hardirqs last enabled at (8538759): [<ffffffff8100c3fc>] restore_args+0x0/0x30 [ 1139.243236] hardirqs last disabled at (8538757): [<ffffffff810422df>] __do_softirq+0x106/0x146 [ 1139.243240] softirqs last enabled at (8538758): [<ffffffff81042310>] __do_softirq+0x137/0x146 [ 1139.243245] softirqs last disabled at (8538743): [<ffffffff8100cb5c>] call_softirq+0x1c/0x34 [ 1139.243249] CPU 0: [ 1139.243250] Modules linked in: ipv6 autofs4 rfcomm l2cap crc16 bluetooth rfkill binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc fan battery ac parport_pc lp parport joydev usbhid processor thermal thermal_sys container button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core pcspkr uhci_hcd ohci_hcd ehci_hcd usbcore [ 1139.243284] Pid: 6534, comm: bash Tainted: G M 2.6.32-haicheng-cpuhp #7 QSSC-S4R [ 1139.243287] RIP: 0010:[<ffffffff810ace35>] [<ffffffff810ace35>] alloc_arraycache+0x35/0x69 [ 1139.243292] RSP: 0018:ffff8802799f9d78 EFLAGS: 00010286 [ 1139.243295] RAX: ffff8884ffc00000 RBX: ffff8802799f9d98 RCX: 0000000000000000 [ 1139.243297] RDX: 0000000000190018 RSI: 0000000000000001 RDI: ffff8884ffc00010 [ 1139.243300] RBP: ffffffff8100c34e R08: 0000000000000002 R09: 0000000000000000 [ 1139.243303] R10: ffffffff8246dda0 R11: 000000d08246dda0 R12: ffff8802599bfff0 [ 1139.243305] R13: ffff88027904c040 R14: ffff8802799f8000 R15: 0000000000000001 [ 1139.243308] FS: 00007fe81bfe86e0(0000) GS:ffff88000d800000(0000) knlGS:0000000000000000 [ 1139.243311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1139.243313] CR2: ffff8884ffc00000 CR3: 000000026cf2d000 CR4: 00000000000006f0 [ 1139.243316] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1139.243318] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1139.243321] Call Trace: [ 1139.243324] [<ffffffff810ace29>] ? alloc_arraycache+0x29/0x69 [ 1139.243328] [<ffffffff8135004e>] ? cpuup_callback+0x1b0/0x32a [ 1139.243333] [<ffffffff8105385d>] ? notifier_call_chain+0x33/0x5b [ 1139.243337] [<ffffffff810538a4>] ? __raw_notifier_call_chain+0x9/0xb [ 1139.243340] [<ffffffff8134ecfc>] ? cpu_up+0xb3/0x152 [ 1139.243344] [<ffffffff813388ce>] ? store_online+0x4d/0x75 [ 1139.243348] [<ffffffff811e53f3>] ? sysdev_store+0x1b/0x1d [ 1139.243351] [<ffffffff8110589f>] ? sysfs_write_file+0xe5/0x121 [ 1139.243355] [<ffffffff810b539d>] ? vfs_write+0xae/0x14a [ 1139.243358] [<ffffffff810b587f>] ? sys_write+0x47/0x6f [ 1139.243362] [<ffffffff8100b9ab>] ? system_call_fastpath+0x16/0x1b This patch makes sure to always replicate new direct mapping PGD entries to the PGDs of all processes, as well as ensures corresponding vmemmap mapping gets synced. V1: initial code by Andi Kleen. V2: fix several issues found in testing. V3: as suggested by Wu Fengguang, reuse common code of vmalloc_sync_all(). [ hpa: changed pgd_change from int to bool ] Originally-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> LKML-Reference: <4C6E4FD8.6080100@linux.intel.com> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Separate x86_64 vmalloc_sync_all() into separate functionsHaicheng Li2010-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No behavior change. Move some of vmalloc_sync_all() code into a new function sync_global_pgds() that will be useful for memory hotplug. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> LKML-Reference: <4C6E4ECD.1090607@linux.intel.com> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, mm: Avoid unnecessary TLB flushShaohua Li2010-08-23
| | |_|_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In x86, access and dirty bits are set automatically by CPU when CPU accesses memory. When we go into the code path of below flush_tlb_fix_spurious_fault(), we already set dirty bit for pte and don't need flush tlb. This might mean tlb entry in some CPUs hasn't dirty bit set, but this doesn't matter. When the CPUs do page write, they will automatically check the bit and no software involved. On the other hand, flush tlb in below position is harmful. Test creates CPU number of threads, each thread writes to a same but random address in same vma range and we measure the total time. Under a 4 socket system, original time is 1.96s, while with the patch, the time is 0.8s. Under a 2 socket system, there is 20% time cut too. perf shows a lot of time are taking to send ipi/handle ipi for tlb flush. Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <20100816011655.GA362@sli10-desk.sh.intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrea Archangeli <aarcange@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | | | | | | | Merge branch 'x86-mem-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mem: Optimize memmove for small size and unaligned cases x86, mem: Optimize memcpy by avoiding memory false dependece x86, mem: Don't implement forward memmove() as memcpy()
| * | | | | | | | | x86, mem: Optimize memmove for small size and unaligned casesMa Ling2010-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | movs instruction will combine data to accelerate moving data, however we need to concern two cases about it. 1. movs instruction need long lantency to startup, so here we use general mov instruction to copy data. 2. movs instruction is not good for unaligned case, even if src offset is 0x10, dest offset is 0x0, we avoid and handle the case by general mov instruction. Signed-off-by: Ma Ling <ling.ma@intel.com> LKML-Reference: <1284664360-6138-1-git-send-email-ling.ma@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | | | x86, mem: Optimize memcpy by avoiding memory false dependeceMa Ling2010-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All read operations after allocation stage can run speculatively, all write operation will run in program order, and if addresses are different read may run before older write operation, otherwise wait until write commit. However CPU don't check each address bit, so read could fail to recognize different address even they are in different page.For example if rsi is 0xf004, rdi is 0xe008, in following operation there will generate big performance latency. 1. movq (%rsi), %rax 2. movq %rax, (%rdi) 3. movq 8(%rsi), %rax 4. movq %rax, 8(%rdi) If %rsi and rdi were in really the same meory page, there are TRUE read-after-write dependence because instruction 2 write 0x008 and instruction 3 read 0x00c, the two address are overlap partially. Actually there are in different page and no any issues, but without checking each address bit CPU could think they are in the same page, and instruction 3 have to wait for instruction 2 to write data into cache from write buffer, then load data from cache, the cost time read spent is equal to mfence instruction. We may avoid it by tuning operation sequence as follow. 1. movq 8(%rsi), %rax 2. movq %rax, 8(%rdi) 3. movq (%rsi), %rax 4. movq %rax, (%rdi) Instruction 3 read 0x004, instruction 2 write address 0x010, no any dependence. At last on Core2 we gain 1.83x speedup compared with original instruction sequence. In this patch we first handle small size(less 20bytes), then jump to different copy mode. Based on our micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X improvement, and up to 1.5X improvement for 1024 bytes on Corei7. (We use our micro-benchmark, and will do further test according to your requirment) Signed-off-by: Ma Ling <ling.ma@intel.com> LKML-Reference: <1277753065-18610-1-git-send-email-ling.ma@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | | | x86, mem: Don't implement forward memmove() as memcpy()Ma, Ling2010-08-23
| |/ / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memmove() allow source and destination address to be overlap, but there is no such limitation for memcpy(). Therefore, explicitly implement memmove() in both the forwards and backward directions, to give us the ability to optimize memcpy(). Signed-off-by: Ma Ling <ling.ma@intel.com> LKML-Reference: <C10D3FB0CD45994C8A51FEC1227CE22F0E483AD86A@shsmsx502.ccr.corp.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* | | | | | | | | Merge branch 'x86-idle-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache line x86, hotplug: Move WBINVD back outside the play_dead loop x86, hotplug: Use mwait to offline a processor, fix the legacy case x86, mwait: Move mwait constants to a common header file
| * | | | | | | | | x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache lineH. Peter Anvin2010-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we're using MWAIT for play_dead, explicitly CLFLUSH the cache line before executing MONITOR. This is a potential workaround for the Xeon 7400 erratum AAI65 after having a spurious wakeup and returning around the loop. "Potential" here because it is not certain that that erratum could actually trigger; however, the CLFLUSH should be harmless. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Venkatesh Pallipadi <venki@google.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan van de Ven <arjan@linux.kernel.org> Cc: Len Brown <lenb@kernel.org>
| * | | | | | | | | x86, hotplug: Move WBINVD back outside the play_dead loopH. Peter Anvin2010-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On processors with hyperthreading, when only one thread is offlined the other thread can cause a spurious wakeup on the idled thread. We do not want to re-WBINVD when that happens. Ideally, we should simply skip WBINVD unless we're the last thread on a particular core to shut down, but there might be similar issues elsewhere in the system. Thus, revert to previous behavior of only WBINVD outside the loop. Partly as a result, remove the mb()'s around it: they are not necessary since wbinvd() is a serializing instruction, but they were intended to make sure the compiler didn't do any funny loop optimizations. Reported-by: Asit Mallick <asit.k.mallick@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.kernel.org> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.hl> LKML-Reference: <tip-ea53069231f9317062910d6e772cca4ce93de8c8@git.kernel.org>
| * | | | | | | | | x86, hotplug: Use mwait to offline a processor, fix the legacy caseH. Peter Anvin2010-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code in native_play_dead() has a number of problems: 1. We should use MWAIT when available, to put ourselves into a deeper sleep state. 2. We use the existence of CLFLUSH to determine if WBINVD is safe, but that is totally bogus -- WBINVD is 486+, whereas CLFLUSH is a much later addition. 3. We should do WBINVD inside the loop, just in case of something like setting an A bit on page tables. Pointed out by Arjan van de Ven. This code is based in part of a previous patch by Venki Pallipadi, but unlike that patch this one keeps all the detection code local instead of pre-caching a bunch of information. We're shutting down the CPU; there is absolutely no hurry. This patch moves all the code to C and deletes the global wbinvd_halt() which is broken anyway. Originally-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.hl> LKML-Reference: <20090522232230.162239000@intel.com>
| * | | | | | | | | x86, mwait: Move mwait constants to a common header fileH. Peter Anvin2010-09-17
| | |_|_|_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have MWAIT constants spread across three different .c files, for no good reason. Move them all into a common header file. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> LKML-Reference: <tip-*@git.kernel.org>
* | | | | | | | | Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds2010-10-21
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, fpu: Merge fpu_save_init() x86-32, fpu: Rewrite fpu_save_init() x86, fpu: Remove PSHUFB_XMM5_* macros x86, fpu: Remove unnecessary ifdefs from i387 code. x86-32, fpu: Remove math_emulate stub x86-64, fpu: Simplify constraints for fxsave/fxtstor x86-64, fpu: Fix %cs value in convert_from_fxsr() x86-64, fpu: Disable preemption when using TS_USEDFPU x86, fpu: Merge __save_init_fpu() x86, fpu: Merge tolerant_fwait() x86, fpu: Merge fpu_init() x86: Use correct type for %cr4 x86, xsave: Disable xsave in i387 emulation mode Fixed up fxsaveq-induced conflict in arch/x86/include/asm/i387.h
| * | | | | | | | | x86, fpu: Merge fpu_save_init()Brian Gerst2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make 64-bit use the 32-bit version of fpu_save_init(). Remove unused clear_fpu_state(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-13-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86-32, fpu: Rewrite fpu_save_init()Brian Gerst2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite fpu_save_init() to prepare for merging with 64-bit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-12-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, fpu: Remove PSHUFB_XMM5_* macrosBrian Gerst2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PSHUFB_XMM5_* macros are no longer used. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-11-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
| * | | | | | | | | x86, fpu: Remove unnecessary ifdefs from i387 code.Brian Gerst2010-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove ifdefs for code that the compiler can optimize away on 64-bit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1283563039-3466-10-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>