aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge branch 'x86-vdso-for-linus' of ↵Linus Torvalds2014-12-10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso updates from Ingo Molnar: "Various vDSO updates from Andy Lutomirski, mostly cleanups and reorganization to improve maintainability, but also some micro-optimizations and robustization changes" * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86_64/vsyscall: Restore orig_ax after vsyscall seccomp x86_64: Add a comment explaining the TASK_SIZE_MAX guard page x86_64,vsyscall: Make vsyscall emulation configurable x86_64, vsyscall: Rewrite comment and clean up headers in vsyscall code x86_64, vsyscall: Turn vsyscalls all the way off when vsyscall==none x86,vdso: Use LSL unconditionally for vgetcpu x86: vdso: Fix build with older gcc x86_64/vdso: Clean up vgetcpu init and merge the vdso initcalls x86_64/vdso: Remove jiffies from the vvar page x86/vdso: Make the PER_CPU segment 32 bits x86/vdso: Make the PER_CPU segment start out accessed x86/vdso: Change the PER_CPU segment to use struct desc_struct x86_64/vdso: Move getcpu code from vsyscall_64.c to vdso/vma.c x86_64/vsyscall: Move all of the gate_area code to vsyscall_64.c
| * x86_64/vsyscall: Restore orig_ax after vsyscall seccompAndy Lutomirski2014-11-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vsyscall emulation code sets orig_ax for seccomp's benefit, but it forgot to set it back. I'm not sure that this is observable at all, but it could cause confusion to various /proc or ptrace users, and it's possible that it could cause minor artifacts if a signal were to be delivered on return from vsyscall emulation. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/cdc6a564517a4df09235572ee5f530ccdcf933f7.1415144089.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86_64: Add a comment explaining the TASK_SIZE_MAX guard pageAndy Lutomirski2014-11-10
| | | | | | | | | | | | | | | | | | That guard page is absolutely necessary; explain why for posterity. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/23320cb5017c2da8475ec20fcde8089d82aa2699.1415144745.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86_64,vsyscall: Make vsyscall emulation configurableAndy Lutomirski2014-11-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds CONFIG_X86_VSYSCALL_EMULATION, guarded by CONFIG_EXPERT. Turning it off completely disables vsyscall emulation, saving ~3.5k for vsyscall_64.c, 4k for vsyscall_emu_64.S (the fake vsyscall page), some tiny amount of core mm code that supports a gate area, and possibly 4k for a wasted pagetable. The latter is because the vsyscall addresses are misaligned and fit poorly in the fixmap. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/406db88b8dd5f0cbbf38216d11be34bbb43c7eae.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86_64, vsyscall: Rewrite comment and clean up headers in vsyscall codeAndy Lutomirski2014-11-03
| | | | | | | | | | | | | | | | | | | | | | | | vsyscall_64.c is just vsyscall emulation. Tidy it up accordingly. [ tglx: Preserved the original copyright notices ] Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/9c448d5643d0fdb618f8cde9a54c21d2bcd486ce.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86_64, vsyscall: Turn vsyscalls all the way off when vsyscall==noneAndy Lutomirski2014-11-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I see no point in having an unusable read-only page sitting at 0xffffffffff600000 when vsyscall=none. Instead, skip mapping it and remove it from /proc/PID/maps. I kept the ratelimited warning when programs try to use a vsyscall in this mode, since it may help admins avoid confusion. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/0dddbadc1d4e3bfbaf887938ff42afc97a7cc1f2.1414618407.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86,vdso: Use LSL unconditionally for vgetcpuAndy Lutomirski2014-11-03
| | | | | | | | | | | | | | | | | | | | LSL is faster than RDTSCP and works everywhere; there's no need to switch between them depending on CPU. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Andi Kleen <andi@firstfloor.org> Link: http://lkml.kernel.org/r/72f73d5ec4514e02bba345b9759177ef03742efb.1414706021.git.luto@amacapital.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86: vdso: Fix build with older gccAndrew Morton2014-11-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-4.4.4: arch/x86/vdso/vma.c: In function 'vgetcpu_cpu_init': arch/x86/vdso/vma.c:247: error: unknown field 'limit0' specified in initializer arch/x86/vdso/vma.c:247: warning: missing braces around initializer arch/x86/vdso/vma.c:247: warning: (near initialization for '(anonymous).<anonymous>') arch/x86/vdso/vma.c:248: error: unknown field 'limit' specified in initializer arch/x86/vdso/vma.c:248: warning: excess elements in struct initializer arch/x86/vdso/vma.c:248: warning: (near initialization for '(anonymous)') .... I couldn't find any way of tricking it into accepting an initializer format :( Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Fixes: 258801563b ("x86/vdso: Change the PER_CPU segment to use struct desc_struct") Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * x86_64/vdso: Clean up vgetcpu init and merge the vdso initcallsAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | Now vdso/vma.c has a single initcall and no references to "vsyscall". Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/945c463e2804fedd8b08d63a040cbe85d55195aa.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86_64/vdso: Remove jiffies from the vvar pageAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | | | I think that the jiffies vvar was once used for the vgetcpu cache. That code is long gone, so let's just make jiffies be a normal variable. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/fcfee6f8749af14d96373a9e2656354ad0b95499.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/vdso: Make the PER_CPU segment 32 bitsAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | IMO users ought not to be able to use 16-bit segments without using modify_ldt. Fortunately, it's impossible to break espfix64 by loading the PER_CPU segment into SS because it's PER_CPU is marked read-only and SS cannot contain an RO segment, but marking PER_CPU as 32-bit is less fragile. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/179f490d659307873eefd09206bebd417e2ab5ad.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/vdso: Make the PER_CPU segment start out accessedAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The first userspace attempt to read or write the PER_CPU segment will write the accessed bit to the GDT. This is visible to userspace using the LAR instruction, and it also pointlessly dirties a cache line. Set the segment's accessed bit at boot to prevent userspace access to segments from having side effects. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/ac63814ca4c637a08ec2fd0360d67ca67560a9ee.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86/vdso: Change the PER_CPU segment to use struct desc_structAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | This makes it easier to see what's going on. It produces exactly the same segment descriptor as the old code. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/d492f7b55136cbc60f016adae79160707b2e03b7.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86_64/vdso: Move getcpu code from vsyscall_64.c to vdso/vma.cAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | | | This is pure cut-and-paste. At this point, vsyscall_64.c contains only code needed for vsyscall emulation, but some of the comments and function names are still confused. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/a244daf7d3cbe71afc08ad09fdfe1866ca1f1978.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * x86_64/vsyscall: Move all of the gate_area code to vsyscall_64.cAndy Lutomirski2014-10-28
| | | | | | | | | | | | | | | | | | | | This code exists for the sole purpose of making the vsyscall page look sort of like real userspace memory. Move it so that it lives with the rest of the vsyscall code. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/a7ee266773671a05f00b7175ca65a0dd812d2e4b.1411494540.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | Merge branch 'x86-ras-for-linus' of ↵Linus Torvalds2014-12-10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS update from Ingo Molnar: "The biggest change in this cycle is better support for UCNA (UnCorrected No Action) events: "Handle all uncorrected error reports in the same way (soft offline the page). We used to only do that for SRAO (software recoverable action optional) machine checks, but it makes sense to also do it for UCNA (UnCorrected No Action) logs found by CMCI or polling." plus various x86 MCE handling updates and fixes" * 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Spell "panicked" correctly x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll x86, mce, severity: Extend the the mce_severity mechanism to handle UCNA/DEFERRED error x86, MCE, AMD: Assign interrupt handler only when bank supports it x86, MCE, AMD: Drop software-defined bank in error thresholding x86, MCE, AMD: Move invariant code out from loop body x86, MCE, AMD: Correct thresholding error logging x86, MCE, AMD: Use macros to compute bank MSRs RAS, HWPOISON: Fix wrong error recovery status GHES: Make ghes_estatus_caches static APEI, GHES: Cleanup unnecessary function for lockless list
| * | x86/mce: Spell "panicked" correctlyBorislav Petkov2014-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need the additional "k" to make it a hard-c: https://en.wiktionary.org/wiki/panicked Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1417642605-15730-1-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | Merge tag 'please-pull-ucna' of ↵Thomas Gleixner2014-11-21
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras Merge RAS updates from Tony Luck: "Handle all uncorrected error reports in the same way (soft offline the page). We used to only do that for SRAO (software recoverable action optional) machine checks, but it makes sense to also do it for UCNA (UnCorrected No Action) logs found by CMCI or polling."
| | * | x86, mce: Support memory error recovery for both UCNA and Deferred error in ↵Chen Yucong2014-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | machine_check_poll Uncorrected no action required (UCNA) - is a uncorrected recoverable machine check error that is not signaled via a machine check exception and, instead, is reported to system software as a corrected machine check error. UCNA errors indicate that some data in the system is corrupted, but the data has not been consumed and the processor state is valid and you may continue execution on this processor. UCNA errors require no action from system software to continue execution. Note that UCNA errors are supported by the processor only when IA32_MCG_CAP[24] (MCG_SER_P) is set. -- Intel SDM Volume 3B Deferred errors are errors that cannot be corrected by hardware, but do not cause an immediate interruption in program flow, loss of data integrity, or corruption of processor state. These errors indicate that data has been corrupted but not consumed. Hardware writes information to the status and address registers in the corresponding bank that identifies the source of the error if deferred errors are enabled for logging. Deferred errors are not reported via machine check exceptions; they can be seen by polling the MCi_STATUS registers. -- AMD64 APM Volume 2 Above two items, both UCNA and Deferred errors belong to detected errors, but they can't be corrected by hardware, and this is very similar to Software Recoverable Action Optional (SRAO) errors. Therefore, we can take some actions that have been used for handling SRAO errors to handle UCNA and Deferred errors. Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| | * | x86, mce, severity: Extend the the mce_severity mechanism to handle ↵Chen Yucong2014-11-19
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UCNA/DEFERRED error Until now, the mce_severity mechanism can only identify the severity of UCNA error as MCE_KEEP_SEVERITY. Meanwhile, it is not able to filter out DEFERRED error for AMD platform. This patch extends the mce_severity mechanism for handling UCNA/DEFERRED error. In order to do this, the patch introduces a new severity level - MCE_UCNA/DEFERRED_SEVERITY. In addition, mce_severity is specific to machine check exception, and it will check MCIP/EIPV/RIPV bits. In order to use mce_severity mechanism in non-exception context, the patch also introduces a new argument (is_excp) for mce_severity. `is_excp' is used to explicitly specify the calling context of mce_severity. Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | x86, MCE, AMD: Assign interrupt handler only when bank supports itChen Yucong2014-11-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some AMD CPU models which have thresholding banks but which cannot generate a thresholding interrupt. This is denoted by the bit MCi_MISC[IntP]. Make sure to check that bit before assigning the thresholding interrupt handler. Signed-off-by: Chen Yucong <slaoub@gmail.com> [ Boris: save an indentation level and rewrite commit message. ] Link: http://lkml.kernel.org/r/1412662128.28440.18.camel@debian Signed-off-by: Borislav Petkov <bp@suse.de>
| * | x86, MCE, AMD: Drop software-defined bank in error thresholdingBorislav Petkov2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aravind had the good question about why we're assigning a software-defined bank when reporting error thresholding errors instead of simply using the bank which reports the last error causing the overflow. Digging through git history, it pointed to 95268664390b ("[PATCH] x86_64: mce_amd support for family 0x10 processors") which added that functionality. The problem with this, however, is that tools don't know about software-defined banks and get puzzled. So drop that K8_MCE_THRESHOLD_BASE and simply use the hw bank reporting the thresholding interrupt. Save us a couple of MSR reads while at it. Reported-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: https://lkml.kernel.org/r/5435B206.60402@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
| * | x86, MCE, AMD: Move invariant code out from loop bodyChen Yucong2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assigning to mce_threshold_vector is loop-invariant code in mce_amd_feature_init(). So do it only once, out of loop body. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1412263212.8085.6.camel@debian [ Boris: commit message corrections. ] Signed-off-by: Borislav Petkov <bp@suse.de>
| * | x86, MCE, AMD: Correct thresholding error loggingChen Yucong2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mce_setup() does not gather the content of IA32_MCG_STATUS, so it should be read explicitly. Moreover, we need to clear IA32_MCx_STATUS to avoid that mce_log() logs the processed threshold event again at next time. But we do the logging ourselves and machine_check_poll() is completely useless there. So kill it. Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| * | x86, MCE, AMD: Use macros to compute bank MSRsChen Yucong2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid open coded calculations for bank MSRs by hiding the index of higher bank MSRs in well-defined macros. No semantic changes. Signed-off-by: Chen Yucong <slaoub@gmail.com> Link: http://lkml.kernel.org/r/1411438561-24319-1-git-send-email-slaoub@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
| * | RAS, HWPOISON: Fix wrong error recovery statusChen, Gong2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When Uncorrected error happens, if the poisoned page is referenced by more than one user after error recovery, the recovery is not successful. But currently the display result is wrong. Before this patch: MCE 0x44e336: dirty mlocked LRU page recovery: Recovered MCE 0x44e336: dirty mlocked LRU page still referenced by 1 users mce: Memory error not recovered After this patch: MCE 0x44e336: dirty mlocked LRU page recovery: Failed MCE 0x44e336: dirty mlocked LRU page still referenced by 1 users mce: Memory error not recovered Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1406530260-26078-3-git-send-email-gong.chen@linux.intel.com Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
| * | GHES: Make ghes_estatus_caches staticBorislav Petkov2014-10-21
| | | | | | | | | | | | | | | | | | It is used only in ghes.c. Signed-off-by: Borislav Petkov <bp@suse.de>
| * | APEI, GHES: Cleanup unnecessary function for lockless listChen, Gong2014-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a generic function to reverse a lockless list, kill homegrown copy. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1406530260-26078-2-git-send-email-gong.chen@linux.intel.com Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> [ Boris: correct commit msg ] Signed-off-by: Borislav Petkov <bp@suse.de>
* | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2014-12-10
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm tree changes from Ingo Molnar: "The biggest change is full PAT support from Jürgen Gross: The x86 architecture offers via the PAT (Page Attribute Table) a way to specify different caching modes in page table entries. The PAT MSR contains 8 entries each specifying one of 6 possible cache modes. A pte references one of those entries via 3 bits: _PAGE_PAT, _PAGE_PWT and _PAGE_PCD. The Linux kernel currently supports only 4 different cache modes. The PAT MSR is set up in a way that the setting of _PAGE_PAT in a pte doesn't matter: the top 4 entries in the PAT MSR are the same as the 4 lower entries. This results in the kernel not supporting e.g. write-through mode. Especially this cache mode would speed up drivers of video cards which now have to use uncached accesses. OTOH some old processors (Pentium) don't support PAT correctly and the Xen hypervisor has been using a different PAT MSR configuration for some time now and can't change that as this setting is part of the ABI. This patch set abstracts the cache mode from the pte and introduces tables to translate between cache mode and pte bits (the default cache mode "write back" is hard-wired to PAT entry 0). The tables are statically initialized with values being compatible to old processors and current usage. As soon as the PAT MSR is changed (or - in case of Xen - is read at boot time) the tables are changed accordingly. Requests of mappings with special cache modes are always possible now, in case they are not supported there will be a fallback to a compatible but slower mode. Summing it up, this patch set adds the following features: - capability to support WT and WP cache modes on processors with full PAT support - processors with no or uncorrect PAT support are still working as today, even if WT or WP cache mode are selected by drivers for some pages - reduction of Xen special handling regarding cache mode Another change is a boot speedup on ridiculously large RAM systems, plus other smaller fixes" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) x86: mm: Move PAT only functions to mm/pat.c xen: Support Xen pv-domains using PAT x86: Enable PAT to use cache mode translation tables x86: Respect PAT bit when copying pte values between large and normal pages x86: Support PAT bit in pagetable dump for lower levels x86: Clean up pgtable_types.h x86: Use new cache mode type in memtype related functions x86: Use new cache mode type in mm/ioremap.c x86: Use new cache mode type in setting page attributes x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.c x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert() x86: Use new cache mode type in mm/iomap_32.c x86: Use new cache mode type in asm/pgtable.h x86: Use new cache mode type in arch/x86/mm/init_64.c x86: Use new cache mode type in arch/x86/pci x86: Use new cache mode type in drivers/video/fbdev/vermilion x86: Use new cache mode type in drivers/video/fbdev/gbefb.c x86: Use new cache mode type in include/asm/fb.h x86: Make page cache mode a real type x86: mm: Use 2GB memory block size on large-memory x86-64 systems ...
| * | | x86: mm: Move PAT only functions to mm/pat.cThomas Gleixner2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e00c8cc93c1a "x86: Use new cache mode type in memtype related functions" broke the ARCH=um build. arch/x86/include/asm/cacheflush.h:67:36: error: return type is an incomplete type static inline enum page_cache_mode get_page_memtype(struct page *pg) The reason is simple. get_page_memtype() and set_page_memtype() require enum page_cache_mode now, which is defined in asm/pgtable_types.h. UM does not include that file for obvious reasons. The simple solution is to move that functions to arch/x86/mm/pat.c where the only callsites of this are located. They should have been there in the first place. Fixes: e00c8cc93c1a "x86: Use new cache mode type in memtype related functions" Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com> Cc: Richard Weinberger <richard@nod.at>
| * | | xen: Support Xen pv-domains using PATJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the dynamical mapping between cache modes and pgprot values it is now possible to use all cache modes via the Xen hypervisor PAT settings in a pv domain. All to be done is to read the PAT configuration MSR and set up the translation tables accordingly. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: ville.syrjala@linux.intel.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-19-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Enable PAT to use cache mode translation tablesJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the translation tables from cache mode to pgprot values according to the PAT settings. This enables changing the cache attributes of a PAT index in just one place without having to change at the users side. With this change it is possible to use the same kernel with different PAT configurations, e.g. supporting Xen. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-18-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Respect PAT bit when copying pte values between large and normal pagesJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PAT bit in the ptes is not moved to the correct position when copying page protection attributes between entries of different sized pages. Translate the ptes according to their page size. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-17-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Support PAT bit in pagetable dump for lower levelsJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dumping page table protection bits is not correct for entries on levels 2 and 3 regarding the PAT bit, which is at a different position as on level 4. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-16-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Clean up pgtable_types.hJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove no longer used defines from pgtable_types.h as they are not used any longer. Switch __PAGE_KERNEL_NOCACHE to use cache mode type instead of pte bits. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-15-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in memtype related functionsJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-14-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in mm/ioremap.cJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-13-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in setting page attributesJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type in the functions for modifying page attributes. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-12-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Remove looking for setting of _PAGE_PAT_LARGE in pageattr.cJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When modifying page attributes via change_page_attr_set_clr() don't test for setting _PAGE_PAT_LARGE, as this is - never done - PAT support for large pages is not included in the kernel up to now Signed-off-by: Juergen Gross <jgross@suse.com> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-11-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in track_pfn_remap() and track_pfn_insert()Juergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. As those are the main callers of lookup_memtype(), change this as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-10-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in mm/iomap_32.cJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. This requires to change io_reserve_memtype() as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-9-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in asm/pgtable.hJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. This requires changing some callers of is_new_memtype_allowed() to be changed as well. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-8-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in arch/x86/mm/init_64.cJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-7-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in arch/x86/pciJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-6-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in drivers/video/fbdev/vermilionJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-5-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in drivers/video/fbdev/gbefb.cJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using the cache mode bits in the pte switch to using the cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-4-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Use new cache mode type in include/asm/fb.hJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of directly using cache mode bits in the pte switch to usage of the new cache mode type. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-3-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: Make page cache mode a real typeJuergen Gross2014-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment there are a lot of places that handle setting or getting the page cache mode by treating the pgprot bits equal to the cache mode. This is only true because there are a lot of assumptions about the setup of the PAT MSR. Otherwise the cache type needs to get translated into pgprot bits and vice versa. This patch tries to prepare for that by introducing a separate type for the cache mode and adding functions to translate between those and pgprot values. To avoid too much performance penalty the translation between cache mode and pgprot values is done via tables which contain the relevant information. Write-back cache mode is hard-wired to be 0, all other modes are configurable via those tables. For large pages there are translation functions as the PAT bit is located at different positions in the ptes of 4k and large pages. Based-on-patch-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stefan.bader@canonical.com Cc: xen-devel@lists.xensource.com Cc: konrad.wilk@oracle.com Cc: ville.syrjala@linux.intel.com Cc: david.vrabel@citrix.com Cc: jbeulich@suse.com Cc: toshi.kani@hp.com Cc: plagnioj@jcrosoft.com Cc: tomi.valkeinen@ti.com Cc: bhelgaas@google.com Link: http://lkml.kernel.org/r/1415019724-4317-2-git-send-email-jgross@suse.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: mm: Use 2GB memory block size on large-memory x86-64 systemsDaniel J Blueman2014-11-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On large-memory x86-64 systems of 64GB or more with memory hot-plug enabled, use a 2GB memory block size. Eg with 64GB memory, this reduces the number of directories in /sys/devices/system/memory from 512 to 32, making it more manageable, and reducing the creation time accordingly. This caveat is that the memory can't be offlined (for hotplug or otherwise) with the finer default 128MB granularity, but this is unimportant due to the high memory densities generally used with such large-memory systems, where eg a single DIMM is the order of 16GB. Signed-off-by: Daniel J Blueman <daniel@numascale.com> Cc: Steffen Persvold <sp@numascale.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Link: http://lkml.kernel.org/r/1415089784-28779-4-git-send-email-daniel@numascale.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | x86: mm: Re-use the early_ioremap fixed areaMinfei Huang2014-11-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The temp fixed area is only used during boot for early_ioremap(), and it is unused when ioremap() is functional. vmalloc/pkmap area become available after early boot so the temp fixed area is available for re-use. The virtual address is more precious on i386, especially turning on high memory. So we can re-use the virtual address space. Remove the now unused defines FIXADDR_BOOT_START and FIXADDR_BOOT_SIZE. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Cc: pbonzini@redhat.com Cc: bp@suse.de Link: http://lkml.kernel.org/r/1414582717-32729-1-git-send-email-mnfhuang@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>