aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/x86/x86_64/5level-paging.txt64
-rw-r--r--arch/x86/Kconfig19
-rw-r--r--arch/x86/xen/Kconfig5
3 files changed, 88 insertions, 0 deletions
diff --git a/Documentation/x86/x86_64/5level-paging.txt b/Documentation/x86/x86_64/5level-paging.txt
new file mode 100644
index 000000000000..087251a0d99c
--- /dev/null
+++ b/Documentation/x86/x86_64/5level-paging.txt
@@ -0,0 +1,64 @@
1== Overview ==
2
3Original x86-64 was limited by 4-level paing to 256 TiB of virtual address
4space and 64 TiB of physical address space. We are already bumping into
5this limit: some vendors offers servers with 64 TiB of memory today.
6
7To overcome the limitation upcoming hardware will introduce support for
85-level paging. It is a straight-forward extension of the current page
9table structure adding one more layer of translation.
10
11It bumps the limits to 128 PiB of virtual address space and 4 PiB of
12physical address space. This "ought to be enough for anybody" ©.
13
14QEMU 2.9 and later support 5-level paging.
15
16Virtual memory layout for 5-level paging is described in
17Documentation/x86/x86_64/mm.txt
18
19== Enabling 5-level paging ==
20
21CONFIG_X86_5LEVEL=y enables the feature.
22
23So far, a kernel compiled with the option enabled will be able to boot
24only on machines that supports the feature -- see for 'la57' flag in
25/proc/cpuinfo.
26
27The plan is to implement boot-time switching between 4- and 5-level paging
28in the future.
29
30== User-space and large virtual address space ==
31
32On x86, 5-level paging enables 56-bit userspace virtual address space.
33Not all user space is ready to handle wide addresses. It's known that
34at least some JIT compilers use higher bits in pointers to encode their
35information. It collides with valid pointers with 5-level paging and
36leads to crashes.
37
38To mitigate this, we are not going to allocate virtual address space
39above 47-bit by default.
40
41But userspace can ask for allocation from full address space by
42specifying hint address (with or without MAP_FIXED) above 47-bits.
43
44If hint address set above 47-bit, but MAP_FIXED is not specified, we try
45to look for unmapped area by specified address. If it's already
46occupied, we look for unmapped area in *full* address space, rather than
47from 47-bit window.
48
49A high hint address would only affect the allocation in question, but not
50any future mmap()s.
51
52Specifying high hint address on older kernel or on machine without 5-level
53paging support is safe. The hint will be ignored and kernel will fall back
54to allocation from 47-bit address space.
55
56This approach helps to easily make application's memory allocator aware
57about large address space without manually tracking allocated virtual
58address space.
59
60One important case we need to handle here is interaction with MPX.
61MPX (without MAWA extension) cannot handle addresses above 47-bit, so we
62need to make sure that MPX cannot be enabled we already have VMA above
63the boundary and forbid creating such VMAs once MPX is enabled.
64
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8328bcb9ce8b..ff637dedfafa 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -326,6 +326,7 @@ config FIX_EARLYCON_MEM
326 326
327config PGTABLE_LEVELS 327config PGTABLE_LEVELS
328 int 328 int
329 default 5 if X86_5LEVEL
329 default 4 if X86_64 330 default 4 if X86_64
330 default 3 if X86_PAE 331 default 3 if X86_PAE
331 default 2 332 default 2
@@ -1398,6 +1399,24 @@ config X86_PAE
1398 has the cost of more pagetable lookup overhead, and also 1399 has the cost of more pagetable lookup overhead, and also
1399 consumes more pagetable space per process. 1400 consumes more pagetable space per process.
1400 1401
1402config X86_5LEVEL
1403 bool "Enable 5-level page tables support"
1404 depends on X86_64
1405 ---help---
1406 5-level paging enables access to larger address space:
1407 upto 128 PiB of virtual address space and 4 PiB of
1408 physical address space.
1409
1410 It will be supported by future Intel CPUs.
1411
1412 Note: a kernel with this option enabled can only be booted
1413 on machines that support the feature.
1414
1415 See Documentation/x86/x86_64/5level-paging.txt for more
1416 information.
1417
1418 Say N if unsure.
1419
1401config ARCH_PHYS_ADDR_T_64BIT 1420config ARCH_PHYS_ADDR_T_64BIT
1402 def_bool y 1421 def_bool y
1403 depends on X86_64 || X86_PAE 1422 depends on X86_64 || X86_PAE
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 027987638e98..1ecd419811a2 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -17,6 +17,9 @@ config XEN_PV
17 bool "Xen PV guest support" 17 bool "Xen PV guest support"
18 default y 18 default y
19 depends on XEN 19 depends on XEN
20 # XEN_PV is not ready to work with 5-level paging.
21 # Changes to hypervisor are also required.
22 depends on !X86_5LEVEL
20 select XEN_HAVE_PVMMU 23 select XEN_HAVE_PVMMU
21 select XEN_HAVE_VPMU 24 select XEN_HAVE_VPMU
22 help 25 help
@@ -75,4 +78,6 @@ config XEN_DEBUG_FS
75config XEN_PVH 78config XEN_PVH
76 bool "Support for running as a PVH guest" 79 bool "Support for running as a PVH guest"
77 depends on XEN && XEN_PVHVM && ACPI 80 depends on XEN && XEN_PVHVM && ACPI
81 # Pre-built page tables are not ready to handle 5-level paging.
82 depends on !X86_5LEVEL
78 def_bool n 83 def_bool n