aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/sysctl/vm.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/sysctl/vm.txt')
-rw-r--r--Documentation/sysctl/vm.txt45
1 files changed, 45 insertions, 0 deletions
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 8cfca173d4bc..df3ff2095f9d 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -32,6 +32,7 @@ Currently, these files are in /proc/sys/vm:
32- min_slab_ratio 32- min_slab_ratio
33- panic_on_oom 33- panic_on_oom
34- mmap_min_address 34- mmap_min_address
35- numa_zonelist_order
35 36
36============================================================== 37==============================================================
37 38
@@ -231,3 +232,47 @@ security module. Setting this value to something like 64k will allow the
231vast majority of applications to work correctly and provide defense in depth 232vast majority of applications to work correctly and provide defense in depth
232against future potential kernel bugs. 233against future potential kernel bugs.
233 234
235==============================================================
236
237numa_zonelist_order
238
239This sysctl is only for NUMA.
240'where the memory is allocated from' is controlled by zonelists.
241(This documentation ignores ZONE_HIGHMEM/ZONE_DMA32 for simple explanation.
242 you may be able to read ZONE_DMA as ZONE_DMA32...)
243
244In non-NUMA case, a zonelist for GFP_KERNEL is ordered as following.
245ZONE_NORMAL -> ZONE_DMA
246This means that a memory allocation request for GFP_KERNEL will
247get memory from ZONE_DMA only when ZONE_NORMAL is not available.
248
249In NUMA case, you can think of following 2 types of order.
250Assume 2 node NUMA and below is zonelist of Node(0)'s GFP_KERNEL
251
252(A) Node(0) ZONE_NORMAL -> Node(0) ZONE_DMA -> Node(1) ZONE_NORMAL
253(B) Node(0) ZONE_NORMAL -> Node(1) ZONE_NORMAL -> Node(0) ZONE_DMA.
254
255Type(A) offers the best locality for processes on Node(0), but ZONE_DMA
256will be used before ZONE_NORMAL exhaustion. This increases possibility of
257out-of-memory(OOM) of ZONE_DMA because ZONE_DMA is tend to be small.
258
259Type(B) cannot offer the best locality but is more robust against OOM of
260the DMA zone.
261
262Type(A) is called as "Node" order. Type (B) is "Zone" order.
263
264"Node order" orders the zonelists by node, then by zone within each node.
265Specify "[Nn]ode" for zone order
266
267"Zone Order" orders the zonelists by zone type, then by node within each
268zone. Specify "[Zz]one"for zode order.
269
270Specify "[Dd]efault" to request automatic configuration. Autoconfiguration
271will select "node" order in following case.
272(1) if the DMA zone does not exist or
273(2) if the DMA zone comprises greater than 50% of the available memory or
274(3) if any node's DMA zone comprises greater than 60% of its local memory and
275 the amount of local memory is big enough.
276
277Otherwise, "zone" order will be selected. Default order is recommended unless
278this is causing problems for your system/application.