aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/filesystems/proc.txt80
1 files changed, 63 insertions, 17 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 07231afca72d..e2799b5fafea 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1336,7 +1336,7 @@ legacy_va_layout
1336If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel 1336If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel
1337will use the legacy (2.4) layout for all processes. 1337will use the legacy (2.4) layout for all processes.
1338 1338
1339lower_zone_protection 1339lowmem_reserve_ratio
1340--------------------- 1340---------------------
1341 1341
1342For some specialised workloads on highmem machines it is dangerous for 1342For some specialised workloads on highmem machines it is dangerous for
@@ -1356,25 +1356,71 @@ captured into pinned user memory.
1356mechanism will also defend that region from allocations which could use 1356mechanism will also defend that region from allocations which could use
1357highmem or lowmem). 1357highmem or lowmem).
1358 1358
1359The `lower_zone_protection' tunable determines how aggressive the kernel is 1359The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is
1360in defending these lower zones. The default value is zero - no 1360in defending these lower zones.
1361protection at all.
1362 1361
1363If you have a machine which uses highmem or ISA DMA and your 1362If you have a machine which uses highmem or ISA DMA and your
1364applications are using mlock(), or if you are running with no swap then 1363applications are using mlock(), or if you are running with no swap then
1365you probably should increase the lower_zone_protection setting. 1364you probably should change the lowmem_reserve_ratio setting.
1366 1365
1367The units of this tunable are fairly vague. It is approximately equal 1366The lowmem_reserve_ratio is an array. You can see them by reading this file.
1368to "megabytes," so setting lower_zone_protection=100 will protect around 100 1367-
1369megabytes of the lowmem zone from user allocations. It will also make 1368% cat /proc/sys/vm/lowmem_reserve_ratio
1370those 100 megabytes unavailable for use by applications and by 1369256 256 32
1371pagecache, so there is a cost. 1370-
1372 1371Note: # of this elements is one fewer than number of zones. Because the highest
1373The effects of this tunable may be observed by monitoring 1372 zone's value is not necessary for following calculation.
1374/proc/meminfo:LowFree. Write a single huge file and observe the point 1373
1375at which LowFree ceases to fall. 1374But, these values are not used directly. The kernel calculates # of protection
1376 1375pages for each zones from them. These are shown as array of protection pages
1377A reasonable value for lower_zone_protection is 100. 1376in /proc/zoneinfo like followings. (This is an example of x86-64 box).
1377Each zone has an array of protection pages like this.
1378
1379-
1380Node 0, zone DMA
1381 pages free 1355
1382 min 3
1383 low 3
1384 high 4
1385 :
1386 :
1387 numa_other 0
1388 protection: (0, 2004, 2004, 2004)
1389 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1390 pagesets
1391 cpu: 0 pcp: 0
1392 :
1393-
1394These protections are added to score to judge whether this zone should be used
1395for page allocation or should be reclaimed.
1396
1397In this example, if normal pages (index=2) are required to this DMA zone and
1398pages_high is used for watermark, the kernel judges this zone should not be
1399used because pages_free(1355) is smaller than watermark + protection[2]
1400(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
1401normal page requirement. If requirement is DMA zone(index=0), protection[0]
1402(=0) is used.
1403
1404zone[i]'s protection[j] is calculated by following exprssion.
1405
1406(i < j):
1407 zone[i]->protection[j]
1408 = (total sums of present_pages from zone[i+1] to zone[j] on the node)
1409 / lowmem_reserve_ratio[i];
1410(i = j):
1411 (should not be protected. = 0;
1412(i > j):
1413 (not necessary, but looks 0)
1414
1415The default values of lowmem_reserve_ratio[i] are
1416 256 (if zone[i] means DMA or DMA32 zone)
1417 32 (others).
1418As above expression, they are reciprocal number of ratio.
1419256 means 1/256. # of protection pages becomes about "0.39%" of total present
1420pages of higher zones on the node.
1421
1422If you would like to protect more pages, smaller values are effective.
1423The minimum value is 1 (1/1 -> 100%).
1378 1424
1379page-cluster 1425page-cluster
1380------------ 1426------------