aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems/proc.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems/proc.txt')
-rw-r--r--Documentation/filesystems/proc.txt95
1 files changed, 78 insertions, 17 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 0b1b0c008613..e2799b5fafea 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1315,13 +1315,28 @@ for writeout by the pdflush daemons. It is expressed in 100'ths of a second.
1315Data which has been dirty in-memory for longer than this interval will be 1315Data which has been dirty in-memory for longer than this interval will be
1316written out next time a pdflush daemon wakes up. 1316written out next time a pdflush daemon wakes up.
1317 1317
1318highmem_is_dirtyable
1319--------------------
1320
1321Only present if CONFIG_HIGHMEM is set.
1322
1323This defaults to 0 (false), meaning that the ratios set above are calculated
1324as a percentage of lowmem only. This protects against excessive scanning
1325in page reclaim, swapping and general VM distress.
1326
1327Setting this to 1 can be useful on 32 bit machines where you want to make
1328random changes within an MMAPed file that is larger than your available
1329lowmem without causing large quantities of random IO. Is is safe if the
1330behavior of all programs running on the machine is known and memory will
1331not be otherwise stressed.
1332
1318legacy_va_layout 1333legacy_va_layout
1319---------------- 1334----------------
1320 1335
1321If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel 1336If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel
1322will use the legacy (2.4) layout for all processes. 1337will use the legacy (2.4) layout for all processes.
1323 1338
1324lower_zone_protection 1339lowmem_reserve_ratio
1325--------------------- 1340---------------------
1326 1341
1327For some specialised workloads on highmem machines it is dangerous for 1342For some specialised workloads on highmem machines it is dangerous for
@@ -1341,25 +1356,71 @@ captured into pinned user memory.
1341mechanism will also defend that region from allocations which could use 1356mechanism will also defend that region from allocations which could use
1342highmem or lowmem). 1357highmem or lowmem).
1343 1358
1344The `lower_zone_protection' tunable determines how aggressive the kernel is 1359The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is
1345in defending these lower zones. The default value is zero - no 1360in defending these lower zones.
1346protection at all.
1347 1361
1348If you have a machine which uses highmem or ISA DMA and your 1362If you have a machine which uses highmem or ISA DMA and your
1349applications are using mlock(), or if you are running with no swap then 1363applications are using mlock(), or if you are running with no swap then
1350you probably should increase the lower_zone_protection setting. 1364you probably should change the lowmem_reserve_ratio setting.
1351 1365
1352The units of this tunable are fairly vague. It is approximately equal 1366The lowmem_reserve_ratio is an array. You can see them by reading this file.
1353to "megabytes," so setting lower_zone_protection=100 will protect around 100 1367-
1354megabytes of the lowmem zone from user allocations. It will also make 1368% cat /proc/sys/vm/lowmem_reserve_ratio
1355those 100 megabytes unavailable for use by applications and by 1369256 256 32
1356pagecache, so there is a cost. 1370-
1357 1371Note: # of this elements is one fewer than number of zones. Because the highest
1358The effects of this tunable may be observed by monitoring 1372 zone's value is not necessary for following calculation.
1359/proc/meminfo:LowFree. Write a single huge file and observe the point 1373
1360at which LowFree ceases to fall. 1374But, these values are not used directly. The kernel calculates # of protection
1361 1375pages for each zones from them. These are shown as array of protection pages
1362A reasonable value for lower_zone_protection is 100. 1376in /proc/zoneinfo like followings. (This is an example of x86-64 box).
1377Each zone has an array of protection pages like this.
1378
1379-
1380Node 0, zone DMA
1381 pages free 1355
1382 min 3
1383 low 3
1384 high 4
1385 :
1386 :
1387 numa_other 0
1388 protection: (0, 2004, 2004, 2004)
1389 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1390 pagesets
1391 cpu: 0 pcp: 0
1392 :
1393-
1394These protections are added to score to judge whether this zone should be used
1395for page allocation or should be reclaimed.
1396
1397In this example, if normal pages (index=2) are required to this DMA zone and
1398pages_high is used for watermark, the kernel judges this zone should not be
1399used because pages_free(1355) is smaller than watermark + protection[2]
1400(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
1401normal page requirement. If requirement is DMA zone(index=0), protection[0]
1402(=0) is used.
1403
1404zone[i]'s protection[j] is calculated by following exprssion.
1405
1406(i < j):
1407 zone[i]->protection[j]
1408 = (total sums of present_pages from zone[i+1] to zone[j] on the node)
1409 / lowmem_reserve_ratio[i];
1410(i = j):
1411 (should not be protected. = 0;
1412(i > j):
1413 (not necessary, but looks 0)
1414
1415The default values of lowmem_reserve_ratio[i] are
1416 256 (if zone[i] means DMA or DMA32 zone)
1417 32 (others).
1418As above expression, they are reciprocal number of ratio.
1419256 means 1/256. # of protection pages becomes about "0.39%" of total present
1420pages of higher zones on the node.
1421
1422If you would like to protect more pages, smaller values are effective.
1423The minimum value is 1 (1/1 -> 100%).
1363 1424
1364page-cluster 1425page-cluster
1365------------ 1426------------