diff options
author | Ingo Molnar <mingo@elte.hu> | 2009-01-21 04:39:51 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2009-01-21 04:39:51 -0500 |
commit | 198030782cedf25391e67e7c88b04f87a5eb6563 (patch) | |
tree | 5b7368c6bf052bcb4bb273497a57900720d36f51 /Documentation | |
parent | 4ec71fa2d2c3f1040348f2604f4b8ccc833d1c2e (diff) | |
parent | 92181f190b649f7ef2b79cbf5c00f26ccc66da2a (diff) |
Merge branch 'x86/mm' into core/percpu
Conflicts:
arch/x86/mm/fault.c
Diffstat (limited to 'Documentation')
18 files changed, 605 insertions, 503 deletions
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index b462bb149543..52441694fe03 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt | |||
@@ -170,16 +170,15 @@ Returns: 0 if successful and a negative error if not. | |||
170 | u64 | 170 | u64 |
171 | dma_get_required_mask(struct device *dev) | 171 | dma_get_required_mask(struct device *dev) |
172 | 172 | ||
173 | After setting the mask with dma_set_mask(), this API returns the | 173 | This API returns the mask that the platform requires to |
174 | actual mask (within that already set) that the platform actually | 174 | operate efficiently. Usually this means the returned mask |
175 | requires to operate efficiently. Usually this means the returned mask | ||
176 | is the minimum required to cover all of memory. Examining the | 175 | is the minimum required to cover all of memory. Examining the |
177 | required mask gives drivers with variable descriptor sizes the | 176 | required mask gives drivers with variable descriptor sizes the |
178 | opportunity to use smaller descriptors as necessary. | 177 | opportunity to use smaller descriptors as necessary. |
179 | 178 | ||
180 | Requesting the required mask does not alter the current mask. If you | 179 | Requesting the required mask does not alter the current mask. If you |
181 | wish to take advantage of it, you should issue another dma_set_mask() | 180 | wish to take advantage of it, you should issue a dma_set_mask() |
182 | call to lower the mask again. | 181 | call to set the mask to the value returned. |
183 | 182 | ||
184 | 183 | ||
185 | Part Id - Streaming DMA mappings | 184 | Part Id - Streaming DMA mappings |
diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index cc49400b4af8..7ea231172c85 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c | |||
@@ -392,6 +392,10 @@ int main(int argc, char *argv[]) | |||
392 | goto err; | 392 | goto err; |
393 | } | 393 | } |
394 | } | 394 | } |
395 | if (!maskset && !tid && !containerset) { | ||
396 | usage(); | ||
397 | goto err; | ||
398 | } | ||
395 | 399 | ||
396 | do { | 400 | do { |
397 | int i; | 401 | int i; |
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt index e33ee74eee77..d9e5d6f41b92 100644 --- a/Documentation/cgroups/cgroups.txt +++ b/Documentation/cgroups/cgroups.txt | |||
@@ -1,7 +1,8 @@ | |||
1 | CGROUPS | 1 | CGROUPS |
2 | ------- | 2 | ------- |
3 | 3 | ||
4 | Written by Paul Menage <menage@google.com> based on Documentation/cpusets.txt | 4 | Written by Paul Menage <menage@google.com> based on |
5 | Documentation/cgroups/cpusets.txt | ||
5 | 6 | ||
6 | Original copyright statements from cpusets.txt: | 7 | Original copyright statements from cpusets.txt: |
7 | Portions Copyright (C) 2004 BULL SA. | 8 | Portions Copyright (C) 2004 BULL SA. |
@@ -68,7 +69,7 @@ On their own, the only use for cgroups is for simple job | |||
68 | tracking. The intention is that other subsystems hook into the generic | 69 | tracking. The intention is that other subsystems hook into the generic |
69 | cgroup support to provide new attributes for cgroups, such as | 70 | cgroup support to provide new attributes for cgroups, such as |
70 | accounting/limiting the resources which processes in a cgroup can | 71 | accounting/limiting the resources which processes in a cgroup can |
71 | access. For example, cpusets (see Documentation/cpusets.txt) allows | 72 | access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allows |
72 | you to associate a set of CPUs and a set of memory nodes with the | 73 | you to associate a set of CPUs and a set of memory nodes with the |
73 | tasks in each cgroup. | 74 | tasks in each cgroup. |
74 | 75 | ||
diff --git a/Documentation/controllers/cpuacct.txt b/Documentation/cgroups/cpuacct.txt index bb775fbe43d7..bb775fbe43d7 100644 --- a/Documentation/controllers/cpuacct.txt +++ b/Documentation/cgroups/cpuacct.txt | |||
diff --git a/Documentation/cpusets.txt b/Documentation/cgroups/cpusets.txt index 5c86c258c791..5c86c258c791 100644 --- a/Documentation/cpusets.txt +++ b/Documentation/cgroups/cpusets.txt | |||
diff --git a/Documentation/controllers/devices.txt b/Documentation/cgroups/devices.txt index 7cc6e6a60672..7cc6e6a60672 100644 --- a/Documentation/controllers/devices.txt +++ b/Documentation/cgroups/devices.txt | |||
diff --git a/Documentation/controllers/memcg_test.txt b/Documentation/cgroups/memcg_test.txt index 08d4d3ea0d79..19533f93b7a2 100644 --- a/Documentation/controllers/memcg_test.txt +++ b/Documentation/cgroups/memcg_test.txt | |||
@@ -6,7 +6,7 @@ Because VM is getting complex (one of reasons is memcg...), memcg's behavior | |||
6 | is complex. This is a document for memcg's internal behavior. | 6 | is complex. This is a document for memcg's internal behavior. |
7 | Please note that implementation details can be changed. | 7 | Please note that implementation details can be changed. |
8 | 8 | ||
9 | (*) Topics on API should be in Documentation/controllers/memory.txt) | 9 | (*) Topics on API should be in Documentation/cgroups/memory.txt) |
10 | 10 | ||
11 | 0. How to record usage ? | 11 | 0. How to record usage ? |
12 | 2 objects are used. | 12 | 2 objects are used. |
diff --git a/Documentation/controllers/memory.txt b/Documentation/cgroups/memory.txt index e1501964df1e..e1501964df1e 100644 --- a/Documentation/controllers/memory.txt +++ b/Documentation/cgroups/memory.txt | |||
diff --git a/Documentation/controllers/resource_counter.txt b/Documentation/cgroups/resource_counter.txt index f196ac1d7d25..f196ac1d7d25 100644 --- a/Documentation/controllers/resource_counter.txt +++ b/Documentation/cgroups/resource_counter.txt | |||
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index d105eb45282a..bbebc3a43ac0 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -1371,292 +1371,8 @@ auto_msgmni default value is 1. | |||
1371 | 2.4 /proc/sys/vm - The virtual memory subsystem | 1371 | 2.4 /proc/sys/vm - The virtual memory subsystem |
1372 | ----------------------------------------------- | 1372 | ----------------------------------------------- |
1373 | 1373 | ||
1374 | The files in this directory can be used to tune the operation of the virtual | 1374 | Please see: Documentation/sysctls/vm.txt for a description of these |
1375 | memory (VM) subsystem of the Linux kernel. | 1375 | entries. |
1376 | |||
1377 | vfs_cache_pressure | ||
1378 | ------------------ | ||
1379 | |||
1380 | Controls the tendency of the kernel to reclaim the memory which is used for | ||
1381 | caching of directory and inode objects. | ||
1382 | |||
1383 | At the default value of vfs_cache_pressure=100 the kernel will attempt to | ||
1384 | reclaim dentries and inodes at a "fair" rate with respect to pagecache and | ||
1385 | swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer | ||
1386 | to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 | ||
1387 | causes the kernel to prefer to reclaim dentries and inodes. | ||
1388 | |||
1389 | dirty_background_bytes | ||
1390 | ---------------------- | ||
1391 | |||
1392 | Contains the amount of dirty memory at which the pdflush background writeback | ||
1393 | daemon will start writeback. | ||
1394 | |||
1395 | If dirty_background_bytes is written, dirty_background_ratio becomes a function | ||
1396 | of its value (dirty_background_bytes / the amount of dirtyable system memory). | ||
1397 | |||
1398 | dirty_background_ratio | ||
1399 | ---------------------- | ||
1400 | |||
1401 | Contains, as a percentage of the dirtyable system memory (free pages + mapped | ||
1402 | pages + file cache, not including locked pages and HugePages), the number of | ||
1403 | pages at which the pdflush background writeback daemon will start writing out | ||
1404 | dirty data. | ||
1405 | |||
1406 | If dirty_background_ratio is written, dirty_background_bytes becomes a function | ||
1407 | of its value (dirty_background_ratio * the amount of dirtyable system memory). | ||
1408 | |||
1409 | dirty_bytes | ||
1410 | ----------- | ||
1411 | |||
1412 | Contains the amount of dirty memory at which a process generating disk writes | ||
1413 | will itself start writeback. | ||
1414 | |||
1415 | If dirty_bytes is written, dirty_ratio becomes a function of its value | ||
1416 | (dirty_bytes / the amount of dirtyable system memory). | ||
1417 | |||
1418 | dirty_ratio | ||
1419 | ----------- | ||
1420 | |||
1421 | Contains, as a percentage of the dirtyable system memory (free pages + mapped | ||
1422 | pages + file cache, not including locked pages and HugePages), the number of | ||
1423 | pages at which a process which is generating disk writes will itself start | ||
1424 | writing out dirty data. | ||
1425 | |||
1426 | If dirty_ratio is written, dirty_bytes becomes a function of its value | ||
1427 | (dirty_ratio * the amount of dirtyable system memory). | ||
1428 | |||
1429 | dirty_writeback_centisecs | ||
1430 | ------------------------- | ||
1431 | |||
1432 | The pdflush writeback daemons will periodically wake up and write `old' data | ||
1433 | out to disk. This tunable expresses the interval between those wakeups, in | ||
1434 | 100'ths of a second. | ||
1435 | |||
1436 | Setting this to zero disables periodic writeback altogether. | ||
1437 | |||
1438 | dirty_expire_centisecs | ||
1439 | ---------------------- | ||
1440 | |||
1441 | This tunable is used to define when dirty data is old enough to be eligible | ||
1442 | for writeout by the pdflush daemons. It is expressed in 100'ths of a second. | ||
1443 | Data which has been dirty in-memory for longer than this interval will be | ||
1444 | written out next time a pdflush daemon wakes up. | ||
1445 | |||
1446 | highmem_is_dirtyable | ||
1447 | -------------------- | ||
1448 | |||
1449 | Only present if CONFIG_HIGHMEM is set. | ||
1450 | |||
1451 | This defaults to 0 (false), meaning that the ratios set above are calculated | ||
1452 | as a percentage of lowmem only. This protects against excessive scanning | ||
1453 | in page reclaim, swapping and general VM distress. | ||
1454 | |||
1455 | Setting this to 1 can be useful on 32 bit machines where you want to make | ||
1456 | random changes within an MMAPed file that is larger than your available | ||
1457 | lowmem without causing large quantities of random IO. Is is safe if the | ||
1458 | behavior of all programs running on the machine is known and memory will | ||
1459 | not be otherwise stressed. | ||
1460 | |||
1461 | legacy_va_layout | ||
1462 | ---------------- | ||
1463 | |||
1464 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel | ||
1465 | will use the legacy (2.4) layout for all processes. | ||
1466 | |||
1467 | lowmem_reserve_ratio | ||
1468 | --------------------- | ||
1469 | |||
1470 | For some specialised workloads on highmem machines it is dangerous for | ||
1471 | the kernel to allow process memory to be allocated from the "lowmem" | ||
1472 | zone. This is because that memory could then be pinned via the mlock() | ||
1473 | system call, or by unavailability of swapspace. | ||
1474 | |||
1475 | And on large highmem machines this lack of reclaimable lowmem memory | ||
1476 | can be fatal. | ||
1477 | |||
1478 | So the Linux page allocator has a mechanism which prevents allocations | ||
1479 | which _could_ use highmem from using too much lowmem. This means that | ||
1480 | a certain amount of lowmem is defended from the possibility of being | ||
1481 | captured into pinned user memory. | ||
1482 | |||
1483 | (The same argument applies to the old 16 megabyte ISA DMA region. This | ||
1484 | mechanism will also defend that region from allocations which could use | ||
1485 | highmem or lowmem). | ||
1486 | |||
1487 | The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is | ||
1488 | in defending these lower zones. | ||
1489 | |||
1490 | If you have a machine which uses highmem or ISA DMA and your | ||
1491 | applications are using mlock(), or if you are running with no swap then | ||
1492 | you probably should change the lowmem_reserve_ratio setting. | ||
1493 | |||
1494 | The lowmem_reserve_ratio is an array. You can see them by reading this file. | ||
1495 | - | ||
1496 | % cat /proc/sys/vm/lowmem_reserve_ratio | ||
1497 | 256 256 32 | ||
1498 | - | ||
1499 | Note: # of this elements is one fewer than number of zones. Because the highest | ||
1500 | zone's value is not necessary for following calculation. | ||
1501 | |||
1502 | But, these values are not used directly. The kernel calculates # of protection | ||
1503 | pages for each zones from them. These are shown as array of protection pages | ||
1504 | in /proc/zoneinfo like followings. (This is an example of x86-64 box). | ||
1505 | Each zone has an array of protection pages like this. | ||
1506 | |||
1507 | - | ||
1508 | Node 0, zone DMA | ||
1509 | pages free 1355 | ||
1510 | min 3 | ||
1511 | low 3 | ||
1512 | high 4 | ||
1513 | : | ||
1514 | : | ||
1515 | numa_other 0 | ||
1516 | protection: (0, 2004, 2004, 2004) | ||
1517 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
1518 | pagesets | ||
1519 | cpu: 0 pcp: 0 | ||
1520 | : | ||
1521 | - | ||
1522 | These protections are added to score to judge whether this zone should be used | ||
1523 | for page allocation or should be reclaimed. | ||
1524 | |||
1525 | In this example, if normal pages (index=2) are required to this DMA zone and | ||
1526 | pages_high is used for watermark, the kernel judges this zone should not be | ||
1527 | used because pages_free(1355) is smaller than watermark + protection[2] | ||
1528 | (4 + 2004 = 2008). If this protection value is 0, this zone would be used for | ||
1529 | normal page requirement. If requirement is DMA zone(index=0), protection[0] | ||
1530 | (=0) is used. | ||
1531 | |||
1532 | zone[i]'s protection[j] is calculated by following expression. | ||
1533 | |||
1534 | (i < j): | ||
1535 | zone[i]->protection[j] | ||
1536 | = (total sums of present_pages from zone[i+1] to zone[j] on the node) | ||
1537 | / lowmem_reserve_ratio[i]; | ||
1538 | (i = j): | ||
1539 | (should not be protected. = 0; | ||
1540 | (i > j): | ||
1541 | (not necessary, but looks 0) | ||
1542 | |||
1543 | The default values of lowmem_reserve_ratio[i] are | ||
1544 | 256 (if zone[i] means DMA or DMA32 zone) | ||
1545 | 32 (others). | ||
1546 | As above expression, they are reciprocal number of ratio. | ||
1547 | 256 means 1/256. # of protection pages becomes about "0.39%" of total present | ||
1548 | pages of higher zones on the node. | ||
1549 | |||
1550 | If you would like to protect more pages, smaller values are effective. | ||
1551 | The minimum value is 1 (1/1 -> 100%). | ||
1552 | |||
1553 | page-cluster | ||
1554 | ------------ | ||
1555 | |||
1556 | page-cluster controls the number of pages which are written to swap in | ||
1557 | a single attempt. The swap I/O size. | ||
1558 | |||
1559 | It is a logarithmic value - setting it to zero means "1 page", setting | ||
1560 | it to 1 means "2 pages", setting it to 2 means "4 pages", etc. | ||
1561 | |||
1562 | The default value is three (eight pages at a time). There may be some | ||
1563 | small benefits in tuning this to a different value if your workload is | ||
1564 | swap-intensive. | ||
1565 | |||
1566 | overcommit_memory | ||
1567 | ----------------- | ||
1568 | |||
1569 | Controls overcommit of system memory, possibly allowing processes | ||
1570 | to allocate (but not use) more memory than is actually available. | ||
1571 | |||
1572 | |||
1573 | 0 - Heuristic overcommit handling. Obvious overcommits of | ||
1574 | address space are refused. Used for a typical system. It | ||
1575 | ensures a seriously wild allocation fails while allowing | ||
1576 | overcommit to reduce swap usage. root is allowed to | ||
1577 | allocate slightly more memory in this mode. This is the | ||
1578 | default. | ||
1579 | |||
1580 | 1 - Always overcommit. Appropriate for some scientific | ||
1581 | applications. | ||
1582 | |||
1583 | 2 - Don't overcommit. The total address space commit | ||
1584 | for the system is not permitted to exceed swap plus a | ||
1585 | configurable percentage (default is 50) of physical RAM. | ||
1586 | Depending on the percentage you use, in most situations | ||
1587 | this means a process will not be killed while attempting | ||
1588 | to use already-allocated memory but will receive errors | ||
1589 | on memory allocation as appropriate. | ||
1590 | |||
1591 | overcommit_ratio | ||
1592 | ---------------- | ||
1593 | |||
1594 | Percentage of physical memory size to include in overcommit calculations | ||
1595 | (see above.) | ||
1596 | |||
1597 | Memory allocation limit = swapspace + physmem * (overcommit_ratio / 100) | ||
1598 | |||
1599 | swapspace = total size of all swap areas | ||
1600 | physmem = size of physical memory in system | ||
1601 | |||
1602 | nr_hugepages and hugetlb_shm_group | ||
1603 | ---------------------------------- | ||
1604 | |||
1605 | nr_hugepages configures number of hugetlb page reserved for the system. | ||
1606 | |||
1607 | hugetlb_shm_group contains group id that is allowed to create SysV shared | ||
1608 | memory segment using hugetlb page. | ||
1609 | |||
1610 | hugepages_treat_as_movable | ||
1611 | -------------------------- | ||
1612 | |||
1613 | This parameter is only useful when kernelcore= is specified at boot time to | ||
1614 | create ZONE_MOVABLE for pages that may be reclaimed or migrated. Huge pages | ||
1615 | are not movable so are not normally allocated from ZONE_MOVABLE. A non-zero | ||
1616 | value written to hugepages_treat_as_movable allows huge pages to be allocated | ||
1617 | from ZONE_MOVABLE. | ||
1618 | |||
1619 | Once enabled, the ZONE_MOVABLE is treated as an area of memory the huge | ||
1620 | pages pool can easily grow or shrink within. Assuming that applications are | ||
1621 | not running that mlock() a lot of memory, it is likely the huge pages pool | ||
1622 | can grow to the size of ZONE_MOVABLE by repeatedly entering the desired value | ||
1623 | into nr_hugepages and triggering page reclaim. | ||
1624 | |||
1625 | laptop_mode | ||
1626 | ----------- | ||
1627 | |||
1628 | laptop_mode is a knob that controls "laptop mode". All the things that are | ||
1629 | controlled by this knob are discussed in Documentation/laptops/laptop-mode.txt. | ||
1630 | |||
1631 | block_dump | ||
1632 | ---------- | ||
1633 | |||
1634 | block_dump enables block I/O debugging when set to a nonzero value. More | ||
1635 | information on block I/O debugging is in Documentation/laptops/laptop-mode.txt. | ||
1636 | |||
1637 | swap_token_timeout | ||
1638 | ------------------ | ||
1639 | |||
1640 | This file contains valid hold time of swap out protection token. The Linux | ||
1641 | VM has token based thrashing control mechanism and uses the token to prevent | ||
1642 | unnecessary page faults in thrashing situation. The unit of the value is | ||
1643 | second. The value would be useful to tune thrashing behavior. | ||
1644 | |||
1645 | drop_caches | ||
1646 | ----------- | ||
1647 | |||
1648 | Writing to this will cause the kernel to drop clean caches, dentries and | ||
1649 | inodes from memory, causing that memory to become free. | ||
1650 | |||
1651 | To free pagecache: | ||
1652 | echo 1 > /proc/sys/vm/drop_caches | ||
1653 | To free dentries and inodes: | ||
1654 | echo 2 > /proc/sys/vm/drop_caches | ||
1655 | To free pagecache, dentries and inodes: | ||
1656 | echo 3 > /proc/sys/vm/drop_caches | ||
1657 | |||
1658 | As this is a non-destructive operation and dirty objects are not freeable, the | ||
1659 | user should run `sync' first. | ||
1660 | 1376 | ||
1661 | 1377 | ||
1662 | 2.5 /proc/sys/dev - Device specific parameters | 1378 | 2.5 /proc/sys/dev - Device specific parameters |
diff --git a/Documentation/hwmon/adt7475 b/Documentation/hwmon/adt7475 new file mode 100644 index 000000000000..a2b1abec850e --- /dev/null +++ b/Documentation/hwmon/adt7475 | |||
@@ -0,0 +1,87 @@ | |||
1 | This describes the interface for the ADT7475 driver: | ||
2 | |||
3 | (there are 4 fans, numbered fan1 to fan4): | ||
4 | |||
5 | fanX_input Read the current speed of the fan (in RPMs) | ||
6 | fanX_min Read/write the minimum speed of the fan. Dropping | ||
7 | below this sets an alarm. | ||
8 | |||
9 | (there are three PWMs, numbered pwm1 to pwm3): | ||
10 | |||
11 | pwmX Read/write the current duty cycle of the PWM. Writes | ||
12 | only have effect when auto mode is turned off (see | ||
13 | below). Range is 0 - 255. | ||
14 | |||
15 | pwmX_enable Fan speed control method: | ||
16 | |||
17 | 0 - No control (fan at full speed) | ||
18 | 1 - Manual fan speed control (using pwm[1-*]) | ||
19 | 2 - Automatic fan speed control | ||
20 | |||
21 | pwmX_auto_channels_temp Select which channels affect this PWM | ||
22 | |||
23 | 1 - TEMP1 controls PWM | ||
24 | 2 - TEMP2 controls PWM | ||
25 | 4 - TEMP3 controls PWM | ||
26 | 6 - TEMP2 and TEMP3 control PWM | ||
27 | 7 - All three inputs control PWM | ||
28 | |||
29 | pwmX_freq Read/write the PWM frequency in Hz. The number | ||
30 | should be one of the following: | ||
31 | |||
32 | 11 Hz | ||
33 | 14 Hz | ||
34 | 22 Hz | ||
35 | 29 Hz | ||
36 | 35 Hz | ||
37 | 44 Hz | ||
38 | 58 Hz | ||
39 | 88 Hz | ||
40 | |||
41 | pwmX_auto_point1_pwm Read/write the minimum PWM duty cycle in automatic mode | ||
42 | |||
43 | pwmX_auto_point2_pwm Read/write the maximum PWM duty cycle in automatic mode | ||
44 | |||
45 | (there are three temperature settings numbered temp1 to temp3): | ||
46 | |||
47 | tempX_input Read the current temperature. The value is in milli | ||
48 | degrees of Celsius. | ||
49 | |||
50 | tempX_max Read/write the upper temperature limit - exceeding this | ||
51 | will cause an alarm. | ||
52 | |||
53 | tempX_min Read/write the lower temperature limit - exceeding this | ||
54 | will cause an alarm. | ||
55 | |||
56 | tempX_offset Read/write the temperature adjustment offset | ||
57 | |||
58 | tempX_crit Read/write the THERM limit for remote1. | ||
59 | |||
60 | tempX_crit_hyst Set the temperature value below crit where the | ||
61 | fans will stay on - this helps drive the temperature | ||
62 | low enough so it doesn't stay near the edge and | ||
63 | cause THERM to keep tripping. | ||
64 | |||
65 | tempX_auto_point1_temp Read/write the minimum temperature where the fans will | ||
66 | turn on in automatic mode. | ||
67 | |||
68 | tempX_auto_point2_temp Read/write the maximum temperature over which the fans | ||
69 | will run in automatic mode. tempX_auto_point1_temp | ||
70 | and tempX_auto_point2_temp together define the | ||
71 | range of automatic control. | ||
72 | |||
73 | tempX_alarm Read a 1 if the max/min alarm is set | ||
74 | tempX_fault Read a 1 if either temp1 or temp3 diode has a fault | ||
75 | |||
76 | (There are two voltage settings, in1 and in2): | ||
77 | |||
78 | inX_input Read the current voltage on VCC. Value is in | ||
79 | millivolts. | ||
80 | |||
81 | inX_min read/write the minimum voltage limit. | ||
82 | Dropping below this causes an alarm. | ||
83 | |||
84 | inX_max read/write the maximum voltage limit. | ||
85 | Exceeding this causes an alarm. | ||
86 | |||
87 | inX_alarm Read a 1 if the max/min alarm is set. | ||
diff --git a/Documentation/hwmon/lis3lv02d b/Documentation/hwmon/lis3lv02d index 65dfb0c0fd67..0fcfc4a7ccdc 100644 --- a/Documentation/hwmon/lis3lv02d +++ b/Documentation/hwmon/lis3lv02d | |||
@@ -13,18 +13,21 @@ Author: | |||
13 | Description | 13 | Description |
14 | ----------- | 14 | ----------- |
15 | 15 | ||
16 | This driver provides support for the accelerometer found in various HP laptops | 16 | This driver provides support for the accelerometer found in various HP |
17 | sporting the feature officially called "HP Mobile Data Protection System 3D" or | 17 | laptops sporting the feature officially called "HP Mobile Data |
18 | "HP 3D DriveGuard". It detect automatically laptops with this sensor. Known models | 18 | Protection System 3D" or "HP 3D DriveGuard". It detect automatically |
19 | (for now the HP 2133, nc6420, nc2510, nc8510, nc84x0, nw9440 and nx9420) will | 19 | laptops with this sensor. Known models (for now the HP 2133, nc6420, |
20 | have their axis automatically oriented on standard way (eg: you can directly | 20 | nc2510, nc8510, nc84x0, nw9440 and nx9420) will have their axis |
21 | play neverball). The accelerometer data is readable via | 21 | automatically oriented on standard way (eg: you can directly play |
22 | neverball). The accelerometer data is readable via | ||
22 | /sys/devices/platform/lis3lv02d. | 23 | /sys/devices/platform/lis3lv02d. |
23 | 24 | ||
24 | Sysfs attributes under /sys/devices/platform/lis3lv02d/: | 25 | Sysfs attributes under /sys/devices/platform/lis3lv02d/: |
25 | position - 3D position that the accelerometer reports. Format: "(x,y,z)" | 26 | position - 3D position that the accelerometer reports. Format: "(x,y,z)" |
26 | calibrate - read: values (x, y, z) that are used as the base for input class device operation. | 27 | calibrate - read: values (x, y, z) that are used as the base for input |
27 | write: forces the base to be recalibrated with the current position. | 28 | class device operation. |
29 | write: forces the base to be recalibrated with the current | ||
30 | position. | ||
28 | rate - reports the sampling rate of the accelerometer device in HZ | 31 | rate - reports the sampling rate of the accelerometer device in HZ |
29 | 32 | ||
30 | This driver also provides an absolute input class device, allowing | 33 | This driver also provides an absolute input class device, allowing |
@@ -39,11 +42,12 @@ the accelerometer are converted into a "standard" organisation of the axes | |||
39 | * When the laptop is horizontal the position reported is about 0 for X and Y | 42 | * When the laptop is horizontal the position reported is about 0 for X and Y |
40 | and a positive value for Z | 43 | and a positive value for Z |
41 | * If the left side is elevated, X increases (becomes positive) | 44 | * If the left side is elevated, X increases (becomes positive) |
42 | * If the front side (where the touchpad is) is elevated, Y decreases (becomes negative) | 45 | * If the front side (where the touchpad is) is elevated, Y decreases |
46 | (becomes negative) | ||
43 | * If the laptop is put upside-down, Z becomes negative | 47 | * If the laptop is put upside-down, Z becomes negative |
44 | 48 | ||
45 | If your laptop model is not recognized (cf "dmesg"), you can send an email to the | 49 | If your laptop model is not recognized (cf "dmesg"), you can send an |
46 | authors to add it to the database. When reporting a new laptop, please include | 50 | email to the authors to add it to the database. When reporting a new |
47 | the output of "dmidecode" plus the value of /sys/devices/platform/lis3lv02d/position | 51 | laptop, please include the output of "dmidecode" plus the value of |
48 | in these four cases. | 52 | /sys/devices/platform/lis3lv02d/position in these four cases. |
49 | 53 | ||
diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index 898b4987bb80..41bc99fa1884 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt | |||
@@ -1,7 +1,7 @@ | |||
1 | ThinkPad ACPI Extras Driver | 1 | ThinkPad ACPI Extras Driver |
2 | 2 | ||
3 | Version 0.21 | 3 | Version 0.22 |
4 | May 29th, 2008 | 4 | November 23rd, 2008 |
5 | 5 | ||
6 | Borislav Deianov <borislav@users.sf.net> | 6 | Borislav Deianov <borislav@users.sf.net> |
7 | Henrique de Moraes Holschuh <hmh@hmh.eng.br> | 7 | Henrique de Moraes Holschuh <hmh@hmh.eng.br> |
@@ -16,7 +16,8 @@ supported by the generic Linux ACPI drivers. | |||
16 | This driver used to be named ibm-acpi until kernel 2.6.21 and release | 16 | This driver used to be named ibm-acpi until kernel 2.6.21 and release |
17 | 0.13-20070314. It used to be in the drivers/acpi tree, but it was | 17 | 0.13-20070314. It used to be in the drivers/acpi tree, but it was |
18 | moved to the drivers/misc tree and renamed to thinkpad-acpi for kernel | 18 | moved to the drivers/misc tree and renamed to thinkpad-acpi for kernel |
19 | 2.6.22, and release 0.14. | 19 | 2.6.22, and release 0.14. It was moved to drivers/platform/x86 for |
20 | kernel 2.6.29 and release 0.22. | ||
20 | 21 | ||
21 | The driver is named "thinkpad-acpi". In some places, like module | 22 | The driver is named "thinkpad-acpi". In some places, like module |
22 | names, "thinkpad_acpi" is used because of userspace issues. | 23 | names, "thinkpad_acpi" is used because of userspace issues. |
@@ -1412,6 +1413,24 @@ Sysfs notes: | |||
1412 | rfkill controller switch "tpacpi_wwan_sw": refer to | 1413 | rfkill controller switch "tpacpi_wwan_sw": refer to |
1413 | Documentation/rfkill.txt for details. | 1414 | Documentation/rfkill.txt for details. |
1414 | 1415 | ||
1416 | EXPERIMENTAL: UWB | ||
1417 | ----------------- | ||
1418 | |||
1419 | This feature is marked EXPERIMENTAL because it has not been extensively | ||
1420 | tested and validated in various ThinkPad models yet. The feature may not | ||
1421 | work as expected. USE WITH CAUTION! To use this feature, you need to supply | ||
1422 | the experimental=1 parameter when loading the module. | ||
1423 | |||
1424 | sysfs rfkill class: switch "tpacpi_uwb_sw" | ||
1425 | |||
1426 | This feature exports an rfkill controller for the UWB device, if one is | ||
1427 | present and enabled in the BIOS. | ||
1428 | |||
1429 | Sysfs notes: | ||
1430 | |||
1431 | rfkill controller switch "tpacpi_uwb_sw": refer to | ||
1432 | Documentation/rfkill.txt for details. | ||
1433 | |||
1415 | Multiple Commands, Module Parameters | 1434 | Multiple Commands, Module Parameters |
1416 | ------------------------------------ | 1435 | ------------------------------------ |
1417 | 1436 | ||
diff --git a/Documentation/mips/AU1xxx_IDE.README b/Documentation/mips/AU1xxx_IDE.README index f54962aea84d..8ace35ebdcd5 100644 --- a/Documentation/mips/AU1xxx_IDE.README +++ b/Documentation/mips/AU1xxx_IDE.README | |||
@@ -52,14 +52,12 @@ Two files are introduced: | |||
52 | b) 'drivers/ide/mips/au1xxx-ide.c' | 52 | b) 'drivers/ide/mips/au1xxx-ide.c' |
53 | contains the functionality of the AU1XXX IDE driver | 53 | contains the functionality of the AU1XXX IDE driver |
54 | 54 | ||
55 | Four configs variables are introduced: | 55 | Following extra configs variables are introduced: |
56 | 56 | ||
57 | CONFIG_BLK_DEV_IDE_AU1XXX_PIO_DBDMA - enable the PIO+DBDMA mode | 57 | CONFIG_BLK_DEV_IDE_AU1XXX_PIO_DBDMA - enable the PIO+DBDMA mode |
58 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA - enable the MWDMA mode | 58 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA - enable the MWDMA mode |
59 | CONFIG_BLK_DEV_IDE_AU1XXX_BURSTABLE_ON - set Burstable FIFO in DBDMA | 59 | CONFIG_BLK_DEV_IDE_AU1XXX_BURSTABLE_ON - set Burstable FIFO in DBDMA |
60 | controller | 60 | controller |
61 | CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ - maximum transfer size | ||
62 | per descriptor | ||
63 | 61 | ||
64 | 62 | ||
65 | SUPPORTED IDE MODES | 63 | SUPPORTED IDE MODES |
@@ -87,7 +85,6 @@ CONFIG_BLK_DEV_IDEDMA_PCI=y | |||
87 | CONFIG_IDEDMA_PCI_AUTO=y | 85 | CONFIG_IDEDMA_PCI_AUTO=y |
88 | CONFIG_BLK_DEV_IDE_AU1XXX=y | 86 | CONFIG_BLK_DEV_IDE_AU1XXX=y |
89 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y | 87 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y |
90 | CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ=128 | ||
91 | CONFIG_BLK_DEV_IDEDMA=y | 88 | CONFIG_BLK_DEV_IDEDMA=y |
92 | CONFIG_IDEDMA_AUTO=y | 89 | CONFIG_IDEDMA_AUTO=y |
93 | 90 | ||
@@ -105,7 +102,6 @@ CONFIG_BLK_DEV_IDEDMA_PCI=y | |||
105 | CONFIG_IDEDMA_PCI_AUTO=y | 102 | CONFIG_IDEDMA_PCI_AUTO=y |
106 | CONFIG_BLK_DEV_IDE_AU1XXX=y | 103 | CONFIG_BLK_DEV_IDE_AU1XXX=y |
107 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y | 104 | CONFIG_BLK_DEV_IDE_AU1XXX_MDMA2_DBDMA=y |
108 | CONFIG_BLK_DEV_IDE_AU1XXX_SEQTS_PER_RQ=128 | ||
109 | CONFIG_BLK_DEV_IDEDMA=y | 105 | CONFIG_BLK_DEV_IDEDMA=y |
110 | CONFIG_IDEDMA_AUTO=y | 106 | CONFIG_IDEDMA_AUTO=y |
111 | 107 | ||
diff --git a/Documentation/scheduler/sched-design-CFS.txt b/Documentation/scheduler/sched-design-CFS.txt index 8398ca4ff4ed..6f33593e59e2 100644 --- a/Documentation/scheduler/sched-design-CFS.txt +++ b/Documentation/scheduler/sched-design-CFS.txt | |||
@@ -231,7 +231,7 @@ CPU bandwidth control purposes: | |||
231 | 231 | ||
232 | This options needs CONFIG_CGROUPS to be defined, and lets the administrator | 232 | This options needs CONFIG_CGROUPS to be defined, and lets the administrator |
233 | create arbitrary groups of tasks, using the "cgroup" pseudo filesystem. See | 233 | create arbitrary groups of tasks, using the "cgroup" pseudo filesystem. See |
234 | Documentation/cgroups.txt for more information about this filesystem. | 234 | Documentation/cgroups/cgroups.txt for more information about this filesystem. |
235 | 235 | ||
236 | Only one of these options to group tasks can be chosen and not both. | 236 | Only one of these options to group tasks can be chosen and not both. |
237 | 237 | ||
diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 4b7ac21ea9eb..64eb1100eec1 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt | |||
@@ -275,7 +275,8 @@ STAC9200 | |||
275 | dell-m25 Dell Inspiron E1505n | 275 | dell-m25 Dell Inspiron E1505n |
276 | dell-m26 Dell Inspiron 1501 | 276 | dell-m26 Dell Inspiron 1501 |
277 | dell-m27 Dell Inspiron E1705/9400 | 277 | dell-m27 Dell Inspiron E1705/9400 |
278 | gateway Gateway laptops with EAPD control | 278 | gateway-m4 Gateway laptops with EAPD control |
279 | gateway-m4-2 Gateway laptops with EAPD control | ||
279 | panasonic Panasonic CF-74 | 280 | panasonic Panasonic CF-74 |
280 | 281 | ||
281 | STAC9205/9254 | 282 | STAC9205/9254 |
@@ -302,6 +303,7 @@ STAC9220/9221 | |||
302 | macbook-pro Intel Mac Book Pro 2nd generation (eq. type 3) | 303 | macbook-pro Intel Mac Book Pro 2nd generation (eq. type 3) |
303 | imac-intel Intel iMac (eq. type 2) | 304 | imac-intel Intel iMac (eq. type 2) |
304 | imac-intel-20 Intel iMac (newer version) (eq. type 3) | 305 | imac-intel-20 Intel iMac (newer version) (eq. type 3) |
306 | ecs202 ECS/PC chips | ||
305 | dell-d81 Dell (unknown) | 307 | dell-d81 Dell (unknown) |
306 | dell-d82 Dell (unknown) | 308 | dell-d82 Dell (unknown) |
307 | dell-m81 Dell (unknown) | 309 | dell-m81 Dell (unknown) |
@@ -310,9 +312,13 @@ STAC9220/9221 | |||
310 | STAC9202/9250/9251 | 312 | STAC9202/9250/9251 |
311 | ================== | 313 | ================== |
312 | ref Reference board, base config | 314 | ref Reference board, base config |
315 | m1 Some Gateway MX series laptops (NX560XL) | ||
316 | m1-2 Some Gateway MX series laptops (MX6453) | ||
317 | m2 Some Gateway MX series laptops (M255) | ||
313 | m2-2 Some Gateway MX series laptops | 318 | m2-2 Some Gateway MX series laptops |
319 | m3 Some Gateway MX series laptops | ||
320 | m5 Some Gateway MX series laptops (MP6954) | ||
314 | m6 Some Gateway NX series laptops | 321 | m6 Some Gateway NX series laptops |
315 | pa6 Gateway NX860 series | ||
316 | 322 | ||
317 | STAC9227/9228/9229/927x | 323 | STAC9227/9228/9229/927x |
318 | ======================= | 324 | ======================= |
@@ -329,6 +335,7 @@ STAC92HD71B* | |||
329 | dell-m4-1 Dell desktops | 335 | dell-m4-1 Dell desktops |
330 | dell-m4-2 Dell desktops | 336 | dell-m4-2 Dell desktops |
331 | dell-m4-3 Dell desktops | 337 | dell-m4-3 Dell desktops |
338 | hp-m4 HP dv laptops | ||
332 | 339 | ||
333 | STAC92HD73* | 340 | STAC92HD73* |
334 | =========== | 341 | =========== |
@@ -337,6 +344,7 @@ STAC92HD73* | |||
337 | dell-m6-amic Dell desktops/laptops with analog mics | 344 | dell-m6-amic Dell desktops/laptops with analog mics |
338 | dell-m6-dmic Dell desktops/laptops with digital mics | 345 | dell-m6-dmic Dell desktops/laptops with digital mics |
339 | dell-m6 Dell desktops/laptops with both type of mics | 346 | dell-m6 Dell desktops/laptops with both type of mics |
347 | dell-eq Dell desktops/laptops | ||
340 | 348 | ||
341 | STAC92HD83* | 349 | STAC92HD83* |
342 | =========== | 350 | =========== |
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index a3415070bcac..3197fc83bc51 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt | |||
@@ -1,12 +1,13 @@ | |||
1 | Documentation for /proc/sys/vm/* kernel version 2.2.10 | 1 | Documentation for /proc/sys/vm/* kernel version 2.6.29 |
2 | (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> | 2 | (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> |
3 | (c) 2008 Peter W. Morreale <pmorreale@novell.com> | ||
3 | 4 | ||
4 | For general info and legal blurb, please look in README. | 5 | For general info and legal blurb, please look in README. |
5 | 6 | ||
6 | ============================================================== | 7 | ============================================================== |
7 | 8 | ||
8 | This file contains the documentation for the sysctl files in | 9 | This file contains the documentation for the sysctl files in |
9 | /proc/sys/vm and is valid for Linux kernel version 2.2. | 10 | /proc/sys/vm and is valid for Linux kernel version 2.6.29. |
10 | 11 | ||
11 | The files in this directory can be used to tune the operation | 12 | The files in this directory can be used to tune the operation |
12 | of the virtual memory (VM) subsystem of the Linux kernel and | 13 | of the virtual memory (VM) subsystem of the Linux kernel and |
@@ -16,180 +17,274 @@ Default values and initialization routines for most of these | |||
16 | files can be found in mm/swap.c. | 17 | files can be found in mm/swap.c. |
17 | 18 | ||
18 | Currently, these files are in /proc/sys/vm: | 19 | Currently, these files are in /proc/sys/vm: |
19 | - overcommit_memory | 20 | |
20 | - page-cluster | 21 | - block_dump |
21 | - dirty_ratio | 22 | - dirty_background_bytes |
22 | - dirty_background_ratio | 23 | - dirty_background_ratio |
24 | - dirty_bytes | ||
23 | - dirty_expire_centisecs | 25 | - dirty_expire_centisecs |
26 | - dirty_ratio | ||
24 | - dirty_writeback_centisecs | 27 | - dirty_writeback_centisecs |
25 | - highmem_is_dirtyable (only if CONFIG_HIGHMEM set) | 28 | - drop_caches |
29 | - hugepages_treat_as_movable | ||
30 | - hugetlb_shm_group | ||
31 | - laptop_mode | ||
32 | - legacy_va_layout | ||
33 | - lowmem_reserve_ratio | ||
26 | - max_map_count | 34 | - max_map_count |
27 | - min_free_kbytes | 35 | - min_free_kbytes |
28 | - laptop_mode | ||
29 | - block_dump | ||
30 | - drop-caches | ||
31 | - zone_reclaim_mode | ||
32 | - min_unmapped_ratio | ||
33 | - min_slab_ratio | 36 | - min_slab_ratio |
34 | - panic_on_oom | 37 | - min_unmapped_ratio |
35 | - oom_dump_tasks | 38 | - mmap_min_addr |
36 | - oom_kill_allocating_task | ||
37 | - mmap_min_address | ||
38 | - numa_zonelist_order | ||
39 | - nr_hugepages | 39 | - nr_hugepages |
40 | - nr_overcommit_hugepages | 40 | - nr_overcommit_hugepages |
41 | - nr_trim_pages (only if CONFIG_MMU=n) | 41 | - nr_pdflush_threads |
42 | - nr_trim_pages (only if CONFIG_MMU=n) | ||
43 | - numa_zonelist_order | ||
44 | - oom_dump_tasks | ||
45 | - oom_kill_allocating_task | ||
46 | - overcommit_memory | ||
47 | - overcommit_ratio | ||
48 | - page-cluster | ||
49 | - panic_on_oom | ||
50 | - percpu_pagelist_fraction | ||
51 | - stat_interval | ||
52 | - swappiness | ||
53 | - vfs_cache_pressure | ||
54 | - zone_reclaim_mode | ||
55 | |||
42 | 56 | ||
43 | ============================================================== | 57 | ============================================================== |
44 | 58 | ||
45 | dirty_bytes, dirty_ratio, dirty_background_bytes, | 59 | block_dump |
46 | dirty_background_ratio, dirty_expire_centisecs, | ||
47 | dirty_writeback_centisecs, highmem_is_dirtyable, | ||
48 | vfs_cache_pressure, laptop_mode, block_dump, swap_token_timeout, | ||
49 | drop-caches, hugepages_treat_as_movable: | ||
50 | 60 | ||
51 | See Documentation/filesystems/proc.txt | 61 | block_dump enables block I/O debugging when set to a nonzero value. More |
62 | information on block I/O debugging is in Documentation/laptops/laptop-mode.txt. | ||
52 | 63 | ||
53 | ============================================================== | 64 | ============================================================== |
54 | 65 | ||
55 | overcommit_memory: | 66 | dirty_background_bytes |
56 | 67 | ||
57 | This value contains a flag that enables memory overcommitment. | 68 | Contains the amount of dirty memory at which the pdflush background writeback |
69 | daemon will start writeback. | ||
58 | 70 | ||
59 | When this flag is 0, the kernel attempts to estimate the amount | 71 | If dirty_background_bytes is written, dirty_background_ratio becomes a function |
60 | of free memory left when userspace requests more memory. | 72 | of its value (dirty_background_bytes / the amount of dirtyable system memory). |
61 | 73 | ||
62 | When this flag is 1, the kernel pretends there is always enough | 74 | ============================================================== |
63 | memory until it actually runs out. | ||
64 | 75 | ||
65 | When this flag is 2, the kernel uses a "never overcommit" | 76 | dirty_background_ratio |
66 | policy that attempts to prevent any overcommit of memory. | ||
67 | 77 | ||
68 | This feature can be very useful because there are a lot of | 78 | Contains, as a percentage of total system memory, the number of pages at which |
69 | programs that malloc() huge amounts of memory "just-in-case" | 79 | the pdflush background writeback daemon will start writing out dirty data. |
70 | and don't use much of it. | ||
71 | 80 | ||
72 | The default value is 0. | 81 | ============================================================== |
73 | 82 | ||
74 | See Documentation/vm/overcommit-accounting and | 83 | dirty_bytes |
75 | security/commoncap.c::cap_vm_enough_memory() for more information. | 84 | |
85 | Contains the amount of dirty memory at which a process generating disk writes | ||
86 | will itself start writeback. | ||
87 | |||
88 | If dirty_bytes is written, dirty_ratio becomes a function of its value | ||
89 | (dirty_bytes / the amount of dirtyable system memory). | ||
76 | 90 | ||
77 | ============================================================== | 91 | ============================================================== |
78 | 92 | ||
79 | overcommit_ratio: | 93 | dirty_expire_centisecs |
80 | 94 | ||
81 | When overcommit_memory is set to 2, the committed address | 95 | This tunable is used to define when dirty data is old enough to be eligible |
82 | space is not permitted to exceed swap plus this percentage | 96 | for writeout by the pdflush daemons. It is expressed in 100'ths of a second. |
83 | of physical RAM. See above. | 97 | Data which has been dirty in-memory for longer than this interval will be |
98 | written out next time a pdflush daemon wakes up. | ||
99 | |||
100 | ============================================================== | ||
101 | |||
102 | dirty_ratio | ||
103 | |||
104 | Contains, as a percentage of total system memory, the number of pages at which | ||
105 | a process which is generating disk writes will itself start writing out dirty | ||
106 | data. | ||
84 | 107 | ||
85 | ============================================================== | 108 | ============================================================== |
86 | 109 | ||
87 | page-cluster: | 110 | dirty_writeback_centisecs |
88 | 111 | ||
89 | The Linux VM subsystem avoids excessive disk seeks by reading | 112 | The pdflush writeback daemons will periodically wake up and write `old' data |
90 | multiple pages on a page fault. The number of pages it reads | 113 | out to disk. This tunable expresses the interval between those wakeups, in |
91 | is dependent on the amount of memory in your machine. | 114 | 100'ths of a second. |
92 | 115 | ||
93 | The number of pages the kernel reads in at once is equal to | 116 | Setting this to zero disables periodic writeback altogether. |
94 | 2 ^ page-cluster. Values above 2 ^ 5 don't make much sense | ||
95 | for swap because we only cluster swap data in 32-page groups. | ||
96 | 117 | ||
97 | ============================================================== | 118 | ============================================================== |
98 | 119 | ||
99 | max_map_count: | 120 | drop_caches |
100 | 121 | ||
101 | This file contains the maximum number of memory map areas a process | 122 | Writing to this will cause the kernel to drop clean caches, dentries and |
102 | may have. Memory map areas are used as a side-effect of calling | 123 | inodes from memory, causing that memory to become free. |
103 | malloc, directly by mmap and mprotect, and also when loading shared | ||
104 | libraries. | ||
105 | 124 | ||
106 | While most applications need less than a thousand maps, certain | 125 | To free pagecache: |
107 | programs, particularly malloc debuggers, may consume lots of them, | 126 | echo 1 > /proc/sys/vm/drop_caches |
108 | e.g., up to one or two maps per allocation. | 127 | To free dentries and inodes: |
128 | echo 2 > /proc/sys/vm/drop_caches | ||
129 | To free pagecache, dentries and inodes: | ||
130 | echo 3 > /proc/sys/vm/drop_caches | ||
109 | 131 | ||
110 | The default value is 65536. | 132 | As this is a non-destructive operation and dirty objects are not freeable, the |
133 | user should run `sync' first. | ||
111 | 134 | ||
112 | ============================================================== | 135 | ============================================================== |
113 | 136 | ||
114 | min_free_kbytes: | 137 | hugepages_treat_as_movable |
115 | 138 | ||
116 | This is used to force the Linux VM to keep a minimum number | 139 | This parameter is only useful when kernelcore= is specified at boot time to |
117 | of kilobytes free. The VM uses this number to compute a pages_min | 140 | create ZONE_MOVABLE for pages that may be reclaimed or migrated. Huge pages |
118 | value for each lowmem zone in the system. Each lowmem zone gets | 141 | are not movable so are not normally allocated from ZONE_MOVABLE. A non-zero |
119 | a number of reserved free pages based proportionally on its size. | 142 | value written to hugepages_treat_as_movable allows huge pages to be allocated |
143 | from ZONE_MOVABLE. | ||
120 | 144 | ||
121 | Some minimal amount of memory is needed to satisfy PF_MEMALLOC | 145 | Once enabled, the ZONE_MOVABLE is treated as an area of memory the huge |
122 | allocations; if you set this to lower than 1024KB, your system will | 146 | pages pool can easily grow or shrink within. Assuming that applications are |
123 | become subtly broken, and prone to deadlock under high loads. | 147 | not running that mlock() a lot of memory, it is likely the huge pages pool |
124 | 148 | can grow to the size of ZONE_MOVABLE by repeatedly entering the desired value | |
125 | Setting this too high will OOM your machine instantly. | 149 | into nr_hugepages and triggering page reclaim. |
126 | 150 | ||
127 | ============================================================== | 151 | ============================================================== |
128 | 152 | ||
129 | percpu_pagelist_fraction | 153 | hugetlb_shm_group |
130 | 154 | ||
131 | This is the fraction of pages at most (high mark pcp->high) in each zone that | 155 | hugetlb_shm_group contains group id that is allowed to create SysV |
132 | are allocated for each per cpu page list. The min value for this is 8. It | 156 | shared memory segment using hugetlb page. |
133 | means that we don't allow more than 1/8th of pages in each zone to be | ||
134 | allocated in any single per_cpu_pagelist. This entry only changes the value | ||
135 | of hot per cpu pagelists. User can specify a number like 100 to allocate | ||
136 | 1/100th of each zone to each per cpu page list. | ||
137 | 157 | ||
138 | The batch value of each per cpu pagelist is also updated as a result. It is | 158 | ============================================================== |
139 | set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) | ||
140 | 159 | ||
141 | The initial value is zero. Kernel does not use this value at boot time to set | 160 | laptop_mode |
142 | the high water marks for each per cpu page list. | ||
143 | 161 | ||
144 | =============================================================== | 162 | laptop_mode is a knob that controls "laptop mode". All the things that are |
163 | controlled by this knob are discussed in Documentation/laptops/laptop-mode.txt. | ||
145 | 164 | ||
146 | zone_reclaim_mode: | 165 | ============================================================== |
147 | 166 | ||
148 | Zone_reclaim_mode allows someone to set more or less aggressive approaches to | 167 | legacy_va_layout |
149 | reclaim memory when a zone runs out of memory. If it is set to zero then no | ||
150 | zone reclaim occurs. Allocations will be satisfied from other zones / nodes | ||
151 | in the system. | ||
152 | 168 | ||
153 | This is value ORed together of | 169 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel |
170 | will use the legacy (2.4) layout for all processes. | ||
154 | 171 | ||
155 | 1 = Zone reclaim on | 172 | ============================================================== |
156 | 2 = Zone reclaim writes dirty pages out | ||
157 | 4 = Zone reclaim swaps pages | ||
158 | 173 | ||
159 | zone_reclaim_mode is set during bootup to 1 if it is determined that pages | 174 | lowmem_reserve_ratio |
160 | from remote zones will cause a measurable performance reduction. The | 175 | |
161 | page allocator will then reclaim easily reusable pages (those page | 176 | For some specialised workloads on highmem machines it is dangerous for |
162 | cache pages that are currently not used) before allocating off node pages. | 177 | the kernel to allow process memory to be allocated from the "lowmem" |
178 | zone. This is because that memory could then be pinned via the mlock() | ||
179 | system call, or by unavailability of swapspace. | ||
180 | |||
181 | And on large highmem machines this lack of reclaimable lowmem memory | ||
182 | can be fatal. | ||
183 | |||
184 | So the Linux page allocator has a mechanism which prevents allocations | ||
185 | which _could_ use highmem from using too much lowmem. This means that | ||
186 | a certain amount of lowmem is defended from the possibility of being | ||
187 | captured into pinned user memory. | ||
188 | |||
189 | (The same argument applies to the old 16 megabyte ISA DMA region. This | ||
190 | mechanism will also defend that region from allocations which could use | ||
191 | highmem or lowmem). | ||
192 | |||
193 | The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is | ||
194 | in defending these lower zones. | ||
195 | |||
196 | If you have a machine which uses highmem or ISA DMA and your | ||
197 | applications are using mlock(), or if you are running with no swap then | ||
198 | you probably should change the lowmem_reserve_ratio setting. | ||
199 | |||
200 | The lowmem_reserve_ratio is an array. You can see them by reading this file. | ||
201 | - | ||
202 | % cat /proc/sys/vm/lowmem_reserve_ratio | ||
203 | 256 256 32 | ||
204 | - | ||
205 | Note: # of this elements is one fewer than number of zones. Because the highest | ||
206 | zone's value is not necessary for following calculation. | ||
207 | |||
208 | But, these values are not used directly. The kernel calculates # of protection | ||
209 | pages for each zones from them. These are shown as array of protection pages | ||
210 | in /proc/zoneinfo like followings. (This is an example of x86-64 box). | ||
211 | Each zone has an array of protection pages like this. | ||
212 | |||
213 | - | ||
214 | Node 0, zone DMA | ||
215 | pages free 1355 | ||
216 | min 3 | ||
217 | low 3 | ||
218 | high 4 | ||
219 | : | ||
220 | : | ||
221 | numa_other 0 | ||
222 | protection: (0, 2004, 2004, 2004) | ||
223 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
224 | pagesets | ||
225 | cpu: 0 pcp: 0 | ||
226 | : | ||
227 | - | ||
228 | These protections are added to score to judge whether this zone should be used | ||
229 | for page allocation or should be reclaimed. | ||
230 | |||
231 | In this example, if normal pages (index=2) are required to this DMA zone and | ||
232 | pages_high is used for watermark, the kernel judges this zone should not be | ||
233 | used because pages_free(1355) is smaller than watermark + protection[2] | ||
234 | (4 + 2004 = 2008). If this protection value is 0, this zone would be used for | ||
235 | normal page requirement. If requirement is DMA zone(index=0), protection[0] | ||
236 | (=0) is used. | ||
237 | |||
238 | zone[i]'s protection[j] is calculated by following expression. | ||
239 | |||
240 | (i < j): | ||
241 | zone[i]->protection[j] | ||
242 | = (total sums of present_pages from zone[i+1] to zone[j] on the node) | ||
243 | / lowmem_reserve_ratio[i]; | ||
244 | (i = j): | ||
245 | (should not be protected. = 0; | ||
246 | (i > j): | ||
247 | (not necessary, but looks 0) | ||
248 | |||
249 | The default values of lowmem_reserve_ratio[i] are | ||
250 | 256 (if zone[i] means DMA or DMA32 zone) | ||
251 | 32 (others). | ||
252 | As above expression, they are reciprocal number of ratio. | ||
253 | 256 means 1/256. # of protection pages becomes about "0.39%" of total present | ||
254 | pages of higher zones on the node. | ||
255 | |||
256 | If you would like to protect more pages, smaller values are effective. | ||
257 | The minimum value is 1 (1/1 -> 100%). | ||
163 | 258 | ||
164 | It may be beneficial to switch off zone reclaim if the system is | 259 | ============================================================== |
165 | used for a file server and all of memory should be used for caching files | ||
166 | from disk. In that case the caching effect is more important than | ||
167 | data locality. | ||
168 | 260 | ||
169 | Allowing zone reclaim to write out pages stops processes that are | 261 | max_map_count: |
170 | writing large amounts of data from dirtying pages on other nodes. Zone | ||
171 | reclaim will write out dirty pages if a zone fills up and so effectively | ||
172 | throttle the process. This may decrease the performance of a single process | ||
173 | since it cannot use all of system memory to buffer the outgoing writes | ||
174 | anymore but it preserve the memory on other nodes so that the performance | ||
175 | of other processes running on other nodes will not be affected. | ||
176 | 262 | ||
177 | Allowing regular swap effectively restricts allocations to the local | 263 | This file contains the maximum number of memory map areas a process |
178 | node unless explicitly overridden by memory policies or cpuset | 264 | may have. Memory map areas are used as a side-effect of calling |
179 | configurations. | 265 | malloc, directly by mmap and mprotect, and also when loading shared |
266 | libraries. | ||
180 | 267 | ||
181 | ============================================================= | 268 | While most applications need less than a thousand maps, certain |
269 | programs, particularly malloc debuggers, may consume lots of them, | ||
270 | e.g., up to one or two maps per allocation. | ||
182 | 271 | ||
183 | min_unmapped_ratio: | 272 | The default value is 65536. |
184 | 273 | ||
185 | This is available only on NUMA kernels. | 274 | ============================================================== |
186 | 275 | ||
187 | A percentage of the total pages in each zone. Zone reclaim will only | 276 | min_free_kbytes: |
188 | occur if more than this percentage of pages are file backed and unmapped. | ||
189 | This is to insure that a minimal amount of local pages is still available for | ||
190 | file I/O even if the node is overallocated. | ||
191 | 277 | ||
192 | The default is 1 percent. | 278 | This is used to force the Linux VM to keep a minimum number |
279 | of kilobytes free. The VM uses this number to compute a pages_min | ||
280 | value for each lowmem zone in the system. Each lowmem zone gets | ||
281 | a number of reserved free pages based proportionally on its size. | ||
282 | |||
283 | Some minimal amount of memory is needed to satisfy PF_MEMALLOC | ||
284 | allocations; if you set this to lower than 1024KB, your system will | ||
285 | become subtly broken, and prone to deadlock under high loads. | ||
286 | |||
287 | Setting this too high will OOM your machine instantly. | ||
193 | 288 | ||
194 | ============================================================= | 289 | ============================================================= |
195 | 290 | ||
@@ -211,82 +306,73 @@ and may not be fast. | |||
211 | 306 | ||
212 | ============================================================= | 307 | ============================================================= |
213 | 308 | ||
214 | panic_on_oom | 309 | min_unmapped_ratio: |
215 | 310 | ||
216 | This enables or disables panic on out-of-memory feature. | 311 | This is available only on NUMA kernels. |
217 | 312 | ||
218 | If this is set to 0, the kernel will kill some rogue process, | 313 | A percentage of the total pages in each zone. Zone reclaim will only |
219 | called oom_killer. Usually, oom_killer can kill rogue processes and | 314 | occur if more than this percentage of pages are file backed and unmapped. |
220 | system will survive. | 315 | This is to insure that a minimal amount of local pages is still available for |
316 | file I/O even if the node is overallocated. | ||
221 | 317 | ||
222 | If this is set to 1, the kernel panics when out-of-memory happens. | 318 | The default is 1 percent. |
223 | However, if a process limits using nodes by mempolicy/cpusets, | ||
224 | and those nodes become memory exhaustion status, one process | ||
225 | may be killed by oom-killer. No panic occurs in this case. | ||
226 | Because other nodes' memory may be free. This means system total status | ||
227 | may be not fatal yet. | ||
228 | 319 | ||
229 | If this is set to 2, the kernel panics compulsorily even on the | 320 | ============================================================== |
230 | above-mentioned. | ||
231 | 321 | ||
232 | The default value is 0. | 322 | mmap_min_addr |
233 | 1 and 2 are for failover of clustering. Please select either | ||
234 | according to your policy of failover. | ||
235 | 323 | ||
236 | ============================================================= | 324 | This file indicates the amount of address space which a user process will |
325 | be restricted from mmaping. Since kernel null dereference bugs could | ||
326 | accidentally operate based on the information in the first couple of pages | ||
327 | of memory userspace processes should not be allowed to write to them. By | ||
328 | default this value is set to 0 and no protections will be enforced by the | ||
329 | security module. Setting this value to something like 64k will allow the | ||
330 | vast majority of applications to work correctly and provide defense in depth | ||
331 | against future potential kernel bugs. | ||
237 | 332 | ||
238 | oom_dump_tasks | 333 | ============================================================== |
239 | 334 | ||
240 | Enables a system-wide task dump (excluding kernel threads) to be | 335 | nr_hugepages |
241 | produced when the kernel performs an OOM-killing and includes such | ||
242 | information as pid, uid, tgid, vm size, rss, cpu, oom_adj score, and | ||
243 | name. This is helpful to determine why the OOM killer was invoked | ||
244 | and to identify the rogue task that caused it. | ||
245 | 336 | ||
246 | If this is set to zero, this information is suppressed. On very | 337 | Change the minimum size of the hugepage pool. |
247 | large systems with thousands of tasks it may not be feasible to dump | ||
248 | the memory state information for each one. Such systems should not | ||
249 | be forced to incur a performance penalty in OOM conditions when the | ||
250 | information may not be desired. | ||
251 | 338 | ||
252 | If this is set to non-zero, this information is shown whenever the | 339 | See Documentation/vm/hugetlbpage.txt |
253 | OOM killer actually kills a memory-hogging task. | ||
254 | 340 | ||
255 | The default value is 0. | 341 | ============================================================== |
256 | 342 | ||
257 | ============================================================= | 343 | nr_overcommit_hugepages |
258 | 344 | ||
259 | oom_kill_allocating_task | 345 | Change the maximum size of the hugepage pool. The maximum is |
346 | nr_hugepages + nr_overcommit_hugepages. | ||
260 | 347 | ||
261 | This enables or disables killing the OOM-triggering task in | 348 | See Documentation/vm/hugetlbpage.txt |
262 | out-of-memory situations. | ||
263 | 349 | ||
264 | If this is set to zero, the OOM killer will scan through the entire | 350 | ============================================================== |
265 | tasklist and select a task based on heuristics to kill. This normally | ||
266 | selects a rogue memory-hogging task that frees up a large amount of | ||
267 | memory when killed. | ||
268 | 351 | ||
269 | If this is set to non-zero, the OOM killer simply kills the task that | 352 | nr_pdflush_threads |
270 | triggered the out-of-memory condition. This avoids the expensive | ||
271 | tasklist scan. | ||
272 | 353 | ||
273 | If panic_on_oom is selected, it takes precedence over whatever value | 354 | The current number of pdflush threads. This value is read-only. |
274 | is used in oom_kill_allocating_task. | 355 | The value changes according to the number of dirty pages in the system. |
275 | 356 | ||
276 | The default value is 0. | 357 | When neccessary, additional pdflush threads are created, one per second, up to |
358 | nr_pdflush_threads_max. | ||
277 | 359 | ||
278 | ============================================================== | 360 | ============================================================== |
279 | 361 | ||
280 | mmap_min_addr | 362 | nr_trim_pages |
281 | 363 | ||
282 | This file indicates the amount of address space which a user process will | 364 | This is available only on NOMMU kernels. |
283 | be restricted from mmaping. Since kernel null dereference bugs could | 365 | |
284 | accidentally operate based on the information in the first couple of pages | 366 | This value adjusts the excess page trimming behaviour of power-of-2 aligned |
285 | of memory userspace processes should not be allowed to write to them. By | 367 | NOMMU mmap allocations. |
286 | default this value is set to 0 and no protections will be enforced by the | 368 | |
287 | security module. Setting this value to something like 64k will allow the | 369 | A value of 0 disables trimming of allocations entirely, while a value of 1 |
288 | vast majority of applications to work correctly and provide defense in depth | 370 | trims excess pages aggressively. Any value >= 1 acts as the watermark where |
289 | against future potential kernel bugs. | 371 | trimming of allocations is initiated. |
372 | |||
373 | The default value is 1. | ||
374 | |||
375 | See Documentation/nommu-mmap.txt for more information. | ||
290 | 376 | ||
291 | ============================================================== | 377 | ============================================================== |
292 | 378 | ||
@@ -335,34 +421,199 @@ this is causing problems for your system/application. | |||
335 | 421 | ||
336 | ============================================================== | 422 | ============================================================== |
337 | 423 | ||
338 | nr_hugepages | 424 | oom_dump_tasks |
339 | 425 | ||
340 | Change the minimum size of the hugepage pool. | 426 | Enables a system-wide task dump (excluding kernel threads) to be |
427 | produced when the kernel performs an OOM-killing and includes such | ||
428 | information as pid, uid, tgid, vm size, rss, cpu, oom_adj score, and | ||
429 | name. This is helpful to determine why the OOM killer was invoked | ||
430 | and to identify the rogue task that caused it. | ||
341 | 431 | ||
342 | See Documentation/vm/hugetlbpage.txt | 432 | If this is set to zero, this information is suppressed. On very |
433 | large systems with thousands of tasks it may not be feasible to dump | ||
434 | the memory state information for each one. Such systems should not | ||
435 | be forced to incur a performance penalty in OOM conditions when the | ||
436 | information may not be desired. | ||
437 | |||
438 | If this is set to non-zero, this information is shown whenever the | ||
439 | OOM killer actually kills a memory-hogging task. | ||
440 | |||
441 | The default value is 0. | ||
343 | 442 | ||
344 | ============================================================== | 443 | ============================================================== |
345 | 444 | ||
346 | nr_overcommit_hugepages | 445 | oom_kill_allocating_task |
347 | 446 | ||
348 | Change the maximum size of the hugepage pool. The maximum is | 447 | This enables or disables killing the OOM-triggering task in |
349 | nr_hugepages + nr_overcommit_hugepages. | 448 | out-of-memory situations. |
350 | 449 | ||
351 | See Documentation/vm/hugetlbpage.txt | 450 | If this is set to zero, the OOM killer will scan through the entire |
451 | tasklist and select a task based on heuristics to kill. This normally | ||
452 | selects a rogue memory-hogging task that frees up a large amount of | ||
453 | memory when killed. | ||
454 | |||
455 | If this is set to non-zero, the OOM killer simply kills the task that | ||
456 | triggered the out-of-memory condition. This avoids the expensive | ||
457 | tasklist scan. | ||
458 | |||
459 | If panic_on_oom is selected, it takes precedence over whatever value | ||
460 | is used in oom_kill_allocating_task. | ||
461 | |||
462 | The default value is 0. | ||
352 | 463 | ||
353 | ============================================================== | 464 | ============================================================== |
354 | 465 | ||
355 | nr_trim_pages | 466 | overcommit_memory: |
356 | 467 | ||
357 | This is available only on NOMMU kernels. | 468 | This value contains a flag that enables memory overcommitment. |
358 | 469 | ||
359 | This value adjusts the excess page trimming behaviour of power-of-2 aligned | 470 | When this flag is 0, the kernel attempts to estimate the amount |
360 | NOMMU mmap allocations. | 471 | of free memory left when userspace requests more memory. |
361 | 472 | ||
362 | A value of 0 disables trimming of allocations entirely, while a value of 1 | 473 | When this flag is 1, the kernel pretends there is always enough |
363 | trims excess pages aggressively. Any value >= 1 acts as the watermark where | 474 | memory until it actually runs out. |
364 | trimming of allocations is initiated. | ||
365 | 475 | ||
366 | The default value is 1. | 476 | When this flag is 2, the kernel uses a "never overcommit" |
477 | policy that attempts to prevent any overcommit of memory. | ||
367 | 478 | ||
368 | See Documentation/nommu-mmap.txt for more information. | 479 | This feature can be very useful because there are a lot of |
480 | programs that malloc() huge amounts of memory "just-in-case" | ||
481 | and don't use much of it. | ||
482 | |||
483 | The default value is 0. | ||
484 | |||
485 | See Documentation/vm/overcommit-accounting and | ||
486 | security/commoncap.c::cap_vm_enough_memory() for more information. | ||
487 | |||
488 | ============================================================== | ||
489 | |||
490 | overcommit_ratio: | ||
491 | |||
492 | When overcommit_memory is set to 2, the committed address | ||
493 | space is not permitted to exceed swap plus this percentage | ||
494 | of physical RAM. See above. | ||
495 | |||
496 | ============================================================== | ||
497 | |||
498 | page-cluster | ||
499 | |||
500 | page-cluster controls the number of pages which are written to swap in | ||
501 | a single attempt. The swap I/O size. | ||
502 | |||
503 | It is a logarithmic value - setting it to zero means "1 page", setting | ||
504 | it to 1 means "2 pages", setting it to 2 means "4 pages", etc. | ||
505 | |||
506 | The default value is three (eight pages at a time). There may be some | ||
507 | small benefits in tuning this to a different value if your workload is | ||
508 | swap-intensive. | ||
509 | |||
510 | ============================================================= | ||
511 | |||
512 | panic_on_oom | ||
513 | |||
514 | This enables or disables panic on out-of-memory feature. | ||
515 | |||
516 | If this is set to 0, the kernel will kill some rogue process, | ||
517 | called oom_killer. Usually, oom_killer can kill rogue processes and | ||
518 | system will survive. | ||
519 | |||
520 | If this is set to 1, the kernel panics when out-of-memory happens. | ||
521 | However, if a process limits using nodes by mempolicy/cpusets, | ||
522 | and those nodes become memory exhaustion status, one process | ||
523 | may be killed by oom-killer. No panic occurs in this case. | ||
524 | Because other nodes' memory may be free. This means system total status | ||
525 | may be not fatal yet. | ||
526 | |||
527 | If this is set to 2, the kernel panics compulsorily even on the | ||
528 | above-mentioned. | ||
529 | |||
530 | The default value is 0. | ||
531 | 1 and 2 are for failover of clustering. Please select either | ||
532 | according to your policy of failover. | ||
533 | |||
534 | ============================================================= | ||
535 | |||
536 | percpu_pagelist_fraction | ||
537 | |||
538 | This is the fraction of pages at most (high mark pcp->high) in each zone that | ||
539 | are allocated for each per cpu page list. The min value for this is 8. It | ||
540 | means that we don't allow more than 1/8th of pages in each zone to be | ||
541 | allocated in any single per_cpu_pagelist. This entry only changes the value | ||
542 | of hot per cpu pagelists. User can specify a number like 100 to allocate | ||
543 | 1/100th of each zone to each per cpu page list. | ||
544 | |||
545 | The batch value of each per cpu pagelist is also updated as a result. It is | ||
546 | set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) | ||
547 | |||
548 | The initial value is zero. Kernel does not use this value at boot time to set | ||
549 | the high water marks for each per cpu page list. | ||
550 | |||
551 | ============================================================== | ||
552 | |||
553 | stat_interval | ||
554 | |||
555 | The time interval between which vm statistics are updated. The default | ||
556 | is 1 second. | ||
557 | |||
558 | ============================================================== | ||
559 | |||
560 | swappiness | ||
561 | |||
562 | This control is used to define how aggressive the kernel will swap | ||
563 | memory pages. Higher values will increase agressiveness, lower values | ||
564 | descrease the amount of swap. | ||
565 | |||
566 | The default value is 60. | ||
567 | |||
568 | ============================================================== | ||
569 | |||
570 | vfs_cache_pressure | ||
571 | ------------------ | ||
572 | |||
573 | Controls the tendency of the kernel to reclaim the memory which is used for | ||
574 | caching of directory and inode objects. | ||
575 | |||
576 | At the default value of vfs_cache_pressure=100 the kernel will attempt to | ||
577 | reclaim dentries and inodes at a "fair" rate with respect to pagecache and | ||
578 | swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer | ||
579 | to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 | ||
580 | causes the kernel to prefer to reclaim dentries and inodes. | ||
581 | |||
582 | ============================================================== | ||
583 | |||
584 | zone_reclaim_mode: | ||
585 | |||
586 | Zone_reclaim_mode allows someone to set more or less aggressive approaches to | ||
587 | reclaim memory when a zone runs out of memory. If it is set to zero then no | ||
588 | zone reclaim occurs. Allocations will be satisfied from other zones / nodes | ||
589 | in the system. | ||
590 | |||
591 | This is value ORed together of | ||
592 | |||
593 | 1 = Zone reclaim on | ||
594 | 2 = Zone reclaim writes dirty pages out | ||
595 | 4 = Zone reclaim swaps pages | ||
596 | |||
597 | zone_reclaim_mode is set during bootup to 1 if it is determined that pages | ||
598 | from remote zones will cause a measurable performance reduction. The | ||
599 | page allocator will then reclaim easily reusable pages (those page | ||
600 | cache pages that are currently not used) before allocating off node pages. | ||
601 | |||
602 | It may be beneficial to switch off zone reclaim if the system is | ||
603 | used for a file server and all of memory should be used for caching files | ||
604 | from disk. In that case the caching effect is more important than | ||
605 | data locality. | ||
606 | |||
607 | Allowing zone reclaim to write out pages stops processes that are | ||
608 | writing large amounts of data from dirtying pages on other nodes. Zone | ||
609 | reclaim will write out dirty pages if a zone fills up and so effectively | ||
610 | throttle the process. This may decrease the performance of a single process | ||
611 | since it cannot use all of system memory to buffer the outgoing writes | ||
612 | anymore but it preserve the memory on other nodes so that the performance | ||
613 | of other processes running on other nodes will not be affected. | ||
614 | |||
615 | Allowing regular swap effectively restricts allocations to the local | ||
616 | node unless explicitly overridden by memory policies or cpuset | ||
617 | configurations. | ||
618 | |||
619 | ============ End of Document ================================= | ||
diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt index 10a0263ebb3f..9e592c718afb 100644 --- a/Documentation/sysrq.txt +++ b/Documentation/sysrq.txt | |||
@@ -1,6 +1,5 @@ | |||
1 | Linux Magic System Request Key Hacks | 1 | Linux Magic System Request Key Hacks |
2 | Documentation for sysrq.c | 2 | Documentation for sysrq.c |
3 | Last update: 2007-AUG-04 | ||
4 | 3 | ||
5 | * What is the magic SysRq key? | 4 | * What is the magic SysRq key? |
6 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 5 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
@@ -211,6 +210,24 @@ within a function called by handle_sysrq, you must be aware that you are in | |||
211 | a lock (you are also in an interrupt handler, which means don't sleep!), so | 210 | a lock (you are also in an interrupt handler, which means don't sleep!), so |
212 | you must call __handle_sysrq_nolock instead. | 211 | you must call __handle_sysrq_nolock instead. |
213 | 212 | ||
213 | * When I hit a SysRq key combination only the header appears on the console? | ||
214 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
215 | Sysrq output is subject to the same console loglevel control as all | ||
216 | other console output. This means that if the kernel was booted 'quiet' | ||
217 | as is common on distro kernels the output may not appear on the actual | ||
218 | console, even though it will appear in the dmesg buffer, and be accessible | ||
219 | via the dmesg command and to the consumers of /proc/kmsg. As a specific | ||
220 | exception the header line from the sysrq command is passed to all console | ||
221 | consumers as if the current loglevel was maximum. If only the header | ||
222 | is emitted it is almost certain that the kernel loglevel is too low. | ||
223 | Should you require the output on the console channel then you will need | ||
224 | to temporarily up the console loglevel using alt-sysrq-8 or: | ||
225 | |||
226 | echo 8 > /proc/sysrq-trigger | ||
227 | |||
228 | Remember to return the loglevel to normal after triggering the sysrq | ||
229 | command you are interested in. | ||
230 | |||
214 | * I have more questions, who can I ask? | 231 | * I have more questions, who can I ask? |
215 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 232 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
216 | And I'll answer any questions about the registration system you got, also | 233 | And I'll answer any questions about the registration system you got, also |