diff options
Diffstat (limited to 'Documentation/filesystems/proc.txt')
-rw-r--r-- | Documentation/filesystems/proc.txt | 155 |
1 files changed, 126 insertions, 29 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index dec99455321f..5681e2fa1496 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
@@ -216,6 +216,7 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3) | |||
216 | priority priority level | 216 | priority priority level |
217 | nice nice level | 217 | nice nice level |
218 | num_threads number of threads | 218 | num_threads number of threads |
219 | it_real_value (obsolete, always 0) | ||
219 | start_time time the process started after system boot | 220 | start_time time the process started after system boot |
220 | vsize virtual memory size | 221 | vsize virtual memory size |
221 | rss resident set memory size | 222 | rss resident set memory size |
@@ -857,6 +858,45 @@ CPUs. | |||
857 | The "procs_blocked" line gives the number of processes currently blocked, | 858 | The "procs_blocked" line gives the number of processes currently blocked, |
858 | waiting for I/O to complete. | 859 | waiting for I/O to complete. |
859 | 860 | ||
861 | 1.9 Ext4 file system parameters | ||
862 | ------------------------------ | ||
863 | Ext4 file system have one directory per partition under /proc/fs/ext4/ | ||
864 | # ls /proc/fs/ext4/hdc/ | ||
865 | group_prealloc max_to_scan mb_groups mb_history min_to_scan order2_req | ||
866 | stats stream_req | ||
867 | |||
868 | mb_groups: | ||
869 | This file gives the details of mutiblock allocator buddy cache of free blocks | ||
870 | |||
871 | mb_history: | ||
872 | Multiblock allocation history. | ||
873 | |||
874 | stats: | ||
875 | This file indicate whether the multiblock allocator should start collecting | ||
876 | statistics. The statistics are shown during unmount | ||
877 | |||
878 | group_prealloc: | ||
879 | The multiblock allocator normalize the block allocation request to | ||
880 | group_prealloc filesystem blocks if we don't have strip value set. | ||
881 | The stripe value can be specified at mount time or during mke2fs. | ||
882 | |||
883 | max_to_scan: | ||
884 | How long multiblock allocator can look for a best extent (in found extents) | ||
885 | |||
886 | min_to_scan: | ||
887 | How long multiblock allocator must look for a best extent | ||
888 | |||
889 | order2_req: | ||
890 | Multiblock allocator use 2^N search using buddies only for requests greater | ||
891 | than or equal to order2_req. The request size is specfied in file system | ||
892 | blocks. A value of 2 indicate only if the requests are greater than or equal | ||
893 | to 4 blocks. | ||
894 | |||
895 | stream_req: | ||
896 | Files smaller than stream_req are served by the stream allocator, whose | ||
897 | purpose is to pack requests as close each to other as possible to | ||
898 | produce smooth I/O traffic. Avalue of 16 indicate that file smaller than 16 | ||
899 | filesystem block size will use group based preallocation. | ||
860 | 900 | ||
861 | ------------------------------------------------------------------------------ | 901 | ------------------------------------------------------------------------------ |
862 | Summary | 902 | Summary |
@@ -989,6 +1029,14 @@ nr_inodes | |||
989 | Denotes the number of inodes the system has allocated. This number will | 1029 | Denotes the number of inodes the system has allocated. This number will |
990 | grow and shrink dynamically. | 1030 | grow and shrink dynamically. |
991 | 1031 | ||
1032 | nr_open | ||
1033 | ------- | ||
1034 | |||
1035 | Denotes the maximum number of file-handles a process can | ||
1036 | allocate. Default value is 1024*1024 (1048576) which should be | ||
1037 | enough for most machines. Actual limit depends on RLIMIT_NOFILE | ||
1038 | resource limit. | ||
1039 | |||
992 | nr_free_inodes | 1040 | nr_free_inodes |
993 | -------------- | 1041 | -------------- |
994 | 1042 | ||
@@ -1095,13 +1143,6 @@ check the amount of free space (value is in seconds). Default settings are: 4, | |||
1095 | resume it if we have a value of 3 or more percent; consider information about | 1143 | resume it if we have a value of 3 or more percent; consider information about |
1096 | the amount of free space valid for 30 seconds | 1144 | the amount of free space valid for 30 seconds |
1097 | 1145 | ||
1098 | audit_argv_kb | ||
1099 | ------------- | ||
1100 | |||
1101 | The file contains a single value denoting the limit on the argv array size | ||
1102 | for execve (in KiB). This limit is only applied when system call auditing for | ||
1103 | execve is enabled, otherwise the value is ignored. | ||
1104 | |||
1105 | ctrl-alt-del | 1146 | ctrl-alt-del |
1106 | ------------ | 1147 | ------------ |
1107 | 1148 | ||
@@ -1282,13 +1323,28 @@ for writeout by the pdflush daemons. It is expressed in 100'ths of a second. | |||
1282 | Data which has been dirty in-memory for longer than this interval will be | 1323 | Data which has been dirty in-memory for longer than this interval will be |
1283 | written out next time a pdflush daemon wakes up. | 1324 | written out next time a pdflush daemon wakes up. |
1284 | 1325 | ||
1326 | highmem_is_dirtyable | ||
1327 | -------------------- | ||
1328 | |||
1329 | Only present if CONFIG_HIGHMEM is set. | ||
1330 | |||
1331 | This defaults to 0 (false), meaning that the ratios set above are calculated | ||
1332 | as a percentage of lowmem only. This protects against excessive scanning | ||
1333 | in page reclaim, swapping and general VM distress. | ||
1334 | |||
1335 | Setting this to 1 can be useful on 32 bit machines where you want to make | ||
1336 | random changes within an MMAPed file that is larger than your available | ||
1337 | lowmem without causing large quantities of random IO. Is is safe if the | ||
1338 | behavior of all programs running on the machine is known and memory will | ||
1339 | not be otherwise stressed. | ||
1340 | |||
1285 | legacy_va_layout | 1341 | legacy_va_layout |
1286 | ---------------- | 1342 | ---------------- |
1287 | 1343 | ||
1288 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel | 1344 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel |
1289 | will use the legacy (2.4) layout for all processes. | 1345 | will use the legacy (2.4) layout for all processes. |
1290 | 1346 | ||
1291 | lower_zone_protection | 1347 | lowmem_reserve_ratio |
1292 | --------------------- | 1348 | --------------------- |
1293 | 1349 | ||
1294 | For some specialised workloads on highmem machines it is dangerous for | 1350 | For some specialised workloads on highmem machines it is dangerous for |
@@ -1308,25 +1364,71 @@ captured into pinned user memory. | |||
1308 | mechanism will also defend that region from allocations which could use | 1364 | mechanism will also defend that region from allocations which could use |
1309 | highmem or lowmem). | 1365 | highmem or lowmem). |
1310 | 1366 | ||
1311 | The `lower_zone_protection' tunable determines how aggressive the kernel is | 1367 | The `lowmem_reserve_ratio' tunable determines how aggressive the kernel is |
1312 | in defending these lower zones. The default value is zero - no | 1368 | in defending these lower zones. |
1313 | protection at all. | ||
1314 | 1369 | ||
1315 | If you have a machine which uses highmem or ISA DMA and your | 1370 | If you have a machine which uses highmem or ISA DMA and your |
1316 | applications are using mlock(), or if you are running with no swap then | 1371 | applications are using mlock(), or if you are running with no swap then |
1317 | you probably should increase the lower_zone_protection setting. | 1372 | you probably should change the lowmem_reserve_ratio setting. |
1318 | 1373 | ||
1319 | The units of this tunable are fairly vague. It is approximately equal | 1374 | The lowmem_reserve_ratio is an array. You can see them by reading this file. |
1320 | to "megabytes," so setting lower_zone_protection=100 will protect around 100 | 1375 | - |
1321 | megabytes of the lowmem zone from user allocations. It will also make | 1376 | % cat /proc/sys/vm/lowmem_reserve_ratio |
1322 | those 100 megabytes unavailable for use by applications and by | 1377 | 256 256 32 |
1323 | pagecache, so there is a cost. | 1378 | - |
1324 | 1379 | Note: # of this elements is one fewer than number of zones. Because the highest | |
1325 | The effects of this tunable may be observed by monitoring | 1380 | zone's value is not necessary for following calculation. |
1326 | /proc/meminfo:LowFree. Write a single huge file and observe the point | 1381 | |
1327 | at which LowFree ceases to fall. | 1382 | But, these values are not used directly. The kernel calculates # of protection |
1328 | 1383 | pages for each zones from them. These are shown as array of protection pages | |
1329 | A reasonable value for lower_zone_protection is 100. | 1384 | in /proc/zoneinfo like followings. (This is an example of x86-64 box). |
1385 | Each zone has an array of protection pages like this. | ||
1386 | |||
1387 | - | ||
1388 | Node 0, zone DMA | ||
1389 | pages free 1355 | ||
1390 | min 3 | ||
1391 | low 3 | ||
1392 | high 4 | ||
1393 | : | ||
1394 | : | ||
1395 | numa_other 0 | ||
1396 | protection: (0, 2004, 2004, 2004) | ||
1397 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
1398 | pagesets | ||
1399 | cpu: 0 pcp: 0 | ||
1400 | : | ||
1401 | - | ||
1402 | These protections are added to score to judge whether this zone should be used | ||
1403 | for page allocation or should be reclaimed. | ||
1404 | |||
1405 | In this example, if normal pages (index=2) are required to this DMA zone and | ||
1406 | pages_high is used for watermark, the kernel judges this zone should not be | ||
1407 | used because pages_free(1355) is smaller than watermark + protection[2] | ||
1408 | (4 + 2004 = 2008). If this protection value is 0, this zone would be used for | ||
1409 | normal page requirement. If requirement is DMA zone(index=0), protection[0] | ||
1410 | (=0) is used. | ||
1411 | |||
1412 | zone[i]'s protection[j] is calculated by following exprssion. | ||
1413 | |||
1414 | (i < j): | ||
1415 | zone[i]->protection[j] | ||
1416 | = (total sums of present_pages from zone[i+1] to zone[j] on the node) | ||
1417 | / lowmem_reserve_ratio[i]; | ||
1418 | (i = j): | ||
1419 | (should not be protected. = 0; | ||
1420 | (i > j): | ||
1421 | (not necessary, but looks 0) | ||
1422 | |||
1423 | The default values of lowmem_reserve_ratio[i] are | ||
1424 | 256 (if zone[i] means DMA or DMA32 zone) | ||
1425 | 32 (others). | ||
1426 | As above expression, they are reciprocal number of ratio. | ||
1427 | 256 means 1/256. # of protection pages becomes about "0.39%" of total present | ||
1428 | pages of higher zones on the node. | ||
1429 | |||
1430 | If you would like to protect more pages, smaller values are effective. | ||
1431 | The minimum value is 1 (1/1 -> 100%). | ||
1330 | 1432 | ||
1331 | page-cluster | 1433 | page-cluster |
1332 | ------------ | 1434 | ------------ |
@@ -1880,11 +1982,6 @@ max_size | |||
1880 | Maximum size of the routing cache. Old entries will be purged once the cache | 1982 | Maximum size of the routing cache. Old entries will be purged once the cache |
1881 | reached has this size. | 1983 | reached has this size. |
1882 | 1984 | ||
1883 | max_delay, min_delay | ||
1884 | -------------------- | ||
1885 | |||
1886 | Delays for flushing the routing cache. | ||
1887 | |||
1888 | redirect_load, redirect_number | 1985 | redirect_load, redirect_number |
1889 | ------------------------------ | 1986 | ------------------------------ |
1890 | 1987 | ||