diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/filesystems/proc.txt |
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation/filesystems/proc.txt')
-rw-r--r-- | Documentation/filesystems/proc.txt | 1940 |
1 files changed, 1940 insertions, 0 deletions
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt new file mode 100644 index 000000000000..cbe85c17176b --- /dev/null +++ b/Documentation/filesystems/proc.txt | |||
@@ -0,0 +1,1940 @@ | |||
1 | ------------------------------------------------------------------------------ | ||
2 | T H E /proc F I L E S Y S T E M | ||
3 | ------------------------------------------------------------------------------ | ||
4 | /proc/sys Terrehon Bowden <terrehon@pacbell.net> October 7 1999 | ||
5 | Bodo Bauer <bb@ricochet.net> | ||
6 | |||
7 | 2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000 | ||
8 | ------------------------------------------------------------------------------ | ||
9 | Version 1.3 Kernel version 2.2.12 | ||
10 | Kernel version 2.4.0-test11-pre4 | ||
11 | ------------------------------------------------------------------------------ | ||
12 | |||
13 | Table of Contents | ||
14 | ----------------- | ||
15 | |||
16 | 0 Preface | ||
17 | 0.1 Introduction/Credits | ||
18 | 0.2 Legal Stuff | ||
19 | |||
20 | 1 Collecting System Information | ||
21 | 1.1 Process-Specific Subdirectories | ||
22 | 1.2 Kernel data | ||
23 | 1.3 IDE devices in /proc/ide | ||
24 | 1.4 Networking info in /proc/net | ||
25 | 1.5 SCSI info | ||
26 | 1.6 Parallel port info in /proc/parport | ||
27 | 1.7 TTY info in /proc/tty | ||
28 | 1.8 Miscellaneous kernel statistics in /proc/stat | ||
29 | |||
30 | 2 Modifying System Parameters | ||
31 | 2.1 /proc/sys/fs - File system data | ||
32 | 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats | ||
33 | 2.3 /proc/sys/kernel - general kernel parameters | ||
34 | 2.4 /proc/sys/vm - The virtual memory subsystem | ||
35 | 2.5 /proc/sys/dev - Device specific parameters | ||
36 | 2.6 /proc/sys/sunrpc - Remote procedure calls | ||
37 | 2.7 /proc/sys/net - Networking stuff | ||
38 | 2.8 /proc/sys/net/ipv4 - IPV4 settings | ||
39 | 2.9 Appletalk | ||
40 | 2.10 IPX | ||
41 | 2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem | ||
42 | |||
43 | ------------------------------------------------------------------------------ | ||
44 | Preface | ||
45 | ------------------------------------------------------------------------------ | ||
46 | |||
47 | 0.1 Introduction/Credits | ||
48 | ------------------------ | ||
49 | |||
50 | This documentation is part of a soon (or so we hope) to be released book on | ||
51 | the SuSE Linux distribution. As there is no complete documentation for the | ||
52 | /proc file system and we've used many freely available sources to write these | ||
53 | chapters, it seems only fair to give the work back to the Linux community. | ||
54 | This work is based on the 2.2.* kernel version and the upcoming 2.4.*. I'm | ||
55 | afraid it's still far from complete, but we hope it will be useful. As far as | ||
56 | we know, it is the first 'all-in-one' document about the /proc file system. It | ||
57 | is focused on the Intel x86 hardware, so if you are looking for PPC, ARM, | ||
58 | SPARC, AXP, etc., features, you probably won't find what you are looking for. | ||
59 | It also only covers IPv4 networking, not IPv6 nor other protocols - sorry. But | ||
60 | additions and patches are welcome and will be added to this document if you | ||
61 | mail them to Bodo. | ||
62 | |||
63 | We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov and a lot of | ||
64 | other people for help compiling this documentation. We'd also like to extend a | ||
65 | special thank you to Andi Kleen for documentation, which we relied on heavily | ||
66 | to create this document, as well as the additional information he provided. | ||
67 | Thanks to everybody else who contributed source or docs to the Linux kernel | ||
68 | and helped create a great piece of software... :) | ||
69 | |||
70 | If you have any comments, corrections or additions, please don't hesitate to | ||
71 | contact Bodo Bauer at bb@ricochet.net. We'll be happy to add them to this | ||
72 | document. | ||
73 | |||
74 | The latest version of this document is available online at | ||
75 | http://skaro.nightcrawler.com/~bb/Docs/Proc as HTML version. | ||
76 | |||
77 | If the above direction does not works for you, ypu could try the kernel | ||
78 | mailing list at linux-kernel@vger.kernel.org and/or try to reach me at | ||
79 | comandante@zaralinux.com. | ||
80 | |||
81 | 0.2 Legal Stuff | ||
82 | --------------- | ||
83 | |||
84 | We don't guarantee the correctness of this document, and if you come to us | ||
85 | complaining about how you screwed up your system because of incorrect | ||
86 | documentation, we won't feel responsible... | ||
87 | |||
88 | ------------------------------------------------------------------------------ | ||
89 | CHAPTER 1: COLLECTING SYSTEM INFORMATION | ||
90 | ------------------------------------------------------------------------------ | ||
91 | |||
92 | ------------------------------------------------------------------------------ | ||
93 | In This Chapter | ||
94 | ------------------------------------------------------------------------------ | ||
95 | * Investigating the properties of the pseudo file system /proc and its | ||
96 | ability to provide information on the running Linux system | ||
97 | * Examining /proc's structure | ||
98 | * Uncovering various information about the kernel and the processes running | ||
99 | on the system | ||
100 | ------------------------------------------------------------------------------ | ||
101 | |||
102 | |||
103 | The proc file system acts as an interface to internal data structures in the | ||
104 | kernel. It can be used to obtain information about the system and to change | ||
105 | certain kernel parameters at runtime (sysctl). | ||
106 | |||
107 | First, we'll take a look at the read-only parts of /proc. In Chapter 2, we | ||
108 | show you how you can use /proc/sys to change settings. | ||
109 | |||
110 | 1.1 Process-Specific Subdirectories | ||
111 | ----------------------------------- | ||
112 | |||
113 | The directory /proc contains (among other things) one subdirectory for each | ||
114 | process running on the system, which is named after the process ID (PID). | ||
115 | |||
116 | The link self points to the process reading the file system. Each process | ||
117 | subdirectory has the entries listed in Table 1-1. | ||
118 | |||
119 | |||
120 | Table 1-1: Process specific entries in /proc | ||
121 | .............................................................................. | ||
122 | File Content | ||
123 | cmdline Command line arguments | ||
124 | cpu Current and last cpu in wich it was executed (2.4)(smp) | ||
125 | cwd Link to the current working directory | ||
126 | environ Values of environment variables | ||
127 | exe Link to the executable of this process | ||
128 | fd Directory, which contains all file descriptors | ||
129 | maps Memory maps to executables and library files (2.4) | ||
130 | mem Memory held by this process | ||
131 | root Link to the root directory of this process | ||
132 | stat Process status | ||
133 | statm Process memory status information | ||
134 | status Process status in human readable form | ||
135 | wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan | ||
136 | .............................................................................. | ||
137 | |||
138 | For example, to get the status information of a process, all you have to do is | ||
139 | read the file /proc/PID/status: | ||
140 | |||
141 | >cat /proc/self/status | ||
142 | Name: cat | ||
143 | State: R (running) | ||
144 | Pid: 5452 | ||
145 | PPid: 743 | ||
146 | TracerPid: 0 (2.4) | ||
147 | Uid: 501 501 501 501 | ||
148 | Gid: 100 100 100 100 | ||
149 | Groups: 100 14 16 | ||
150 | VmSize: 1112 kB | ||
151 | VmLck: 0 kB | ||
152 | VmRSS: 348 kB | ||
153 | VmData: 24 kB | ||
154 | VmStk: 12 kB | ||
155 | VmExe: 8 kB | ||
156 | VmLib: 1044 kB | ||
157 | SigPnd: 0000000000000000 | ||
158 | SigBlk: 0000000000000000 | ||
159 | SigIgn: 0000000000000000 | ||
160 | SigCgt: 0000000000000000 | ||
161 | CapInh: 00000000fffffeff | ||
162 | CapPrm: 0000000000000000 | ||
163 | CapEff: 0000000000000000 | ||
164 | |||
165 | |||
166 | This shows you nearly the same information you would get if you viewed it with | ||
167 | the ps command. In fact, ps uses the proc file system to obtain its | ||
168 | information. The statm file contains more detailed information about the | ||
169 | process memory usage. Its seven fields are explained in Table 1-2. | ||
170 | |||
171 | |||
172 | Table 1-2: Contents of the statm files (as of 2.6.8-rc3) | ||
173 | .............................................................................. | ||
174 | Field Content | ||
175 | size total program size (pages) (same as VmSize in status) | ||
176 | resident size of memory portions (pages) (same as VmRSS in status) | ||
177 | shared number of pages that are shared (i.e. backed by a file) | ||
178 | trs number of pages that are 'code' (not including libs; broken, | ||
179 | includes data segment) | ||
180 | lrs number of pages of library (always 0 on 2.6) | ||
181 | drs number of pages of data/stack (including libs; broken, | ||
182 | includes library text) | ||
183 | dt number of dirty pages (always 0 on 2.6) | ||
184 | .............................................................................. | ||
185 | |||
186 | 1.2 Kernel data | ||
187 | --------------- | ||
188 | |||
189 | Similar to the process entries, the kernel data files give information about | ||
190 | the running kernel. The files used to obtain this information are contained in | ||
191 | /proc and are listed in Table 1-3. Not all of these will be present in your | ||
192 | system. It depends on the kernel configuration and the loaded modules, which | ||
193 | files are there, and which are missing. | ||
194 | |||
195 | Table 1-3: Kernel info in /proc | ||
196 | .............................................................................. | ||
197 | File Content | ||
198 | apm Advanced power management info | ||
199 | buddyinfo Kernel memory allocator information (see text) (2.5) | ||
200 | bus Directory containing bus specific information | ||
201 | cmdline Kernel command line | ||
202 | cpuinfo Info about the CPU | ||
203 | devices Available devices (block and character) | ||
204 | dma Used DMS channels | ||
205 | filesystems Supported filesystems | ||
206 | driver Various drivers grouped here, currently rtc (2.4) | ||
207 | execdomains Execdomains, related to security (2.4) | ||
208 | fb Frame Buffer devices (2.4) | ||
209 | fs File system parameters, currently nfs/exports (2.4) | ||
210 | ide Directory containing info about the IDE subsystem | ||
211 | interrupts Interrupt usage | ||
212 | iomem Memory map (2.4) | ||
213 | ioports I/O port usage | ||
214 | irq Masks for irq to cpu affinity (2.4)(smp?) | ||
215 | isapnp ISA PnP (Plug&Play) Info (2.4) | ||
216 | kcore Kernel core image (can be ELF or A.OUT(deprecated in 2.4)) | ||
217 | kmsg Kernel messages | ||
218 | ksyms Kernel symbol table | ||
219 | loadavg Load average of last 1, 5 & 15 minutes | ||
220 | locks Kernel locks | ||
221 | meminfo Memory info | ||
222 | misc Miscellaneous | ||
223 | modules List of loaded modules | ||
224 | mounts Mounted filesystems | ||
225 | net Networking info (see text) | ||
226 | partitions Table of partitions known to the system | ||
227 | pci Depreciated info of PCI bus (new way -> /proc/bus/pci/, | ||
228 | decoupled by lspci (2.4) | ||
229 | rtc Real time clock | ||
230 | scsi SCSI info (see text) | ||
231 | slabinfo Slab pool info | ||
232 | stat Overall statistics | ||
233 | swaps Swap space utilization | ||
234 | sys See chapter 2 | ||
235 | sysvipc Info of SysVIPC Resources (msg, sem, shm) (2.4) | ||
236 | tty Info of tty drivers | ||
237 | uptime System uptime | ||
238 | version Kernel version | ||
239 | video bttv info of video resources (2.4) | ||
240 | .............................................................................. | ||
241 | |||
242 | You can, for example, check which interrupts are currently in use and what | ||
243 | they are used for by looking in the file /proc/interrupts: | ||
244 | |||
245 | > cat /proc/interrupts | ||
246 | CPU0 | ||
247 | 0: 8728810 XT-PIC timer | ||
248 | 1: 895 XT-PIC keyboard | ||
249 | 2: 0 XT-PIC cascade | ||
250 | 3: 531695 XT-PIC aha152x | ||
251 | 4: 2014133 XT-PIC serial | ||
252 | 5: 44401 XT-PIC pcnet_cs | ||
253 | 8: 2 XT-PIC rtc | ||
254 | 11: 8 XT-PIC i82365 | ||
255 | 12: 182918 XT-PIC PS/2 Mouse | ||
256 | 13: 1 XT-PIC fpu | ||
257 | 14: 1232265 XT-PIC ide0 | ||
258 | 15: 7 XT-PIC ide1 | ||
259 | NMI: 0 | ||
260 | |||
261 | In 2.4.* a couple of lines where added to this file LOC & ERR (this time is the | ||
262 | output of a SMP machine): | ||
263 | |||
264 | > cat /proc/interrupts | ||
265 | |||
266 | CPU0 CPU1 | ||
267 | 0: 1243498 1214548 IO-APIC-edge timer | ||
268 | 1: 8949 8958 IO-APIC-edge keyboard | ||
269 | 2: 0 0 XT-PIC cascade | ||
270 | 5: 11286 10161 IO-APIC-edge soundblaster | ||
271 | 8: 1 0 IO-APIC-edge rtc | ||
272 | 9: 27422 27407 IO-APIC-edge 3c503 | ||
273 | 12: 113645 113873 IO-APIC-edge PS/2 Mouse | ||
274 | 13: 0 0 XT-PIC fpu | ||
275 | 14: 22491 24012 IO-APIC-edge ide0 | ||
276 | 15: 2183 2415 IO-APIC-edge ide1 | ||
277 | 17: 30564 30414 IO-APIC-level eth0 | ||
278 | 18: 177 164 IO-APIC-level bttv | ||
279 | NMI: 2457961 2457959 | ||
280 | LOC: 2457882 2457881 | ||
281 | ERR: 2155 | ||
282 | |||
283 | NMI is incremented in this case because every timer interrupt generates a NMI | ||
284 | (Non Maskable Interrupt) which is used by the NMI Watchdog to detect lockups. | ||
285 | |||
286 | LOC is the local interrupt counter of the internal APIC of every CPU. | ||
287 | |||
288 | ERR is incremented in the case of errors in the IO-APIC bus (the bus that | ||
289 | connects the CPUs in a SMP system. This means that an error has been detected, | ||
290 | the IO-APIC automatically retry the transmission, so it should not be a big | ||
291 | problem, but you should read the SMP-FAQ. | ||
292 | |||
293 | In this context it could be interesting to note the new irq directory in 2.4. | ||
294 | It could be used to set IRQ to CPU affinity, this means that you can "hook" an | ||
295 | IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the | ||
296 | irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask | ||
297 | |||
298 | For example | ||
299 | > ls /proc/irq/ | ||
300 | 0 10 12 14 16 18 2 4 6 8 prof_cpu_mask | ||
301 | 1 11 13 15 17 19 3 5 7 9 | ||
302 | > ls /proc/irq/0/ | ||
303 | smp_affinity | ||
304 | |||
305 | The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ | ||
306 | is the same by default: | ||
307 | |||
308 | > cat /proc/irq/0/smp_affinity | ||
309 | ffffffff | ||
310 | |||
311 | It's a bitmask, in wich you can specify wich CPUs can handle the IRQ, you can | ||
312 | set it by doing: | ||
313 | |||
314 | > echo 1 > /proc/irq/prof_cpu_mask | ||
315 | |||
316 | This means that only the first CPU will handle the IRQ, but you can also echo 5 | ||
317 | wich means that only the first and fourth CPU can handle the IRQ. | ||
318 | |||
319 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin | ||
320 | between all the CPUs which are allowed to handle it. As usual the kernel has | ||
321 | more info than you and does a better job than you, so the defaults are the | ||
322 | best choice for almost everyone. | ||
323 | |||
324 | There are three more important subdirectories in /proc: net, scsi, and sys. | ||
325 | The general rule is that the contents, or even the existence of these | ||
326 | directories, depend on your kernel configuration. If SCSI is not enabled, the | ||
327 | directory scsi may not exist. The same is true with the net, which is there | ||
328 | only when networking support is present in the running kernel. | ||
329 | |||
330 | The slabinfo file gives information about memory usage at the slab level. | ||
331 | Linux uses slab pools for memory management above page level in version 2.2. | ||
332 | Commonly used objects have their own slab pool (such as network buffers, | ||
333 | directory cache, and so on). | ||
334 | |||
335 | .............................................................................. | ||
336 | |||
337 | > cat /proc/buddyinfo | ||
338 | |||
339 | Node 0, zone DMA 0 4 5 4 4 3 ... | ||
340 | Node 0, zone Normal 1 0 0 1 101 8 ... | ||
341 | Node 0, zone HighMem 2 0 0 1 1 0 ... | ||
342 | |||
343 | Memory fragmentation is a problem under some workloads, and buddyinfo is a | ||
344 | useful tool for helping diagnose these problems. Buddyinfo will give you a | ||
345 | clue as to how big an area you can safely allocate, or why a previous | ||
346 | allocation failed. | ||
347 | |||
348 | Each column represents the number of pages of a certain order which are | ||
349 | available. In this case, there are 0 chunks of 2^0*PAGE_SIZE available in | ||
350 | ZONE_DMA, 4 chunks of 2^1*PAGE_SIZE in ZONE_DMA, 101 chunks of 2^4*PAGE_SIZE | ||
351 | available in ZONE_NORMAL, etc... | ||
352 | |||
353 | .............................................................................. | ||
354 | |||
355 | meminfo: | ||
356 | |||
357 | Provides information about distribution and utilization of memory. This | ||
358 | varies by architecture and compile options. The following is from a | ||
359 | 16GB PIII, which has highmem enabled. You may not have all of these fields. | ||
360 | |||
361 | > cat /proc/meminfo | ||
362 | |||
363 | |||
364 | MemTotal: 16344972 kB | ||
365 | MemFree: 13634064 kB | ||
366 | Buffers: 3656 kB | ||
367 | Cached: 1195708 kB | ||
368 | SwapCached: 0 kB | ||
369 | Active: 891636 kB | ||
370 | Inactive: 1077224 kB | ||
371 | HighTotal: 15597528 kB | ||
372 | HighFree: 13629632 kB | ||
373 | LowTotal: 747444 kB | ||
374 | LowFree: 4432 kB | ||
375 | SwapTotal: 0 kB | ||
376 | SwapFree: 0 kB | ||
377 | Dirty: 968 kB | ||
378 | Writeback: 0 kB | ||
379 | Mapped: 280372 kB | ||
380 | Slab: 684068 kB | ||
381 | CommitLimit: 7669796 kB | ||
382 | Committed_AS: 100056 kB | ||
383 | PageTables: 24448 kB | ||
384 | VmallocTotal: 112216 kB | ||
385 | VmallocUsed: 428 kB | ||
386 | VmallocChunk: 111088 kB | ||
387 | |||
388 | MemTotal: Total usable ram (i.e. physical ram minus a few reserved | ||
389 | bits and the kernel binary code) | ||
390 | MemFree: The sum of LowFree+HighFree | ||
391 | Buffers: Relatively temporary storage for raw disk blocks | ||
392 | shouldn't get tremendously large (20MB or so) | ||
393 | Cached: in-memory cache for files read from the disk (the | ||
394 | pagecache). Doesn't include SwapCached | ||
395 | SwapCached: Memory that once was swapped out, is swapped back in but | ||
396 | still also is in the swapfile (if memory is needed it | ||
397 | doesn't need to be swapped out AGAIN because it is already | ||
398 | in the swapfile. This saves I/O) | ||
399 | Active: Memory that has been used more recently and usually not | ||
400 | reclaimed unless absolutely necessary. | ||
401 | Inactive: Memory which has been less recently used. It is more | ||
402 | eligible to be reclaimed for other purposes | ||
403 | HighTotal: | ||
404 | HighFree: Highmem is all memory above ~860MB of physical memory | ||
405 | Highmem areas are for use by userspace programs, or | ||
406 | for the pagecache. The kernel must use tricks to access | ||
407 | this memory, making it slower to access than lowmem. | ||
408 | LowTotal: | ||
409 | LowFree: Lowmem is memory which can be used for everything that | ||
410 | highmem can be used for, but it is also availble for the | ||
411 | kernel's use for its own data structures. Among many | ||
412 | other things, it is where everything from the Slab is | ||
413 | allocated. Bad things happen when you're out of lowmem. | ||
414 | SwapTotal: total amount of swap space available | ||
415 | SwapFree: Memory which has been evicted from RAM, and is temporarily | ||
416 | on the disk | ||
417 | Dirty: Memory which is waiting to get written back to the disk | ||
418 | Writeback: Memory which is actively being written back to the disk | ||
419 | Mapped: files which have been mmaped, such as libraries | ||
420 | Slab: in-kernel data structures cache | ||
421 | CommitLimit: Based on the overcommit ratio ('vm.overcommit_ratio'), | ||
422 | this is the total amount of memory currently available to | ||
423 | be allocated on the system. This limit is only adhered to | ||
424 | if strict overcommit accounting is enabled (mode 2 in | ||
425 | 'vm.overcommit_memory'). | ||
426 | The CommitLimit is calculated with the following formula: | ||
427 | CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap | ||
428 | For example, on a system with 1G of physical RAM and 7G | ||
429 | of swap with a `vm.overcommit_ratio` of 30 it would | ||
430 | yield a CommitLimit of 7.3G. | ||
431 | For more details, see the memory overcommit documentation | ||
432 | in vm/overcommit-accounting. | ||
433 | Committed_AS: The amount of memory presently allocated on the system. | ||
434 | The committed memory is a sum of all of the memory which | ||
435 | has been allocated by processes, even if it has not been | ||
436 | "used" by them as of yet. A process which malloc()'s 1G | ||
437 | of memory, but only touches 300M of it will only show up | ||
438 | as using 300M of memory even if it has the address space | ||
439 | allocated for the entire 1G. This 1G is memory which has | ||
440 | been "committed" to by the VM and can be used at any time | ||
441 | by the allocating application. With strict overcommit | ||
442 | enabled on the system (mode 2 in 'vm.overcommit_memory'), | ||
443 | allocations which would exceed the CommitLimit (detailed | ||
444 | above) will not be permitted. This is useful if one needs | ||
445 | to guarantee that processes will not fail due to lack of | ||
446 | memory once that memory has been successfully allocated. | ||
447 | PageTables: amount of memory dedicated to the lowest level of page | ||
448 | tables. | ||
449 | VmallocTotal: total size of vmalloc memory area | ||
450 | VmallocUsed: amount of vmalloc area which is used | ||
451 | VmallocChunk: largest contigious block of vmalloc area which is free | ||
452 | |||
453 | |||
454 | 1.3 IDE devices in /proc/ide | ||
455 | ---------------------------- | ||
456 | |||
457 | The subdirectory /proc/ide contains information about all IDE devices of which | ||
458 | the kernel is aware. There is one subdirectory for each IDE controller, the | ||
459 | file drivers and a link for each IDE device, pointing to the device directory | ||
460 | in the controller specific subtree. | ||
461 | |||
462 | The file drivers contains general information about the drivers used for the | ||
463 | IDE devices: | ||
464 | |||
465 | > cat /proc/ide/drivers | ||
466 | ide-cdrom version 4.53 | ||
467 | ide-disk version 1.08 | ||
468 | |||
469 | More detailed information can be found in the controller specific | ||
470 | subdirectories. These are named ide0, ide1 and so on. Each of these | ||
471 | directories contains the files shown in table 1-4. | ||
472 | |||
473 | |||
474 | Table 1-4: IDE controller info in /proc/ide/ide? | ||
475 | .............................................................................. | ||
476 | File Content | ||
477 | channel IDE channel (0 or 1) | ||
478 | config Configuration (only for PCI/IDE bridge) | ||
479 | mate Mate name | ||
480 | model Type/Chipset of IDE controller | ||
481 | .............................................................................. | ||
482 | |||
483 | Each device connected to a controller has a separate subdirectory in the | ||
484 | controllers directory. The files listed in table 1-5 are contained in these | ||
485 | directories. | ||
486 | |||
487 | |||
488 | Table 1-5: IDE device information | ||
489 | .............................................................................. | ||
490 | File Content | ||
491 | cache The cache | ||
492 | capacity Capacity of the medium (in 512Byte blocks) | ||
493 | driver driver and version | ||
494 | geometry physical and logical geometry | ||
495 | identify device identify block | ||
496 | media media type | ||
497 | model device identifier | ||
498 | settings device setup | ||
499 | smart_thresholds IDE disk management thresholds | ||
500 | smart_values IDE disk management values | ||
501 | .............................................................................. | ||
502 | |||
503 | The most interesting file is settings. This file contains a nice overview of | ||
504 | the drive parameters: | ||
505 | |||
506 | # cat /proc/ide/ide0/hda/settings | ||
507 | name value min max mode | ||
508 | ---- ----- --- --- ---- | ||
509 | bios_cyl 526 0 65535 rw | ||
510 | bios_head 255 0 255 rw | ||
511 | bios_sect 63 0 63 rw | ||
512 | breada_readahead 4 0 127 rw | ||
513 | bswap 0 0 1 r | ||
514 | file_readahead 72 0 2097151 rw | ||
515 | io_32bit 0 0 3 rw | ||
516 | keepsettings 0 0 1 rw | ||
517 | max_kb_per_request 122 1 127 rw | ||
518 | multcount 0 0 8 rw | ||
519 | nice1 1 0 1 rw | ||
520 | nowerr 0 0 1 rw | ||
521 | pio_mode write-only 0 255 w | ||
522 | slow 0 0 1 rw | ||
523 | unmaskirq 0 0 1 rw | ||
524 | using_dma 0 0 1 rw | ||
525 | |||
526 | |||
527 | 1.4 Networking info in /proc/net | ||
528 | -------------------------------- | ||
529 | |||
530 | The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the | ||
531 | additional values you get for IP version 6 if you configure the kernel to | ||
532 | support this. Table 1-7 lists the files and their meaning. | ||
533 | |||
534 | |||
535 | Table 1-6: IPv6 info in /proc/net | ||
536 | .............................................................................. | ||
537 | File Content | ||
538 | udp6 UDP sockets (IPv6) | ||
539 | tcp6 TCP sockets (IPv6) | ||
540 | raw6 Raw device statistics (IPv6) | ||
541 | igmp6 IP multicast addresses, which this host joined (IPv6) | ||
542 | if_inet6 List of IPv6 interface addresses | ||
543 | ipv6_route Kernel routing table for IPv6 | ||
544 | rt6_stats Global IPv6 routing tables statistics | ||
545 | sockstat6 Socket statistics (IPv6) | ||
546 | snmp6 Snmp data (IPv6) | ||
547 | .............................................................................. | ||
548 | |||
549 | |||
550 | Table 1-7: Network info in /proc/net | ||
551 | .............................................................................. | ||
552 | File Content | ||
553 | arp Kernel ARP table | ||
554 | dev network devices with statistics | ||
555 | dev_mcast the Layer2 multicast groups a device is listening too | ||
556 | (interface index, label, number of references, number of bound | ||
557 | addresses). | ||
558 | dev_stat network device status | ||
559 | ip_fwchains Firewall chain linkage | ||
560 | ip_fwnames Firewall chain names | ||
561 | ip_masq Directory containing the masquerading tables | ||
562 | ip_masquerade Major masquerading table | ||
563 | netstat Network statistics | ||
564 | raw raw device statistics | ||
565 | route Kernel routing table | ||
566 | rpc Directory containing rpc info | ||
567 | rt_cache Routing cache | ||
568 | snmp SNMP data | ||
569 | sockstat Socket statistics | ||
570 | tcp TCP sockets | ||
571 | tr_rif Token ring RIF routing table | ||
572 | udp UDP sockets | ||
573 | unix UNIX domain sockets | ||
574 | wireless Wireless interface data (Wavelan etc) | ||
575 | igmp IP multicast addresses, which this host joined | ||
576 | psched Global packet scheduler parameters. | ||
577 | netlink List of PF_NETLINK sockets | ||
578 | ip_mr_vifs List of multicast virtual interfaces | ||
579 | ip_mr_cache List of multicast routing cache | ||
580 | .............................................................................. | ||
581 | |||
582 | You can use this information to see which network devices are available in | ||
583 | your system and how much traffic was routed over those devices: | ||
584 | |||
585 | > cat /proc/net/dev | ||
586 | Inter-|Receive |[... | ||
587 | face |bytes packets errs drop fifo frame compressed multicast|[... | ||
588 | lo: 908188 5596 0 0 0 0 0 0 [... | ||
589 | ppp0:15475140 20721 410 0 0 410 0 0 [... | ||
590 | eth0: 614530 7085 0 0 0 0 0 1 [... | ||
591 | |||
592 | ...] Transmit | ||
593 | ...] bytes packets errs drop fifo colls carrier compressed | ||
594 | ...] 908188 5596 0 0 0 0 0 0 | ||
595 | ...] 1375103 17405 0 0 0 0 0 0 | ||
596 | ...] 1703981 5535 0 0 0 3 0 0 | ||
597 | |||
598 | In addition, each Channel Bond interface has it's own directory. For | ||
599 | example, the bond0 device will have a directory called /proc/net/bond0/. | ||
600 | It will contain information that is specific to that bond, such as the | ||
601 | current slaves of the bond, the link status of the slaves, and how | ||
602 | many times the slaves link has failed. | ||
603 | |||
604 | 1.5 SCSI info | ||
605 | ------------- | ||
606 | |||
607 | If you have a SCSI host adapter in your system, you'll find a subdirectory | ||
608 | named after the driver for this adapter in /proc/scsi. You'll also see a list | ||
609 | of all recognized SCSI devices in /proc/scsi: | ||
610 | |||
611 | >cat /proc/scsi/scsi | ||
612 | Attached devices: | ||
613 | Host: scsi0 Channel: 00 Id: 00 Lun: 00 | ||
614 | Vendor: IBM Model: DGHS09U Rev: 03E0 | ||
615 | Type: Direct-Access ANSI SCSI revision: 03 | ||
616 | Host: scsi0 Channel: 00 Id: 06 Lun: 00 | ||
617 | Vendor: PIONEER Model: CD-ROM DR-U06S Rev: 1.04 | ||
618 | Type: CD-ROM ANSI SCSI revision: 02 | ||
619 | |||
620 | |||
621 | The directory named after the driver has one file for each adapter found in | ||
622 | the system. These files contain information about the controller, including | ||
623 | the used IRQ and the IO address range. The amount of information shown is | ||
624 | dependent on the adapter you use. The example shows the output for an Adaptec | ||
625 | AHA-2940 SCSI adapter: | ||
626 | |||
627 | > cat /proc/scsi/aic7xxx/0 | ||
628 | |||
629 | Adaptec AIC7xxx driver version: 5.1.19/3.2.4 | ||
630 | Compile Options: | ||
631 | TCQ Enabled By Default : Disabled | ||
632 | AIC7XXX_PROC_STATS : Disabled | ||
633 | AIC7XXX_RESET_DELAY : 5 | ||
634 | Adapter Configuration: | ||
635 | SCSI Adapter: Adaptec AHA-294X Ultra SCSI host adapter | ||
636 | Ultra Wide Controller | ||
637 | PCI MMAPed I/O Base: 0xeb001000 | ||
638 | Adapter SEEPROM Config: SEEPROM found and used. | ||
639 | Adaptec SCSI BIOS: Enabled | ||
640 | IRQ: 10 | ||
641 | SCBs: Active 0, Max Active 2, | ||
642 | Allocated 15, HW 16, Page 255 | ||
643 | Interrupts: 160328 | ||
644 | BIOS Control Word: 0x18b6 | ||
645 | Adapter Control Word: 0x005b | ||
646 | Extended Translation: Enabled | ||
647 | Disconnect Enable Flags: 0xffff | ||
648 | Ultra Enable Flags: 0x0001 | ||
649 | Tag Queue Enable Flags: 0x0000 | ||
650 | Ordered Queue Tag Flags: 0x0000 | ||
651 | Default Tag Queue Depth: 8 | ||
652 | Tagged Queue By Device array for aic7xxx host instance 0: | ||
653 | {255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255} | ||
654 | Actual queue depth per device for aic7xxx host instance 0: | ||
655 | {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1} | ||
656 | Statistics: | ||
657 | (scsi0:0:0:0) | ||
658 | Device using Wide/Sync transfers at 40.0 MByte/sec, offset 8 | ||
659 | Transinfo settings: current(12/8/1/0), goal(12/8/1/0), user(12/15/1/0) | ||
660 | Total transfers 160151 (74577 reads and 85574 writes) | ||
661 | (scsi0:0:6:0) | ||
662 | Device using Narrow/Sync transfers at 5.0 MByte/sec, offset 15 | ||
663 | Transinfo settings: current(50/15/0/0), goal(50/15/0/0), user(50/15/0/0) | ||
664 | Total transfers 0 (0 reads and 0 writes) | ||
665 | |||
666 | |||
667 | 1.6 Parallel port info in /proc/parport | ||
668 | --------------------------------------- | ||
669 | |||
670 | The directory /proc/parport contains information about the parallel ports of | ||
671 | your system. It has one subdirectory for each port, named after the port | ||
672 | number (0,1,2,...). | ||
673 | |||
674 | These directories contain the four files shown in Table 1-8. | ||
675 | |||
676 | |||
677 | Table 1-8: Files in /proc/parport | ||
678 | .............................................................................. | ||
679 | File Content | ||
680 | autoprobe Any IEEE-1284 device ID information that has been acquired. | ||
681 | devices list of the device drivers using that port. A + will appear by the | ||
682 | name of the device currently using the port (it might not appear | ||
683 | against any). | ||
684 | hardware Parallel port's base address, IRQ line and DMA channel. | ||
685 | irq IRQ that parport is using for that port. This is in a separate | ||
686 | file to allow you to alter it by writing a new value in (IRQ | ||
687 | number or none). | ||
688 | .............................................................................. | ||
689 | |||
690 | 1.7 TTY info in /proc/tty | ||
691 | ------------------------- | ||
692 | |||
693 | Information about the available and actually used tty's can be found in the | ||
694 | directory /proc/tty.You'll find entries for drivers and line disciplines in | ||
695 | this directory, as shown in Table 1-9. | ||
696 | |||
697 | |||
698 | Table 1-9: Files in /proc/tty | ||
699 | .............................................................................. | ||
700 | File Content | ||
701 | drivers list of drivers and their usage | ||
702 | ldiscs registered line disciplines | ||
703 | driver/serial usage statistic and status of single tty lines | ||
704 | .............................................................................. | ||
705 | |||
706 | To see which tty's are currently in use, you can simply look into the file | ||
707 | /proc/tty/drivers: | ||
708 | |||
709 | > cat /proc/tty/drivers | ||
710 | pty_slave /dev/pts 136 0-255 pty:slave | ||
711 | pty_master /dev/ptm 128 0-255 pty:master | ||
712 | pty_slave /dev/ttyp 3 0-255 pty:slave | ||
713 | pty_master /dev/pty 2 0-255 pty:master | ||
714 | serial /dev/cua 5 64-67 serial:callout | ||
715 | serial /dev/ttyS 4 64-67 serial | ||
716 | /dev/tty0 /dev/tty0 4 0 system:vtmaster | ||
717 | /dev/ptmx /dev/ptmx 5 2 system | ||
718 | /dev/console /dev/console 5 1 system:console | ||
719 | /dev/tty /dev/tty 5 0 system:/dev/tty | ||
720 | unknown /dev/tty 4 1-63 console | ||
721 | |||
722 | |||
723 | 1.8 Miscellaneous kernel statistics in /proc/stat | ||
724 | ------------------------------------------------- | ||
725 | |||
726 | Various pieces of information about kernel activity are available in the | ||
727 | /proc/stat file. All of the numbers reported in this file are aggregates | ||
728 | since the system first booted. For a quick look, simply cat the file: | ||
729 | |||
730 | > cat /proc/stat | ||
731 | cpu 2255 34 2290 22625563 6290 127 456 | ||
732 | cpu0 1132 34 1441 11311718 3675 127 438 | ||
733 | cpu1 1123 0 849 11313845 2614 0 18 | ||
734 | intr 114930548 113199788 3 0 5 263 0 4 [... lots more numbers ...] | ||
735 | ctxt 1990473 | ||
736 | btime 1062191376 | ||
737 | processes 2915 | ||
738 | procs_running 1 | ||
739 | procs_blocked 0 | ||
740 | |||
741 | The very first "cpu" line aggregates the numbers in all of the other "cpuN" | ||
742 | lines. These numbers identify the amount of time the CPU has spent performing | ||
743 | different kinds of work. Time units are in USER_HZ (typically hundredths of a | ||
744 | second). The meanings of the columns are as follows, from left to right: | ||
745 | |||
746 | - user: normal processes executing in user mode | ||
747 | - nice: niced processes executing in user mode | ||
748 | - system: processes executing in kernel mode | ||
749 | - idle: twiddling thumbs | ||
750 | - iowait: waiting for I/O to complete | ||
751 | - irq: servicing interrupts | ||
752 | - softirq: servicing softirqs | ||
753 | |||
754 | The "intr" line gives counts of interrupts serviced since boot time, for each | ||
755 | of the possible system interrupts. The first column is the total of all | ||
756 | interrupts serviced; each subsequent column is the total for that particular | ||
757 | interrupt. | ||
758 | |||
759 | The "ctxt" line gives the total number of context switches across all CPUs. | ||
760 | |||
761 | The "btime" line gives the time at which the system booted, in seconds since | ||
762 | the Unix epoch. | ||
763 | |||
764 | The "processes" line gives the number of processes and threads created, which | ||
765 | includes (but is not limited to) those created by calls to the fork() and | ||
766 | clone() system calls. | ||
767 | |||
768 | The "procs_running" line gives the number of processes currently running on | ||
769 | CPUs. | ||
770 | |||
771 | The "procs_blocked" line gives the number of processes currently blocked, | ||
772 | waiting for I/O to complete. | ||
773 | |||
774 | |||
775 | ------------------------------------------------------------------------------ | ||
776 | Summary | ||
777 | ------------------------------------------------------------------------------ | ||
778 | The /proc file system serves information about the running system. It not only | ||
779 | allows access to process data but also allows you to request the kernel status | ||
780 | by reading files in the hierarchy. | ||
781 | |||
782 | The directory structure of /proc reflects the types of information and makes | ||
783 | it easy, if not obvious, where to look for specific data. | ||
784 | ------------------------------------------------------------------------------ | ||
785 | |||
786 | ------------------------------------------------------------------------------ | ||
787 | CHAPTER 2: MODIFYING SYSTEM PARAMETERS | ||
788 | ------------------------------------------------------------------------------ | ||
789 | |||
790 | ------------------------------------------------------------------------------ | ||
791 | In This Chapter | ||
792 | ------------------------------------------------------------------------------ | ||
793 | * Modifying kernel parameters by writing into files found in /proc/sys | ||
794 | * Exploring the files which modify certain parameters | ||
795 | * Review of the /proc/sys file tree | ||
796 | ------------------------------------------------------------------------------ | ||
797 | |||
798 | |||
799 | A very interesting part of /proc is the directory /proc/sys. This is not only | ||
800 | a source of information, it also allows you to change parameters within the | ||
801 | kernel. Be very careful when attempting this. You can optimize your system, | ||
802 | but you can also cause it to crash. Never alter kernel parameters on a | ||
803 | production system. Set up a development machine and test to make sure that | ||
804 | everything works the way you want it to. You may have no alternative but to | ||
805 | reboot the machine once an error has been made. | ||
806 | |||
807 | To change a value, simply echo the new value into the file. An example is | ||
808 | given below in the section on the file system data. You need to be root to do | ||
809 | this. You can create your own boot script to perform this every time your | ||
810 | system boots. | ||
811 | |||
812 | The files in /proc/sys can be used to fine tune and monitor miscellaneous and | ||
813 | general things in the operation of the Linux kernel. Since some of the files | ||
814 | can inadvertently disrupt your system, it is advisable to read both | ||
815 | documentation and source before actually making adjustments. In any case, be | ||
816 | very careful when writing to any of these files. The entries in /proc may | ||
817 | change slightly between the 2.1.* and the 2.2 kernel, so if there is any doubt | ||
818 | review the kernel documentation in the directory /usr/src/linux/Documentation. | ||
819 | This chapter is heavily based on the documentation included in the pre 2.2 | ||
820 | kernels, and became part of it in version 2.2.1 of the Linux kernel. | ||
821 | |||
822 | 2.1 /proc/sys/fs - File system data | ||
823 | ----------------------------------- | ||
824 | |||
825 | This subdirectory contains specific file system, file handle, inode, dentry | ||
826 | and quota information. | ||
827 | |||
828 | Currently, these files are in /proc/sys/fs: | ||
829 | |||
830 | dentry-state | ||
831 | ------------ | ||
832 | |||
833 | Status of the directory cache. Since directory entries are dynamically | ||
834 | allocated and deallocated, this file indicates the current status. It holds | ||
835 | six values, in which the last two are not used and are always zero. The others | ||
836 | are listed in table 2-1. | ||
837 | |||
838 | |||
839 | Table 2-1: Status files of the directory cache | ||
840 | .............................................................................. | ||
841 | File Content | ||
842 | nr_dentry Almost always zero | ||
843 | nr_unused Number of unused cache entries | ||
844 | age_limit | ||
845 | in seconds after the entry may be reclaimed, when memory is short | ||
846 | want_pages internally | ||
847 | .............................................................................. | ||
848 | |||
849 | dquot-nr and dquot-max | ||
850 | ---------------------- | ||
851 | |||
852 | The file dquot-max shows the maximum number of cached disk quota entries. | ||
853 | |||
854 | The file dquot-nr shows the number of allocated disk quota entries and the | ||
855 | number of free disk quota entries. | ||
856 | |||
857 | If the number of available cached disk quotas is very low and you have a large | ||
858 | number of simultaneous system users, you might want to raise the limit. | ||
859 | |||
860 | file-nr and file-max | ||
861 | -------------------- | ||
862 | |||
863 | The kernel allocates file handles dynamically, but doesn't free them again at | ||
864 | this time. | ||
865 | |||
866 | The value in file-max denotes the maximum number of file handles that the | ||
867 | Linux kernel will allocate. When you get a lot of error messages about running | ||
868 | out of file handles, you might want to raise this limit. The default value is | ||
869 | 10% of RAM in kilobytes. To change it, just write the new number into the | ||
870 | file: | ||
871 | |||
872 | # cat /proc/sys/fs/file-max | ||
873 | 4096 | ||
874 | # echo 8192 > /proc/sys/fs/file-max | ||
875 | # cat /proc/sys/fs/file-max | ||
876 | 8192 | ||
877 | |||
878 | |||
879 | This method of revision is useful for all customizable parameters of the | ||
880 | kernel - simply echo the new value to the corresponding file. | ||
881 | |||
882 | Historically, the three values in file-nr denoted the number of allocated file | ||
883 | handles, the number of allocated but unused file handles, and the maximum | ||
884 | number of file handles. Linux 2.6 always reports 0 as the number of free file | ||
885 | handles -- this is not an error, it just means that the number of allocated | ||
886 | file handles exactly matches the number of used file handles. | ||
887 | |||
888 | Attempts to allocate more file descriptors than file-max are reported with | ||
889 | printk, look for "VFS: file-max limit <number> reached". | ||
890 | |||
891 | inode-state and inode-nr | ||
892 | ------------------------ | ||
893 | |||
894 | The file inode-nr contains the first two items from inode-state, so we'll skip | ||
895 | to that file... | ||
896 | |||
897 | inode-state contains two actual numbers and five dummy values. The numbers | ||
898 | are nr_inodes and nr_free_inodes (in order of appearance). | ||
899 | |||
900 | nr_inodes | ||
901 | ~~~~~~~~~ | ||
902 | |||
903 | Denotes the number of inodes the system has allocated. This number will | ||
904 | grow and shrink dynamically. | ||
905 | |||
906 | nr_free_inodes | ||
907 | -------------- | ||
908 | |||
909 | Represents the number of free inodes. Ie. The number of inuse inodes is | ||
910 | (nr_inodes - nr_free_inodes). | ||
911 | |||
912 | super-nr and super-max | ||
913 | ---------------------- | ||
914 | |||
915 | Again, super block structures are allocated by the kernel, but not freed. The | ||
916 | file super-max contains the maximum number of super block handlers, where | ||
917 | super-nr shows the number of currently allocated ones. | ||
918 | |||
919 | Every mounted file system needs a super block, so if you plan to mount lots of | ||
920 | file systems, you may want to increase these numbers. | ||
921 | |||
922 | aio-nr and aio-max-nr | ||
923 | --------------------- | ||
924 | |||
925 | aio-nr is the running total of the number of events specified on the | ||
926 | io_setup system call for all currently active aio contexts. If aio-nr | ||
927 | reaches aio-max-nr then io_setup will fail with EAGAIN. Note that | ||
928 | raising aio-max-nr does not result in the pre-allocation or re-sizing | ||
929 | of any kernel data structures. | ||
930 | |||
931 | 2.2 /proc/sys/fs/binfmt_misc - Miscellaneous binary formats | ||
932 | ----------------------------------------------------------- | ||
933 | |||
934 | Besides these files, there is the subdirectory /proc/sys/fs/binfmt_misc. This | ||
935 | handles the kernel support for miscellaneous binary formats. | ||
936 | |||
937 | Binfmt_misc provides the ability to register additional binary formats to the | ||
938 | Kernel without compiling an additional module/kernel. Therefore, binfmt_misc | ||
939 | needs to know magic numbers at the beginning or the filename extension of the | ||
940 | binary. | ||
941 | |||
942 | It works by maintaining a linked list of structs that contain a description of | ||
943 | a binary format, including a magic with size (or the filename extension), | ||
944 | offset and mask, and the interpreter name. On request it invokes the given | ||
945 | interpreter with the original program as argument, as binfmt_java and | ||
946 | binfmt_em86 and binfmt_mz do. Since binfmt_misc does not define any default | ||
947 | binary-formats, you have to register an additional binary-format. | ||
948 | |||
949 | There are two general files in binfmt_misc and one file per registered format. | ||
950 | The two general files are register and status. | ||
951 | |||
952 | Registering a new binary format | ||
953 | ------------------------------- | ||
954 | |||
955 | To register a new binary format you have to issue the command | ||
956 | |||
957 | echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register | ||
958 | |||
959 | |||
960 | |||
961 | with appropriate name (the name for the /proc-dir entry), offset (defaults to | ||
962 | 0, if omitted), magic, mask (which can be omitted, defaults to all 0xff) and | ||
963 | last but not least, the interpreter that is to be invoked (for example and | ||
964 | testing /bin/echo). Type can be M for usual magic matching or E for filename | ||
965 | extension matching (give extension in place of magic). | ||
966 | |||
967 | Check or reset the status of the binary format handler | ||
968 | ------------------------------------------------------ | ||
969 | |||
970 | If you do a cat on the file /proc/sys/fs/binfmt_misc/status, you will get the | ||
971 | current status (enabled/disabled) of binfmt_misc. Change the status by echoing | ||
972 | 0 (disables) or 1 (enables) or -1 (caution: this clears all previously | ||
973 | registered binary formats) to status. For example echo 0 > status to disable | ||
974 | binfmt_misc (temporarily). | ||
975 | |||
976 | Status of a single handler | ||
977 | -------------------------- | ||
978 | |||
979 | Each registered handler has an entry in /proc/sys/fs/binfmt_misc. These files | ||
980 | perform the same function as status, but their scope is limited to the actual | ||
981 | binary format. By cating this file, you also receive all related information | ||
982 | about the interpreter/magic of the binfmt. | ||
983 | |||
984 | Example usage of binfmt_misc (emulate binfmt_java) | ||
985 | -------------------------------------------------- | ||
986 | |||
987 | cd /proc/sys/fs/binfmt_misc | ||
988 | echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register | ||
989 | echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register | ||
990 | echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register | ||
991 | echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register | ||
992 | |||
993 | |||
994 | These four lines add support for Java executables and Java applets (like | ||
995 | binfmt_java, additionally recognizing the .html extension with no need to put | ||
996 | <!--applet> to every applet file). You have to install the JDK and the | ||
997 | shell-script /usr/local/java/bin/javawrapper too. It works around the | ||
998 | brokenness of the Java filename handling. To add a Java binary, just create a | ||
999 | link to the class-file somewhere in the path. | ||
1000 | |||
1001 | 2.3 /proc/sys/kernel - general kernel parameters | ||
1002 | ------------------------------------------------ | ||
1003 | |||
1004 | This directory reflects general kernel behaviors. As I've said before, the | ||
1005 | contents depend on your configuration. Here you'll find the most important | ||
1006 | files, along with descriptions of what they mean and how to use them. | ||
1007 | |||
1008 | acct | ||
1009 | ---- | ||
1010 | |||
1011 | The file contains three values; highwater, lowwater, and frequency. | ||
1012 | |||
1013 | It exists only when BSD-style process accounting is enabled. These values | ||
1014 | control its behavior. If the free space on the file system where the log lives | ||
1015 | goes below lowwater percentage, accounting suspends. If it goes above | ||
1016 | highwater percentage, accounting resumes. Frequency determines how often you | ||
1017 | check the amount of free space (value is in seconds). Default settings are: 4, | ||
1018 | 2, and 30. That is, suspend accounting if there is less than 2 percent free; | ||
1019 | resume it if we have a value of 3 or more percent; consider information about | ||
1020 | the amount of free space valid for 30 seconds | ||
1021 | |||
1022 | ctrl-alt-del | ||
1023 | ------------ | ||
1024 | |||
1025 | When the value in this file is 0, ctrl-alt-del is trapped and sent to the init | ||
1026 | program to handle a graceful restart. However, when the value is greater that | ||
1027 | zero, Linux's reaction to this key combination will be an immediate reboot, | ||
1028 | without syncing its dirty buffers. | ||
1029 | |||
1030 | [NOTE] | ||
1031 | When a program (like dosemu) has the keyboard in raw mode, the | ||
1032 | ctrl-alt-del is intercepted by the program before it ever reaches the | ||
1033 | kernel tty layer, and it is up to the program to decide what to do with | ||
1034 | it. | ||
1035 | |||
1036 | domainname and hostname | ||
1037 | ----------------------- | ||
1038 | |||
1039 | These files can be controlled to set the NIS domainname and hostname of your | ||
1040 | box. For the classic darkstar.frop.org a simple: | ||
1041 | |||
1042 | # echo "darkstar" > /proc/sys/kernel/hostname | ||
1043 | # echo "frop.org" > /proc/sys/kernel/domainname | ||
1044 | |||
1045 | |||
1046 | would suffice to set your hostname and NIS domainname. | ||
1047 | |||
1048 | osrelease, ostype and version | ||
1049 | ----------------------------- | ||
1050 | |||
1051 | The names make it pretty obvious what these fields contain: | ||
1052 | |||
1053 | > cat /proc/sys/kernel/osrelease | ||
1054 | 2.2.12 | ||
1055 | |||
1056 | > cat /proc/sys/kernel/ostype | ||
1057 | Linux | ||
1058 | |||
1059 | > cat /proc/sys/kernel/version | ||
1060 | #4 Fri Oct 1 12:41:14 PDT 1999 | ||
1061 | |||
1062 | |||
1063 | The files osrelease and ostype should be clear enough. Version needs a little | ||
1064 | more clarification. The #4 means that this is the 4th kernel built from this | ||
1065 | source base and the date after it indicates the time the kernel was built. The | ||
1066 | only way to tune these values is to rebuild the kernel. | ||
1067 | |||
1068 | panic | ||
1069 | ----- | ||
1070 | |||
1071 | The value in this file represents the number of seconds the kernel waits | ||
1072 | before rebooting on a panic. When you use the software watchdog, the | ||
1073 | recommended setting is 60. If set to 0, the auto reboot after a kernel panic | ||
1074 | is disabled, which is the default setting. | ||
1075 | |||
1076 | printk | ||
1077 | ------ | ||
1078 | |||
1079 | The four values in printk denote | ||
1080 | * console_loglevel, | ||
1081 | * default_message_loglevel, | ||
1082 | * minimum_console_loglevel and | ||
1083 | * default_console_loglevel | ||
1084 | respectively. | ||
1085 | |||
1086 | These values influence printk() behavior when printing or logging error | ||
1087 | messages, which come from inside the kernel. See syslog(2) for more | ||
1088 | information on the different log levels. | ||
1089 | |||
1090 | console_loglevel | ||
1091 | ---------------- | ||
1092 | |||
1093 | Messages with a higher priority than this will be printed to the console. | ||
1094 | |||
1095 | default_message_level | ||
1096 | --------------------- | ||
1097 | |||
1098 | Messages without an explicit priority will be printed with this priority. | ||
1099 | |||
1100 | minimum_console_loglevel | ||
1101 | ------------------------ | ||
1102 | |||
1103 | Minimum (highest) value to which the console_loglevel can be set. | ||
1104 | |||
1105 | default_console_loglevel | ||
1106 | ------------------------ | ||
1107 | |||
1108 | Default value for console_loglevel. | ||
1109 | |||
1110 | sg-big-buff | ||
1111 | ----------- | ||
1112 | |||
1113 | This file shows the size of the generic SCSI (sg) buffer. At this point, you | ||
1114 | can't tune it yet, but you can change it at compile time by editing | ||
1115 | include/scsi/sg.h and changing the value of SG_BIG_BUFF. | ||
1116 | |||
1117 | If you use a scanner with SANE (Scanner Access Now Easy) you might want to set | ||
1118 | this to a higher value. Refer to the SANE documentation on this issue. | ||
1119 | |||
1120 | modprobe | ||
1121 | -------- | ||
1122 | |||
1123 | The location where the modprobe binary is located. The kernel uses this | ||
1124 | program to load modules on demand. | ||
1125 | |||
1126 | unknown_nmi_panic | ||
1127 | ----------------- | ||
1128 | |||
1129 | The value in this file affects behavior of handling NMI. When the value is | ||
1130 | non-zero, unknown NMI is trapped and then panic occurs. At that time, kernel | ||
1131 | debugging information is displayed on console. | ||
1132 | |||
1133 | NMI switch that most IA32 servers have fires unknown NMI up, for example. | ||
1134 | If a system hangs up, try pressing the NMI switch. | ||
1135 | |||
1136 | [NOTE] | ||
1137 | This function and oprofile share a NMI callback. Therefore this function | ||
1138 | cannot be enabled when oprofile is activated. | ||
1139 | And NMI watchdog will be disabled when the value in this file is set to | ||
1140 | non-zero. | ||
1141 | |||
1142 | |||
1143 | 2.4 /proc/sys/vm - The virtual memory subsystem | ||
1144 | ----------------------------------------------- | ||
1145 | |||
1146 | The files in this directory can be used to tune the operation of the virtual | ||
1147 | memory (VM) subsystem of the Linux kernel. | ||
1148 | |||
1149 | vfs_cache_pressure | ||
1150 | ------------------ | ||
1151 | |||
1152 | Controls the tendency of the kernel to reclaim the memory which is used for | ||
1153 | caching of directory and inode objects. | ||
1154 | |||
1155 | At the default value of vfs_cache_pressure=100 the kernel will attempt to | ||
1156 | reclaim dentries and inodes at a "fair" rate with respect to pagecache and | ||
1157 | swapcache reclaim. Decreasing vfs_cache_pressure causes the kernel to prefer | ||
1158 | to retain dentry and inode caches. Increasing vfs_cache_pressure beyond 100 | ||
1159 | causes the kernel to prefer to reclaim dentries and inodes. | ||
1160 | |||
1161 | dirty_background_ratio | ||
1162 | ---------------------- | ||
1163 | |||
1164 | Contains, as a percentage of total system memory, the number of pages at which | ||
1165 | the pdflush background writeback daemon will start writing out dirty data. | ||
1166 | |||
1167 | dirty_ratio | ||
1168 | ----------------- | ||
1169 | |||
1170 | Contains, as a percentage of total system memory, the number of pages at which | ||
1171 | a process which is generating disk writes will itself start writing out dirty | ||
1172 | data. | ||
1173 | |||
1174 | dirty_writeback_centisecs | ||
1175 | ------------------------- | ||
1176 | |||
1177 | The pdflush writeback daemons will periodically wake up and write `old' data | ||
1178 | out to disk. This tunable expresses the interval between those wakeups, in | ||
1179 | 100'ths of a second. | ||
1180 | |||
1181 | Setting this to zero disables periodic writeback altogether. | ||
1182 | |||
1183 | dirty_expire_centisecs | ||
1184 | ---------------------- | ||
1185 | |||
1186 | This tunable is used to define when dirty data is old enough to be eligible | ||
1187 | for writeout by the pdflush daemons. It is expressed in 100'ths of a second. | ||
1188 | Data which has been dirty in-memory for longer than this interval will be | ||
1189 | written out next time a pdflush daemon wakes up. | ||
1190 | |||
1191 | legacy_va_layout | ||
1192 | ---------------- | ||
1193 | |||
1194 | If non-zero, this sysctl disables the new 32-bit mmap mmap layout - the kernel | ||
1195 | will use the legacy (2.4) layout for all processes. | ||
1196 | |||
1197 | lower_zone_protection | ||
1198 | --------------------- | ||
1199 | |||
1200 | For some specialised workloads on highmem machines it is dangerous for | ||
1201 | the kernel to allow process memory to be allocated from the "lowmem" | ||
1202 | zone. This is because that memory could then be pinned via the mlock() | ||
1203 | system call, or by unavailability of swapspace. | ||
1204 | |||
1205 | And on large highmem machines this lack of reclaimable lowmem memory | ||
1206 | can be fatal. | ||
1207 | |||
1208 | So the Linux page allocator has a mechanism which prevents allocations | ||
1209 | which _could_ use highmem from using too much lowmem. This means that | ||
1210 | a certain amount of lowmem is defended from the possibility of being | ||
1211 | captured into pinned user memory. | ||
1212 | |||
1213 | (The same argument applies to the old 16 megabyte ISA DMA region. This | ||
1214 | mechanism will also defend that region from allocations which could use | ||
1215 | highmem or lowmem). | ||
1216 | |||
1217 | The `lower_zone_protection' tunable determines how aggressive the kernel is | ||
1218 | in defending these lower zones. The default value is zero - no | ||
1219 | protection at all. | ||
1220 | |||
1221 | If you have a machine which uses highmem or ISA DMA and your | ||
1222 | applications are using mlock(), or if you are running with no swap then | ||
1223 | you probably should increase the lower_zone_protection setting. | ||
1224 | |||
1225 | The units of this tunable are fairly vague. It is approximately equal | ||
1226 | to "megabytes". So setting lower_zone_protection=100 will protect around 100 | ||
1227 | megabytes of the lowmem zone from user allocations. It will also make | ||
1228 | those 100 megabytes unavaliable for use by applications and by | ||
1229 | pagecache, so there is a cost. | ||
1230 | |||
1231 | The effects of this tunable may be observed by monitoring | ||
1232 | /proc/meminfo:LowFree. Write a single huge file and observe the point | ||
1233 | at which LowFree ceases to fall. | ||
1234 | |||
1235 | A reasonable value for lower_zone_protection is 100. | ||
1236 | |||
1237 | page-cluster | ||
1238 | ------------ | ||
1239 | |||
1240 | page-cluster controls the number of pages which are written to swap in | ||
1241 | a single attempt. The swap I/O size. | ||
1242 | |||
1243 | It is a logarithmic value - setting it to zero means "1 page", setting | ||
1244 | it to 1 means "2 pages", setting it to 2 means "4 pages", etc. | ||
1245 | |||
1246 | The default value is three (eight pages at a time). There may be some | ||
1247 | small benefits in tuning this to a different value if your workload is | ||
1248 | swap-intensive. | ||
1249 | |||
1250 | overcommit_memory | ||
1251 | ----------------- | ||
1252 | |||
1253 | This file contains one value. The following algorithm is used to decide if | ||
1254 | there's enough memory: if the value of overcommit_memory is positive, then | ||
1255 | there's always enough memory. This is a useful feature, since programs often | ||
1256 | malloc() huge amounts of memory 'just in case', while they only use a small | ||
1257 | part of it. Leaving this value at 0 will lead to the failure of such a huge | ||
1258 | malloc(), when in fact the system has enough memory for the program to run. | ||
1259 | |||
1260 | On the other hand, enabling this feature can cause you to run out of memory | ||
1261 | and thrash the system to death, so large and/or important servers will want to | ||
1262 | set this value to 0. | ||
1263 | |||
1264 | nr_hugepages and hugetlb_shm_group | ||
1265 | ---------------------------------- | ||
1266 | |||
1267 | nr_hugepages configures number of hugetlb page reserved for the system. | ||
1268 | |||
1269 | hugetlb_shm_group contains group id that is allowed to create SysV shared | ||
1270 | memory segment using hugetlb page. | ||
1271 | |||
1272 | laptop_mode | ||
1273 | ----------- | ||
1274 | |||
1275 | laptop_mode is a knob that controls "laptop mode". All the things that are | ||
1276 | controlled by this knob are discussed in Documentation/laptop-mode.txt. | ||
1277 | |||
1278 | block_dump | ||
1279 | ---------- | ||
1280 | |||
1281 | block_dump enables block I/O debugging when set to a nonzero value. More | ||
1282 | information on block I/O debugging is in Documentation/laptop-mode.txt. | ||
1283 | |||
1284 | swap_token_timeout | ||
1285 | ------------------ | ||
1286 | |||
1287 | This file contains valid hold time of swap out protection token. The Linux | ||
1288 | VM has token based thrashing control mechanism and uses the token to prevent | ||
1289 | unnecessary page faults in thrashing situation. The unit of the value is | ||
1290 | second. The value would be useful to tune thrashing behavior. | ||
1291 | |||
1292 | 2.5 /proc/sys/dev - Device specific parameters | ||
1293 | ---------------------------------------------- | ||
1294 | |||
1295 | Currently there is only support for CDROM drives, and for those, there is only | ||
1296 | one read-only file containing information about the CD-ROM drives attached to | ||
1297 | the system: | ||
1298 | |||
1299 | >cat /proc/sys/dev/cdrom/info | ||
1300 | CD-ROM information, Id: cdrom.c 2.55 1999/04/25 | ||
1301 | |||
1302 | drive name: sr0 hdb | ||
1303 | drive speed: 32 40 | ||
1304 | drive # of slots: 1 0 | ||
1305 | Can close tray: 1 1 | ||
1306 | Can open tray: 1 1 | ||
1307 | Can lock tray: 1 1 | ||
1308 | Can change speed: 1 1 | ||
1309 | Can select disk: 0 1 | ||
1310 | Can read multisession: 1 1 | ||
1311 | Can read MCN: 1 1 | ||
1312 | Reports media changed: 1 1 | ||
1313 | Can play audio: 1 1 | ||
1314 | |||
1315 | |||
1316 | You see two drives, sr0 and hdb, along with a list of their features. | ||
1317 | |||
1318 | 2.6 /proc/sys/sunrpc - Remote procedure calls | ||
1319 | --------------------------------------------- | ||
1320 | |||
1321 | This directory contains four files, which enable or disable debugging for the | ||
1322 | RPC functions NFS, NFS-daemon, RPC and NLM. The default values are 0. They can | ||
1323 | be set to one to turn debugging on. (The default value is 0 for each) | ||
1324 | |||
1325 | 2.7 /proc/sys/net - Networking stuff | ||
1326 | ------------------------------------ | ||
1327 | |||
1328 | The interface to the networking parts of the kernel is located in | ||
1329 | /proc/sys/net. Table 2-3 shows all possible subdirectories. You may see only | ||
1330 | some of them, depending on your kernel's configuration. | ||
1331 | |||
1332 | |||
1333 | Table 2-3: Subdirectories in /proc/sys/net | ||
1334 | .............................................................................. | ||
1335 | Directory Content Directory Content | ||
1336 | core General parameter appletalk Appletalk protocol | ||
1337 | unix Unix domain sockets netrom NET/ROM | ||
1338 | 802 E802 protocol ax25 AX25 | ||
1339 | ethernet Ethernet protocol rose X.25 PLP layer | ||
1340 | ipv4 IP version 4 x25 X.25 protocol | ||
1341 | ipx IPX token-ring IBM token ring | ||
1342 | bridge Bridging decnet DEC net | ||
1343 | ipv6 IP version 6 | ||
1344 | .............................................................................. | ||
1345 | |||
1346 | We will concentrate on IP networking here. Since AX15, X.25, and DEC Net are | ||
1347 | only minor players in the Linux world, we'll skip them in this chapter. You'll | ||
1348 | find some short info on Appletalk and IPX further on in this chapter. Review | ||
1349 | the online documentation and the kernel source to get a detailed view of the | ||
1350 | parameters for those protocols. In this section we'll discuss the | ||
1351 | subdirectories printed in bold letters in the table above. As default values | ||
1352 | are suitable for most needs, there is no need to change these values. | ||
1353 | |||
1354 | /proc/sys/net/core - Network core options | ||
1355 | ----------------------------------------- | ||
1356 | |||
1357 | rmem_default | ||
1358 | ------------ | ||
1359 | |||
1360 | The default setting of the socket receive buffer in bytes. | ||
1361 | |||
1362 | rmem_max | ||
1363 | -------- | ||
1364 | |||
1365 | The maximum receive socket buffer size in bytes. | ||
1366 | |||
1367 | wmem_default | ||
1368 | ------------ | ||
1369 | |||
1370 | The default setting (in bytes) of the socket send buffer. | ||
1371 | |||
1372 | wmem_max | ||
1373 | -------- | ||
1374 | |||
1375 | The maximum send socket buffer size in bytes. | ||
1376 | |||
1377 | message_burst and message_cost | ||
1378 | ------------------------------ | ||
1379 | |||
1380 | These parameters are used to limit the warning messages written to the kernel | ||
1381 | log from the networking code. They enforce a rate limit to make a | ||
1382 | denial-of-service attack impossible. A higher message_cost factor, results in | ||
1383 | fewer messages that will be written. Message_burst controls when messages will | ||
1384 | be dropped. The default settings limit warning messages to one every five | ||
1385 | seconds. | ||
1386 | |||
1387 | netdev_max_backlog | ||
1388 | ------------------ | ||
1389 | |||
1390 | Maximum number of packets, queued on the INPUT side, when the interface | ||
1391 | receives packets faster than kernel can process them. | ||
1392 | |||
1393 | optmem_max | ||
1394 | ---------- | ||
1395 | |||
1396 | Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence | ||
1397 | of struct cmsghdr structures with appended data. | ||
1398 | |||
1399 | /proc/sys/net/unix - Parameters for Unix domain sockets | ||
1400 | ------------------------------------------------------- | ||
1401 | |||
1402 | There are only two files in this subdirectory. They control the delays for | ||
1403 | deleting and destroying socket descriptors. | ||
1404 | |||
1405 | 2.8 /proc/sys/net/ipv4 - IPV4 settings | ||
1406 | -------------------------------------- | ||
1407 | |||
1408 | IP version 4 is still the most used protocol in Unix networking. It will be | ||
1409 | replaced by IP version 6 in the next couple of years, but for the moment it's | ||
1410 | the de facto standard for the internet and is used in most networking | ||
1411 | environments around the world. Because of the importance of this protocol, | ||
1412 | we'll have a deeper look into the subtree controlling the behavior of the IPv4 | ||
1413 | subsystem of the Linux kernel. | ||
1414 | |||
1415 | Let's start with the entries in /proc/sys/net/ipv4. | ||
1416 | |||
1417 | ICMP settings | ||
1418 | ------------- | ||
1419 | |||
1420 | icmp_echo_ignore_all and icmp_echo_ignore_broadcasts | ||
1421 | ---------------------------------------------------- | ||
1422 | |||
1423 | Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO requests, or | ||
1424 | just those to broadcast and multicast addresses. | ||
1425 | |||
1426 | Please note that if you accept ICMP echo requests with a broadcast/multi\-cast | ||
1427 | destination address your network may be used as an exploder for denial of | ||
1428 | service packet flooding attacks to other hosts. | ||
1429 | |||
1430 | icmp_destunreach_rate, icmp_echoreply_rate, icmp_paramprob_rate and icmp_timeexeed_rate | ||
1431 | --------------------------------------------------------------------------------------- | ||
1432 | |||
1433 | Sets limits for sending ICMP packets to specific targets. A value of zero | ||
1434 | disables all limiting. Any positive value sets the maximum package rate in | ||
1435 | hundredth of a second (on Intel systems). | ||
1436 | |||
1437 | IP settings | ||
1438 | ----------- | ||
1439 | |||
1440 | ip_autoconfig | ||
1441 | ------------- | ||
1442 | |||
1443 | This file contains the number one if the host received its IP configuration by | ||
1444 | RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero. | ||
1445 | |||
1446 | ip_default_ttl | ||
1447 | -------------- | ||
1448 | |||
1449 | TTL (Time To Live) for IPv4 interfaces. This is simply the maximum number of | ||
1450 | hops a packet may travel. | ||
1451 | |||
1452 | ip_dynaddr | ||
1453 | ---------- | ||
1454 | |||
1455 | Enable dynamic socket address rewriting on interface address change. This is | ||
1456 | useful for dialup interface with changing IP addresses. | ||
1457 | |||
1458 | ip_forward | ||
1459 | ---------- | ||
1460 | |||
1461 | Enable or disable forwarding of IP packages between interfaces. Changing this | ||
1462 | value resets all other parameters to their default values. They differ if the | ||
1463 | kernel is configured as host or router. | ||
1464 | |||
1465 | ip_local_port_range | ||
1466 | ------------------- | ||
1467 | |||
1468 | Range of ports used by TCP and UDP to choose the local port. Contains two | ||
1469 | numbers, the first number is the lowest port, the second number the highest | ||
1470 | local port. Default is 1024-4999. Should be changed to 32768-61000 for | ||
1471 | high-usage systems. | ||
1472 | |||
1473 | ip_no_pmtu_disc | ||
1474 | --------------- | ||
1475 | |||
1476 | Global switch to turn path MTU discovery off. It can also be set on a per | ||
1477 | socket basis by the applications or on a per route basis. | ||
1478 | |||
1479 | ip_masq_debug | ||
1480 | ------------- | ||
1481 | |||
1482 | Enable/disable debugging of IP masquerading. | ||
1483 | |||
1484 | IP fragmentation settings | ||
1485 | ------------------------- | ||
1486 | |||
1487 | ipfrag_high_trash and ipfrag_low_trash | ||
1488 | -------------------------------------- | ||
1489 | |||
1490 | Maximum memory used to reassemble IP fragments. When ipfrag_high_thresh bytes | ||
1491 | of memory is allocated for this purpose, the fragment handler will toss | ||
1492 | packets until ipfrag_low_thresh is reached. | ||
1493 | |||
1494 | ipfrag_time | ||
1495 | ----------- | ||
1496 | |||
1497 | Time in seconds to keep an IP fragment in memory. | ||
1498 | |||
1499 | TCP settings | ||
1500 | ------------ | ||
1501 | |||
1502 | tcp_ecn | ||
1503 | ------- | ||
1504 | |||
1505 | This file controls the use of the ECN bit in the IPv4 headers, this is a new | ||
1506 | feature about Explicit Congestion Notification, but some routers and firewalls | ||
1507 | block trafic that has this bit set, so it could be necessary to echo 0 to | ||
1508 | /proc/sys/net/ipv4/tcp_ecn, if you want to talk to this sites. For more info | ||
1509 | you could read RFC2481. | ||
1510 | |||
1511 | tcp_retrans_collapse | ||
1512 | -------------------- | ||
1513 | |||
1514 | Bug-to-bug compatibility with some broken printers. On retransmit, try to send | ||
1515 | larger packets to work around bugs in certain TCP stacks. Can be turned off by | ||
1516 | setting it to zero. | ||
1517 | |||
1518 | tcp_keepalive_probes | ||
1519 | -------------------- | ||
1520 | |||
1521 | Number of keep alive probes TCP sends out, until it decides that the | ||
1522 | connection is broken. | ||
1523 | |||
1524 | tcp_keepalive_time | ||
1525 | ------------------ | ||
1526 | |||
1527 | How often TCP sends out keep alive messages, when keep alive is enabled. The | ||
1528 | default is 2 hours. | ||
1529 | |||
1530 | tcp_syn_retries | ||
1531 | --------------- | ||
1532 | |||
1533 | Number of times initial SYNs for a TCP connection attempt will be | ||
1534 | retransmitted. Should not be higher than 255. This is only the timeout for | ||
1535 | outgoing connections, for incoming connections the number of retransmits is | ||
1536 | defined by tcp_retries1. | ||
1537 | |||
1538 | tcp_sack | ||
1539 | -------- | ||
1540 | |||
1541 | Enable select acknowledgments after RFC2018. | ||
1542 | |||
1543 | tcp_timestamps | ||
1544 | -------------- | ||
1545 | |||
1546 | Enable timestamps as defined in RFC1323. | ||
1547 | |||
1548 | tcp_stdurg | ||
1549 | ---------- | ||
1550 | |||
1551 | Enable the strict RFC793 interpretation of the TCP urgent pointer field. The | ||
1552 | default is to use the BSD compatible interpretation of the urgent pointer | ||
1553 | pointing to the first byte after the urgent data. The RFC793 interpretation is | ||
1554 | to have it point to the last byte of urgent data. Enabling this option may | ||
1555 | lead to interoperatibility problems. Disabled by default. | ||
1556 | |||
1557 | tcp_syncookies | ||
1558 | -------------- | ||
1559 | |||
1560 | Only valid when the kernel was compiled with CONFIG_SYNCOOKIES. Send out | ||
1561 | syncookies when the syn backlog queue of a socket overflows. This is to ward | ||
1562 | off the common 'syn flood attack'. Disabled by default. | ||
1563 | |||
1564 | Note that the concept of a socket backlog is abandoned. This means the peer | ||
1565 | may not receive reliable error messages from an over loaded server with | ||
1566 | syncookies enabled. | ||
1567 | |||
1568 | tcp_window_scaling | ||
1569 | ------------------ | ||
1570 | |||
1571 | Enable window scaling as defined in RFC1323. | ||
1572 | |||
1573 | tcp_fin_timeout | ||
1574 | --------------- | ||
1575 | |||
1576 | The length of time in seconds it takes to receive a final FIN before the | ||
1577 | socket is always closed. This is strictly a violation of the TCP | ||
1578 | specification, but required to prevent denial-of-service attacks. | ||
1579 | |||
1580 | tcp_max_ka_probes | ||
1581 | ----------------- | ||
1582 | |||
1583 | Indicates how many keep alive probes are sent per slow timer run. Should not | ||
1584 | be set too high to prevent bursts. | ||
1585 | |||
1586 | tcp_max_syn_backlog | ||
1587 | ------------------- | ||
1588 | |||
1589 | Length of the per socket backlog queue. Since Linux 2.2 the backlog specified | ||
1590 | in listen(2) only specifies the length of the backlog queue of already | ||
1591 | established sockets. When more connection requests arrive Linux starts to drop | ||
1592 | packets. When syncookies are enabled the packets are still answered and the | ||
1593 | maximum queue is effectively ignored. | ||
1594 | |||
1595 | tcp_retries1 | ||
1596 | ------------ | ||
1597 | |||
1598 | Defines how often an answer to a TCP connection request is retransmitted | ||
1599 | before giving up. | ||
1600 | |||
1601 | tcp_retries2 | ||
1602 | ------------ | ||
1603 | |||
1604 | Defines how often a TCP packet is retransmitted before giving up. | ||
1605 | |||
1606 | Interface specific settings | ||
1607 | --------------------------- | ||
1608 | |||
1609 | In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory for each | ||
1610 | interface the system knows about and one directory calls all. Changes in the | ||
1611 | all subdirectory affect all interfaces, whereas changes in the other | ||
1612 | subdirectories affect only one interface. All directories have the same | ||
1613 | entries: | ||
1614 | |||
1615 | accept_redirects | ||
1616 | ---------------- | ||
1617 | |||
1618 | This switch decides if the kernel accepts ICMP redirect messages or not. The | ||
1619 | default is 'yes' if the kernel is configured for a regular host and 'no' for a | ||
1620 | router configuration. | ||
1621 | |||
1622 | accept_source_route | ||
1623 | ------------------- | ||
1624 | |||
1625 | Should source routed packages be accepted or declined. The default is | ||
1626 | dependent on the kernel configuration. It's 'yes' for routers and 'no' for | ||
1627 | hosts. | ||
1628 | |||
1629 | bootp_relay | ||
1630 | ~~~~~~~~~~~ | ||
1631 | |||
1632 | Accept packets with source address 0.b.c.d with destinations not to this host | ||
1633 | as local ones. It is supposed that a BOOTP relay daemon will catch and forward | ||
1634 | such packets. | ||
1635 | |||
1636 | The default is 0, since this feature is not implemented yet (kernel version | ||
1637 | 2.2.12). | ||
1638 | |||
1639 | forwarding | ||
1640 | ---------- | ||
1641 | |||
1642 | Enable or disable IP forwarding on this interface. | ||
1643 | |||
1644 | log_martians | ||
1645 | ------------ | ||
1646 | |||
1647 | Log packets with source addresses with no known route to kernel log. | ||
1648 | |||
1649 | mc_forwarding | ||
1650 | ------------- | ||
1651 | |||
1652 | Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE and a | ||
1653 | multicast routing daemon is required. | ||
1654 | |||
1655 | proxy_arp | ||
1656 | --------- | ||
1657 | |||
1658 | Does (1) or does not (0) perform proxy ARP. | ||
1659 | |||
1660 | rp_filter | ||
1661 | --------- | ||
1662 | |||
1663 | Integer value determines if a source validation should be made. 1 means yes, 0 | ||
1664 | means no. Disabled by default, but local/broadcast address spoofing is always | ||
1665 | on. | ||
1666 | |||
1667 | If you set this to 1 on a router that is the only connection for a network to | ||
1668 | the net, it will prevent spoofing attacks against your internal networks | ||
1669 | (external addresses can still be spoofed), without the need for additional | ||
1670 | firewall rules. | ||
1671 | |||
1672 | secure_redirects | ||
1673 | ---------------- | ||
1674 | |||
1675 | Accept ICMP redirect messages only for gateways, listed in default gateway | ||
1676 | list. Enabled by default. | ||
1677 | |||
1678 | shared_media | ||
1679 | ------------ | ||
1680 | |||
1681 | If it is not set the kernel does not assume that different subnets on this | ||
1682 | device can communicate directly. Default setting is 'yes'. | ||
1683 | |||
1684 | send_redirects | ||
1685 | -------------- | ||
1686 | |||
1687 | Determines whether to send ICMP redirects to other hosts. | ||
1688 | |||
1689 | Routing settings | ||
1690 | ---------------- | ||
1691 | |||
1692 | The directory /proc/sys/net/ipv4/route contains several file to control | ||
1693 | routing issues. | ||
1694 | |||
1695 | error_burst and error_cost | ||
1696 | -------------------------- | ||
1697 | |||
1698 | These parameters are used to limit how many ICMP destination unreachable to | ||
1699 | send from the host in question. ICMP destination unreachable messages are | ||
1700 | sent when we can not reach the next hop, while trying to transmit a packet. | ||
1701 | It will also print some error messages to kernel logs if someone is ignoring | ||
1702 | our ICMP redirects. The higher the error_cost factor is, the fewer | ||
1703 | destination unreachable and error messages will be let through. Error_burst | ||
1704 | controls when destination unreachable messages and error messages will be | ||
1705 | dropped. The default settings limit warning messages to five every second. | ||
1706 | |||
1707 | flush | ||
1708 | ----- | ||
1709 | |||
1710 | Writing to this file results in a flush of the routing cache. | ||
1711 | |||
1712 | gc_elasticity, gc_interval, gc_min_interval_ms, gc_timeout, gc_thresh | ||
1713 | --------------------------------------------------------------------- | ||
1714 | |||
1715 | Values to control the frequency and behavior of the garbage collection | ||
1716 | algorithm for the routing cache. gc_min_interval is deprecated and replaced | ||
1717 | by gc_min_interval_ms. | ||
1718 | |||
1719 | |||
1720 | max_size | ||
1721 | -------- | ||
1722 | |||
1723 | Maximum size of the routing cache. Old entries will be purged once the cache | ||
1724 | reached has this size. | ||
1725 | |||
1726 | max_delay, min_delay | ||
1727 | -------------------- | ||
1728 | |||
1729 | Delays for flushing the routing cache. | ||
1730 | |||
1731 | redirect_load, redirect_number | ||
1732 | ------------------------------ | ||
1733 | |||
1734 | Factors which determine if more ICPM redirects should be sent to a specific | ||
1735 | host. No redirects will be sent once the load limit or the maximum number of | ||
1736 | redirects has been reached. | ||
1737 | |||
1738 | redirect_silence | ||
1739 | ---------------- | ||
1740 | |||
1741 | Timeout for redirects. After this period redirects will be sent again, even if | ||
1742 | this has been stopped, because the load or number limit has been reached. | ||
1743 | |||
1744 | Network Neighbor handling | ||
1745 | ------------------------- | ||
1746 | |||
1747 | Settings about how to handle connections with direct neighbors (nodes attached | ||
1748 | to the same link) can be found in the directory /proc/sys/net/ipv4/neigh. | ||
1749 | |||
1750 | As we saw it in the conf directory, there is a default subdirectory which | ||
1751 | holds the default values, and one directory for each interface. The contents | ||
1752 | of the directories are identical, with the single exception that the default | ||
1753 | settings contain additional options to set garbage collection parameters. | ||
1754 | |||
1755 | In the interface directories you'll find the following entries: | ||
1756 | |||
1757 | base_reachable_time, base_reachable_time_ms | ||
1758 | ------------------------------------------- | ||
1759 | |||
1760 | A base value used for computing the random reachable time value as specified | ||
1761 | in RFC2461. | ||
1762 | |||
1763 | Expression of base_reachable_time, which is deprecated, is in seconds. | ||
1764 | Expression of base_reachable_time_ms is in milliseconds. | ||
1765 | |||
1766 | retrans_time, retrans_time_ms | ||
1767 | ----------------------------- | ||
1768 | |||
1769 | The time between retransmitted Neighbor Solicitation messages. | ||
1770 | Used for address resolution and to determine if a neighbor is | ||
1771 | unreachable. | ||
1772 | |||
1773 | Expression of retrans_time, which is deprecated, is in 1/100 seconds (for | ||
1774 | IPv4) or in jiffies (for IPv6). | ||
1775 | Expression of retrans_time_ms is in milliseconds. | ||
1776 | |||
1777 | unres_qlen | ||
1778 | ---------- | ||
1779 | |||
1780 | Maximum queue length for a pending arp request - the number of packets which | ||
1781 | are accepted from other layers while the ARP address is still resolved. | ||
1782 | |||
1783 | anycast_delay | ||
1784 | ------------- | ||
1785 | |||
1786 | Maximum for random delay of answers to neighbor solicitation messages in | ||
1787 | jiffies (1/100 sec). Not yet implemented (Linux does not have anycast support | ||
1788 | yet). | ||
1789 | |||
1790 | ucast_solicit | ||
1791 | ------------- | ||
1792 | |||
1793 | Maximum number of retries for unicast solicitation. | ||
1794 | |||
1795 | mcast_solicit | ||
1796 | ------------- | ||
1797 | |||
1798 | Maximum number of retries for multicast solicitation. | ||
1799 | |||
1800 | delay_first_probe_time | ||
1801 | ---------------------- | ||
1802 | |||
1803 | Delay for the first time probe if the neighbor is reachable. (see | ||
1804 | gc_stale_time) | ||
1805 | |||
1806 | locktime | ||
1807 | -------- | ||
1808 | |||
1809 | An ARP/neighbor entry is only replaced with a new one if the old is at least | ||
1810 | locktime old. This prevents ARP cache thrashing. | ||
1811 | |||
1812 | proxy_delay | ||
1813 | ----------- | ||
1814 | |||
1815 | Maximum time (real time is random [0..proxytime]) before answering to an ARP | ||
1816 | request for which we have an proxy ARP entry. In some cases, this is used to | ||
1817 | prevent network flooding. | ||
1818 | |||
1819 | proxy_qlen | ||
1820 | ---------- | ||
1821 | |||
1822 | Maximum queue length of the delayed proxy arp timer. (see proxy_delay). | ||
1823 | |||
1824 | app_solcit | ||
1825 | ---------- | ||
1826 | |||
1827 | Determines the number of requests to send to the user level ARP daemon. Use 0 | ||
1828 | to turn off. | ||
1829 | |||
1830 | gc_stale_time | ||
1831 | ------------- | ||
1832 | |||
1833 | Determines how often to check for stale ARP entries. After an ARP entry is | ||
1834 | stale it will be resolved again (which is useful when an IP address migrates | ||
1835 | to another machine). When ucast_solicit is greater than 0 it first tries to | ||
1836 | send an ARP packet directly to the known host When that fails and | ||
1837 | mcast_solicit is greater than 0, an ARP request is broadcasted. | ||
1838 | |||
1839 | 2.9 Appletalk | ||
1840 | ------------- | ||
1841 | |||
1842 | The /proc/sys/net/appletalk directory holds the Appletalk configuration data | ||
1843 | when Appletalk is loaded. The configurable parameters are: | ||
1844 | |||
1845 | aarp-expiry-time | ||
1846 | ---------------- | ||
1847 | |||
1848 | The amount of time we keep an ARP entry before expiring it. Used to age out | ||
1849 | old hosts. | ||
1850 | |||
1851 | aarp-resolve-time | ||
1852 | ----------------- | ||
1853 | |||
1854 | The amount of time we will spend trying to resolve an Appletalk address. | ||
1855 | |||
1856 | aarp-retransmit-limit | ||
1857 | --------------------- | ||
1858 | |||
1859 | The number of times we will retransmit a query before giving up. | ||
1860 | |||
1861 | aarp-tick-time | ||
1862 | -------------- | ||
1863 | |||
1864 | Controls the rate at which expires are checked. | ||
1865 | |||
1866 | The directory /proc/net/appletalk holds the list of active Appletalk sockets | ||
1867 | on a machine. | ||
1868 | |||
1869 | The fields indicate the DDP type, the local address (in network:node format) | ||
1870 | the remote address, the size of the transmit pending queue, the size of the | ||
1871 | received queue (bytes waiting for applications to read) the state and the uid | ||
1872 | owning the socket. | ||
1873 | |||
1874 | /proc/net/atalk_iface lists all the interfaces configured for appletalk.It | ||
1875 | shows the name of the interface, its Appletalk address, the network range on | ||
1876 | that address (or network number for phase 1 networks), and the status of the | ||
1877 | interface. | ||
1878 | |||
1879 | /proc/net/atalk_route lists each known network route. It lists the target | ||
1880 | (network) that the route leads to, the router (may be directly connected), the | ||
1881 | route flags, and the device the route is using. | ||
1882 | |||
1883 | 2.10 IPX | ||
1884 | -------- | ||
1885 | |||
1886 | The IPX protocol has no tunable values in proc/sys/net. | ||
1887 | |||
1888 | The IPX protocol does, however, provide proc/net/ipx. This lists each IPX | ||
1889 | socket giving the local and remote addresses in Novell format (that is | ||
1890 | network:node:port). In accordance with the strange Novell tradition, | ||
1891 | everything but the port is in hex. Not_Connected is displayed for sockets that | ||
1892 | are not tied to a specific remote address. The Tx and Rx queue sizes indicate | ||
1893 | the number of bytes pending for transmission and reception. The state | ||
1894 | indicates the state the socket is in and the uid is the owning uid of the | ||
1895 | socket. | ||
1896 | |||
1897 | The /proc/net/ipx_interface file lists all IPX interfaces. For each interface | ||
1898 | it gives the network number, the node number, and indicates if the network is | ||
1899 | the primary network. It also indicates which device it is bound to (or | ||
1900 | Internal for internal networks) and the Frame Type if appropriate. Linux | ||
1901 | supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for | ||
1902 | IPX. | ||
1903 | |||
1904 | The /proc/net/ipx_route table holds a list of IPX routes. For each route it | ||
1905 | gives the destination network, the router node (or Directly) and the network | ||
1906 | address of the router (or Connected) for internal networks. | ||
1907 | |||
1908 | 2.11 /proc/sys/fs/mqueue - POSIX message queues filesystem | ||
1909 | ---------------------------------------------------------- | ||
1910 | |||
1911 | The "mqueue" filesystem provides the necessary kernel features to enable the | ||
1912 | creation of a user space library that implements the POSIX message queues | ||
1913 | API (as noted by the MSG tag in the POSIX 1003.1-2001 version of the System | ||
1914 | Interfaces specification.) | ||
1915 | |||
1916 | The "mqueue" filesystem contains values for determining/setting the amount of | ||
1917 | resources used by the file system. | ||
1918 | |||
1919 | /proc/sys/fs/mqueue/queues_max is a read/write file for setting/getting the | ||
1920 | maximum number of message queues allowed on the system. | ||
1921 | |||
1922 | /proc/sys/fs/mqueue/msg_max is a read/write file for setting/getting the | ||
1923 | maximum number of messages in a queue value. In fact it is the limiting value | ||
1924 | for another (user) limit which is set in mq_open invocation. This attribute of | ||
1925 | a queue must be less or equal then msg_max. | ||
1926 | |||
1927 | /proc/sys/fs/mqueue/msgsize_max is a read/write file for setting/getting the | ||
1928 | maximum message size value (it is every message queue's attribute set during | ||
1929 | its creation). | ||
1930 | |||
1931 | |||
1932 | ------------------------------------------------------------------------------ | ||
1933 | Summary | ||
1934 | ------------------------------------------------------------------------------ | ||
1935 | Certain aspects of kernel behavior can be modified at runtime, without the | ||
1936 | need to recompile the kernel, or even to reboot the system. The files in the | ||
1937 | /proc/sys tree can not only be read, but also modified. You can use the echo | ||
1938 | command to write value into these files, thereby changing the default settings | ||
1939 | of the kernel. | ||
1940 | ------------------------------------------------------------------------------ | ||