diff options
author | Jeff Garzik <jgarzik@pobox.com> | 2005-08-29 16:40:27 -0400 |
---|---|---|
committer | Jeff Garzik <jgarzik@pobox.com> | 2005-08-29 16:40:27 -0400 |
commit | c1b054d03f5b31c33eaa0b267c629b118eaf3790 (patch) | |
tree | 9333907ca767be24fcb3667877242976c3e3c8dd /Documentation | |
parent | 559fb51ba7e66fe298b8355fabde1275b7def35f (diff) | |
parent | bf4e70e54cf31dcca48d279c7f7e71328eebe749 (diff) |
Merge /spare/repo/linux-2.6/
Diffstat (limited to 'Documentation')
109 files changed, 4589 insertions, 2149 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 8de8a01a2474..f28a24e0279b 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX | |||
@@ -138,6 +138,8 @@ java.txt | |||
138 | - info on the in-kernel binary support for Java(tm). | 138 | - info on the in-kernel binary support for Java(tm). |
139 | kbuild/ | 139 | kbuild/ |
140 | - directory with info about the kernel build process. | 140 | - directory with info about the kernel build process. |
141 | kdumpt.txt | ||
142 | - mini HowTo on getting the crash dump code to work. | ||
141 | kernel-doc-nano-HOWTO.txt | 143 | kernel-doc-nano-HOWTO.txt |
142 | - mini HowTo on generation and location of kernel documentation files. | 144 | - mini HowTo on generation and location of kernel documentation files. |
143 | kernel-docs.txt | 145 | kernel-docs.txt |
diff --git a/Documentation/Changes b/Documentation/Changes index 57542bc25edd..5eaab0441d76 100644 --- a/Documentation/Changes +++ b/Documentation/Changes | |||
@@ -44,9 +44,9 @@ running, the suggested command should tell you. | |||
44 | 44 | ||
45 | Again, keep in mind that this list assumes you are already | 45 | Again, keep in mind that this list assumes you are already |
46 | functionally running a Linux 2.4 kernel. Also, not all tools are | 46 | functionally running a Linux 2.4 kernel. Also, not all tools are |
47 | necessary on all systems; obviously, if you don't have any PCMCIA (PC | 47 | necessary on all systems; obviously, if you don't have any ISDN |
48 | Card) hardware, for example, you probably needn't concern yourself | 48 | hardware, for example, you probably needn't concern yourself with |
49 | with pcmcia-cs. | 49 | isdn4k-utils. |
50 | 50 | ||
51 | o Gnu C 2.95.3 # gcc --version | 51 | o Gnu C 2.95.3 # gcc --version |
52 | o Gnu make 3.79.1 # make --version | 52 | o Gnu make 3.79.1 # make --version |
@@ -57,13 +57,15 @@ o e2fsprogs 1.29 # tune2fs | |||
57 | o jfsutils 1.1.3 # fsck.jfs -V | 57 | o jfsutils 1.1.3 # fsck.jfs -V |
58 | o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs | 58 | o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs |
59 | o xfsprogs 2.6.0 # xfs_db -V | 59 | o xfsprogs 2.6.0 # xfs_db -V |
60 | o pcmciautils 004 | ||
60 | o pcmcia-cs 3.1.21 # cardmgr -V | 61 | o pcmcia-cs 3.1.21 # cardmgr -V |
61 | o quota-tools 3.09 # quota -V | 62 | o quota-tools 3.09 # quota -V |
62 | o PPP 2.4.0 # pppd --version | 63 | o PPP 2.4.0 # pppd --version |
63 | o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version | 64 | o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version |
64 | o nfs-utils 1.0.5 # showmount --version | 65 | o nfs-utils 1.0.5 # showmount --version |
65 | o procps 3.2.0 # ps --version | 66 | o procps 3.2.0 # ps --version |
66 | o oprofile 0.5.3 # oprofiled --version | 67 | o oprofile 0.9 # oprofiled --version |
68 | o udev 058 # udevinfo -V | ||
67 | 69 | ||
68 | Kernel compilation | 70 | Kernel compilation |
69 | ================== | 71 | ================== |
@@ -186,13 +188,20 @@ architecture independent and any version from 2.0.0 onward should | |||
186 | work correctly with this version of the XFS kernel code (2.6.0 or | 188 | work correctly with this version of the XFS kernel code (2.6.0 or |
187 | later is recommended, due to some significant improvements). | 189 | later is recommended, due to some significant improvements). |
188 | 190 | ||
191 | PCMCIAutils | ||
192 | ----------- | ||
193 | |||
194 | PCMCIAutils replaces pcmcia-cs (see below). It properly sets up | ||
195 | PCMCIA sockets at system startup and loads the appropriate modules | ||
196 | for 16-bit PCMCIA devices if the kernel is modularized and the hotplug | ||
197 | subsystem is used. | ||
189 | 198 | ||
190 | Pcmcia-cs | 199 | Pcmcia-cs |
191 | --------- | 200 | --------- |
192 | 201 | ||
193 | PCMCIA (PC Card) support is now partially implemented in the main | 202 | PCMCIA (PC Card) support is now partially implemented in the main |
194 | kernel source. Pay attention when you recompile your kernel ;-). | 203 | kernel source. The "pcmciautils" package (see above) replaces pcmcia-cs |
195 | Also, be sure to upgrade to the latest pcmcia-cs release. | 204 | for newest kernels. |
196 | 205 | ||
197 | Quota-tools | 206 | Quota-tools |
198 | ----------- | 207 | ----------- |
@@ -349,9 +358,13 @@ Xfsprogs | |||
349 | -------- | 358 | -------- |
350 | o <ftp://oss.sgi.com/projects/xfs/download/> | 359 | o <ftp://oss.sgi.com/projects/xfs/download/> |
351 | 360 | ||
361 | Pcmciautils | ||
362 | ----------- | ||
363 | o <ftp://ftp.kernel.org/pub/linux/utils/kernel/pcmcia/> | ||
364 | |||
352 | Pcmcia-cs | 365 | Pcmcia-cs |
353 | --------- | 366 | --------- |
354 | o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz> | 367 | o <http://pcmcia-cs.sourceforge.net/> |
355 | 368 | ||
356 | Quota-tools | 369 | Quota-tools |
357 | ---------- | 370 | ---------- |
diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 87da3478fada..fa3e29ad8a46 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile | |||
@@ -49,7 +49,7 @@ installmandocs: mandocs | |||
49 | KERNELDOC = scripts/kernel-doc | 49 | KERNELDOC = scripts/kernel-doc |
50 | DOCPROC = scripts/basic/docproc | 50 | DOCPROC = scripts/basic/docproc |
51 | 51 | ||
52 | XMLTOFLAGS = -m Documentation/DocBook/stylesheet.xsl | 52 | XMLTOFLAGS = -m $(srctree)/Documentation/DocBook/stylesheet.xsl |
53 | #XMLTOFLAGS += --skip-validation | 53 | #XMLTOFLAGS += --skip-validation |
54 | 54 | ||
55 | ### | 55 | ### |
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index bb6a0106be11..d650ce36485f 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl | |||
@@ -266,7 +266,7 @@ X!Ekernel/module.c | |||
266 | <chapter id="hardware"> | 266 | <chapter id="hardware"> |
267 | <title>Hardware Interfaces</title> | 267 | <title>Hardware Interfaces</title> |
268 | <sect1><title>Interrupt Handling</title> | 268 | <sect1><title>Interrupt Handling</title> |
269 | !Iarch/i386/kernel/irq.c | 269 | !Ikernel/irq/manage.c |
270 | </sect1> | 270 | </sect1> |
271 | 271 | ||
272 | <sect1><title>Resources Management</title> | 272 | <sect1><title>Resources Management</title> |
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 6df1dfd18b65..375ae760dc1e 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl | |||
@@ -84,6 +84,14 @@ void (*port_disable) (struct ata_port *); | |||
84 | Called from ata_bus_probe() and ata_bus_reset() error paths, | 84 | Called from ata_bus_probe() and ata_bus_reset() error paths, |
85 | as well as when unregistering from the SCSI module (rmmod, hot | 85 | as well as when unregistering from the SCSI module (rmmod, hot |
86 | unplug). | 86 | unplug). |
87 | This function should do whatever needs to be done to take the | ||
88 | port out of use. In most cases, ata_port_disable() can be used | ||
89 | as this hook. | ||
90 | </para> | ||
91 | <para> | ||
92 | Called from ata_bus_probe() on a failed probe. | ||
93 | Called from ata_bus_reset() on a failed bus reset. | ||
94 | Called from ata_scsi_release(). | ||
87 | </para> | 95 | </para> |
88 | 96 | ||
89 | </sect2> | 97 | </sect2> |
@@ -98,6 +106,13 @@ void (*dev_config) (struct ata_port *, struct ata_device *); | |||
98 | found. Typically used to apply device-specific fixups prior to | 106 | found. Typically used to apply device-specific fixups prior to |
99 | issue of SET FEATURES - XFER MODE, and prior to operation. | 107 | issue of SET FEATURES - XFER MODE, and prior to operation. |
100 | </para> | 108 | </para> |
109 | <para> | ||
110 | Called by ata_device_add() after ata_dev_identify() determines | ||
111 | a device is present. | ||
112 | </para> | ||
113 | <para> | ||
114 | This entry may be specified as NULL in ata_port_operations. | ||
115 | </para> | ||
101 | 116 | ||
102 | </sect2> | 117 | </sect2> |
103 | 118 | ||
@@ -135,6 +150,8 @@ void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf); | |||
135 | registers / DMA buffers. ->tf_read() is called to read the | 150 | registers / DMA buffers. ->tf_read() is called to read the |
136 | hardware registers / DMA buffers, to obtain the current set of | 151 | hardware registers / DMA buffers, to obtain the current set of |
137 | taskfile register values. | 152 | taskfile register values. |
153 | Most drivers for taskfile-based hardware (PIO or MMIO) use | ||
154 | ata_tf_load() and ata_tf_read() for these hooks. | ||
138 | </para> | 155 | </para> |
139 | 156 | ||
140 | </sect2> | 157 | </sect2> |
@@ -147,6 +164,8 @@ void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf); | |||
147 | <para> | 164 | <para> |
148 | causes an ATA command, previously loaded with | 165 | causes an ATA command, previously loaded with |
149 | ->tf_load(), to be initiated in hardware. | 166 | ->tf_load(), to be initiated in hardware. |
167 | Most drivers for taskfile-based hardware use ata_exec_command() | ||
168 | for this hook. | ||
150 | </para> | 169 | </para> |
151 | 170 | ||
152 | </sect2> | 171 | </sect2> |
@@ -161,6 +180,10 @@ Allow low-level driver to filter ATA PACKET commands, returning a status | |||
161 | indicating whether or not it is OK to use DMA for the supplied PACKET | 180 | indicating whether or not it is OK to use DMA for the supplied PACKET |
162 | command. | 181 | command. |
163 | </para> | 182 | </para> |
183 | <para> | ||
184 | This hook may be specified as NULL, in which case libata will | ||
185 | assume that atapi dma can be supported. | ||
186 | </para> | ||
164 | 187 | ||
165 | </sect2> | 188 | </sect2> |
166 | 189 | ||
@@ -175,6 +198,14 @@ u8 (*check_err)(struct ata_port *ap); | |||
175 | Reads the Status/AltStatus/Error ATA shadow register from | 198 | Reads the Status/AltStatus/Error ATA shadow register from |
176 | hardware. On some hardware, reading the Status register has | 199 | hardware. On some hardware, reading the Status register has |
177 | the side effect of clearing the interrupt condition. | 200 | the side effect of clearing the interrupt condition. |
201 | Most drivers for taskfile-based hardware use | ||
202 | ata_check_status() for this hook. | ||
203 | </para> | ||
204 | <para> | ||
205 | Note that because this is called from ata_device_add(), at | ||
206 | least a dummy function that clears device interrupts must be | ||
207 | provided for all drivers, even if the controller doesn't | ||
208 | actually have a taskfile status register. | ||
178 | </para> | 209 | </para> |
179 | 210 | ||
180 | </sect2> | 211 | </sect2> |
@@ -188,7 +219,13 @@ void (*dev_select)(struct ata_port *ap, unsigned int device); | |||
188 | Issues the low-level hardware command(s) that causes one of N | 219 | Issues the low-level hardware command(s) that causes one of N |
189 | hardware devices to be considered 'selected' (active and | 220 | hardware devices to be considered 'selected' (active and |
190 | available for use) on the ATA bus. This generally has no | 221 | available for use) on the ATA bus. This generally has no |
191 | meaning on FIS-based devices. | 222 | meaning on FIS-based devices. |
223 | </para> | ||
224 | <para> | ||
225 | Most drivers for taskfile-based hardware use | ||
226 | ata_std_dev_select() for this hook. Controllers which do not | ||
227 | support second drives on a port (such as SATA contollers) will | ||
228 | use ata_noop_dev_select(). | ||
192 | </para> | 229 | </para> |
193 | 230 | ||
194 | </sect2> | 231 | </sect2> |
@@ -204,6 +241,8 @@ void (*phy_reset) (struct ata_port *ap); | |||
204 | for device presence (PATA and SATA), typically a soft reset | 241 | for device presence (PATA and SATA), typically a soft reset |
205 | (SRST) will be performed. Drivers typically use the helper | 242 | (SRST) will be performed. Drivers typically use the helper |
206 | functions ata_bus_reset() or sata_phy_reset() for this hook. | 243 | functions ata_bus_reset() or sata_phy_reset() for this hook. |
244 | Many SATA drivers use sata_phy_reset() or call it from within | ||
245 | their own phy_reset() functions. | ||
207 | </para> | 246 | </para> |
208 | 247 | ||
209 | </sect2> | 248 | </sect2> |
@@ -227,6 +266,25 @@ PCI IDE DMA Status register. | |||
227 | These hooks are typically either no-ops, or simply not implemented, in | 266 | These hooks are typically either no-ops, or simply not implemented, in |
228 | FIS-based drivers. | 267 | FIS-based drivers. |
229 | </para> | 268 | </para> |
269 | <para> | ||
270 | Most legacy IDE drivers use ata_bmdma_setup() for the bmdma_setup() | ||
271 | hook. ata_bmdma_setup() will write the pointer to the PRD table to | ||
272 | the IDE PRD Table Address register, enable DMA in the DMA Command | ||
273 | register, and call exec_command() to begin the transfer. | ||
274 | </para> | ||
275 | <para> | ||
276 | Most legacy IDE drivers use ata_bmdma_start() for the bmdma_start() | ||
277 | hook. ata_bmdma_start() will write the ATA_DMA_START flag to the DMA | ||
278 | Command register. | ||
279 | </para> | ||
280 | <para> | ||
281 | Many legacy IDE drivers use ata_bmdma_stop() for the bmdma_stop() | ||
282 | hook. ata_bmdma_stop() clears the ATA_DMA_START flag in the DMA | ||
283 | command register. | ||
284 | </para> | ||
285 | <para> | ||
286 | Many legacy IDE drivers use ata_bmdma_status() as the bmdma_status() hook. | ||
287 | </para> | ||
230 | 288 | ||
231 | </sect2> | 289 | </sect2> |
232 | 290 | ||
@@ -250,6 +308,10 @@ int (*qc_issue) (struct ata_queued_cmd *qc); | |||
250 | helper function ata_qc_issue_prot() for taskfile protocol-based | 308 | helper function ata_qc_issue_prot() for taskfile protocol-based |
251 | dispatch. More advanced drivers implement their own ->qc_issue. | 309 | dispatch. More advanced drivers implement their own ->qc_issue. |
252 | </para> | 310 | </para> |
311 | <para> | ||
312 | ata_qc_issue_prot() calls ->tf_load(), ->bmdma_setup(), and | ||
313 | ->bmdma_start() as necessary to initiate a transfer. | ||
314 | </para> | ||
253 | 315 | ||
254 | </sect2> | 316 | </sect2> |
255 | 317 | ||
@@ -279,6 +341,21 @@ void (*irq_clear) (struct ata_port *); | |||
279 | before the interrupt handler is registered, to be sure hardware | 341 | before the interrupt handler is registered, to be sure hardware |
280 | is quiet. | 342 | is quiet. |
281 | </para> | 343 | </para> |
344 | <para> | ||
345 | The second argument, dev_instance, should be cast to a pointer | ||
346 | to struct ata_host_set. | ||
347 | </para> | ||
348 | <para> | ||
349 | Most legacy IDE drivers use ata_interrupt() for the | ||
350 | irq_handler hook, which scans all ports in the host_set, | ||
351 | determines which queued command was active (if any), and calls | ||
352 | ata_host_intr(ap,qc). | ||
353 | </para> | ||
354 | <para> | ||
355 | Most legacy IDE drivers use ata_bmdma_irq_clear() for the | ||
356 | irq_clear() hook, which simply clears the interrupt and error | ||
357 | flags in the DMA status register. | ||
358 | </para> | ||
282 | 359 | ||
283 | </sect2> | 360 | </sect2> |
284 | 361 | ||
@@ -292,6 +369,7 @@ void (*scr_write) (struct ata_port *ap, unsigned int sc_reg, | |||
292 | <para> | 369 | <para> |
293 | Read and write standard SATA phy registers. Currently only used | 370 | Read and write standard SATA phy registers. Currently only used |
294 | if ->phy_reset hook called the sata_phy_reset() helper function. | 371 | if ->phy_reset hook called the sata_phy_reset() helper function. |
372 | sc_reg is one of SCR_STATUS, SCR_CONTROL, SCR_ERROR, or SCR_ACTIVE. | ||
295 | </para> | 373 | </para> |
296 | 374 | ||
297 | </sect2> | 375 | </sect2> |
@@ -307,17 +385,29 @@ void (*host_stop) (struct ata_host_set *host_set); | |||
307 | ->port_start() is called just after the data structures for each | 385 | ->port_start() is called just after the data structures for each |
308 | port are initialized. Typically this is used to alloc per-port | 386 | port are initialized. Typically this is used to alloc per-port |
309 | DMA buffers / tables / rings, enable DMA engines, and similar | 387 | DMA buffers / tables / rings, enable DMA engines, and similar |
310 | tasks. | 388 | tasks. Some drivers also use this entry point as a chance to |
389 | allocate driver-private memory for ap->private_data. | ||
390 | </para> | ||
391 | <para> | ||
392 | Many drivers use ata_port_start() as this hook or call | ||
393 | it from their own port_start() hooks. ata_port_start() | ||
394 | allocates space for a legacy IDE PRD table and returns. | ||
311 | </para> | 395 | </para> |
312 | <para> | 396 | <para> |
313 | ->port_stop() is called after ->host_stop(). It's sole function | 397 | ->port_stop() is called after ->host_stop(). It's sole function |
314 | is to release DMA/memory resources, now that they are no longer | 398 | is to release DMA/memory resources, now that they are no longer |
315 | actively being used. | 399 | actively being used. Many drivers also free driver-private |
400 | data from port at this time. | ||
401 | </para> | ||
402 | <para> | ||
403 | Many drivers use ata_port_stop() as this hook, which frees the | ||
404 | PRD table. | ||
316 | </para> | 405 | </para> |
317 | <para> | 406 | <para> |
318 | ->host_stop() is called after all ->port_stop() calls | 407 | ->host_stop() is called after all ->port_stop() calls |
319 | have completed. The hook must finalize hardware shutdown, release DMA | 408 | have completed. The hook must finalize hardware shutdown, release DMA |
320 | and other resources, etc. | 409 | and other resources, etc. |
410 | This hook may be specified as NULL, in which case it is not called. | ||
321 | </para> | 411 | </para> |
322 | 412 | ||
323 | </sect2> | 413 | </sect2> |
diff --git a/Documentation/DocBook/stylesheet.xsl b/Documentation/DocBook/stylesheet.xsl index e14c21dda403..64be9f7ee3bb 100644 --- a/Documentation/DocBook/stylesheet.xsl +++ b/Documentation/DocBook/stylesheet.xsl | |||
@@ -2,4 +2,5 @@ | |||
2 | <stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0"> | 2 | <stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0"> |
3 | <param name="chunk.quietly">1</param> | 3 | <param name="chunk.quietly">1</param> |
4 | <param name="funcsynopsis.style">ansi</param> | 4 | <param name="funcsynopsis.style">ansi</param> |
5 | <param name="funcsynopsis.tabular.threshold">80</param> | ||
5 | </stylesheet> | 6 | </stylesheet> |
diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt index 90d10e708ca3..84d3d4d10c17 100644 --- a/Documentation/IPMI.txt +++ b/Documentation/IPMI.txt | |||
@@ -25,9 +25,10 @@ subject and I can't cover it all here! | |||
25 | Configuration | 25 | Configuration |
26 | ------------- | 26 | ------------- |
27 | 27 | ||
28 | The LinuxIPMI driver is modular, which means you have to pick several | 28 | The Linux IPMI driver is modular, which means you have to pick several |
29 | things to have it work right depending on your hardware. Most of | 29 | things to have it work right depending on your hardware. Most of |
30 | these are available in the 'Character Devices' menu. | 30 | these are available in the 'Character Devices' menu then the IPMI |
31 | menu. | ||
31 | 32 | ||
32 | No matter what, you must pick 'IPMI top-level message handler' to use | 33 | No matter what, you must pick 'IPMI top-level message handler' to use |
33 | IPMI. What you do beyond that depends on your needs and hardware. | 34 | IPMI. What you do beyond that depends on your needs and hardware. |
@@ -35,33 +36,30 @@ IPMI. What you do beyond that depends on your needs and hardware. | |||
35 | The message handler does not provide any user-level interfaces. | 36 | The message handler does not provide any user-level interfaces. |
36 | Kernel code (like the watchdog) can still use it. If you need access | 37 | Kernel code (like the watchdog) can still use it. If you need access |
37 | from userland, you need to select 'Device interface for IPMI' if you | 38 | from userland, you need to select 'Device interface for IPMI' if you |
38 | want access through a device driver. Another interface is also | 39 | want access through a device driver. |
39 | available, you may select 'IPMI sockets' in the 'Networking Support' | 40 | |
40 | main menu. This provides a socket interface to IPMI. You may select | 41 | The driver interface depends on your hardware. If your system |
41 | both of these at the same time, they will both work together. | 42 | properly provides the SMBIOS info for IPMI, the driver will detect it |
42 | 43 | and just work. If you have a board with a standard interface (These | |
43 | The driver interface depends on your hardware. If you have a board | 44 | will generally be either "KCS", "SMIC", or "BT", consult your hardware |
44 | with a standard interface (These will generally be either "KCS", | 45 | manual), choose the 'IPMI SI handler' option. A driver also exists |
45 | "SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI | 46 | for direct I2C access to the IPMI management controller. Some boards |
46 | handler' option. A driver also exists for direct I2C access to the | 47 | support this, but it is unknown if it will work on every board. For |
47 | IPMI management controller. Some boards support this, but it is | 48 | this, choose 'IPMI SMBus handler', but be ready to try to do some |
48 | unknown if it will work on every board. For this, choose 'IPMI SMBus | 49 | figuring to see if it will work on your system if the SMBIOS/APCI |
49 | handler', but be ready to try to do some figuring to see if it will | 50 | information is wrong or not present. It is fairly safe to have both |
50 | work. | 51 | these enabled and let the drivers auto-detect what is present. |
51 | |||
52 | There is also a KCS-only driver interface supplied, but it is | ||
53 | depracated in favor of the SI interface. | ||
54 | 52 | ||
55 | You should generally enable ACPI on your system, as systems with IPMI | 53 | You should generally enable ACPI on your system, as systems with IPMI |
56 | should have ACPI tables describing them. | 54 | can have ACPI tables describing them. |
57 | 55 | ||
58 | If you have a standard interface and the board manufacturer has done | 56 | If you have a standard interface and the board manufacturer has done |
59 | their job correctly, the IPMI controller should be automatically | 57 | their job correctly, the IPMI controller should be automatically |
60 | detect (via ACPI or SMBIOS tables) and should just work. Sadly, many | 58 | detected (via ACPI or SMBIOS tables) and should just work. Sadly, |
61 | boards do not have this information. The driver attempts standard | 59 | many boards do not have this information. The driver attempts |
62 | defaults, but they may not work. If you fall into this situation, you | 60 | standard defaults, but they may not work. If you fall into this |
63 | need to read the section below named 'The SI Driver' on how to | 61 | situation, you need to read the section below named 'The SI Driver' or |
64 | hand-configure your system. | 62 | "The SMBus Driver" on how to hand-configure your system. |
65 | 63 | ||
66 | IPMI defines a standard watchdog timer. You can enable this with the | 64 | IPMI defines a standard watchdog timer. You can enable this with the |
67 | 'IPMI Watchdog Timer' config option. If you compile the driver into | 65 | 'IPMI Watchdog Timer' config option. If you compile the driver into |
@@ -73,6 +71,18 @@ closed (by default it is disabled on close). Go into the 'Watchdog | |||
73 | Cards' menu, enable 'Watchdog Timer Support', and enable the option | 71 | Cards' menu, enable 'Watchdog Timer Support', and enable the option |
74 | 'Disable watchdog shutdown on close'. | 72 | 'Disable watchdog shutdown on close'. |
75 | 73 | ||
74 | IPMI systems can often be powered off using IPMI commands. Select | ||
75 | 'IPMI Poweroff' to do this. The driver will auto-detect if the system | ||
76 | can be powered off by IPMI. It is safe to enable this even if your | ||
77 | system doesn't support this option. This works on ATCA systems, the | ||
78 | Radisys CPI1 card, and any IPMI system that supports standard chassis | ||
79 | management commands. | ||
80 | |||
81 | If you want the driver to put an event into the event log on a panic, | ||
82 | enable the 'Generate a panic event to all BMCs on a panic' option. If | ||
83 | you want the whole panic string put into the event log using OEM | ||
84 | events, enable the 'Generate OEM events containing the panic string' | ||
85 | option. | ||
76 | 86 | ||
77 | Basic Design | 87 | Basic Design |
78 | ------------ | 88 | ------------ |
@@ -80,7 +90,7 @@ Basic Design | |||
80 | The Linux IPMI driver is designed to be very modular and flexible, you | 90 | The Linux IPMI driver is designed to be very modular and flexible, you |
81 | only need to take the pieces you need and you can use it in many | 91 | only need to take the pieces you need and you can use it in many |
82 | different ways. Because of that, it's broken into many chunks of | 92 | different ways. Because of that, it's broken into many chunks of |
83 | code. These chunks are: | 93 | code. These chunks (by module name) are: |
84 | 94 | ||
85 | ipmi_msghandler - This is the central piece of software for the IPMI | 95 | ipmi_msghandler - This is the central piece of software for the IPMI |
86 | system. It handles all messages, message timing, and responses. The | 96 | system. It handles all messages, message timing, and responses. The |
@@ -93,18 +103,26 @@ ipmi_devintf - This provides a userland IOCTL interface for the IPMI | |||
93 | driver, each open file for this device ties in to the message handler | 103 | driver, each open file for this device ties in to the message handler |
94 | as an IPMI user. | 104 | as an IPMI user. |
95 | 105 | ||
96 | ipmi_si - A driver for various system interfaces. This supports | 106 | ipmi_si - A driver for various system interfaces. This supports KCS, |
97 | KCS, SMIC, and may support BT in the future. Unless you have your own | 107 | SMIC, and BT interfaces. Unless you have an SMBus interface or your |
98 | custom interface, you probably need to use this. | 108 | own custom interface, you probably need to use this. |
99 | 109 | ||
100 | ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the | 110 | ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the |
101 | I2C kernel driver's SMBus interfaces to send and receive IPMI messages | 111 | I2C kernel driver's SMBus interfaces to send and receive IPMI messages |
102 | over the SMBus. | 112 | over the SMBus. |
103 | 113 | ||
104 | af_ipmi - A network socket interface to IPMI. This doesn't take up | 114 | ipmi_watchdog - IPMI requires systems to have a very capable watchdog |
105 | a character device in your system. | 115 | timer. This driver implements the standard Linux watchdog timer |
116 | interface on top of the IPMI message handler. | ||
117 | |||
118 | ipmi_poweroff - Some systems support the ability to be turned off via | ||
119 | IPMI commands. | ||
106 | 120 | ||
107 | Note that the KCS-only interface ahs been removed. | 121 | These are all individually selectable via configuration options. |
122 | |||
123 | Note that the KCS-only interface has been removed. The af_ipmi driver | ||
124 | is no longer supported and has been removed because it was impossible | ||
125 | to do 32 bit emulation on 64-bit kernels with it. | ||
108 | 126 | ||
109 | Much documentation for the interface is in the include files. The | 127 | Much documentation for the interface is in the include files. The |
110 | IPMI include files are: | 128 | IPMI include files are: |
@@ -424,7 +442,7 @@ at module load time (for a module) with: | |||
424 | modprobe ipmi_smb.o | 442 | modprobe ipmi_smb.o |
425 | addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] | 443 | addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]] |
426 | dbg=<flags1>,<flags2>... | 444 | dbg=<flags1>,<flags2>... |
427 | [defaultprobe=0] [dbg_probe=1] | 445 | [defaultprobe=1] [dbg_probe=1] |
428 | 446 | ||
429 | The addresses are specified in pairs, the first is the adapter ID and the | 447 | The addresses are specified in pairs, the first is the adapter ID and the |
430 | second is the I2C address on that adapter. | 448 | second is the I2C address on that adapter. |
@@ -532,3 +550,67 @@ Once you open the watchdog timer, you must write a 'V' character to the | |||
532 | device to close it, or the timer will not stop. This is a new semantic | 550 | device to close it, or the timer will not stop. This is a new semantic |
533 | for the driver, but makes it consistent with the rest of the watchdog | 551 | for the driver, but makes it consistent with the rest of the watchdog |
534 | drivers in Linux. | 552 | drivers in Linux. |
553 | |||
554 | |||
555 | Panic Timeouts | ||
556 | -------------- | ||
557 | |||
558 | The OpenIPMI driver supports the ability to put semi-custom and custom | ||
559 | events in the system event log if a panic occurs. if you enable the | ||
560 | 'Generate a panic event to all BMCs on a panic' option, you will get | ||
561 | one event on a panic in a standard IPMI event format. If you enable | ||
562 | the 'Generate OEM events containing the panic string' option, you will | ||
563 | also get a bunch of OEM events holding the panic string. | ||
564 | |||
565 | |||
566 | The field settings of the events are: | ||
567 | * Generator ID: 0x21 (kernel) | ||
568 | * EvM Rev: 0x03 (this event is formatting in IPMI 1.0 format) | ||
569 | * Sensor Type: 0x20 (OS critical stop sensor) | ||
570 | * Sensor #: The first byte of the panic string (0 if no panic string) | ||
571 | * Event Dir | Event Type: 0x6f (Assertion, sensor-specific event info) | ||
572 | * Event Data 1: 0xa1 (Runtime stop in OEM bytes 2 and 3) | ||
573 | * Event data 2: second byte of panic string | ||
574 | * Event data 3: third byte of panic string | ||
575 | See the IPMI spec for the details of the event layout. This event is | ||
576 | always sent to the local management controller. It will handle routing | ||
577 | the message to the right place | ||
578 | |||
579 | Other OEM events have the following format: | ||
580 | Record ID (bytes 0-1): Set by the SEL. | ||
581 | Record type (byte 2): 0xf0 (OEM non-timestamped) | ||
582 | byte 3: The slave address of the card saving the panic | ||
583 | byte 4: A sequence number (starting at zero) | ||
584 | The rest of the bytes (11 bytes) are the panic string. If the panic string | ||
585 | is longer than 11 bytes, multiple messages will be sent with increasing | ||
586 | sequence numbers. | ||
587 | |||
588 | Because you cannot send OEM events using the standard interface, this | ||
589 | function will attempt to find an SEL and add the events there. It | ||
590 | will first query the capabilities of the local management controller. | ||
591 | If it has an SEL, then they will be stored in the SEL of the local | ||
592 | management controller. If not, and the local management controller is | ||
593 | an event generator, the event receiver from the local management | ||
594 | controller will be queried and the events sent to the SEL on that | ||
595 | device. Otherwise, the events go nowhere since there is nowhere to | ||
596 | send them. | ||
597 | |||
598 | |||
599 | Poweroff | ||
600 | -------- | ||
601 | |||
602 | If the poweroff capability is selected, the IPMI driver will install | ||
603 | a shutdown function into the standard poweroff function pointer. This | ||
604 | is in the ipmi_poweroff module. When the system requests a powerdown, | ||
605 | it will send the proper IPMI commands to do this. This is supported on | ||
606 | several platforms. | ||
607 | |||
608 | There is a module parameter named "poweroff_control" that may either be zero | ||
609 | (do a power down) or 2 (do a power cycle, power the system off, then power | ||
610 | it on in a few seconds). Setting ipmi_poweroff.poweroff_control=x will do | ||
611 | the same thing on the kernel command line. The parameter is also available | ||
612 | via the proc filesystem in /proc/ipmi/poweroff_control. Note that if the | ||
613 | system does not support power cycling, it will always to the power off. | ||
614 | |||
615 | Note that if you have ACPI enabled, the system will prefer using ACPI to | ||
616 | power off. | ||
diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers index de3b252e717d..c3cca924e94b 100644 --- a/Documentation/SubmittingDrivers +++ b/Documentation/SubmittingDrivers | |||
@@ -13,13 +13,14 @@ Allocating Device Numbers | |||
13 | ------------------------- | 13 | ------------------------- |
14 | 14 | ||
15 | Major and minor numbers for block and character devices are allocated | 15 | Major and minor numbers for block and character devices are allocated |
16 | by the Linux assigned name and number authority (currently better | 16 | by the Linux assigned name and number authority (currently this is |
17 | known as H Peter Anvin). The site is http://www.lanana.org/. This | 17 | Torben Mathiasen). The site is http://www.lanana.org/. This |
18 | also deals with allocating numbers for devices that are not going to | 18 | also deals with allocating numbers for devices that are not going to |
19 | be submitted to the mainstream kernel. | 19 | be submitted to the mainstream kernel. |
20 | See Documentation/devices.txt for more information on this. | ||
20 | 21 | ||
21 | If you don't use assigned numbers then when you device is submitted it will | 22 | If you don't use assigned numbers then when your device is submitted it will |
22 | get given an assigned number even if that is different from values you may | 23 | be given an assigned number even if that is different from values you may |
23 | have shipped to customers before. | 24 | have shipped to customers before. |
24 | 25 | ||
25 | Who To Submit Drivers To | 26 | Who To Submit Drivers To |
@@ -32,7 +33,8 @@ Linux 2.2: | |||
32 | If the code area has a general maintainer then please submit it to | 33 | If the code area has a general maintainer then please submit it to |
33 | the maintainer listed in MAINTAINERS in the kernel file. If the | 34 | the maintainer listed in MAINTAINERS in the kernel file. If the |
34 | maintainer does not respond or you cannot find the appropriate | 35 | maintainer does not respond or you cannot find the appropriate |
35 | maintainer then please contact Alan Cox <alan@lxorguk.ukuu.org.uk> | 36 | maintainer then please contact the 2.2 kernel maintainer: |
37 | Marc-Christian Petersen <m.c.p@wolk-project.de>. | ||
36 | 38 | ||
37 | Linux 2.4: | 39 | Linux 2.4: |
38 | The same rules apply as 2.2. The final contact point for Linux 2.4 | 40 | The same rules apply as 2.2. The final contact point for Linux 2.4 |
@@ -48,7 +50,7 @@ What Criteria Determine Acceptance | |||
48 | 50 | ||
49 | Licensing: The code must be released to us under the | 51 | Licensing: The code must be released to us under the |
50 | GNU General Public License. We don't insist on any kind | 52 | GNU General Public License. We don't insist on any kind |
51 | of exclusively GPL licensing, and if you wish the driver | 53 | of exclusive GPL licensing, and if you wish the driver |
52 | to be useful to other communities such as BSD you may well | 54 | to be useful to other communities such as BSD you may well |
53 | wish to release under multiple licenses. | 55 | wish to release under multiple licenses. |
54 | 56 | ||
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 4d35562b1cf9..7f43b040311e 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches | |||
@@ -35,7 +35,7 @@ not in any lower subdirectory. | |||
35 | 35 | ||
36 | To create a patch for a single file, it is often sufficient to do: | 36 | To create a patch for a single file, it is often sufficient to do: |
37 | 37 | ||
38 | SRCTREE= linux-2.4 | 38 | SRCTREE= linux-2.6 |
39 | MYFILE= drivers/net/mydriver.c | 39 | MYFILE= drivers/net/mydriver.c |
40 | 40 | ||
41 | cd $SRCTREE | 41 | cd $SRCTREE |
@@ -48,17 +48,18 @@ To create a patch for multiple files, you should unpack a "vanilla", | |||
48 | or unmodified kernel source tree, and generate a diff against your | 48 | or unmodified kernel source tree, and generate a diff against your |
49 | own source tree. For example: | 49 | own source tree. For example: |
50 | 50 | ||
51 | MYSRC= /devel/linux-2.4 | 51 | MYSRC= /devel/linux-2.6 |
52 | 52 | ||
53 | tar xvfz linux-2.4.0-test11.tar.gz | 53 | tar xvfz linux-2.6.12.tar.gz |
54 | mv linux linux-vanilla | 54 | mv linux-2.6.12 linux-2.6.12-vanilla |
55 | wget http://www.moses.uklinux.net/patches/dontdiff | 55 | diff -uprN -X linux-2.6.12-vanilla/Documentation/dontdiff \ |
56 | diff -uprN -X dontdiff linux-vanilla $MYSRC > /tmp/patch | 56 | linux-2.6.12-vanilla $MYSRC > /tmp/patch |
57 | rm -f dontdiff | ||
58 | 57 | ||
59 | "dontdiff" is a list of files which are generated by the kernel during | 58 | "dontdiff" is a list of files which are generated by the kernel during |
60 | the build process, and should be ignored in any diff(1)-generated | 59 | the build process, and should be ignored in any diff(1)-generated |
61 | patch. dontdiff is maintained by Tigran Aivazian <tigran@veritas.com> | 60 | patch. The "dontdiff" file is included in the kernel tree in |
61 | 2.6.12 and later. For earlier kernel versions, you can get it | ||
62 | from <http://www.xenotime.net/linux/doc/dontdiff>. | ||
62 | 63 | ||
63 | Make sure your patch does not include any extra files which do not | 64 | Make sure your patch does not include any extra files which do not |
64 | belong in a patch submission. Make sure to review your patch -after- | 65 | belong in a patch submission. Make sure to review your patch -after- |
@@ -66,18 +67,20 @@ generated it with diff(1), to ensure accuracy. | |||
66 | 67 | ||
67 | If your changes produce a lot of deltas, you may want to look into | 68 | If your changes produce a lot of deltas, you may want to look into |
68 | splitting them into individual patches which modify things in | 69 | splitting them into individual patches which modify things in |
69 | logical stages, this will facilitate easier reviewing by other | 70 | logical stages. This will facilitate easier reviewing by other |
70 | kernel developers, very important if you want your patch accepted. | 71 | kernel developers, very important if you want your patch accepted. |
71 | There are a number of scripts which can aid in this; | 72 | There are a number of scripts which can aid in this: |
72 | 73 | ||
73 | Quilt: | 74 | Quilt: |
74 | http://savannah.nongnu.org/projects/quilt | 75 | http://savannah.nongnu.org/projects/quilt |
75 | 76 | ||
76 | Randy Dunlap's patch scripts: | 77 | Randy Dunlap's patch scripts: |
77 | http://developer.osdl.org/rddunlap/scripts/patching-scripts.tgz | 78 | http://www.xenotime.net/linux/scripts/patching-scripts-002.tar.gz |
78 | 79 | ||
79 | Andrew Morton's patch scripts: | 80 | Andrew Morton's patch scripts: |
80 | http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.16 | 81 | http://www.zip.com.au/~akpm/linux/patches/patch-scripts-0.20 |
82 | |||
83 | |||
81 | 84 | ||
82 | 2) Describe your changes. | 85 | 2) Describe your changes. |
83 | 86 | ||
@@ -132,21 +135,6 @@ which require discussion or do not have a clear advantage should | |||
132 | usually be sent first to linux-kernel. Only after the patch is | 135 | usually be sent first to linux-kernel. Only after the patch is |
133 | discussed should the patch then be submitted to Linus. | 136 | discussed should the patch then be submitted to Linus. |
134 | 137 | ||
135 | For small patches you may want to CC the Trivial Patch Monkey | ||
136 | trivial@rustcorp.com.au set up by Rusty Russell; which collects "trivial" | ||
137 | patches. Trivial patches must qualify for one of the following rules: | ||
138 | Spelling fixes in documentation | ||
139 | Spelling fixes which could break grep(1). | ||
140 | Warning fixes (cluttering with useless warnings is bad) | ||
141 | Compilation fixes (only if they are actually correct) | ||
142 | Runtime fixes (only if they actually fix things) | ||
143 | Removing use of deprecated functions/macros (eg. check_region). | ||
144 | Contact detail and documentation fixes | ||
145 | Non-portable code replaced by portable code (even in arch-specific, | ||
146 | since people copy, as long as it's trivial) | ||
147 | Any fix by the author/maintainer of the file. (ie. patch monkey | ||
148 | in re-transmission mode) | ||
149 | |||
150 | 138 | ||
151 | 139 | ||
152 | 5) Select your CC (e-mail carbon copy) list. | 140 | 5) Select your CC (e-mail carbon copy) list. |
@@ -161,6 +149,11 @@ USB, framebuffer devices, the VFS, the SCSI subsystem, etc. See the | |||
161 | MAINTAINERS file for a mailing list that relates specifically to | 149 | MAINTAINERS file for a mailing list that relates specifically to |
162 | your change. | 150 | your change. |
163 | 151 | ||
152 | If changes affect userland-kernel interfaces, please send | ||
153 | the MAN-PAGES maintainer (as listed in the MAINTAINERS file) | ||
154 | a man-pages patch, or at least a notification of the change, | ||
155 | so that some information makes its way into the manual pages. | ||
156 | |||
164 | Even if the maintainer did not respond in step #4, make sure to ALWAYS | 157 | Even if the maintainer did not respond in step #4, make sure to ALWAYS |
165 | copy the maintainer when you change their code. | 158 | copy the maintainer when you change their code. |
166 | 159 | ||
@@ -178,6 +171,8 @@ patches. Trivial patches must qualify for one of the following rules: | |||
178 | since people copy, as long as it's trivial) | 171 | since people copy, as long as it's trivial) |
179 | Any fix by the author/maintainer of the file. (ie. patch monkey | 172 | Any fix by the author/maintainer of the file. (ie. patch monkey |
180 | in re-transmission mode) | 173 | in re-transmission mode) |
174 | URL: <http://www.kernel.org/pub/linux/kernel/people/rusty/trivial/> | ||
175 | |||
181 | 176 | ||
182 | 177 | ||
183 | 178 | ||
@@ -299,13 +294,24 @@ can certify the below: | |||
299 | 294 | ||
300 | then you just add a line saying | 295 | then you just add a line saying |
301 | 296 | ||
302 | Signed-off-by: Random J Developer <random@developer.org> | 297 | Signed-off-by: Random J Developer <random@developer.example.org> |
303 | 298 | ||
304 | Some people also put extra tags at the end. They'll just be ignored for | 299 | Some people also put extra tags at the end. They'll just be ignored for |
305 | now, but you can do this to mark internal company procedures or just | 300 | now, but you can do this to mark internal company procedures or just |
306 | point out some special detail about the sign-off. | 301 | point out some special detail about the sign-off. |
307 | 302 | ||
308 | 303 | ||
304 | |||
305 | 12) More references for submitting patches | ||
306 | |||
307 | Andrew Morton, "The perfect patch" (tpp). | ||
308 | <http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt> | ||
309 | |||
310 | Jeff Garzik, "Linux kernel patch submission format." | ||
311 | <http://linux.yyz.us/patch-format.html> | ||
312 | |||
313 | |||
314 | |||
309 | ----------------------------------- | 315 | ----------------------------------- |
310 | SECTION 2 - HINTS, TIPS, AND TRICKS | 316 | SECTION 2 - HINTS, TIPS, AND TRICKS |
311 | ----------------------------------- | 317 | ----------------------------------- |
@@ -374,7 +380,5 @@ and 'extern __inline__'. | |||
374 | 4) Don't over-design. | 380 | 4) Don't over-design. |
375 | 381 | ||
376 | Don't try to anticipate nebulous future cases which may or may not | 382 | Don't try to anticipate nebulous future cases which may or may not |
377 | be useful: "Make it as simple as you can, and no simpler" | 383 | be useful: "Make it as simple as you can, and no simpler." |
378 | |||
379 | |||
380 | 384 | ||
diff --git a/Documentation/acpi-hotkey.txt b/Documentation/acpi-hotkey.txt new file mode 100644 index 000000000000..0acdc80c30c2 --- /dev/null +++ b/Documentation/acpi-hotkey.txt | |||
@@ -0,0 +1,38 @@ | |||
1 | driver/acpi/hotkey.c implement: | ||
2 | 1. /proc/acpi/hotkey/event_config | ||
3 | (event based hotkey or event config interface): | ||
4 | a. add a event based hotkey(event) : | ||
5 | echo "0:bus::action:method:num:num" > event_config | ||
6 | |||
7 | b. delete a event based hotkey(event): | ||
8 | echo "1:::::num:num" > event_config | ||
9 | |||
10 | c. modify a event based hotkey(event): | ||
11 | echo "2:bus::action:method:num:num" > event_config | ||
12 | |||
13 | 2. /proc/acpi/hotkey/poll_config | ||
14 | (polling based hotkey or event config interface): | ||
15 | a.add a polling based hotkey(event) : | ||
16 | echo "0:bus:method:action:method:num" > poll_config | ||
17 | this adding command will create a proc file | ||
18 | /proc/acpi/hotkey/method, which is used to get | ||
19 | result of polling. | ||
20 | |||
21 | b.delete a polling based hotkey(event): | ||
22 | echo "1:::::num" > event_config | ||
23 | |||
24 | c.modify a polling based hotkey(event): | ||
25 | echo "2:bus:method:action:method:num" > poll_config | ||
26 | |||
27 | 3./proc/acpi/hotkey/action | ||
28 | (interface to call aml method associated with a | ||
29 | specific hotkey(event)) | ||
30 | echo "event_num:event_type:event_argument" > | ||
31 | /proc/acpi/hotkey/action. | ||
32 | The result of the execution of this aml method is | ||
33 | attached to /proc/acpi/hotkey/poll_method, which is dnyamically | ||
34 | created. Please use command "cat /proc/acpi/hotkey/polling_method" | ||
35 | to retrieve it. | ||
36 | |||
37 | Note: Use cmdline "acpi_generic_hotkey" to over-ride | ||
38 | loading any platform specific drivers. | ||
diff --git a/Documentation/arm/Samsung-S3C24XX/USB-Host.txt b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt new file mode 100644 index 000000000000..b93b68e2b143 --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/USB-Host.txt | |||
@@ -0,0 +1,93 @@ | |||
1 | S3C24XX USB Host support | ||
2 | ======================== | ||
3 | |||
4 | |||
5 | |||
6 | Introduction | ||
7 | ------------ | ||
8 | |||
9 | This document details the S3C2410/S3C2440 in-built OHCI USB host support. | ||
10 | |||
11 | Configuration | ||
12 | ------------- | ||
13 | |||
14 | Enable at least the following kernel options: | ||
15 | |||
16 | menuconfig: | ||
17 | |||
18 | Device Drivers ---> | ||
19 | USB support ---> | ||
20 | <*> Support for Host-side USB | ||
21 | <*> OHCI HCD support | ||
22 | |||
23 | |||
24 | .config: | ||
25 | CONFIG_USB | ||
26 | CONFIG_USB_OHCI_HCD | ||
27 | |||
28 | |||
29 | Once these options are configured, the standard set of USB device | ||
30 | drivers can be configured and used. | ||
31 | |||
32 | |||
33 | Board Support | ||
34 | ------------- | ||
35 | |||
36 | The driver attaches to a platform device, which will need to be | ||
37 | added by the board specific support file in linux/arch/arm/mach-s3c2410, | ||
38 | such as mach-bast.c or mach-smdk2410.c | ||
39 | |||
40 | The platform device's platform_data field is only needed if the | ||
41 | board implements extra power control or over-current monitoring. | ||
42 | |||
43 | The OHCI driver does not ensure the state of the S3C2410's MISCCTRL | ||
44 | register, so if both ports are to be used for the host, then it is | ||
45 | the board support file's responsibility to ensure that the second | ||
46 | port is configured to be connected to the OHCI core. | ||
47 | |||
48 | |||
49 | Platform Data | ||
50 | ------------- | ||
51 | |||
52 | See linux/include/asm-arm/arch-s3c2410/usb-control.h for the | ||
53 | descriptions of the platform device data. An implementation | ||
54 | can be found in linux/arch/arm/mach-s3c2410/usb-simtec.c . | ||
55 | |||
56 | The `struct s3c2410_hcd_info` contains a pair of functions | ||
57 | that get called to enable over-current detection, and to | ||
58 | control the port power status. | ||
59 | |||
60 | The ports are numbered 0 and 1. | ||
61 | |||
62 | power_control: | ||
63 | |||
64 | Called to enable or disable the power on the port. | ||
65 | |||
66 | enable_oc: | ||
67 | |||
68 | Called to enable or disable the over-current monitoring. | ||
69 | This should claim or release the resources being used to | ||
70 | check the power condition on the port, such as an IRQ. | ||
71 | |||
72 | report_oc: | ||
73 | |||
74 | The OHCI driver fills this field in for the over-current code | ||
75 | to call when there is a change to the over-current state on | ||
76 | an port. The ports argument is a bitmask of 1 bit per port, | ||
77 | with bit X being 1 for an over-current on port X. | ||
78 | |||
79 | The function s3c2410_usb_report_oc() has been provided to | ||
80 | ensure this is called correctly. | ||
81 | |||
82 | port[x]: | ||
83 | |||
84 | This is struct describes each port, 0 or 1. The platform driver | ||
85 | should set the flags field of each port to S3C_HCDFLG_USED if | ||
86 | the port is enabled. | ||
87 | |||
88 | |||
89 | |||
90 | Document Author | ||
91 | --------------- | ||
92 | |||
93 | Ben Dooks, (c) 2005 Simtec Electronics | ||
diff --git a/Documentation/basic_profiling.txt b/Documentation/basic_profiling.txt index 65e3dc2d4437..8764e9f70821 100644 --- a/Documentation/basic_profiling.txt +++ b/Documentation/basic_profiling.txt | |||
@@ -27,9 +27,13 @@ dump output readprofile -m /boot/System.map > captured_profile | |||
27 | 27 | ||
28 | Oprofile | 28 | Oprofile |
29 | -------- | 29 | -------- |
30 | Get the source (I use 0.8) from http://oprofile.sourceforge.net/ | 30 | |
31 | and add "idle=poll" to the kernel command line | 31 | Get the source (see Changes for required version) from |
32 | http://oprofile.sourceforge.net/ and add "idle=poll" to the kernel command | ||
33 | line. | ||
34 | |||
32 | Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel | 35 | Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel |
36 | |||
33 | ./configure --with-kernel-support | 37 | ./configure --with-kernel-support |
34 | make install | 38 | make install |
35 | 39 | ||
@@ -46,7 +50,7 @@ start opcontrol --start | |||
46 | stop opcontrol --stop | 50 | stop opcontrol --stop |
47 | dump output opreport > output_file | 51 | dump output opreport > output_file |
48 | 52 | ||
49 | To only report on the kernel, run opreport /boot/vmlinux > output_file | 53 | To only report on the kernel, run opreport -l /boot/vmlinux > output_file |
50 | 54 | ||
51 | A reset is needed to clear old statistics, which survive a reboot. | 55 | A reset is needed to clear old statistics, which survive a reboot. |
52 | 56 | ||
diff --git a/Documentation/block/ioprio.txt b/Documentation/block/ioprio.txt new file mode 100644 index 000000000000..96ccf681075e --- /dev/null +++ b/Documentation/block/ioprio.txt | |||
@@ -0,0 +1,176 @@ | |||
1 | Block io priorities | ||
2 | =================== | ||
3 | |||
4 | |||
5 | Intro | ||
6 | ----- | ||
7 | |||
8 | With the introduction of cfq v3 (aka cfq-ts or time sliced cfq), basic io | ||
9 | priorities is supported for reads on files. This enables users to io nice | ||
10 | processes or process groups, similar to what has been possible to cpu | ||
11 | scheduling for ages. This document mainly details the current possibilites | ||
12 | with cfq, other io schedulers do not support io priorities so far. | ||
13 | |||
14 | Scheduling classes | ||
15 | ------------------ | ||
16 | |||
17 | CFQ implements three generic scheduling classes that determine how io is | ||
18 | served for a process. | ||
19 | |||
20 | IOPRIO_CLASS_RT: This is the realtime io class. This scheduling class is given | ||
21 | higher priority than any other in the system, processes from this class are | ||
22 | given first access to the disk every time. Thus it needs to be used with some | ||
23 | care, one io RT process can starve the entire system. Within the RT class, | ||
24 | there are 8 levels of class data that determine exactly how much time this | ||
25 | process needs the disk for on each service. In the future this might change | ||
26 | to be more directly mappable to performance, by passing in a wanted data | ||
27 | rate instead. | ||
28 | |||
29 | IOPRIO_CLASS_BE: This is the best-effort scheduling class, which is the default | ||
30 | for any process that hasn't set a specific io priority. The class data | ||
31 | determines how much io bandwidth the process will get, it's directly mappable | ||
32 | to the cpu nice levels just more coarsely implemented. 0 is the highest | ||
33 | BE prio level, 7 is the lowest. The mapping between cpu nice level and io | ||
34 | nice level is determined as: io_nice = (cpu_nice + 20) / 5. | ||
35 | |||
36 | IOPRIO_CLASS_IDLE: This is the idle scheduling class, processes running at this | ||
37 | level only get io time when no one else needs the disk. The idle class has no | ||
38 | class data, since it doesn't really apply here. | ||
39 | |||
40 | Tools | ||
41 | ----- | ||
42 | |||
43 | See below for a sample ionice tool. Usage: | ||
44 | |||
45 | # ionice -c<class> -n<level> -p<pid> | ||
46 | |||
47 | If pid isn't given, the current process is assumed. IO priority settings | ||
48 | are inherited on fork, so you can use ionice to start the process at a given | ||
49 | level: | ||
50 | |||
51 | # ionice -c2 -n0 /bin/ls | ||
52 | |||
53 | will run ls at the best-effort scheduling class at the highest priority. | ||
54 | For a running process, you can give the pid instead: | ||
55 | |||
56 | # ionice -c1 -n2 -p100 | ||
57 | |||
58 | will change pid 100 to run at the realtime scheduling class, at priority 2. | ||
59 | |||
60 | ---> snip ionice.c tool <--- | ||
61 | |||
62 | #include <stdio.h> | ||
63 | #include <stdlib.h> | ||
64 | #include <errno.h> | ||
65 | #include <getopt.h> | ||
66 | #include <unistd.h> | ||
67 | #include <sys/ptrace.h> | ||
68 | #include <asm/unistd.h> | ||
69 | |||
70 | extern int sys_ioprio_set(int, int, int); | ||
71 | extern int sys_ioprio_get(int, int); | ||
72 | |||
73 | #if defined(__i386__) | ||
74 | #define __NR_ioprio_set 289 | ||
75 | #define __NR_ioprio_get 290 | ||
76 | #elif defined(__ppc__) | ||
77 | #define __NR_ioprio_set 273 | ||
78 | #define __NR_ioprio_get 274 | ||
79 | #elif defined(__x86_64__) | ||
80 | #define __NR_ioprio_set 251 | ||
81 | #define __NR_ioprio_get 252 | ||
82 | #elif defined(__ia64__) | ||
83 | #define __NR_ioprio_set 1274 | ||
84 | #define __NR_ioprio_get 1275 | ||
85 | #else | ||
86 | #error "Unsupported arch" | ||
87 | #endif | ||
88 | |||
89 | _syscall3(int, ioprio_set, int, which, int, who, int, ioprio); | ||
90 | _syscall2(int, ioprio_get, int, which, int, who); | ||
91 | |||
92 | enum { | ||
93 | IOPRIO_CLASS_NONE, | ||
94 | IOPRIO_CLASS_RT, | ||
95 | IOPRIO_CLASS_BE, | ||
96 | IOPRIO_CLASS_IDLE, | ||
97 | }; | ||
98 | |||
99 | enum { | ||
100 | IOPRIO_WHO_PROCESS = 1, | ||
101 | IOPRIO_WHO_PGRP, | ||
102 | IOPRIO_WHO_USER, | ||
103 | }; | ||
104 | |||
105 | #define IOPRIO_CLASS_SHIFT 13 | ||
106 | |||
107 | const char *to_prio[] = { "none", "realtime", "best-effort", "idle", }; | ||
108 | |||
109 | int main(int argc, char *argv[]) | ||
110 | { | ||
111 | int ioprio = 4, set = 0, ioprio_class = IOPRIO_CLASS_BE; | ||
112 | int c, pid = 0; | ||
113 | |||
114 | while ((c = getopt(argc, argv, "+n:c:p:")) != EOF) { | ||
115 | switch (c) { | ||
116 | case 'n': | ||
117 | ioprio = strtol(optarg, NULL, 10); | ||
118 | set = 1; | ||
119 | break; | ||
120 | case 'c': | ||
121 | ioprio_class = strtol(optarg, NULL, 10); | ||
122 | set = 1; | ||
123 | break; | ||
124 | case 'p': | ||
125 | pid = strtol(optarg, NULL, 10); | ||
126 | break; | ||
127 | } | ||
128 | } | ||
129 | |||
130 | switch (ioprio_class) { | ||
131 | case IOPRIO_CLASS_NONE: | ||
132 | ioprio_class = IOPRIO_CLASS_BE; | ||
133 | break; | ||
134 | case IOPRIO_CLASS_RT: | ||
135 | case IOPRIO_CLASS_BE: | ||
136 | break; | ||
137 | case IOPRIO_CLASS_IDLE: | ||
138 | ioprio = 7; | ||
139 | break; | ||
140 | default: | ||
141 | printf("bad prio class %d\n", ioprio_class); | ||
142 | return 1; | ||
143 | } | ||
144 | |||
145 | if (!set) { | ||
146 | if (!pid && argv[optind]) | ||
147 | pid = strtol(argv[optind], NULL, 10); | ||
148 | |||
149 | ioprio = ioprio_get(IOPRIO_WHO_PROCESS, pid); | ||
150 | |||
151 | printf("pid=%d, %d\n", pid, ioprio); | ||
152 | |||
153 | if (ioprio == -1) | ||
154 | perror("ioprio_get"); | ||
155 | else { | ||
156 | ioprio_class = ioprio >> IOPRIO_CLASS_SHIFT; | ||
157 | ioprio = ioprio & 0xff; | ||
158 | printf("%s: prio %d\n", to_prio[ioprio_class], ioprio); | ||
159 | } | ||
160 | } else { | ||
161 | if (ioprio_set(IOPRIO_WHO_PROCESS, pid, ioprio | ioprio_class << IOPRIO_CLASS_SHIFT) == -1) { | ||
162 | perror("ioprio_set"); | ||
163 | return 1; | ||
164 | } | ||
165 | |||
166 | if (argv[optind]) | ||
167 | execvp(argv[optind], &argv[optind]); | ||
168 | } | ||
169 | |||
170 | return 0; | ||
171 | } | ||
172 | |||
173 | ---> snip ionice.c tool <--- | ||
174 | |||
175 | |||
176 | March 11 2005, Jens Axboe <axboe@suse.de> | ||
diff --git a/Documentation/cciss.txt b/Documentation/cciss.txt index d599beb9df8a..c8f9a73111da 100644 --- a/Documentation/cciss.txt +++ b/Documentation/cciss.txt | |||
@@ -17,6 +17,7 @@ This driver is known to work with the following cards: | |||
17 | * SA P600 | 17 | * SA P600 |
18 | * SA P800 | 18 | * SA P800 |
19 | * SA E400 | 19 | * SA E400 |
20 | * SA E300 | ||
20 | 21 | ||
21 | If nodes are not already created in the /dev/cciss directory, run as root: | 22 | If nodes are not already created in the /dev/cciss directory, run as root: |
22 | 23 | ||
diff --git a/Documentation/cdrom/sbpcd b/Documentation/cdrom/sbpcd index d1825dffca34..b3ba63f4ce3e 100644 --- a/Documentation/cdrom/sbpcd +++ b/Documentation/cdrom/sbpcd | |||
@@ -419,6 +419,7 @@ into the file "track01": | |||
419 | */ | 419 | */ |
420 | #include <stdio.h> | 420 | #include <stdio.h> |
421 | #include <sys/ioctl.h> | 421 | #include <sys/ioctl.h> |
422 | #include <sys/types.h> | ||
422 | #include <linux/cdrom.h> | 423 | #include <linux/cdrom.h> |
423 | 424 | ||
424 | static struct cdrom_tochdr hdr; | 425 | static struct cdrom_tochdr hdr; |
@@ -429,7 +430,7 @@ static int datafile, drive; | |||
429 | static int i, j, limit, track, err; | 430 | static int i, j, limit, track, err; |
430 | static char filename[32]; | 431 | static char filename[32]; |
431 | 432 | ||
432 | main(int argc, char *argv[]) | 433 | int main(int argc, char *argv[]) |
433 | { | 434 | { |
434 | /* | 435 | /* |
435 | * open /dev/cdrom | 436 | * open /dev/cdrom |
@@ -516,6 +517,7 @@ entry[track+1].cdte_addr.lba=entry[track].cdte_addr.lba+300; | |||
516 | } | 517 | } |
517 | arg.addr.lba++; | 518 | arg.addr.lba++; |
518 | } | 519 | } |
520 | return 0; | ||
519 | } | 521 | } |
520 | /*===================== end program ========================================*/ | 522 | /*===================== end program ========================================*/ |
521 | 523 | ||
@@ -564,15 +566,16 @@ Appendix -- the "cdtester" utility: | |||
564 | #include <stdio.h> | 566 | #include <stdio.h> |
565 | #include <malloc.h> | 567 | #include <malloc.h> |
566 | #include <sys/ioctl.h> | 568 | #include <sys/ioctl.h> |
569 | #include <sys/types.h> | ||
567 | #include <linux/cdrom.h> | 570 | #include <linux/cdrom.h> |
568 | 571 | ||
569 | #ifdef AZT_PRIVATE_IOCTLS | 572 | #ifdef AZT_PRIVATE_IOCTLS |
570 | #include <linux/../../drivers/cdrom/aztcd.h> | 573 | #include <linux/../../drivers/cdrom/aztcd.h> |
571 | #endif AZT_PRIVATE_IOCTLS | 574 | #endif /* AZT_PRIVATE_IOCTLS */ |
572 | #ifdef SBP_PRIVATE_IOCTLS | 575 | #ifdef SBP_PRIVATE_IOCTLS |
573 | #include <linux/../../drivers/cdrom/sbpcd.h> | 576 | #include <linux/../../drivers/cdrom/sbpcd.h> |
574 | #include <linux/fs.h> | 577 | #include <linux/fs.h> |
575 | #endif SBP_PRIVATE_IOCTLS | 578 | #endif /* SBP_PRIVATE_IOCTLS */ |
576 | 579 | ||
577 | struct cdrom_tochdr hdr; | 580 | struct cdrom_tochdr hdr; |
578 | struct cdrom_tochdr tocHdr; | 581 | struct cdrom_tochdr tocHdr; |
@@ -590,7 +593,7 @@ union | |||
590 | struct cdrom_msf msf; | 593 | struct cdrom_msf msf; |
591 | unsigned char buf[CD_FRAMESIZE_RAW]; | 594 | unsigned char buf[CD_FRAMESIZE_RAW]; |
592 | } azt; | 595 | } azt; |
593 | #endif AZT_PRIVATE_IOCTLS | 596 | #endif /* AZT_PRIVATE_IOCTLS */ |
594 | int i, i1, i2, i3, j, k; | 597 | int i, i1, i2, i3, j, k; |
595 | unsigned char sequence=0; | 598 | unsigned char sequence=0; |
596 | unsigned char command[80]; | 599 | unsigned char command[80]; |
@@ -738,7 +741,7 @@ void display(int size,unsigned char *buffer) | |||
738 | } | 741 | } |
739 | } | 742 | } |
740 | 743 | ||
741 | main(int argc, char *argv[]) | 744 | int main(int argc, char *argv[]) |
742 | { | 745 | { |
743 | printf("\nTesting tool for a CDROM driver's audio functions V0.1\n"); | 746 | printf("\nTesting tool for a CDROM driver's audio functions V0.1\n"); |
744 | printf("(C) 1995 Eberhard Moenkeberg <emoenke@gwdg.de>\n"); | 747 | printf("(C) 1995 Eberhard Moenkeberg <emoenke@gwdg.de>\n"); |
@@ -1046,12 +1049,13 @@ main(int argc, char *argv[]) | |||
1046 | rc=ioctl(drive,CDROMAUDIOBUFSIZ,j); | 1049 | rc=ioctl(drive,CDROMAUDIOBUFSIZ,j); |
1047 | printf("%d frames granted.\n",rc); | 1050 | printf("%d frames granted.\n",rc); |
1048 | break; | 1051 | break; |
1049 | #endif SBP_PRIVATE_IOCTLS | 1052 | #endif /* SBP_PRIVATE_IOCTLS */ |
1050 | default: | 1053 | default: |
1051 | printf("unknown command: \"%s\".\n",command); | 1054 | printf("unknown command: \"%s\".\n",command); |
1052 | break; | 1055 | break; |
1053 | } | 1056 | } |
1054 | } | 1057 | } |
1058 | return 0; | ||
1055 | } | 1059 | } |
1056 | /*==========================================================================*/ | 1060 | /*==========================================================================*/ |
1057 | 1061 | ||
diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt index b85481acd0ca..933fae74c337 100644 --- a/Documentation/cpu-freq/governors.txt +++ b/Documentation/cpu-freq/governors.txt | |||
@@ -9,6 +9,7 @@ | |||
9 | 9 | ||
10 | 10 | ||
11 | Dominik Brodowski <linux@brodo.de> | 11 | Dominik Brodowski <linux@brodo.de> |
12 | some additions and corrections by Nico Golde <nico@ngolde.de> | ||
12 | 13 | ||
13 | 14 | ||
14 | 15 | ||
@@ -25,6 +26,7 @@ Contents: | |||
25 | 2.1 Performance | 26 | 2.1 Performance |
26 | 2.2 Powersave | 27 | 2.2 Powersave |
27 | 2.3 Userspace | 28 | 2.3 Userspace |
29 | 2.4 Ondemand | ||
28 | 30 | ||
29 | 3. The Governor Interface in the CPUfreq Core | 31 | 3. The Governor Interface in the CPUfreq Core |
30 | 32 | ||
@@ -86,7 +88,7 @@ highest frequency within the borders of scaling_min_freq and | |||
86 | scaling_max_freq. | 88 | scaling_max_freq. |
87 | 89 | ||
88 | 90 | ||
89 | 2.1 Powersave | 91 | 2.2 Powersave |
90 | ------------- | 92 | ------------- |
91 | 93 | ||
92 | The CPUfreq governor "powersave" sets the CPU statically to the | 94 | The CPUfreq governor "powersave" sets the CPU statically to the |
@@ -94,7 +96,7 @@ lowest frequency within the borders of scaling_min_freq and | |||
94 | scaling_max_freq. | 96 | scaling_max_freq. |
95 | 97 | ||
96 | 98 | ||
97 | 2.2 Userspace | 99 | 2.3 Userspace |
98 | ------------- | 100 | ------------- |
99 | 101 | ||
100 | The CPUfreq governor "userspace" allows the user, or any userspace | 102 | The CPUfreq governor "userspace" allows the user, or any userspace |
@@ -103,6 +105,14 @@ by making a sysfs file "scaling_setspeed" available in the CPU-device | |||
103 | directory. | 105 | directory. |
104 | 106 | ||
105 | 107 | ||
108 | 2.4 Ondemand | ||
109 | ------------ | ||
110 | |||
111 | The CPUfreq govenor "ondemand" sets the CPU depending on the | ||
112 | current usage. To do this the CPU must have the capability to | ||
113 | switch the frequency very fast. | ||
114 | |||
115 | |||
106 | 116 | ||
107 | 3. The Governor Interface in the CPUfreq Core | 117 | 3. The Governor Interface in the CPUfreq Core |
108 | ============================================= | 118 | ============================================= |
diff --git a/Documentation/cpusets.txt b/Documentation/cpusets.txt index 2f8f24eaefd9..ad944c060312 100644 --- a/Documentation/cpusets.txt +++ b/Documentation/cpusets.txt | |||
@@ -51,6 +51,14 @@ mems_allowed vector. | |||
51 | 51 | ||
52 | If a cpuset is cpu or mem exclusive, no other cpuset, other than a direct | 52 | If a cpuset is cpu or mem exclusive, no other cpuset, other than a direct |
53 | ancestor or descendent, may share any of the same CPUs or Memory Nodes. | 53 | ancestor or descendent, may share any of the same CPUs or Memory Nodes. |
54 | A cpuset that is cpu exclusive has a sched domain associated with it. | ||
55 | The sched domain consists of all cpus in the current cpuset that are not | ||
56 | part of any exclusive child cpusets. | ||
57 | This ensures that the scheduler load balacing code only balances | ||
58 | against the cpus that are in the sched domain as defined above and not | ||
59 | all of the cpus in the system. This removes any overhead due to | ||
60 | load balancing code trying to pull tasks outside of the cpu exclusive | ||
61 | cpuset only to be prevented by the tasks' cpus_allowed mask. | ||
54 | 62 | ||
55 | User level code may create and destroy cpusets by name in the cpuset | 63 | User level code may create and destroy cpusets by name in the cpuset |
56 | virtual file system, manage the attributes and permissions of these | 64 | virtual file system, manage the attributes and permissions of these |
@@ -84,6 +92,9 @@ This can be especially valuable on: | |||
84 | and a database), or | 92 | and a database), or |
85 | * NUMA systems running large HPC applications with demanding | 93 | * NUMA systems running large HPC applications with demanding |
86 | performance characteristics. | 94 | performance characteristics. |
95 | * Also cpu_exclusive cpusets are useful for servers running orthogonal | ||
96 | workloads such as RT applications requiring low latency and HPC | ||
97 | applications that are throughput sensitive | ||
87 | 98 | ||
88 | These subsets, or "soft partitions" must be able to be dynamically | 99 | These subsets, or "soft partitions" must be able to be dynamically |
89 | adjusted, as the job mix changes, without impacting other concurrently | 100 | adjusted, as the job mix changes, without impacting other concurrently |
@@ -125,6 +136,8 @@ Cpusets extends these two mechanisms as follows: | |||
125 | - A cpuset may be marked exclusive, which ensures that no other | 136 | - A cpuset may be marked exclusive, which ensures that no other |
126 | cpuset (except direct ancestors and descendents) may contain | 137 | cpuset (except direct ancestors and descendents) may contain |
127 | any overlapping CPUs or Memory Nodes. | 138 | any overlapping CPUs or Memory Nodes. |
139 | Also a cpu_exclusive cpuset would be associated with a sched | ||
140 | domain. | ||
128 | - You can list all the tasks (by pid) attached to any cpuset. | 141 | - You can list all the tasks (by pid) attached to any cpuset. |
129 | 142 | ||
130 | The implementation of cpusets requires a few, simple hooks | 143 | The implementation of cpusets requires a few, simple hooks |
@@ -136,6 +149,9 @@ into the rest of the kernel, none in performance critical paths: | |||
136 | allowed in that tasks cpuset. | 149 | allowed in that tasks cpuset. |
137 | - in sched.c migrate_all_tasks(), to keep migrating tasks within | 150 | - in sched.c migrate_all_tasks(), to keep migrating tasks within |
138 | the CPUs allowed by their cpuset, if possible. | 151 | the CPUs allowed by their cpuset, if possible. |
152 | - in sched.c, a new API partition_sched_domains for handling | ||
153 | sched domain changes associated with cpu_exclusive cpusets | ||
154 | and related changes in both sched.c and arch/ia64/kernel/domain.c | ||
139 | - in the mbind and set_mempolicy system calls, to mask the requested | 155 | - in the mbind and set_mempolicy system calls, to mask the requested |
140 | Memory Nodes by what's allowed in that tasks cpuset. | 156 | Memory Nodes by what's allowed in that tasks cpuset. |
141 | - in page_alloc, to restrict memory to allowed nodes. | 157 | - in page_alloc, to restrict memory to allowed nodes. |
diff --git a/Documentation/devices.txt b/Documentation/devices.txt index bb67cf25010e..0f515175c72a 100644 --- a/Documentation/devices.txt +++ b/Documentation/devices.txt | |||
@@ -94,6 +94,7 @@ Your cooperation is appreciated. | |||
94 | 9 = /dev/urandom Faster, less secure random number gen. | 94 | 9 = /dev/urandom Faster, less secure random number gen. |
95 | 10 = /dev/aio Asyncronous I/O notification interface | 95 | 10 = /dev/aio Asyncronous I/O notification interface |
96 | 11 = /dev/kmsg Writes to this come out as printk's | 96 | 11 = /dev/kmsg Writes to this come out as printk's |
97 | 12 = /dev/oldmem Access to crash dump from kexec kernel | ||
97 | 1 block RAM disk | 98 | 1 block RAM disk |
98 | 0 = /dev/ram0 First RAM disk | 99 | 0 = /dev/ram0 First RAM disk |
99 | 1 = /dev/ram1 Second RAM disk | 100 | 1 = /dev/ram1 Second RAM disk |
diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 9a33bb94f74f..96bea278bbf6 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff | |||
@@ -41,6 +41,7 @@ COPYING | |||
41 | CREDITS | 41 | CREDITS |
42 | CVS | 42 | CVS |
43 | ChangeSet | 43 | ChangeSet |
44 | Image | ||
44 | Kerntypes | 45 | Kerntypes |
45 | MODS.txt | 46 | MODS.txt |
46 | Module.symvers | 47 | Module.symvers |
@@ -103,6 +104,8 @@ logo_*.c | |||
103 | logo_*_clut224.c | 104 | logo_*_clut224.c |
104 | logo_*_mono.c | 105 | logo_*_mono.c |
105 | lxdialog | 106 | lxdialog |
107 | mach-types | ||
108 | mach-types.h | ||
106 | make_times_h | 109 | make_times_h |
107 | map | 110 | map |
108 | maui_boot.h | 111 | maui_boot.h |
@@ -111,6 +114,7 @@ mkdep | |||
111 | mktables | 114 | mktables |
112 | modpost | 115 | modpost |
113 | modversions.h* | 116 | modversions.h* |
117 | offset.h | ||
114 | offsets.h | 118 | offsets.h |
115 | oui.c* | 119 | oui.c* |
116 | parse.c* | 120 | parse.c* |
diff --git a/Documentation/dvb/README.dibusb b/Documentation/dvb/README.dibusb deleted file mode 100644 index 7a9e958513f3..000000000000 --- a/Documentation/dvb/README.dibusb +++ /dev/null | |||
@@ -1,285 +0,0 @@ | |||
1 | Documentation for dib3000* frontend drivers and dibusb device driver | ||
2 | ==================================================================== | ||
3 | |||
4 | Copyright (C) 2004-5 Patrick Boettcher (patrick.boettcher@desy.de), | ||
5 | |||
6 | dibusb and dib3000mb/mc drivers based on GPL code, which has | ||
7 | |||
8 | Copyright (C) 2004 Amaury Demol for DiBcom (ademol@dibcom.fr) | ||
9 | |||
10 | This program is free software; you can redistribute it and/or | ||
11 | modify it under the terms of the GNU General Public License as | ||
12 | published by the Free Software Foundation, version 2. | ||
13 | |||
14 | |||
15 | Supported devices USB1.1 | ||
16 | ======================== | ||
17 | |||
18 | Produced and reselled by Twinhan: | ||
19 | --------------------------------- | ||
20 | - TwinhanDTV USB-Ter DVB-T Device (VP7041) | ||
21 | http://www.twinhan.com/product_terrestrial_3.asp | ||
22 | |||
23 | - TwinhanDTV Magic Box (VP7041e) | ||
24 | http://www.twinhan.com/product_terrestrial_4.asp | ||
25 | |||
26 | - HAMA DVB-T USB device | ||
27 | http://www.hama.de/portal/articleId*110620/action*2598 | ||
28 | |||
29 | - CTS Portable (Chinese Television System) (2) | ||
30 | http://www.2cts.tv/ctsportable/ | ||
31 | |||
32 | - Unknown USB DVB-T device with vendor ID Hyper-Paltek | ||
33 | |||
34 | |||
35 | Produced and reselled by KWorld: | ||
36 | -------------------------------- | ||
37 | - KWorld V-Stream XPERT DTV DVB-T USB | ||
38 | http://www.kworld.com.tw/en/product/DVBT-USB/DVBT-USB.html | ||
39 | |||
40 | - JetWay DTV DVB-T USB | ||
41 | http://www.jetway.com.tw/evisn/product/lcd-tv/DVT-USB/dtv-usb.htm | ||
42 | |||
43 | - ADSTech Instant TV DVB-T USB | ||
44 | http://www.adstech.com/products/PTV-333/intro/PTV-333_intro.asp?pid=PTV-333 | ||
45 | |||
46 | |||
47 | Others: | ||
48 | ------- | ||
49 | - Ultima Electronic/Artec T1 USB TVBOX (AN2135, AN2235, AN2235 with Panasonic Tuner) | ||
50 | http://82.161.246.249/products-tvbox.html | ||
51 | |||
52 | - Compro Videomate DVB-U2000 - DVB-T USB (2) | ||
53 | http://www.comprousa.com/products/vmu2000.htm | ||
54 | |||
55 | - Grandtec USB DVB-T | ||
56 | http://www.grand.com.tw/ | ||
57 | |||
58 | - Avermedia AverTV DVBT USB (2) | ||
59 | http://www.avermedia.com/ | ||
60 | |||
61 | - DiBcom USB DVB-T reference device (non-public) | ||
62 | |||
63 | |||
64 | Supported devices USB2.0 | ||
65 | ======================== | ||
66 | - Twinhan MagicBox II (2) | ||
67 | http://www.twinhan.com/product_terrestrial_7.asp | ||
68 | |||
69 | - Hanftek UMT-010 (1) | ||
70 | http://www.globalsources.com/si/6008819757082/ProductDetail/Digital-TV/product_id-100046529 | ||
71 | |||
72 | - Typhoon/Yakumo/HAMA DVB-T mobile USB2.0 (1) | ||
73 | http://www.yakumo.de/produkte/index.php?pid=1&ag=DVB-T | ||
74 | |||
75 | - Artec T1 USB TVBOX (FX2) (2) | ||
76 | |||
77 | - Hauppauge WinTV NOVA-T USB2 | ||
78 | http://www.hauppauge.com/ | ||
79 | |||
80 | - KWorld/ADSTech Instant DVB-T USB2.0 (DiB3000M-B) | ||
81 | |||
82 | - DiBcom USB2.0 DVB-T reference device (non-public) | ||
83 | |||
84 | 1) It is working almost. | ||
85 | 2) No test reports received yet. | ||
86 | |||
87 | |||
88 | 0. NEWS: | ||
89 | 2005-02-11 - added support for the KWorld/ADSTech Instant DVB-T USB2.0. Thanks a lot to Joachim von Caron | ||
90 | 2005-02-02 - added support for the Hauppauge Win-TV Nova-T USB2 | ||
91 | 2005-01-31 - distorted streaming is finally gone for USB1.1 devices | ||
92 | 2005-01-13 - moved the mirrored pid_filter_table back to dvb-dibusb | ||
93 | - first almost working version for HanfTek UMT-010 | ||
94 | - found out, that Yakumo/HAMA/Typhoon are predessors of the HanfTek UMT-010 | ||
95 | 2005-01-10 - refactoring completed, now everything is very delightful | ||
96 | - tuner quirks for some weird devices (Artec T1 AN2235 device has sometimes a | ||
97 | Panasonic Tuner assembled). Tunerprobing implemented. Thanks a lot to Gunnar Wittich. | ||
98 | 2004-12-29 - after several days of struggling around bug of no returning URBs fixed. | ||
99 | 2004-12-26 - refactored the dibusb-driver, splitted into separate files | ||
100 | - i2c-probing enabled | ||
101 | 2004-12-06 - possibility for demod i2c-address probing | ||
102 | - new usb IDs (Compro,Artec) | ||
103 | 2004-11-23 - merged changes from DiB3000MC_ver2.1 | ||
104 | - revised the debugging | ||
105 | - possibility to deliver the complete TS for USB2.0 | ||
106 | 2004-11-21 - first working version of the dib3000mc/p frontend driver. | ||
107 | 2004-11-12 - added additional remote control keys. Thanks to Uwe Hanke. | ||
108 | 2004-11-07 - added remote control support. Thanks to David Matthews. | ||
109 | 2004-11-05 - added support for a new devices (Grandtec/Avermedia/Artec) | ||
110 | - merged my changes (for dib3000mb/dibusb) to the FE_REFACTORING, because it became HEAD | ||
111 | - moved transfer control (pid filter, fifo control) from usb driver to frontend, it seems | ||
112 | better settled there (added xfer_ops-struct) | ||
113 | - created a common files for frontends (mc/p/mb) | ||
114 | 2004-09-28 - added support for a new device (Unkown, vendor ID is Hyper-Paltek) | ||
115 | 2004-09-20 - added support for a new device (Compro DVB-U2000), thanks | ||
116 | to Amaury Demol for reporting | ||
117 | - changed usb TS transfer method (several urbs, stopping transfer | ||
118 | before setting a new pid) | ||
119 | 2004-09-13 - added support for a new device (Artec T1 USB TVBOX), thanks | ||
120 | to Christian Motschke for reporting | ||
121 | 2004-09-05 - released the dibusb device and dib3000mb-frontend driver | ||
122 | |||
123 | (old news for vp7041.c) | ||
124 | 2004-07-15 - found out, by accident, that the device has a TUA6010XS for | ||
125 | PLL | ||
126 | 2004-07-12 - figured out, that the driver should also work with the | ||
127 | CTS Portable (Chinese Television System) | ||
128 | 2004-07-08 - firmware-extraction-2.422-problem solved, driver is now working | ||
129 | properly with firmware extracted from 2.422 | ||
130 | - #if for 2.6.4 (dvb), compile issue | ||
131 | - changed firmware handling, see vp7041.txt sec 1.1 | ||
132 | 2004-07-02 - some tuner modifications, v0.1, cleanups, first public | ||
133 | 2004-06-28 - now using the dvb_dmx_swfilter_packets, everything | ||
134 | runs fine now | ||
135 | 2004-06-27 - able to watch and switching channels (pre-alpha) | ||
136 | - no section filtering yet | ||
137 | 2004-06-06 - first TS received, but kernel oops :/ | ||
138 | 2004-05-14 - firmware loader is working | ||
139 | 2004-05-11 - start writing the driver | ||
140 | |||
141 | 1. How to use? | ||
142 | NOTE: This driver was developed using Linux 2.6.6., | ||
143 | it is working with 2.6.7 and above. | ||
144 | |||
145 | Linux 2.4.x support is not planned, but patches are very welcome. | ||
146 | |||
147 | NOTE: I'm using Debian testing, so the following explaination (especially | ||
148 | the hotplug-path) needn't match your system, but probably it will :). | ||
149 | |||
150 | The driver is included in the kernel since Linux 2.6.10. | ||
151 | |||
152 | 1.1. Firmware | ||
153 | |||
154 | The USB driver needs to download a firmware to start working. | ||
155 | |||
156 | You can either use "get_dvb_firmware dibusb" to download the firmware or you | ||
157 | can get it directly via | ||
158 | |||
159 | for USB1.1 (AN2135) | ||
160 | http://www.linuxtv.org/downloads/firmware/dvb-dibusb-5.0.0.11.fw | ||
161 | |||
162 | for USB1.1 (AN2235) (a few Artec T1 devices) | ||
163 | http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw | ||
164 | |||
165 | for USB2.0 (FX2) Hauppauge, DiBcom | ||
166 | http://www.linuxtv.org/downloads/firmware/dvb-dibusb-6.0.0.5.fw | ||
167 | |||
168 | for USB2.0 ADSTech/Kworld USB2.0 | ||
169 | http://www.linuxtv.org/downloads/firmware/dvb-dibusb-adstech-usb2-1.fw | ||
170 | |||
171 | for USB2.0 HanfTek | ||
172 | http://www.linuxtv.org/downloads/firmware/dvb-dibusb-an2235-1.fw | ||
173 | |||
174 | |||
175 | 1.2. Compiling | ||
176 | |||
177 | Since the driver is in the linux kernel, activating the driver in | ||
178 | your favorite config-environment should sufficient. I recommend | ||
179 | to compile the driver as module. Hotplug does the rest. | ||
180 | |||
181 | 1.3. Loading the drivers | ||
182 | |||
183 | Hotplug is able to load the driver, when it is needed (because you plugged | ||
184 | in the device). | ||
185 | |||
186 | If you want to enable debug output, you have to load the driver manually and | ||
187 | from withing the dvb-kernel cvs repository. | ||
188 | |||
189 | first have a look, which debug level are available: | ||
190 | |||
191 | modinfo dib3000mb | ||
192 | modinfo dib3000-common | ||
193 | modinfo dib3000mc | ||
194 | modinfo dvb-dibusb | ||
195 | |||
196 | modprobe dib3000-common debug=<level> | ||
197 | modprobe dib3000mb debug=<level> | ||
198 | modprobe dib3000mc debug=<level> | ||
199 | modprobe dvb-dibusb debug=<level> | ||
200 | |||
201 | should do the trick. | ||
202 | |||
203 | When the driver is loaded successfully, the firmware file was in | ||
204 | the right place and the device is connected, the "Power"-LED should be | ||
205 | turned on. | ||
206 | |||
207 | At this point you should be able to start a dvb-capable application. For myself | ||
208 | I used mplayer, dvbscan, tzap and kaxtv, they are working. Using the device | ||
209 | in vdr is working now also. | ||
210 | |||
211 | 2. Known problems and bugs | ||
212 | |||
213 | - Don't remove the USB device while running an DVB application, your system will die. | ||
214 | |||
215 | 2.1. Adding support for devices | ||
216 | |||
217 | It is not possible to determine the range of devices based on the DiBcom | ||
218 | reference designs. This is because the reference design of DiBcom can be sold | ||
219 | to thirds, without telling DiBcom (so done with the Twinhan VP7041 and | ||
220 | the HAMA device). | ||
221 | |||
222 | When you think you have a device like this and the driver does not recognizes it, | ||
223 | please send the ****load*.inf and the ****cap*.inf of the Windows driver to me. | ||
224 | |||
225 | Sometimes the Vendor or Product ID is identical to the ones of Twinhan, even | ||
226 | though it is not a Twinhan device (e.g. HAMA), then please send me the name | ||
227 | of the device. I will add it to this list in order to make this clear to | ||
228 | others. | ||
229 | |||
230 | If you are familar with C you can also add the VID and PID of the device to | ||
231 | the dvb-dibusb-core.c-file and create a patch and send it over to me or to | ||
232 | the linux-dvb mailing list, _after_ you have tried compiling and modprobing | ||
233 | it. | ||
234 | |||
235 | 2.2. USB1.1 Bandwidth limitation | ||
236 | |||
237 | Most of the currently supported devices are USB1.1 and thus they have a | ||
238 | maximum bandwidth of about 5-6 MBit/s when connected to a USB2.0 hub. | ||
239 | This is not enough for receiving the complete transport stream of a | ||
240 | DVB-T channel (which can be about 16 MBit/s). Normally this is not a | ||
241 | problem, if you only want to watch TV (this does not apply for HDTV), | ||
242 | but watching a channel while recording another channel on the same | ||
243 | frequency simply does not work very well. This applies to all USB1.1 | ||
244 | DVB-T devices, not just dibusb) | ||
245 | |||
246 | Update: For the USB1.1 and VDR some work has been done (patches and comments | ||
247 | are still very welcome). Maybe the problem is solved in the meantime because I | ||
248 | now use the dmx_sw_filter function instead of dmx_sw_filter_packet. I hope the | ||
249 | linux-dvb software filter is able to get the best of the garbled TS. | ||
250 | |||
251 | The bug, where the TS is distorted by a heavy usage of the device is gone | ||
252 | definitely. All dibusb-devices I was using (Twinhan, Kworld, DiBcom) are | ||
253 | working like charm now with VDR. Sometimes I even was able to record a channel | ||
254 | and watch another one. | ||
255 | |||
256 | 2.3. Comments | ||
257 | |||
258 | Patches, comments and suggestions are very very welcome. | ||
259 | |||
260 | 3. Acknowledgements | ||
261 | Amaury Demol (ademol@dibcom.fr) and Francois Kanounnikoff from DiBcom for | ||
262 | providing specs, code and help, on which the dvb-dibusb, dib3000mb and | ||
263 | dib3000mc are based. | ||
264 | |||
265 | David Matthews for identifying a new device type (Artec T1 with AN2235) | ||
266 | and for extending dibusb with remote control event handling. Thank you. | ||
267 | |||
268 | Alex Woods for frequently answering question about usb and dvb | ||
269 | stuff, a big thank you. | ||
270 | |||
271 | Bernd Wagner for helping with huge bug reports and discussions. | ||
272 | |||
273 | Gunnar Wittich and Joachim von Caron for their trust for giving me | ||
274 | root-shells on their machines to implement support for new devices. | ||
275 | |||
276 | Some guys on the linux-dvb mailing list for encouraging me | ||
277 | |||
278 | Peter Schildmann >peter.schildmann-nospam-at-web.de< for his | ||
279 | user-level firmware loader, which saves a lot of time | ||
280 | (when writing the vp7041 driver) | ||
281 | |||
282 | Ulf Hermenau for helping me out with traditional chinese. | ||
283 | |||
284 | André Smoktun and Christian Frömmel for supporting me with | ||
285 | hardware and listening to my problems very patient | ||
diff --git a/Documentation/dvb/README.dvb-usb b/Documentation/dvb/README.dvb-usb new file mode 100644 index 000000000000..ac0797ea646c --- /dev/null +++ b/Documentation/dvb/README.dvb-usb | |||
@@ -0,0 +1,232 @@ | |||
1 | Documentation for dvb-usb-framework module and its devices | ||
2 | |||
3 | Idea behind the dvb-usb-framework | ||
4 | ================================= | ||
5 | |||
6 | In March 2005 I got the new Twinhan USB2.0 DVB-T device. They provided specs and a firmware. | ||
7 | |||
8 | Quite keen I wanted to put the driver (with some quirks of course) into dibusb. | ||
9 | After reading some specs and doing some USB snooping, it realized, that the | ||
10 | dibusb-driver would be a complete mess afterwards. So I decided to do it in a | ||
11 | different way: With the help of a dvb-usb-framework. | ||
12 | |||
13 | The framework provides generic functions (mostly kernel API calls), such as: | ||
14 | |||
15 | - Transport Stream URB handling in conjunction with dvb-demux-feed-control | ||
16 | (bulk and isoc are supported) | ||
17 | - registering the device for the DVB-API | ||
18 | - registering an I2C-adapter if applicable | ||
19 | - remote-control/input-device handling | ||
20 | - firmware requesting and loading (currently just for the Cypress USB | ||
21 | controllers) | ||
22 | - other functions/methods which can be shared by several drivers (such as | ||
23 | functions for bulk-control-commands) | ||
24 | - TODO: a I2C-chunker. It creates device-specific chunks of register-accesses | ||
25 | depending on length of a register and the number of values that can be | ||
26 | multi-written and multi-read. | ||
27 | |||
28 | The source code of the particular DVB USB devices does just the communication | ||
29 | with the device via the bus. The connection between the DVB-API-functionality | ||
30 | is done via callbacks, assigned in a static device-description (struct | ||
31 | dvb_usb_device) each device-driver has to have. | ||
32 | |||
33 | For an example have a look in drivers/media/dvb/dvb-usb/vp7045*. | ||
34 | |||
35 | Objective is to migrate all the usb-devices (dibusb, cinergyT2, maybe the | ||
36 | ttusb; flexcop-usb already benefits from the generic flexcop-device) to use | ||
37 | the dvb-usb-lib. | ||
38 | |||
39 | TODO: dynamic enabling and disabling of the pid-filter in regard to number of | ||
40 | feeds requested. | ||
41 | |||
42 | Supported devices | ||
43 | ======================== | ||
44 | |||
45 | See the LinuxTV DVB Wiki at www.linuxtv.org for a complete list of | ||
46 | cards/drivers/firmwares: | ||
47 | |||
48 | http://www.linuxtv.org/wiki/index.php/DVB_USB | ||
49 | |||
50 | 0. History & News: | ||
51 | 2005-06-30 - added support for WideView WT-220U (Thanks to Steve Chang) | ||
52 | 2005-05-30 - added basic isochronous support to the dvb-usb-framework | ||
53 | added support for Conexant Hybrid reference design and Nebula DigiTV USB | ||
54 | 2005-04-17 - all dibusb devices ported to make use of the dvb-usb-framework | ||
55 | 2005-04-02 - re-enabled and improved remote control code. | ||
56 | 2005-03-31 - ported the Yakumo/Hama/Typhoon DVB-T USB2.0 device to dvb-usb. | ||
57 | 2005-03-30 - first commit of the dvb-usb-module based on the dibusb-source. First device is a new driver for the | ||
58 | TwinhanDTV Alpha / MagicBox II USB2.0-only DVB-T device. | ||
59 | |||
60 | (change from dvb-dibusb to dvb-usb) | ||
61 | 2005-03-28 - added support for the AVerMedia AverTV DVB-T USB2.0 device (Thanks to Glen Harris and Jiun-Kuei Jung, AVerMedia) | ||
62 | 2005-03-14 - added support for the Typhoon/Yakumo/HAMA DVB-T mobile USB2.0 | ||
63 | 2005-02-11 - added support for the KWorld/ADSTech Instant DVB-T USB2.0. Thanks a lot to Joachim von Caron | ||
64 | 2005-02-02 - added support for the Hauppauge Win-TV Nova-T USB2 | ||
65 | 2005-01-31 - distorted streaming is gone for USB1.1 devices | ||
66 | 2005-01-13 - moved the mirrored pid_filter_table back to dvb-dibusb | ||
67 | - first almost working version for HanfTek UMT-010 | ||
68 | - found out, that Yakumo/HAMA/Typhoon are predecessors of the HanfTek UMT-010 | ||
69 | 2005-01-10 - refactoring completed, now everything is very delightful | ||
70 | - tuner quirks for some weird devices (Artec T1 AN2235 device has sometimes a | ||
71 | Panasonic Tuner assembled). Tunerprobing implemented. Thanks a lot to Gunnar Wittich. | ||
72 | 2004-12-29 - after several days of struggling around bug of no returning URBs fixed. | ||
73 | 2004-12-26 - refactored the dibusb-driver, splitted into separate files | ||
74 | - i2c-probing enabled | ||
75 | 2004-12-06 - possibility for demod i2c-address probing | ||
76 | - new usb IDs (Compro, Artec) | ||
77 | 2004-11-23 - merged changes from DiB3000MC_ver2.1 | ||
78 | - revised the debugging | ||
79 | - possibility to deliver the complete TS for USB2.0 | ||
80 | 2004-11-21 - first working version of the dib3000mc/p frontend driver. | ||
81 | 2004-11-12 - added additional remote control keys. Thanks to Uwe Hanke. | ||
82 | 2004-11-07 - added remote control support. Thanks to David Matthews. | ||
83 | 2004-11-05 - added support for a new devices (Grandtec/Avermedia/Artec) | ||
84 | - merged my changes (for dib3000mb/dibusb) to the FE_REFACTORING, because it became HEAD | ||
85 | - moved transfer control (pid filter, fifo control) from usb driver to frontend, it seems | ||
86 | better settled there (added xfer_ops-struct) | ||
87 | - created a common files for frontends (mc/p/mb) | ||
88 | 2004-09-28 - added support for a new device (Unkown, vendor ID is Hyper-Paltek) | ||
89 | 2004-09-20 - added support for a new device (Compro DVB-U2000), thanks | ||
90 | to Amaury Demol for reporting | ||
91 | - changed usb TS transfer method (several urbs, stopping transfer | ||
92 | before setting a new pid) | ||
93 | 2004-09-13 - added support for a new device (Artec T1 USB TVBOX), thanks | ||
94 | to Christian Motschke for reporting | ||
95 | 2004-09-05 - released the dibusb device and dib3000mb-frontend driver | ||
96 | |||
97 | (old news for vp7041.c) | ||
98 | 2004-07-15 - found out, by accident, that the device has a TUA6010XS for | ||
99 | PLL | ||
100 | 2004-07-12 - figured out, that the driver should also work with the | ||
101 | CTS Portable (Chinese Television System) | ||
102 | 2004-07-08 - firmware-extraction-2.422-problem solved, driver is now working | ||
103 | properly with firmware extracted from 2.422 | ||
104 | - #if for 2.6.4 (dvb), compile issue | ||
105 | - changed firmware handling, see vp7041.txt sec 1.1 | ||
106 | 2004-07-02 - some tuner modifications, v0.1, cleanups, first public | ||
107 | 2004-06-28 - now using the dvb_dmx_swfilter_packets, everything | ||
108 | runs fine now | ||
109 | 2004-06-27 - able to watch and switching channels (pre-alpha) | ||
110 | - no section filtering yet | ||
111 | 2004-06-06 - first TS received, but kernel oops :/ | ||
112 | 2004-05-14 - firmware loader is working | ||
113 | 2004-05-11 - start writing the driver | ||
114 | |||
115 | 1. How to use? | ||
116 | 1.1. Firmware | ||
117 | |||
118 | Most of the USB drivers need to download a firmware to the device before start | ||
119 | working. | ||
120 | |||
121 | Have a look at the Wikipage for the DVB-USB-drivers to find out, which firmware | ||
122 | you need for your device: | ||
123 | |||
124 | http://www.linuxtv.org/wiki/index.php/DVB_USB | ||
125 | |||
126 | 1.2. Compiling | ||
127 | |||
128 | Since the driver is in the linux kernel, activating the driver in | ||
129 | your favorite config-environment should sufficient. I recommend | ||
130 | to compile the driver as module. Hotplug does the rest. | ||
131 | |||
132 | If you use dvb-kernel enter the build-2.6 directory run 'make' and 'insmod.sh | ||
133 | load' afterwards. | ||
134 | |||
135 | 1.3. Loading the drivers | ||
136 | |||
137 | Hotplug is able to load the driver, when it is needed (because you plugged | ||
138 | in the device). | ||
139 | |||
140 | If you want to enable debug output, you have to load the driver manually and | ||
141 | from withing the dvb-kernel cvs repository. | ||
142 | |||
143 | first have a look, which debug level are available: | ||
144 | |||
145 | modinfo dvb-usb | ||
146 | modinfo dvb-usb-vp7045 | ||
147 | etc. | ||
148 | |||
149 | modprobe dvb-usb debug=<level> | ||
150 | modprobe dvb-usb-vp7045 debug=<level> | ||
151 | etc. | ||
152 | |||
153 | should do the trick. | ||
154 | |||
155 | When the driver is loaded successfully, the firmware file was in | ||
156 | the right place and the device is connected, the "Power"-LED should be | ||
157 | turned on. | ||
158 | |||
159 | At this point you should be able to start a dvb-capable application. I'm use | ||
160 | (t|s)zap, mplayer and dvbscan to test the basics. VDR-xine provides the | ||
161 | long-term test scenario. | ||
162 | |||
163 | 2. Known problems and bugs | ||
164 | |||
165 | - Don't remove the USB device while running an DVB application, your system | ||
166 | will go crazy or die most likely. | ||
167 | |||
168 | 2.1. Adding support for devices | ||
169 | |||
170 | TODO | ||
171 | |||
172 | 2.2. USB1.1 Bandwidth limitation | ||
173 | |||
174 | A lot of the currently supported devices are USB1.1 and thus they have a | ||
175 | maximum bandwidth of about 5-6 MBit/s when connected to a USB2.0 hub. | ||
176 | This is not enough for receiving the complete transport stream of a | ||
177 | DVB-T channel (which is about 16 MBit/s). Normally this is not a | ||
178 | problem, if you only want to watch TV (this does not apply for HDTV), | ||
179 | but watching a channel while recording another channel on the same | ||
180 | frequency simply does not work very well. This applies to all USB1.1 | ||
181 | DVB-T devices, not just the dvb-usb-devices) | ||
182 | |||
183 | The bug, where the TS is distorted by a heavy usage of the device is gone | ||
184 | definitely. All dvb-usb-devices I was using (Twinhan, Kworld, DiBcom) are | ||
185 | working like charm now with VDR. Sometimes I even was able to record a channel | ||
186 | and watch another one. | ||
187 | |||
188 | 2.3. Comments | ||
189 | |||
190 | Patches, comments and suggestions are very very welcome. | ||
191 | |||
192 | 3. Acknowledgements | ||
193 | Amaury Demol (ademol@dibcom.fr) and Francois Kanounnikoff from DiBcom for | ||
194 | providing specs, code and help, on which the dvb-dibusb, dib3000mb and | ||
195 | dib3000mc are based. | ||
196 | |||
197 | David Matthews for identifying a new device type (Artec T1 with AN2235) | ||
198 | and for extending dibusb with remote control event handling. Thank you. | ||
199 | |||
200 | Alex Woods for frequently answering question about usb and dvb | ||
201 | stuff, a big thank you. | ||
202 | |||
203 | Bernd Wagner for helping with huge bug reports and discussions. | ||
204 | |||
205 | Gunnar Wittich and Joachim von Caron for their trust for providing | ||
206 | root-shells on their machines to implement support for new devices. | ||
207 | |||
208 | Allan Third and Michael Hutchinson for their help to write the Nebula | ||
209 | digitv-driver. | ||
210 | |||
211 | Glen Harris for bringing up, that there is a new dibusb-device and Jiun-Kuei | ||
212 | Jung from AVerMedia who kindly provided a special firmware to get the device | ||
213 | up and running in Linux. | ||
214 | |||
215 | Jennifer Chen, Jeff and Jack from Twinhan for kindly supporting by | ||
216 | writing the vp7045-driver. | ||
217 | |||
218 | Steve Chang from WideView for providing information for new devices and | ||
219 | firmware files. | ||
220 | |||
221 | Michael Paxton for submitting remote control keymaps. | ||
222 | |||
223 | Some guys on the linux-dvb mailing list for encouraging me. | ||
224 | |||
225 | Peter Schildmann >peter.schildmann-nospam-at-web.de< for his | ||
226 | user-level firmware loader, which saves a lot of time | ||
227 | (when writing the vp7041 driver) | ||
228 | |||
229 | Ulf Hermenau for helping me out with traditional chinese. | ||
230 | |||
231 | André Smoktun and Christian Frömmel for supporting me with | ||
232 | hardware and listening to my problems very patiently. | ||
diff --git a/Documentation/dvb/bt8xx.txt b/Documentation/dvb/bt8xx.txt index d64430bf4bb6..e6b8d05bc08d 100644 --- a/Documentation/dvb/bt8xx.txt +++ b/Documentation/dvb/bt8xx.txt | |||
@@ -1,69 +1,55 @@ | |||
1 | How to get the Nebula, PCTV and Twinhan DST cards working | 1 | How to get the Nebula Electronics DigiTV, Pinnacle PCTV Sat, Twinhan DST + clones working |
2 | ========================================================= | 2 | ========================================================================================= |
3 | 3 | ||
4 | This class of cards has a bt878a as the PCI interface, and | 4 | 1) General information |
5 | require the bttv driver. | 5 | ====================== |
6 | 6 | ||
7 | Please pay close attention to the warning about the bttv module | 7 | This class of cards has a bt878a chip as the PCI interface. |
8 | options below for the DST card. | 8 | The different card drivers require the bttv driver to provide the means |
9 | to access the i2c bus and the gpio pins of the bt8xx chipset. | ||
9 | 10 | ||
10 | 1) General informations | 11 | 2) Compilation rules for Kernel >= 2.6.12 |
11 | ======================= | 12 | ========================================= |
12 | 13 | ||
13 | These drivers require the bttv driver to provide the means to access | 14 | Enable the following options: |
14 | the i2c bus and the gpio pins of the bt8xx chipset. | ||
15 | 15 | ||
16 | Because of this, you need to enable | ||
17 | "Device drivers" => "Multimedia devices" | 16 | "Device drivers" => "Multimedia devices" |
18 | => "Video For Linux" => "BT848 Video For Linux" | 17 | => "Video For Linux" => "BT848 Video For Linux" |
19 | |||
20 | Furthermore you need to enable | ||
21 | "Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" | 18 | "Device drivers" => "Multimedia devices" => "Digital Video Broadcasting Devices" |
22 | => "DVB for Linux" "DVB Core Support" "Nebula/Pinnacle PCTV/TwinHan PCI Cards" | 19 | => "DVB for Linux" "DVB Core Support" "Nebula/Pinnacle PCTV/TwinHan PCI Cards" |
23 | 20 | ||
24 | 2) Loading Modules | 21 | 3) Loading Modules, described by two approaches |
25 | ================== | 22 | =============================================== |
26 | 23 | ||
27 | In general you need to load the bttv driver, which will handle the gpio and | 24 | In general you need to load the bttv driver, which will handle the gpio and |
28 | i2c communication for us, plus the common dvb-bt8xx device driver. | 25 | i2c communication for us, plus the common dvb-bt8xx device driver, |
29 | The frontends for Nebula (nxt6000), Pinnacle PCTV (cx24110) and | 26 | which is called the backend. |
30 | TwinHan (dst) are loaded automatically by the dvb-bt8xx device driver. | 27 | The frontends for Nebula DigiTV (nxt6000), Pinnacle PCTV Sat (cx24110), |
28 | TwinHan DST + clones (dst and dst-ca) are loaded automatically by the backend. | ||
29 | For further details about TwinHan DST + clones see /Documentation/dvb/ci.txt. | ||
31 | 30 | ||
32 | 3a) Nebula / Pinnacle PCTV | 31 | 3a) The manual approach |
33 | -------------------------- | 32 | ----------------------- |
34 | 33 | ||
35 | $ modprobe bttv (normally bttv is being loaded automatically by kmod) | 34 | Loading modules: |
36 | $ modprobe dvb-bt8xx (or just place dvb-bt8xx in /etc/modules for automatic loading) | 35 | modprobe bttv |
36 | modprobe dvb-bt8xx | ||
37 | 37 | ||
38 | Unloading modules: | ||
39 | modprobe -r dvb-bt8xx | ||
40 | modprobe -r bttv | ||
38 | 41 | ||
39 | 3b) TwinHan and Clones | 42 | 3b) The automatic approach |
40 | -------------------------- | 43 | -------------------------- |
41 | 44 | ||
42 | $ modprobe bttv i2c_hw=1 card=0x71 | 45 | If not already done by installation, place a line either in |
43 | $ modprobe dvb-bt8xx | 46 | /etc/modules.conf or in /etc/modprobe.conf containing this text: |
44 | $ modprobe dst | 47 | alias char-major-81 bttv |
45 | |||
46 | The value 0x71 will override the PCI type detection for dvb-bt8xx, | ||
47 | which is necessary for TwinHan cards. | ||
48 | |||
49 | If you're having an older card (blue color circuit) and card=0x71 locks | ||
50 | your machine, try using 0x68, too. If that does not work, ask on the | ||
51 | mailing list. | ||
52 | |||
53 | The DST module takes a couple of useful parameters. | ||
54 | |||
55 | verbose takes values 0 to 5. These values control the verbosity level. | ||
56 | |||
57 | debug takes values 0 and 1. You can either disable or enable debugging. | ||
58 | |||
59 | dst_addons takes values 0 and 0x20. A value of 0 means it is a FTA card. | ||
60 | 0x20 means it has a Conditional Access slot. | ||
61 | |||
62 | The autodected values are determined bythe cards 'response | ||
63 | string' which you can see in your logs e.g. | ||
64 | 48 | ||
65 | dst_get_device_id: Recognise [DSTMCI] | 49 | Then place a line in /etc/modules containing this text: |
50 | dvb-bt8xx | ||
66 | 51 | ||
52 | Reboot your system and have fun! | ||
67 | 53 | ||
68 | -- | 54 | -- |
69 | Authors: Richard Walker, Jamie Honan, Michael Hunold, Manu Abraham | 55 | Authors: Richard Walker, Jamie Honan, Michael Hunold, Manu Abraham, Uwe Bugla |
diff --git a/Documentation/fb/vesafb.txt b/Documentation/fb/vesafb.txt index 814e2f56a6ad..62db6758d1c1 100644 --- a/Documentation/fb/vesafb.txt +++ b/Documentation/fb/vesafb.txt | |||
@@ -144,7 +144,21 @@ vgapal Use the standard vga registers for palette changes. | |||
144 | This is the default. | 144 | This is the default. |
145 | pmipal Use the protected mode interface for palette changes. | 145 | pmipal Use the protected mode interface for palette changes. |
146 | 146 | ||
147 | mtrr setup memory type range registers for the vesafb framebuffer. | 147 | mtrr:n setup memory type range registers for the vesafb framebuffer |
148 | where n: | ||
149 | 0 - disabled (equivalent to nomtrr) | ||
150 | 1 - uncachable | ||
151 | 2 - write-back | ||
152 | 3 - write-combining (default) | ||
153 | 4 - write-through | ||
154 | |||
155 | If you see the following in dmesg, choose the type that matches the | ||
156 | old one. In this example, use "mtrr:2". | ||
157 | ... | ||
158 | mtrr: type mismatch for e0000000,8000000 old: write-back new: write-combining | ||
159 | ... | ||
160 | |||
161 | nomtrr disable mtrr | ||
148 | 162 | ||
149 | vremap:n | 163 | vremap:n |
150 | remap 'n' MiB of video RAM. If 0 or not specified, remap memory | 164 | remap 'n' MiB of video RAM. If 0 or not specified, remap memory |
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 26414bc87c65..8b1430b46655 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt | |||
@@ -43,6 +43,14 @@ Who: Randy Dunlap <rddunlap@osdl.org> | |||
43 | 43 | ||
44 | --------------------------- | 44 | --------------------------- |
45 | 45 | ||
46 | What: RAW driver (CONFIG_RAW_DRIVER) | ||
47 | When: December 2005 | ||
48 | Why: declared obsolete since kernel 2.6.3 | ||
49 | O_DIRECT can be used instead | ||
50 | Who: Adrian Bunk <bunk@stusta.de> | ||
51 | |||
52 | --------------------------- | ||
53 | |||
46 | What: register_ioctl32_conversion() / unregister_ioctl32_conversion() | 54 | What: register_ioctl32_conversion() / unregister_ioctl32_conversion() |
47 | When: April 2005 | 55 | When: April 2005 |
48 | Why: Replaced by ->compat_ioctl in file_operations and other method | 56 | Why: Replaced by ->compat_ioctl in file_operations and other method |
@@ -66,6 +74,14 @@ Who: Paul E. McKenney <paulmck@us.ibm.com> | |||
66 | 74 | ||
67 | --------------------------- | 75 | --------------------------- |
68 | 76 | ||
77 | What: remove verify_area() | ||
78 | When: July 2006 | ||
79 | Files: Various uaccess.h headers. | ||
80 | Why: Deprecated and redundant. access_ok() should be used instead. | ||
81 | Who: Jesper Juhl <juhl-lkml@dif.dk> | ||
82 | |||
83 | --------------------------- | ||
84 | |||
69 | What: IEEE1394 Audio and Music Data Transmission Protocol driver, | 85 | What: IEEE1394 Audio and Music Data Transmission Protocol driver, |
70 | Connection Management Procedures driver | 86 | Connection Management Procedures driver |
71 | When: November 2005 | 87 | When: November 2005 |
@@ -86,6 +102,16 @@ Who: Jody McIntyre <scjody@steamballoon.com> | |||
86 | 102 | ||
87 | --------------------------- | 103 | --------------------------- |
88 | 104 | ||
105 | What: register_serial/unregister_serial | ||
106 | When: September 2005 | ||
107 | Why: This interface does not allow serial ports to be registered against | ||
108 | a struct device, and as such does not allow correct power management | ||
109 | of such ports. 8250-based ports should use serial8250_register_port | ||
110 | and serial8250_unregister_port, or platform devices instead. | ||
111 | Who: Russell King <rmk@arm.linux.org.uk> | ||
112 | |||
113 | --------------------------- | ||
114 | |||
89 | What: i2c sysfs name change: in1_ref, vid deprecated in favour of cpu0_vid | 115 | What: i2c sysfs name change: in1_ref, vid deprecated in favour of cpu0_vid |
90 | When: November 2005 | 116 | When: November 2005 |
91 | Files: drivers/i2c/chips/adm1025.c, drivers/i2c/chips/adm1026.c | 117 | Files: drivers/i2c/chips/adm1025.c, drivers/i2c/chips/adm1026.c |
@@ -93,3 +119,19 @@ Why: Match the other drivers' name for the same function, duplicate names | |||
93 | will be available until removal of old names. | 119 | will be available until removal of old names. |
94 | Who: Grant Coady <gcoady@gmail.com> | 120 | Who: Grant Coady <gcoady@gmail.com> |
95 | 121 | ||
122 | --------------------------- | ||
123 | |||
124 | What: PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl]) | ||
125 | When: November 2005 | ||
126 | Files: drivers/pcmcia/: pcmcia_ioctl.c | ||
127 | Why: With the 16-bit PCMCIA subsystem now behaving (almost) like a | ||
128 | normal hotpluggable bus, and with it using the default kernel | ||
129 | infrastructure (hotplug, driver core, sysfs) keeping the PCMCIA | ||
130 | control ioctl needed by cardmgr and cardctl from pcmcia-cs is | ||
131 | unnecessary, and makes further cleanups and integration of the | ||
132 | PCMCIA subsystem into the Linux kernel device driver model more | ||
133 | difficult. The features provided by cardmgr and cardctl are either | ||
134 | handled by the kernel itself now or are available in the new | ||
135 | pcmciautils package available at | ||
136 | http://kernel.org/pub/linux/utils/kernel/pcmcia/ | ||
137 | Who: Dominik Brodowski <linux@brodo.de> | ||
diff --git a/Documentation/filesystems/ext2.txt b/Documentation/filesystems/ext2.txt index b5cb9110cc6b..d16334ec48ba 100644 --- a/Documentation/filesystems/ext2.txt +++ b/Documentation/filesystems/ext2.txt | |||
@@ -58,6 +58,8 @@ noacl Don't support POSIX ACLs. | |||
58 | 58 | ||
59 | nobh Do not attach buffer_heads to file pagecache. | 59 | nobh Do not attach buffer_heads to file pagecache. |
60 | 60 | ||
61 | xip Use execute in place (no caching) if possible | ||
62 | |||
61 | grpquota,noquota,quota,usrquota Quota options are silently ignored by ext2. | 63 | grpquota,noquota,quota,usrquota Quota options are silently ignored by ext2. |
62 | 64 | ||
63 | 65 | ||
diff --git a/Documentation/filesystems/inotify.txt b/Documentation/filesystems/inotify.txt new file mode 100644 index 000000000000..6d501903f68e --- /dev/null +++ b/Documentation/filesystems/inotify.txt | |||
@@ -0,0 +1,151 @@ | |||
1 | inotify | ||
2 | a powerful yet simple file change notification system | ||
3 | |||
4 | |||
5 | |||
6 | Document started 15 Mar 2005 by Robert Love <rml@novell.com> | ||
7 | |||
8 | |||
9 | (i) User Interface | ||
10 | |||
11 | Inotify is controlled by a set of three system calls and normal file I/O on a | ||
12 | returned file descriptor. | ||
13 | |||
14 | First step in using inotify is to initialise an inotify instance: | ||
15 | |||
16 | int fd = inotify_init (); | ||
17 | |||
18 | Each instance is associated with a unique, ordered queue. | ||
19 | |||
20 | Change events are managed by "watches". A watch is an (object,mask) pair where | ||
21 | the object is a file or directory and the mask is a bit mask of one or more | ||
22 | inotify events that the application wishes to receive. See <linux/inotify.h> | ||
23 | for valid events. A watch is referenced by a watch descriptor, or wd. | ||
24 | |||
25 | Watches are added via a path to the file. | ||
26 | |||
27 | Watches on a directory will return events on any files inside of the directory. | ||
28 | |||
29 | Adding a watch is simple: | ||
30 | |||
31 | int wd = inotify_add_watch (fd, path, mask); | ||
32 | |||
33 | Where "fd" is the return value from inotify_init(), path is the path to the | ||
34 | object to watch, and mask is the watch mask (see <linux/inotify.h>). | ||
35 | |||
36 | You can update an existing watch in the same manner, by passing in a new mask. | ||
37 | |||
38 | An existing watch is removed via | ||
39 | |||
40 | int ret = inotify_rm_watch (fd, wd); | ||
41 | |||
42 | Events are provided in the form of an inotify_event structure that is read(2) | ||
43 | from a given inotify instance. The filename is of dynamic length and follows | ||
44 | the struct. It is of size len. The filename is padded with null bytes to | ||
45 | ensure proper alignment. This padding is reflected in len. | ||
46 | |||
47 | You can slurp multiple events by passing a large buffer, for example | ||
48 | |||
49 | size_t len = read (fd, buf, BUF_LEN); | ||
50 | |||
51 | Where "buf" is a pointer to an array of "inotify_event" structures at least | ||
52 | BUF_LEN bytes in size. The above example will return as many events as are | ||
53 | available and fit in BUF_LEN. | ||
54 | |||
55 | Each inotify instance fd is also select()- and poll()-able. | ||
56 | |||
57 | You can find the size of the current event queue via the standard FIONREAD | ||
58 | ioctl on the fd returned by inotify_init(). | ||
59 | |||
60 | All watches are destroyed and cleaned up on close. | ||
61 | |||
62 | |||
63 | (ii) | ||
64 | |||
65 | Prototypes: | ||
66 | |||
67 | int inotify_init (void); | ||
68 | int inotify_add_watch (int fd, const char *path, __u32 mask); | ||
69 | int inotify_rm_watch (int fd, __u32 mask); | ||
70 | |||
71 | |||
72 | (iii) Internal Kernel Implementation | ||
73 | |||
74 | Each inotify instance is associated with an inotify_device structure. | ||
75 | |||
76 | Each watch is associated with an inotify_watch structure. Watches are chained | ||
77 | off of each associated device and each associated inode. | ||
78 | |||
79 | See fs/inotify.c for the locking and lifetime rules. | ||
80 | |||
81 | |||
82 | (iv) Rationale | ||
83 | |||
84 | Q: What is the design decision behind not tying the watch to the open fd of | ||
85 | the watched object? | ||
86 | |||
87 | A: Watches are associated with an open inotify device, not an open file. | ||
88 | This solves the primary problem with dnotify: keeping the file open pins | ||
89 | the file and thus, worse, pins the mount. Dnotify is therefore infeasible | ||
90 | for use on a desktop system with removable media as the media cannot be | ||
91 | unmounted. Watching a file should not require that it be open. | ||
92 | |||
93 | Q: What is the design decision behind using an-fd-per-instance as opposed to | ||
94 | an fd-per-watch? | ||
95 | |||
96 | A: An fd-per-watch quickly consumes more file descriptors than are allowed, | ||
97 | more fd's than are feasible to manage, and more fd's than are optimally | ||
98 | select()-able. Yes, root can bump the per-process fd limit and yes, users | ||
99 | can use epoll, but requiring both is a silly and extraneous requirement. | ||
100 | A watch consumes less memory than an open file, separating the number | ||
101 | spaces is thus sensible. The current design is what user-space developers | ||
102 | want: Users initialize inotify, once, and add n watches, requiring but one | ||
103 | fd and no twiddling with fd limits. Initializing an inotify instance two | ||
104 | thousand times is silly. If we can implement user-space's preferences | ||
105 | cleanly--and we can, the idr layer makes stuff like this trivial--then we | ||
106 | should. | ||
107 | |||
108 | There are other good arguments. With a single fd, there is a single | ||
109 | item to block on, which is mapped to a single queue of events. The single | ||
110 | fd returns all watch events and also any potential out-of-band data. If | ||
111 | every fd was a separate watch, | ||
112 | |||
113 | - There would be no way to get event ordering. Events on file foo and | ||
114 | file bar would pop poll() on both fd's, but there would be no way to tell | ||
115 | which happened first. A single queue trivially gives you ordering. Such | ||
116 | ordering is crucial to existing applications such as Beagle. Imagine | ||
117 | "mv a b ; mv b a" events without ordering. | ||
118 | |||
119 | - We'd have to maintain n fd's and n internal queues with state, | ||
120 | versus just one. It is a lot messier in the kernel. A single, linear | ||
121 | queue is the data structure that makes sense. | ||
122 | |||
123 | - User-space developers prefer the current API. The Beagle guys, for | ||
124 | example, love it. Trust me, I asked. It is not a surprise: Who'd want | ||
125 | to manage and block on 1000 fd's via select? | ||
126 | |||
127 | - No way to get out of band data. | ||
128 | |||
129 | - 1024 is still too low. ;-) | ||
130 | |||
131 | When you talk about designing a file change notification system that | ||
132 | scales to 1000s of directories, juggling 1000s of fd's just does not seem | ||
133 | the right interface. It is too heavy. | ||
134 | |||
135 | Additionally, it _is_ possible to more than one instance and | ||
136 | juggle more than one queue and thus more than one associated fd. There | ||
137 | need not be a one-fd-per-process mapping; it is one-fd-per-queue and a | ||
138 | process can easily want more than one queue. | ||
139 | |||
140 | Q: Why the system call approach? | ||
141 | |||
142 | A: The poor user-space interface is the second biggest problem with dnotify. | ||
143 | Signals are a terrible, terrible interface for file notification. Or for | ||
144 | anything, for that matter. The ideal solution, from all perspectives, is a | ||
145 | file descriptor-based one that allows basic file I/O and poll/select. | ||
146 | Obtaining the fd and managing the watches could have been done either via a | ||
147 | device file or a family of new system calls. We decided to implement a | ||
148 | family of system calls because that is the preffered approach for new kernel | ||
149 | interfaces. The only real difference was whether we wanted to use open(2) | ||
150 | and ioctl(2) or a couple of new system calls. System calls beat ioctls. | ||
151 | |||
diff --git a/Documentation/filesystems/ntfs.txt b/Documentation/filesystems/ntfs.txt index f89b440fad1d..eef4aca0c753 100644 --- a/Documentation/filesystems/ntfs.txt +++ b/Documentation/filesystems/ntfs.txt | |||
@@ -21,7 +21,7 @@ Overview | |||
21 | ======== | 21 | ======== |
22 | 22 | ||
23 | Linux-NTFS comes with a number of user-space programs known as ntfsprogs. | 23 | Linux-NTFS comes with a number of user-space programs known as ntfsprogs. |
24 | These include mkntfs, a full-featured ntfs file system format utility, | 24 | These include mkntfs, a full-featured ntfs filesystem format utility, |
25 | ntfsundelete used for recovering files that were unintentionally deleted | 25 | ntfsundelete used for recovering files that were unintentionally deleted |
26 | from an NTFS volume and ntfsresize which is used to resize an NTFS partition. | 26 | from an NTFS volume and ntfsresize which is used to resize an NTFS partition. |
27 | See the web site for more information. | 27 | See the web site for more information. |
@@ -149,7 +149,14 @@ case_sensitive=<BOOL> If case_sensitive is specified, treat all file names as | |||
149 | name, if it exists. If case_sensitive, you will need | 149 | name, if it exists. If case_sensitive, you will need |
150 | to provide the correct case of the short file name. | 150 | to provide the correct case of the short file name. |
151 | 151 | ||
152 | errors=opt What to do when critical file system errors are found. | 152 | disable_sparse=<BOOL> If disable_sparse is specified, creation of sparse |
153 | regions, i.e. holes, inside files is disabled for the | ||
154 | volume (for the duration of this mount only). By | ||
155 | default, creation of sparse regions is enabled, which | ||
156 | is consistent with the behaviour of traditional Unix | ||
157 | filesystems. | ||
158 | |||
159 | errors=opt What to do when critical filesystem errors are found. | ||
153 | Following values can be used for "opt": | 160 | Following values can be used for "opt": |
154 | continue: DEFAULT, try to clean-up as much as | 161 | continue: DEFAULT, try to clean-up as much as |
155 | possible, e.g. marking a corrupt inode as | 162 | possible, e.g. marking a corrupt inode as |
@@ -432,6 +439,24 @@ ChangeLog | |||
432 | 439 | ||
433 | Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. | 440 | Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog. |
434 | 441 | ||
442 | 2.1.23: | ||
443 | - Stamp the user space journal, aka transaction log, aka $UsnJrnl, if | ||
444 | it is present and active thus telling Windows and applications using | ||
445 | the transaction log that changes can have happened on the volume | ||
446 | which are not recorded in $UsnJrnl. | ||
447 | - Detect the case when Windows has been hibernated (suspended to disk) | ||
448 | and if this is the case do not allow (re)mounting read-write to | ||
449 | prevent data corruption when you boot back into the suspended | ||
450 | Windows session. | ||
451 | - Implement extension of resident files using the normal file write | ||
452 | code paths, i.e. most very small files can be extended to be a little | ||
453 | bit bigger but not by much. | ||
454 | - Add new mount option "disable_sparse". (See list of mount options | ||
455 | above for details.) | ||
456 | - Improve handling of ntfs volumes with errors and strange boot sectors | ||
457 | in particular. | ||
458 | - Fix various bugs including a nasty deadlock that appeared in recent | ||
459 | kernels (around 2.6.11-2.6.12 timeframe). | ||
435 | 2.1.22: | 460 | 2.1.22: |
436 | - Improve handling of ntfs volumes with errors. | 461 | - Improve handling of ntfs volumes with errors. |
437 | - Fix various bugs and race conditions. | 462 | - Fix various bugs and race conditions. |
diff --git a/Documentation/filesystems/xip.txt b/Documentation/filesystems/xip.txt new file mode 100644 index 000000000000..6c0cef10eb4d --- /dev/null +++ b/Documentation/filesystems/xip.txt | |||
@@ -0,0 +1,67 @@ | |||
1 | Execute-in-place for file mappings | ||
2 | ---------------------------------- | ||
3 | |||
4 | Motivation | ||
5 | ---------- | ||
6 | File mappings are performed by mapping page cache pages to userspace. In | ||
7 | addition, read&write type file operations also transfer data from/to the page | ||
8 | cache. | ||
9 | |||
10 | For memory backed storage devices that use the block device interface, the page | ||
11 | cache pages are in fact copies of the original storage. Various approaches | ||
12 | exist to work around the need for an extra copy. The ramdisk driver for example | ||
13 | does read the data into the page cache, keeps a reference, and discards the | ||
14 | original data behind later on. | ||
15 | |||
16 | Execute-in-place solves this issue the other way around: instead of keeping | ||
17 | data in the page cache, the need to have a page cache copy is eliminated | ||
18 | completely. With execute-in-place, read&write type operations are performed | ||
19 | directly from/to the memory backed storage device. For file mappings, the | ||
20 | storage device itself is mapped directly into userspace. | ||
21 | |||
22 | This implementation was initialy written for shared memory segments between | ||
23 | different virtual machines on s390 hardware to allow multiple machines to | ||
24 | share the same binaries and libraries. | ||
25 | |||
26 | Implementation | ||
27 | -------------- | ||
28 | Execute-in-place is implemented in three steps: block device operation, | ||
29 | address space operation, and file operations. | ||
30 | |||
31 | A block device operation named direct_access is used to retrieve a | ||
32 | reference (pointer) to a block on-disk. The reference is supposed to be | ||
33 | cpu-addressable, physical address and remain valid until the release operation | ||
34 | is performed. A struct block_device reference is used to address the device, | ||
35 | and a sector_t argument is used to identify the individual block. As an | ||
36 | alternative, memory technology devices can be used for this. | ||
37 | |||
38 | The block device operation is optional, these block devices support it as of | ||
39 | today: | ||
40 | - dcssblk: s390 dcss block device driver | ||
41 | |||
42 | An address space operation named get_xip_page is used to retrieve reference | ||
43 | to a struct page. To address the target page, a reference to an address_space, | ||
44 | and a sector number is provided. A 3rd argument indicates whether the | ||
45 | function should allocate blocks if needed. | ||
46 | |||
47 | This address space operation is mutually exclusive with readpage&writepage that | ||
48 | do page cache read/write operations. | ||
49 | The following filesystems support it as of today: | ||
50 | - ext2: the second extended filesystem, see Documentation/filesystems/ext2.txt | ||
51 | |||
52 | A set of file operations that do utilize get_xip_page can be found in | ||
53 | mm/filemap_xip.c . The following file operation implementations are provided: | ||
54 | - aio_read/aio_write | ||
55 | - readv/writev | ||
56 | - sendfile | ||
57 | |||
58 | The generic file operations do_sync_read/do_sync_write can be used to implement | ||
59 | classic synchronous IO calls. | ||
60 | |||
61 | Shortcomings | ||
62 | ------------ | ||
63 | This implementation is limited to storage devices that are cpu addressable at | ||
64 | all times (no highmem or such). It works well on rom/ram, but enhancements are | ||
65 | needed to make it work with flash in read+write mode. | ||
66 | Putting the Linux kernel and/or its modules on a xip filesystem does not mean | ||
67 | they are not copied. | ||
diff --git a/Documentation/i2c/chips/adm1021 b/Documentation/hwmon/adm1021 index 03d02bfb3df1..03d02bfb3df1 100644 --- a/Documentation/i2c/chips/adm1021 +++ b/Documentation/hwmon/adm1021 | |||
diff --git a/Documentation/i2c/chips/adm1025 b/Documentation/hwmon/adm1025 index 39d2b781b5d6..39d2b781b5d6 100644 --- a/Documentation/i2c/chips/adm1025 +++ b/Documentation/hwmon/adm1025 | |||
diff --git a/Documentation/i2c/chips/adm1026 b/Documentation/hwmon/adm1026 index 473c689d7924..473c689d7924 100644 --- a/Documentation/i2c/chips/adm1026 +++ b/Documentation/hwmon/adm1026 | |||
diff --git a/Documentation/i2c/chips/adm1031 b/Documentation/hwmon/adm1031 index 130a38382b98..130a38382b98 100644 --- a/Documentation/i2c/chips/adm1031 +++ b/Documentation/hwmon/adm1031 | |||
diff --git a/Documentation/i2c/chips/adm9240 b/Documentation/hwmon/adm9240 index 35f618f32896..35f618f32896 100644 --- a/Documentation/i2c/chips/adm9240 +++ b/Documentation/hwmon/adm9240 | |||
diff --git a/Documentation/i2c/chips/asb100 b/Documentation/hwmon/asb100 index ab7365e139be..ab7365e139be 100644 --- a/Documentation/i2c/chips/asb100 +++ b/Documentation/hwmon/asb100 | |||
diff --git a/Documentation/i2c/chips/ds1621 b/Documentation/hwmon/ds1621 index 1fee6f1e6bc5..1fee6f1e6bc5 100644 --- a/Documentation/i2c/chips/ds1621 +++ b/Documentation/hwmon/ds1621 | |||
diff --git a/Documentation/i2c/chips/fscher b/Documentation/hwmon/fscher index 64031659aff3..64031659aff3 100644 --- a/Documentation/i2c/chips/fscher +++ b/Documentation/hwmon/fscher | |||
diff --git a/Documentation/i2c/chips/gl518sm b/Documentation/hwmon/gl518sm index ce0881883bca..ce0881883bca 100644 --- a/Documentation/i2c/chips/gl518sm +++ b/Documentation/hwmon/gl518sm | |||
diff --git a/Documentation/i2c/chips/it87 b/Documentation/hwmon/it87 index 0d0195040d88..0d0195040d88 100644 --- a/Documentation/i2c/chips/it87 +++ b/Documentation/hwmon/it87 | |||
diff --git a/Documentation/i2c/chips/lm63 b/Documentation/hwmon/lm63 index 31660bf97979..31660bf97979 100644 --- a/Documentation/i2c/chips/lm63 +++ b/Documentation/hwmon/lm63 | |||
diff --git a/Documentation/i2c/chips/lm75 b/Documentation/hwmon/lm75 index 8e6356fe05d7..8e6356fe05d7 100644 --- a/Documentation/i2c/chips/lm75 +++ b/Documentation/hwmon/lm75 | |||
diff --git a/Documentation/i2c/chips/lm77 b/Documentation/hwmon/lm77 index 57c3a46d6370..57c3a46d6370 100644 --- a/Documentation/i2c/chips/lm77 +++ b/Documentation/hwmon/lm77 | |||
diff --git a/Documentation/i2c/chips/lm78 b/Documentation/hwmon/lm78 index 357086ed7f64..357086ed7f64 100644 --- a/Documentation/i2c/chips/lm78 +++ b/Documentation/hwmon/lm78 | |||
diff --git a/Documentation/i2c/chips/lm80 b/Documentation/hwmon/lm80 index cb5b407ba3e6..cb5b407ba3e6 100644 --- a/Documentation/i2c/chips/lm80 +++ b/Documentation/hwmon/lm80 | |||
diff --git a/Documentation/i2c/chips/lm83 b/Documentation/hwmon/lm83 index 061d9ed8ff43..061d9ed8ff43 100644 --- a/Documentation/i2c/chips/lm83 +++ b/Documentation/hwmon/lm83 | |||
diff --git a/Documentation/i2c/chips/lm85 b/Documentation/hwmon/lm85 index 9549237530cf..9549237530cf 100644 --- a/Documentation/i2c/chips/lm85 +++ b/Documentation/hwmon/lm85 | |||
diff --git a/Documentation/i2c/chips/lm87 b/Documentation/hwmon/lm87 index c952c57f0e11..c952c57f0e11 100644 --- a/Documentation/i2c/chips/lm87 +++ b/Documentation/hwmon/lm87 | |||
diff --git a/Documentation/i2c/chips/lm90 b/Documentation/hwmon/lm90 index 2c4cf39471f4..2c4cf39471f4 100644 --- a/Documentation/i2c/chips/lm90 +++ b/Documentation/hwmon/lm90 | |||
diff --git a/Documentation/i2c/chips/lm92 b/Documentation/hwmon/lm92 index 7705bfaa0708..7705bfaa0708 100644 --- a/Documentation/i2c/chips/lm92 +++ b/Documentation/hwmon/lm92 | |||
diff --git a/Documentation/i2c/chips/max1619 b/Documentation/hwmon/max1619 index d6f8d9cd7d7f..d6f8d9cd7d7f 100644 --- a/Documentation/i2c/chips/max1619 +++ b/Documentation/hwmon/max1619 | |||
diff --git a/Documentation/i2c/chips/pc87360 b/Documentation/hwmon/pc87360 index 89a8fcfa78df..89a8fcfa78df 100644 --- a/Documentation/i2c/chips/pc87360 +++ b/Documentation/hwmon/pc87360 | |||
diff --git a/Documentation/i2c/chips/sis5595 b/Documentation/hwmon/sis5595 index b7ae36b8cdf5..b7ae36b8cdf5 100644 --- a/Documentation/i2c/chips/sis5595 +++ b/Documentation/hwmon/sis5595 | |||
diff --git a/Documentation/i2c/chips/smsc47b397 b/Documentation/hwmon/smsc47b397 index da9d80c96432..da9d80c96432 100644 --- a/Documentation/i2c/chips/smsc47b397 +++ b/Documentation/hwmon/smsc47b397 | |||
diff --git a/Documentation/i2c/chips/smsc47m1 b/Documentation/hwmon/smsc47m1 index 34e6478c1425..34e6478c1425 100644 --- a/Documentation/i2c/chips/smsc47m1 +++ b/Documentation/hwmon/smsc47m1 | |||
diff --git a/Documentation/i2c/sysfs-interface b/Documentation/hwmon/sysfs-interface index 346400519d0d..346400519d0d 100644 --- a/Documentation/i2c/sysfs-interface +++ b/Documentation/hwmon/sysfs-interface | |||
diff --git a/Documentation/i2c/userspace-tools b/Documentation/hwmon/userspace-tools index 2622aac65422..2622aac65422 100644 --- a/Documentation/i2c/userspace-tools +++ b/Documentation/hwmon/userspace-tools | |||
diff --git a/Documentation/i2c/chips/via686a b/Documentation/hwmon/via686a index b82014cb7c53..b82014cb7c53 100644 --- a/Documentation/i2c/chips/via686a +++ b/Documentation/hwmon/via686a | |||
diff --git a/Documentation/i2c/chips/w83627hf b/Documentation/hwmon/w83627hf index 78f37c2d602e..78f37c2d602e 100644 --- a/Documentation/i2c/chips/w83627hf +++ b/Documentation/hwmon/w83627hf | |||
diff --git a/Documentation/i2c/chips/w83781d b/Documentation/hwmon/w83781d index e5459333ba68..e5459333ba68 100644 --- a/Documentation/i2c/chips/w83781d +++ b/Documentation/hwmon/w83781d | |||
diff --git a/Documentation/i2c/chips/w83l785ts b/Documentation/hwmon/w83l785ts index 1841cedc25b2..1841cedc25b2 100644 --- a/Documentation/i2c/chips/w83l785ts +++ b/Documentation/hwmon/w83l785ts | |||
diff --git a/Documentation/i2c/chips/max6875 b/Documentation/i2c/chips/max6875 index b4fb49b41813..b02002898a09 100644 --- a/Documentation/i2c/chips/max6875 +++ b/Documentation/i2c/chips/max6875 | |||
@@ -2,10 +2,10 @@ Kernel driver max6875 | |||
2 | ===================== | 2 | ===================== |
3 | 3 | ||
4 | Supported chips: | 4 | Supported chips: |
5 | * Maxim max6874, max6875 | 5 | * Maxim MAX6874, MAX6875 |
6 | Prefixes: 'max6875' | 6 | Prefix: 'max6875' |
7 | Addresses scanned: 0x50, 0x52 | 7 | Addresses scanned: 0x50, 0x52 |
8 | Datasheets: | 8 | Datasheet: |
9 | http://pdfserv.maxim-ic.com/en/ds/MAX6874-MAX6875.pdf | 9 | http://pdfserv.maxim-ic.com/en/ds/MAX6874-MAX6875.pdf |
10 | 10 | ||
11 | Author: Ben Gardner <bgardner@wabtec.com> | 11 | Author: Ben Gardner <bgardner@wabtec.com> |
@@ -23,14 +23,26 @@ Module Parameters | |||
23 | Description | 23 | Description |
24 | ----------- | 24 | ----------- |
25 | 25 | ||
26 | The MAXIM max6875 is a EEPROM-programmable power-supply sequencer/supervisor. | 26 | The Maxim MAX6875 is an EEPROM-programmable power-supply sequencer/supervisor. |
27 | It provides timed outputs that can be used as a watchdog, if properly wired. | 27 | It provides timed outputs that can be used as a watchdog, if properly wired. |
28 | It also provides 512 bytes of user EEPROM. | 28 | It also provides 512 bytes of user EEPROM. |
29 | 29 | ||
30 | At reset, the max6875 reads the configuration eeprom into its configuration | 30 | At reset, the MAX6875 reads the configuration EEPROM into its configuration |
31 | registers. The chip then begins to operate according to the values in the | 31 | registers. The chip then begins to operate according to the values in the |
32 | registers. | 32 | registers. |
33 | 33 | ||
34 | The Maxim MAX6874 is a similar, mostly compatible device, with more intputs | ||
35 | and outputs: | ||
36 | |||
37 | vin gpi vout | ||
38 | MAX6874 6 4 8 | ||
39 | MAX6875 4 3 5 | ||
40 | |||
41 | MAX6874 chips can have four different addresses (as opposed to only two for | ||
42 | the MAX6875). The additional addresses (0x54 and 0x56) are not probed by | ||
43 | this driver by default, but the probe module parameter can be used if | ||
44 | needed. | ||
45 | |||
34 | See the datasheet for details on how to program the EEPROM. | 46 | See the datasheet for details on how to program the EEPROM. |
35 | 47 | ||
36 | 48 | ||
diff --git a/Documentation/i2c/dev-interface b/Documentation/i2c/dev-interface index 09d6cda2a1fb..b849ad636583 100644 --- a/Documentation/i2c/dev-interface +++ b/Documentation/i2c/dev-interface | |||
@@ -14,9 +14,12 @@ C example | |||
14 | ========= | 14 | ========= |
15 | 15 | ||
16 | So let's say you want to access an i2c adapter from a C program. The | 16 | So let's say you want to access an i2c adapter from a C program. The |
17 | first thing to do is `#include <linux/i2c.h>" and "#include <linux/i2c-dev.h>. | 17 | first thing to do is "#include <linux/i2c-dev.h>". Please note that |
18 | Yes, I know, you should never include kernel header files, but until glibc | 18 | there are two files named "i2c-dev.h" out there, one is distributed |
19 | knows about i2c, there is not much choice. | 19 | with the Linux kernel and is meant to be included from kernel |
20 | driver code, the other one is distributed with lm_sensors and is | ||
21 | meant to be included from user-space programs. You obviously want | ||
22 | the second one here. | ||
20 | 23 | ||
21 | Now, you have to decide which adapter you want to access. You should | 24 | Now, you have to decide which adapter you want to access. You should |
22 | inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned | 25 | inspect /sys/class/i2c-dev/ to decide this. Adapter numbers are assigned |
@@ -78,7 +81,7 @@ Full interface description | |||
78 | ========================== | 81 | ========================== |
79 | 82 | ||
80 | The following IOCTLs are defined and fully supported | 83 | The following IOCTLs are defined and fully supported |
81 | (see also i2c-dev.h and i2c.h): | 84 | (see also i2c-dev.h): |
82 | 85 | ||
83 | ioctl(file,I2C_SLAVE,long addr) | 86 | ioctl(file,I2C_SLAVE,long addr) |
84 | Change slave address. The address is passed in the 7 lower bits of the | 87 | Change slave address. The address is passed in the 7 lower bits of the |
@@ -97,10 +100,10 @@ ioctl(file,I2C_PEC,long select) | |||
97 | ioctl(file,I2C_FUNCS,unsigned long *funcs) | 100 | ioctl(file,I2C_FUNCS,unsigned long *funcs) |
98 | Gets the adapter functionality and puts it in *funcs. | 101 | Gets the adapter functionality and puts it in *funcs. |
99 | 102 | ||
100 | ioctl(file,I2C_RDWR,struct i2c_ioctl_rdwr_data *msgset) | 103 | ioctl(file,I2C_RDWR,struct i2c_rdwr_ioctl_data *msgset) |
101 | 104 | ||
102 | Do combined read/write transaction without stop in between. | 105 | Do combined read/write transaction without stop in between. |
103 | The argument is a pointer to a struct i2c_ioctl_rdwr_data { | 106 | The argument is a pointer to a struct i2c_rdwr_ioctl_data { |
104 | 107 | ||
105 | struct i2c_msg *msgs; /* ptr to array of simple messages */ | 108 | struct i2c_msg *msgs; /* ptr to array of simple messages */ |
106 | int nmsgs; /* number of messages to exchange */ | 109 | int nmsgs; /* number of messages to exchange */ |
diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index f482dae81de3..91664be91ffc 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients | |||
@@ -27,7 +27,6 @@ address. | |||
27 | static struct i2c_driver foo_driver = { | 27 | static struct i2c_driver foo_driver = { |
28 | .owner = THIS_MODULE, | 28 | .owner = THIS_MODULE, |
29 | .name = "Foo version 2.3 driver", | 29 | .name = "Foo version 2.3 driver", |
30 | .id = I2C_DRIVERID_FOO, /* from i2c-id.h, optional */ | ||
31 | .flags = I2C_DF_NOTIFY, | 30 | .flags = I2C_DF_NOTIFY, |
32 | .attach_adapter = &foo_attach_adapter, | 31 | .attach_adapter = &foo_attach_adapter, |
33 | .detach_client = &foo_detach_client, | 32 | .detach_client = &foo_detach_client, |
@@ -37,12 +36,6 @@ static struct i2c_driver foo_driver = { | |||
37 | The name can be chosen freely, and may be upto 40 characters long. Please | 36 | The name can be chosen freely, and may be upto 40 characters long. Please |
38 | use something descriptive here. | 37 | use something descriptive here. |
39 | 38 | ||
40 | If used, the id should be a unique ID. The range 0xf000 to 0xffff is | ||
41 | reserved for local use, and you can use one of those until you start | ||
42 | distributing the driver, at which time you should contact the i2c authors | ||
43 | to get your own ID(s). Note that most of the time you don't need an ID | ||
44 | at all so you can just omit it. | ||
45 | |||
46 | Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This | 39 | Don't worry about the flags field; just put I2C_DF_NOTIFY into it. This |
47 | means that your driver will be notified when new adapters are found. | 40 | means that your driver will be notified when new adapters are found. |
48 | This is almost always what you want. | 41 | This is almost always what you want. |
diff --git a/Documentation/infiniband/core_locking.txt b/Documentation/infiniband/core_locking.txt new file mode 100644 index 000000000000..e1678542279a --- /dev/null +++ b/Documentation/infiniband/core_locking.txt | |||
@@ -0,0 +1,114 @@ | |||
1 | INFINIBAND MIDLAYER LOCKING | ||
2 | |||
3 | This guide is an attempt to make explicit the locking assumptions | ||
4 | made by the InfiniBand midlayer. It describes the requirements on | ||
5 | both low-level drivers that sit below the midlayer and upper level | ||
6 | protocols that use the midlayer. | ||
7 | |||
8 | Sleeping and interrupt context | ||
9 | |||
10 | With the following exceptions, a low-level driver implementation of | ||
11 | all of the methods in struct ib_device may sleep. The exceptions | ||
12 | are any methods from the list: | ||
13 | |||
14 | create_ah | ||
15 | modify_ah | ||
16 | query_ah | ||
17 | destroy_ah | ||
18 | bind_mw | ||
19 | post_send | ||
20 | post_recv | ||
21 | poll_cq | ||
22 | req_notify_cq | ||
23 | map_phys_fmr | ||
24 | |||
25 | which may not sleep and must be callable from any context. | ||
26 | |||
27 | The corresponding functions exported to upper level protocol | ||
28 | consumers: | ||
29 | |||
30 | ib_create_ah | ||
31 | ib_modify_ah | ||
32 | ib_query_ah | ||
33 | ib_destroy_ah | ||
34 | ib_bind_mw | ||
35 | ib_post_send | ||
36 | ib_post_recv | ||
37 | ib_req_notify_cq | ||
38 | ib_map_phys_fmr | ||
39 | |||
40 | are therefore safe to call from any context. | ||
41 | |||
42 | In addition, the function | ||
43 | |||
44 | ib_dispatch_event | ||
45 | |||
46 | used by low-level drivers to dispatch asynchronous events through | ||
47 | the midlayer is also safe to call from any context. | ||
48 | |||
49 | Reentrancy | ||
50 | |||
51 | All of the methods in struct ib_device exported by a low-level | ||
52 | driver must be fully reentrant. The low-level driver is required to | ||
53 | perform all synchronization necessary to maintain consistency, even | ||
54 | if multiple function calls using the same object are run | ||
55 | simultaneously. | ||
56 | |||
57 | The IB midlayer does not perform any serialization of function calls. | ||
58 | |||
59 | Because low-level drivers are reentrant, upper level protocol | ||
60 | consumers are not required to perform any serialization. However, | ||
61 | some serialization may be required to get sensible results. For | ||
62 | example, a consumer may safely call ib_poll_cq() on multiple CPUs | ||
63 | simultaneously. However, the ordering of the work completion | ||
64 | information between different calls of ib_poll_cq() is not defined. | ||
65 | |||
66 | Callbacks | ||
67 | |||
68 | A low-level driver must not perform a callback directly from the | ||
69 | same callchain as an ib_device method call. For example, it is not | ||
70 | allowed for a low-level driver to call a consumer's completion event | ||
71 | handler directly from its post_send method. Instead, the low-level | ||
72 | driver should defer this callback by, for example, scheduling a | ||
73 | tasklet to perform the callback. | ||
74 | |||
75 | The low-level driver is responsible for ensuring that multiple | ||
76 | completion event handlers for the same CQ are not called | ||
77 | simultaneously. The driver must guarantee that only one CQ event | ||
78 | handler for a given CQ is running at a time. In other words, the | ||
79 | following situation is not allowed: | ||
80 | |||
81 | CPU1 CPU2 | ||
82 | |||
83 | low-level driver -> | ||
84 | consumer CQ event callback: | ||
85 | /* ... */ | ||
86 | ib_req_notify_cq(cq, ...); | ||
87 | low-level driver -> | ||
88 | /* ... */ consumer CQ event callback: | ||
89 | /* ... */ | ||
90 | return from CQ event handler | ||
91 | |||
92 | The context in which completion event and asynchronous event | ||
93 | callbacks run is not defined. Depending on the low-level driver, it | ||
94 | may be process context, softirq context, or interrupt context. | ||
95 | Upper level protocol consumers may not sleep in a callback. | ||
96 | |||
97 | Hot-plug | ||
98 | |||
99 | A low-level driver announces that a device is ready for use by | ||
100 | consumers when it calls ib_register_device(), all initialization | ||
101 | must be complete before this call. The device must remain usable | ||
102 | until the driver's call to ib_unregister_device() has returned. | ||
103 | |||
104 | A low-level driver must call ib_register_device() and | ||
105 | ib_unregister_device() from process context. It must not hold any | ||
106 | semaphores that could cause deadlock if a consumer calls back into | ||
107 | the driver across these calls. | ||
108 | |||
109 | An upper level protocol consumer may begin using an IB device as | ||
110 | soon as the add method of its struct ib_client is called for that | ||
111 | device. A consumer must finish all cleanup and free all resources | ||
112 | relating to a device before returning from the remove method. | ||
113 | |||
114 | A consumer is permitted to sleep in its add and remove methods. | ||
diff --git a/Documentation/infiniband/user_mad.txt b/Documentation/infiniband/user_mad.txt index cae0c83f1ee9..750fe5e80ebc 100644 --- a/Documentation/infiniband/user_mad.txt +++ b/Documentation/infiniband/user_mad.txt | |||
@@ -28,13 +28,37 @@ Creating MAD agents | |||
28 | 28 | ||
29 | Receiving MADs | 29 | Receiving MADs |
30 | 30 | ||
31 | MADs are received using read(). The buffer passed to read() must be | 31 | MADs are received using read(). The receive side now supports |
32 | large enough to hold at least one struct ib_user_mad. For example: | 32 | RMPP. The buffer passed to read() must be at least one |
33 | 33 | struct ib_user_mad + 256 bytes. For example: | |
34 | struct ib_user_mad mad; | 34 | |
35 | ret = read(fd, &mad, sizeof mad); | 35 | If the buffer passed is not large enough to hold the received |
36 | if (ret != sizeof mad) | 36 | MAD (RMPP), the errno is set to ENOSPC and the length of the |
37 | buffer needed is set in mad.length. | ||
38 | |||
39 | Example for normal MAD (non RMPP) reads: | ||
40 | struct ib_user_mad *mad; | ||
41 | mad = malloc(sizeof *mad + 256); | ||
42 | ret = read(fd, mad, sizeof *mad + 256); | ||
43 | if (ret != sizeof mad + 256) { | ||
44 | perror("read"); | ||
45 | free(mad); | ||
46 | } | ||
47 | |||
48 | Example for RMPP reads: | ||
49 | struct ib_user_mad *mad; | ||
50 | mad = malloc(sizeof *mad + 256); | ||
51 | ret = read(fd, mad, sizeof *mad + 256); | ||
52 | if (ret == -ENOSPC)) { | ||
53 | length = mad.length; | ||
54 | free(mad); | ||
55 | mad = malloc(sizeof *mad + length); | ||
56 | ret = read(fd, mad, sizeof *mad + length); | ||
57 | } | ||
58 | if (ret < 0) { | ||
37 | perror("read"); | 59 | perror("read"); |
60 | free(mad); | ||
61 | } | ||
38 | 62 | ||
39 | In addition to the actual MAD contents, the other struct ib_user_mad | 63 | In addition to the actual MAD contents, the other struct ib_user_mad |
40 | fields will be filled in with information on the received MAD. For | 64 | fields will be filled in with information on the received MAD. For |
@@ -50,18 +74,21 @@ Sending MADs | |||
50 | 74 | ||
51 | MADs are sent using write(). The agent ID for sending should be | 75 | MADs are sent using write(). The agent ID for sending should be |
52 | filled into the id field of the MAD, the destination LID should be | 76 | filled into the id field of the MAD, the destination LID should be |
53 | filled into the lid field, and so on. For example: | 77 | filled into the lid field, and so on. The send side does support |
78 | RMPP so arbitrary length MAD can be sent. For example: | ||
79 | |||
80 | struct ib_user_mad *mad; | ||
54 | 81 | ||
55 | struct ib_user_mad mad; | 82 | mad = malloc(sizeof *mad + mad_length); |
56 | 83 | ||
57 | /* fill in mad.data */ | 84 | /* fill in mad->data */ |
58 | 85 | ||
59 | mad.id = my_agent; /* req.id from agent registration */ | 86 | mad->hdr.id = my_agent; /* req.id from agent registration */ |
60 | mad.lid = my_dest; /* in network byte order... */ | 87 | mad->hdr.lid = my_dest; /* in network byte order... */ |
61 | /* etc. */ | 88 | /* etc. */ |
62 | 89 | ||
63 | ret = write(fd, &mad, sizeof mad); | 90 | ret = write(fd, &mad, sizeof *mad + mad_length); |
64 | if (ret != sizeof mad) | 91 | if (ret != sizeof *mad + mad_length) |
65 | perror("write"); | 92 | perror("write"); |
66 | 93 | ||
67 | Setting IsSM Capability Bit | 94 | Setting IsSM Capability Bit |
diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt new file mode 100644 index 000000000000..f847501e50b5 --- /dev/null +++ b/Documentation/infiniband/user_verbs.txt | |||
@@ -0,0 +1,69 @@ | |||
1 | USERSPACE VERBS ACCESS | ||
2 | |||
3 | The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS, | ||
4 | enables direct userspace access to IB hardware via "verbs," as | ||
5 | described in chapter 11 of the InfiniBand Architecture Specification. | ||
6 | |||
7 | To use the verbs, the libibverbs library, available from | ||
8 | <http://openib.org/>, is required. libibverbs contains a | ||
9 | device-independent API for using the ib_uverbs interface. | ||
10 | libibverbs also requires appropriate device-dependent kernel and | ||
11 | userspace driver for your InfiniBand hardware. For example, to use | ||
12 | a Mellanox HCA, you will need the ib_mthca kernel module and the | ||
13 | libmthca userspace driver be installed. | ||
14 | |||
15 | User-kernel communication | ||
16 | |||
17 | Userspace communicates with the kernel for slow path, resource | ||
18 | management operations via the /dev/infiniband/uverbsN character | ||
19 | devices. Fast path operations are typically performed by writing | ||
20 | directly to hardware registers mmap()ed into userspace, with no | ||
21 | system call or context switch into the kernel. | ||
22 | |||
23 | Commands are sent to the kernel via write()s on these device files. | ||
24 | The ABI is defined in drivers/infiniband/include/ib_user_verbs.h. | ||
25 | The structs for commands that require a response from the kernel | ||
26 | contain a 64-bit field used to pass a pointer to an output buffer. | ||
27 | Status is returned to userspace as the return value of the write() | ||
28 | system call. | ||
29 | |||
30 | Resource management | ||
31 | |||
32 | Since creation and destruction of all IB resources is done by | ||
33 | commands passed through a file descriptor, the kernel can keep track | ||
34 | of which resources are attached to a given userspace context. The | ||
35 | ib_uverbs module maintains idr tables that are used to translate | ||
36 | between kernel pointers and opaque userspace handles, so that kernel | ||
37 | pointers are never exposed to userspace and userspace cannot trick | ||
38 | the kernel into following a bogus pointer. | ||
39 | |||
40 | This also allows the kernel to clean up when a process exits and | ||
41 | prevent one process from touching another process's resources. | ||
42 | |||
43 | Memory pinning | ||
44 | |||
45 | Direct userspace I/O requires that memory regions that are potential | ||
46 | I/O targets be kept resident at the same physical address. The | ||
47 | ib_uverbs module manages pinning and unpinning memory regions via | ||
48 | get_user_pages() and put_page() calls. It also accounts for the | ||
49 | amount of memory pinned in the process's locked_vm, and checks that | ||
50 | unprivileged processes do not exceed their RLIMIT_MEMLOCK limit. | ||
51 | |||
52 | Pages that are pinned multiple times are counted each time they are | ||
53 | pinned, so the value of locked_vm may be an overestimate of the | ||
54 | number of pages pinned by a process. | ||
55 | |||
56 | /dev files | ||
57 | |||
58 | To create the appropriate character device files automatically with | ||
59 | udev, a rule like | ||
60 | |||
61 | KERNEL="uverbs*", NAME="infiniband/%k" | ||
62 | |||
63 | can be used. This will create device nodes named | ||
64 | |||
65 | /dev/infiniband/uverbs0 | ||
66 | |||
67 | and so on. Since the InfiniBand userspace verbs should be safe for | ||
68 | use by non-privileged processes, it may be useful to add an | ||
69 | appropriate MODE or GROUP to the udev rule. | ||
diff --git a/Documentation/kdump/gdbmacros.txt b/Documentation/kdump/gdbmacros.txt new file mode 100644 index 000000000000..bc1b9eb92ae1 --- /dev/null +++ b/Documentation/kdump/gdbmacros.txt | |||
@@ -0,0 +1,179 @@ | |||
1 | # | ||
2 | # This file contains a few gdb macros (user defined commands) to extract | ||
3 | # useful information from kernel crashdump (kdump) like stack traces of | ||
4 | # all the processes or a particular process and trapinfo. | ||
5 | # | ||
6 | # These macros can be used by copying this file in .gdbinit (put in home | ||
7 | # directory or current directory) or by invoking gdb command with | ||
8 | # --command=<command-file-name> option | ||
9 | # | ||
10 | # Credits: | ||
11 | # Alexander Nyberg <alexn@telia.com> | ||
12 | # V Srivatsa <vatsa@in.ibm.com> | ||
13 | # Maneesh Soni <maneesh@in.ibm.com> | ||
14 | # | ||
15 | |||
16 | define bttnobp | ||
17 | set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) | ||
18 | set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) | ||
19 | set $init_t=&init_task | ||
20 | set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) | ||
21 | while ($next_t != $init_t) | ||
22 | set $next_t=(struct task_struct *)$next_t | ||
23 | printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm | ||
24 | printf "===================\n" | ||
25 | set var $stackp = $next_t.thread.esp | ||
26 | set var $stack_top = ($stackp & ~4095) + 4096 | ||
27 | |||
28 | while ($stackp < $stack_top) | ||
29 | if (*($stackp) > _stext && *($stackp) < _sinittext) | ||
30 | info symbol *($stackp) | ||
31 | end | ||
32 | set $stackp += 4 | ||
33 | end | ||
34 | set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) | ||
35 | while ($next_th != $next_t) | ||
36 | set $next_th=(struct task_struct *)$next_th | ||
37 | printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm | ||
38 | printf "===================\n" | ||
39 | set var $stackp = $next_t.thread.esp | ||
40 | set var $stack_top = ($stackp & ~4095) + 4096 | ||
41 | |||
42 | while ($stackp < $stack_top) | ||
43 | if (*($stackp) > _stext && *($stackp) < _sinittext) | ||
44 | info symbol *($stackp) | ||
45 | end | ||
46 | set $stackp += 4 | ||
47 | end | ||
48 | set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) | ||
49 | end | ||
50 | set $next_t=(char *)($next_t->tasks.next) - $tasks_off | ||
51 | end | ||
52 | end | ||
53 | document bttnobp | ||
54 | dump all thread stack traces on a kernel compiled with !CONFIG_FRAME_POINTER | ||
55 | end | ||
56 | |||
57 | define btt | ||
58 | set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) | ||
59 | set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) | ||
60 | set $init_t=&init_task | ||
61 | set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) | ||
62 | while ($next_t != $init_t) | ||
63 | set $next_t=(struct task_struct *)$next_t | ||
64 | printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm | ||
65 | printf "===================\n" | ||
66 | set var $stackp = $next_t.thread.esp | ||
67 | set var $stack_top = ($stackp & ~4095) + 4096 | ||
68 | set var $stack_bot = ($stackp & ~4095) | ||
69 | |||
70 | set $stackp = *($stackp) | ||
71 | while (($stackp < $stack_top) && ($stackp > $stack_bot)) | ||
72 | set var $addr = *($stackp + 4) | ||
73 | info symbol $addr | ||
74 | set $stackp = *($stackp) | ||
75 | end | ||
76 | |||
77 | set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) | ||
78 | while ($next_th != $next_t) | ||
79 | set $next_th=(struct task_struct *)$next_th | ||
80 | printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm | ||
81 | printf "===================\n" | ||
82 | set var $stackp = $next_t.thread.esp | ||
83 | set var $stack_top = ($stackp & ~4095) + 4096 | ||
84 | set var $stack_bot = ($stackp & ~4095) | ||
85 | |||
86 | set $stackp = *($stackp) | ||
87 | while (($stackp < $stack_top) && ($stackp > $stack_bot)) | ||
88 | set var $addr = *($stackp + 4) | ||
89 | info symbol $addr | ||
90 | set $stackp = *($stackp) | ||
91 | end | ||
92 | set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) | ||
93 | end | ||
94 | set $next_t=(char *)($next_t->tasks.next) - $tasks_off | ||
95 | end | ||
96 | end | ||
97 | document btt | ||
98 | dump all thread stack traces on a kernel compiled with CONFIG_FRAME_POINTER | ||
99 | end | ||
100 | |||
101 | define btpid | ||
102 | set var $pid = $arg0 | ||
103 | set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) | ||
104 | set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) | ||
105 | set $init_t=&init_task | ||
106 | set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) | ||
107 | set var $pid_task = 0 | ||
108 | |||
109 | while ($next_t != $init_t) | ||
110 | set $next_t=(struct task_struct *)$next_t | ||
111 | |||
112 | if ($next_t.pid == $pid) | ||
113 | set $pid_task = $next_t | ||
114 | end | ||
115 | |||
116 | set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) | ||
117 | while ($next_th != $next_t) | ||
118 | set $next_th=(struct task_struct *)$next_th | ||
119 | if ($next_th.pid == $pid) | ||
120 | set $pid_task = $next_th | ||
121 | end | ||
122 | set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) | ||
123 | end | ||
124 | set $next_t=(char *)($next_t->tasks.next) - $tasks_off | ||
125 | end | ||
126 | |||
127 | printf "\npid %d; comm %s:\n", $pid_task.pid, $pid_task.comm | ||
128 | printf "===================\n" | ||
129 | set var $stackp = $pid_task.thread.esp | ||
130 | set var $stack_top = ($stackp & ~4095) + 4096 | ||
131 | set var $stack_bot = ($stackp & ~4095) | ||
132 | |||
133 | set $stackp = *($stackp) | ||
134 | while (($stackp < $stack_top) && ($stackp > $stack_bot)) | ||
135 | set var $addr = *($stackp + 4) | ||
136 | info symbol $addr | ||
137 | set $stackp = *($stackp) | ||
138 | end | ||
139 | end | ||
140 | document btpid | ||
141 | backtrace of pid | ||
142 | end | ||
143 | |||
144 | |||
145 | define trapinfo | ||
146 | set var $pid = $arg0 | ||
147 | set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) | ||
148 | set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) | ||
149 | set $init_t=&init_task | ||
150 | set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) | ||
151 | set var $pid_task = 0 | ||
152 | |||
153 | while ($next_t != $init_t) | ||
154 | set $next_t=(struct task_struct *)$next_t | ||
155 | |||
156 | if ($next_t.pid == $pid) | ||
157 | set $pid_task = $next_t | ||
158 | end | ||
159 | |||
160 | set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) | ||
161 | while ($next_th != $next_t) | ||
162 | set $next_th=(struct task_struct *)$next_th | ||
163 | if ($next_th.pid == $pid) | ||
164 | set $pid_task = $next_th | ||
165 | end | ||
166 | set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) | ||
167 | end | ||
168 | set $next_t=(char *)($next_t->tasks.next) - $tasks_off | ||
169 | end | ||
170 | |||
171 | printf "Trapno %ld, cr2 0x%lx, error_code %ld\n", $pid_task.thread.trap_no, \ | ||
172 | $pid_task.thread.cr2, $pid_task.thread.error_code | ||
173 | |||
174 | end | ||
175 | document trapinfo | ||
176 | Run info threads and lookup pid of thread #1 | ||
177 | 'trapinfo <pid>' will tell you by which trap & possibly | ||
178 | addresthe kernel paniced. | ||
179 | end | ||
diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt new file mode 100644 index 000000000000..7ff213f4becd --- /dev/null +++ b/Documentation/kdump/kdump.txt | |||
@@ -0,0 +1,141 @@ | |||
1 | Documentation for kdump - the kexec-based crash dumping solution | ||
2 | ================================================================ | ||
3 | |||
4 | DESIGN | ||
5 | ====== | ||
6 | |||
7 | Kdump uses kexec to reboot to a second kernel whenever a dump needs to be taken. | ||
8 | This second kernel is booted with very little memory. The first kernel reserves | ||
9 | the section of memory that the second kernel uses. This ensures that on-going | ||
10 | DMA from the first kernel does not corrupt the second kernel. | ||
11 | |||
12 | All the necessary information about Core image is encoded in ELF format and | ||
13 | stored in reserved area of memory before crash. Physical address of start of | ||
14 | ELF header is passed to new kernel through command line parameter elfcorehdr=. | ||
15 | |||
16 | On i386, the first 640 KB of physical memory is needed to boot, irrespective | ||
17 | of where the kernel loads. Hence, this region is backed up by kexec just before | ||
18 | rebooting into the new kernel. | ||
19 | |||
20 | In the second kernel, "old memory" can be accessed in two ways. | ||
21 | |||
22 | - The first one is through a /dev/oldmem device interface. A capture utility | ||
23 | can read the device file and write out the memory in raw format. This is raw | ||
24 | dump of memory and analysis/capture tool should be intelligent enough to | ||
25 | determine where to look for the right information. ELF headers (elfcorehdr=) | ||
26 | can become handy here. | ||
27 | |||
28 | - The second interface is through /proc/vmcore. This exports the dump as an ELF | ||
29 | format file which can be written out using any file copy command | ||
30 | (cp, scp, etc). Further, gdb can be used to perform limited debugging on | ||
31 | the dump file. This method ensures methods ensure that there is correct | ||
32 | ordering of the dump pages (corresponding to the first 640 KB that has been | ||
33 | relocated). | ||
34 | |||
35 | SETUP | ||
36 | ===== | ||
37 | |||
38 | 1) Download http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz | ||
39 | and apply http://lse.sourceforge.net/kdump/patches/kexec-tools-1.101-kdump.patch | ||
40 | and after that build the source. | ||
41 | |||
42 | 2) Download and build the appropriate (latest) kexec/kdump (-mm) kernel | ||
43 | patchset and apply it to the vanilla kernel tree. | ||
44 | |||
45 | Two kernels need to be built in order to get this feature working. | ||
46 | |||
47 | A) First kernel: | ||
48 | a) Enable "kexec system call" feature (in Processor type and features). | ||
49 | CONFIG_KEXEC=y | ||
50 | b) This kernel's physical load address should be the default value of | ||
51 | 0x100000 (0x100000, 1 MB) (in Processor type and features). | ||
52 | CONFIG_PHYSICAL_START=0x100000 | ||
53 | c) Enable "sysfs file system support" (in Pseudo filesystems). | ||
54 | CONFIG_SYSFS=y | ||
55 | d) Boot into first kernel with the command line parameter "crashkernel=Y@X". | ||
56 | Use appropriate values for X and Y. Y denotes how much memory to reserve | ||
57 | for the second kernel, and X denotes at what physical address the reserved | ||
58 | memory section starts. For example: "crashkernel=64M@16M". | ||
59 | |||
60 | B) Second kernel: | ||
61 | a) Enable "kernel crash dumps" feature (in Processor type and features). | ||
62 | CONFIG_CRASH_DUMP=y | ||
63 | b) Specify a suitable value for "Physical address where the kernel is | ||
64 | loaded" (in Processor type and features). Typically this value | ||
65 | should be same as X (See option d) above, e.g., 16 MB or 0x1000000. | ||
66 | CONFIG_PHYSICAL_START=0x1000000 | ||
67 | c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems). | ||
68 | CONFIG_PROC_VMCORE=y | ||
69 | d) Disable SMP support and build a UP kernel (Until it is fixed). | ||
70 | CONFIG_SMP=n | ||
71 | e) Enable "Local APIC support on uniprocessors". | ||
72 | CONFIG_X86_UP_APIC=y | ||
73 | f) Enable "IO-APIC support on uniprocessors" | ||
74 | CONFIG_X86_UP_IOAPIC=y | ||
75 | |||
76 | Note: i) Options a) and b) depend upon "Configure standard kernel features | ||
77 | (for small systems)" (under General setup). | ||
78 | ii) Option a) also depends on CONFIG_HIGHMEM (under Processor | ||
79 | type and features). | ||
80 | iii) Both option a) and b) are under "Processor type and features". | ||
81 | |||
82 | 3) Boot into the first kernel. You are now ready to try out kexec-based crash | ||
83 | dumps. | ||
84 | |||
85 | 4) Load the second kernel to be booted using: | ||
86 | |||
87 | kexec -p <second-kernel> --crash-dump --args-linux --append="root=<root-dev> | ||
88 | init 1 irqpoll" | ||
89 | |||
90 | Note: i) <second-kernel> has to be a vmlinux image. bzImage will not work, | ||
91 | as of now. | ||
92 | ii) By default ELF headers are stored in ELF32 format (for i386). This | ||
93 | is sufficient to represent the physical memory up to 4GB. To store | ||
94 | headers in ELF64 format, specifiy "--elf64-core-headers" on the | ||
95 | kexec command line additionally. | ||
96 | iii) Specify "irqpoll" as command line parameter. This reduces driver | ||
97 | initialization failures in second kernel due to shared interrupts. | ||
98 | |||
99 | 5) System reboots into the second kernel when a panic occurs. A module can be | ||
100 | written to force the panic or "ALT-SysRq-c" can be used initiate a crash | ||
101 | dump for testing purposes. | ||
102 | |||
103 | 6) Write out the dump file using | ||
104 | |||
105 | cp /proc/vmcore <dump-file> | ||
106 | |||
107 | Dump memory can also be accessed as a /dev/oldmem device for a linear/raw | ||
108 | view. To create the device, type: | ||
109 | |||
110 | mknod /dev/oldmem c 1 12 | ||
111 | |||
112 | Use "dd" with suitable options for count, bs and skip to access specific | ||
113 | portions of the dump. | ||
114 | |||
115 | Entire memory: dd if=/dev/oldmem of=oldmem.001 | ||
116 | |||
117 | ANALYSIS | ||
118 | ======== | ||
119 | |||
120 | Limited analysis can be done using gdb on the dump file copied out of | ||
121 | /proc/vmcore. Use vmlinux built with -g and run | ||
122 | |||
123 | gdb vmlinux <dump-file> | ||
124 | |||
125 | Stack trace for the task on processor 0, register display, memory display | ||
126 | work fine. | ||
127 | |||
128 | Note: gdb cannot analyse core files generated in ELF64 format for i386. | ||
129 | |||
130 | TODO | ||
131 | ==== | ||
132 | |||
133 | 1) Provide a kernel pages filtering mechanism so that core file size is not | ||
134 | insane on systems having huge memory banks. | ||
135 | 2) Modify "crash" tool to make it recognize this dump. | ||
136 | |||
137 | CONTACT | ||
138 | ======= | ||
139 | |||
140 | Vivek Goyal (vgoyal@in.ibm.com) | ||
141 | Maneesh Soni (maneesh@in.ibm.com) | ||
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 4924d387a657..3d5cd7a09b2f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -37,7 +37,7 @@ restrictions referred to are that the relevant option is valid if: | |||
37 | IA-32 IA-32 aka i386 architecture is enabled. | 37 | IA-32 IA-32 aka i386 architecture is enabled. |
38 | IA-64 IA-64 architecture is enabled. | 38 | IA-64 IA-64 architecture is enabled. |
39 | IOSCHED More than one I/O scheduler is enabled. | 39 | IOSCHED More than one I/O scheduler is enabled. |
40 | IP_PNP IP DCHP, BOOTP, or RARP is enabled. | 40 | IP_PNP IP DHCP, BOOTP, or RARP is enabled. |
41 | ISAPNP ISA PnP code is enabled. | 41 | ISAPNP ISA PnP code is enabled. |
42 | ISDN Appropriate ISDN support is enabled. | 42 | ISDN Appropriate ISDN support is enabled. |
43 | JOY Appropriate joystick support is enabled. | 43 | JOY Appropriate joystick support is enabled. |
@@ -159,6 +159,11 @@ running once the system is up. | |||
159 | 159 | ||
160 | acpi_fake_ecdt [HW,ACPI] Workaround failure due to BIOS lacking ECDT | 160 | acpi_fake_ecdt [HW,ACPI] Workaround failure due to BIOS lacking ECDT |
161 | 161 | ||
162 | acpi_generic_hotkey [HW,ACPI] | ||
163 | Allow consolidated generic hotkey driver to | ||
164 | over-ride platform specific driver. | ||
165 | See also Documentation/acpi-hotkey.txt. | ||
166 | |||
162 | ad1816= [HW,OSS] | 167 | ad1816= [HW,OSS] |
163 | Format: <io>,<irq>,<dma>,<dma2> | 168 | Format: <io>,<irq>,<dma>,<dma2> |
164 | See also Documentation/sound/oss/AD1816. | 169 | See also Documentation/sound/oss/AD1816. |
@@ -358,6 +363,10 @@ running once the system is up. | |||
358 | cpia_pp= [HW,PPT] | 363 | cpia_pp= [HW,PPT] |
359 | Format: { parport<nr> | auto | none } | 364 | Format: { parport<nr> | auto | none } |
360 | 365 | ||
366 | crashkernel=nn[KMG]@ss[KMG] | ||
367 | [KNL] Reserve a chunk of physical memory to | ||
368 | hold a kernel to switch to with kexec on panic. | ||
369 | |||
361 | cs4232= [HW,OSS] | 370 | cs4232= [HW,OSS] |
362 | Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq> | 371 | Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq> |
363 | 372 | ||
@@ -447,6 +456,10 @@ running once the system is up. | |||
447 | Format: {"as"|"cfq"|"deadline"|"noop"} | 456 | Format: {"as"|"cfq"|"deadline"|"noop"} |
448 | See Documentation/block/as-iosched.txt | 457 | See Documentation/block/as-iosched.txt |
449 | and Documentation/block/deadline-iosched.txt for details. | 458 | and Documentation/block/deadline-iosched.txt for details. |
459 | elfcorehdr= [IA-32] | ||
460 | Specifies physical address of start of kernel core image | ||
461 | elf header. | ||
462 | See Documentation/kdump.txt for details. | ||
450 | 463 | ||
451 | enforcing [SELINUX] Set initial enforcing status. | 464 | enforcing [SELINUX] Set initial enforcing status. |
452 | Format: {"0" | "1"} | 465 | Format: {"0" | "1"} |
@@ -548,6 +561,9 @@ running once the system is up. | |||
548 | 561 | ||
549 | i810= [HW,DRM] | 562 | i810= [HW,DRM] |
550 | 563 | ||
564 | i8k.ignore_dmi [HW] Continue probing hardware even if DMI data | ||
565 | indicates that the driver is running on unsupported | ||
566 | hardware. | ||
551 | i8k.force [HW] Activate i8k driver even if SMM BIOS signature | 567 | i8k.force [HW] Activate i8k driver even if SMM BIOS signature |
552 | does not match list of supported models. | 568 | does not match list of supported models. |
553 | i8k.power_status | 569 | i8k.power_status |
@@ -611,6 +627,17 @@ running once the system is up. | |||
611 | ips= [HW,SCSI] Adaptec / IBM ServeRAID controller | 627 | ips= [HW,SCSI] Adaptec / IBM ServeRAID controller |
612 | See header of drivers/scsi/ips.c. | 628 | See header of drivers/scsi/ips.c. |
613 | 629 | ||
630 | irqfixup [HW] | ||
631 | When an interrupt is not handled search all handlers | ||
632 | for it. Intended to get systems with badly broken | ||
633 | firmware running. | ||
634 | |||
635 | irqpoll [HW] | ||
636 | When an interrupt is not handled search all handlers | ||
637 | for it. Also check all handlers each timer | ||
638 | interrupt. Intended to get systems with badly broken | ||
639 | firmware running. | ||
640 | |||
614 | isapnp= [ISAPNP] | 641 | isapnp= [ISAPNP] |
615 | Format: <RDP>, <reset>, <pci_scan>, <verbosity> | 642 | Format: <RDP>, <reset>, <pci_scan>, <verbosity> |
616 | 643 | ||
@@ -736,6 +763,9 @@ running once the system is up. | |||
736 | maxcpus= [SMP] Maximum number of processors that an SMP kernel | 763 | maxcpus= [SMP] Maximum number of processors that an SMP kernel |
737 | should make use of | 764 | should make use of |
738 | 765 | ||
766 | max_addr=[KMG] [KNL,BOOT,ia64] All physical memory greater than or | ||
767 | equal to this physical address is ignored. | ||
768 | |||
739 | max_luns= [SCSI] Maximum number of LUNs to probe | 769 | max_luns= [SCSI] Maximum number of LUNs to probe |
740 | Should be between 1 and 2^32-1. | 770 | Should be between 1 and 2^32-1. |
741 | 771 | ||
@@ -1019,6 +1049,10 @@ running once the system is up. | |||
1019 | irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be assigned | 1049 | irqmask=0xMMMM [IA-32] Set a bit mask of IRQs allowed to be assigned |
1020 | automatically to PCI devices. You can make the kernel | 1050 | automatically to PCI devices. You can make the kernel |
1021 | exclude IRQs of your ISA cards this way. | 1051 | exclude IRQs of your ISA cards this way. |
1052 | pirqaddr=0xAAAAA [IA-32] Specify the physical address | ||
1053 | of the PIRQ table (normally generated | ||
1054 | by the BIOS) if it is outside the | ||
1055 | F0000h-100000h range. | ||
1022 | lastbus=N [IA-32] Scan all buses till bus #N. Can be useful | 1056 | lastbus=N [IA-32] Scan all buses till bus #N. Can be useful |
1023 | if the kernel is unable to find your secondary buses | 1057 | if the kernel is unable to find your secondary buses |
1024 | and you want to tell it explicitly which ones they are. | 1058 | and you want to tell it explicitly which ones they are. |
@@ -1104,7 +1138,7 @@ running once the system is up. | |||
1104 | See Documentation/ramdisk.txt. | 1138 | See Documentation/ramdisk.txt. |
1105 | 1139 | ||
1106 | psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to | 1140 | psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to |
1107 | probe for (bare|imps|exps). | 1141 | probe for (bare|imps|exps|lifebook|any). |
1108 | psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports | 1142 | psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports |
1109 | per second. | 1143 | per second. |
1110 | psmouse.resetafter= | 1144 | psmouse.resetafter= |
diff --git a/Documentation/keys.txt b/Documentation/keys.txt index 36d80aeeaf28..0321ded4b9ae 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt | |||
@@ -22,6 +22,7 @@ This document has the following sections: | |||
22 | - New procfs files | 22 | - New procfs files |
23 | - Userspace system call interface | 23 | - Userspace system call interface |
24 | - Kernel services | 24 | - Kernel services |
25 | - Notes on accessing payload contents | ||
25 | - Defining a key type | 26 | - Defining a key type |
26 | - Request-key callback service | 27 | - Request-key callback service |
27 | - Key access filesystem | 28 | - Key access filesystem |
@@ -45,27 +46,26 @@ Each key has a number of attributes: | |||
45 | - State. | 46 | - State. |
46 | 47 | ||
47 | 48 | ||
48 | (*) Each key is issued a serial number of type key_serial_t that is unique | 49 | (*) Each key is issued a serial number of type key_serial_t that is unique for |
49 | for the lifetime of that key. All serial numbers are positive non-zero | 50 | the lifetime of that key. All serial numbers are positive non-zero 32-bit |
50 | 32-bit integers. | 51 | integers. |
51 | 52 | ||
52 | Userspace programs can use a key's serial numbers as a way to gain access | 53 | Userspace programs can use a key's serial numbers as a way to gain access |
53 | to it, subject to permission checking. | 54 | to it, subject to permission checking. |
54 | 55 | ||
55 | (*) Each key is of a defined "type". Types must be registered inside the | 56 | (*) Each key is of a defined "type". Types must be registered inside the |
56 | kernel by a kernel service (such as a filesystem) before keys of that | 57 | kernel by a kernel service (such as a filesystem) before keys of that type |
57 | type can be added or used. Userspace programs cannot define new types | 58 | can be added or used. Userspace programs cannot define new types directly. |
58 | directly. | ||
59 | 59 | ||
60 | Key types are represented in the kernel by struct key_type. This defines | 60 | Key types are represented in the kernel by struct key_type. This defines a |
61 | a number of operations that can be performed on a key of that type. | 61 | number of operations that can be performed on a key of that type. |
62 | 62 | ||
63 | Should a type be removed from the system, all the keys of that type will | 63 | Should a type be removed from the system, all the keys of that type will |
64 | be invalidated. | 64 | be invalidated. |
65 | 65 | ||
66 | (*) Each key has a description. This should be a printable string. The key | 66 | (*) Each key has a description. This should be a printable string. The key |
67 | type provides an operation to perform a match between the description on | 67 | type provides an operation to perform a match between the description on a |
68 | a key and a criterion string. | 68 | key and a criterion string. |
69 | 69 | ||
70 | (*) Each key has an owner user ID, a group ID and a permissions mask. These | 70 | (*) Each key has an owner user ID, a group ID and a permissions mask. These |
71 | are used to control what a process may do to a key from userspace, and | 71 | are used to control what a process may do to a key from userspace, and |
@@ -74,10 +74,10 @@ Each key has a number of attributes: | |||
74 | (*) Each key can be set to expire at a specific time by the key type's | 74 | (*) Each key can be set to expire at a specific time by the key type's |
75 | instantiation function. Keys can also be immortal. | 75 | instantiation function. Keys can also be immortal. |
76 | 76 | ||
77 | (*) Each key can have a payload. This is a quantity of data that represent | 77 | (*) Each key can have a payload. This is a quantity of data that represent the |
78 | the actual "key". In the case of a keyring, this is a list of keys to | 78 | actual "key". In the case of a keyring, this is a list of keys to which |
79 | which the keyring links; in the case of a user-defined key, it's an | 79 | the keyring links; in the case of a user-defined key, it's an arbitrary |
80 | arbitrary blob of data. | 80 | blob of data. |
81 | 81 | ||
82 | Having a payload is not required; and the payload can, in fact, just be a | 82 | Having a payload is not required; and the payload can, in fact, just be a |
83 | value stored in the struct key itself. | 83 | value stored in the struct key itself. |
@@ -92,8 +92,8 @@ Each key has a number of attributes: | |||
92 | 92 | ||
93 | (*) Each key can be in one of a number of basic states: | 93 | (*) Each key can be in one of a number of basic states: |
94 | 94 | ||
95 | (*) Uninstantiated. The key exists, but does not have any data | 95 | (*) Uninstantiated. The key exists, but does not have any data attached. |
96 | attached. Keys being requested from userspace will be in this state. | 96 | Keys being requested from userspace will be in this state. |
97 | 97 | ||
98 | (*) Instantiated. This is the normal state. The key is fully formed, and | 98 | (*) Instantiated. This is the normal state. The key is fully formed, and |
99 | has data attached. | 99 | has data attached. |
@@ -140,10 +140,10 @@ The key service provides a number of features besides keys: | |||
140 | clone, fork, vfork or execve occurs. A new keyring is created only when | 140 | clone, fork, vfork or execve occurs. A new keyring is created only when |
141 | required. | 141 | required. |
142 | 142 | ||
143 | The process-specific keyring is replaced with an empty one in the child | 143 | The process-specific keyring is replaced with an empty one in the child on |
144 | on clone, fork, vfork unless CLONE_THREAD is supplied, in which case it | 144 | clone, fork, vfork unless CLONE_THREAD is supplied, in which case it is |
145 | is shared. execve also discards the process's process keyring and creates | 145 | shared. execve also discards the process's process keyring and creates a |
146 | a new one. | 146 | new one. |
147 | 147 | ||
148 | The session-specific keyring is persistent across clone, fork, vfork and | 148 | The session-specific keyring is persistent across clone, fork, vfork and |
149 | execve, even when the latter executes a set-UID or set-GID binary. A | 149 | execve, even when the latter executes a set-UID or set-GID binary. A |
@@ -177,11 +177,11 @@ The key service provides a number of features besides keys: | |||
177 | If a system call that modifies a key or keyring in some way would put the | 177 | If a system call that modifies a key or keyring in some way would put the |
178 | user over quota, the operation is refused and error EDQUOT is returned. | 178 | user over quota, the operation is refused and error EDQUOT is returned. |
179 | 179 | ||
180 | (*) There's a system call interface by which userspace programs can create | 180 | (*) There's a system call interface by which userspace programs can create and |
181 | and manipulate keys and keyrings. | 181 | manipulate keys and keyrings. |
182 | 182 | ||
183 | (*) There's a kernel interface by which services can register types and | 183 | (*) There's a kernel interface by which services can register types and search |
184 | search for keys. | 184 | for keys. |
185 | 185 | ||
186 | (*) There's a way for the a search done from the kernel to call back to | 186 | (*) There's a way for the a search done from the kernel to call back to |
187 | userspace to request a key that can't be found in a process's keyrings. | 187 | userspace to request a key that can't be found in a process's keyrings. |
@@ -194,9 +194,9 @@ The key service provides a number of features besides keys: | |||
194 | KEY ACCESS PERMISSIONS | 194 | KEY ACCESS PERMISSIONS |
195 | ====================== | 195 | ====================== |
196 | 196 | ||
197 | Keys have an owner user ID, a group access ID, and a permissions mask. The | 197 | Keys have an owner user ID, a group access ID, and a permissions mask. The mask |
198 | mask has up to eight bits each for user, group and other access. Only five of | 198 | has up to eight bits each for user, group and other access. Only five of each |
199 | each set of eight bits are defined. These permissions granted are: | 199 | set of eight bits are defined. These permissions granted are: |
200 | 200 | ||
201 | (*) View | 201 | (*) View |
202 | 202 | ||
@@ -210,8 +210,8 @@ each set of eight bits are defined. These permissions granted are: | |||
210 | 210 | ||
211 | (*) Write | 211 | (*) Write |
212 | 212 | ||
213 | This permits a key's payload to be instantiated or updated, or it allows | 213 | This permits a key's payload to be instantiated or updated, or it allows a |
214 | a link to be added to or removed from a keyring. | 214 | link to be added to or removed from a keyring. |
215 | 215 | ||
216 | (*) Search | 216 | (*) Search |
217 | 217 | ||
@@ -238,8 +238,8 @@ about the status of the key service: | |||
238 | (*) /proc/keys | 238 | (*) /proc/keys |
239 | 239 | ||
240 | This lists all the keys on the system, giving information about their | 240 | This lists all the keys on the system, giving information about their |
241 | type, description and permissions. The payload of the key is not | 241 | type, description and permissions. The payload of the key is not available |
242 | available this way: | 242 | this way: |
243 | 243 | ||
244 | SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY | 244 | SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY |
245 | 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 | 245 | 00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4 |
@@ -318,21 +318,21 @@ The main syscalls are: | |||
318 | If a key of the same type and description as that proposed already exists | 318 | If a key of the same type and description as that proposed already exists |
319 | in the keyring, this will try to update it with the given payload, or it | 319 | in the keyring, this will try to update it with the given payload, or it |
320 | will return error EEXIST if that function is not supported by the key | 320 | will return error EEXIST if that function is not supported by the key |
321 | type. The process must also have permission to write to the key to be | 321 | type. The process must also have permission to write to the key to be able |
322 | able to update it. The new key will have all user permissions granted and | 322 | to update it. The new key will have all user permissions granted and no |
323 | no group or third party permissions. | 323 | group or third party permissions. |
324 | 324 | ||
325 | Otherwise, this will attempt to create a new key of the specified type | 325 | Otherwise, this will attempt to create a new key of the specified type and |
326 | and description, and to instantiate it with the supplied payload and | 326 | description, and to instantiate it with the supplied payload and attach it |
327 | attach it to the keyring. In this case, an error will be generated if the | 327 | to the keyring. In this case, an error will be generated if the process |
328 | process does not have permission to write to the keyring. | 328 | does not have permission to write to the keyring. |
329 | 329 | ||
330 | The payload is optional, and the pointer can be NULL if not required by | 330 | The payload is optional, and the pointer can be NULL if not required by |
331 | the type. The payload is plen in size, and plen can be zero for an empty | 331 | the type. The payload is plen in size, and plen can be zero for an empty |
332 | payload. | 332 | payload. |
333 | 333 | ||
334 | A new keyring can be generated by setting type "keyring", the keyring | 334 | A new keyring can be generated by setting type "keyring", the keyring name |
335 | name as the description (or NULL) and setting the payload to NULL. | 335 | as the description (or NULL) and setting the payload to NULL. |
336 | 336 | ||
337 | User defined keys can be created by specifying type "user". It is | 337 | User defined keys can be created by specifying type "user". It is |
338 | recommended that a user defined key's description by prefixed with a type | 338 | recommended that a user defined key's description by prefixed with a type |
@@ -369,9 +369,9 @@ The keyctl syscall functions are: | |||
369 | key_serial_t keyctl(KEYCTL_GET_KEYRING_ID, key_serial_t id, | 369 | key_serial_t keyctl(KEYCTL_GET_KEYRING_ID, key_serial_t id, |
370 | int create); | 370 | int create); |
371 | 371 | ||
372 | The special key specified by "id" is looked up (with the key being | 372 | The special key specified by "id" is looked up (with the key being created |
373 | created if necessary) and the ID of the key or keyring thus found is | 373 | if necessary) and the ID of the key or keyring thus found is returned if |
374 | returned if it exists. | 374 | it exists. |
375 | 375 | ||
376 | If the key does not yet exist, the key will be created if "create" is | 376 | If the key does not yet exist, the key will be created if "create" is |
377 | non-zero; and the error ENOKEY will be returned if "create" is zero. | 377 | non-zero; and the error ENOKEY will be returned if "create" is zero. |
@@ -402,8 +402,8 @@ The keyctl syscall functions are: | |||
402 | 402 | ||
403 | This will try to update the specified key with the given payload, or it | 403 | This will try to update the specified key with the given payload, or it |
404 | will return error EOPNOTSUPP if that function is not supported by the key | 404 | will return error EOPNOTSUPP if that function is not supported by the key |
405 | type. The process must also have permission to write to the key to be | 405 | type. The process must also have permission to write to the key to be able |
406 | able to update it. | 406 | to update it. |
407 | 407 | ||
408 | The payload is of length plen, and may be absent or empty as for | 408 | The payload is of length plen, and may be absent or empty as for |
409 | add_key(). | 409 | add_key(). |
@@ -422,8 +422,8 @@ The keyctl syscall functions are: | |||
422 | 422 | ||
423 | long keyctl(KEYCTL_CHOWN, key_serial_t key, uid_t uid, gid_t gid); | 423 | long keyctl(KEYCTL_CHOWN, key_serial_t key, uid_t uid, gid_t gid); |
424 | 424 | ||
425 | This function permits a key's owner and group ID to be changed. Either | 425 | This function permits a key's owner and group ID to be changed. Either one |
426 | one of uid or gid can be set to -1 to suppress that change. | 426 | of uid or gid can be set to -1 to suppress that change. |
427 | 427 | ||
428 | Only the superuser can change a key's owner to something other than the | 428 | Only the superuser can change a key's owner to something other than the |
429 | key's current owner. Similarly, only the superuser can change a key's | 429 | key's current owner. Similarly, only the superuser can change a key's |
@@ -484,12 +484,12 @@ The keyctl syscall functions are: | |||
484 | 484 | ||
485 | long keyctl(KEYCTL_LINK, key_serial_t keyring, key_serial_t key); | 485 | long keyctl(KEYCTL_LINK, key_serial_t keyring, key_serial_t key); |
486 | 486 | ||
487 | This function creates a link from the keyring to the key. The process | 487 | This function creates a link from the keyring to the key. The process must |
488 | must have write permission on the keyring and must have link permission | 488 | have write permission on the keyring and must have link permission on the |
489 | on the key. | 489 | key. |
490 | 490 | ||
491 | Should the keyring not be a keyring, error ENOTDIR will result; and if | 491 | Should the keyring not be a keyring, error ENOTDIR will result; and if the |
492 | the keyring is full, error ENFILE will result. | 492 | keyring is full, error ENFILE will result. |
493 | 493 | ||
494 | The link procedure checks the nesting of the keyrings, returning ELOOP if | 494 | The link procedure checks the nesting of the keyrings, returning ELOOP if |
495 | it appears to deep or EDEADLK if the link would introduce a cycle. | 495 | it appears to deep or EDEADLK if the link would introduce a cycle. |
@@ -503,8 +503,8 @@ The keyctl syscall functions are: | |||
503 | specified key, and removes it if found. Subsequent links to that key are | 503 | specified key, and removes it if found. Subsequent links to that key are |
504 | ignored. The process must have write permission on the keyring. | 504 | ignored. The process must have write permission on the keyring. |
505 | 505 | ||
506 | If the keyring is not a keyring, error ENOTDIR will result; and if the | 506 | If the keyring is not a keyring, error ENOTDIR will result; and if the key |
507 | key is not present, error ENOENT will be the result. | 507 | is not present, error ENOENT will be the result. |
508 | 508 | ||
509 | 509 | ||
510 | (*) Search a keyring tree for a key: | 510 | (*) Search a keyring tree for a key: |
@@ -513,9 +513,9 @@ The keyctl syscall functions are: | |||
513 | const char *type, const char *description, | 513 | const char *type, const char *description, |
514 | key_serial_t dest_keyring); | 514 | key_serial_t dest_keyring); |
515 | 515 | ||
516 | This searches the keyring tree headed by the specified keyring until a | 516 | This searches the keyring tree headed by the specified keyring until a key |
517 | key is found that matches the type and description criteria. Each keyring | 517 | is found that matches the type and description criteria. Each keyring is |
518 | is checked for keys before recursion into its children occurs. | 518 | checked for keys before recursion into its children occurs. |
519 | 519 | ||
520 | The process must have search permission on the top level keyring, or else | 520 | The process must have search permission on the top level keyring, or else |
521 | error EACCES will result. Only keyrings that the process has search | 521 | error EACCES will result. Only keyrings that the process has search |
@@ -549,8 +549,8 @@ The keyctl syscall functions are: | |||
549 | As much of the data as can be fitted into the buffer will be copied to | 549 | As much of the data as can be fitted into the buffer will be copied to |
550 | userspace if the buffer pointer is not NULL. | 550 | userspace if the buffer pointer is not NULL. |
551 | 551 | ||
552 | On a successful return, the function will always return the amount of | 552 | On a successful return, the function will always return the amount of data |
553 | data available rather than the amount copied. | 553 | available rather than the amount copied. |
554 | 554 | ||
555 | 555 | ||
556 | (*) Instantiate a partially constructed key. | 556 | (*) Instantiate a partially constructed key. |
@@ -568,8 +568,8 @@ The keyctl syscall functions are: | |||
568 | it, and the key must be uninstantiated. | 568 | it, and the key must be uninstantiated. |
569 | 569 | ||
570 | If a keyring is specified (non-zero), the key will also be linked into | 570 | If a keyring is specified (non-zero), the key will also be linked into |
571 | that keyring, however all the constraints applying in KEYCTL_LINK apply | 571 | that keyring, however all the constraints applying in KEYCTL_LINK apply in |
572 | in this case too. | 572 | this case too. |
573 | 573 | ||
574 | The payload and plen arguments describe the payload data as for add_key(). | 574 | The payload and plen arguments describe the payload data as for add_key(). |
575 | 575 | ||
@@ -587,8 +587,39 @@ The keyctl syscall functions are: | |||
587 | it, and the key must be uninstantiated. | 587 | it, and the key must be uninstantiated. |
588 | 588 | ||
589 | If a keyring is specified (non-zero), the key will also be linked into | 589 | If a keyring is specified (non-zero), the key will also be linked into |
590 | that keyring, however all the constraints applying in KEYCTL_LINK apply | 590 | that keyring, however all the constraints applying in KEYCTL_LINK apply in |
591 | in this case too. | 591 | this case too. |
592 | |||
593 | |||
594 | (*) Set the default request-key destination keyring. | ||
595 | |||
596 | long keyctl(KEYCTL_SET_REQKEY_KEYRING, int reqkey_defl); | ||
597 | |||
598 | This sets the default keyring to which implicitly requested keys will be | ||
599 | attached for this thread. reqkey_defl should be one of these constants: | ||
600 | |||
601 | CONSTANT VALUE NEW DEFAULT KEYRING | ||
602 | ====================================== ====== ======================= | ||
603 | KEY_REQKEY_DEFL_NO_CHANGE -1 No change | ||
604 | KEY_REQKEY_DEFL_DEFAULT 0 Default[1] | ||
605 | KEY_REQKEY_DEFL_THREAD_KEYRING 1 Thread keyring | ||
606 | KEY_REQKEY_DEFL_PROCESS_KEYRING 2 Process keyring | ||
607 | KEY_REQKEY_DEFL_SESSION_KEYRING 3 Session keyring | ||
608 | KEY_REQKEY_DEFL_USER_KEYRING 4 User keyring | ||
609 | KEY_REQKEY_DEFL_USER_SESSION_KEYRING 5 User session keyring | ||
610 | KEY_REQKEY_DEFL_GROUP_KEYRING 6 Group keyring | ||
611 | |||
612 | The old default will be returned if successful and error EINVAL will be | ||
613 | returned if reqkey_defl is not one of the above values. | ||
614 | |||
615 | The default keyring can be overridden by the keyring indicated to the | ||
616 | request_key() system call. | ||
617 | |||
618 | Note that this setting is inherited across fork/exec. | ||
619 | |||
620 | [1] The default default is: the thread keyring if there is one, otherwise | ||
621 | the process keyring if there is one, otherwise the session keyring if | ||
622 | there is one, otherwise the user default session keyring. | ||
592 | 623 | ||
593 | 624 | ||
594 | =============== | 625 | =============== |
@@ -601,17 +632,14 @@ be broken down into two areas: keys and key types. | |||
601 | Dealing with keys is fairly straightforward. Firstly, the kernel service | 632 | Dealing with keys is fairly straightforward. Firstly, the kernel service |
602 | registers its type, then it searches for a key of that type. It should retain | 633 | registers its type, then it searches for a key of that type. It should retain |
603 | the key as long as it has need of it, and then it should release it. For a | 634 | the key as long as it has need of it, and then it should release it. For a |
604 | filesystem or device file, a search would probably be performed during the | 635 | filesystem or device file, a search would probably be performed during the open |
605 | open call, and the key released upon close. How to deal with conflicting keys | 636 | call, and the key released upon close. How to deal with conflicting keys due to |
606 | due to two different users opening the same file is left to the filesystem | 637 | two different users opening the same file is left to the filesystem author to |
607 | author to solve. | 638 | solve. |
608 | 639 | ||
609 | When accessing a key's payload data, key->lock should be at least read locked, | 640 | When accessing a key's payload contents, certain precautions must be taken to |
610 | or else the data may be changed by an update being performed from userspace | 641 | prevent access vs modification races. See the section "Notes on accessing |
611 | whilst the driver or filesystem is trying to access it. If no update method is | 642 | payload contents" for more information. |
612 | supplied, then the key's payload may be accessed without holding a lock as | ||
613 | there is no way to change it, provided it can be guaranteed that the key's | ||
614 | type definition won't go away. | ||
615 | 643 | ||
616 | (*) To search for a key, call: | 644 | (*) To search for a key, call: |
617 | 645 | ||
@@ -629,6 +657,9 @@ type definition won't go away. | |||
629 | Should the function fail error ENOKEY, EKEYEXPIRED or EKEYREVOKED will be | 657 | Should the function fail error ENOKEY, EKEYEXPIRED or EKEYREVOKED will be |
630 | returned. | 658 | returned. |
631 | 659 | ||
660 | If successful, the key will have been attached to the default keyring for | ||
661 | implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING. | ||
662 | |||
632 | 663 | ||
633 | (*) When it is no longer required, the key should be released using: | 664 | (*) When it is no longer required, the key should be released using: |
634 | 665 | ||
@@ -690,6 +721,54 @@ type definition won't go away. | |||
690 | void unregister_key_type(struct key_type *type); | 721 | void unregister_key_type(struct key_type *type); |
691 | 722 | ||
692 | 723 | ||
724 | =================================== | ||
725 | NOTES ON ACCESSING PAYLOAD CONTENTS | ||
726 | =================================== | ||
727 | |||
728 | The simplest payload is just a number in key->payload.value. In this case, | ||
729 | there's no need to indulge in RCU or locking when accessing the payload. | ||
730 | |||
731 | More complex payload contents must be allocated and a pointer to them set in | ||
732 | key->payload.data. One of the following ways must be selected to access the | ||
733 | data: | ||
734 | |||
735 | (1) Unmodifyable key type. | ||
736 | |||
737 | If the key type does not have a modify method, then the key's payload can | ||
738 | be accessed without any form of locking, provided that it's known to be | ||
739 | instantiated (uninstantiated keys cannot be "found"). | ||
740 | |||
741 | (2) The key's semaphore. | ||
742 | |||
743 | The semaphore could be used to govern access to the payload and to control | ||
744 | the payload pointer. It must be write-locked for modifications and would | ||
745 | have to be read-locked for general access. The disadvantage of doing this | ||
746 | is that the accessor may be required to sleep. | ||
747 | |||
748 | (3) RCU. | ||
749 | |||
750 | RCU must be used when the semaphore isn't already held; if the semaphore | ||
751 | is held then the contents can't change under you unexpectedly as the | ||
752 | semaphore must still be used to serialise modifications to the key. The | ||
753 | key management code takes care of this for the key type. | ||
754 | |||
755 | However, this means using: | ||
756 | |||
757 | rcu_read_lock() ... rcu_dereference() ... rcu_read_unlock() | ||
758 | |||
759 | to read the pointer, and: | ||
760 | |||
761 | rcu_dereference() ... rcu_assign_pointer() ... call_rcu() | ||
762 | |||
763 | to set the pointer and dispose of the old contents after a grace period. | ||
764 | Note that only the key type should ever modify a key's payload. | ||
765 | |||
766 | Furthermore, an RCU controlled payload must hold a struct rcu_head for the | ||
767 | use of call_rcu() and, if the payload is of variable size, the length of | ||
768 | the payload. key->datalen cannot be relied upon to be consistent with the | ||
769 | payload just dereferenced if the key's semaphore is not held. | ||
770 | |||
771 | |||
693 | =================== | 772 | =================== |
694 | DEFINING A KEY TYPE | 773 | DEFINING A KEY TYPE |
695 | =================== | 774 | =================== |
@@ -717,15 +796,15 @@ The structure has a number of fields, some of which are mandatory: | |||
717 | 796 | ||
718 | int key_payload_reserve(struct key *key, size_t datalen); | 797 | int key_payload_reserve(struct key *key, size_t datalen); |
719 | 798 | ||
720 | With the revised data length. Error EDQUOT will be returned if this is | 799 | With the revised data length. Error EDQUOT will be returned if this is not |
721 | not viable. | 800 | viable. |
722 | 801 | ||
723 | 802 | ||
724 | (*) int (*instantiate)(struct key *key, const void *data, size_t datalen); | 803 | (*) int (*instantiate)(struct key *key, const void *data, size_t datalen); |
725 | 804 | ||
726 | This method is called to attach a payload to a key during construction. | 805 | This method is called to attach a payload to a key during construction. |
727 | The payload attached need not bear any relation to the data passed to | 806 | The payload attached need not bear any relation to the data passed to this |
728 | this function. | 807 | function. |
729 | 808 | ||
730 | If the amount of data attached to the key differs from the size in | 809 | If the amount of data attached to the key differs from the size in |
731 | keytype->def_datalen, then key_payload_reserve() should be called. | 810 | keytype->def_datalen, then key_payload_reserve() should be called. |
@@ -734,38 +813,47 @@ The structure has a number of fields, some of which are mandatory: | |||
734 | The fact that KEY_FLAG_INSTANTIATED is not set in key->flags prevents | 813 | The fact that KEY_FLAG_INSTANTIATED is not set in key->flags prevents |
735 | anything else from gaining access to the key. | 814 | anything else from gaining access to the key. |
736 | 815 | ||
737 | This method may sleep if it wishes. | 816 | It is safe to sleep in this method. |
738 | 817 | ||
739 | 818 | ||
740 | (*) int (*duplicate)(struct key *key, const struct key *source); | 819 | (*) int (*duplicate)(struct key *key, const struct key *source); |
741 | 820 | ||
742 | If this type of key can be duplicated, then this method should be | 821 | If this type of key can be duplicated, then this method should be |
743 | provided. It is called to copy the payload attached to the source into | 822 | provided. It is called to copy the payload attached to the source into the |
744 | the new key. The data length on the new key will have been updated and | 823 | new key. The data length on the new key will have been updated and the |
745 | the quota adjusted already. | 824 | quota adjusted already. |
746 | 825 | ||
747 | This method will be called with the source key's semaphore read-locked to | 826 | This method will be called with the source key's semaphore read-locked to |
748 | prevent its payload from being changed. It is safe to sleep here. | 827 | prevent its payload from being changed, thus RCU constraints need not be |
828 | applied to the source key. | ||
829 | |||
830 | This method does not have to lock the destination key in order to attach a | ||
831 | payload. The fact that KEY_FLAG_INSTANTIATED is not set in key->flags | ||
832 | prevents anything else from gaining access to the key. | ||
833 | |||
834 | It is safe to sleep in this method. | ||
749 | 835 | ||
750 | 836 | ||
751 | (*) int (*update)(struct key *key, const void *data, size_t datalen); | 837 | (*) int (*update)(struct key *key, const void *data, size_t datalen); |
752 | 838 | ||
753 | If this type of key can be updated, then this method should be | 839 | If this type of key can be updated, then this method should be provided. |
754 | provided. It is called to update a key's payload from the blob of data | 840 | It is called to update a key's payload from the blob of data provided. |
755 | provided. | ||
756 | 841 | ||
757 | key_payload_reserve() should be called if the data length might change | 842 | key_payload_reserve() should be called if the data length might change |
758 | before any changes are actually made. Note that if this succeeds, the | 843 | before any changes are actually made. Note that if this succeeds, the type |
759 | type is committed to changing the key because it's already been altered, | 844 | is committed to changing the key because it's already been altered, so all |
760 | so all memory allocation must be done first. | 845 | memory allocation must be done first. |
846 | |||
847 | The key will have its semaphore write-locked before this method is called, | ||
848 | but this only deters other writers; any changes to the key's payload must | ||
849 | be made under RCU conditions, and call_rcu() must be used to dispose of | ||
850 | the old payload. | ||
761 | 851 | ||
762 | key_payload_reserve() should be called with the key->lock write locked, | 852 | key_payload_reserve() should be called before the changes are made, but |
763 | and the changes to the key's attached payload should be made before the | 853 | after all allocations and other potentially failing function calls are |
764 | key is locked. | 854 | made. |
765 | 855 | ||
766 | The key will have its semaphore write-locked before this method is | 856 | It is safe to sleep in this method. |
767 | called. Any changes to the key should be made with the key's rwlock | ||
768 | write-locked also. It is safe to sleep here. | ||
769 | 857 | ||
770 | 858 | ||
771 | (*) int (*match)(const struct key *key, const void *desc); | 859 | (*) int (*match)(const struct key *key, const void *desc); |
@@ -782,12 +870,12 @@ The structure has a number of fields, some of which are mandatory: | |||
782 | 870 | ||
783 | (*) void (*destroy)(struct key *key); | 871 | (*) void (*destroy)(struct key *key); |
784 | 872 | ||
785 | This method is optional. It is called to discard the payload data on a | 873 | This method is optional. It is called to discard the payload data on a key |
786 | key when it is being destroyed. | 874 | when it is being destroyed. |
787 | 875 | ||
788 | This method does not need to lock the key; it can consider the key as | 876 | This method does not need to lock the key to access the payload; it can |
789 | being inaccessible. Note that the key's type may have changed before this | 877 | consider the key as being inaccessible at this time. Note that the key's |
790 | function is called. | 878 | type may have been changed before this function is called. |
791 | 879 | ||
792 | It is not safe to sleep in this method; the caller may hold spinlocks. | 880 | It is not safe to sleep in this method; the caller may hold spinlocks. |
793 | 881 | ||
@@ -797,26 +885,31 @@ The structure has a number of fields, some of which are mandatory: | |||
797 | This method is optional. It is called during /proc/keys reading to | 885 | This method is optional. It is called during /proc/keys reading to |
798 | summarise a key's description and payload in text form. | 886 | summarise a key's description and payload in text form. |
799 | 887 | ||
800 | This method will be called with the key's rwlock read-locked. This will | 888 | This method will be called with the RCU read lock held. rcu_dereference() |
801 | prevent the key's payload and state changing; also the description should | 889 | should be used to read the payload pointer if the payload is to be |
802 | not change. This also means it is not safe to sleep in this method. | 890 | accessed. key->datalen cannot be trusted to stay consistent with the |
891 | contents of the payload. | ||
892 | |||
893 | The description will not change, though the key's state may. | ||
894 | |||
895 | It is not safe to sleep in this method; the RCU read lock is held by the | ||
896 | caller. | ||
803 | 897 | ||
804 | 898 | ||
805 | (*) long (*read)(const struct key *key, char __user *buffer, size_t buflen); | 899 | (*) long (*read)(const struct key *key, char __user *buffer, size_t buflen); |
806 | 900 | ||
807 | This method is optional. It is called by KEYCTL_READ to translate the | 901 | This method is optional. It is called by KEYCTL_READ to translate the |
808 | key's payload into something a blob of data for userspace to deal | 902 | key's payload into something a blob of data for userspace to deal with. |
809 | with. Ideally, the blob should be in the same format as that passed in to | 903 | Ideally, the blob should be in the same format as that passed in to the |
810 | the instantiate and update methods. | 904 | instantiate and update methods. |
811 | 905 | ||
812 | If successful, the blob size that could be produced should be returned | 906 | If successful, the blob size that could be produced should be returned |
813 | rather than the size copied. | 907 | rather than the size copied. |
814 | 908 | ||
815 | This method will be called with the key's semaphore read-locked. This | 909 | This method will be called with the key's semaphore read-locked. This will |
816 | will prevent the key's payload changing. It is not necessary to also | 910 | prevent the key's payload changing. It is not necessary to use RCU locking |
817 | read-lock key->lock when accessing the key's payload. It is safe to sleep | 911 | when accessing the key's payload. It is safe to sleep in this method, such |
818 | in this method, such as might happen when the userspace buffer is | 912 | as might happen when the userspace buffer is accessed. |
819 | accessed. | ||
820 | 913 | ||
821 | 914 | ||
822 | ============================ | 915 | ============================ |
@@ -853,8 +946,8 @@ If it returns with the key remaining in the unconstructed state, the key will | |||
853 | be marked as being negative, it will be added to the session keyring, and an | 946 | be marked as being negative, it will be added to the session keyring, and an |
854 | error will be returned to the key requestor. | 947 | error will be returned to the key requestor. |
855 | 948 | ||
856 | Supplementary information may be provided from whoever or whatever invoked | 949 | Supplementary information may be provided from whoever or whatever invoked this |
857 | this service. This will be passed as the <callout_info> parameter. If no such | 950 | service. This will be passed as the <callout_info> parameter. If no such |
858 | information was made available, then "-" will be passed as this parameter | 951 | information was made available, then "-" will be passed as this parameter |
859 | instead. | 952 | instead. |
860 | 953 | ||
diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt new file mode 100644 index 000000000000..0541fe1de704 --- /dev/null +++ b/Documentation/kprobes.txt | |||
@@ -0,0 +1,588 @@ | |||
1 | Title : Kernel Probes (Kprobes) | ||
2 | Authors : Jim Keniston <jkenisto@us.ibm.com> | ||
3 | : Prasanna S Panchamukhi <prasanna@in.ibm.com> | ||
4 | |||
5 | CONTENTS | ||
6 | |||
7 | 1. Concepts: Kprobes, Jprobes, Return Probes | ||
8 | 2. Architectures Supported | ||
9 | 3. Configuring Kprobes | ||
10 | 4. API Reference | ||
11 | 5. Kprobes Features and Limitations | ||
12 | 6. Probe Overhead | ||
13 | 7. TODO | ||
14 | 8. Kprobes Example | ||
15 | 9. Jprobes Example | ||
16 | 10. Kretprobes Example | ||
17 | |||
18 | 1. Concepts: Kprobes, Jprobes, Return Probes | ||
19 | |||
20 | Kprobes enables you to dynamically break into any kernel routine and | ||
21 | collect debugging and performance information non-disruptively. You | ||
22 | can trap at almost any kernel code address, specifying a handler | ||
23 | routine to be invoked when the breakpoint is hit. | ||
24 | |||
25 | There are currently three types of probes: kprobes, jprobes, and | ||
26 | kretprobes (also called return probes). A kprobe can be inserted | ||
27 | on virtually any instruction in the kernel. A jprobe is inserted at | ||
28 | the entry to a kernel function, and provides convenient access to the | ||
29 | function's arguments. A return probe fires when a specified function | ||
30 | returns. | ||
31 | |||
32 | In the typical case, Kprobes-based instrumentation is packaged as | ||
33 | a kernel module. The module's init function installs ("registers") | ||
34 | one or more probes, and the exit function unregisters them. A | ||
35 | registration function such as register_kprobe() specifies where | ||
36 | the probe is to be inserted and what handler is to be called when | ||
37 | the probe is hit. | ||
38 | |||
39 | The next three subsections explain how the different types of | ||
40 | probes work. They explain certain things that you'll need to | ||
41 | know in order to make the best use of Kprobes -- e.g., the | ||
42 | difference between a pre_handler and a post_handler, and how | ||
43 | to use the maxactive and nmissed fields of a kretprobe. But | ||
44 | if you're in a hurry to start using Kprobes, you can skip ahead | ||
45 | to section 2. | ||
46 | |||
47 | 1.1 How Does a Kprobe Work? | ||
48 | |||
49 | When a kprobe is registered, Kprobes makes a copy of the probed | ||
50 | instruction and replaces the first byte(s) of the probed instruction | ||
51 | with a breakpoint instruction (e.g., int3 on i386 and x86_64). | ||
52 | |||
53 | When a CPU hits the breakpoint instruction, a trap occurs, the CPU's | ||
54 | registers are saved, and control passes to Kprobes via the | ||
55 | notifier_call_chain mechanism. Kprobes executes the "pre_handler" | ||
56 | associated with the kprobe, passing the handler the addresses of the | ||
57 | kprobe struct and the saved registers. | ||
58 | |||
59 | Next, Kprobes single-steps its copy of the probed instruction. | ||
60 | (It would be simpler to single-step the actual instruction in place, | ||
61 | but then Kprobes would have to temporarily remove the breakpoint | ||
62 | instruction. This would open a small time window when another CPU | ||
63 | could sail right past the probepoint.) | ||
64 | |||
65 | After the instruction is single-stepped, Kprobes executes the | ||
66 | "post_handler," if any, that is associated with the kprobe. | ||
67 | Execution then continues with the instruction following the probepoint. | ||
68 | |||
69 | 1.2 How Does a Jprobe Work? | ||
70 | |||
71 | A jprobe is implemented using a kprobe that is placed on a function's | ||
72 | entry point. It employs a simple mirroring principle to allow | ||
73 | seamless access to the probed function's arguments. The jprobe | ||
74 | handler routine should have the same signature (arg list and return | ||
75 | type) as the function being probed, and must always end by calling | ||
76 | the Kprobes function jprobe_return(). | ||
77 | |||
78 | Here's how it works. When the probe is hit, Kprobes makes a copy of | ||
79 | the saved registers and a generous portion of the stack (see below). | ||
80 | Kprobes then points the saved instruction pointer at the jprobe's | ||
81 | handler routine, and returns from the trap. As a result, control | ||
82 | passes to the handler, which is presented with the same register and | ||
83 | stack contents as the probed function. When it is done, the handler | ||
84 | calls jprobe_return(), which traps again to restore the original stack | ||
85 | contents and processor state and switch to the probed function. | ||
86 | |||
87 | By convention, the callee owns its arguments, so gcc may produce code | ||
88 | that unexpectedly modifies that portion of the stack. This is why | ||
89 | Kprobes saves a copy of the stack and restores it after the jprobe | ||
90 | handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g., | ||
91 | 64 bytes on i386. | ||
92 | |||
93 | Note that the probed function's args may be passed on the stack | ||
94 | or in registers (e.g., for x86_64 or for an i386 fastcall function). | ||
95 | The jprobe will work in either case, so long as the handler's | ||
96 | prototype matches that of the probed function. | ||
97 | |||
98 | 1.3 How Does a Return Probe Work? | ||
99 | |||
100 | When you call register_kretprobe(), Kprobes establishes a kprobe at | ||
101 | the entry to the function. When the probed function is called and this | ||
102 | probe is hit, Kprobes saves a copy of the return address, and replaces | ||
103 | the return address with the address of a "trampoline." The trampoline | ||
104 | is an arbitrary piece of code -- typically just a nop instruction. | ||
105 | At boot time, Kprobes registers a kprobe at the trampoline. | ||
106 | |||
107 | When the probed function executes its return instruction, control | ||
108 | passes to the trampoline and that probe is hit. Kprobes' trampoline | ||
109 | handler calls the user-specified handler associated with the kretprobe, | ||
110 | then sets the saved instruction pointer to the saved return address, | ||
111 | and that's where execution resumes upon return from the trap. | ||
112 | |||
113 | While the probed function is executing, its return address is | ||
114 | stored in an object of type kretprobe_instance. Before calling | ||
115 | register_kretprobe(), the user sets the maxactive field of the | ||
116 | kretprobe struct to specify how many instances of the specified | ||
117 | function can be probed simultaneously. register_kretprobe() | ||
118 | pre-allocates the indicated number of kretprobe_instance objects. | ||
119 | |||
120 | For example, if the function is non-recursive and is called with a | ||
121 | spinlock held, maxactive = 1 should be enough. If the function is | ||
122 | non-recursive and can never relinquish the CPU (e.g., via a semaphore | ||
123 | or preemption), NR_CPUS should be enough. If maxactive <= 0, it is | ||
124 | set to a default value. If CONFIG_PREEMPT is enabled, the default | ||
125 | is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS. | ||
126 | |||
127 | It's not a disaster if you set maxactive too low; you'll just miss | ||
128 | some probes. In the kretprobe struct, the nmissed field is set to | ||
129 | zero when the return probe is registered, and is incremented every | ||
130 | time the probed function is entered but there is no kretprobe_instance | ||
131 | object available for establishing the return probe. | ||
132 | |||
133 | 2. Architectures Supported | ||
134 | |||
135 | Kprobes, jprobes, and return probes are implemented on the following | ||
136 | architectures: | ||
137 | |||
138 | - i386 | ||
139 | - x86_64 (AMD-64, E64MT) | ||
140 | - ppc64 | ||
141 | - ia64 (Support for probes on certain instruction types is still in progress.) | ||
142 | - sparc64 (Return probes not yet implemented.) | ||
143 | |||
144 | 3. Configuring Kprobes | ||
145 | |||
146 | When configuring the kernel using make menuconfig/xconfig/oldconfig, | ||
147 | ensure that CONFIG_KPROBES is set to "y". Under "Kernel hacking", | ||
148 | look for "Kprobes". You may have to enable "Kernel debugging" | ||
149 | (CONFIG_DEBUG_KERNEL) before you can enable Kprobes. | ||
150 | |||
151 | You may also want to ensure that CONFIG_KALLSYMS and perhaps even | ||
152 | CONFIG_KALLSYMS_ALL are set to "y", since kallsyms_lookup_name() | ||
153 | is a handy, version-independent way to find a function's address. | ||
154 | |||
155 | If you need to insert a probe in the middle of a function, you may find | ||
156 | it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), | ||
157 | so you can use "objdump -d -l vmlinux" to see the source-to-object | ||
158 | code mapping. | ||
159 | |||
160 | 4. API Reference | ||
161 | |||
162 | The Kprobes API includes a "register" function and an "unregister" | ||
163 | function for each type of probe. Here are terse, mini-man-page | ||
164 | specifications for these functions and the associated probe handlers | ||
165 | that you'll write. See the latter half of this document for examples. | ||
166 | |||
167 | 4.1 register_kprobe | ||
168 | |||
169 | #include <linux/kprobes.h> | ||
170 | int register_kprobe(struct kprobe *kp); | ||
171 | |||
172 | Sets a breakpoint at the address kp->addr. When the breakpoint is | ||
173 | hit, Kprobes calls kp->pre_handler. After the probed instruction | ||
174 | is single-stepped, Kprobe calls kp->post_handler. If a fault | ||
175 | occurs during execution of kp->pre_handler or kp->post_handler, | ||
176 | or during single-stepping of the probed instruction, Kprobes calls | ||
177 | kp->fault_handler. Any or all handlers can be NULL. | ||
178 | |||
179 | register_kprobe() returns 0 on success, or a negative errno otherwise. | ||
180 | |||
181 | User's pre-handler (kp->pre_handler): | ||
182 | #include <linux/kprobes.h> | ||
183 | #include <linux/ptrace.h> | ||
184 | int pre_handler(struct kprobe *p, struct pt_regs *regs); | ||
185 | |||
186 | Called with p pointing to the kprobe associated with the breakpoint, | ||
187 | and regs pointing to the struct containing the registers saved when | ||
188 | the breakpoint was hit. Return 0 here unless you're a Kprobes geek. | ||
189 | |||
190 | User's post-handler (kp->post_handler): | ||
191 | #include <linux/kprobes.h> | ||
192 | #include <linux/ptrace.h> | ||
193 | void post_handler(struct kprobe *p, struct pt_regs *regs, | ||
194 | unsigned long flags); | ||
195 | |||
196 | p and regs are as described for the pre_handler. flags always seems | ||
197 | to be zero. | ||
198 | |||
199 | User's fault-handler (kp->fault_handler): | ||
200 | #include <linux/kprobes.h> | ||
201 | #include <linux/ptrace.h> | ||
202 | int fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr); | ||
203 | |||
204 | p and regs are as described for the pre_handler. trapnr is the | ||
205 | architecture-specific trap number associated with the fault (e.g., | ||
206 | on i386, 13 for a general protection fault or 14 for a page fault). | ||
207 | Returns 1 if it successfully handled the exception. | ||
208 | |||
209 | 4.2 register_jprobe | ||
210 | |||
211 | #include <linux/kprobes.h> | ||
212 | int register_jprobe(struct jprobe *jp) | ||
213 | |||
214 | Sets a breakpoint at the address jp->kp.addr, which must be the address | ||
215 | of the first instruction of a function. When the breakpoint is hit, | ||
216 | Kprobes runs the handler whose address is jp->entry. | ||
217 | |||
218 | The handler should have the same arg list and return type as the probed | ||
219 | function; and just before it returns, it must call jprobe_return(). | ||
220 | (The handler never actually returns, since jprobe_return() returns | ||
221 | control to Kprobes.) If the probed function is declared asmlinkage, | ||
222 | fastcall, or anything else that affects how args are passed, the | ||
223 | handler's declaration must match. | ||
224 | |||
225 | register_jprobe() returns 0 on success, or a negative errno otherwise. | ||
226 | |||
227 | 4.3 register_kretprobe | ||
228 | |||
229 | #include <linux/kprobes.h> | ||
230 | int register_kretprobe(struct kretprobe *rp); | ||
231 | |||
232 | Establishes a return probe for the function whose address is | ||
233 | rp->kp.addr. When that function returns, Kprobes calls rp->handler. | ||
234 | You must set rp->maxactive appropriately before you call | ||
235 | register_kretprobe(); see "How Does a Return Probe Work?" for details. | ||
236 | |||
237 | register_kretprobe() returns 0 on success, or a negative errno | ||
238 | otherwise. | ||
239 | |||
240 | User's return-probe handler (rp->handler): | ||
241 | #include <linux/kprobes.h> | ||
242 | #include <linux/ptrace.h> | ||
243 | int kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs); | ||
244 | |||
245 | regs is as described for kprobe.pre_handler. ri points to the | ||
246 | kretprobe_instance object, of which the following fields may be | ||
247 | of interest: | ||
248 | - ret_addr: the return address | ||
249 | - rp: points to the corresponding kretprobe object | ||
250 | - task: points to the corresponding task struct | ||
251 | The handler's return value is currently ignored. | ||
252 | |||
253 | 4.4 unregister_*probe | ||
254 | |||
255 | #include <linux/kprobes.h> | ||
256 | void unregister_kprobe(struct kprobe *kp); | ||
257 | void unregister_jprobe(struct jprobe *jp); | ||
258 | void unregister_kretprobe(struct kretprobe *rp); | ||
259 | |||
260 | Removes the specified probe. The unregister function can be called | ||
261 | at any time after the probe has been registered. | ||
262 | |||
263 | 5. Kprobes Features and Limitations | ||
264 | |||
265 | As of Linux v2.6.12, Kprobes allows multiple probes at the same | ||
266 | address. Currently, however, there cannot be multiple jprobes on | ||
267 | the same function at the same time. | ||
268 | |||
269 | In general, you can install a probe anywhere in the kernel. | ||
270 | In particular, you can probe interrupt handlers. Known exceptions | ||
271 | are discussed in this section. | ||
272 | |||
273 | For obvious reasons, it's a bad idea to install a probe in | ||
274 | the code that implements Kprobes (mostly kernel/kprobes.c and | ||
275 | arch/*/kernel/kprobes.c). A patch in the v2.6.13 timeframe instructs | ||
276 | Kprobes to reject such requests. | ||
277 | |||
278 | If you install a probe in an inline-able function, Kprobes makes | ||
279 | no attempt to chase down all inline instances of the function and | ||
280 | install probes there. gcc may inline a function without being asked, | ||
281 | so keep this in mind if you're not seeing the probe hits you expect. | ||
282 | |||
283 | A probe handler can modify the environment of the probed function | ||
284 | -- e.g., by modifying kernel data structures, or by modifying the | ||
285 | contents of the pt_regs struct (which are restored to the registers | ||
286 | upon return from the breakpoint). So Kprobes can be used, for example, | ||
287 | to install a bug fix or to inject faults for testing. Kprobes, of | ||
288 | course, has no way to distinguish the deliberately injected faults | ||
289 | from the accidental ones. Don't drink and probe. | ||
290 | |||
291 | Kprobes makes no attempt to prevent probe handlers from stepping on | ||
292 | each other -- e.g., probing printk() and then calling printk() from a | ||
293 | probe handler. As of Linux v2.6.12, if a probe handler hits a probe, | ||
294 | that second probe's handlers won't be run in that instance. | ||
295 | |||
296 | In Linux v2.6.12 and previous versions, Kprobes' data structures are | ||
297 | protected by a single lock that is held during probe registration and | ||
298 | unregistration and while handlers are run. Thus, no two handlers | ||
299 | can run simultaneously. To improve scalability on SMP systems, | ||
300 | this restriction will probably be removed soon, in which case | ||
301 | multiple handlers (or multiple instances of the same handler) may | ||
302 | run concurrently on different CPUs. Code your handlers accordingly. | ||
303 | |||
304 | Kprobes does not use semaphores or allocate memory except during | ||
305 | registration and unregistration. | ||
306 | |||
307 | Probe handlers are run with preemption disabled. Depending on the | ||
308 | architecture, handlers may also run with interrupts disabled. In any | ||
309 | case, your handler should not yield the CPU (e.g., by attempting to | ||
310 | acquire a semaphore). | ||
311 | |||
312 | Since a return probe is implemented by replacing the return | ||
313 | address with the trampoline's address, stack backtraces and calls | ||
314 | to __builtin_return_address() will typically yield the trampoline's | ||
315 | address instead of the real return address for kretprobed functions. | ||
316 | (As far as we can tell, __builtin_return_address() is used only | ||
317 | for instrumentation and error reporting.) | ||
318 | |||
319 | If the number of times a function is called does not match the | ||
320 | number of times it returns, registering a return probe on that | ||
321 | function may produce undesirable results. We have the do_exit() | ||
322 | and do_execve() cases covered. do_fork() is not an issue. We're | ||
323 | unaware of other specific cases where this could be a problem. | ||
324 | |||
325 | 6. Probe Overhead | ||
326 | |||
327 | On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 | ||
328 | microseconds to process. Specifically, a benchmark that hits the same | ||
329 | probepoint repeatedly, firing a simple handler each time, reports 1-2 | ||
330 | million hits per second, depending on the architecture. A jprobe or | ||
331 | return-probe hit typically takes 50-75% longer than a kprobe hit. | ||
332 | When you have a return probe set on a function, adding a kprobe at | ||
333 | the entry to that function adds essentially no overhead. | ||
334 | |||
335 | Here are sample overhead figures (in usec) for different architectures. | ||
336 | k = kprobe; j = jprobe; r = return probe; kr = kprobe + return probe | ||
337 | on same function; jr = jprobe + return probe on same function | ||
338 | |||
339 | i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips | ||
340 | k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40 | ||
341 | |||
342 | x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips | ||
343 | k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07 | ||
344 | |||
345 | ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU) | ||
346 | k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99 | ||
347 | |||
348 | 7. TODO | ||
349 | |||
350 | a. SystemTap (http://sourceware.org/systemtap): Work in progress | ||
351 | to provide a simplified programming interface for probe-based | ||
352 | instrumentation. | ||
353 | b. Improved SMP scalability: Currently, work is in progress to handle | ||
354 | multiple kprobes in parallel. | ||
355 | c. Kernel return probes for sparc64. | ||
356 | d. Support for other architectures. | ||
357 | e. User-space probes. | ||
358 | |||
359 | 8. Kprobes Example | ||
360 | |||
361 | Here's a sample kernel module showing the use of kprobes to dump a | ||
362 | stack trace and selected i386 registers when do_fork() is called. | ||
363 | ----- cut here ----- | ||
364 | /*kprobe_example.c*/ | ||
365 | #include <linux/kernel.h> | ||
366 | #include <linux/module.h> | ||
367 | #include <linux/kprobes.h> | ||
368 | #include <linux/kallsyms.h> | ||
369 | #include <linux/sched.h> | ||
370 | |||
371 | /*For each probe you need to allocate a kprobe structure*/ | ||
372 | static struct kprobe kp; | ||
373 | |||
374 | /*kprobe pre_handler: called just before the probed instruction is executed*/ | ||
375 | int handler_pre(struct kprobe *p, struct pt_regs *regs) | ||
376 | { | ||
377 | printk("pre_handler: p->addr=0x%p, eip=%lx, eflags=0x%lx\n", | ||
378 | p->addr, regs->eip, regs->eflags); | ||
379 | dump_stack(); | ||
380 | return 0; | ||
381 | } | ||
382 | |||
383 | /*kprobe post_handler: called after the probed instruction is executed*/ | ||
384 | void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags) | ||
385 | { | ||
386 | printk("post_handler: p->addr=0x%p, eflags=0x%lx\n", | ||
387 | p->addr, regs->eflags); | ||
388 | } | ||
389 | |||
390 | /* fault_handler: this is called if an exception is generated for any | ||
391 | * instruction within the pre- or post-handler, or when Kprobes | ||
392 | * single-steps the probed instruction. | ||
393 | */ | ||
394 | int handler_fault(struct kprobe *p, struct pt_regs *regs, int trapnr) | ||
395 | { | ||
396 | printk("fault_handler: p->addr=0x%p, trap #%dn", | ||
397 | p->addr, trapnr); | ||
398 | /* Return 0 because we don't handle the fault. */ | ||
399 | return 0; | ||
400 | } | ||
401 | |||
402 | int init_module(void) | ||
403 | { | ||
404 | int ret; | ||
405 | kp.pre_handler = handler_pre; | ||
406 | kp.post_handler = handler_post; | ||
407 | kp.fault_handler = handler_fault; | ||
408 | kp.addr = (kprobe_opcode_t*) kallsyms_lookup_name("do_fork"); | ||
409 | /* register the kprobe now */ | ||
410 | if (!kp.addr) { | ||
411 | printk("Couldn't find %s to plant kprobe\n", "do_fork"); | ||
412 | return -1; | ||
413 | } | ||
414 | if ((ret = register_kprobe(&kp) < 0)) { | ||
415 | printk("register_kprobe failed, returned %d\n", ret); | ||
416 | return -1; | ||
417 | } | ||
418 | printk("kprobe registered\n"); | ||
419 | return 0; | ||
420 | } | ||
421 | |||
422 | void cleanup_module(void) | ||
423 | { | ||
424 | unregister_kprobe(&kp); | ||
425 | printk("kprobe unregistered\n"); | ||
426 | } | ||
427 | |||
428 | MODULE_LICENSE("GPL"); | ||
429 | ----- cut here ----- | ||
430 | |||
431 | You can build the kernel module, kprobe-example.ko, using the following | ||
432 | Makefile: | ||
433 | ----- cut here ----- | ||
434 | obj-m := kprobe-example.o | ||
435 | KDIR := /lib/modules/$(shell uname -r)/build | ||
436 | PWD := $(shell pwd) | ||
437 | default: | ||
438 | $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules | ||
439 | clean: | ||
440 | rm -f *.mod.c *.ko *.o | ||
441 | ----- cut here ----- | ||
442 | |||
443 | $ make | ||
444 | $ su - | ||
445 | ... | ||
446 | # insmod kprobe-example.ko | ||
447 | |||
448 | You will see the trace data in /var/log/messages and on the console | ||
449 | whenever do_fork() is invoked to create a new process. | ||
450 | |||
451 | 9. Jprobes Example | ||
452 | |||
453 | Here's a sample kernel module showing the use of jprobes to dump | ||
454 | the arguments of do_fork(). | ||
455 | ----- cut here ----- | ||
456 | /*jprobe-example.c */ | ||
457 | #include <linux/kernel.h> | ||
458 | #include <linux/module.h> | ||
459 | #include <linux/fs.h> | ||
460 | #include <linux/uio.h> | ||
461 | #include <linux/kprobes.h> | ||
462 | #include <linux/kallsyms.h> | ||
463 | |||
464 | /* | ||
465 | * Jumper probe for do_fork. | ||
466 | * Mirror principle enables access to arguments of the probed routine | ||
467 | * from the probe handler. | ||
468 | */ | ||
469 | |||
470 | /* Proxy routine having the same arguments as actual do_fork() routine */ | ||
471 | long jdo_fork(unsigned long clone_flags, unsigned long stack_start, | ||
472 | struct pt_regs *regs, unsigned long stack_size, | ||
473 | int __user * parent_tidptr, int __user * child_tidptr) | ||
474 | { | ||
475 | printk("jprobe: clone_flags=0x%lx, stack_size=0x%lx, regs=0x%p\n", | ||
476 | clone_flags, stack_size, regs); | ||
477 | /* Always end with a call to jprobe_return(). */ | ||
478 | jprobe_return(); | ||
479 | /*NOTREACHED*/ | ||
480 | return 0; | ||
481 | } | ||
482 | |||
483 | static struct jprobe my_jprobe = { | ||
484 | .entry = (kprobe_opcode_t *) jdo_fork | ||
485 | }; | ||
486 | |||
487 | int init_module(void) | ||
488 | { | ||
489 | int ret; | ||
490 | my_jprobe.kp.addr = (kprobe_opcode_t *) kallsyms_lookup_name("do_fork"); | ||
491 | if (!my_jprobe.kp.addr) { | ||
492 | printk("Couldn't find %s to plant jprobe\n", "do_fork"); | ||
493 | return -1; | ||
494 | } | ||
495 | |||
496 | if ((ret = register_jprobe(&my_jprobe)) <0) { | ||
497 | printk("register_jprobe failed, returned %d\n", ret); | ||
498 | return -1; | ||
499 | } | ||
500 | printk("Planted jprobe at %p, handler addr %p\n", | ||
501 | my_jprobe.kp.addr, my_jprobe.entry); | ||
502 | return 0; | ||
503 | } | ||
504 | |||
505 | void cleanup_module(void) | ||
506 | { | ||
507 | unregister_jprobe(&my_jprobe); | ||
508 | printk("jprobe unregistered\n"); | ||
509 | } | ||
510 | |||
511 | MODULE_LICENSE("GPL"); | ||
512 | ----- cut here ----- | ||
513 | |||
514 | Build and insert the kernel module as shown in the above kprobe | ||
515 | example. You will see the trace data in /var/log/messages and on | ||
516 | the console whenever do_fork() is invoked to create a new process. | ||
517 | (Some messages may be suppressed if syslogd is configured to | ||
518 | eliminate duplicate messages.) | ||
519 | |||
520 | 10. Kretprobes Example | ||
521 | |||
522 | Here's a sample kernel module showing the use of return probes to | ||
523 | report failed calls to sys_open(). | ||
524 | ----- cut here ----- | ||
525 | /*kretprobe-example.c*/ | ||
526 | #include <linux/kernel.h> | ||
527 | #include <linux/module.h> | ||
528 | #include <linux/kprobes.h> | ||
529 | #include <linux/kallsyms.h> | ||
530 | |||
531 | static const char *probed_func = "sys_open"; | ||
532 | |||
533 | /* Return-probe handler: If the probed function fails, log the return value. */ | ||
534 | static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs) | ||
535 | { | ||
536 | // Substitute the appropriate register name for your architecture -- | ||
537 | // e.g., regs->rax for x86_64, regs->gpr[3] for ppc64. | ||
538 | int retval = (int) regs->eax; | ||
539 | if (retval < 0) { | ||
540 | printk("%s returns %d\n", probed_func, retval); | ||
541 | } | ||
542 | return 0; | ||
543 | } | ||
544 | |||
545 | static struct kretprobe my_kretprobe = { | ||
546 | .handler = ret_handler, | ||
547 | /* Probe up to 20 instances concurrently. */ | ||
548 | .maxactive = 20 | ||
549 | }; | ||
550 | |||
551 | int init_module(void) | ||
552 | { | ||
553 | int ret; | ||
554 | my_kretprobe.kp.addr = | ||
555 | (kprobe_opcode_t *) kallsyms_lookup_name(probed_func); | ||
556 | if (!my_kretprobe.kp.addr) { | ||
557 | printk("Couldn't find %s to plant return probe\n", probed_func); | ||
558 | return -1; | ||
559 | } | ||
560 | if ((ret = register_kretprobe(&my_kretprobe)) < 0) { | ||
561 | printk("register_kretprobe failed, returned %d\n", ret); | ||
562 | return -1; | ||
563 | } | ||
564 | printk("Planted return probe at %p\n", my_kretprobe.kp.addr); | ||
565 | return 0; | ||
566 | } | ||
567 | |||
568 | void cleanup_module(void) | ||
569 | { | ||
570 | unregister_kretprobe(&my_kretprobe); | ||
571 | printk("kretprobe unregistered\n"); | ||
572 | /* nmissed > 0 suggests that maxactive was set too low. */ | ||
573 | printk("Missed probing %d instances of %s\n", | ||
574 | my_kretprobe.nmissed, probed_func); | ||
575 | } | ||
576 | |||
577 | MODULE_LICENSE("GPL"); | ||
578 | ----- cut here ----- | ||
579 | |||
580 | Build and insert the kernel module as shown in the above kprobe | ||
581 | example. You will see the trace data in /var/log/messages and on the | ||
582 | console whenever sys_open() returns a negative value. (Some messages | ||
583 | may be suppressed if syslogd is configured to eliminate duplicate | ||
584 | messages.) | ||
585 | |||
586 | For additional information on Kprobes, refer to the following URLs: | ||
587 | http://www-106.ibm.com/developerworks/library/l-kprobes.html?ca=dgr-lnxw42Kprobe | ||
588 | http://www.redhat.com/magazine/005mar05/features/kprobes/ | ||
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 834993d26730..5b01d5cc4e95 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX | |||
@@ -114,9 +114,7 @@ tuntap.txt | |||
114 | vortex.txt | 114 | vortex.txt |
115 | - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. | 115 | - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. |
116 | wan-router.txt | 116 | wan-router.txt |
117 | - Wan router documentation | 117 | - WAN router documentation |
118 | wanpipe.txt | ||
119 | - WANPIPE(tm) Multiprotocol WAN Driver for Linux WAN Router | ||
120 | wavelan.txt | 118 | wavelan.txt |
121 | - AT&T GIS (nee NCR) WaveLAN card: An Ethernet-like radio transceiver | 119 | - AT&T GIS (nee NCR) WaveLAN card: An Ethernet-like radio transceiver |
122 | x25.txt | 120 | x25.txt |
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 0bc2ed136a38..24d029455baa 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt | |||
@@ -1,5 +1,7 @@ | |||
1 | 1 | ||
2 | Linux Ethernet Bonding Driver HOWTO | 2 | Linux Ethernet Bonding Driver HOWTO |
3 | |||
4 | Latest update: 21 June 2005 | ||
3 | 5 | ||
4 | Initial release : Thomas Davis <tadavis at lbl.gov> | 6 | Initial release : Thomas Davis <tadavis at lbl.gov> |
5 | Corrections, HA extensions : 2000/10/03-15 : | 7 | Corrections, HA extensions : 2000/10/03-15 : |
@@ -11,15 +13,22 @@ Corrections, HA extensions : 2000/10/03-15 : | |||
11 | 13 | ||
12 | Reorganized and updated Feb 2005 by Jay Vosburgh | 14 | Reorganized and updated Feb 2005 by Jay Vosburgh |
13 | 15 | ||
14 | Note : | 16 | Introduction |
15 | ------ | 17 | ============ |
18 | |||
19 | The Linux bonding driver provides a method for aggregating | ||
20 | multiple network interfaces into a single logical "bonded" interface. | ||
21 | The behavior of the bonded interfaces depends upon the mode; generally | ||
22 | speaking, modes provide either hot standby or load balancing services. | ||
23 | Additionally, link integrity monitoring may be performed. | ||
16 | 24 | ||
17 | The bonding driver originally came from Donald Becker's beowulf patches for | 25 | The bonding driver originally came from Donald Becker's |
18 | kernel 2.0. It has changed quite a bit since, and the original tools from | 26 | beowulf patches for kernel 2.0. It has changed quite a bit since, and |
19 | extreme-linux and beowulf sites will not work with this version of the driver. | 27 | the original tools from extreme-linux and beowulf sites will not work |
28 | with this version of the driver. | ||
20 | 29 | ||
21 | For new versions of the driver, patches for older kernels and the updated | 30 | For new versions of the driver, updated userspace tools, and |
22 | userspace tools, please follow the links at the end of this file. | 31 | who to ask for help, please follow the links at the end of this file. |
23 | 32 | ||
24 | Table of Contents | 33 | Table of Contents |
25 | ================= | 34 | ================= |
@@ -30,9 +39,13 @@ Table of Contents | |||
30 | 39 | ||
31 | 3. Configuring Bonding Devices | 40 | 3. Configuring Bonding Devices |
32 | 3.1 Configuration with sysconfig support | 41 | 3.1 Configuration with sysconfig support |
42 | 3.1.1 Using DHCP with sysconfig | ||
43 | 3.1.2 Configuring Multiple Bonds with sysconfig | ||
33 | 3.2 Configuration with initscripts support | 44 | 3.2 Configuration with initscripts support |
45 | 3.2.1 Using DHCP with initscripts | ||
46 | 3.2.2 Configuring Multiple Bonds with initscripts | ||
34 | 3.3 Configuring Bonding Manually | 47 | 3.3 Configuring Bonding Manually |
35 | 3.4 Configuring Multiple Bonds | 48 | 3.3.1 Configuring Multiple Bonds Manually |
36 | 49 | ||
37 | 5. Querying Bonding Configuration | 50 | 5. Querying Bonding Configuration |
38 | 5.1 Bonding Configuration | 51 | 5.1 Bonding Configuration |
@@ -56,21 +69,30 @@ Table of Contents | |||
56 | 69 | ||
57 | 11. Promiscuous mode | 70 | 11. Promiscuous mode |
58 | 71 | ||
59 | 12. High Availability Information | 72 | 12. Configuring Bonding for High Availability |
60 | 12.1 High Availability in a Single Switch Topology | 73 | 12.1 High Availability in a Single Switch Topology |
61 | 12.1.1 Bonding Mode Selection for Single Switch Topology | ||
62 | 12.1.2 Link Monitoring for Single Switch Topology | ||
63 | 12.2 High Availability in a Multiple Switch Topology | 74 | 12.2 High Availability in a Multiple Switch Topology |
64 | 12.2.1 Bonding Mode Selection for Multiple Switch Topology | 75 | 12.2.1 HA Bonding Mode Selection for Multiple Switch Topology |
65 | 12.2.2 Link Monitoring for Multiple Switch Topology | 76 | 12.2.2 HA Link Monitoring for Multiple Switch Topology |
66 | 12.3 Switch Behavior Issues for High Availability | 77 | |
78 | 13. Configuring Bonding for Maximum Throughput | ||
79 | 13.1 Maximum Throughput in a Single Switch Topology | ||
80 | 13.1.1 MT Bonding Mode Selection for Single Switch Topology | ||
81 | 13.1.2 MT Link Monitoring for Single Switch Topology | ||
82 | 13.2 Maximum Throughput in a Multiple Switch Topology | ||
83 | 13.2.1 MT Bonding Mode Selection for Multiple Switch Topology | ||
84 | 13.2.2 MT Link Monitoring for Multiple Switch Topology | ||
67 | 85 | ||
68 | 13. Hardware Specific Considerations | 86 | 14. Switch Behavior Issues |
69 | 13.1 IBM BladeCenter | 87 | 14.1 Link Establishment and Failover Delays |
88 | 14.2 Duplicated Incoming Packets | ||
70 | 89 | ||
71 | 14. Frequently Asked Questions | 90 | 15. Hardware Specific Considerations |
91 | 15.1 IBM BladeCenter | ||
72 | 92 | ||
73 | 15. Resources and Links | 93 | 16. Frequently Asked Questions |
94 | |||
95 | 17. Resources and Links | ||
74 | 96 | ||
75 | 97 | ||
76 | 1. Bonding Driver Installation | 98 | 1. Bonding Driver Installation |
@@ -86,16 +108,10 @@ the following steps: | |||
86 | 1.1 Configure and build the kernel with bonding | 108 | 1.1 Configure and build the kernel with bonding |
87 | ----------------------------------------------- | 109 | ----------------------------------------------- |
88 | 110 | ||
89 | The latest version of the bonding driver is available in the | 111 | The current version of the bonding driver is available in the |
90 | drivers/net/bonding subdirectory of the most recent kernel source | 112 | drivers/net/bonding subdirectory of the most recent kernel source |
91 | (which is available on http://kernel.org). | 113 | (which is available on http://kernel.org). Most users "rolling their |
92 | 114 | own" will want to use the most recent kernel from kernel.org. | |
93 | Prior to the 2.4.11 kernel, the bonding driver was maintained | ||
94 | largely outside the kernel tree; patches for some earlier kernels are | ||
95 | available on the bonding sourceforge site, although those patches are | ||
96 | still several years out of date. Most users will want to use either | ||
97 | the most recent kernel from kernel.org or whatever kernel came with | ||
98 | their distro. | ||
99 | 115 | ||
100 | Configure kernel with "make menuconfig" (or "make xconfig" or | 116 | Configure kernel with "make menuconfig" (or "make xconfig" or |
101 | "make config"), then select "Bonding driver support" in the "Network | 117 | "make config"), then select "Bonding driver support" in the "Network |
@@ -103,8 +119,8 @@ device support" section. It is recommended that you configure the | |||
103 | driver as module since it is currently the only way to pass parameters | 119 | driver as module since it is currently the only way to pass parameters |
104 | to the driver or configure more than one bonding device. | 120 | to the driver or configure more than one bonding device. |
105 | 121 | ||
106 | Build and install the new kernel and modules, then proceed to | 122 | Build and install the new kernel and modules, then continue |
107 | step 2. | 123 | below to install ifenslave. |
108 | 124 | ||
109 | 1.2 Install ifenslave Control Utility | 125 | 1.2 Install ifenslave Control Utility |
110 | ------------------------------------- | 126 | ------------------------------------- |
@@ -147,9 +163,9 @@ default kernel source include directory. | |||
147 | Options for the bonding driver are supplied as parameters to | 163 | Options for the bonding driver are supplied as parameters to |
148 | the bonding module at load time. They may be given as command line | 164 | the bonding module at load time. They may be given as command line |
149 | arguments to the insmod or modprobe command, but are usually specified | 165 | arguments to the insmod or modprobe command, but are usually specified |
150 | in either the /etc/modprobe.conf configuration file, or in a | 166 | in either the /etc/modules.conf or /etc/modprobe.conf configuration |
151 | distro-specific configuration file (some of which are detailed in the | 167 | file, or in a distro-specific configuration file (some of which are |
152 | next section). | 168 | detailed in the next section). |
153 | 169 | ||
154 | The available bonding driver parameters are listed below. If a | 170 | The available bonding driver parameters are listed below. If a |
155 | parameter is not specified the default value is used. When initially | 171 | parameter is not specified the default value is used. When initially |
@@ -162,34 +178,34 @@ degradation will occur during link failures. Very few devices do not | |||
162 | support at least miimon, so there is really no reason not to use it. | 178 | support at least miimon, so there is really no reason not to use it. |
163 | 179 | ||
164 | Options with textual values will accept either the text name | 180 | Options with textual values will accept either the text name |
165 | or, for backwards compatibility, the option value. E.g., | 181 | or, for backwards compatibility, the option value. E.g., |
166 | "mode=802.3ad" and "mode=4" set the same mode. | 182 | "mode=802.3ad" and "mode=4" set the same mode. |
167 | 183 | ||
168 | The parameters are as follows: | 184 | The parameters are as follows: |
169 | 185 | ||
170 | arp_interval | 186 | arp_interval |
171 | 187 | ||
172 | Specifies the ARP monitoring frequency in milli-seconds. If | 188 | Specifies the ARP link monitoring frequency in milliseconds. |
173 | ARP monitoring is used in a load-balancing mode (mode 0 or 2), | 189 | If ARP monitoring is used in an etherchannel compatible mode |
174 | the switch should be configured in a mode that evenly | 190 | (modes 0 and 2), the switch should be configured in a mode |
175 | distributes packets across all links - such as round-robin. If | 191 | that evenly distributes packets across all links. If the |
176 | the switch is configured to distribute the packets in an XOR | 192 | switch is configured to distribute the packets in an XOR |
177 | fashion, all replies from the ARP targets will be received on | 193 | fashion, all replies from the ARP targets will be received on |
178 | the same link which could cause the other team members to | 194 | the same link which could cause the other team members to |
179 | fail. ARP monitoring should not be used in conjunction with | 195 | fail. ARP monitoring should not be used in conjunction with |
180 | miimon. A value of 0 disables ARP monitoring. The default | 196 | miimon. A value of 0 disables ARP monitoring. The default |
181 | value is 0. | 197 | value is 0. |
182 | 198 | ||
183 | arp_ip_target | 199 | arp_ip_target |
184 | 200 | ||
185 | Specifies the ip addresses to use when arp_interval is > 0. | 201 | Specifies the IP addresses to use as ARP monitoring peers when |
186 | These are the targets of the ARP request sent to determine the | 202 | arp_interval is > 0. These are the targets of the ARP request |
187 | health of the link to the targets. Specify these values in | 203 | sent to determine the health of the link to the targets. |
188 | ddd.ddd.ddd.ddd format. Multiple ip adresses must be | 204 | Specify these values in ddd.ddd.ddd.ddd format. Multiple IP |
189 | seperated by a comma. At least one IP address must be given | 205 | addresses must be separated by a comma. At least one IP |
190 | for ARP monitoring to function. The maximum number of targets | 206 | address must be given for ARP monitoring to function. The |
191 | that can be specified is 16. The default value is no IP | 207 | maximum number of targets that can be specified is 16. The |
192 | addresses. | 208 | default value is no IP addresses. |
193 | 209 | ||
194 | downdelay | 210 | downdelay |
195 | 211 | ||
@@ -207,11 +223,13 @@ lacp_rate | |||
207 | are: | 223 | are: |
208 | 224 | ||
209 | slow or 0 | 225 | slow or 0 |
210 | Request partner to transmit LACPDUs every 30 seconds (default) | 226 | Request partner to transmit LACPDUs every 30 seconds |
211 | 227 | ||
212 | fast or 1 | 228 | fast or 1 |
213 | Request partner to transmit LACPDUs every 1 second | 229 | Request partner to transmit LACPDUs every 1 second |
214 | 230 | ||
231 | The default is slow. | ||
232 | |||
215 | max_bonds | 233 | max_bonds |
216 | 234 | ||
217 | Specifies the number of bonding devices to create for this | 235 | Specifies the number of bonding devices to create for this |
@@ -221,10 +239,11 @@ max_bonds | |||
221 | 239 | ||
222 | miimon | 240 | miimon |
223 | 241 | ||
224 | Specifies the frequency in milli-seconds that MII link | 242 | Specifies the MII link monitoring frequency in milliseconds. |
225 | monitoring will occur. A value of zero disables MII link | 243 | This determines how often the link state of each slave is |
226 | monitoring. A value of 100 is a good starting point. The | 244 | inspected for link failures. A value of zero disables MII |
227 | use_carrier option, below, affects how the link state is | 245 | link monitoring. A value of 100 is a good starting point. |
246 | The use_carrier option, below, affects how the link state is | ||
228 | determined. See the High Availability section for additional | 247 | determined. See the High Availability section for additional |
229 | information. The default value is 0. | 248 | information. The default value is 0. |
230 | 249 | ||
@@ -246,17 +265,31 @@ mode | |||
246 | active. A different slave becomes active if, and only | 265 | active. A different slave becomes active if, and only |
247 | if, the active slave fails. The bond's MAC address is | 266 | if, the active slave fails. The bond's MAC address is |
248 | externally visible on only one port (network adapter) | 267 | externally visible on only one port (network adapter) |
249 | to avoid confusing the switch. This mode provides | 268 | to avoid confusing the switch. |
250 | fault tolerance. The primary option affects the | 269 | |
251 | behavior of this mode. | 270 | In bonding version 2.6.2 or later, when a failover |
271 | occurs in active-backup mode, bonding will issue one | ||
272 | or more gratuitous ARPs on the newly active slave. | ||
273 | One gratutious ARP is issued for the bonding master | ||
274 | interface and each VLAN interfaces configured above | ||
275 | it, provided that the interface has at least one IP | ||
276 | address configured. Gratuitous ARPs issued for VLAN | ||
277 | interfaces are tagged with the appropriate VLAN id. | ||
278 | |||
279 | This mode provides fault tolerance. The primary | ||
280 | option, documented below, affects the behavior of this | ||
281 | mode. | ||
252 | 282 | ||
253 | balance-xor or 2 | 283 | balance-xor or 2 |
254 | 284 | ||
255 | XOR policy: Transmit based on [(source MAC address | 285 | XOR policy: Transmit based on the selected transmit |
256 | XOR'd with destination MAC address) modulo slave | 286 | hash policy. The default policy is a simple [(source |
257 | count]. This selects the same slave for each | 287 | MAC address XOR'd with destination MAC address) modulo |
258 | destination MAC address. This mode provides load | 288 | slave count]. Alternate transmit policies may be |
259 | balancing and fault tolerance. | 289 | selected via the xmit_hash_policy option, described |
290 | below. | ||
291 | |||
292 | This mode provides load balancing and fault tolerance. | ||
260 | 293 | ||
261 | broadcast or 3 | 294 | broadcast or 3 |
262 | 295 | ||
@@ -270,7 +303,17 @@ mode | |||
270 | duplex settings. Utilizes all slaves in the active | 303 | duplex settings. Utilizes all slaves in the active |
271 | aggregator according to the 802.3ad specification. | 304 | aggregator according to the 802.3ad specification. |
272 | 305 | ||
273 | Pre-requisites: | 306 | Slave selection for outgoing traffic is done according |
307 | to the transmit hash policy, which may be changed from | ||
308 | the default simple XOR policy via the xmit_hash_policy | ||
309 | option, documented below. Note that not all transmit | ||
310 | policies may be 802.3ad compliant, particularly in | ||
311 | regards to the packet mis-ordering requirements of | ||
312 | section 43.2.4 of the 802.3ad standard. Differing | ||
313 | peer implementations will have varying tolerances for | ||
314 | noncompliance. | ||
315 | |||
316 | Prerequisites: | ||
274 | 317 | ||
275 | 1. Ethtool support in the base drivers for retrieving | 318 | 1. Ethtool support in the base drivers for retrieving |
276 | the speed and duplex of each slave. | 319 | the speed and duplex of each slave. |
@@ -333,7 +376,7 @@ mode | |||
333 | 376 | ||
334 | When a link is reconnected or a new slave joins the | 377 | When a link is reconnected or a new slave joins the |
335 | bond the receive traffic is redistributed among all | 378 | bond the receive traffic is redistributed among all |
336 | active slaves in the bond by intiating ARP Replies | 379 | active slaves in the bond by initiating ARP Replies |
337 | with the selected mac address to each of the | 380 | with the selected mac address to each of the |
338 | clients. The updelay parameter (detailed below) must | 381 | clients. The updelay parameter (detailed below) must |
339 | be set to a value equal or greater than the switch's | 382 | be set to a value equal or greater than the switch's |
@@ -396,6 +439,60 @@ use_carrier | |||
396 | 0 will use the deprecated MII / ETHTOOL ioctls. The default | 439 | 0 will use the deprecated MII / ETHTOOL ioctls. The default |
397 | value is 1. | 440 | value is 1. |
398 | 441 | ||
442 | xmit_hash_policy | ||
443 | |||
444 | Selects the transmit hash policy to use for slave selection in | ||
445 | balance-xor and 802.3ad modes. Possible values are: | ||
446 | |||
447 | layer2 | ||
448 | |||
449 | Uses XOR of hardware MAC addresses to generate the | ||
450 | hash. The formula is | ||
451 | |||
452 | (source MAC XOR destination MAC) modulo slave count | ||
453 | |||
454 | This algorithm will place all traffic to a particular | ||
455 | network peer on the same slave. | ||
456 | |||
457 | This algorithm is 802.3ad compliant. | ||
458 | |||
459 | layer3+4 | ||
460 | |||
461 | This policy uses upper layer protocol information, | ||
462 | when available, to generate the hash. This allows for | ||
463 | traffic to a particular network peer to span multiple | ||
464 | slaves, although a single connection will not span | ||
465 | multiple slaves. | ||
466 | |||
467 | The formula for unfragmented TCP and UDP packets is | ||
468 | |||
469 | ((source port XOR dest port) XOR | ||
470 | ((source IP XOR dest IP) AND 0xffff) | ||
471 | modulo slave count | ||
472 | |||
473 | For fragmented TCP or UDP packets and all other IP | ||
474 | protocol traffic, the source and destination port | ||
475 | information is omitted. For non-IP traffic, the | ||
476 | formula is the same as for the layer2 transmit hash | ||
477 | policy. | ||
478 | |||
479 | This policy is intended to mimic the behavior of | ||
480 | certain switches, notably Cisco switches with PFC2 as | ||
481 | well as some Foundry and IBM products. | ||
482 | |||
483 | This algorithm is not fully 802.3ad compliant. A | ||
484 | single TCP or UDP conversation containing both | ||
485 | fragmented and unfragmented packets will see packets | ||
486 | striped across two interfaces. This may result in out | ||
487 | of order delivery. Most traffic types will not meet | ||
488 | this criteria, as TCP rarely fragments traffic, and | ||
489 | most UDP traffic is not involved in extended | ||
490 | conversations. Other implementations of 802.3ad may | ||
491 | or may not tolerate this noncompliance. | ||
492 | |||
493 | The default value is layer2. This option was added in bonding | ||
494 | version 2.6.3. In earlier versions of bonding, this parameter does | ||
495 | not exist, and the layer2 policy is the only policy. | ||
399 | 496 | ||
400 | 497 | ||
401 | 3. Configuring Bonding Devices | 498 | 3. Configuring Bonding Devices |
@@ -448,8 +545,9 @@ Bonding devices can be managed by hand, however, as follows. | |||
448 | slave devices. On SLES 9, this is most easily done by running the | 545 | slave devices. On SLES 9, this is most easily done by running the |
449 | yast2 sysconfig configuration utility. The goal is for to create an | 546 | yast2 sysconfig configuration utility. The goal is for to create an |
450 | ifcfg-id file for each slave device. The simplest way to accomplish | 547 | ifcfg-id file for each slave device. The simplest way to accomplish |
451 | this is to configure the devices for DHCP. The name of the | 548 | this is to configure the devices for DHCP (this is only to get the |
452 | configuration file for each device will be of the form: | 549 | file ifcfg-id file created; see below for some issues with DHCP). The |
550 | name of the configuration file for each device will be of the form: | ||
453 | 551 | ||
454 | ifcfg-id-xx:xx:xx:xx:xx:xx | 552 | ifcfg-id-xx:xx:xx:xx:xx:xx |
455 | 553 | ||
@@ -459,7 +557,7 @@ the device's permanent MAC address. | |||
459 | Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been | 557 | Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been |
460 | created, it is necessary to edit the configuration files for the slave | 558 | created, it is necessary to edit the configuration files for the slave |
461 | devices (the MAC addresses correspond to those of the slave devices). | 559 | devices (the MAC addresses correspond to those of the slave devices). |
462 | Before editing, the file will contain muliple lines, and will look | 560 | Before editing, the file will contain multiple lines, and will look |
463 | something like this: | 561 | something like this: |
464 | 562 | ||
465 | BOOTPROTO='dhcp' | 563 | BOOTPROTO='dhcp' |
@@ -496,16 +594,11 @@ STARTMODE="onboot" | |||
496 | BONDING_MASTER="yes" | 594 | BONDING_MASTER="yes" |
497 | BONDING_MODULE_OPTS="mode=active-backup miimon=100" | 595 | BONDING_MODULE_OPTS="mode=active-backup miimon=100" |
498 | BONDING_SLAVE0="eth0" | 596 | BONDING_SLAVE0="eth0" |
499 | BONDING_SLAVE1="eth1" | 597 | BONDING_SLAVE1="bus-pci-0000:06:08.1" |
500 | 598 | ||
501 | Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK | 599 | Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK |
502 | values with the appropriate values for your network. | 600 | values with the appropriate values for your network. |
503 | 601 | ||
504 | Note that configuring the bonding device with BOOTPROTO='dhcp' | ||
505 | does not work; the scripts attempt to obtain the device address from | ||
506 | DHCP prior to adding any of the slave devices. Without active slaves, | ||
507 | the DHCP requests are not sent to the network. | ||
508 | |||
509 | The STARTMODE specifies when the device is brought online. | 602 | The STARTMODE specifies when the device is brought online. |
510 | The possible values are: | 603 | The possible values are: |
511 | 604 | ||
@@ -531,9 +624,17 @@ for the bonding mode, link monitoring, and so on here. Do not include | |||
531 | the max_bonds bonding parameter; this will confuse the configuration | 624 | the max_bonds bonding parameter; this will confuse the configuration |
532 | system if you have multiple bonding devices. | 625 | system if you have multiple bonding devices. |
533 | 626 | ||
534 | Finally, supply one BONDING_SLAVEn="ethX" for each slave, | 627 | Finally, supply one BONDING_SLAVEn="slave device" for each |
535 | where "n" is an increasing value, one for each slave, and "ethX" is | 628 | slave. where "n" is an increasing value, one for each slave. The |
536 | the name of the slave device (eth0, eth1, etc). | 629 | "slave device" is either an interface name, e.g., "eth0", or a device |
630 | specifier for the network device. The interface name is easier to | ||
631 | find, but the ethN names are subject to change at boot time if, e.g., | ||
632 | a device early in the sequence has failed. The device specifiers | ||
633 | (bus-pci-0000:06:08.1 in the example above) specify the physical | ||
634 | network device, and will not change unless the device's bus location | ||
635 | changes (for example, it is moved from one PCI slot to another). The | ||
636 | example above uses one of each type for demonstration purposes; most | ||
637 | configurations will choose one or the other for all slave devices. | ||
537 | 638 | ||
538 | When all configuration files have been modified or created, | 639 | When all configuration files have been modified or created, |
539 | networking must be restarted for the configuration changes to take | 640 | networking must be restarted for the configuration changes to take |
@@ -544,7 +645,7 @@ effect. This can be accomplished via the following: | |||
544 | Note that the network control script (/sbin/ifdown) will | 645 | Note that the network control script (/sbin/ifdown) will |
545 | remove the bonding module as part of the network shutdown processing, | 646 | remove the bonding module as part of the network shutdown processing, |
546 | so it is not necessary to remove the module by hand if, e.g., the | 647 | so it is not necessary to remove the module by hand if, e.g., the |
547 | module paramters have changed. | 648 | module parameters have changed. |
548 | 649 | ||
549 | Also, at this writing, YaST/YaST2 will not manage bonding | 650 | Also, at this writing, YaST/YaST2 will not manage bonding |
550 | devices (they do not show bonding interfaces on its list of network | 651 | devices (they do not show bonding interfaces on its list of network |
@@ -559,12 +660,37 @@ format can be found in an example ifcfg template file: | |||
559 | Note that the template does not document the various BONDING_ | 660 | Note that the template does not document the various BONDING_ |
560 | settings described above, but does describe many of the other options. | 661 | settings described above, but does describe many of the other options. |
561 | 662 | ||
663 | 3.1.1 Using DHCP with sysconfig | ||
664 | ------------------------------- | ||
665 | |||
666 | Under sysconfig, configuring a device with BOOTPROTO='dhcp' | ||
667 | will cause it to query DHCP for its IP address information. At this | ||
668 | writing, this does not function for bonding devices; the scripts | ||
669 | attempt to obtain the device address from DHCP prior to adding any of | ||
670 | the slave devices. Without active slaves, the DHCP requests are not | ||
671 | sent to the network. | ||
672 | |||
673 | 3.1.2 Configuring Multiple Bonds with sysconfig | ||
674 | ----------------------------------------------- | ||
675 | |||
676 | The sysconfig network initialization system is capable of | ||
677 | handling multiple bonding devices. All that is necessary is for each | ||
678 | bonding instance to have an appropriately configured ifcfg-bondX file | ||
679 | (as described above). Do not specify the "max_bonds" parameter to any | ||
680 | instance of bonding, as this will confuse sysconfig. If you require | ||
681 | multiple bonding devices with identical parameters, create multiple | ||
682 | ifcfg-bondX files. | ||
683 | |||
684 | Because the sysconfig scripts supply the bonding module | ||
685 | options in the ifcfg-bondX file, it is not necessary to add them to | ||
686 | the system /etc/modules.conf or /etc/modprobe.conf configuration file. | ||
687 | |||
562 | 3.2 Configuration with initscripts support | 688 | 3.2 Configuration with initscripts support |
563 | ------------------------------------------ | 689 | ------------------------------------------ |
564 | 690 | ||
565 | This section applies to distros using a version of initscripts | 691 | This section applies to distros using a version of initscripts |
566 | with bonding support, for example, Red Hat Linux 9 or Red Hat | 692 | with bonding support, for example, Red Hat Linux 9 or Red Hat |
567 | Enterprise Linux version 3. On these systems, the network | 693 | Enterprise Linux version 3 or 4. On these systems, the network |
568 | initialization scripts have some knowledge of bonding, and can be | 694 | initialization scripts have some knowledge of bonding, and can be |
569 | configured to control bonding devices. | 695 | configured to control bonding devices. |
570 | 696 | ||
@@ -614,10 +740,11 @@ USERCTL=no | |||
614 | Be sure to change the networking specific lines (IPADDR, | 740 | Be sure to change the networking specific lines (IPADDR, |
615 | NETMASK, NETWORK and BROADCAST) to match your network configuration. | 741 | NETMASK, NETWORK and BROADCAST) to match your network configuration. |
616 | 742 | ||
617 | Finally, it is necessary to edit /etc/modules.conf to load the | 743 | Finally, it is necessary to edit /etc/modules.conf (or |
618 | bonding module when the bond0 interface is brought up. The following | 744 | /etc/modprobe.conf, depending upon your distro) to load the bonding |
619 | sample lines in /etc/modules.conf will load the bonding module, and | 745 | module with your desired options when the bond0 interface is brought |
620 | select its options: | 746 | up. The following lines in /etc/modules.conf (or modprobe.conf) will |
747 | load the bonding module, and select its options: | ||
621 | 748 | ||
622 | alias bond0 bonding | 749 | alias bond0 bonding |
623 | options bond0 mode=balance-alb miimon=100 | 750 | options bond0 mode=balance-alb miimon=100 |
@@ -629,6 +756,33 @@ options for your configuration. | |||
629 | will restart the networking subsystem and your bond link should be now | 756 | will restart the networking subsystem and your bond link should be now |
630 | up and running. | 757 | up and running. |
631 | 758 | ||
759 | 3.2.1 Using DHCP with initscripts | ||
760 | --------------------------------- | ||
761 | |||
762 | Recent versions of initscripts (the version supplied with | ||
763 | Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do | ||
764 | have support for assigning IP information to bonding devices via DHCP. | ||
765 | |||
766 | To configure bonding for DHCP, configure it as described | ||
767 | above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp" | ||
768 | and add a line consisting of "TYPE=Bonding". Note that the TYPE value | ||
769 | is case sensitive. | ||
770 | |||
771 | 3.2.2 Configuring Multiple Bonds with initscripts | ||
772 | ------------------------------------------------- | ||
773 | |||
774 | At this writing, the initscripts package does not directly | ||
775 | support loading the bonding driver multiple times, so the process for | ||
776 | doing so is the same as described in the "Configuring Multiple Bonds | ||
777 | Manually" section, below. | ||
778 | |||
779 | NOTE: It has been observed that some Red Hat supplied kernels | ||
780 | are apparently unable to rename modules at load time (the "-obonding1" | ||
781 | part). Attempts to pass that option to modprobe will produce an | ||
782 | "Operation not permitted" error. This has been reported on some | ||
783 | Fedora Core kernels, and has been seen on RHEL 4 as well. On kernels | ||
784 | exhibiting this problem, it will be impossible to configure multiple | ||
785 | bonds with differing parameters. | ||
632 | 786 | ||
633 | 3.3 Configuring Bonding Manually | 787 | 3.3 Configuring Bonding Manually |
634 | -------------------------------- | 788 | -------------------------------- |
@@ -638,10 +792,11 @@ scripts (the sysconfig or initscripts package) do not have specific | |||
638 | knowledge of bonding. One such distro is SuSE Linux Enterprise Server | 792 | knowledge of bonding. One such distro is SuSE Linux Enterprise Server |
639 | version 8. | 793 | version 8. |
640 | 794 | ||
641 | The general methodology for these systems is to place the | 795 | The general method for these systems is to place the bonding |
642 | bonding module parameters into /etc/modprobe.conf, then add modprobe | 796 | module parameters into /etc/modules.conf or /etc/modprobe.conf (as |
643 | and/or ifenslave commands to the system's global init script. The | 797 | appropriate for the installed distro), then add modprobe and/or |
644 | name of the global init script differs; for sysconfig, it is | 798 | ifenslave commands to the system's global init script. The name of |
799 | the global init script differs; for sysconfig, it is | ||
645 | /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. | 800 | /etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local. |
646 | 801 | ||
647 | For example, if you wanted to make a simple bond of two e100 | 802 | For example, if you wanted to make a simple bond of two e100 |
@@ -649,7 +804,7 @@ devices (presumed to be eth0 and eth1), and have it persist across | |||
649 | reboots, edit the appropriate file (/etc/init.d/boot.local or | 804 | reboots, edit the appropriate file (/etc/init.d/boot.local or |
650 | /etc/rc.d/rc.local), and add the following: | 805 | /etc/rc.d/rc.local), and add the following: |
651 | 806 | ||
652 | modprobe bonding -obond0 mode=balance-alb miimon=100 | 807 | modprobe bonding mode=balance-alb miimon=100 |
653 | modprobe e100 | 808 | modprobe e100 |
654 | ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up | 809 | ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up |
655 | ifenslave bond0 eth0 | 810 | ifenslave bond0 eth0 |
@@ -657,11 +812,7 @@ ifenslave bond0 eth1 | |||
657 | 812 | ||
658 | Replace the example bonding module parameters and bond0 | 813 | Replace the example bonding module parameters and bond0 |
659 | network configuration (IP address, netmask, etc) with the appropriate | 814 | network configuration (IP address, netmask, etc) with the appropriate |
660 | values for your configuration. The above example loads the bonding | 815 | values for your configuration. |
661 | module with the name "bond0," this simplifies the naming if multiple | ||
662 | bonding modules are loaded (each successive instance of the module is | ||
663 | given a different name, and the module instance names match the | ||
664 | bonding interface names). | ||
665 | 816 | ||
666 | Unfortunately, this method will not provide support for the | 817 | Unfortunately, this method will not provide support for the |
667 | ifup and ifdown scripts on the bond devices. To reload the bonding | 818 | ifup and ifdown scripts on the bond devices. To reload the bonding |
@@ -684,20 +835,23 @@ appropriate device driver modules. For our example above, you can do | |||
684 | the following: | 835 | the following: |
685 | 836 | ||
686 | # ifconfig bond0 down | 837 | # ifconfig bond0 down |
687 | # rmmod bond0 | 838 | # rmmod bonding |
688 | # rmmod e100 | 839 | # rmmod e100 |
689 | 840 | ||
690 | Again, for convenience, it may be desirable to create a script | 841 | Again, for convenience, it may be desirable to create a script |
691 | with these commands. | 842 | with these commands. |
692 | 843 | ||
693 | 844 | ||
694 | 3.4 Configuring Multiple Bonds | 845 | 3.3.1 Configuring Multiple Bonds Manually |
695 | ------------------------------ | 846 | ----------------------------------------- |
696 | 847 | ||
697 | This section contains information on configuring multiple | 848 | This section contains information on configuring multiple |
698 | bonding devices with differing options. If you require multiple | 849 | bonding devices with differing options for those systems whose network |
699 | bonding devices, but all with the same options, see the "max_bonds" | 850 | initialization scripts lack support for configuring multiple bonds. |
700 | module paramter, documented above. | 851 | |
852 | If you require multiple bonding devices, but all with the same | ||
853 | options, you may wish to use the "max_bonds" module parameter, | ||
854 | documented above. | ||
701 | 855 | ||
702 | To create multiple bonding devices with differing options, it | 856 | To create multiple bonding devices with differing options, it |
703 | is necessary to load the bonding driver multiple times. Note that | 857 | is necessary to load the bonding driver multiple times. Note that |
@@ -724,11 +878,16 @@ named "bond0" and creates the bond0 device in balance-rr mode with an | |||
724 | miimon of 100. The second instance is named "bond1" and creates the | 878 | miimon of 100. The second instance is named "bond1" and creates the |
725 | bond1 device in balance-alb mode with an miimon of 50. | 879 | bond1 device in balance-alb mode with an miimon of 50. |
726 | 880 | ||
881 | In some circumstances (typically with older distributions), | ||
882 | the above does not work, and the second bonding instance never sees | ||
883 | its options. In that case, the second options line can be substituted | ||
884 | as follows: | ||
885 | |||
886 | install bonding1 /sbin/modprobe bonding -obond1 mode=balance-alb miimon=50 | ||
887 | |||
727 | This may be repeated any number of times, specifying a new and | 888 | This may be repeated any number of times, specifying a new and |
728 | unique name in place of bond0 or bond1 for each instance. | 889 | unique name in place of bond1 for each subsequent instance. |
729 | 890 | ||
730 | When the appropriate module paramters are in place, then | ||
731 | configure bonding according to the instructions for your distro. | ||
732 | 891 | ||
733 | 5. Querying Bonding Configuration | 892 | 5. Querying Bonding Configuration |
734 | ================================= | 893 | ================================= |
@@ -846,8 +1005,8 @@ tagged internally by bonding itself. As a result, bonding must | |||
846 | self generated packets. | 1005 | self generated packets. |
847 | 1006 | ||
848 | For reasons of simplicity, and to support the use of adapters | 1007 | For reasons of simplicity, and to support the use of adapters |
849 | that can do VLAN hardware acceleration offloding, the bonding | 1008 | that can do VLAN hardware acceleration offloading, the bonding |
850 | interface declares itself as fully hardware offloaing capable, it gets | 1009 | interface declares itself as fully hardware offloading capable, it gets |
851 | the add_vid/kill_vid notifications to gather the necessary | 1010 | the add_vid/kill_vid notifications to gather the necessary |
852 | information, and it propagates those actions to the slaves. In case | 1011 | information, and it propagates those actions to the slaves. In case |
853 | of mixed adapter types, hardware accelerated tagged packets that | 1012 | of mixed adapter types, hardware accelerated tagged packets that |
@@ -880,7 +1039,7 @@ bond interface: | |||
880 | matches the hardware address of the VLAN interfaces. | 1039 | matches the hardware address of the VLAN interfaces. |
881 | 1040 | ||
882 | Note that changing a VLAN interface's HW address would set the | 1041 | Note that changing a VLAN interface's HW address would set the |
883 | underlying device -- i.e. the bonding interface -- to promiscouos | 1042 | underlying device -- i.e. the bonding interface -- to promiscuous |
884 | mode, which might not be what you want. | 1043 | mode, which might not be what you want. |
885 | 1044 | ||
886 | 1045 | ||
@@ -923,7 +1082,7 @@ down or have a problem making it unresponsive to ARP requests. Having | |||
923 | an additional target (or several) increases the reliability of the ARP | 1082 | an additional target (or several) increases the reliability of the ARP |
924 | monitoring. | 1083 | monitoring. |
925 | 1084 | ||
926 | Multiple ARP targets must be seperated by commas as follows: | 1085 | Multiple ARP targets must be separated by commas as follows: |
927 | 1086 | ||
928 | # example options for ARP monitoring with three targets | 1087 | # example options for ARP monitoring with three targets |
929 | alias bond0 bonding | 1088 | alias bond0 bonding |
@@ -1045,7 +1204,7 @@ install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; | |||
1045 | This will, when loading the bonding module, rather than | 1204 | This will, when loading the bonding module, rather than |
1046 | performing the normal action, instead execute the provided command. | 1205 | performing the normal action, instead execute the provided command. |
1047 | This command loads the device drivers in the order needed, then calls | 1206 | This command loads the device drivers in the order needed, then calls |
1048 | modprobe with --ingore-install to cause the normal action to then take | 1207 | modprobe with --ignore-install to cause the normal action to then take |
1049 | place. Full documentation on this can be found in the modprobe.conf | 1208 | place. Full documentation on this can be found in the modprobe.conf |
1050 | and modprobe manual pages. | 1209 | and modprobe manual pages. |
1051 | 1210 | ||
@@ -1130,14 +1289,14 @@ association. | |||
1130 | common to enable promiscuous mode on the device, so that all traffic | 1289 | common to enable promiscuous mode on the device, so that all traffic |
1131 | is seen (instead of seeing only traffic destined for the local host). | 1290 | is seen (instead of seeing only traffic destined for the local host). |
1132 | The bonding driver handles promiscuous mode changes to the bonding | 1291 | The bonding driver handles promiscuous mode changes to the bonding |
1133 | master device (e.g., bond0), and propogates the setting to the slave | 1292 | master device (e.g., bond0), and propagates the setting to the slave |
1134 | devices. | 1293 | devices. |
1135 | 1294 | ||
1136 | For the balance-rr, balance-xor, broadcast, and 802.3ad modes, | 1295 | For the balance-rr, balance-xor, broadcast, and 802.3ad modes, |
1137 | the promiscuous mode setting is propogated to all slaves. | 1296 | the promiscuous mode setting is propagated to all slaves. |
1138 | 1297 | ||
1139 | For the active-backup, balance-tlb and balance-alb modes, the | 1298 | For the active-backup, balance-tlb and balance-alb modes, the |
1140 | promiscuous mode setting is propogated only to the active slave. | 1299 | promiscuous mode setting is propagated only to the active slave. |
1141 | 1300 | ||
1142 | For balance-tlb mode, the active slave is the slave currently | 1301 | For balance-tlb mode, the active slave is the slave currently |
1143 | receiving inbound traffic. | 1302 | receiving inbound traffic. |
@@ -1148,46 +1307,182 @@ sending to peers that are unassigned or if the load is unbalanced. | |||
1148 | 1307 | ||
1149 | For the active-backup, balance-tlb and balance-alb modes, when | 1308 | For the active-backup, balance-tlb and balance-alb modes, when |
1150 | the active slave changes (e.g., due to a link failure), the | 1309 | the active slave changes (e.g., due to a link failure), the |
1151 | promiscuous setting will be propogated to the new active slave. | 1310 | promiscuous setting will be propagated to the new active slave. |
1152 | 1311 | ||
1153 | 12. High Availability Information | 1312 | 12. Configuring Bonding for High Availability |
1154 | ================================= | 1313 | ============================================= |
1155 | 1314 | ||
1156 | High Availability refers to configurations that provide | 1315 | High Availability refers to configurations that provide |
1157 | maximum network availability by having redundant or backup devices, | 1316 | maximum network availability by having redundant or backup devices, |
1158 | links and switches between the host and the rest of the world. | 1317 | links or switches between the host and the rest of the world. The |
1159 | 1318 | goal is to provide the maximum availability of network connectivity | |
1160 | There are currently two basic methods for configuring to | 1319 | (i.e., the network always works), even though other configurations |
1161 | maximize availability. They are dependent on the network topology and | 1320 | could provide higher throughput. |
1162 | the primary goal of the configuration, but in general, a configuration | ||
1163 | can be optimized for maximum available bandwidth, or for maximum | ||
1164 | network availability. | ||
1165 | 1321 | ||
1166 | 12.1 High Availability in a Single Switch Topology | 1322 | 12.1 High Availability in a Single Switch Topology |
1167 | -------------------------------------------------- | 1323 | -------------------------------------------------- |
1168 | 1324 | ||
1169 | If two hosts (or a host and a switch) are directly connected | 1325 | If two hosts (or a host and a single switch) are directly |
1170 | via multiple physical links, then there is no network availability | 1326 | connected via multiple physical links, then there is no availability |
1171 | penalty for optimizing for maximum bandwidth: there is only one switch | 1327 | penalty to optimizing for maximum bandwidth. In this case, there is |
1172 | (or peer), so if it fails, you have no alternative access to fail over | 1328 | only one switch (or peer), so if it fails, there is no alternative |
1173 | to. | 1329 | access to fail over to. Additionally, the bonding load balance modes |
1330 | support link monitoring of their members, so if individual links fail, | ||
1331 | the load will be rebalanced across the remaining devices. | ||
1332 | |||
1333 | See Section 13, "Configuring Bonding for Maximum Throughput" | ||
1334 | for information on configuring bonding with one peer device. | ||
1335 | |||
1336 | 12.2 High Availability in a Multiple Switch Topology | ||
1337 | ---------------------------------------------------- | ||
1338 | |||
1339 | With multiple switches, the configuration of bonding and the | ||
1340 | network changes dramatically. In multiple switch topologies, there is | ||
1341 | a trade off between network availability and usable bandwidth. | ||
1342 | |||
1343 | Below is a sample network, configured to maximize the | ||
1344 | availability of the network: | ||
1174 | 1345 | ||
1175 | Example 1 : host to switch (or other host) | 1346 | | | |
1347 | |port3 port3| | ||
1348 | +-----+----+ +-----+----+ | ||
1349 | | |port2 ISL port2| | | ||
1350 | | switch A +--------------------------+ switch B | | ||
1351 | | | | | | ||
1352 | +-----+----+ +-----++---+ | ||
1353 | |port1 port1| | ||
1354 | | +-------+ | | ||
1355 | +-------------+ host1 +---------------+ | ||
1356 | eth0 +-------+ eth1 | ||
1176 | 1357 | ||
1177 | +----------+ +----------+ | 1358 | In this configuration, there is a link between the two |
1178 | | |eth0 eth0| switch | | 1359 | switches (ISL, or inter switch link), and multiple ports connecting to |
1179 | | Host A +--------------------------+ or | | 1360 | the outside world ("port3" on each switch). There is no technical |
1180 | | +--------------------------+ other | | 1361 | reason that this could not be extended to a third switch. |
1181 | | |eth1 eth1| host | | ||
1182 | +----------+ +----------+ | ||
1183 | 1362 | ||
1363 | 12.2.1 HA Bonding Mode Selection for Multiple Switch Topology | ||
1364 | ------------------------------------------------------------- | ||
1184 | 1365 | ||
1185 | 12.1.1 Bonding Mode Selection for single switch topology | 1366 | In a topology such as the example above, the active-backup and |
1186 | -------------------------------------------------------- | 1367 | broadcast modes are the only useful bonding modes when optimizing for |
1368 | availability; the other modes require all links to terminate on the | ||
1369 | same peer for them to behave rationally. | ||
1370 | |||
1371 | active-backup: This is generally the preferred mode, particularly if | ||
1372 | the switches have an ISL and play together well. If the | ||
1373 | network configuration is such that one switch is specifically | ||
1374 | a backup switch (e.g., has lower capacity, higher cost, etc), | ||
1375 | then the primary option can be used to insure that the | ||
1376 | preferred link is always used when it is available. | ||
1377 | |||
1378 | broadcast: This mode is really a special purpose mode, and is suitable | ||
1379 | only for very specific needs. For example, if the two | ||
1380 | switches are not connected (no ISL), and the networks beyond | ||
1381 | them are totally independent. In this case, if it is | ||
1382 | necessary for some specific one-way traffic to reach both | ||
1383 | independent networks, then the broadcast mode may be suitable. | ||
1384 | |||
1385 | 12.2.2 HA Link Monitoring Selection for Multiple Switch Topology | ||
1386 | ---------------------------------------------------------------- | ||
1387 | |||
1388 | The choice of link monitoring ultimately depends upon your | ||
1389 | switch. If the switch can reliably fail ports in response to other | ||
1390 | failures, then either the MII or ARP monitors should work. For | ||
1391 | example, in the above example, if the "port3" link fails at the remote | ||
1392 | end, the MII monitor has no direct means to detect this. The ARP | ||
1393 | monitor could be configured with a target at the remote end of port3, | ||
1394 | thus detecting that failure without switch support. | ||
1395 | |||
1396 | In general, however, in a multiple switch topology, the ARP | ||
1397 | monitor can provide a higher level of reliability in detecting end to | ||
1398 | end connectivity failures (which may be caused by the failure of any | ||
1399 | individual component to pass traffic for any reason). Additionally, | ||
1400 | the ARP monitor should be configured with multiple targets (at least | ||
1401 | one for each switch in the network). This will insure that, | ||
1402 | regardless of which switch is active, the ARP monitor has a suitable | ||
1403 | target to query. | ||
1404 | |||
1405 | |||
1406 | 13. Configuring Bonding for Maximum Throughput | ||
1407 | ============================================== | ||
1408 | |||
1409 | 13.1 Maximizing Throughput in a Single Switch Topology | ||
1410 | ------------------------------------------------------ | ||
1411 | |||
1412 | In a single switch configuration, the best method to maximize | ||
1413 | throughput depends upon the application and network environment. The | ||
1414 | various load balancing modes each have strengths and weaknesses in | ||
1415 | different environments, as detailed below. | ||
1416 | |||
1417 | For this discussion, we will break down the topologies into | ||
1418 | two categories. Depending upon the destination of most traffic, we | ||
1419 | categorize them into either "gatewayed" or "local" configurations. | ||
1420 | |||
1421 | In a gatewayed configuration, the "switch" is acting primarily | ||
1422 | as a router, and the majority of traffic passes through this router to | ||
1423 | other networks. An example would be the following: | ||
1424 | |||
1425 | |||
1426 | +----------+ +----------+ | ||
1427 | | |eth0 port1| | to other networks | ||
1428 | | Host A +---------------------+ router +-------------------> | ||
1429 | | +---------------------+ | Hosts B and C are out | ||
1430 | | |eth1 port2| | here somewhere | ||
1431 | +----------+ +----------+ | ||
1432 | |||
1433 | The router may be a dedicated router device, or another host | ||
1434 | acting as a gateway. For our discussion, the important point is that | ||
1435 | the majority of traffic from Host A will pass through the router to | ||
1436 | some other network before reaching its final destination. | ||
1437 | |||
1438 | In a gatewayed network configuration, although Host A may | ||
1439 | communicate with many other systems, all of its traffic will be sent | ||
1440 | and received via one other peer on the local network, the router. | ||
1441 | |||
1442 | Note that the case of two systems connected directly via | ||
1443 | multiple physical links is, for purposes of configuring bonding, the | ||
1444 | same as a gatewayed configuration. In that case, it happens that all | ||
1445 | traffic is destined for the "gateway" itself, not some other network | ||
1446 | beyond the gateway. | ||
1447 | |||
1448 | In a local configuration, the "switch" is acting primarily as | ||
1449 | a switch, and the majority of traffic passes through this switch to | ||
1450 | reach other stations on the same network. An example would be the | ||
1451 | following: | ||
1452 | |||
1453 | +----------+ +----------+ +--------+ | ||
1454 | | |eth0 port1| +-------+ Host B | | ||
1455 | | Host A +------------+ switch |port3 +--------+ | ||
1456 | | +------------+ | +--------+ | ||
1457 | | |eth1 port2| +------------------+ Host C | | ||
1458 | +----------+ +----------+port4 +--------+ | ||
1459 | |||
1460 | |||
1461 | Again, the switch may be a dedicated switch device, or another | ||
1462 | host acting as a gateway. For our discussion, the important point is | ||
1463 | that the majority of traffic from Host A is destined for other hosts | ||
1464 | on the same local network (Hosts B and C in the above example). | ||
1465 | |||
1466 | In summary, in a gatewayed configuration, traffic to and from | ||
1467 | the bonded device will be to the same MAC level peer on the network | ||
1468 | (the gateway itself, i.e., the router), regardless of its final | ||
1469 | destination. In a local configuration, traffic flows directly to and | ||
1470 | from the final destinations, thus, each destination (Host B, Host C) | ||
1471 | will be addressed directly by their individual MAC addresses. | ||
1472 | |||
1473 | This distinction between a gatewayed and a local network | ||
1474 | configuration is important because many of the load balancing modes | ||
1475 | available use the MAC addresses of the local network source and | ||
1476 | destination to make load balancing decisions. The behavior of each | ||
1477 | mode is described below. | ||
1478 | |||
1479 | |||
1480 | 13.1.1 MT Bonding Mode Selection for Single Switch Topology | ||
1481 | ----------------------------------------------------------- | ||
1187 | 1482 | ||
1188 | This configuration is the easiest to set up and to understand, | 1483 | This configuration is the easiest to set up and to understand, |
1189 | although you will have to decide which bonding mode best suits your | 1484 | although you will have to decide which bonding mode best suits your |
1190 | needs. The tradeoffs for each mode are detailed below: | 1485 | needs. The trade offs for each mode are detailed below: |
1191 | 1486 | ||
1192 | balance-rr: This mode is the only mode that will permit a single | 1487 | balance-rr: This mode is the only mode that will permit a single |
1193 | TCP/IP connection to stripe traffic across multiple | 1488 | TCP/IP connection to stripe traffic across multiple |
@@ -1206,6 +1501,23 @@ balance-rr: This mode is the only mode that will permit a single | |||
1206 | interface's worth of throughput, even after adjusting | 1501 | interface's worth of throughput, even after adjusting |
1207 | tcp_reordering. | 1502 | tcp_reordering. |
1208 | 1503 | ||
1504 | Note that this out of order delivery occurs when both the | ||
1505 | sending and receiving systems are utilizing a multiple | ||
1506 | interface bond. Consider a configuration in which a | ||
1507 | balance-rr bond feeds into a single higher capacity network | ||
1508 | channel (e.g., multiple 100Mb/sec ethernets feeding a single | ||
1509 | gigabit ethernet via an etherchannel capable switch). In this | ||
1510 | configuration, traffic sent from the multiple 100Mb devices to | ||
1511 | a destination connected to the gigabit device will not see | ||
1512 | packets out of order. However, traffic sent from the gigabit | ||
1513 | device to the multiple 100Mb devices may or may not see | ||
1514 | traffic out of order, depending upon the balance policy of the | ||
1515 | switch. Many switches do not support any modes that stripe | ||
1516 | traffic (instead choosing a port based upon IP or MAC level | ||
1517 | addresses); for those devices, traffic flowing from the | ||
1518 | gigabit device to the many 100Mb devices will only utilize one | ||
1519 | interface. | ||
1520 | |||
1209 | If you are utilizing protocols other than TCP/IP, UDP for | 1521 | If you are utilizing protocols other than TCP/IP, UDP for |
1210 | example, and your application can tolerate out of order | 1522 | example, and your application can tolerate out of order |
1211 | delivery, then this mode can allow for single stream datagram | 1523 | delivery, then this mode can allow for single stream datagram |
@@ -1220,16 +1532,21 @@ active-backup: There is not much advantage in this network topology to | |||
1220 | connected to the same peer as the primary. In this case, a | 1532 | connected to the same peer as the primary. In this case, a |
1221 | load balancing mode (with link monitoring) will provide the | 1533 | load balancing mode (with link monitoring) will provide the |
1222 | same level of network availability, but with increased | 1534 | same level of network availability, but with increased |
1223 | available bandwidth. On the plus side, it does not require | 1535 | available bandwidth. On the plus side, active-backup mode |
1224 | any configuration of the switch. | 1536 | does not require any configuration of the switch, so it may |
1537 | have value if the hardware available does not support any of | ||
1538 | the load balance modes. | ||
1225 | 1539 | ||
1226 | balance-xor: This mode will limit traffic such that packets destined | 1540 | balance-xor: This mode will limit traffic such that packets destined |
1227 | for specific peers will always be sent over the same | 1541 | for specific peers will always be sent over the same |
1228 | interface. Since the destination is determined by the MAC | 1542 | interface. Since the destination is determined by the MAC |
1229 | addresses involved, this may be desirable if you have a large | 1543 | addresses involved, this mode works best in a "local" network |
1230 | network with many hosts. It is likely to be suboptimal if all | 1544 | configuration (as described above), with destinations all on |
1231 | your traffic is passed through a single router, however. As | 1545 | the same local network. This mode is likely to be suboptimal |
1232 | with balance-rr, the switch ports need to be configured for | 1546 | if all your traffic is passed through a single router (i.e., a |
1547 | "gatewayed" network configuration, as described above). | ||
1548 | |||
1549 | As with balance-rr, the switch ports need to be configured for | ||
1233 | "etherchannel" or "trunking." | 1550 | "etherchannel" or "trunking." |
1234 | 1551 | ||
1235 | broadcast: Like active-backup, there is not much advantage to this | 1552 | broadcast: Like active-backup, there is not much advantage to this |
@@ -1241,122 +1558,131 @@ broadcast: Like active-backup, there is not much advantage to this | |||
1241 | protocol includes automatic configuration of the aggregates, | 1558 | protocol includes automatic configuration of the aggregates, |
1242 | so minimal manual configuration of the switch is needed | 1559 | so minimal manual configuration of the switch is needed |
1243 | (typically only to designate that some set of devices is | 1560 | (typically only to designate that some set of devices is |
1244 | usable for 802.3ad). The 802.3ad standard also mandates that | 1561 | available for 802.3ad). The 802.3ad standard also mandates |
1245 | frames be delivered in order (within certain limits), so in | 1562 | that frames be delivered in order (within certain limits), so |
1246 | general single connections will not see misordering of | 1563 | in general single connections will not see misordering of |
1247 | packets. The 802.3ad mode does have some drawbacks: the | 1564 | packets. The 802.3ad mode does have some drawbacks: the |
1248 | standard mandates that all devices in the aggregate operate at | 1565 | standard mandates that all devices in the aggregate operate at |
1249 | the same speed and duplex. Also, as with all bonding load | 1566 | the same speed and duplex. Also, as with all bonding load |
1250 | balance modes other than balance-rr, no single connection will | 1567 | balance modes other than balance-rr, no single connection will |
1251 | be able to utilize more than a single interface's worth of | 1568 | be able to utilize more than a single interface's worth of |
1252 | bandwidth. Additionally, the linux bonding 802.3ad | 1569 | bandwidth. |
1253 | implementation distributes traffic by peer (using an XOR of | 1570 | |
1254 | MAC addresses), so in general all traffic to a particular | 1571 | Additionally, the linux bonding 802.3ad implementation |
1255 | destination will use the same interface. Finally, the 802.3ad | 1572 | distributes traffic by peer (using an XOR of MAC addresses), |
1256 | mode mandates the use of the MII monitor, therefore, the ARP | 1573 | so in a "gatewayed" configuration, all outgoing traffic will |
1257 | monitor is not available in this mode. | 1574 | generally use the same device. Incoming traffic may also end |
1258 | 1575 | up on a single device, but that is dependent upon the | |
1259 | balance-tlb: This mode is also a good choice for this type of | 1576 | balancing policy of the peer's 8023.ad implementation. In a |
1260 | topology. It has no special switch configuration | 1577 | "local" configuration, traffic will be distributed across the |
1261 | requirements, and balances outgoing traffic by peer, in a | 1578 | devices in the bond. |
1262 | vaguely intelligent manner (not a simple XOR as in balance-xor | 1579 | |
1263 | or 802.3ad mode), so that unlucky MAC addresses will not all | 1580 | Finally, the 802.3ad mode mandates the use of the MII monitor, |
1264 | "bunch up" on a single interface. Interfaces may be of | 1581 | therefore, the ARP monitor is not available in this mode. |
1265 | differing speeds. On the down side, in this mode all incoming | 1582 | |
1266 | traffic arrives over a single interface, this mode requires | 1583 | balance-tlb: The balance-tlb mode balances outgoing traffic by peer. |
1267 | certain ethtool support in the network device driver of the | 1584 | Since the balancing is done according to MAC address, in a |
1268 | slave interfaces, and the ARP monitor is not available. | 1585 | "gatewayed" configuration (as described above), this mode will |
1269 | 1586 | send all traffic across a single device. However, in a | |
1270 | balance-alb: This mode is everything that balance-tlb is, and more. It | 1587 | "local" network configuration, this mode balances multiple |
1271 | has all of the features (and restrictions) of balance-tlb, and | 1588 | local network peers across devices in a vaguely intelligent |
1272 | will also balance incoming traffic from peers (as described in | 1589 | manner (not a simple XOR as in balance-xor or 802.3ad mode), |
1273 | the Bonding Module Options section, above). The only extra | 1590 | so that mathematically unlucky MAC addresses (i.e., ones that |
1274 | down side to this mode is that the network device driver must | 1591 | XOR to the same value) will not all "bunch up" on a single |
1275 | support changing the hardware address while the device is | 1592 | interface. |
1276 | open. | 1593 | |
1277 | 1594 | Unlike 802.3ad, interfaces may be of differing speeds, and no | |
1278 | 12.1.2 Link Monitoring for Single Switch Topology | 1595 | special switch configuration is required. On the down side, |
1279 | ------------------------------------------------- | 1596 | in this mode all incoming traffic arrives over a single |
1597 | interface, this mode requires certain ethtool support in the | ||
1598 | network device driver of the slave interfaces, and the ARP | ||
1599 | monitor is not available. | ||
1600 | |||
1601 | balance-alb: This mode is everything that balance-tlb is, and more. | ||
1602 | It has all of the features (and restrictions) of balance-tlb, | ||
1603 | and will also balance incoming traffic from local network | ||
1604 | peers (as described in the Bonding Module Options section, | ||
1605 | above). | ||
1606 | |||
1607 | The only additional down side to this mode is that the network | ||
1608 | device driver must support changing the hardware address while | ||
1609 | the device is open. | ||
1610 | |||
1611 | 13.1.2 MT Link Monitoring for Single Switch Topology | ||
1612 | ---------------------------------------------------- | ||
1280 | 1613 | ||
1281 | The choice of link monitoring may largely depend upon which | 1614 | The choice of link monitoring may largely depend upon which |
1282 | mode you choose to use. The more advanced load balancing modes do not | 1615 | mode you choose to use. The more advanced load balancing modes do not |
1283 | support the use of the ARP monitor, and are thus restricted to using | 1616 | support the use of the ARP monitor, and are thus restricted to using |
1284 | the MII monitor (which does not provide as high a level of assurance | 1617 | the MII monitor (which does not provide as high a level of end to end |
1285 | as the ARP monitor). | 1618 | assurance as the ARP monitor). |
1286 | 1619 | ||
1287 | 1620 | 13.2 Maximum Throughput in a Multiple Switch Topology | |
1288 | 12.2 High Availability in a Multiple Switch Topology | 1621 | ----------------------------------------------------- |
1289 | ---------------------------------------------------- | 1622 | |
1290 | 1623 | Multiple switches may be utilized to optimize for throughput | |
1291 | With multiple switches, the configuration of bonding and the | 1624 | when they are configured in parallel as part of an isolated network |
1292 | network changes dramatically. In multiple switch topologies, there is | 1625 | between two or more systems, for example: |
1293 | a tradeoff between network availability and usable bandwidth. | 1626 | |
1294 | 1627 | +-----------+ | |
1295 | Below is a sample network, configured to maximize the | 1628 | | Host A | |
1296 | availability of the network: | 1629 | +-+---+---+-+ |
1297 | 1630 | | | | | |
1298 | | | | 1631 | +--------+ | +---------+ |
1299 | |port3 port3| | 1632 | | | | |
1300 | +-----+----+ +-----+----+ | 1633 | +------+---+ +-----+----+ +-----+----+ |
1301 | | |port2 ISL port2| | | 1634 | | Switch A | | Switch B | | Switch C | |
1302 | | switch A +--------------------------+ switch B | | 1635 | +------+---+ +-----+----+ +-----+----+ |
1303 | | | | | | 1636 | | | | |
1304 | +-----+----+ +-----++---+ | 1637 | +--------+ | +---------+ |
1305 | |port1 port1| | 1638 | | | | |
1306 | | +-------+ | | 1639 | +-+---+---+-+ |
1307 | +-------------+ host1 +---------------+ | 1640 | | Host B | |
1308 | eth0 +-------+ eth1 | 1641 | +-----------+ |
1309 | 1642 | ||
1310 | In this configuration, there is a link between the two | 1643 | In this configuration, the switches are isolated from one |
1311 | switches (ISL, or inter switch link), and multiple ports connecting to | 1644 | another. One reason to employ a topology such as this is for an |
1312 | the outside world ("port3" on each switch). There is no technical | 1645 | isolated network with many hosts (a cluster configured for high |
1313 | reason that this could not be extended to a third switch. | 1646 | performance, for example), using multiple smaller switches can be more |
1314 | 1647 | cost effective than a single larger switch, e.g., on a network with 24 | |
1315 | 12.2.1 Bonding Mode Selection for Multiple Switch Topology | 1648 | hosts, three 24 port switches can be significantly less expensive than |
1316 | ---------------------------------------------------------- | 1649 | a single 72 port switch. |
1317 | 1650 | ||
1318 | In a topology such as this, the active-backup and broadcast | 1651 | If access beyond the network is required, an individual host |
1319 | modes are the only useful bonding modes; the other modes require all | 1652 | can be equipped with an additional network device connected to an |
1320 | links to terminate on the same peer for them to behave rationally. | 1653 | external network; this host then additionally acts as a gateway. |
1321 | 1654 | ||
1322 | active-backup: This is generally the preferred mode, particularly if | 1655 | 13.2.1 MT Bonding Mode Selection for Multiple Switch Topology |
1323 | the switches have an ISL and play together well. If the | ||
1324 | network configuration is such that one switch is specifically | ||
1325 | a backup switch (e.g., has lower capacity, higher cost, etc), | ||
1326 | then the primary option can be used to insure that the | ||
1327 | preferred link is always used when it is available. | ||
1328 | |||
1329 | broadcast: This mode is really a special purpose mode, and is suitable | ||
1330 | only for very specific needs. For example, if the two | ||
1331 | switches are not connected (no ISL), and the networks beyond | ||
1332 | them are totally independant. In this case, if it is | ||
1333 | necessary for some specific one-way traffic to reach both | ||
1334 | independent networks, then the broadcast mode may be suitable. | ||
1335 | |||
1336 | 12.2.2 Link Monitoring Selection for Multiple Switch Topology | ||
1337 | ------------------------------------------------------------- | 1656 | ------------------------------------------------------------- |
1338 | 1657 | ||
1339 | The choice of link monitoring ultimately depends upon your | 1658 | In actual practice, the bonding mode typically employed in |
1340 | switch. If the switch can reliably fail ports in response to other | 1659 | configurations of this type is balance-rr. Historically, in this |
1341 | failures, then either the MII or ARP monitors should work. For | 1660 | network configuration, the usual caveats about out of order packet |
1342 | example, in the above example, if the "port3" link fails at the remote | 1661 | delivery are mitigated by the use of network adapters that do not do |
1343 | end, the MII monitor has no direct means to detect this. The ARP | 1662 | any kind of packet coalescing (via the use of NAPI, or because the |
1344 | monitor could be configured with a target at the remote end of port3, | 1663 | device itself does not generate interrupts until some number of |
1345 | thus detecting that failure without switch support. | 1664 | packets has arrived). When employed in this fashion, the balance-rr |
1665 | mode allows individual connections between two hosts to effectively | ||
1666 | utilize greater than one interface's bandwidth. | ||
1346 | 1667 | ||
1347 | In general, however, in a multiple switch topology, the ARP | 1668 | 13.2.2 MT Link Monitoring for Multiple Switch Topology |
1348 | monitor can provide a higher level of reliability in detecting link | 1669 | ------------------------------------------------------ |
1349 | failures. Additionally, it should be configured with multiple targets | ||
1350 | (at least one for each switch in the network). This will insure that, | ||
1351 | regardless of which switch is active, the ARP monitor has a suitable | ||
1352 | target to query. | ||
1353 | 1670 | ||
1671 | Again, in actual practice, the MII monitor is most often used | ||
1672 | in this configuration, as performance is given preference over | ||
1673 | availability. The ARP monitor will function in this topology, but its | ||
1674 | advantages over the MII monitor are mitigated by the volume of probes | ||
1675 | needed as the number of systems involved grows (remember that each | ||
1676 | host in the network is configured with bonding). | ||
1354 | 1677 | ||
1355 | 12.3 Switch Behavior Issues for High Availability | 1678 | 14. Switch Behavior Issues |
1356 | ------------------------------------------------- | 1679 | ========================== |
1357 | 1680 | ||
1358 | You may encounter issues with the timing of link up and down | 1681 | 14.1 Link Establishment and Failover Delays |
1359 | reporting by the switch. | 1682 | ------------------------------------------- |
1683 | |||
1684 | Some switches exhibit undesirable behavior with regard to the | ||
1685 | timing of link up and down reporting by the switch. | ||
1360 | 1686 | ||
1361 | First, when a link comes up, some switches may indicate that | 1687 | First, when a link comes up, some switches may indicate that |
1362 | the link is up (carrier available), but not pass traffic over the | 1688 | the link is up (carrier available), but not pass traffic over the |
@@ -1370,30 +1696,70 @@ relevant interface(s). | |||
1370 | Second, some switches may "bounce" the link state one or more | 1696 | Second, some switches may "bounce" the link state one or more |
1371 | times while a link is changing state. This occurs most commonly while | 1697 | times while a link is changing state. This occurs most commonly while |
1372 | the switch is initializing. Again, an appropriate updelay value may | 1698 | the switch is initializing. Again, an appropriate updelay value may |
1373 | help, but note that if all links are down, then updelay is ignored | 1699 | help. |
1374 | when any link becomes active (the slave closest to completing its | ||
1375 | updelay is chosen). | ||
1376 | 1700 | ||
1377 | Note that when a bonding interface has no active links, the | 1701 | Note that when a bonding interface has no active links, the |
1378 | driver will immediately reuse the first link that goes up, even if | 1702 | driver will immediately reuse the first link that goes up, even if the |
1379 | updelay parameter was specified. If there are slave interfaces | 1703 | updelay parameter has been specified (the updelay is ignored in this |
1380 | waiting for the updelay timeout to expire, the interface that first | 1704 | case). If there are slave interfaces waiting for the updelay timeout |
1381 | went into that state will be immediately reused. This reduces down | 1705 | to expire, the interface that first went into that state will be |
1382 | time of the network if the value of updelay has been overestimated. | 1706 | immediately reused. This reduces down time of the network if the |
1707 | value of updelay has been overestimated, and since this occurs only in | ||
1708 | cases with no connectivity, there is no additional penalty for | ||
1709 | ignoring the updelay. | ||
1383 | 1710 | ||
1384 | In addition to the concerns about switch timings, if your | 1711 | In addition to the concerns about switch timings, if your |
1385 | switches take a long time to go into backup mode, it may be desirable | 1712 | switches take a long time to go into backup mode, it may be desirable |
1386 | to not activate a backup interface immediately after a link goes down. | 1713 | to not activate a backup interface immediately after a link goes down. |
1387 | Failover may be delayed via the downdelay bonding module option. | 1714 | Failover may be delayed via the downdelay bonding module option. |
1388 | 1715 | ||
1389 | 13. Hardware Specific Considerations | 1716 | 14.2 Duplicated Incoming Packets |
1717 | -------------------------------- | ||
1718 | |||
1719 | It is not uncommon to observe a short burst of duplicated | ||
1720 | traffic when the bonding device is first used, or after it has been | ||
1721 | idle for some period of time. This is most easily observed by issuing | ||
1722 | a "ping" to some other host on the network, and noticing that the | ||
1723 | output from ping flags duplicates (typically one per slave). | ||
1724 | |||
1725 | For example, on a bond in active-backup mode with five slaves | ||
1726 | all connected to one switch, the output may appear as follows: | ||
1727 | |||
1728 | # ping -n 10.0.4.2 | ||
1729 | PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data. | ||
1730 | 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms | ||
1731 | 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) | ||
1732 | 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) | ||
1733 | 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) | ||
1734 | 64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!) | ||
1735 | 64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms | ||
1736 | 64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms | ||
1737 | 64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms | ||
1738 | |||
1739 | This is not due to an error in the bonding driver, rather, it | ||
1740 | is a side effect of how many switches update their MAC forwarding | ||
1741 | tables. Initially, the switch does not associate the MAC address in | ||
1742 | the packet with a particular switch port, and so it may send the | ||
1743 | traffic to all ports until its MAC forwarding table is updated. Since | ||
1744 | the interfaces attached to the bond may occupy multiple ports on a | ||
1745 | single switch, when the switch (temporarily) floods the traffic to all | ||
1746 | ports, the bond device receives multiple copies of the same packet | ||
1747 | (one per slave device). | ||
1748 | |||
1749 | The duplicated packet behavior is switch dependent, some | ||
1750 | switches exhibit this, and some do not. On switches that display this | ||
1751 | behavior, it can be induced by clearing the MAC forwarding table (on | ||
1752 | most Cisco switches, the privileged command "clear mac address-table | ||
1753 | dynamic" will accomplish this). | ||
1754 | |||
1755 | 15. Hardware Specific Considerations | ||
1390 | ==================================== | 1756 | ==================================== |
1391 | 1757 | ||
1392 | This section contains additional information for configuring | 1758 | This section contains additional information for configuring |
1393 | bonding on specific hardware platforms, or for interfacing bonding | 1759 | bonding on specific hardware platforms, or for interfacing bonding |
1394 | with particular switches or other devices. | 1760 | with particular switches or other devices. |
1395 | 1761 | ||
1396 | 13.1 IBM BladeCenter | 1762 | 15.1 IBM BladeCenter |
1397 | -------------------- | 1763 | -------------------- |
1398 | 1764 | ||
1399 | This applies to the JS20 and similar systems. | 1765 | This applies to the JS20 and similar systems. |
@@ -1407,12 +1773,12 @@ JS20 network adapter information | |||
1407 | -------------------------------- | 1773 | -------------------------------- |
1408 | 1774 | ||
1409 | All JS20s come with two Broadcom Gigabit Ethernet ports | 1775 | All JS20s come with two Broadcom Gigabit Ethernet ports |
1410 | integrated on the planar. In the BladeCenter chassis, the eth0 port | 1776 | integrated on the planar (that's "motherboard" in IBM-speak). In the |
1411 | of all JS20 blades is hard wired to I/O Module #1; similarly, all eth1 | 1777 | BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to |
1412 | ports are wired to I/O Module #2. An add-on Broadcom daughter card | 1778 | I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2. |
1413 | can be installed on a JS20 to provide two more Gigabit Ethernet ports. | 1779 | An add-on Broadcom daughter card can be installed on a JS20 to provide |
1414 | These ports, eth2 and eth3, are wired to I/O Modules 3 and 4, | 1780 | two more Gigabit Ethernet ports. These ports, eth2 and eth3, are |
1415 | respectively. | 1781 | wired to I/O Modules 3 and 4, respectively. |
1416 | 1782 | ||
1417 | Each I/O Module may contain either a switch or a passthrough | 1783 | Each I/O Module may contain either a switch or a passthrough |
1418 | module (which allows ports to be directly connected to an external | 1784 | module (which allows ports to be directly connected to an external |
@@ -1432,29 +1798,30 @@ BladeCenter networking configuration | |||
1432 | of ways, this discussion will be confined to describing basic | 1798 | of ways, this discussion will be confined to describing basic |
1433 | configurations. | 1799 | configurations. |
1434 | 1800 | ||
1435 | Normally, Ethernet Switch Modules (ESM) are used in I/O | 1801 | Normally, Ethernet Switch Modules (ESMs) are used in I/O |
1436 | modules 1 and 2. In this configuration, the eth0 and eth1 ports of a | 1802 | modules 1 and 2. In this configuration, the eth0 and eth1 ports of a |
1437 | JS20 will be connected to different internal switches (in the | 1803 | JS20 will be connected to different internal switches (in the |
1438 | respective I/O modules). | 1804 | respective I/O modules). |
1439 | 1805 | ||
1440 | An optical passthru module (OPM) connects the I/O module | 1806 | A passthrough module (OPM or CPM, optical or copper, |
1441 | directly to an external switch. By using OPMs in I/O module #1 and | 1807 | passthrough module) connects the I/O module directly to an external |
1442 | #2, the eth0 and eth1 interfaces of a JS20 can be redirected to the | 1808 | switch. By using PMs in I/O module #1 and #2, the eth0 and eth1 |
1443 | outside world and connected to a common external switch. | 1809 | interfaces of a JS20 can be redirected to the outside world and |
1444 | 1810 | connected to a common external switch. | |
1445 | Depending upon the mix of ESM and OPM modules, the network | 1811 | |
1446 | will appear to bonding as either a single switch topology (all OPM | 1812 | Depending upon the mix of ESMs and PMs, the network will |
1447 | modules) or as a multiple switch topology (one or more ESM modules, | 1813 | appear to bonding as either a single switch topology (all PMs) or as a |
1448 | zero or more OPM modules). It is also possible to connect ESM modules | 1814 | multiple switch topology (one or more ESMs, zero or more PMs). It is |
1449 | together, resulting in a configuration much like the example in "High | 1815 | also possible to connect ESMs together, resulting in a configuration |
1450 | Availability in a multiple switch topology." | 1816 | much like the example in "High Availability in a Multiple Switch |
1451 | 1817 | Topology," above. | |
1452 | Requirements for specifc modes | 1818 | |
1453 | ------------------------------ | 1819 | Requirements for specific modes |
1454 | 1820 | ------------------------------- | |
1455 | The balance-rr mode requires the use of OPM modules for | 1821 | |
1456 | devices in the bond, all connected to an common external switch. That | 1822 | The balance-rr mode requires the use of passthrough modules |
1457 | switch must be configured for "etherchannel" or "trunking" on the | 1823 | for devices in the bond, all connected to an common external switch. |
1824 | That switch must be configured for "etherchannel" or "trunking" on the | ||
1458 | appropriate ports, as is usual for balance-rr. | 1825 | appropriate ports, as is usual for balance-rr. |
1459 | 1826 | ||
1460 | The balance-alb and balance-tlb modes will function with | 1827 | The balance-alb and balance-tlb modes will function with |
@@ -1484,17 +1851,18 @@ connected to the JS20 system. | |||
1484 | Other concerns | 1851 | Other concerns |
1485 | -------------- | 1852 | -------------- |
1486 | 1853 | ||
1487 | The Serial Over LAN link is established over the primary | 1854 | The Serial Over LAN (SoL) link is established over the primary |
1488 | ethernet (eth0) only, therefore, any loss of link to eth0 will result | 1855 | ethernet (eth0) only, therefore, any loss of link to eth0 will result |
1489 | in losing your SoL connection. It will not fail over with other | 1856 | in losing your SoL connection. It will not fail over with other |
1490 | network traffic. | 1857 | network traffic, as the SoL system is beyond the control of the |
1858 | bonding driver. | ||
1491 | 1859 | ||
1492 | It may be desirable to disable spanning tree on the switch | 1860 | It may be desirable to disable spanning tree on the switch |
1493 | (either the internal Ethernet Switch Module, or an external switch) to | 1861 | (either the internal Ethernet Switch Module, or an external switch) to |
1494 | avoid fail-over delays issues when using bonding. | 1862 | avoid fail-over delay issues when using bonding. |
1495 | 1863 | ||
1496 | 1864 | ||
1497 | 14. Frequently Asked Questions | 1865 | 16. Frequently Asked Questions |
1498 | ============================== | 1866 | ============================== |
1499 | 1867 | ||
1500 | 1. Is it SMP safe? | 1868 | 1. Is it SMP safe? |
@@ -1505,8 +1873,8 @@ The new driver was designed to be SMP safe from the start. | |||
1505 | 2. What type of cards will work with it? | 1873 | 2. What type of cards will work with it? |
1506 | 1874 | ||
1507 | Any Ethernet type cards (you can even mix cards - a Intel | 1875 | Any Ethernet type cards (you can even mix cards - a Intel |
1508 | EtherExpress PRO/100 and a 3com 3c905b, for example). They need not | 1876 | EtherExpress PRO/100 and a 3com 3c905b, for example). For most modes, |
1509 | be of the same speed. | 1877 | devices need not be of the same speed. |
1510 | 1878 | ||
1511 | 3. How many bonding devices can I have? | 1879 | 3. How many bonding devices can I have? |
1512 | 1880 | ||
@@ -1524,11 +1892,12 @@ system. | |||
1524 | disabled. The active-backup mode will fail over to a backup link, and | 1892 | disabled. The active-backup mode will fail over to a backup link, and |
1525 | other modes will ignore the failed link. The link will continue to be | 1893 | other modes will ignore the failed link. The link will continue to be |
1526 | monitored, and should it recover, it will rejoin the bond (in whatever | 1894 | monitored, and should it recover, it will rejoin the bond (in whatever |
1527 | manner is appropriate for the mode). See the section on High | 1895 | manner is appropriate for the mode). See the sections on High |
1528 | Availability for additional information. | 1896 | Availability and the documentation for each mode for additional |
1897 | information. | ||
1529 | 1898 | ||
1530 | Link monitoring can be enabled via either the miimon or | 1899 | Link monitoring can be enabled via either the miimon or |
1531 | arp_interval paramters (described in the module paramters section, | 1900 | arp_interval parameters (described in the module parameters section, |
1532 | above). In general, miimon monitors the carrier state as sensed by | 1901 | above). In general, miimon monitors the carrier state as sensed by |
1533 | the underlying network device, and the arp monitor (arp_interval) | 1902 | the underlying network device, and the arp monitor (arp_interval) |
1534 | monitors connectivity to another host on the local network. | 1903 | monitors connectivity to another host on the local network. |
@@ -1536,7 +1905,7 @@ monitors connectivity to another host on the local network. | |||
1536 | If no link monitoring is configured, the bonding driver will | 1905 | If no link monitoring is configured, the bonding driver will |
1537 | be unable to detect link failures, and will assume that all links are | 1906 | be unable to detect link failures, and will assume that all links are |
1538 | always available. This will likely result in lost packets, and a | 1907 | always available. This will likely result in lost packets, and a |
1539 | resulting degredation of performance. The precise performance loss | 1908 | resulting degradation of performance. The precise performance loss |
1540 | depends upon the bonding mode and network configuration. | 1909 | depends upon the bonding mode and network configuration. |
1541 | 1910 | ||
1542 | 6. Can bonding be used for High Availability? | 1911 | 6. Can bonding be used for High Availability? |
@@ -1550,12 +1919,12 @@ depends upon the bonding mode and network configuration. | |||
1550 | In the basic balance modes (balance-rr and balance-xor), it | 1919 | In the basic balance modes (balance-rr and balance-xor), it |
1551 | works with any system that supports etherchannel (also called | 1920 | works with any system that supports etherchannel (also called |
1552 | trunking). Most managed switches currently available have such | 1921 | trunking). Most managed switches currently available have such |
1553 | support, and many unmananged switches as well. | 1922 | support, and many unmanaged switches as well. |
1554 | 1923 | ||
1555 | The advanced balance modes (balance-tlb and balance-alb) do | 1924 | The advanced balance modes (balance-tlb and balance-alb) do |
1556 | not have special switch requirements, but do need device drivers that | 1925 | not have special switch requirements, but do need device drivers that |
1557 | support specific features (described in the appropriate section under | 1926 | support specific features (described in the appropriate section under |
1558 | module paramters, above). | 1927 | module parameters, above). |
1559 | 1928 | ||
1560 | In 802.3ad mode, it works with with systems that support IEEE | 1929 | In 802.3ad mode, it works with with systems that support IEEE |
1561 | 802.3ad Dynamic Link Aggregation. Most managed and many unmanaged | 1930 | 802.3ad Dynamic Link Aggregation. Most managed and many unmanaged |
@@ -1565,17 +1934,19 @@ switches currently available support 802.3ad. | |||
1565 | 1934 | ||
1566 | 8. Where does a bonding device get its MAC address from? | 1935 | 8. Where does a bonding device get its MAC address from? |
1567 | 1936 | ||
1568 | If not explicitly configured with ifconfig, the MAC address of | 1937 | If not explicitly configured (with ifconfig or ip link), the |
1569 | the bonding device is taken from its first slave device. This MAC | 1938 | MAC address of the bonding device is taken from its first slave |
1570 | address is then passed to all following slaves and remains persistent | 1939 | device. This MAC address is then passed to all following slaves and |
1571 | (even if the the first slave is removed) until the bonding device is | 1940 | remains persistent (even if the the first slave is removed) until the |
1572 | brought down or reconfigured. | 1941 | bonding device is brought down or reconfigured. |
1573 | 1942 | ||
1574 | If you wish to change the MAC address, you can set it with | 1943 | If you wish to change the MAC address, you can set it with |
1575 | ifconfig: | 1944 | ifconfig or ip link: |
1576 | 1945 | ||
1577 | # ifconfig bond0 hw ether 00:11:22:33:44:55 | 1946 | # ifconfig bond0 hw ether 00:11:22:33:44:55 |
1578 | 1947 | ||
1948 | # ip link set bond0 address 66:77:88:99:aa:bb | ||
1949 | |||
1579 | The MAC address can be also changed by bringing down/up the | 1950 | The MAC address can be also changed by bringing down/up the |
1580 | device and then changing its slaves (or their order): | 1951 | device and then changing its slaves (or their order): |
1581 | 1952 | ||
@@ -1591,23 +1962,28 @@ from the bond (`ifenslave -d bond0 eth0'). The bonding driver will | |||
1591 | then restore the MAC addresses that the slaves had before they were | 1962 | then restore the MAC addresses that the slaves had before they were |
1592 | enslaved. | 1963 | enslaved. |
1593 | 1964 | ||
1594 | 15. Resources and Links | 1965 | 16. Resources and Links |
1595 | ======================= | 1966 | ======================= |
1596 | 1967 | ||
1597 | The latest version of the bonding driver can be found in the latest | 1968 | The latest version of the bonding driver can be found in the latest |
1598 | version of the linux kernel, found on http://kernel.org | 1969 | version of the linux kernel, found on http://kernel.org |
1599 | 1970 | ||
1971 | The latest version of this document can be found in either the latest | ||
1972 | kernel source (named Documentation/networking/bonding.txt), or on the | ||
1973 | bonding sourceforge site: | ||
1974 | |||
1975 | http://www.sourceforge.net/projects/bonding | ||
1976 | |||
1600 | Discussions regarding the bonding driver take place primarily on the | 1977 | Discussions regarding the bonding driver take place primarily on the |
1601 | bonding-devel mailing list, hosted at sourceforge.net. If you have | 1978 | bonding-devel mailing list, hosted at sourceforge.net. If you have |
1602 | questions or problems, post them to the list. | 1979 | questions or problems, post them to the list. The list address is: |
1603 | 1980 | ||
1604 | bonding-devel@lists.sourceforge.net | 1981 | bonding-devel@lists.sourceforge.net |
1605 | 1982 | ||
1606 | https://lists.sourceforge.net/lists/listinfo/bonding-devel | 1983 | The administrative interface (to subscribe or unsubscribe) can |
1607 | 1984 | be found at: | |
1608 | There is also a project site on sourceforge. | ||
1609 | 1985 | ||
1610 | http://www.sourceforge.net/projects/bonding | 1986 | https://lists.sourceforge.net/lists/listinfo/bonding-devel |
1611 | 1987 | ||
1612 | Donald Becker's Ethernet Drivers and diag programs may be found at : | 1988 | Donald Becker's Ethernet Drivers and diag programs may be found at : |
1613 | - http://www.scyld.com/network/ | 1989 | - http://www.scyld.com/network/ |
diff --git a/Documentation/networking/dmfe.txt b/Documentation/networking/dmfe.txt index c0e8398674ef..046363552d09 100644 --- a/Documentation/networking/dmfe.txt +++ b/Documentation/networking/dmfe.txt | |||
@@ -1,59 +1,65 @@ | |||
1 | dmfe.c: Version 1.28 01/18/2000 | 1 | Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. |
2 | 2 | ||
3 | A Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. | 3 | This program is free software; you can redistribute it and/or |
4 | Copyright (C) 1997 Sten Wang | 4 | modify it under the terms of the GNU General Public License |
5 | as published by the Free Software Foundation; either version 2 | ||
6 | of the License, or (at your option) any later version. | ||
5 | 7 | ||
6 | This program is free software; you can redistribute it and/or | 8 | This program is distributed in the hope that it will be useful, |
7 | modify it under the terms of the GNU General Public License | 9 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
8 | as published by the Free Software Foundation; either version 2 | 10 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
9 | of the License, or (at your option) any later version. | 11 | GNU General Public License for more details. |
10 | 12 | ||
11 | This program is distributed in the hope that it will be useful, | ||
12 | but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
13 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
14 | GNU General Public License for more details. | ||
15 | 13 | ||
14 | This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET | ||
15 | 10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you | ||
16 | didn't compile this driver as a module, it will automatically load itself on boot and print a | ||
17 | line similar to : | ||
16 | 18 | ||
17 | A. Compiler command: | 19 | dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17) |
18 | 20 | ||
19 | A-1: For normal single or multiple processor kernel | 21 | If you compiled this driver as a module, you have to load it on boot.You can load it with command : |
20 | "gcc -DMODULE -D__KERNEL__ -I/usr/src/linux/net/inet -Wall | ||
21 | -Wstrict-prototypes -O6 -c dmfe.c" | ||
22 | 22 | ||
23 | A-2: For single or multiple processor with kernel module version function | 23 | insmod dmfe |
24 | "gcc -DMODULE -DMODVERSIONS -D__KERNEL__ -I/usr/src/linux/net/inet | ||
25 | -Wall -Wstrict-prototypes -O6 -c dmfe.c" | ||
26 | 24 | ||
25 | This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass | ||
26 | a mode= setting to module while loading, like : | ||
27 | 27 | ||
28 | B. The following steps teach you how to activate a DM9102 board: | 28 | insmod dmfe mode=0 # Force 10M Half Duplex |
29 | insmod dmfe mode=1 # Force 100M Half Duplex | ||
30 | insmod dmfe mode=4 # Force 10M Full Duplex | ||
31 | insmod dmfe mode=5 # Force 100M Full Duplex | ||
29 | 32 | ||
30 | 1. Used the upper compiler command to compile dmfe.c | 33 | Next you should configure your network interface with a command similar to : |
31 | 34 | ||
32 | 2. Insert dmfe module into kernel | 35 | ifconfig eth0 172.22.3.18 |
33 | "insmod dmfe" ;;Auto Detection Mode (Suggest) | 36 | ^^^^^^^^^^^ |
34 | "insmod dmfe mode=0" ;;Force 10M Half Duplex | 37 | Your IP Adress |
35 | "insmod dmfe mode=1" ;;Force 100M Half Duplex | ||
36 | "insmod dmfe mode=4" ;;Force 10M Full Duplex | ||
37 | "insmod dmfe mode=5" ;;Force 100M Full Duplex | ||
38 | 38 | ||
39 | 3. Config a dm9102 network interface | 39 | Then you may have to modify the default routing table with command : |
40 | "ifconfig eth0 172.22.3.18" | ||
41 | ^^^^^^^^^^^ Your IP address | ||
42 | 40 | ||
43 | 4. Activate the IP routing table. For some distributions, it is not | 41 | route add default eth0 |
44 | necessary. You can type "route" to check. | ||
45 | 42 | ||
46 | "route add default eth0" | ||
47 | 43 | ||
44 | Now your ethernet card should be up and running. | ||
48 | 45 | ||
49 | 5. Well done. Your DM9102 adapter is now activated. | ||
50 | 46 | ||
47 | TODO: | ||
51 | 48 | ||
52 | C. Object files description: | 49 | Implement pci_driver::suspend() and pci_driver::resume() power management methods. |
53 | 1. dmfe_rh61.o: For Redhat 6.1 | 50 | Check on 64 bit boxes. |
51 | Check and fix on big endian boxes. | ||
52 | Test and make sure PCI latency is now correct for all cases. | ||
54 | 53 | ||
55 | If you can make sure your kernel version, you can rename | ||
56 | to dmfe.o and directly use it without re-compiling. | ||
57 | 54 | ||
55 | Authors: | ||
58 | 56 | ||
59 | Author: Sten Wang, 886-3-5798797-8517, E-mail: sten_wang@davicom.com.tw | 57 | Sten Wang <sten_wang@davicom.com.tw > : Original Author |
58 | Tobias Ringstrom <tori@unhappy.mine.nu> : Current Maintainer | ||
59 | |||
60 | Contributors: | ||
61 | |||
62 | Marcelo Tosatti <marcelo@conectiva.com.br> | ||
63 | Alan Cox <alan@redhat.com> | ||
64 | Jeff Garzik <jgarzik@pobox.com> | ||
65 | Vojtech Pavlik <vojtech@suse.cz> | ||
diff --git a/Documentation/networking/fib_trie.txt b/Documentation/networking/fib_trie.txt new file mode 100644 index 000000000000..f50d0c673c57 --- /dev/null +++ b/Documentation/networking/fib_trie.txt | |||
@@ -0,0 +1,145 @@ | |||
1 | LC-trie implementation notes. | ||
2 | |||
3 | Node types | ||
4 | ---------- | ||
5 | leaf | ||
6 | An end node with data. This has a copy of the relevant key, along | ||
7 | with 'hlist' with routing table entries sorted by prefix length. | ||
8 | See struct leaf and struct leaf_info. | ||
9 | |||
10 | trie node or tnode | ||
11 | An internal node, holding an array of child (leaf or tnode) pointers, | ||
12 | indexed through a subset of the key. See Level Compression. | ||
13 | |||
14 | A few concepts explained | ||
15 | ------------------------ | ||
16 | Bits (tnode) | ||
17 | The number of bits in the key segment used for indexing into the | ||
18 | child array - the "child index". See Level Compression. | ||
19 | |||
20 | Pos (tnode) | ||
21 | The position (in the key) of the key segment used for indexing into | ||
22 | the child array. See Path Compression. | ||
23 | |||
24 | Path Compression / skipped bits | ||
25 | Any given tnode is linked to from the child array of its parent, using | ||
26 | a segment of the key specified by the parent's "pos" and "bits" | ||
27 | In certain cases, this tnode's own "pos" will not be immediately | ||
28 | adjacent to the parent (pos+bits), but there will be some bits | ||
29 | in the key skipped over because they represent a single path with no | ||
30 | deviations. These "skipped bits" constitute Path Compression. | ||
31 | Note that the search algorithm will simply skip over these bits when | ||
32 | searching, making it necessary to save the keys in the leaves to | ||
33 | verify that they actually do match the key we are searching for. | ||
34 | |||
35 | Level Compression / child arrays | ||
36 | the trie is kept level balanced moving, under certain conditions, the | ||
37 | children of a full child (see "full_children") up one level, so that | ||
38 | instead of a pure binary tree, each internal node ("tnode") may | ||
39 | contain an arbitrarily large array of links to several children. | ||
40 | Conversely, a tnode with a mostly empty child array (see empty_children) | ||
41 | may be "halved", having some of its children moved downwards one level, | ||
42 | in order to avoid ever-increasing child arrays. | ||
43 | |||
44 | empty_children | ||
45 | the number of positions in the child array of a given tnode that are | ||
46 | NULL. | ||
47 | |||
48 | full_children | ||
49 | the number of children of a given tnode that aren't path compressed. | ||
50 | (in other words, they aren't NULL or leaves and their "pos" is equal | ||
51 | to this tnode's "pos"+"bits"). | ||
52 | |||
53 | (The word "full" here is used more in the sense of "complete" than | ||
54 | as the opposite of "empty", which might be a tad confusing.) | ||
55 | |||
56 | Comments | ||
57 | --------- | ||
58 | |||
59 | We have tried to keep the structure of the code as close to fib_hash as | ||
60 | possible to allow verification and help up reviewing. | ||
61 | |||
62 | fib_find_node() | ||
63 | A good start for understanding this code. This function implements a | ||
64 | straightforward trie lookup. | ||
65 | |||
66 | fib_insert_node() | ||
67 | Inserts a new leaf node in the trie. This is bit more complicated than | ||
68 | fib_find_node(). Inserting a new node means we might have to run the | ||
69 | level compression algorithm on part of the trie. | ||
70 | |||
71 | trie_leaf_remove() | ||
72 | Looks up a key, deletes it and runs the level compression algorithm. | ||
73 | |||
74 | trie_rebalance() | ||
75 | The key function for the dynamic trie after any change in the trie | ||
76 | it is run to optimize and reorganize. Tt will walk the trie upwards | ||
77 | towards the root from a given tnode, doing a resize() at each step | ||
78 | to implement level compression. | ||
79 | |||
80 | resize() | ||
81 | Analyzes a tnode and optimizes the child array size by either inflating | ||
82 | or shrinking it repeatedly until it fullfills the criteria for optimal | ||
83 | level compression. This part follows the original paper pretty closely | ||
84 | and there may be some room for experimentation here. | ||
85 | |||
86 | inflate() | ||
87 | Doubles the size of the child array within a tnode. Used by resize(). | ||
88 | |||
89 | halve() | ||
90 | Halves the size of the child array within a tnode - the inverse of | ||
91 | inflate(). Used by resize(); | ||
92 | |||
93 | fn_trie_insert(), fn_trie_delete(), fn_trie_select_default() | ||
94 | The route manipulation functions. Should conform pretty closely to the | ||
95 | corresponding functions in fib_hash. | ||
96 | |||
97 | fn_trie_flush() | ||
98 | This walks the full trie (using nextleaf()) and searches for empty | ||
99 | leaves which have to be removed. | ||
100 | |||
101 | fn_trie_dump() | ||
102 | Dumps the routing table ordered by prefix length. This is somewhat | ||
103 | slower than the corresponding fib_hash function, as we have to walk the | ||
104 | entire trie for each prefix length. In comparison, fib_hash is organized | ||
105 | as one "zone"/hash per prefix length. | ||
106 | |||
107 | Locking | ||
108 | ------- | ||
109 | |||
110 | fib_lock is used for an RW-lock in the same way that this is done in fib_hash. | ||
111 | However, the functions are somewhat separated for other possible locking | ||
112 | scenarios. It might conceivably be possible to run trie_rebalance via RCU | ||
113 | to avoid read_lock in the fn_trie_lookup() function. | ||
114 | |||
115 | Main lookup mechanism | ||
116 | --------------------- | ||
117 | fn_trie_lookup() is the main lookup function. | ||
118 | |||
119 | The lookup is in its simplest form just like fib_find_node(). We descend the | ||
120 | trie, key segment by key segment, until we find a leaf. check_leaf() does | ||
121 | the fib_semantic_match in the leaf's sorted prefix hlist. | ||
122 | |||
123 | If we find a match, we are done. | ||
124 | |||
125 | If we don't find a match, we enter prefix matching mode. The prefix length, | ||
126 | starting out at the same as the key length, is reduced one step at a time, | ||
127 | and we backtrack upwards through the trie trying to find a longest matching | ||
128 | prefix. The goal is always to reach a leaf and get a positive result from the | ||
129 | fib_semantic_match mechanism. | ||
130 | |||
131 | Inside each tnode, the search for longest matching prefix consists of searching | ||
132 | through the child array, chopping off (zeroing) the least significant "1" of | ||
133 | the child index until we find a match or the child index consists of nothing but | ||
134 | zeros. | ||
135 | |||
136 | At this point we backtrack (t->stats.backtrack++) up the trie, continuing to | ||
137 | chop off part of the key in order to find the longest matching prefix. | ||
138 | |||
139 | At this point we will repeatedly descend subtries to look for a match, and there | ||
140 | are some optimizations available that can provide us with "shortcuts" to avoid | ||
141 | descending into dead ends. Look for "HL_OPTIMIZE" sections in the code. | ||
142 | |||
143 | To alleviate any doubts about the correctness of the route selection process, | ||
144 | a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which | ||
145 | gives userland access to fib_lookup(). | ||
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index a2c893a7475d..ab65714d95fc 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt | |||
@@ -304,57 +304,6 @@ tcp_low_latency - BOOLEAN | |||
304 | changed would be a Beowulf compute cluster. | 304 | changed would be a Beowulf compute cluster. |
305 | Default: 0 | 305 | Default: 0 |
306 | 306 | ||
307 | tcp_westwood - BOOLEAN | ||
308 | Enable TCP Westwood+ congestion control algorithm. | ||
309 | TCP Westwood+ is a sender-side only modification of the TCP Reno | ||
310 | protocol stack that optimizes the performance of TCP congestion | ||
311 | control. It is based on end-to-end bandwidth estimation to set | ||
312 | congestion window and slow start threshold after a congestion | ||
313 | episode. Using this estimation, TCP Westwood+ adaptively sets a | ||
314 | slow start threshold and a congestion window which takes into | ||
315 | account the bandwidth used at the time congestion is experienced. | ||
316 | TCP Westwood+ significantly increases fairness wrt TCP Reno in | ||
317 | wired networks and throughput over wireless links. | ||
318 | Default: 0 | ||
319 | |||
320 | tcp_vegas_cong_avoid - BOOLEAN | ||
321 | Enable TCP Vegas congestion avoidance algorithm. | ||
322 | TCP Vegas is a sender-side only change to TCP that anticipates | ||
323 | the onset of congestion by estimating the bandwidth. TCP Vegas | ||
324 | adjusts the sending rate by modifying the congestion | ||
325 | window. TCP Vegas should provide less packet loss, but it is | ||
326 | not as aggressive as TCP Reno. | ||
327 | Default:0 | ||
328 | |||
329 | tcp_bic - BOOLEAN | ||
330 | Enable BIC TCP congestion control algorithm. | ||
331 | BIC-TCP is a sender-side only change that ensures a linear RTT | ||
332 | fairness under large windows while offering both scalability and | ||
333 | bounded TCP-friendliness. The protocol combines two schemes | ||
334 | called additive increase and binary search increase. When the | ||
335 | congestion window is large, additive increase with a large | ||
336 | increment ensures linear RTT fairness as well as good | ||
337 | scalability. Under small congestion windows, binary search | ||
338 | increase provides TCP friendliness. | ||
339 | Default: 0 | ||
340 | |||
341 | tcp_bic_low_window - INTEGER | ||
342 | Sets the threshold window (in packets) where BIC TCP starts to | ||
343 | adjust the congestion window. Below this threshold BIC TCP behaves | ||
344 | the same as the default TCP Reno. | ||
345 | Default: 14 | ||
346 | |||
347 | tcp_bic_fast_convergence - BOOLEAN | ||
348 | Forces BIC TCP to more quickly respond to changes in congestion | ||
349 | window. Allows two flows sharing the same connection to converge | ||
350 | more rapidly. | ||
351 | Default: 1 | ||
352 | |||
353 | tcp_default_win_scale - INTEGER | ||
354 | Sets the minimum window scale TCP will negotiate for on all | ||
355 | conections. | ||
356 | Default: 7 | ||
357 | |||
358 | tcp_tso_win_divisor - INTEGER | 307 | tcp_tso_win_divisor - INTEGER |
359 | This allows control over what percentage of the congestion window | 308 | This allows control over what percentage of the congestion window |
360 | can be consumed by a single TSO frame. | 309 | can be consumed by a single TSO frame. |
@@ -368,6 +317,11 @@ tcp_frto - BOOLEAN | |||
368 | where packet loss is typically due to random radio interference | 317 | where packet loss is typically due to random radio interference |
369 | rather than intermediate router congestion. | 318 | rather than intermediate router congestion. |
370 | 319 | ||
320 | tcp_congestion_control - STRING | ||
321 | Set the congestion control algorithm to be used for new | ||
322 | connections. The algorithm "reno" is always available, but | ||
323 | additional choices may be available based on kernel configuration. | ||
324 | |||
371 | somaxconn - INTEGER | 325 | somaxconn - INTEGER |
372 | Limit of socket listen() backlog, known in userspace as SOMAXCONN. | 326 | Limit of socket listen() backlog, known in userspace as SOMAXCONN. |
373 | Defaults to 128. See also tcp_max_syn_backlog for additional tuning | 327 | Defaults to 128. See also tcp_max_syn_backlog for additional tuning |
diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt new file mode 100644 index 000000000000..29ccae409031 --- /dev/null +++ b/Documentation/networking/phy.txt | |||
@@ -0,0 +1,288 @@ | |||
1 | |||
2 | ------- | ||
3 | PHY Abstraction Layer | ||
4 | (Updated 2005-07-21) | ||
5 | |||
6 | Purpose | ||
7 | |||
8 | Most network devices consist of set of registers which provide an interface | ||
9 | to a MAC layer, which communicates with the physical connection through a | ||
10 | PHY. The PHY concerns itself with negotiating link parameters with the link | ||
11 | partner on the other side of the network connection (typically, an ethernet | ||
12 | cable), and provides a register interface to allow drivers to determine what | ||
13 | settings were chosen, and to configure what settings are allowed. | ||
14 | |||
15 | While these devices are distinct from the network devices, and conform to a | ||
16 | standard layout for the registers, it has been common practice to integrate | ||
17 | the PHY management code with the network driver. This has resulted in large | ||
18 | amounts of redundant code. Also, on embedded systems with multiple (and | ||
19 | sometimes quite different) ethernet controllers connected to the same | ||
20 | management bus, it is difficult to ensure safe use of the bus. | ||
21 | |||
22 | Since the PHYs are devices, and the management busses through which they are | ||
23 | accessed are, in fact, busses, the PHY Abstraction Layer treats them as such. | ||
24 | In doing so, it has these goals: | ||
25 | |||
26 | 1) Increase code-reuse | ||
27 | 2) Increase overall code-maintainability | ||
28 | 3) Speed development time for new network drivers, and for new systems | ||
29 | |||
30 | Basically, this layer is meant to provide an interface to PHY devices which | ||
31 | allows network driver writers to write as little code as possible, while | ||
32 | still providing a full feature set. | ||
33 | |||
34 | The MDIO bus | ||
35 | |||
36 | Most network devices are connected to a PHY by means of a management bus. | ||
37 | Different devices use different busses (though some share common interfaces). | ||
38 | In order to take advantage of the PAL, each bus interface needs to be | ||
39 | registered as a distinct device. | ||
40 | |||
41 | 1) read and write functions must be implemented. Their prototypes are: | ||
42 | |||
43 | int write(struct mii_bus *bus, int mii_id, int regnum, u16 value); | ||
44 | int read(struct mii_bus *bus, int mii_id, int regnum); | ||
45 | |||
46 | mii_id is the address on the bus for the PHY, and regnum is the register | ||
47 | number. These functions are guaranteed not to be called from interrupt | ||
48 | time, so it is safe for them to block, waiting for an interrupt to signal | ||
49 | the operation is complete | ||
50 | |||
51 | 2) A reset function is necessary. This is used to return the bus to an | ||
52 | initialized state. | ||
53 | |||
54 | 3) A probe function is needed. This function should set up anything the bus | ||
55 | driver needs, setup the mii_bus structure, and register with the PAL using | ||
56 | mdiobus_register. Similarly, there's a remove function to undo all of | ||
57 | that (use mdiobus_unregister). | ||
58 | |||
59 | 4) Like any driver, the device_driver structure must be configured, and init | ||
60 | exit functions are used to register the driver. | ||
61 | |||
62 | 5) The bus must also be declared somewhere as a device, and registered. | ||
63 | |||
64 | As an example for how one driver implemented an mdio bus driver, see | ||
65 | drivers/net/gianfar_mii.c and arch/ppc/syslib/mpc85xx_devices.c | ||
66 | |||
67 | Connecting to a PHY | ||
68 | |||
69 | Sometime during startup, the network driver needs to establish a connection | ||
70 | between the PHY device, and the network device. At this time, the PHY's bus | ||
71 | and drivers need to all have been loaded, so it is ready for the connection. | ||
72 | At this point, there are several ways to connect to the PHY: | ||
73 | |||
74 | 1) The PAL handles everything, and only calls the network driver when | ||
75 | the link state changes, so it can react. | ||
76 | |||
77 | 2) The PAL handles everything except interrupts (usually because the | ||
78 | controller has the interrupt registers). | ||
79 | |||
80 | 3) The PAL handles everything, but checks in with the driver every second, | ||
81 | allowing the network driver to react first to any changes before the PAL | ||
82 | does. | ||
83 | |||
84 | 4) The PAL serves only as a library of functions, with the network device | ||
85 | manually calling functions to update status, and configure the PHY | ||
86 | |||
87 | |||
88 | Letting the PHY Abstraction Layer do Everything | ||
89 | |||
90 | If you choose option 1 (The hope is that every driver can, but to still be | ||
91 | useful to drivers that can't), connecting to the PHY is simple: | ||
92 | |||
93 | First, you need a function to react to changes in the link state. This | ||
94 | function follows this protocol: | ||
95 | |||
96 | static void adjust_link(struct net_device *dev); | ||
97 | |||
98 | Next, you need to know the device name of the PHY connected to this device. | ||
99 | The name will look something like, "phy0:0", where the first number is the | ||
100 | bus id, and the second is the PHY's address on that bus. | ||
101 | |||
102 | Now, to connect, just call this function: | ||
103 | |||
104 | phydev = phy_connect(dev, phy_name, &adjust_link, flags); | ||
105 | |||
106 | phydev is a pointer to the phy_device structure which represents the PHY. If | ||
107 | phy_connect is successful, it will return the pointer. dev, here, is the | ||
108 | pointer to your net_device. Once done, this function will have started the | ||
109 | PHY's software state machine, and registered for the PHY's interrupt, if it | ||
110 | has one. The phydev structure will be populated with information about the | ||
111 | current state, though the PHY will not yet be truly operational at this | ||
112 | point. | ||
113 | |||
114 | flags is a u32 which can optionally contain phy-specific flags. | ||
115 | This is useful if the system has put hardware restrictions on | ||
116 | the PHY/controller, of which the PHY needs to be aware. | ||
117 | |||
118 | Now just make sure that phydev->supported and phydev->advertising have any | ||
119 | values pruned from them which don't make sense for your controller (a 10/100 | ||
120 | controller may be connected to a gigabit capable PHY, so you would need to | ||
121 | mask off SUPPORTED_1000baseT*). See include/linux/ethtool.h for definitions | ||
122 | for these bitfields. Note that you should not SET any bits, or the PHY may | ||
123 | get put into an unsupported state. | ||
124 | |||
125 | Lastly, once the controller is ready to handle network traffic, you call | ||
126 | phy_start(phydev). This tells the PAL that you are ready, and configures the | ||
127 | PHY to connect to the network. If you want to handle your own interrupts, | ||
128 | just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start. | ||
129 | Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL. | ||
130 | |||
131 | When you want to disconnect from the network (even if just briefly), you call | ||
132 | phy_stop(phydev). | ||
133 | |||
134 | Keeping Close Tabs on the PAL | ||
135 | |||
136 | It is possible that the PAL's built-in state machine needs a little help to | ||
137 | keep your network device and the PHY properly in sync. If so, you can | ||
138 | register a helper function when connecting to the PHY, which will be called | ||
139 | every second before the state machine reacts to any changes. To do this, you | ||
140 | need to manually call phy_attach() and phy_prepare_link(), and then call | ||
141 | phy_start_machine() with the second argument set to point to your special | ||
142 | handler. | ||
143 | |||
144 | Currently there are no examples of how to use this functionality, and testing | ||
145 | on it has been limited because the author does not have any drivers which use | ||
146 | it (they all use option 1). So Caveat Emptor. | ||
147 | |||
148 | Doing it all yourself | ||
149 | |||
150 | There's a remote chance that the PAL's built-in state machine cannot track | ||
151 | the complex interactions between the PHY and your network device. If this is | ||
152 | so, you can simply call phy_attach(), and not call phy_start_machine or | ||
153 | phy_prepare_link(). This will mean that phydev->state is entirely yours to | ||
154 | handle (phy_start and phy_stop toggle between some of the states, so you | ||
155 | might need to avoid them). | ||
156 | |||
157 | An effort has been made to make sure that useful functionality can be | ||
158 | accessed without the state-machine running, and most of these functions are | ||
159 | descended from functions which did not interact with a complex state-machine. | ||
160 | However, again, no effort has been made so far to test running without the | ||
161 | state machine, so tryer beware. | ||
162 | |||
163 | Here is a brief rundown of the functions: | ||
164 | |||
165 | int phy_read(struct phy_device *phydev, u16 regnum); | ||
166 | int phy_write(struct phy_device *phydev, u16 regnum, u16 val); | ||
167 | |||
168 | Simple read/write primitives. They invoke the bus's read/write function | ||
169 | pointers. | ||
170 | |||
171 | void phy_print_status(struct phy_device *phydev); | ||
172 | |||
173 | A convenience function to print out the PHY status neatly. | ||
174 | |||
175 | int phy_clear_interrupt(struct phy_device *phydev); | ||
176 | int phy_config_interrupt(struct phy_device *phydev, u32 interrupts); | ||
177 | |||
178 | Clear the PHY's interrupt, and configure which ones are allowed, | ||
179 | respectively. Currently only supports all on, or all off. | ||
180 | |||
181 | int phy_enable_interrupts(struct phy_device *phydev); | ||
182 | int phy_disable_interrupts(struct phy_device *phydev); | ||
183 | |||
184 | Functions which enable/disable PHY interrupts, clearing them | ||
185 | before and after, respectively. | ||
186 | |||
187 | int phy_start_interrupts(struct phy_device *phydev); | ||
188 | int phy_stop_interrupts(struct phy_device *phydev); | ||
189 | |||
190 | Requests the IRQ for the PHY interrupts, then enables them for | ||
191 | start, or disables then frees them for stop. | ||
192 | |||
193 | struct phy_device * phy_attach(struct net_device *dev, const char *phy_id, | ||
194 | u32 flags); | ||
195 | |||
196 | Attaches a network device to a particular PHY, binding the PHY to a generic | ||
197 | driver if none was found during bus initialization. Passes in | ||
198 | any phy-specific flags as needed. | ||
199 | |||
200 | int phy_start_aneg(struct phy_device *phydev); | ||
201 | |||
202 | Using variables inside the phydev structure, either configures advertising | ||
203 | and resets autonegotiation, or disables autonegotiation, and configures | ||
204 | forced settings. | ||
205 | |||
206 | static inline int phy_read_status(struct phy_device *phydev); | ||
207 | |||
208 | Fills the phydev structure with up-to-date information about the current | ||
209 | settings in the PHY. | ||
210 | |||
211 | void phy_sanitize_settings(struct phy_device *phydev) | ||
212 | |||
213 | Resolves differences between currently desired settings, and | ||
214 | supported settings for the given PHY device. Does not make | ||
215 | the changes in the hardware, though. | ||
216 | |||
217 | int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd); | ||
218 | int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd); | ||
219 | |||
220 | Ethtool convenience functions. | ||
221 | |||
222 | int phy_mii_ioctl(struct phy_device *phydev, | ||
223 | struct mii_ioctl_data *mii_data, int cmd); | ||
224 | |||
225 | The MII ioctl. Note that this function will completely screw up the state | ||
226 | machine if you write registers like BMCR, BMSR, ADVERTISE, etc. Best to | ||
227 | use this only to write registers which are not standard, and don't set off | ||
228 | a renegotiation. | ||
229 | |||
230 | |||
231 | PHY Device Drivers | ||
232 | |||
233 | With the PHY Abstraction Layer, adding support for new PHYs is | ||
234 | quite easy. In some cases, no work is required at all! However, | ||
235 | many PHYs require a little hand-holding to get up-and-running. | ||
236 | |||
237 | Generic PHY driver | ||
238 | |||
239 | If the desired PHY doesn't have any errata, quirks, or special | ||
240 | features you want to support, then it may be best to not add | ||
241 | support, and let the PHY Abstraction Layer's Generic PHY Driver | ||
242 | do all of the work. | ||
243 | |||
244 | Writing a PHY driver | ||
245 | |||
246 | If you do need to write a PHY driver, the first thing to do is | ||
247 | make sure it can be matched with an appropriate PHY device. | ||
248 | This is done during bus initialization by reading the device's | ||
249 | UID (stored in registers 2 and 3), then comparing it to each | ||
250 | driver's phy_id field by ANDing it with each driver's | ||
251 | phy_id_mask field. Also, it needs a name. Here's an example: | ||
252 | |||
253 | static struct phy_driver dm9161_driver = { | ||
254 | .phy_id = 0x0181b880, | ||
255 | .name = "Davicom DM9161E", | ||
256 | .phy_id_mask = 0x0ffffff0, | ||
257 | ... | ||
258 | } | ||
259 | |||
260 | Next, you need to specify what features (speed, duplex, autoneg, | ||
261 | etc) your PHY device and driver support. Most PHYs support | ||
262 | PHY_BASIC_FEATURES, but you can look in include/mii.h for other | ||
263 | features. | ||
264 | |||
265 | Each driver consists of a number of function pointers: | ||
266 | |||
267 | config_init: configures PHY into a sane state after a reset. | ||
268 | For instance, a Davicom PHY requires descrambling disabled. | ||
269 | probe: Does any setup needed by the driver | ||
270 | suspend/resume: power management | ||
271 | config_aneg: Changes the speed/duplex/negotiation settings | ||
272 | read_status: Reads the current speed/duplex/negotiation settings | ||
273 | ack_interrupt: Clear a pending interrupt | ||
274 | config_intr: Enable or disable interrupts | ||
275 | remove: Does any driver take-down | ||
276 | |||
277 | Of these, only config_aneg and read_status are required to be | ||
278 | assigned by the driver code. The rest are optional. Also, it is | ||
279 | preferred to use the generic phy driver's versions of these two | ||
280 | functions if at all possible: genphy_read_status and | ||
281 | genphy_config_aneg. If this is not possible, it is likely that | ||
282 | you only need to perform some actions before and after invoking | ||
283 | these functions, and so your functions will wrap the generic | ||
284 | ones. | ||
285 | |||
286 | Feel free to look at the Marvell, Cicada, and Davicom drivers in | ||
287 | drivers/net/phy/ for examples (the lxt and qsemi drivers have | ||
288 | not been tested as of this writing) | ||
diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt index 71749007091e..0fa300425575 100644 --- a/Documentation/networking/tcp.txt +++ b/Documentation/networking/tcp.txt | |||
@@ -1,5 +1,72 @@ | |||
1 | How the new TCP output machine [nyi] works. | 1 | TCP protocol |
2 | ============ | ||
3 | |||
4 | Last updated: 21 June 2005 | ||
5 | |||
6 | Contents | ||
7 | ======== | ||
8 | |||
9 | - Congestion control | ||
10 | - How the new TCP output machine [nyi] works | ||
11 | |||
12 | Congestion control | ||
13 | ================== | ||
14 | |||
15 | The following variables are used in the tcp_sock for congestion control: | ||
16 | snd_cwnd The size of the congestion window | ||
17 | snd_ssthresh Slow start threshold. We are in slow start if | ||
18 | snd_cwnd is less than this. | ||
19 | snd_cwnd_cnt A counter used to slow down the rate of increase | ||
20 | once we exceed slow start threshold. | ||
21 | snd_cwnd_clamp This is the maximum size that snd_cwnd can grow to. | ||
22 | snd_cwnd_stamp Timestamp for when congestion window last validated. | ||
23 | snd_cwnd_used Used as a highwater mark for how much of the | ||
24 | congestion window is in use. It is used to adjust | ||
25 | snd_cwnd down when the link is limited by the | ||
26 | application rather than the network. | ||
27 | |||
28 | As of 2.6.13, Linux supports pluggable congestion control algorithms. | ||
29 | A congestion control mechanism can be registered through functions in | ||
30 | tcp_cong.c. The functions used by the congestion control mechanism are | ||
31 | registered via passing a tcp_congestion_ops struct to | ||
32 | tcp_register_congestion_control. As a minimum name, ssthresh, | ||
33 | cong_avoid, min_cwnd must be valid. | ||
2 | 34 | ||
35 | Private data for a congestion control mechanism is stored in tp->ca_priv. | ||
36 | tcp_ca(tp) returns a pointer to this space. This is preallocated space - it | ||
37 | is important to check the size of your private data will fit this space, or | ||
38 | alternatively space could be allocated elsewhere and a pointer to it could | ||
39 | be stored here. | ||
40 | |||
41 | There are three kinds of congestion control algorithms currently: The | ||
42 | simplest ones are derived from TCP reno (highspeed, scalable) and just | ||
43 | provide an alternative the congestion window calculation. More complex | ||
44 | ones like BIC try to look at other events to provide better | ||
45 | heuristics. There are also round trip time based algorithms like | ||
46 | Vegas and Westwood+. | ||
47 | |||
48 | Good TCP congestion control is a complex problem because the algorithm | ||
49 | needs to maintain fairness and performance. Please review current | ||
50 | research and RFC's before developing new modules. | ||
51 | |||
52 | The method that is used to determine which congestion control mechanism is | ||
53 | determined by the setting of the sysctl net.ipv4.tcp_congestion_control. | ||
54 | The default congestion control will be the last one registered (LIFO); | ||
55 | so if you built everything as modules. the default will be reno. If you | ||
56 | build with the default's from Kconfig, then BIC will be builtin (not a module) | ||
57 | and it will end up the default. | ||
58 | |||
59 | If you really want a particular default value then you will need | ||
60 | to set it with the sysctl. If you use a sysctl, the module will be autoloaded | ||
61 | if needed and you will get the expected protocol. If you ask for an | ||
62 | unknown congestion method, then the sysctl attempt will fail. | ||
63 | |||
64 | If you remove a tcp congestion control module, then you will get the next | ||
65 | available one. Since reno can not be built as a module, and can not be | ||
66 | deleted, it will always be available. | ||
67 | |||
68 | How the new TCP output machine [nyi] works. | ||
69 | =========================================== | ||
3 | 70 | ||
4 | Data is kept on a single queue. The skb->users flag tells us if the frame is | 71 | Data is kept on a single queue. The skb->users flag tells us if the frame is |
5 | one that has been queued already. To add a frame we throw it on the end. Ack | 72 | one that has been queued already. To add a frame we throw it on the end. Ack |
diff --git a/Documentation/networking/wanpipe.txt b/Documentation/networking/wanpipe.txt deleted file mode 100644 index aea20cd2a56e..000000000000 --- a/Documentation/networking/wanpipe.txt +++ /dev/null | |||
@@ -1,622 +0,0 @@ | |||
1 | ------------------------------------------------------------------------------ | ||
2 | Linux WAN Router Utilities Package | ||
3 | ------------------------------------------------------------------------------ | ||
4 | Version 2.2.1 | ||
5 | Mar 28, 2001 | ||
6 | Author: Nenad Corbic <ncorbic@sangoma.com> | ||
7 | Copyright (c) 1995-2001 Sangoma Technologies Inc. | ||
8 | ------------------------------------------------------------------------------ | ||
9 | |||
10 | INTRODUCTION | ||
11 | |||
12 | Wide Area Networks (WANs) are used to interconnect Local Area Networks (LANs) | ||
13 | and/or stand-alone hosts over vast distances with data transfer rates | ||
14 | significantly higher than those achievable with commonly used dial-up | ||
15 | connections. | ||
16 | |||
17 | Usually an external device called `WAN router' sitting on your local network | ||
18 | or connected to your machine's serial port provides physical connection to | ||
19 | WAN. Although router's job may be as simple as taking your local network | ||
20 | traffic, converting it to WAN format and piping it through the WAN link, these | ||
21 | devices are notoriously expensive, with prices as much as 2 - 5 times higher | ||
22 | then the price of a typical PC box. | ||
23 | |||
24 | Alternatively, considering robustness and multitasking capabilities of Linux, | ||
25 | an internal router can be built (most routers use some sort of stripped down | ||
26 | Unix-like operating system anyway). With a number of relatively inexpensive WAN | ||
27 | interface cards available on the market, a perfectly usable router can be | ||
28 | built for less than half a price of an external router. Yet a Linux box | ||
29 | acting as a router can still be used for other purposes, such as fire-walling, | ||
30 | running FTP, WWW or DNS server, etc. | ||
31 | |||
32 | This kernel module introduces the notion of a WAN Link Driver (WLD) to Linux | ||
33 | operating system and provides generic hardware-independent services for such | ||
34 | drivers. Why can existing Linux network device interface not be used for | ||
35 | this purpose? Well, it can. However, there are a few key differences between | ||
36 | a typical network interface (e.g. Ethernet) and a WAN link. | ||
37 | |||
38 | Many WAN protocols, such as X.25 and frame relay, allow for multiple logical | ||
39 | connections (known as `virtual circuits' in X.25 terminology) over a single | ||
40 | physical link. Each such virtual circuit may (and almost always does) lead | ||
41 | to a different geographical location and, therefore, different network. As a | ||
42 | result, it is the virtual circuit, not the physical link, that represents a | ||
43 | route and, therefore, a network interface in Linux terms. | ||
44 | |||
45 | To further complicate things, virtual circuits are usually volatile in nature | ||
46 | (excluding so called `permanent' virtual circuits or PVCs). With almost no | ||
47 | time required to set up and tear down a virtual circuit, it is highly desirable | ||
48 | to implement on-demand connections in order to minimize network charges. So | ||
49 | unlike a typical network driver, the WAN driver must be able to handle multiple | ||
50 | network interfaces and cope as multiple virtual circuits come into existence | ||
51 | and go away dynamically. | ||
52 | |||
53 | Last, but not least, WAN configuration is much more complex than that of say | ||
54 | Ethernet and may well amount to several dozens of parameters. Some of them | ||
55 | are "link-wide" while others are virtual circuit-specific. The same holds | ||
56 | true for WAN statistics which is by far more extensive and extremely useful | ||
57 | when troubleshooting WAN connections. Extending the ifconfig utility to suit | ||
58 | these needs may be possible, but does not seem quite reasonable. Therefore, a | ||
59 | WAN configuration utility and corresponding application programmer's interface | ||
60 | is needed for this purpose. | ||
61 | |||
62 | Most of these problems are taken care of by this module. Its goal is to | ||
63 | provide a user with more-or-less standard look and feel for all WAN devices and | ||
64 | assist a WAN device driver writer by providing common services, such as: | ||
65 | |||
66 | o User-level interface via /proc file system | ||
67 | o Centralized configuration | ||
68 | o Device management (setup, shutdown, etc.) | ||
69 | o Network interface management (dynamic creation/destruction) | ||
70 | o Protocol encapsulation/decapsulation | ||
71 | |||
72 | To ba able to use the Linux WAN Router you will also need a WAN Tools package | ||
73 | available from | ||
74 | |||
75 | ftp.sangoma.com/pub/linux/current_wanpipe/wanpipe-X.Y.Z.tgz | ||
76 | |||
77 | where vX.Y.Z represent the wanpipe version number. | ||
78 | |||
79 | For technical questions and/or comments please e-mail to ncorbic@sangoma.com. | ||
80 | For general inquiries please contact Sangoma Technologies Inc. by | ||
81 | |||
82 | Hotline: 1-800-388-2475 (USA and Canada, toll free) | ||
83 | Phone: (905) 474-1990 ext: 106 | ||
84 | Fax: (905) 474-9223 | ||
85 | E-mail: dm@sangoma.com (David Mandelstam) | ||
86 | WWW: http://www.sangoma.com | ||
87 | |||
88 | |||
89 | INSTALLATION | ||
90 | |||
91 | Please read the WanpipeForLinux.pdf manual on how to | ||
92 | install the WANPIPE tools and drivers properly. | ||
93 | |||
94 | |||
95 | After installing wanpipe package: /usr/local/wanrouter/doc. | ||
96 | On the ftp.sangoma.com : /linux/current_wanpipe/doc | ||
97 | |||
98 | |||
99 | COPYRIGHT AND LICENSING INFORMATION | ||
100 | |||
101 | This program is free software; you can redistribute it and/or modify it under | ||
102 | the terms of the GNU General Public License as published by the Free Software | ||
103 | Foundation; either version 2, or (at your option) any later version. | ||
104 | |||
105 | This program is distributed in the hope that it will be useful, but WITHOUT | ||
106 | ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS | ||
107 | FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. | ||
108 | |||
109 | You should have received a copy of the GNU General Public License along with | ||
110 | this program; if not, write to the Free Software Foundation, Inc., 675 Mass | ||
111 | Ave, Cambridge, MA 02139, USA. | ||
112 | |||
113 | |||
114 | |||
115 | ACKNOWLEDGEMENTS | ||
116 | |||
117 | This product is based on the WANPIPE(tm) Multiprotocol WAN Router developed | ||
118 | by Sangoma Technologies Inc. for Linux 2.0.x and 2.2.x. Success of the WANPIPE | ||
119 | together with the next major release of Linux kernel in summer 1996 commanded | ||
120 | adequate changes to the WANPIPE code to take full advantage of new Linux | ||
121 | features. | ||
122 | |||
123 | Instead of continuing developing proprietary interface tied to Sangoma WAN | ||
124 | cards, we decided to separate all hardware-independent code into a separate | ||
125 | module and defined two levels of interfaces - one for user-level applications | ||
126 | and another for kernel-level WAN drivers. WANPIPE is now implemented as a | ||
127 | WAN driver compliant with the WAN Link Driver interface. Also a general | ||
128 | purpose WAN configuration utility and a set of shell scripts was developed to | ||
129 | support WAN router at the user level. | ||
130 | |||
131 | Many useful ideas concerning hardware-independent interface implementation | ||
132 | were given by Mike McLagan <mike.mclagan@linux.org> and his implementation | ||
133 | of the Frame Relay router and drivers for Sangoma cards (dlci/sdla). | ||
134 | |||
135 | With the new implementation of the APIs being incorporated into the WANPIPE, | ||
136 | a special thank goes to Alan Cox in providing insight into BSD sockets. | ||
137 | |||
138 | Special thanks to all the WANPIPE users who performed field-testing, reported | ||
139 | bugs and made valuable comments and suggestions that help us to improve this | ||
140 | product. | ||
141 | |||
142 | |||
143 | |||
144 | NEW IN THIS RELEASE | ||
145 | |||
146 | o Updated the WANCFG utility | ||
147 | Calls the pppconfig to configure the PPPD | ||
148 | for async connections. | ||
149 | |||
150 | o Added the PPPCONFIG utility | ||
151 | Used to configure the PPPD dameon for the | ||
152 | WANPIPE Async PPP and standard serial port. | ||
153 | The wancfg calls the pppconfig to configure | ||
154 | the pppd. | ||
155 | |||
156 | o Fixed the PCI autodetect feature. | ||
157 | The SLOT 0 was used as an autodetect option | ||
158 | however, some high end PC's slot numbers start | ||
159 | from 0. | ||
160 | |||
161 | o This release has been tested with the new backupd | ||
162 | daemon release. | ||
163 | |||
164 | |||
165 | PRODUCT COMPONENTS AND RELATED FILES | ||
166 | |||
167 | /etc: (or user defined) | ||
168 | wanpipe1.conf default router configuration file | ||
169 | |||
170 | /lib/modules/X.Y.Z/misc: | ||
171 | wanrouter.o router kernel loadable module | ||
172 | af_wanpipe.o wanpipe api socket module | ||
173 | |||
174 | /lib/modules/X.Y.Z/net: | ||
175 | sdladrv.o Sangoma SDLA support module | ||
176 | wanpipe.o Sangoma WANPIPE(tm) driver module | ||
177 | |||
178 | /proc/net/wanrouter | ||
179 | Config reads current router configuration | ||
180 | Status reads current router status | ||
181 | {name} reads WAN driver statistics | ||
182 | |||
183 | /usr/sbin: | ||
184 | wanrouter wanrouter start-up script | ||
185 | wanconfig wanrouter configuration utility | ||
186 | sdladump WANPIPE adapter memory dump utility | ||
187 | fpipemon Monitor for Frame Relay | ||
188 | cpipemon Monitor for Cisco HDLC | ||
189 | ppipemon Monitor for PPP | ||
190 | xpipemon Monitor for X25 | ||
191 | wpkbdmon WANPIPE keyboard led monitor/debugger | ||
192 | |||
193 | /usr/local/wanrouter: | ||
194 | README this file | ||
195 | COPYING GNU General Public License | ||
196 | Setup installation script | ||
197 | Filelist distribution definition file | ||
198 | wanrouter.rc meta-configuration file | ||
199 | (used by the Setup and wanrouter script) | ||
200 | |||
201 | /usr/local/wanrouter/doc: | ||
202 | wanpipeForLinux.pdf WAN Router User's Manual | ||
203 | |||
204 | /usr/local/wanrouter/patches: | ||
205 | wanrouter-v2213.gz patch for Linux kernels 2.2.11 up to 2.2.13. | ||
206 | wanrouter-v2214.gz patch for Linux kernel 2.2.14. | ||
207 | wanrouter-v2215.gz patch for Linux kernels 2.2.15 to 2.2.17. | ||
208 | wanrouter-v2218.gz patch for Linux kernels 2.2.18 and up. | ||
209 | wanrouter-v240.gz patch for Linux kernel 2.4.0. | ||
210 | wanrouter-v242.gz patch for Linux kernel 2.4.2 and up. | ||
211 | wanrouter-v2034.gz patch for Linux kernel 2.0.34 | ||
212 | wanrouter-v2036.gz patch for Linux kernel 2.0.36 and up. | ||
213 | |||
214 | /usr/local/wanrouter/patches/kdrivers: | ||
215 | Sources of the latest WANPIPE device drivers. | ||
216 | These are used to UPGRADE the linux kernel to the newest | ||
217 | version if the kernel source has already been pathced with | ||
218 | WANPIPE drivers. | ||
219 | |||
220 | /usr/local/wanrouter/samples: | ||
221 | interface sample interface configuration file | ||
222 | wanpipe1.cpri CHDLC primary port | ||
223 | wanpipe2.csec CHDLC secondary port | ||
224 | wanpipe1.fr Frame Relay protocol | ||
225 | wanpipe1.ppp PPP protocol ) | ||
226 | wanpipe1.asy CHDLC ASYNC protocol | ||
227 | wanpipe1.x25 X25 protocol | ||
228 | wanpipe1.stty Sync TTY driver (Used by Kernel PPPD daemon) | ||
229 | wanpipe1.atty Async TTY driver (Used by Kernel PPPD daemon) | ||
230 | wanrouter.rc sample meta-configuration file | ||
231 | |||
232 | /usr/local/wanrouter/util: | ||
233 | * wan-tools utilities source code | ||
234 | |||
235 | /usr/local/wanrouter/api/x25: | ||
236 | * x25 api sample programs. | ||
237 | /usr/local/wanrouter/api/chdlc: | ||
238 | * chdlc api sample programs. | ||
239 | /usr/local/wanrouter/api/fr: | ||
240 | * fr api sample programs. | ||
241 | /usr/local/wanrouter/config/wancfg: | ||
242 | wancfg WANPIPE GUI configuration program. | ||
243 | Creates wanpipe#.conf files. | ||
244 | /usr/local/wanrouter/config/cfgft1: | ||
245 | cfgft1 GUI CSU/DSU configuration program. | ||
246 | |||
247 | /usr/include/linux: | ||
248 | wanrouter.h router API definitions | ||
249 | wanpipe.h WANPIPE API definitions | ||
250 | sdladrv.h SDLA support module API definitions | ||
251 | sdlasfm.h SDLA firmware module definitions | ||
252 | if_wanpipe.h WANPIPE Socket definitions | ||
253 | if_wanpipe_common.h WANPIPE Socket/Driver common definitions. | ||
254 | sdlapci.h WANPIPE PCI definitions | ||
255 | |||
256 | |||
257 | /usr/src/linux/net/wanrouter: | ||
258 | * wanrouter source code | ||
259 | |||
260 | /var/log: | ||
261 | wanrouter wanrouter start-up log (created by the Setup script) | ||
262 | |||
263 | /var/lock: (or /var/lock/subsys for RedHat) | ||
264 | wanrouter wanrouter lock file (created by the Setup script) | ||
265 | |||
266 | /usr/local/wanrouter/firmware: | ||
267 | fr514.sfm Frame relay firmware for Sangoma S508/S514 card | ||
268 | cdual514.sfm Dual Port Cisco HDLC firmware for Sangoma S508/S514 card | ||
269 | ppp514.sfm PPP Firmware for Sangoma S508 and S514 cards | ||
270 | x25_508.sfm X25 Firmware for Sangoma S508 card. | ||
271 | |||
272 | |||
273 | REVISION HISTORY | ||
274 | |||
275 | 1.0.0 December 31, 1996 Initial version | ||
276 | |||
277 | 1.0.1 January 30, 1997 Status and statistics can be read via /proc | ||
278 | filesystem entries. | ||
279 | |||
280 | 1.0.2 April 30, 1997 Added UDP management via monitors. | ||
281 | |||
282 | 1.0.3 June 3, 1997 UDP management for multiple boards using Frame | ||
283 | Relay and PPP | ||
284 | Enabled continuous transmission of Configure | ||
285 | Request Packet for PPP (for 508 only) | ||
286 | Connection Timeout for PPP changed from 900 to 0 | ||
287 | Flow Control Problem fixed for Frame Relay | ||
288 | |||
289 | 1.0.4 July 10, 1997 S508/FT1 monitoring capability in fpipemon and | ||
290 | ppipemon utilities. | ||
291 | Configurable TTL for UDP packets. | ||
292 | Multicast and Broadcast IP source addresses are | ||
293 | silently discarded. | ||
294 | |||
295 | 1.0.5 July 28, 1997 Configurable T391,T392,N391,N392,N393 for Frame | ||
296 | Relay in router.conf. | ||
297 | Configurable Memory Address through router.conf | ||
298 | for Frame Relay, PPP and X.25. (commenting this | ||
299 | out enables auto-detection). | ||
300 | Fixed freeing up received buffers using kfree() | ||
301 | for Frame Relay and X.25. | ||
302 | Protect sdla_peek() by calling save_flags(), | ||
303 | cli() and restore_flags(). | ||
304 | Changed number of Trace elements from 32 to 20 | ||
305 | Added DLCI specific data monitoring in FPIPEMON. | ||
306 | 2.0.0 Nov 07, 1997 Implemented protection of RACE conditions by | ||
307 | critical flags for FRAME RELAY and PPP. | ||
308 | DLCI List interrupt mode implemented. | ||
309 | IPX support in FRAME RELAY and PPP. | ||
310 | IPX Server Support (MARS) | ||
311 | More driver specific stats included in FPIPEMON | ||
312 | and PIPEMON. | ||
313 | |||
314 | 2.0.1 Nov 28, 1997 Bug Fixes for version 2.0.0. | ||
315 | Protection of "enable_irq()" while | ||
316 | "disable_irq()" has been enabled from any other | ||
317 | routine (for Frame Relay, PPP and X25). | ||
318 | Added additional Stats for Fpipemon and Ppipemon | ||
319 | Improved Load Sharing for multiple boards | ||
320 | |||
321 | 2.0.2 Dec 09, 1997 Support for PAP and CHAP for ppp has been | ||
322 | implemented. | ||
323 | |||
324 | 2.0.3 Aug 15, 1998 New release supporting Cisco HDLC, CIR for Frame | ||
325 | relay, Dynamic IP assignment for PPP and Inverse | ||
326 | Arp support for Frame-relay. Man Pages are | ||
327 | included for better support and a new utility | ||
328 | for configuring FT1 cards. | ||
329 | |||
330 | 2.0.4 Dec 09, 1998 Dual Port support for Cisco HDLC. | ||
331 | Support for HDLC (LAPB) API. | ||
332 | Supports BiSync Streaming code for S502E | ||
333 | and S503 cards. | ||
334 | Support for Streaming HDLC API. | ||
335 | Provides a BSD socket interface for | ||
336 | creating applications using BiSync | ||
337 | streaming. | ||
338 | |||
339 | 2.0.5 Aug 04, 1999 CHDLC initializatin bug fix. | ||
340 | PPP interrupt driven driver: | ||
341 | Fix to the PPP line hangup problem. | ||
342 | New PPP firmware | ||
343 | Added comments to the startup SYSTEM ERROR messages | ||
344 | Xpipemon debugging application for the X25 protocol | ||
345 | New USER_MANUAL.txt | ||
346 | Fixed the odd boundary 4byte writes to the board. | ||
347 | BiSync Streaming code has been taken out. | ||
348 | Available as a patch. | ||
349 | Streaming HDLC API has been taken out. | ||
350 | Available as a patch. | ||
351 | |||
352 | 2.0.6 Aug 17, 1999 Increased debugging in statup scripts | ||
353 | Fixed insallation bugs from 2.0.5 | ||
354 | Kernel patch works for both 2.2.10 and 2.2.11 kernels. | ||
355 | There is no functional difference between the two packages | ||
356 | |||
357 | 2.0.7 Aug 26, 1999 o Merged X25API code into WANPIPE. | ||
358 | o Fixed a memeory leak for X25API | ||
359 | o Updated the X25API code for 2.2.X kernels. | ||
360 | o Improved NEM handling. | ||
361 | |||
362 | 2.1.0 Oct 25, 1999 o New code for S514 PCI Card | ||
363 | o New CHDLC and Frame Relay drivers | ||
364 | o PPP and X25 are not supported in this release | ||
365 | |||
366 | 2.1.1 Nov 30, 1999 o PPP support for S514 PCI Cards | ||
367 | |||
368 | 2.1.3 Apr 06, 2000 o Socket based x25api | ||
369 | o Socket based chdlc api | ||
370 | o Socket based fr api | ||
371 | o Dual Port Receive only CHDLC support. | ||
372 | o Asynchronous CHDLC support (Secondary Port) | ||
373 | o cfgft1 GUI csu/dsu configurator | ||
374 | o wancfg GUI configuration file | ||
375 | configurator. | ||
376 | o Architectual directory changes. | ||
377 | |||
378 | beta-2.1.4 Jul 2000 o Dynamic interface configuration: | ||
379 | Network interfaces reflect the state | ||
380 | of protocol layer. If the protocol becomes | ||
381 | disconnected, driver will bring down | ||
382 | the interface. Once the protocol reconnects | ||
383 | the interface will be brought up. | ||
384 | |||
385 | Note: This option is turned off by default. | ||
386 | |||
387 | o Dynamic wanrouter setup using 'wanconfig': | ||
388 | wanconfig utility can be used to | ||
389 | shutdown,restart,start or reconfigure | ||
390 | a virtual circuit dynamically. | ||
391 | |||
392 | Frame Relay: Each DLCI can be: | ||
393 | created,stopped,restarted and reconfigured | ||
394 | dynamically using wanconfig. | ||
395 | |||
396 | ex: wanconfig card wanpipe1 dev wp1_fr16 up | ||
397 | |||
398 | o Wanrouter startup via command line arguments: | ||
399 | wanconfig also supports wanrouter startup via command line | ||
400 | arguments. Thus, there is no need to create a wanpipe#.conf | ||
401 | configuration file. | ||
402 | |||
403 | o Socket based x25api update/bug fixes. | ||
404 | Added support for LCN numbers greater than 255. | ||
405 | Option to pass up modem messages. | ||
406 | Provided a PCI IRQ check, so a single S514 | ||
407 | card is guaranteed to have a non-sharing interrupt. | ||
408 | |||
409 | o Fixes to the wancfg utility. | ||
410 | o New FT1 debugging support via *pipemon utilities. | ||
411 | o Frame Relay ARP support Enabled. | ||
412 | |||
413 | beta3-2.1.4 Jul 2000 o X25 M_BIT Problem fix. | ||
414 | o Added the Multi-Port PPP | ||
415 | Updated utilites for the Multi-Port PPP. | ||
416 | |||
417 | 2.1.4 Aut 2000 | ||
418 | o In X25API: | ||
419 | Maximum packet an application can send | ||
420 | to the driver has been extended to 4096 bytes. | ||
421 | |||
422 | Fixed the x25 startup bug. Enable | ||
423 | communications only after all interfaces | ||
424 | come up. HIGH SVC/PVC is used to calculate | ||
425 | the number of channels. | ||
426 | Enable protocol only after all interfaces | ||
427 | are enabled. | ||
428 | |||
429 | o Added an extra state to the FT1 config, kernel module. | ||
430 | o Updated the pipemon debuggers. | ||
431 | |||
432 | o Blocked the Multi-Port PPP from running on kernels | ||
433 | 2.2.16 or greater, due to syncppp kernel module | ||
434 | change. | ||
435 | |||
436 | beta1-2.1.5 Nov 15 2000 | ||
437 | o Fixed the MulitPort PPP Support for kernels 2.2.16 and above. | ||
438 | 2.2.X kernels only | ||
439 | |||
440 | o Secured the driver UDP debugging calls | ||
441 | - All illegal netowrk debugging calls are reported to | ||
442 | the log. | ||
443 | - Defined a set of allowed commands, all other denied. | ||
444 | |||
445 | o Cpipemon | ||
446 | - Added set FT1 commands to the cpipemon. Thus CSU/DSU | ||
447 | configuraiton can be performed using cpipemon. | ||
448 | All systems that cannot run cfgft1 GUI utility should | ||
449 | use cpipemon to configure the on board CSU/DSU. | ||
450 | |||
451 | |||
452 | o Keyboard Led Monitor/Debugger | ||
453 | - A new utilty /usr/sbin/wpkbdmon uses keyboard leds | ||
454 | to convey operatinal statistic information of the | ||
455 | Sangoma WANPIPE cards. | ||
456 | NUM_LOCK = Line State (On=connected, Off=disconnected) | ||
457 | CAPS_LOCK = Tx data (On=transmitting, Off=no tx data) | ||
458 | SCROLL_LOCK = Rx data (On=receiving, Off=no rx data | ||
459 | |||
460 | o Hardware probe on module load and dynamic device allocation | ||
461 | - During WANPIPE module load, all Sangoma cards are probed | ||
462 | and found information is printed in the /var/log/messages. | ||
463 | - If no cards are found, the module load fails. | ||
464 | - Appropriate number of devices are dynamically loaded | ||
465 | based on the number of Sangoma cards found. | ||
466 | |||
467 | Note: The kernel configuraiton option | ||
468 | CONFIG_WANPIPE_CARDS has been taken out. | ||
469 | |||
470 | o Fixed the Frame Relay and Chdlc network interfaces so they are | ||
471 | compatible with libpcap libraries. Meaning, tcpdump, snort, | ||
472 | ethereal, and all other packet sniffers and debuggers work on | ||
473 | all WANPIPE netowrk interfaces. | ||
474 | - Set the network interface encoding type to ARPHRD_PPP. | ||
475 | This tell the sniffers that data obtained from the | ||
476 | network interface is in pure IP format. | ||
477 | Fix for 2.2.X kernels only. | ||
478 | |||
479 | o True interface encoding option for Frame Relay and CHDLC | ||
480 | - The above fix sets the network interface encoding | ||
481 | type to ARPHRD_PPP, however some customers use | ||
482 | the encoding interface type to determine the | ||
483 | protocol running. Therefore, the TURE ENCODING | ||
484 | option will set the interface type back to the | ||
485 | original value. | ||
486 | |||
487 | NOTE: If this option is used with Frame Relay and CHDLC | ||
488 | libpcap library support will be broken. | ||
489 | i.e. tcpdump will not work. | ||
490 | Fix for 2.2.x Kernels only. | ||
491 | |||
492 | o Ethernet Bridgind over Frame Relay | ||
493 | - The Frame Relay bridging has been developed by | ||
494 | Kristian Hoffmann and Mark Wells. | ||
495 | - The Linux kernel bridge is used to send ethernet | ||
496 | data over the frame relay links. | ||
497 | For 2.2.X Kernels only. | ||
498 | |||
499 | o Added extensive 2.0.X support. Most new features of | ||
500 | 2.1.5 for protocols Frame Relay, PPP and CHDLC are | ||
501 | supported under 2.0.X kernels. | ||
502 | |||
503 | beta1-2.2.0 Dec 30 2000 | ||
504 | o Updated drivers for 2.4.X kernels. | ||
505 | o Updated drivers for SMP support. | ||
506 | o X25API is now able to share PCI interrupts. | ||
507 | o Took out a general polling routine that was used | ||
508 | only by X25API. | ||
509 | o Added appropriate locks to the dynamic reconfiguration | ||
510 | code. | ||
511 | o Fixed a bug in the keyboard debug monitor. | ||
512 | |||
513 | beta2-2.2.0 Jan 8 2001 | ||
514 | o Patches for 2.4.0 kernel | ||
515 | o Patches for 2.2.18 kernel | ||
516 | o Minor updates to PPP and CHLDC drivers. | ||
517 | Note: No functinal difference. | ||
518 | |||
519 | beta3-2.2.9 Jan 10 2001 | ||
520 | o I missed the 2.2.18 kernel patches in beta2-2.2.0 | ||
521 | release. They are included in this release. | ||
522 | |||
523 | Stable Release | ||
524 | 2.2.0 Feb 01 2001 | ||
525 | o Bug fix in wancfg GUI configurator. | ||
526 | The edit function didn't work properly. | ||
527 | |||
528 | |||
529 | bata1-2.2.1 Feb 09 2001 | ||
530 | o WANPIPE TTY Driver emulation. | ||
531 | Two modes of operation Sync and Async. | ||
532 | Sync: Using the PPPD daemon, kernel SyncPPP layer | ||
533 | and the Wanpipe sync TTY driver: a PPP protocol | ||
534 | connection can be established via Sangoma adapter, over | ||
535 | a T1 leased line. | ||
536 | |||
537 | The 2.4.0 kernel PPP layer supports MULTILINK | ||
538 | protocol, that can be used to bundle any number of Sangoma | ||
539 | adapters (T1 lines) into one, under a single IP address. | ||
540 | Thus, efficiently obtaining multiple T1 throughput. | ||
541 | |||
542 | NOTE: The remote side must also implement MULTILINK PPP | ||
543 | protocol. | ||
544 | |||
545 | Async:Using the PPPD daemon, kernel AsyncPPP layer | ||
546 | and the WANPIPE async TTY driver: a PPP protocol | ||
547 | connection can be established via Sangoma adapter and | ||
548 | a modem, over a telephone line. | ||
549 | |||
550 | Thus, the WANPIPE async TTY driver simulates a serial | ||
551 | TTY driver that would normally be used to interface the | ||
552 | MODEM to the linux kernel. | ||
553 | |||
554 | o WANPIPE PPP Backup Utility | ||
555 | This utility will monitor the state of the PPP T1 line. | ||
556 | In case of failure, a dial up connection will be established | ||
557 | via pppd daemon, ether via a serial tty driver (serial port), | ||
558 | or a WANPIPE async TTY driver (in case serial port is unavailable). | ||
559 | |||
560 | Furthermore, while in dial up mode, the primary PPP T1 link | ||
561 | will be monitored for signs of life. | ||
562 | |||
563 | If the PPP T1 link comes back to life, the dial up connection | ||
564 | will be shutdown and T1 line re-established. | ||
565 | |||
566 | |||
567 | o New Setup installation script. | ||
568 | Option to UPGRADE device drivers if the kernel source has | ||
569 | already been patched with WANPIPE. | ||
570 | |||
571 | Option to COMPILE WANPIPE modules against the currently | ||
572 | running kernel, thus no need for manual kernel and module | ||
573 | re-compilatin. | ||
574 | |||
575 | o Updates and Bug Fixes to wancfg utility. | ||
576 | |||
577 | bata2-2.2.1 Feb 20 2001 | ||
578 | |||
579 | o Bug fixes to the CHDLC device drivers. | ||
580 | The driver had compilation problems under kernels | ||
581 | 2.2.14 or lower. | ||
582 | |||
583 | o Bug fixes to the Setup installation script. | ||
584 | The device drivers compilation options didn't work | ||
585 | properly. | ||
586 | |||
587 | o Update to the wpbackupd daemon. | ||
588 | Optimized the cross-over times, between the primary | ||
589 | link and the backup dialup. | ||
590 | |||
591 | beta3-2.2.1 Mar 02 2001 | ||
592 | o Patches for 2.4.2 kernel. | ||
593 | |||
594 | o Bug fixes to util/ make files. | ||
595 | o Bug fixes to the Setup installation script. | ||
596 | |||
597 | o Took out the backupd support and made it into | ||
598 | as separate package. | ||
599 | |||
600 | beta4-2.2.1 Mar 12 2001 | ||
601 | |||
602 | o Fix to the Frame Relay Device driver. | ||
603 | IPSAC sends a packet of zero length | ||
604 | header to the frame relay driver. The | ||
605 | driver tries to push its own 2 byte header | ||
606 | into the packet, which causes the driver to | ||
607 | crash. | ||
608 | |||
609 | o Fix the WANPIPE re-configuration code. | ||
610 | Bug was found by trying to run the cfgft1 while the | ||
611 | interface was already running. | ||
612 | |||
613 | o Updates to cfgft1. | ||
614 | Writes a wanpipe#.cfgft1 configuration file | ||
615 | once the CSU/DSU is configured. This file can | ||
616 | holds the current CSU/DSU configuration. | ||
617 | |||
618 | |||
619 | |||
620 | >>>>>> END OF README <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | ||
621 | |||
622 | |||
diff --git a/Documentation/pci.txt b/Documentation/pci.txt index 62b1dc5d97e2..76d28d033657 100644 --- a/Documentation/pci.txt +++ b/Documentation/pci.txt | |||
@@ -266,20 +266,6 @@ port an old driver to the new PCI interface. They are no longer present | |||
266 | in the kernel as they aren't compatible with hotplug or PCI domains or | 266 | in the kernel as they aren't compatible with hotplug or PCI domains or |
267 | having sane locking. | 267 | having sane locking. |
268 | 268 | ||
269 | pcibios_present() and Since ages, you don't need to test presence | ||
270 | pci_present() of PCI subsystem when trying to talk to it. | ||
271 | If it's not there, the list of PCI devices | ||
272 | is empty and all functions for searching for | ||
273 | devices just return NULL. | ||
274 | pcibios_(read|write)_* Superseded by their pci_(read|write)_* | ||
275 | counterparts. | ||
276 | pcibios_find_* Superseded by their pci_get_* counterparts. | ||
277 | pci_for_each_dev() Superseded by pci_get_device() | ||
278 | pci_for_each_dev_reverse() Superseded by pci_find_device_reverse() | ||
279 | pci_for_each_bus() Superseded by pci_find_next_bus() | ||
280 | pci_find_device() Superseded by pci_get_device() | 269 | pci_find_device() Superseded by pci_get_device() |
281 | pci_find_subsys() Superseded by pci_get_subsys() | 270 | pci_find_subsys() Superseded by pci_get_subsys() |
282 | pci_find_slot() Superseded by pci_get_slot() | 271 | pci_find_slot() Superseded by pci_get_slot() |
283 | pcibios_find_class() Superseded by pci_get_class() | ||
284 | pci_find_class() Superseded by pci_get_class() | ||
285 | pci_(read|write)_*_nodev() Superseded by pci_bus_(read|write)_*() | ||
diff --git a/Documentation/pcmcia/devicetable.txt b/Documentation/pcmcia/devicetable.txt new file mode 100644 index 000000000000..3351c0355143 --- /dev/null +++ b/Documentation/pcmcia/devicetable.txt | |||
@@ -0,0 +1,63 @@ | |||
1 | Matching of PCMCIA devices to drivers is done using one or more of the | ||
2 | following criteria: | ||
3 | |||
4 | - manufactor ID | ||
5 | - card ID | ||
6 | - product ID strings _and_ hashes of these strings | ||
7 | - function ID | ||
8 | - device function (actual and pseudo) | ||
9 | |||
10 | You should use the helpers in include/pcmcia/device_id.h for generating the | ||
11 | struct pcmcia_device_id[] entries which match devices to drivers. | ||
12 | |||
13 | If you want to match product ID strings, you also need to pass the crc32 | ||
14 | hashes of the string to the macro, e.g. if you want to match the product ID | ||
15 | string 1, you need to use | ||
16 | |||
17 | PCMCIA_DEVICE_PROD_ID1("some_string", 0x(hash_of_some_string)), | ||
18 | |||
19 | If the hash is incorrect, the kernel will inform you about this in "dmesg" | ||
20 | upon module initialization, and tell you of the correct hash. | ||
21 | |||
22 | You can determine the hash of the product ID strings by catting the file | ||
23 | "modalias" in the sysfs directory of the PCMCIA device. It generates a string | ||
24 | in the following form: | ||
25 | pcmcia:m0149cC1ABf06pfn00fn00pa725B842DpbF1EFEE84pc0877B627pd00000000 | ||
26 | |||
27 | The hex value after "pa" is the hash of product ID string 1, after "pb" for | ||
28 | string 2 and so on. | ||
29 | |||
30 | Alternatively, you can use this small tool to determine the crc32 hash. | ||
31 | simply pass the string you want to evaluate as argument to this program, | ||
32 | e.g. | ||
33 | $ ./crc32hash "Dual Speed" | ||
34 | |||
35 | ------------------------------------------------------------------------- | ||
36 | /* crc32hash.c - derived from linux/lib/crc32.c, GNU GPL v2 */ | ||
37 | #include <string.h> | ||
38 | #include <stdio.h> | ||
39 | #include <ctype.h> | ||
40 | #include <stdlib.h> | ||
41 | |||
42 | unsigned int crc32(unsigned char const *p, unsigned int len) | ||
43 | { | ||
44 | int i; | ||
45 | unsigned int crc = 0; | ||
46 | while (len--) { | ||
47 | crc ^= *p++; | ||
48 | for (i = 0; i < 8; i++) | ||
49 | crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0); | ||
50 | } | ||
51 | return crc; | ||
52 | } | ||
53 | |||
54 | int main(int argc, char **argv) { | ||
55 | unsigned int result; | ||
56 | if (argc != 2) { | ||
57 | printf("no string passed as argument\n"); | ||
58 | return -1; | ||
59 | } | ||
60 | result = crc32(argv[1], strlen(argv[1])); | ||
61 | printf("0x%x\n", result); | ||
62 | return 0; | ||
63 | } | ||
diff --git a/Documentation/pcmcia/driver-changes.txt b/Documentation/pcmcia/driver-changes.txt new file mode 100644 index 000000000000..403e7b4dcdd4 --- /dev/null +++ b/Documentation/pcmcia/driver-changes.txt | |||
@@ -0,0 +1,67 @@ | |||
1 | This file details changes in 2.6 which affect PCMCIA card driver authors: | ||
2 | |||
3 | * event handler initialization in struct pcmcia_driver (as of 2.6.13) | ||
4 | The event handler is notified of all events, and must be initialized | ||
5 | as the event() callback in the driver's struct pcmcia_driver. | ||
6 | |||
7 | * pcmcia/version.h should not be used (as of 2.6.13) | ||
8 | This file will be removed eventually. | ||
9 | |||
10 | * in-kernel device<->driver matching (as of 2.6.13) | ||
11 | PCMCIA devices and their correct drivers can now be matched in | ||
12 | kernelspace. See 'devicetable.txt' for details. | ||
13 | |||
14 | * Device model integration (as of 2.6.11) | ||
15 | A struct pcmcia_device is registered with the device model core, | ||
16 | and can be used (e.g. for SET_NETDEV_DEV) by using | ||
17 | handle_to_dev(client_handle_t * handle). | ||
18 | |||
19 | * Convert internal I/O port addresses to unsigned long (as of 2.6.11) | ||
20 | ioaddr_t should be replaced by kio_addr_t in PCMCIA card drivers. | ||
21 | |||
22 | * irq_mask and irq_list parameters (as of 2.6.11) | ||
23 | The irq_mask and irq_list parameters should no longer be used in | ||
24 | PCMCIA card drivers. Instead, it is the job of the PCMCIA core to | ||
25 | determine which IRQ should be used. Therefore, link->irq.IRQInfo2 | ||
26 | is ignored. | ||
27 | |||
28 | * client->PendingEvents is gone (as of 2.6.11) | ||
29 | client->PendingEvents is no longer available. | ||
30 | |||
31 | * client->Attributes are gone (as of 2.6.11) | ||
32 | client->Attributes is unused, therefore it is removed from all | ||
33 | PCMCIA card drivers | ||
34 | |||
35 | * core functions no longer available (as of 2.6.11) | ||
36 | The following functions have been removed from the kernel source | ||
37 | because they are unused by all in-kernel drivers, and no external | ||
38 | driver was reported to rely on them: | ||
39 | pcmcia_get_first_region() | ||
40 | pcmcia_get_next_region() | ||
41 | pcmcia_modify_window() | ||
42 | pcmcia_set_event_mask() | ||
43 | pcmcia_get_first_window() | ||
44 | pcmcia_get_next_window() | ||
45 | |||
46 | * device list iteration upon module removal (as of 2.6.10) | ||
47 | It is no longer necessary to iterate on the driver's internal | ||
48 | client list and call the ->detach() function upon module removal. | ||
49 | |||
50 | * Resource management. (as of 2.6.8) | ||
51 | Although the PCMCIA subsystem will allocate resources for cards, | ||
52 | it no longer marks these resources busy. This means that driver | ||
53 | authors are now responsible for claiming your resources as per | ||
54 | other drivers in Linux. You should use request_region() to mark | ||
55 | your IO regions in-use, and request_mem_region() to mark your | ||
56 | memory regions in-use. The name argument should be a pointer to | ||
57 | your driver name. Eg, for pcnet_cs, name should point to the | ||
58 | string "pcnet_cs". | ||
59 | |||
60 | * CardServices is gone | ||
61 | CardServices() in 2.4 is just a big switch statement to call various | ||
62 | services. In 2.6, all of those entry points are exported and called | ||
63 | directly (except for pcmcia_report_error(), just use cs_error() instead). | ||
64 | |||
65 | * struct pcmcia_driver | ||
66 | You need to use struct pcmcia_driver and pcmcia_{un,}register_driver | ||
67 | instead of {un,}register_pccard_driver | ||
diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt index 60b548105edf..fb57784986b1 100644 --- a/Documentation/power/kernel_threads.txt +++ b/Documentation/power/kernel_threads.txt | |||
@@ -12,8 +12,7 @@ refrigerator. Code to do this looks like this: | |||
12 | do { | 12 | do { |
13 | hub_events(); | 13 | hub_events(); |
14 | wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); | 14 | wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); |
15 | if (current->flags & PF_FREEZE) | 15 | try_to_freeze(); |
16 | refrigerator(PF_FREEZE); | ||
17 | } while (!signal_pending(current)); | 16 | } while (!signal_pending(current)); |
18 | 17 | ||
19 | from drivers/usb/core/hub.c::hub_thread() | 18 | from drivers/usb/core/hub.c::hub_thread() |
diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt index 35b1a7dae342..6fc9d511fc39 100644 --- a/Documentation/power/pci.txt +++ b/Documentation/power/pci.txt | |||
@@ -291,6 +291,44 @@ a request to enable wake events from D3, two calls should be made to | |||
291 | pci_enable_wake (one for both D3hot and D3cold). | 291 | pci_enable_wake (one for both D3hot and D3cold). |
292 | 292 | ||
293 | 293 | ||
294 | A reference implementation | ||
295 | ------------------------- | ||
296 | .suspend() | ||
297 | { | ||
298 | /* driver specific operations */ | ||
299 | |||
300 | /* Disable IRQ */ | ||
301 | free_irq(); | ||
302 | /* If using MSI */ | ||
303 | pci_disable_msi(); | ||
304 | |||
305 | pci_save_state(); | ||
306 | pci_enable_wake(); | ||
307 | /* Disable IO/bus master/irq router */ | ||
308 | pci_disable_device(); | ||
309 | pci_set_power_state(pci_choose_state()); | ||
310 | } | ||
311 | |||
312 | .resume() | ||
313 | { | ||
314 | pci_set_power_state(PCI_D0); | ||
315 | pci_restore_state(); | ||
316 | /* device's irq possibly is changed, driver should take care */ | ||
317 | pci_enable_device(); | ||
318 | pci_set_master(); | ||
319 | |||
320 | /* if using MSI, device's vector possibly is changed */ | ||
321 | pci_enable_msi(); | ||
322 | |||
323 | request_irq(); | ||
324 | /* driver specific operations; */ | ||
325 | } | ||
326 | |||
327 | This is a typical implementation. Drivers can slightly change the order | ||
328 | of the operations in the implementation, ignore some operations or add | ||
329 | more deriver specific operations in it, but drivers should do something like | ||
330 | this on the whole. | ||
331 | |||
294 | 5. Resources | 332 | 5. Resources |
295 | ~~~~~~~~~~~~ | 333 | ~~~~~~~~~~~~ |
296 | 334 | ||
diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt index c7c3459fde43..7a6b78966459 100644 --- a/Documentation/power/swsusp.txt +++ b/Documentation/power/swsusp.txt | |||
@@ -164,11 +164,11 @@ place where the thread is safe to be frozen (no kernel semaphores | |||
164 | should be held at that point and it must be safe to sleep there), and | 164 | should be held at that point and it must be safe to sleep there), and |
165 | add: | 165 | add: |
166 | 166 | ||
167 | if (current->flags & PF_FREEZE) | 167 | try_to_freeze(); |
168 | refrigerator(PF_FREEZE); | ||
169 | 168 | ||
170 | If the thread is needed for writing the image to storage, you should | 169 | If the thread is needed for writing the image to storage, you should |
171 | instead set the PF_NOFREEZE process flag when creating the thread. | 170 | instead set the PF_NOFREEZE process flag when creating the thread (and |
171 | be very carefull). | ||
172 | 172 | ||
173 | 173 | ||
174 | Q: What is the difference between between "platform", "shutdown" and | 174 | Q: What is the difference between between "platform", "shutdown" and |
@@ -233,3 +233,81 @@ A: Try running | |||
233 | cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null | 233 | cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null |
234 | 234 | ||
235 | after resume. swapoff -a; swapon -a may also be usefull. | 235 | after resume. swapoff -a; swapon -a may also be usefull. |
236 | |||
237 | Q: What happens to devices during swsusp? They seem to be resumed | ||
238 | during system suspend? | ||
239 | |||
240 | A: That's correct. We need to resume them if we want to write image to | ||
241 | disk. Whole sequence goes like | ||
242 | |||
243 | Suspend part | ||
244 | ~~~~~~~~~~~~ | ||
245 | running system, user asks for suspend-to-disk | ||
246 | |||
247 | user processes are stopped | ||
248 | |||
249 | suspend(PMSG_FREEZE): devices are frozen so that they don't interfere | ||
250 | with state snapshot | ||
251 | |||
252 | state snapshot: copy of whole used memory is taken with interrupts disabled | ||
253 | |||
254 | resume(): devices are woken up so that we can write image to swap | ||
255 | |||
256 | write image to swap | ||
257 | |||
258 | suspend(PMSG_SUSPEND): suspend devices so that we can power off | ||
259 | |||
260 | turn the power off | ||
261 | |||
262 | Resume part | ||
263 | ~~~~~~~~~~~ | ||
264 | (is actually pretty similar) | ||
265 | |||
266 | running system, user asks for suspend-to-disk | ||
267 | |||
268 | user processes are stopped (in common case there are none, but with resume-from-initrd, noone knows) | ||
269 | |||
270 | read image from disk | ||
271 | |||
272 | suspend(PMSG_FREEZE): devices are frozen so that they don't interfere | ||
273 | with image restoration | ||
274 | |||
275 | image restoration: rewrite memory with image | ||
276 | |||
277 | resume(): devices are woken up so that system can continue | ||
278 | |||
279 | thaw all user processes | ||
280 | |||
281 | Q: What is this 'Encrypt suspend image' for? | ||
282 | |||
283 | A: First of all: it is not a replacement for dm-crypt encrypted swap. | ||
284 | It cannot protect your computer while it is suspended. Instead it does | ||
285 | protect from leaking sensitive data after resume from suspend. | ||
286 | |||
287 | Think of the following: you suspend while an application is running | ||
288 | that keeps sensitive data in memory. The application itself prevents | ||
289 | the data from being swapped out. Suspend, however, must write these | ||
290 | data to swap to be able to resume later on. Without suspend encryption | ||
291 | your sensitive data are then stored in plaintext on disk. This means | ||
292 | that after resume your sensitive data are accessible to all | ||
293 | applications having direct access to the swap device which was used | ||
294 | for suspend. If you don't need swap after resume these data can remain | ||
295 | on disk virtually forever. Thus it can happen that your system gets | ||
296 | broken in weeks later and sensitive data which you thought were | ||
297 | encrypted and protected are retrieved and stolen from the swap device. | ||
298 | To prevent this situation you should use 'Encrypt suspend image'. | ||
299 | |||
300 | During suspend a temporary key is created and this key is used to | ||
301 | encrypt the data written to disk. When, during resume, the data was | ||
302 | read back into memory the temporary key is destroyed which simply | ||
303 | means that all data written to disk during suspend are then | ||
304 | inaccessible so they can't be stolen later on. The only thing that | ||
305 | you must then take care of is that you call 'mkswap' for the swap | ||
306 | partition used for suspend as early as possible during regular | ||
307 | boot. This asserts that any temporary key from an oopsed suspend or | ||
308 | from a failed or aborted resume is erased from the swap device. | ||
309 | |||
310 | As a rule of thumb use encrypted swap to protect your data while your | ||
311 | system is shut down or suspended. Additionally use the encrypted | ||
312 | suspend image to prevent sensitive data from being stolen after | ||
313 | resume. | ||
diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt index 68734355d7cf..7a4a5036d123 100644 --- a/Documentation/power/video.txt +++ b/Documentation/power/video.txt | |||
@@ -83,8 +83,10 @@ Compaq Armada E500 - P3-700 none (1) (S1 also works OK) | |||
83 | Compaq Evo N620c vga=normal, s3_bios (2) | 83 | Compaq Evo N620c vga=normal, s3_bios (2) |
84 | Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 | 84 | Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 |
85 | Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) | 85 | Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) |
86 | Dell D610 vga=normal and X (possibly vbestate (6) too, but not tested) | ||
86 | Dell Inspiron 4000 ??? (*) | 87 | Dell Inspiron 4000 ??? (*) |
87 | Dell Inspiron 500m ??? (*) | 88 | Dell Inspiron 500m ??? (*) |
89 | Dell Inspiron 510m ??? | ||
88 | Dell Inspiron 600m ??? (*) | 90 | Dell Inspiron 600m ??? (*) |
89 | Dell Inspiron 8200 ??? (*) | 91 | Dell Inspiron 8200 ??? (*) |
90 | Dell Inspiron 8500 ??? (*) | 92 | Dell Inspiron 8500 ??? (*) |
@@ -115,6 +117,7 @@ IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) | |||
115 | Medion MD4220 ??? (*) | 117 | Medion MD4220 ??? (*) |
116 | Samsung P35 vbetool needed (6) | 118 | Samsung P35 vbetool needed (6) |
117 | Sharp PC-AR10 (ATI rage) none (1) | 119 | Sharp PC-AR10 (ATI rage) none (1) |
120 | Sony Vaio PCG-C1VRX/K s3_bios (2) | ||
118 | Sony Vaio PCG-F403 ??? (*) | 121 | Sony Vaio PCG-F403 ??? (*) |
119 | Sony Vaio PCG-N505SN ??? (*) | 122 | Sony Vaio PCG-N505SN ??? (*) |
120 | Sony Vaio vgn-s260 X or boot-radeon can init it (5) | 123 | Sony Vaio vgn-s260 X or boot-radeon can init it (5) |
@@ -123,6 +126,7 @@ Toshiba Satellite 4030CDT s3_mode (3) | |||
123 | Toshiba Satellite 4080XCDT s3_mode (3) | 126 | Toshiba Satellite 4080XCDT s3_mode (3) |
124 | Toshiba Satellite 4090XCDT ??? (*) | 127 | Toshiba Satellite 4090XCDT ??? (*) |
125 | Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) | 128 | Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) |
129 | Toshiba M30 (2) xor X with nvidia driver using internal AGP | ||
126 | Uniwill 244IIO ??? (*) | 130 | Uniwill 244IIO ??? (*) |
127 | 131 | ||
128 | 132 | ||
diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt index 8e33d7c82c49..b2f9b1598ac2 100644 --- a/Documentation/power/video_extension.txt +++ b/Documentation/power/video_extension.txt | |||
@@ -1,13 +1,16 @@ | |||
1 | This driver implement the ACPI Extensions For Display Adapters | 1 | ACPI video extensions |
2 | for integrated graphics devices on motherboard, as specified in | 2 | ~~~~~~~~~~~~~~~~~~~~~ |
3 | ACPI 2.0 Specification, Appendix B, allowing to perform some basic | 3 | |
4 | control like defining the video POST device, retrieving EDID information | 4 | This driver implement the ACPI Extensions For Display Adapters for |
5 | or to setup a video output, etc. Note that this is an ref. implementation only. | 5 | integrated graphics devices on motherboard, as specified in ACPI 2.0 |
6 | It may or may not work for your integrated video device. | 6 | Specification, Appendix B, allowing to perform some basic control like |
7 | defining the video POST device, retrieving EDID information or to | ||
8 | setup a video output, etc. Note that this is an ref. implementation | ||
9 | only. It may or may not work for your integrated video device. | ||
7 | 10 | ||
8 | Interfaces exposed to userland through /proc/acpi/video: | 11 | Interfaces exposed to userland through /proc/acpi/video: |
9 | 12 | ||
10 | VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV. | 13 | VGA/info : display the supported video bus device capability like Video ROM, CRT/LCD/TV. |
11 | VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). | 14 | VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). |
12 | VGA/POST_info : Used to determine what options are implemented. | 15 | VGA/POST_info : Used to determine what options are implemented. |
13 | VGA/POST : Used to get/set POST device. | 16 | VGA/POST : Used to get/set POST device. |
@@ -15,7 +18,7 @@ VGA/DOS : Used to get/set ownership of output switching: | |||
15 | Please refer ACPI spec B.4.1 _DOS | 18 | Please refer ACPI spec B.4.1 _DOS |
16 | VGA/CRT : CRT output | 19 | VGA/CRT : CRT output |
17 | VGA/LCD : LCD output | 20 | VGA/LCD : LCD output |
18 | VGA/TV : TV output | 21 | VGA/TVO : TV output |
19 | VGA/*/brightness : Used to get/set brightness of output device | 22 | VGA/*/brightness : Used to get/set brightness of output device |
20 | 23 | ||
21 | Notify event through /proc/acpi/event: | 24 | Notify event through /proc/acpi/event: |
diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt index 2d1cd939b4df..e24fdeada970 100644 --- a/Documentation/s390/s390dbf.txt +++ b/Documentation/s390/s390dbf.txt | |||
@@ -12,8 +12,8 @@ where log records can be stored efficiently in memory, where each component | |||
12 | One purpose of this is to inspect the debug logs after a production system crash | 12 | One purpose of this is to inspect the debug logs after a production system crash |
13 | in order to analyze the reason for the crash. | 13 | in order to analyze the reason for the crash. |
14 | If the system still runs but only a subcomponent which uses dbf failes, | 14 | If the system still runs but only a subcomponent which uses dbf failes, |
15 | it is possible to look at the debug logs on a live system via the Linux proc | 15 | it is possible to look at the debug logs on a live system via the Linux |
16 | filesystem. | 16 | debugfs filesystem. |
17 | The debug feature may also very useful for kernel and driver development. | 17 | The debug feature may also very useful for kernel and driver development. |
18 | 18 | ||
19 | Design: | 19 | Design: |
@@ -52,16 +52,18 @@ Each debug entry contains the following data: | |||
52 | - Flag, if entry is an exception or not | 52 | - Flag, if entry is an exception or not |
53 | 53 | ||
54 | The debug logs can be inspected in a live system through entries in | 54 | The debug logs can be inspected in a live system through entries in |
55 | the proc-filesystem. Under the path /proc/s390dbf there is | 55 | the debugfs-filesystem. Under the toplevel directory "s390dbf" there is |
56 | a directory for each registered component, which is named like the | 56 | a directory for each registered component, which is named like the |
57 | corresponding component. | 57 | corresponding component. The debugfs normally should be mounted to |
58 | /sys/kernel/debug therefore the debug feature can be accessed unter | ||
59 | /sys/kernel/debug/s390dbf. | ||
58 | 60 | ||
59 | The content of the directories are files which represent different views | 61 | The content of the directories are files which represent different views |
60 | to the debug log. Each component can decide which views should be | 62 | to the debug log. Each component can decide which views should be |
61 | used through registering them with the function debug_register_view(). | 63 | used through registering them with the function debug_register_view(). |
62 | Predefined views for hex/ascii, sprintf and raw binary data are provided. | 64 | Predefined views for hex/ascii, sprintf and raw binary data are provided. |
63 | It is also possible to define other views. The content of | 65 | It is also possible to define other views. The content of |
64 | a view can be inspected simply by reading the corresponding proc file. | 66 | a view can be inspected simply by reading the corresponding debugfs file. |
65 | 67 | ||
66 | All debug logs have an an actual debug level (range from 0 to 6). | 68 | All debug logs have an an actual debug level (range from 0 to 6). |
67 | The default level is 3. Event and Exception functions have a 'level' | 69 | The default level is 3. Event and Exception functions have a 'level' |
@@ -69,14 +71,14 @@ parameter. Only debug entries with a level that is lower or equal | |||
69 | than the actual level are written to the log. This means, when | 71 | than the actual level are written to the log. This means, when |
70 | writing events, high priority log entries should have a low level | 72 | writing events, high priority log entries should have a low level |
71 | value whereas low priority entries should have a high one. | 73 | value whereas low priority entries should have a high one. |
72 | The actual debug level can be changed with the help of the proc-filesystem | 74 | The actual debug level can be changed with the help of the debugfs-filesystem |
73 | through writing a number string "x" to the 'level' proc file which is | 75 | through writing a number string "x" to the 'level' debugfs file which is |
74 | provided for every debug log. Debugging can be switched off completely | 76 | provided for every debug log. Debugging can be switched off completely |
75 | by using "-" on the 'level' proc file. | 77 | by using "-" on the 'level' debugfs file. |
76 | 78 | ||
77 | Example: | 79 | Example: |
78 | 80 | ||
79 | > echo "-" > /proc/s390dbf/dasd/level | 81 | > echo "-" > /sys/kernel/debug/s390dbf/dasd/level |
80 | 82 | ||
81 | It is also possible to deactivate the debug feature globally for every | 83 | It is also possible to deactivate the debug feature globally for every |
82 | debug log. You can change the behavior using 2 sysctl parameters in | 84 | debug log. You can change the behavior using 2 sysctl parameters in |
@@ -99,11 +101,11 @@ Kernel Interfaces: | |||
99 | ------------------ | 101 | ------------------ |
100 | 102 | ||
101 | ---------------------------------------------------------------------------- | 103 | ---------------------------------------------------------------------------- |
102 | debug_info_t *debug_register(char *name, int pages_index, int nr_areas, | 104 | debug_info_t *debug_register(char *name, int pages, int nr_areas, |
103 | int buf_size); | 105 | int buf_size); |
104 | 106 | ||
105 | Parameter: name: Name of debug log (e.g. used for proc entry) | 107 | Parameter: name: Name of debug log (e.g. used for debugfs entry) |
106 | pages_index: 2^pages_index pages will be allocated per area | 108 | pages: number of pages, which will be allocated per area |
107 | nr_areas: number of debug areas | 109 | nr_areas: number of debug areas |
108 | buf_size: size of data area in each debug entry | 110 | buf_size: size of data area in each debug entry |
109 | 111 | ||
@@ -134,7 +136,7 @@ Return Value: none | |||
134 | Description: Sets new actual debug level if new_level is valid. | 136 | Description: Sets new actual debug level if new_level is valid. |
135 | 137 | ||
136 | --------------------------------------------------------------------------- | 138 | --------------------------------------------------------------------------- |
137 | +void debug_stop_all(void); | 139 | void debug_stop_all(void); |
138 | 140 | ||
139 | Parameter: none | 141 | Parameter: none |
140 | 142 | ||
@@ -270,7 +272,7 @@ Parameter: id: handle for debug log | |||
270 | Return Value: 0 : ok | 272 | Return Value: 0 : ok |
271 | < 0: Error | 273 | < 0: Error |
272 | 274 | ||
273 | Description: registers new debug view and creates proc dir entry | 275 | Description: registers new debug view and creates debugfs dir entry |
274 | 276 | ||
275 | --------------------------------------------------------------------------- | 277 | --------------------------------------------------------------------------- |
276 | int debug_unregister_view (debug_info_t * id, struct debug_view *view); | 278 | int debug_unregister_view (debug_info_t * id, struct debug_view *view); |
@@ -281,7 +283,7 @@ Parameter: id: handle for debug log | |||
281 | Return Value: 0 : ok | 283 | Return Value: 0 : ok |
282 | < 0: Error | 284 | < 0: Error |
283 | 285 | ||
284 | Description: unregisters debug view and removes proc dir entry | 286 | Description: unregisters debug view and removes debugfs dir entry |
285 | 287 | ||
286 | 288 | ||
287 | 289 | ||
@@ -308,7 +310,7 @@ static int init(void) | |||
308 | { | 310 | { |
309 | /* register 4 debug areas with one page each and 4 byte data field */ | 311 | /* register 4 debug areas with one page each and 4 byte data field */ |
310 | 312 | ||
311 | debug_info = debug_register ("test", 0, 4, 4 ); | 313 | debug_info = debug_register ("test", 1, 4, 4 ); |
312 | debug_register_view(debug_info,&debug_hex_ascii_view); | 314 | debug_register_view(debug_info,&debug_hex_ascii_view); |
313 | debug_register_view(debug_info,&debug_raw_view); | 315 | debug_register_view(debug_info,&debug_raw_view); |
314 | 316 | ||
@@ -343,7 +345,7 @@ static int init(void) | |||
343 | /* register 4 debug areas with one page each and data field for */ | 345 | /* register 4 debug areas with one page each and data field for */ |
344 | /* format string pointer + 2 varargs (= 3 * sizeof(long)) */ | 346 | /* format string pointer + 2 varargs (= 3 * sizeof(long)) */ |
345 | 347 | ||
346 | debug_info = debug_register ("test", 0, 4, sizeof(long) * 3); | 348 | debug_info = debug_register ("test", 1, 4, sizeof(long) * 3); |
347 | debug_register_view(debug_info,&debug_sprintf_view); | 349 | debug_register_view(debug_info,&debug_sprintf_view); |
348 | 350 | ||
349 | debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__); | 351 | debug_sprintf_event(debug_info, 2 , "first event in %s:%i\n",__FILE__,__LINE__); |
@@ -362,16 +364,16 @@ module_exit(cleanup); | |||
362 | 364 | ||
363 | 365 | ||
364 | 366 | ||
365 | ProcFS Interface | 367 | Debugfs Interface |
366 | ---------------- | 368 | ---------------- |
367 | Views to the debug logs can be investigated through reading the corresponding | 369 | Views to the debug logs can be investigated through reading the corresponding |
368 | proc-files: | 370 | debugfs-files: |
369 | 371 | ||
370 | Example: | 372 | Example: |
371 | 373 | ||
372 | > ls /proc/s390dbf/dasd | 374 | > ls /sys/kernel/debug/s390dbf/dasd |
373 | flush hex_ascii level raw | 375 | flush hex_ascii level pages raw |
374 | > cat /proc/s390dbf/dasd/hex_ascii | sort +1 | 376 | > cat /sys/kernel/debug/s390dbf/dasd/hex_ascii | sort +1 |
375 | 00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | .... | 377 | 00 00974733272:680099 2 - 02 0006ad7e 07 ea 4a 90 | .... |
376 | 00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE | 378 | 00 00974733272:682210 2 - 02 0006ade6 46 52 45 45 | FREE |
377 | 00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | .... | 379 | 00 00974733272:682213 2 - 02 0006adf6 07 ea 4a 90 | .... |
@@ -391,25 +393,36 @@ Changing the debug level | |||
391 | Example: | 393 | Example: |
392 | 394 | ||
393 | 395 | ||
394 | > cat /proc/s390dbf/dasd/level | 396 | > cat /sys/kernel/debug/s390dbf/dasd/level |
395 | 3 | 397 | 3 |
396 | > echo "5" > /proc/s390dbf/dasd/level | 398 | > echo "5" > /sys/kernel/debug/s390dbf/dasd/level |
397 | > cat /proc/s390dbf/dasd/level | 399 | > cat /sys/kernel/debug/s390dbf/dasd/level |
398 | 5 | 400 | 5 |
399 | 401 | ||
400 | Flushing debug areas | 402 | Flushing debug areas |
401 | -------------------- | 403 | -------------------- |
402 | Debug areas can be flushed with piping the number of the desired | 404 | Debug areas can be flushed with piping the number of the desired |
403 | area (0...n) to the proc file "flush". When using "-" all debug areas | 405 | area (0...n) to the debugfs file "flush". When using "-" all debug areas |
404 | are flushed. | 406 | are flushed. |
405 | 407 | ||
406 | Examples: | 408 | Examples: |
407 | 409 | ||
408 | 1. Flush debug area 0: | 410 | 1. Flush debug area 0: |
409 | > echo "0" > /proc/s390dbf/dasd/flush | 411 | > echo "0" > /sys/kernel/debug/s390dbf/dasd/flush |
410 | 412 | ||
411 | 2. Flush all debug areas: | 413 | 2. Flush all debug areas: |
412 | > echo "-" > /proc/s390dbf/dasd/flush | 414 | > echo "-" > /sys/kernel/debug/s390dbf/dasd/flush |
415 | |||
416 | Changing the size of debug areas | ||
417 | ------------------------------------ | ||
418 | It is possible the change the size of debug areas through piping | ||
419 | the number of pages to the debugfs file "pages". The resize request will | ||
420 | also flush the debug areas. | ||
421 | |||
422 | Example: | ||
423 | |||
424 | Define 4 pages for the debug areas of debug feature "dasd": | ||
425 | > echo "4" > /sys/kernel/debug/s390dbf/dasd/pages | ||
413 | 426 | ||
414 | Stooping the debug feature | 427 | Stooping the debug feature |
415 | -------------------------- | 428 | -------------------------- |
@@ -491,7 +504,7 @@ Defining views | |||
491 | -------------- | 504 | -------------- |
492 | 505 | ||
493 | Views are specified with the 'debug_view' structure. There are defined | 506 | Views are specified with the 'debug_view' structure. There are defined |
494 | callback functions which are used for reading and writing the proc files: | 507 | callback functions which are used for reading and writing the debugfs files: |
495 | 508 | ||
496 | struct debug_view { | 509 | struct debug_view { |
497 | char name[DEBUG_MAX_PROCF_LEN]; | 510 | char name[DEBUG_MAX_PROCF_LEN]; |
@@ -525,7 +538,7 @@ typedef int (debug_input_proc_t) (debug_info_t* id, | |||
525 | The "private_data" member can be used as pointer to view specific data. | 538 | The "private_data" member can be used as pointer to view specific data. |
526 | It is not used by the debug feature itself. | 539 | It is not used by the debug feature itself. |
527 | 540 | ||
528 | The output when reading a debug-proc file is structured like this: | 541 | The output when reading a debugfs file is structured like this: |
529 | 542 | ||
530 | "prolog_proc output" | 543 | "prolog_proc output" |
531 | 544 | ||
@@ -534,13 +547,13 @@ The output when reading a debug-proc file is structured like this: | |||
534 | "header_proc output 3" "format_proc output 3" | 547 | "header_proc output 3" "format_proc output 3" |
535 | ... | 548 | ... |
536 | 549 | ||
537 | When a view is read from the proc fs, the Debug Feature calls the | 550 | When a view is read from the debugfs, the Debug Feature calls the |
538 | 'prolog_proc' once for writing the prolog. | 551 | 'prolog_proc' once for writing the prolog. |
539 | Then 'header_proc' and 'format_proc' are called for each | 552 | Then 'header_proc' and 'format_proc' are called for each |
540 | existing debug entry. | 553 | existing debug entry. |
541 | 554 | ||
542 | The input_proc can be used to implement functionality when it is written to | 555 | The input_proc can be used to implement functionality when it is written to |
543 | the view (e.g. like with 'echo "0" > /proc/s390dbf/dasd/level). | 556 | the view (e.g. like with 'echo "0" > /sys/kernel/debug/s390dbf/dasd/level). |
544 | 557 | ||
545 | For header_proc there can be used the default function | 558 | For header_proc there can be used the default function |
546 | debug_dflt_header_fn() which is defined in in debug.h. | 559 | debug_dflt_header_fn() which is defined in in debug.h. |
@@ -602,7 +615,7 @@ debug_info = debug_register ("test", 0, 4, 4 )); | |||
602 | debug_register_view(debug_info, &debug_test_view); | 615 | debug_register_view(debug_info, &debug_test_view); |
603 | for(i = 0; i < 10; i ++) debug_int_event(debug_info, 1, i); | 616 | for(i = 0; i < 10; i ++) debug_int_event(debug_info, 1, i); |
604 | 617 | ||
605 | > cat /proc/s390dbf/test/myview | 618 | > cat /sys/kernel/debug/s390dbf/test/myview |
606 | 00 00964419734:611402 1 - 00 88042ca This error........... | 619 | 00 00964419734:611402 1 - 00 88042ca This error........... |
607 | 00 00964419734:611405 1 - 00 88042ca That error........... | 620 | 00 00964419734:611405 1 - 00 88042ca That error........... |
608 | 00 00964419734:611408 1 - 00 88042ca Problem.............. | 621 | 00 00964419734:611408 1 - 00 88042ca Problem.............. |
diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt index da176c95d0fb..7536823c0cb1 100644 --- a/Documentation/scsi/scsi_mid_low_api.txt +++ b/Documentation/scsi/scsi_mid_low_api.txt | |||
@@ -388,7 +388,6 @@ Summary: | |||
388 | scsi_remove_device - detach and remove a SCSI device | 388 | scsi_remove_device - detach and remove a SCSI device |
389 | scsi_remove_host - detach and remove all SCSI devices owned by host | 389 | scsi_remove_host - detach and remove all SCSI devices owned by host |
390 | scsi_report_bus_reset - report scsi _bus_ reset observed | 390 | scsi_report_bus_reset - report scsi _bus_ reset observed |
391 | scsi_set_device - place device reference in host structure | ||
392 | scsi_track_queue_full - track successive QUEUE_FULL events | 391 | scsi_track_queue_full - track successive QUEUE_FULL events |
393 | scsi_unblock_requests - allow further commands to be queued to given host | 392 | scsi_unblock_requests - allow further commands to be queued to given host |
394 | scsi_unregister - [calls scsi_host_put()] | 393 | scsi_unregister - [calls scsi_host_put()] |
@@ -741,20 +740,6 @@ void scsi_report_bus_reset(struct Scsi_Host * shost, int channel) | |||
741 | 740 | ||
742 | 741 | ||
743 | /** | 742 | /** |
744 | * scsi_set_device - place device reference in host structure | ||
745 | * @shost: a pointer to a scsi host instance | ||
746 | * @pdev: pointer to device instance to assign | ||
747 | * | ||
748 | * Returns nothing | ||
749 | * | ||
750 | * Might block: no | ||
751 | * | ||
752 | * Defined in: include/scsi/scsi_host.h . | ||
753 | **/ | ||
754 | void scsi_set_device(struct Scsi_Host * shost, struct device * dev) | ||
755 | |||
756 | |||
757 | /** | ||
758 | * scsi_track_queue_full - track successive QUEUE_FULL events on given | 743 | * scsi_track_queue_full - track successive QUEUE_FULL events on given |
759 | * device to determine if and when there is a need | 744 | * device to determine if and when there is a need |
760 | * to adjust the queue depth on the device. | 745 | * to adjust the queue depth on the device. |
diff --git a/Documentation/serial/driver b/Documentation/serial/driver index e9c0178cd202..ac7eabbf662a 100644 --- a/Documentation/serial/driver +++ b/Documentation/serial/driver | |||
@@ -107,8 +107,8 @@ hardware. | |||
107 | indicate that the signal is permanently active. If RI is | 107 | indicate that the signal is permanently active. If RI is |
108 | not available, the signal should not be indicated as active. | 108 | not available, the signal should not be indicated as active. |
109 | 109 | ||
110 | Locking: none. | 110 | Locking: port->lock taken. |
111 | Interrupts: caller dependent. | 111 | Interrupts: locally disabled. |
112 | This call must not sleep | 112 | This call must not sleep |
113 | 113 | ||
114 | stop_tx(port,tty_stop) | 114 | stop_tx(port,tty_stop) |
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 104a994b8289..a18ecb92b356 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt | |||
@@ -636,11 +636,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
636 | 3stack-digout 3-jack in back, a HP out and a SPDIF out | 636 | 3stack-digout 3-jack in back, a HP out and a SPDIF out |
637 | 5stack 5-jack in back, 2-jack in front | 637 | 5stack 5-jack in back, 2-jack in front |
638 | 5stack-digout 5-jack in back, 2-jack in front, a SPDIF out | 638 | 5stack-digout 5-jack in back, 2-jack in front, a SPDIF out |
639 | 6stack 6-jack in back, 2-jack in front | ||
640 | 6stack-digout 6-jack with a SPDIF out | ||
639 | w810 3-jack | 641 | w810 3-jack |
640 | z71v 3-jack (HP shared SPDIF) | 642 | z71v 3-jack (HP shared SPDIF) |
641 | asus 3-jack | 643 | asus 3-jack |
642 | uniwill 3-jack | 644 | uniwill 3-jack |
643 | F1734 2-jack | 645 | F1734 2-jack |
646 | test for testing/debugging purpose, almost all controls can be | ||
647 | adjusted. Appearing only when compiled with | ||
648 | $CONFIG_SND_DEBUG=y | ||
644 | 649 | ||
645 | CMI9880 | 650 | CMI9880 |
646 | minimal 3-jack in back | 651 | minimal 3-jack in back |
@@ -1054,6 +1059,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
1054 | 1059 | ||
1055 | The power-management is supported. | 1060 | The power-management is supported. |
1056 | 1061 | ||
1062 | Module snd-pxa2xx-ac97 (on arm only) | ||
1063 | ------------------------------------ | ||
1064 | |||
1065 | Module for AC97 driver for the Intel PXA2xx chip | ||
1066 | |||
1067 | For ARM architecture only. | ||
1068 | |||
1057 | Module snd-rme32 | 1069 | Module snd-rme32 |
1058 | ---------------- | 1070 | ---------------- |
1059 | 1071 | ||
@@ -1173,6 +1185,13 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
1173 | 1185 | ||
1174 | Module supports up to 8 cards. | 1186 | Module supports up to 8 cards. |
1175 | 1187 | ||
1188 | Module snd-sun-dbri (on sparc only) | ||
1189 | ----------------------------------- | ||
1190 | |||
1191 | Module for DBRI sound chips found on Sparcs. | ||
1192 | |||
1193 | Module supports up to 8 cards. | ||
1194 | |||
1176 | Module snd-wavefront | 1195 | Module snd-wavefront |
1177 | -------------------- | 1196 | -------------------- |
1178 | 1197 | ||
@@ -1371,7 +1390,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
1371 | Module snd-vxpocket | 1390 | Module snd-vxpocket |
1372 | ------------------- | 1391 | ------------------- |
1373 | 1392 | ||
1374 | Module for Digigram VX-Pocket VX2 PCMCIA card. | 1393 | Module for Digigram VX-Pocket VX2 and 440 PCMCIA cards. |
1375 | 1394 | ||
1376 | ibl - Capture IBL size. (default = 0, minimum size) | 1395 | ibl - Capture IBL size. (default = 0, minimum size) |
1377 | 1396 | ||
@@ -1391,29 +1410,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
1391 | 1410 | ||
1392 | Note: the driver is build only when CONFIG_ISA is set. | 1411 | Note: the driver is build only when CONFIG_ISA is set. |
1393 | 1412 | ||
1394 | Module snd-vxp440 | ||
1395 | ----------------- | ||
1396 | |||
1397 | Module for Digigram VX-Pocket 440 PCMCIA card. | ||
1398 | |||
1399 | ibl - Capture IBL size. (default = 0, minimum size) | ||
1400 | |||
1401 | Module supports up to 8 cards. The module is compiled only when | ||
1402 | PCMCIA is supported on kernel. | ||
1403 | |||
1404 | To activate the driver via the card manager, you'll need to set | ||
1405 | up /etc/pcmcia/vxp440.conf. See the sound/pcmcia/vx/vxp440.c. | ||
1406 | |||
1407 | When the driver is compiled as a module and the hotplug firmware | ||
1408 | is supported, the firmware data is loaded via hotplug automatically. | ||
1409 | Install the necessary firmware files in alsa-firmware package. | ||
1410 | When no hotplug fw loader is available, you need to load the | ||
1411 | firmware via vxloader utility in alsa-tools package. | ||
1412 | |||
1413 | About capture IBL, see the description of snd-vx222 module. | ||
1414 | |||
1415 | Note: the driver is build only when CONFIG_ISA is set. | ||
1416 | |||
1417 | Module snd-ymfpci | 1413 | Module snd-ymfpci |
1418 | ----------------- | 1414 | ----------------- |
1419 | 1415 | ||
diff --git a/Documentation/stable_api_nonsense.txt b/Documentation/stable_api_nonsense.txt index 3cea13875277..f39c9d714db3 100644 --- a/Documentation/stable_api_nonsense.txt +++ b/Documentation/stable_api_nonsense.txt | |||
@@ -132,7 +132,7 @@ to extra work for the USB developers. Since all Linux USB developers do | |||
132 | their work on their own time, asking programmers to do extra work for no | 132 | their work on their own time, asking programmers to do extra work for no |
133 | gain, for free, is not a possibility. | 133 | gain, for free, is not a possibility. |
134 | 134 | ||
135 | Security issues are also a very important for Linux. When a | 135 | Security issues are also very important for Linux. When a |
136 | security issue is found, it is fixed in a very short amount of time. A | 136 | security issue is found, it is fixed in a very short amount of time. A |
137 | number of times this has caused internal kernel interfaces to be | 137 | number of times this has caused internal kernel interfaces to be |
138 | reworked to prevent the security problem from occurring. When this | 138 | reworked to prevent the security problem from occurring. When this |
diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt new file mode 100644 index 000000000000..2c81305090df --- /dev/null +++ b/Documentation/stable_kernel_rules.txt | |||
@@ -0,0 +1,58 @@ | |||
1 | Everything you ever wanted to know about Linux 2.6 -stable releases. | ||
2 | |||
3 | Rules on what kind of patches are accepted, and what ones are not, into | ||
4 | the "-stable" tree: | ||
5 | |||
6 | - It must be obviously correct and tested. | ||
7 | - It can not bigger than 100 lines, with context. | ||
8 | - It must fix only one thing. | ||
9 | - It must fix a real bug that bothers people (not a, "This could be a | ||
10 | problem..." type thing.) | ||
11 | - It must fix a problem that causes a build error (but not for things | ||
12 | marked CONFIG_BROKEN), an oops, a hang, data corruption, a real | ||
13 | security issue, or some "oh, that's not good" issue. In short, | ||
14 | something critical. | ||
15 | - No "theoretical race condition" issues, unless an explanation of how | ||
16 | the race can be exploited. | ||
17 | - It can not contain any "trivial" fixes in it (spelling changes, | ||
18 | whitespace cleanups, etc.) | ||
19 | - It must be accepted by the relevant subsystem maintainer. | ||
20 | - It must follow Documentation/SubmittingPatches rules. | ||
21 | |||
22 | |||
23 | Procedure for submitting patches to the -stable tree: | ||
24 | |||
25 | - Send the patch, after verifying that it follows the above rules, to | ||
26 | stable@kernel.org. | ||
27 | - The sender will receive an ack when the patch has been accepted into | ||
28 | the queue, or a nak if the patch is rejected. This response might | ||
29 | take a few days, according to the developer's schedules. | ||
30 | - If accepted, the patch will be added to the -stable queue, for review | ||
31 | by other developers. | ||
32 | - Security patches should not be sent to this alias, but instead to the | ||
33 | documented security@kernel.org. | ||
34 | |||
35 | |||
36 | Review cycle: | ||
37 | |||
38 | - When the -stable maintainers decide for a review cycle, the patches | ||
39 | will be sent to the review committee, and the maintainer of the | ||
40 | affected area of the patch (unless the submitter is the maintainer of | ||
41 | the area) and CC: to the linux-kernel mailing list. | ||
42 | - The review committee has 48 hours in which to ack or nak the patch. | ||
43 | - If the patch is rejected by a member of the committee, or linux-kernel | ||
44 | members object to the patch, bringing up issues that the maintainers | ||
45 | and members did not realize, the patch will be dropped from the | ||
46 | queue. | ||
47 | - At the end of the review cycle, the acked patches will be added to | ||
48 | the latest -stable release, and a new -stable release will happen. | ||
49 | - Security patches will be accepted into the -stable tree directly from | ||
50 | the security kernel team, and not go through the normal review cycle. | ||
51 | Contact the kernel security team for more details on this procedure. | ||
52 | |||
53 | |||
54 | Review committe: | ||
55 | |||
56 | - This will be made up of a number of kernel developers who have | ||
57 | volunteered for this task, and a few that haven't. | ||
58 | |||
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 35159176997b..9f11d36a8c10 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt | |||
@@ -49,6 +49,7 @@ show up in /proc/sys/kernel: | |||
49 | - shmmax [ sysv ipc ] | 49 | - shmmax [ sysv ipc ] |
50 | - shmmni | 50 | - shmmni |
51 | - stop-a [ SPARC only ] | 51 | - stop-a [ SPARC only ] |
52 | - suid_dumpable | ||
52 | - sysrq ==> Documentation/sysrq.txt | 53 | - sysrq ==> Documentation/sysrq.txt |
53 | - tainted | 54 | - tainted |
54 | - threads-max | 55 | - threads-max |
@@ -300,6 +301,25 @@ kernel. This value defaults to SHMMAX. | |||
300 | 301 | ||
301 | ============================================================== | 302 | ============================================================== |
302 | 303 | ||
304 | suid_dumpable: | ||
305 | |||
306 | This value can be used to query and set the core dump mode for setuid | ||
307 | or otherwise protected/tainted binaries. The modes are | ||
308 | |||
309 | 0 - (default) - traditional behaviour. Any process which has changed | ||
310 | privilege levels or is execute only will not be dumped | ||
311 | 1 - (debug) - all processes dump core when possible. The core dump is | ||
312 | owned by the current user and no security is applied. This is | ||
313 | intended for system debugging situations only. Ptrace is unchecked. | ||
314 | 2 - (suidsafe) - any binary which normally would not be dumped is dumped | ||
315 | readable by root only. This allows the end user to remove | ||
316 | such a dump but not access it directly. For security reasons | ||
317 | core dumps in this mode will not overwrite one another or | ||
318 | other files. This mode is appropriate when adminstrators are | ||
319 | attempting to debug problems in a normal environment. | ||
320 | |||
321 | ============================================================== | ||
322 | |||
303 | tainted: | 323 | tainted: |
304 | 324 | ||
305 | Non-zero if the kernel has been tainted. Numeric values, which | 325 | Non-zero if the kernel has been tainted. Numeric values, which |
diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt index f98c2e31c143..136d817c01ba 100644 --- a/Documentation/sysrq.txt +++ b/Documentation/sysrq.txt | |||
@@ -72,6 +72,8 @@ On all - write a character to /proc/sysrq-trigger. eg: | |||
72 | 'b' - Will immediately reboot the system without syncing or unmounting | 72 | 'b' - Will immediately reboot the system without syncing or unmounting |
73 | your disks. | 73 | your disks. |
74 | 74 | ||
75 | 'c' - Will perform a kexec reboot in order to take a crashdump. | ||
76 | |||
75 | 'o' - Will shut your system off (if configured and supported). | 77 | 'o' - Will shut your system off (if configured and supported). |
76 | 78 | ||
77 | 's' - Will attempt to sync all mounted filesystems. | 79 | 's' - Will attempt to sync all mounted filesystems. |
@@ -122,6 +124,9 @@ useful when you want to exit a program that will not let you switch consoles. | |||
122 | re'B'oot is good when you're unable to shut down. But you should also 'S'ync | 124 | re'B'oot is good when you're unable to shut down. But you should also 'S'ync |
123 | and 'U'mount first. | 125 | and 'U'mount first. |
124 | 126 | ||
127 | 'C'rashdump can be used to manually trigger a crashdump when the system is hung. | ||
128 | The kernel needs to have been built with CONFIG_KEXEC enabled. | ||
129 | |||
125 | 'S'ync is great when your system is locked up, it allows you to sync your | 130 | 'S'ync is great when your system is locked up, it allows you to sync your |
126 | disks and will certainly lessen the chance of data loss and fscking. Note | 131 | disks and will certainly lessen the chance of data loss and fscking. Note |
127 | that the sync hasn't taken place until you see the "OK" and "Done" appear | 132 | that the sync hasn't taken place until you see the "OK" and "Done" appear |
diff --git a/Documentation/tty.txt b/Documentation/tty.txt index 3958cf746dde..8ff7bc2a0811 100644 --- a/Documentation/tty.txt +++ b/Documentation/tty.txt | |||
@@ -22,7 +22,7 @@ copy of the structure. You must not re-register over the top of the line | |||
22 | discipline even with the same data or your computer again will be eaten by | 22 | discipline even with the same data or your computer again will be eaten by |
23 | demons. | 23 | demons. |
24 | 24 | ||
25 | In order to remove a line discipline call tty_register_ldisc passing NULL. | 25 | In order to remove a line discipline call tty_unregister_ldisc(). |
26 | In ancient times this always worked. In modern times the function will | 26 | In ancient times this always worked. In modern times the function will |
27 | return -EBUSY if the ldisc is currently in use. Since the ldisc referencing | 27 | return -EBUSY if the ldisc is currently in use. Since the ldisc referencing |
28 | code manages the module counts this should not usually be a concern. | 28 | code manages the module counts this should not usually be a concern. |
diff --git a/Documentation/usb/sn9c102.txt b/Documentation/usb/sn9c102.txt index cf9a1187edce..3f8a119db31b 100644 --- a/Documentation/usb/sn9c102.txt +++ b/Documentation/usb/sn9c102.txt | |||
@@ -297,6 +297,7 @@ Vendor ID Product ID | |||
297 | 0x0c45 0x602a | 297 | 0x0c45 0x602a |
298 | 0x0c45 0x602b | 298 | 0x0c45 0x602b |
299 | 0x0c45 0x602c | 299 | 0x0c45 0x602c |
300 | 0x0c45 0x602d | ||
300 | 0x0c45 0x6030 | 301 | 0x0c45 0x6030 |
301 | 0x0c45 0x6080 | 302 | 0x0c45 0x6080 |
302 | 0x0c45 0x6082 | 303 | 0x0c45 0x6082 |
@@ -333,6 +334,7 @@ Model Manufacturer | |||
333 | ----- ------------ | 334 | ----- ------------ |
334 | HV7131D Hynix Semiconductor, Inc. | 335 | HV7131D Hynix Semiconductor, Inc. |
335 | MI-0343 Micron Technology, Inc. | 336 | MI-0343 Micron Technology, Inc. |
337 | OV7630 OmniVision Technologies, Inc. | ||
336 | PAS106B PixArt Imaging, Inc. | 338 | PAS106B PixArt Imaging, Inc. |
337 | PAS202BCB PixArt Imaging, Inc. | 339 | PAS202BCB PixArt Imaging, Inc. |
338 | TAS5110C1B Taiwan Advanced Sensor Corporation | 340 | TAS5110C1B Taiwan Advanced Sensor Corporation |
@@ -470,9 +472,11 @@ order): | |||
470 | - Luca Capello for the donation of a webcam; | 472 | - Luca Capello for the donation of a webcam; |
471 | - Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the | 473 | - Joao Rodrigo Fuzaro, Joao Limirio, Claudio Filho and Caio Begotti for the |
472 | donation of a webcam; | 474 | donation of a webcam; |
475 | - Jon Hollstrom for the donation of a webcam; | ||
473 | - Carlos Eduardo Medaglia Dyonisio, who added the support for the PAS202BCB | 476 | - Carlos Eduardo Medaglia Dyonisio, who added the support for the PAS202BCB |
474 | image sensor; | 477 | image sensor; |
475 | - Stefano Mozzi, who donated 45 EU; | 478 | - Stefano Mozzi, who donated 45 EU; |
479 | - Andrew Pearce for the donation of a webcam; | ||
476 | - Bertrik Sikken, who reverse-engineered and documented the Huffman compression | 480 | - Bertrik Sikken, who reverse-engineered and documented the Huffman compression |
477 | algorithm used in the SN9C10x controllers and implemented the first decoder; | 481 | algorithm used in the SN9C10x controllers and implemented the first decoder; |
478 | - Mizuno Takafumi for the donation of a webcam; | 482 | - Mizuno Takafumi for the donation of a webcam; |
diff --git a/Documentation/usb/usbmon.txt b/Documentation/usb/usbmon.txt index 2f8431f92b77..63cb7edd177e 100644 --- a/Documentation/usb/usbmon.txt +++ b/Documentation/usb/usbmon.txt | |||
@@ -101,6 +101,13 @@ Here is the list of words, from left to right: | |||
101 | or 3 and 2 positions, correspondingly. | 101 | or 3 and 2 positions, correspondingly. |
102 | - URB Status. This field makes no sense for submissions, but is present | 102 | - URB Status. This field makes no sense for submissions, but is present |
103 | to help scripts with parsing. In error case, it contains the error code. | 103 | to help scripts with parsing. In error case, it contains the error code. |
104 | In case of a setup packet, it contains a Setup Tag. If scripts read a number | ||
105 | in this field, they proceed to read Data Length. Otherwise, they read | ||
106 | the setup packet before reading the Data Length. | ||
107 | - Setup packet, if present, consists of 5 words: one of each for bmRequestType, | ||
108 | bRequest, wValue, wIndex, wLength, as specified by the USB Specification 2.0. | ||
109 | These words are safe to decode if Setup Tag was 's'. Otherwise, the setup | ||
110 | packet was present, but not captured, and the fields contain filler. | ||
104 | - Data Length. This is the actual length in the URB. | 111 | - Data Length. This is the actual length in the URB. |
105 | - Data tag. The usbmon may not always capture data, even if length is nonzero. | 112 | - Data tag. The usbmon may not always capture data, even if length is nonzero. |
106 | Only if tag is '=', the data words are present. | 113 | Only if tag is '=', the data words are present. |
@@ -125,25 +132,31 @@ class ParsedLine { | |||
125 | String data_str = st.nextToken(); | 132 | String data_str = st.nextToken(); |
126 | int len = data_str.length() / 2; | 133 | int len = data_str.length() / 2; |
127 | int i; | 134 | int i; |
135 | int b; // byte is signed, apparently?! XXX | ||
128 | for (i = 0; i < len; i++) { | 136 | for (i = 0; i < len; i++) { |
129 | data[data_len] = Byte.parseByte( | 137 | // data[data_len] = Byte.parseByte( |
130 | data_str.substring(i*2, i*2 + 2), | 138 | // data_str.substring(i*2, i*2 + 2), |
131 | 16); | 139 | // 16); |
140 | b = Integer.parseInt( | ||
141 | data_str.substring(i*2, i*2 + 2), | ||
142 | 16); | ||
143 | if (b >= 128) | ||
144 | b *= -1; | ||
145 | data[data_len] = (byte) b; | ||
132 | data_len++; | 146 | data_len++; |
133 | } | 147 | } |
134 | } | 148 | } |
135 | } | 149 | } |
136 | } | 150 | } |
137 | 151 | ||
138 | This format is obviously deficient. For example, the setup packet for control | 152 | This format may be changed in the future. |
139 | transfers is not delivered. This will change in the future. | ||
140 | 153 | ||
141 | Examples: | 154 | Examples: |
142 | 155 | ||
143 | An input control transfer to get a port status: | 156 | An input control transfer to get a port status. |
144 | 157 | ||
145 | d74ff9a0 2640288196 S Ci:001:00 -115 4 < | 158 | d5ea89a0 3575914555 S Ci:001:00 s a3 00 0000 0003 0004 4 < |
146 | d74ff9a0 2640288202 C Ci:001:00 0 4 = 01010100 | 159 | d5ea89a0 3575914560 C Ci:001:00 0 4 = 01050000 |
147 | 160 | ||
148 | An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper | 161 | An output bulk transfer to send a SCSI command 0x5E in a 31-byte Bulk wrapper |
149 | to a storage device at address 5: | 162 | to a storage device at address 5: |
diff --git a/Documentation/video4linux/API.html b/Documentation/video4linux/API.html index 4b3d8f640a4a..441407b12a9f 100644 --- a/Documentation/video4linux/API.html +++ b/Documentation/video4linux/API.html | |||
@@ -1,399 +1,16 @@ | |||
1 | <HTML><HEAD> | 1 | <TITLE>V4L API</TITLE> |
2 | <TITLE>Video4Linux Kernel API Reference v0.1:19990430</TITLE> | 2 | <H1>Video For Linux APIs</H1> |
3 | </HEAD> | 3 | <table border=0> |
4 | <! Revision History: > | 4 | <tr> |
5 | <! 4/30/1999 - Fred Gleason (fredg@wava.com)> | 5 | <td> |
6 | <! Documented extensions for the Radio Data System (RDS) extensions > | 6 | <A HREF=http://www.linuxtv.org/downloads/video4linux/API/V4L1_API.html> |
7 | <BODY bgcolor="#ffffff"> | 7 | V4L original API</a> |
8 | <H3>Devices</H3> | 8 | </td><td> |
9 | Video4Linux provides the following sets of device files. These live on the | 9 | Obsoleted by V4L2 API |
10 | character device formerly known as "/dev/bttv". /dev/bttv should be a | 10 | </td></tr><tr><td> |
11 | symlink to /dev/video0 for most people. | 11 | <A HREF=http://www.linuxtv.org/downloads/video4linux/API/V4L2_API.html> |
12 | <P> | 12 | V4L2 API</a> |
13 | <TABLE> | 13 | </td><td> |
14 | <TR><TH>Device Name</TH><TH>Minor Range</TH><TH>Function</TH> | 14 | Should be used for new projects |
15 | <TR><TD>/dev/video</TD><TD>0-63</TD><TD>Video Capture Interface</TD> | 15 | </td></tr> |
16 | <TR><TD>/dev/radio</TD><TD>64-127</TD><TD>AM/FM Radio Devices</TD> | 16 | </table> |
17 | <TR><TD>/dev/vtx</TD><TD>192-223</TD><TD>Teletext Interface Chips</TD> | ||
18 | <TR><TD>/dev/vbi</TD><TD>224-239</TD><TD>Raw VBI Data (Intercast/teletext)</TD> | ||
19 | </TABLE> | ||
20 | <P> | ||
21 | Video4Linux programs open and scan the devices to find what they are looking | ||
22 | for. Capability queries define what each interface supports. The | ||
23 | described API is only defined for video capture cards. The relevant subset | ||
24 | applies to radio cards. Teletext interfaces talk the existing VTX API. | ||
25 | <P> | ||
26 | <H3>Capability Query Ioctl</H3> | ||
27 | The <B>VIDIOCGCAP</B> ioctl call is used to obtain the capability | ||
28 | information for a video device. The <b>struct video_capability</b> object | ||
29 | passed to the ioctl is completed and returned. It contains the following | ||
30 | information | ||
31 | <P> | ||
32 | <TABLE> | ||
33 | <TR><TD><b>name[32]</b><TD>Canonical name for this interface</TD> | ||
34 | <TR><TD><b>type</b><TD>Type of interface</TD> | ||
35 | <TR><TD><b>channels</b><TD>Number of radio/tv channels if appropriate</TD> | ||
36 | <TR><TD><b>audios</b><TD>Number of audio devices if appropriate</TD> | ||
37 | <TR><TD><b>maxwidth</b><TD>Maximum capture width in pixels</TD> | ||
38 | <TR><TD><b>maxheight</b><TD>Maximum capture height in pixels</TD> | ||
39 | <TR><TD><b>minwidth</b><TD>Minimum capture width in pixels</TD> | ||
40 | <TR><TD><b>minheight</b><TD>Minimum capture height in pixels</TD> | ||
41 | </TABLE> | ||
42 | <P> | ||
43 | The type field lists the capability flags for the device. These are | ||
44 | as follows | ||
45 | <P> | ||
46 | <TABLE> | ||
47 | <TR><TH>Name</TH><TH>Description</TH> | ||
48 | <TR><TD><b>VID_TYPE_CAPTURE</b><TD>Can capture to memory</TD> | ||
49 | <TR><TD><b>VID_TYPE_TUNER</b><TD>Has a tuner of some form</TD> | ||
50 | <TR><TD><b>VID_TYPE_TELETEXT</b><TD>Has teletext capability</TD> | ||
51 | <TR><TD><b>VID_TYPE_OVERLAY</b><TD>Can overlay its image onto the frame buffer</TD> | ||
52 | <TR><TD><b>VID_TYPE_CHROMAKEY</b><TD>Overlay is Chromakeyed</TD> | ||
53 | <TR><TD><b>VID_TYPE_CLIPPING</b><TD>Overlay clipping is supported</TD> | ||
54 | <TR><TD><b>VID_TYPE_FRAMERAM</b><TD>Overlay overwrites frame buffer memory</TD> | ||
55 | <TR><TD><b>VID_TYPE_SCALES</b><TD>The hardware supports image scaling</TD> | ||
56 | <TR><TD><b>VID_TYPE_MONOCHROME</b><TD>Image capture is grey scale only</TD> | ||
57 | <TR><TD><b>VID_TYPE_SUBCAPTURE</b><TD>Capture can be of only part of the image</TD> | ||
58 | </TABLE> | ||
59 | <P> | ||
60 | The minimum and maximum sizes listed for a capture device do not imply all | ||
61 | that all height/width ratios or sizes within the range are possible. A | ||
62 | request to set a size will be honoured by the largest available capture | ||
63 | size whose capture is no large than the requested rectangle in either | ||
64 | direction. For example the quickcam has 3 fixed settings. | ||
65 | <P> | ||
66 | <H3>Frame Buffer</H3> | ||
67 | Capture cards that drop data directly onto the frame buffer must be told the | ||
68 | base address of the frame buffer, its size and organisation. This is a | ||
69 | privileged ioctl and one that eventually X itself should set. | ||
70 | <P> | ||
71 | The <b>VIDIOCSFBUF</b> ioctl sets the frame buffer parameters for a capture | ||
72 | card. If the card does not do direct writes to the frame buffer then this | ||
73 | ioctl will be unsupported. The <b>VIDIOCGFBUF</b> ioctl returns the | ||
74 | currently used parameters. The structure used in both cases is a | ||
75 | <b>struct video_buffer</b>. | ||
76 | <P> | ||
77 | <TABLE> | ||
78 | <TR><TD><b>void *base</b></TD><TD>Base physical address of the buffer</TD> | ||
79 | <TR><TD><b>int height</b></TD><TD>Height of the frame buffer</TD> | ||
80 | <TR><TD><b>int width</b></TD><TD>Width of the frame buffer</TD> | ||
81 | <TR><TD><b>int depth</b></TD><TD>Depth of the frame buffer</TD> | ||
82 | <TR><TD><b>int bytesperline</b></TD><TD>Number of bytes of memory between the start of two adjacent lines</TD> | ||
83 | </TABLE> | ||
84 | <P> | ||
85 | Note that these values reflect the physical layout of the frame buffer. | ||
86 | The visible area may be smaller. In fact under XFree86 this is commonly the | ||
87 | case. XFree86 DGA can provide the parameters required to set up this ioctl. | ||
88 | Setting the base address to NULL indicates there is no physical frame buffer | ||
89 | access. | ||
90 | <P> | ||
91 | <H3>Capture Windows</H3> | ||
92 | The capture area is described by a <b>struct video_window</b>. This defines | ||
93 | a capture area and the clipping information if relevant. The | ||
94 | <b>VIDIOCGWIN</b> ioctl recovers the current settings and the | ||
95 | <b>VIDIOCSWIN</b> sets new values. A successful call to <b>VIDIOCSWIN</b> | ||
96 | indicates that a suitable set of parameters have been chosen. They do not | ||
97 | indicate that exactly what was requested was granted. The program should | ||
98 | call <b>VIDIOCGWIN</b> to check if the nearest match was suitable. The | ||
99 | <b>struct video_window</b> contains the following fields. | ||
100 | <P> | ||
101 | <TABLE> | ||
102 | <TR><TD><b>x</b><TD>The X co-ordinate specified in X windows format.</TD> | ||
103 | <TR><TD><b>y</b><TD>The Y co-ordinate specified in X windows format.</TD> | ||
104 | <TR><TD><b>width</b><TD>The width of the image capture.</TD> | ||
105 | <TR><TD><b>height</b><TD>The height of the image capture.</TD> | ||
106 | <TR><TD><b>chromakey</b><TD>A host order RGB32 value for the chroma key.</TD> | ||
107 | <TR><TD><b>flags</b><TD>Additional capture flags.</TD> | ||
108 | <TR><TD><b>clips</b><TD>A list of clipping rectangles. <em>(Set only)</em></TD> | ||
109 | <TR><TD><b>clipcount</b><TD>The number of clipping rectangles. <em>(Set only)</em></TD> | ||
110 | </TABLE> | ||
111 | <P> | ||
112 | Clipping rectangles are passed as an array. Each clip consists of the following | ||
113 | fields available to the user. | ||
114 | <P> | ||
115 | <TABLE> | ||
116 | <TR><TD><b>x</b></TD><TD>X co-ordinate of rectangle to skip</TD> | ||
117 | <TR><TD><b>y</b></TD><TD>Y co-ordinate of rectangle to skip</TD> | ||
118 | <TR><TD><b>width</b></TD><TD>Width of rectangle to skip</TD> | ||
119 | <TR><TD><b>height</b></TD><TD>Height of rectangle to skip</TD> | ||
120 | </TABLE> | ||
121 | <P> | ||
122 | Merely setting the window does not enable capturing. Overlay capturing | ||
123 | (i.e. PCI-PCI transfer to the frame buffer of the video card) | ||
124 | is activated by passing the <b>VIDIOCCAPTURE</b> ioctl a value of 1, and | ||
125 | disabled by passing it a value of 0. | ||
126 | <P> | ||
127 | Some capture devices can capture a subfield of the image they actually see. | ||
128 | This is indicated when VIDEO_TYPE_SUBCAPTURE is defined. | ||
129 | The video_capture describes the time and special subfields to capture. | ||
130 | The video_capture structure contains the following fields. | ||
131 | <P> | ||
132 | <TABLE> | ||
133 | <TR><TD><b>x</b></TD><TD>X co-ordinate of source rectangle to grab</TD> | ||
134 | <TR><TD><b>y</b></TD><TD>Y co-ordinate of source rectangle to grab</TD> | ||
135 | <TR><TD><b>width</b></TD><TD>Width of source rectangle to grab</TD> | ||
136 | <TR><TD><b>height</b></TD><TD>Height of source rectangle to grab</TD> | ||
137 | <TR><TD><b>decimation</b></TD><TD>Decimation to apply</TD> | ||
138 | <TR><TD><b>flags</b></TD><TD>Flag settings for grabbing</TD> | ||
139 | </TABLE> | ||
140 | The available flags are | ||
141 | <P> | ||
142 | <TABLE> | ||
143 | <TR><TH>Name</TH><TH>Description</TH> | ||
144 | <TR><TD><b>VIDEO_CAPTURE_ODD</b><TD>Capture only odd frames</TD> | ||
145 | <TR><TD><b>VIDEO_CAPTURE_EVEN</b><TD>Capture only even frames</TD> | ||
146 | </TABLE> | ||
147 | <P> | ||
148 | <H3>Video Sources</H3> | ||
149 | Each video4linux video or audio device captures from one or more | ||
150 | source <b>channels</b>. Each channel can be queries with the | ||
151 | <b>VDIOCGCHAN</b> ioctl call. Before invoking this function the caller | ||
152 | must set the channel field to the channel that is being queried. On return | ||
153 | the <b>struct video_channel</b> is filled in with information about the | ||
154 | nature of the channel itself. | ||
155 | <P> | ||
156 | The <b>VIDIOCSCHAN</b> ioctl takes an integer argument and switches the | ||
157 | capture to this input. It is not defined whether parameters such as colour | ||
158 | settings or tuning are maintained across a channel switch. The caller should | ||
159 | maintain settings as desired for each channel. (This is reasonable as | ||
160 | different video inputs may have different properties). | ||
161 | <P> | ||
162 | The <b>struct video_channel</b> consists of the following | ||
163 | <P> | ||
164 | <TABLE> | ||
165 | <TR><TD><b>channel</b></TD><TD>The channel number</TD> | ||
166 | <TR><TD><b>name</b></TD><TD>The input name - preferably reflecting the label | ||
167 | on the card input itself</TD> | ||
168 | <TR><TD><b>tuners</b></TD><TD>Number of tuners for this input</TD> | ||
169 | <TR><TD><b>flags</b></TD><TD>Properties the tuner has</TD> | ||
170 | <TR><TD><b>type</b></TD><TD>Input type (if known)</TD> | ||
171 | <TR><TD><b>norm</b><TD>The norm for this channel</TD> | ||
172 | </TABLE> | ||
173 | <P> | ||
174 | The flags defined are | ||
175 | <P> | ||
176 | <TABLE> | ||
177 | <TR><TD><b>VIDEO_VC_TUNER</b><TD>Channel has tuners.</TD> | ||
178 | <TR><TD><b>VIDEO_VC_AUDIO</b><TD>Channel has audio.</TD> | ||
179 | <TR><TD><b>VIDEO_VC_NORM</b><TD>Channel has norm setting.</TD> | ||
180 | </TABLE> | ||
181 | <P> | ||
182 | The types defined are | ||
183 | <P> | ||
184 | <TABLE> | ||
185 | <TR><TD><b>VIDEO_TYPE_TV</b><TD>The input is a TV input.</TD> | ||
186 | <TR><TD><b>VIDEO_TYPE_CAMERA</b><TD>The input is a camera.</TD> | ||
187 | </TABLE> | ||
188 | <P> | ||
189 | <H3>Image Properties</H3> | ||
190 | The image properties of the picture can be queried with the <b>VIDIOCGPICT</b> | ||
191 | ioctl which fills in a <b>struct video_picture</b>. The <b>VIDIOCSPICT</b> | ||
192 | ioctl allows values to be changed. All values except for the palette type | ||
193 | are scaled between 0-65535. | ||
194 | <P> | ||
195 | The <b>struct video_picture</b> consists of the following fields | ||
196 | <P> | ||
197 | <TABLE> | ||
198 | <TR><TD><b>brightness</b><TD>Picture brightness</TD> | ||
199 | <TR><TD><b>hue</b><TD>Picture hue (colour only)</TD> | ||
200 | <TR><TD><b>colour</b><TD>Picture colour (colour only)</TD> | ||
201 | <TR><TD><b>contrast</b><TD>Picture contrast</TD> | ||
202 | <TR><TD><b>whiteness</b><TD>The whiteness (greyscale only)</TD> | ||
203 | <TR><TD><b>depth</b><TD>The capture depth (may need to match the frame buffer depth)</TD> | ||
204 | <TR><TD><b>palette</b><TD>Reports the palette that should be used for this image</TD> | ||
205 | </TABLE> | ||
206 | <P> | ||
207 | The following palettes are defined | ||
208 | <P> | ||
209 | <TABLE> | ||
210 | <TR><TD><b>VIDEO_PALETTE_GREY</b><TD>Linear intensity grey scale (255 is brightest).</TD> | ||
211 | <TR><TD><b>VIDEO_PALETTE_HI240</b><TD>The BT848 8bit colour cube.</TD> | ||
212 | <TR><TD><b>VIDEO_PALETTE_RGB565</b><TD>RGB565 packed into 16 bit words.</TD> | ||
213 | <TR><TD><b>VIDEO_PALETTE_RGB555</b><TD>RGV555 packed into 16 bit words, top bit undefined.</TD> | ||
214 | <TR><TD><b>VIDEO_PALETTE_RGB24</b><TD>RGB888 packed into 24bit words.</TD> | ||
215 | <TR><TD><b>VIDEO_PALETTE_RGB32</b><TD>RGB888 packed into the low 3 bytes of 32bit words. The top 8bits are undefined.</TD> | ||
216 | <TR><TD><b>VIDEO_PALETTE_YUV422</b><TD>Video style YUV422 - 8bits packed 4bits Y 2bits U 2bits V</TD> | ||
217 | <TR><TD><b>VIDEO_PALETTE_YUYV</b><TD>Describe me</TD> | ||
218 | <TR><TD><b>VIDEO_PALETTE_UYVY</b><TD>Describe me</TD> | ||
219 | <TR><TD><b>VIDEO_PALETTE_YUV420</b><TD>YUV420 capture</TD> | ||
220 | <TR><TD><b>VIDEO_PALETTE_YUV411</b><TD>YUV411 capture</TD> | ||
221 | <TR><TD><b>VIDEO_PALETTE_RAW</b><TD>RAW capture (BT848)</TD> | ||
222 | <TR><TD><b>VIDEO_PALETTE_YUV422P</b><TD>YUV 4:2:2 Planar</TD> | ||
223 | <TR><TD><b>VIDEO_PALETTE_YUV411P</b><TD>YUV 4:1:1 Planar</TD> | ||
224 | </TABLE> | ||
225 | <P> | ||
226 | <H3>Tuning</H3> | ||
227 | Each video input channel can have one or more tuners associated with it. Many | ||
228 | devices will not have tuners. TV cards and radio cards will have one or more | ||
229 | tuners attached. | ||
230 | <P> | ||
231 | Tuners are described by a <b>struct video_tuner</b> which can be obtained by | ||
232 | the <b>VIDIOCGTUNER</b> ioctl. Fill in the tuner number in the structure | ||
233 | then pass the structure to the ioctl to have the data filled in. The | ||
234 | tuner can be switched using <b>VIDIOCSTUNER</b> which takes an integer argument | ||
235 | giving the tuner to use. A struct tuner has the following fields | ||
236 | <P> | ||
237 | <TABLE> | ||
238 | <TR><TD><b>tuner</b><TD>Number of the tuner</TD> | ||
239 | <TR><TD><b>name</b><TD>Canonical name for this tuner (eg FM/AM/TV)</TD> | ||
240 | <TR><TD><b>rangelow</b><TD>Lowest tunable frequency</TD> | ||
241 | <TR><TD><b>rangehigh</b><TD>Highest tunable frequency</TD> | ||
242 | <TR><TD><b>flags</b><TD>Flags describing the tuner</TD> | ||
243 | <TR><TD><b>mode</b><TD>The video signal mode if relevant</TD> | ||
244 | <TR><TD><b>signal</b><TD>Signal strength if known - between 0-65535</TD> | ||
245 | </TABLE> | ||
246 | <P> | ||
247 | The following flags exist | ||
248 | <P> | ||
249 | <TABLE> | ||
250 | <TR><TD><b>VIDEO_TUNER_PAL</b><TD>PAL tuning is supported</TD> | ||
251 | <TR><TD><b>VIDEO_TUNER_NTSC</b><TD>NTSC tuning is supported</TD> | ||
252 | <TR><TD><b>VIDEO_TUNER_SECAM</b><TD>SECAM tuning is supported</TD> | ||
253 | <TR><TD><b>VIDEO_TUNER_LOW</b><TD>Frequency is in a lower range</TD> | ||
254 | <TR><TD><b>VIDEO_TUNER_NORM</b><TD>The norm for this tuner is settable</TD> | ||
255 | <TR><TD><b>VIDEO_TUNER_STEREO_ON</b><TD>The tuner is seeing stereo audio</TD> | ||
256 | <TR><TD><b>VIDEO_TUNER_RDS_ON</b><TD>The tuner is seeing a RDS datastream</TD> | ||
257 | <TR><TD><b>VIDEO_TUNER_MBS_ON</b><TD>The tuner is seeing a MBS datastream</TD> | ||
258 | </TABLE> | ||
259 | <P> | ||
260 | The following modes are defined | ||
261 | <P> | ||
262 | <TABLE> | ||
263 | <TR><TD><b>VIDEO_MODE_PAL</b><TD>The tuner is in PAL mode</TD> | ||
264 | <TR><TD><b>VIDEO_MODE_NTSC</b><TD>The tuner is in NTSC mode</TD> | ||
265 | <TR><TD><b>VIDEO_MODE_SECAM</b><TD>The tuner is in SECAM mode</TD> | ||
266 | <TR><TD><b>VIDEO_MODE_AUTO</b><TD>The tuner auto switches, or mode does not apply</TD> | ||
267 | </TABLE> | ||
268 | <P> | ||
269 | Tuning frequencies are an unsigned 32bit value in 1/16th MHz or if the | ||
270 | <b>VIDEO_TUNER_LOW</b> flag is set they are in 1/16th KHz. The current | ||
271 | frequency is obtained as an unsigned long via the <b>VIDIOCGFREQ</b> ioctl and | ||
272 | set by the <b>VIDIOCSFREQ</b> ioctl. | ||
273 | <P> | ||
274 | <H3>Audio</H3> | ||
275 | TV and Radio devices have one or more audio inputs that may be selected. | ||
276 | The audio properties are queried by passing a <b>struct video_audio</b> to <b>VIDIOCGAUDIO</b> ioctl. The | ||
277 | <b>VIDIOCSAUDIO</b> ioctl sets audio properties. | ||
278 | <P> | ||
279 | The structure contains the following fields | ||
280 | <P> | ||
281 | <TABLE> | ||
282 | <TR><TD><b>audio</b><TD>The channel number</TD> | ||
283 | <TR><TD><b>volume</b><TD>The volume level</TD> | ||
284 | <TR><TD><b>bass</b><TD>The bass level</TD> | ||
285 | <TR><TD><b>treble</b><TD>The treble level</TD> | ||
286 | <TR><TD><b>flags</b><TD>Flags describing the audio channel</TD> | ||
287 | <TR><TD><b>name</b><TD>Canonical name for the audio input</TD> | ||
288 | <TR><TD><b>mode</b><TD>The mode the audio input is in</TD> | ||
289 | <TR><TD><b>balance</b><TD>The left/right balance</TD> | ||
290 | <TR><TD><b>step</b><TD>Actual step used by the hardware</TD> | ||
291 | </TABLE> | ||
292 | <P> | ||
293 | The following flags are defined | ||
294 | <P> | ||
295 | <TABLE> | ||
296 | <TR><TD><b>VIDEO_AUDIO_MUTE</b><TD>The audio is muted</TD> | ||
297 | <TR><TD><b>VIDEO_AUDIO_MUTABLE</b><TD>Audio muting is supported</TD> | ||
298 | <TR><TD><b>VIDEO_AUDIO_VOLUME</b><TD>The volume is controllable</TD> | ||
299 | <TR><TD><b>VIDEO_AUDIO_BASS</b><TD>The bass is controllable</TD> | ||
300 | <TR><TD><b>VIDEO_AUDIO_TREBLE</b><TD>The treble is controllable</TD> | ||
301 | <TR><TD><b>VIDEO_AUDIO_BALANCE</b><TD>The balance is controllable</TD> | ||
302 | </TABLE> | ||
303 | <P> | ||
304 | The following decoding modes are defined | ||
305 | <P> | ||
306 | <TABLE> | ||
307 | <TR><TD><b>VIDEO_SOUND_MONO</b><TD>Mono signal</TD> | ||
308 | <TR><TD><b>VIDEO_SOUND_STEREO</b><TD>Stereo signal (NICAM for TV)</TD> | ||
309 | <TR><TD><b>VIDEO_SOUND_LANG1</b><TD>European TV alternate language 1</TD> | ||
310 | <TR><TD><b>VIDEO_SOUND_LANG2</b><TD>European TV alternate language 2</TD> | ||
311 | </TABLE> | ||
312 | <P> | ||
313 | <H3>Reading Images</H3> | ||
314 | Each call to the <b>read</b> syscall returns the next available image | ||
315 | from the device. It is up to the caller to set format and size (using | ||
316 | the VIDIOCSPICT and VIDIOCSWIN ioctls) and then to pass a suitable | ||
317 | size buffer and length to the function. Not all devices will support | ||
318 | read operations. | ||
319 | <P> | ||
320 | A second way to handle image capture is via the mmap interface if supported. | ||
321 | To use the mmap interface a user first sets the desired image size and depth | ||
322 | properties. Next the VIDIOCGMBUF ioctl is issued. This reports the size | ||
323 | of buffer to mmap and the offset within the buffer for each frame. The | ||
324 | number of frames supported is device dependent and may only be one. | ||
325 | <P> | ||
326 | The video_mbuf structure contains the following fields | ||
327 | <P> | ||
328 | <TABLE> | ||
329 | <TR><TD><b>size</b><TD>The number of bytes to map</TD> | ||
330 | <TR><TD><b>frames</b><TD>The number of frames</TD> | ||
331 | <TR><TD><b>offsets</b><TD>The offset of each frame</TD> | ||
332 | </TABLE> | ||
333 | <P> | ||
334 | Once the mmap has been made the VIDIOCMCAPTURE ioctl starts the | ||
335 | capture to a frame using the format and image size specified in the | ||
336 | video_mmap (which should match or be below the initial query size). | ||
337 | When the VIDIOCMCAPTURE ioctl returns the frame is <em>not</em> | ||
338 | captured yet, the driver just instructed the hardware to start the | ||
339 | capture. The application has to use the VIDIOCSYNC ioctl to wait | ||
340 | until the capture of a frame is finished. VIDIOCSYNC takes the frame | ||
341 | number you want to wait for as argument. | ||
342 | <p> | ||
343 | It is allowed to call VIDIOCMCAPTURE multiple times (with different | ||
344 | frame numbers in video_mmap->frame of course) and thus have multiple | ||
345 | outstanding capture requests. A simple way do to double-buffering | ||
346 | using this feature looks like this: | ||
347 | <pre> | ||
348 | /* setup everything */ | ||
349 | VIDIOCMCAPTURE(0) | ||
350 | while (whatever) { | ||
351 | VIDIOCMCAPTURE(1) | ||
352 | VIDIOCSYNC(0) | ||
353 | /* process frame 0 while the hardware captures frame 1 */ | ||
354 | VIDIOCMCAPTURE(0) | ||
355 | VIDIOCSYNC(1) | ||
356 | /* process frame 1 while the hardware captures frame 0 */ | ||
357 | } | ||
358 | </pre> | ||
359 | Note that you are <em>not</em> limited to only two frames. The API | ||
360 | allows up to 32 frames, the VIDIOCGMBUF ioctl returns the number of | ||
361 | frames the driver granted. Thus it is possible to build deeper queues | ||
362 | to avoid loosing frames on load peaks. | ||
363 | <p> | ||
364 | While capturing to memory the driver will make a "best effort" attempt | ||
365 | to capture to screen as well if requested. This normally means all | ||
366 | frames that "miss" memory mapped capture will go to the display. | ||
367 | <P> | ||
368 | A final ioctl exists to allow a device to obtain related devices if a | ||
369 | driver has multiple components (for example video0 may not be associated | ||
370 | with vbi0 which would cause an intercast display program to make a bad | ||
371 | mistake). The VIDIOCGUNIT ioctl reports the unit numbers of the associated | ||
372 | devices if any exist. The video_unit structure has the following fields. | ||
373 | <P> | ||
374 | <TABLE> | ||
375 | <TR><TD><b>video</b><TD>Video capture device</TD> | ||
376 | <TR><TD><b>vbi</b><TD>VBI capture device</TD> | ||
377 | <TR><TD><b>radio</b><TD>Radio device</TD> | ||
378 | <TR><TD><b>audio</b><TD>Audio mixer</TD> | ||
379 | <TR><TD><b>teletext</b><TD>Teletext device</TD> | ||
380 | </TABLE> | ||
381 | <P> | ||
382 | <H3>RDS Datastreams</H3> | ||
383 | For radio devices that support it, it is possible to receive Radio Data | ||
384 | System (RDS) data by means of a read() on the device. The data is packed in | ||
385 | groups of three, as follows: | ||
386 | <TABLE> | ||
387 | <TR><TD>First Octet</TD><TD>Least Significant Byte of RDS Block</TD></TR> | ||
388 | <TR><TD>Second Octet</TD><TD>Most Significant Byte of RDS Block | ||
389 | <TR><TD>Third Octet</TD><TD>Bit 7:</TD><TD>Error bit. Indicates that | ||
390 | an uncorrectable error occurred during reception of this block.</TD></TR> | ||
391 | <TR><TD> </TD><TD>Bit 6:</TD><TD>Corrected bit. Indicates that | ||
392 | an error was corrected for this data block.</TD></TR> | ||
393 | <TR><TD> </TD><TD>Bits 5-3:</TD><TD>Received Offset. Indicates the | ||
394 | offset received by the sync system.</TD></TR> | ||
395 | <TR><TD> </TD><TD>Bits 2-0:</TD><TD>Offset Name. Indicates the | ||
396 | offset applied to this data.</TD></TR> | ||
397 | </TABLE> | ||
398 | </BODY> | ||
399 | </HTML> | ||
diff --git a/Documentation/video4linux/CARDLIST.bttv b/Documentation/video4linux/CARDLIST.bttv index e46761c39e3f..62a12a08e2ac 100644 --- a/Documentation/video4linux/CARDLIST.bttv +++ b/Documentation/video4linux/CARDLIST.bttv | |||
@@ -1,4 +1,4 @@ | |||
1 | card=0 - *** UNKNOWN/GENERIC *** | 1 | card=0 - *** UNKNOWN/GENERIC *** |
2 | card=1 - MIRO PCTV | 2 | card=1 - MIRO PCTV |
3 | card=2 - Hauppauge (bt848) | 3 | card=2 - Hauppauge (bt848) |
4 | card=3 - STB, Gateway P/N 6000699 (bt848) | 4 | card=3 - STB, Gateway P/N 6000699 (bt848) |
@@ -119,3 +119,17 @@ card=117 - NGS NGSTV+ | |||
119 | card=118 - LMLBT4 | 119 | card=118 - LMLBT4 |
120 | card=119 - Tekram M205 PRO | 120 | card=119 - Tekram M205 PRO |
121 | card=120 - Conceptronic CONTVFMi | 121 | card=120 - Conceptronic CONTVFMi |
122 | card=121 - Euresys Picolo Tetra | ||
123 | card=122 - Spirit TV Tuner | ||
124 | card=123 - AVerMedia AVerTV DVB-T 771 | ||
125 | card=124 - AverMedia AverTV DVB-T 761 | ||
126 | card=125 - MATRIX Vision Sigma-SQ | ||
127 | card=126 - MATRIX Vision Sigma-SLC | ||
128 | card=127 - APAC Viewcomp 878(AMAX) | ||
129 | card=128 - DVICO FusionHDTV DVB-T Lite | ||
130 | card=129 - V-Gear MyVCD | ||
131 | card=130 - Super TV Tuner | ||
132 | card=131 - Tibet Systems 'Progress DVR' CS16 | ||
133 | card=132 - Kodicom 4400R (master) | ||
134 | card=133 - Kodicom 4400R (slave) | ||
135 | card=134 - Adlink RTV24 | ||
diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 new file mode 100644 index 000000000000..03deb0726aa4 --- /dev/null +++ b/Documentation/video4linux/CARDLIST.cx88 | |||
@@ -0,0 +1,32 @@ | |||
1 | card=0 - UNKNOWN/GENERIC | ||
2 | card=1 - Hauppauge WinTV 34xxx models | ||
3 | card=2 - GDI Black Gold | ||
4 | card=3 - PixelView | ||
5 | card=4 - ATI TV Wonder Pro | ||
6 | card=5 - Leadtek Winfast 2000XP Expert | ||
7 | card=6 - AverTV Studio 303 (M126) | ||
8 | card=7 - MSI TV-@nywhere Master | ||
9 | card=8 - Leadtek Winfast DV2000 | ||
10 | card=9 - Leadtek PVR 2000 | ||
11 | card=10 - IODATA GV-VCP3/PCI | ||
12 | card=11 - Prolink PlayTV PVR | ||
13 | card=12 - ASUS PVR-416 | ||
14 | card=13 - MSI TV-@nywhere | ||
15 | card=14 - KWorld/VStream XPert DVB-T | ||
16 | card=15 - DViCO FusionHDTV DVB-T1 | ||
17 | card=16 - KWorld LTV883RF | ||
18 | card=17 - DViCO FusionHDTV 3 Gold-Q | ||
19 | card=18 - Hauppauge Nova-T DVB-T | ||
20 | card=19 - Conexant DVB-T reference design | ||
21 | card=20 - Provideo PV259 | ||
22 | card=21 - DViCO FusionHDTV DVB-T Plus | ||
23 | card=22 - digitalnow DNTV Live! DVB-T | ||
24 | card=23 - pcHDTV HD3000 HDTV | ||
25 | card=24 - Hauppauge WinTV 28xxx (Roslyn) models | ||
26 | card=25 - Digital-Logic MICROSPACE Entertainment Center (MEC) | ||
27 | card=26 - IODATA GV/BCTV7E | ||
28 | card=27 - PixelView PlayTV Ultra Pro (Stereo) | ||
29 | card=28 - DViCO FusionHDTV 3 Gold-T | ||
30 | card=29 - ADS Tech Instant TV DVB-T PCI | ||
31 | card=30 - TerraTec Cinergy 1400 DVB-T | ||
32 | card=31 - DViCO FusionHDTV 5 Gold | ||
diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index a6c82fa4de02..1b5a3a9ffbe2 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 | |||
@@ -1,10 +1,10 @@ | |||
1 | 0 -> UNKNOWN/GENERIC | 1 | 0 -> UNKNOWN/GENERIC |
2 | 1 -> Proteus Pro [philips reference design] [1131:2001,1131:2001] | 2 | 1 -> Proteus Pro [philips reference design] [1131:2001,1131:2001] |
3 | 2 -> LifeView FlyVIDEO3000 [5168:0138,4e42:0138] | 3 | 2 -> LifeView FlyVIDEO3000 [5168:0138,4e42:0138] |
4 | 3 -> LifeView FlyVIDEO2000 [5168:0138] | 4 | 3 -> LifeView FlyVIDEO2000 [5168:0138] |
5 | 4 -> EMPRESS [1131:6752] | 5 | 4 -> EMPRESS [1131:6752] |
6 | 5 -> SKNet Monster TV [1131:4e85] | 6 | 5 -> SKNet Monster TV [1131:4e85] |
7 | 6 -> Tevion MD 9717 | 7 | 6 -> Tevion MD 9717 |
8 | 7 -> KNC One TV-Station RDS / Typhoon TV Tuner RDS [1131:fe01,1894:fe01] | 8 | 7 -> KNC One TV-Station RDS / Typhoon TV Tuner RDS [1131:fe01,1894:fe01] |
9 | 8 -> Terratec Cinergy 400 TV [153B:1142] | 9 | 8 -> Terratec Cinergy 400 TV [153B:1142] |
10 | 9 -> Medion 5044 | 10 | 9 -> Medion 5044 |
@@ -20,16 +20,45 @@ | |||
20 | 19 -> Compro VideoMate TV [185b:c100] | 20 | 19 -> Compro VideoMate TV [185b:c100] |
21 | 20 -> Matrox CronosPlus [102B:48d0] | 21 | 20 -> Matrox CronosPlus [102B:48d0] |
22 | 21 -> 10MOONS PCI TV CAPTURE CARD [1131:2001] | 22 | 21 -> 10MOONS PCI TV CAPTURE CARD [1131:2001] |
23 | 22 -> Medion 2819/ AverMedia M156 [1461:a70b,1461:2115] | 23 | 22 -> AverMedia M156 / Medion 2819 [1461:a70b] |
24 | 23 -> BMK MPEX Tuner | 24 | 23 -> BMK MPEX Tuner |
25 | 24 -> KNC One TV-Station DVR [1894:a006] | 25 | 24 -> KNC One TV-Station DVR [1894:a006] |
26 | 25 -> ASUS TV-FM 7133 [1043:4843] | 26 | 25 -> ASUS TV-FM 7133 [1043:4843] |
27 | 26 -> Pinnacle PCTV Stereo (saa7134) [11bd:002b] | 27 | 26 -> Pinnacle PCTV Stereo (saa7134) [11bd:002b] |
28 | 27 -> Manli MuchTV M-TV002 | 28 | 27 -> Manli MuchTV M-TV002/Behold TV 403 FM |
29 | 28 -> Manli MuchTV M-TV001 | 29 | 28 -> Manli MuchTV M-TV001/Behold TV 401 |
30 | 29 -> Nagase Sangyo TransGear 3000TV [1461:050c] | 30 | 29 -> Nagase Sangyo TransGear 3000TV [1461:050c] |
31 | 30 -> Elitegroup ECS TVP3XP FM1216 Tuner Card(PAL-BG,FM) [1019:4cb4] | 31 | 30 -> Elitegroup ECS TVP3XP FM1216 Tuner Card(PAL-BG,FM) [1019:4cb4] |
32 | 31 -> Elitegroup ECS TVP3XP FM1236 Tuner Card (NTSC,FM) [1019:4cb5] | 32 | 31 -> Elitegroup ECS TVP3XP FM1236 Tuner Card (NTSC,FM) [1019:4cb5] |
33 | 32 -> AVACS SmartTV | 33 | 32 -> AVACS SmartTV |
34 | 33 -> AVerMedia DVD EZMaker [1461:10ff] | 34 | 33 -> AVerMedia DVD EZMaker [1461:10ff] |
35 | 34 -> LifeView FlyTV Platinum33 mini [5168:0212] | 35 | 34 -> Noval Prime TV 7133 |
36 | 35 -> AverMedia AverTV Studio 305 [1461:2115] | ||
37 | 36 -> UPMOST PURPLE TV [12ab:0800] | ||
38 | 37 -> Items MuchTV Plus / IT-005 | ||
39 | 38 -> Terratec Cinergy 200 TV [153B:1152] | ||
40 | 39 -> LifeView FlyTV Platinum Mini [5168:0212] | ||
41 | 40 -> Compro VideoMate TV PVR/FM [185b:c100] | ||
42 | 41 -> Compro VideoMate TV Gold+ [185b:c100] | ||
43 | 42 -> Sabrent SBT-TVFM (saa7130) | ||
44 | 43 -> :Zolid Xpert TV7134 | ||
45 | 44 -> Empire PCI TV-Radio LE | ||
46 | 45 -> Avermedia AVerTV Studio 307 [1461:9715] | ||
47 | 46 -> AVerMedia Cardbus TV/Radio (E500) [1461:d6ee] | ||
48 | 47 -> Terratec Cinergy 400 mobile [153b:1162] | ||
49 | 48 -> Terratec Cinergy 600 TV MK3 [153B:1158] | ||
50 | 49 -> Compro VideoMate Gold+ Pal [185b:c200] | ||
51 | 50 -> Pinnacle PCTV 300i DVB-T + PAL [11bd:002d] | ||
52 | 51 -> ProVideo PV952 [1540:9524] | ||
53 | 52 -> AverMedia AverTV/305 [1461:2108] | ||
54 | 53 -> ASUS TV-FM 7135 [1043:4845] | ||
55 | 54 -> LifeView FlyTV Platinum FM [5168:0214,1489:0214] | ||
56 | 55 -> LifeView FlyDVB-T DUO [5168:0502,5168:0306] | ||
57 | 56 -> Avermedia AVerTV 307 [1461:a70a] | ||
58 | 57 -> Avermedia AVerTV GO 007 FM [1461:f31f] | ||
59 | 58 -> ADS Tech Instant TV (saa7135) [1421:0350,1421:0370] | ||
60 | 59 -> Kworld/Tevion V-Stream Xpert TV PVR7134 | ||
61 | 60 -> Typhoon DVB-T Duo Digital/Analog Cardbus [4e42:0502] | ||
62 | 61 -> Philips TOUGH DVB-T reference design [1131:2004] | ||
63 | 62 -> Compro VideoMate TV Gold+II | ||
64 | 63 -> Kworld Xpert TV PVR7134 | ||
diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index f7bafe862ba0..f3302e1b1b9c 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner | |||
@@ -44,3 +44,23 @@ tuner=42 - Philips 1236D ATSC/NTSC daul in | |||
44 | tuner=43 - Philips NTSC MK3 (FM1236MK3 or FM1236/F) | 44 | tuner=43 - Philips NTSC MK3 (FM1236MK3 or FM1236/F) |
45 | tuner=44 - Philips 4 in 1 (ATI TV Wonder Pro/Conexant) | 45 | tuner=44 - Philips 4 in 1 (ATI TV Wonder Pro/Conexant) |
46 | tuner=45 - Microtune 4049 FM5 | 46 | tuner=45 - Microtune 4049 FM5 |
47 | tuner=46 - Panasonic VP27s/ENGE4324D | ||
48 | tuner=47 - LG NTSC (TAPE series) | ||
49 | tuner=48 - Tenna TNF 8831 BGFF) | ||
50 | tuner=49 - Microtune 4042 FI5 ATSC/NTSC dual in | ||
51 | tuner=50 - TCL 2002N | ||
52 | tuner=51 - Philips PAL/SECAM_D (FM 1256 I-H3) | ||
53 | tuner=52 - Thomson DDT 7610 (ATSC/NTSC) | ||
54 | tuner=53 - Philips FQ1286 | ||
55 | tuner=54 - tda8290+75 | ||
56 | tuner=55 - LG PAL (TAPE series) | ||
57 | tuner=56 - Philips PAL/SECAM multi (FQ1216AME MK4) | ||
58 | tuner=57 - Philips FQ1236A MK4 | ||
59 | tuner=58 - Ymec TVision TVF-8531MF/8831MF/8731MF | ||
60 | tuner=59 - Ymec TVision TVF-5533MF | ||
61 | tuner=60 - Thomson DDT 7611 (ATSC/NTSC) | ||
62 | tuner=61 - Tena TNF9533-D/IF/TNF9533-B/DF | ||
63 | tuner=62 - Philips TEA5767HN FM Radio | ||
64 | tuner=63 - Philips FMD1216ME MK3 Hybrid Tuner | ||
65 | tuner=64 - LG TDVS-H062F/TUA6034 | ||
66 | tuner=65 - Ymec TVF66T5-B/DFF | ||
diff --git a/Documentation/video4linux/README.saa7134 b/Documentation/video4linux/README.saa7134 index 1a446c65365e..1f788e498eff 100644 --- a/Documentation/video4linux/README.saa7134 +++ b/Documentation/video4linux/README.saa7134 | |||
@@ -57,6 +57,15 @@ Cards can use either of these two crystals (xtal): | |||
57 | - 24.576MHz -> .audio_clock=0x200000 | 57 | - 24.576MHz -> .audio_clock=0x200000 |
58 | (xtal * .audio_clock = 51539600) | 58 | (xtal * .audio_clock = 51539600) |
59 | 59 | ||
60 | Some details about 30/34/35: | ||
61 | |||
62 | - saa7130 - low-price chip, doesn't have mute, that is why all those | ||
63 | cards should have .mute field defined in their tuner structure. | ||
64 | |||
65 | - saa7134 - usual chip | ||
66 | |||
67 | - saa7133/35 - saa7135 is probably a marketing decision, since all those | ||
68 | chips identifies itself as 33 on pci. | ||
60 | 69 | ||
61 | Credits | 70 | Credits |
62 | ======= | 71 | ======= |
diff --git a/Documentation/video4linux/bttv/Cards b/Documentation/video4linux/bttv/Cards index 7f8c7eb70ab2..8f1941ede4da 100644 --- a/Documentation/video4linux/bttv/Cards +++ b/Documentation/video4linux/bttv/Cards | |||
@@ -20,7 +20,7 @@ All other cards only differ by additional components as tuners, sound | |||
20 | decoders, EEPROMs, teletext decoders ... | 20 | decoders, EEPROMs, teletext decoders ... |
21 | 21 | ||
22 | 22 | ||
23 | Unsupported Cards: | 23 | Unsupported Cards: |
24 | ------------------ | 24 | ------------------ |
25 | 25 | ||
26 | Cards with Zoran (ZR) or Philips (SAA) or ISA are not supported by | 26 | Cards with Zoran (ZR) or Philips (SAA) or ISA are not supported by |
@@ -50,11 +50,11 @@ Bt848a/Bt849 single crytal operation support possible!!! | |||
50 | Miro/Pinnacle PCTV | 50 | Miro/Pinnacle PCTV |
51 | ------------------ | 51 | ------------------ |
52 | 52 | ||
53 | - Bt848 | 53 | - Bt848 |
54 | some (all??) come with 2 crystals for PAL/SECAM and NTSC | 54 | some (all??) come with 2 crystals for PAL/SECAM and NTSC |
55 | - PAL, SECAM or NTSC TV tuner (Philips or TEMIC) | 55 | - PAL, SECAM or NTSC TV tuner (Philips or TEMIC) |
56 | - MSP34xx sound decoder on add on board | 56 | - MSP34xx sound decoder on add on board |
57 | decoder is supported but AFAIK does not yet work | 57 | decoder is supported but AFAIK does not yet work |
58 | (other sound MUX setting in GPIO port needed??? somebody who fixed this???) | 58 | (other sound MUX setting in GPIO port needed??? somebody who fixed this???) |
59 | - 1 tuner, 1 composite and 1 S-VHS input | 59 | - 1 tuner, 1 composite and 1 S-VHS input |
60 | - tuner type is autodetected | 60 | - tuner type is autodetected |
@@ -70,7 +70,7 @@ in 1997! | |||
70 | Hauppauge Win/TV pci | 70 | Hauppauge Win/TV pci |
71 | -------------------- | 71 | -------------------- |
72 | 72 | ||
73 | There are many different versions of the Hauppauge cards with different | 73 | There are many different versions of the Hauppauge cards with different |
74 | tuners (TV+Radio ...), teletext decoders. | 74 | tuners (TV+Radio ...), teletext decoders. |
75 | Note that even cards with same model numbers have (depending on the revision) | 75 | Note that even cards with same model numbers have (depending on the revision) |
76 | different chips on it. | 76 | different chips on it. |
@@ -80,22 +80,22 @@ different chips on it. | |||
80 | - PAL, SECAM, NTSC or tuner with or without Radio support | 80 | - PAL, SECAM, NTSC or tuner with or without Radio support |
81 | 81 | ||
82 | e.g.: | 82 | e.g.: |
83 | PAL: | 83 | PAL: |
84 | TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners | 84 | TDA5737: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners |
85 | TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3 | 85 | TSA5522: 1.4 GHz I2C-bus controlled synthesizer, I2C 0xc2-0xc3 |
86 | 86 | ||
87 | NTSC: | 87 | NTSC: |
88 | TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners | 88 | TDA5731: VHF, hyperband and UHF mixer/oscillator for TV and VCR 3-band tuners |
89 | TSA5518: no datasheet available on Philips site | 89 | TSA5518: no datasheet available on Philips site |
90 | - Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip | 90 | - Philips SAA5246 or SAA5284 ( or no) Teletext decoder chip |
91 | with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM) | 91 | with buffer RAM (e.g. Winbond W24257AS-35: 32Kx8 CMOS static RAM) |
92 | SAA5246 (I2C 0x22) is supported | 92 | SAA5246 (I2C 0x22) is supported |
93 | - 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y | 93 | - 256 bytes EEPROM: Microchip 24LC02B or Philips 8582E2Y |
94 | with configuration information | 94 | with configuration information |
95 | I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf) | 95 | I2C address 0xa0 (24LC02B also responds to 0xa2-0xaf) |
96 | - 1 tuner, 1 composite and (depending on model) 1 S-VHS input | 96 | - 1 tuner, 1 composite and (depending on model) 1 S-VHS input |
97 | - 14052B: mux for selection of sound source | 97 | - 14052B: mux for selection of sound source |
98 | - sound decoder: TDA9800, MSP34xx (stereo cards) | 98 | - sound decoder: TDA9800, MSP34xx (stereo cards) |
99 | 99 | ||
100 | 100 | ||
101 | Askey CPH-Series | 101 | Askey CPH-Series |
@@ -108,17 +108,17 @@ Developed by TelSignal(?), OEMed by many vendors (Typhoon, Anubis, Dynalink) | |||
108 | CPH05x: BT878 with FM | 108 | CPH05x: BT878 with FM |
109 | CPH06x: BT878 (w/o FM) | 109 | CPH06x: BT878 (w/o FM) |
110 | CPH07x: BT878 capture only | 110 | CPH07x: BT878 capture only |
111 | 111 | ||
112 | TV standards: | 112 | TV standards: |
113 | CPH0x0: NTSC-M/M | 113 | CPH0x0: NTSC-M/M |
114 | CPH0x1: PAL-B/G | 114 | CPH0x1: PAL-B/G |
115 | CPH0x2: PAL-I/I | 115 | CPH0x2: PAL-I/I |
116 | CPH0x3: PAL-D/K | 116 | CPH0x3: PAL-D/K |
117 | CPH0x4: SECAM-L/L | 117 | CPH0x4: SECAM-L/L |
118 | CPH0x5: SECAM-B/G | 118 | CPH0x5: SECAM-B/G |
119 | CPH0x6: SECAM-D/K | 119 | CPH0x6: SECAM-D/K |
120 | CPH0x7: PAL-N/N | 120 | CPH0x7: PAL-N/N |
121 | CPH0x8: PAL-B/H | 121 | CPH0x8: PAL-B/H |
122 | CPH0x9: PAL-M/M | 122 | CPH0x9: PAL-M/M |
123 | 123 | ||
124 | CPH03x was often sold as "TV capturer". | 124 | CPH03x was often sold as "TV capturer". |
@@ -174,7 +174,7 @@ Lifeview Flyvideo Series: | |||
174 | "The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98." | 174 | "The FlyVideo2000 and FlyVideo2000s product name have renamed to FlyVideo98." |
175 | Their Bt8x8 cards are listed as discontinued. | 175 | Their Bt8x8 cards are listed as discontinued. |
176 | Flyvideo 2000S was probably sold as Flyvideo 3000 in some contries(Europe?). | 176 | Flyvideo 2000S was probably sold as Flyvideo 3000 in some contries(Europe?). |
177 | The new Flyvideo 2000/3000 are SAA7130/SAA7134 based. | 177 | The new Flyvideo 2000/3000 are SAA7130/SAA7134 based. |
178 | 178 | ||
179 | "Flyvideo II" had been the name for the 848 cards, nowadays (in Germany) | 179 | "Flyvideo II" had been the name for the 848 cards, nowadays (in Germany) |
180 | this name is re-used for LR50 Rev.W. | 180 | this name is re-used for LR50 Rev.W. |
@@ -235,12 +235,12 @@ Prolink | |||
235 | Multimedia TV packages (card + software pack): | 235 | Multimedia TV packages (card + software pack): |
236 | PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software | 236 | PixelView Play TV Theater - (Model: PV-M4200) = PixelView Play TV pro + Software |
237 | PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E) | 237 | PixelView Play TV PAK - (Model: PV-BT878P+ REV 4E) |
238 | PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A ) | 238 | PixelView Play TV/VCR - (Model: PV-M3200 REV 4C / 8D / 10A ) |
239 | PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A ) | 239 | PixelView Studio PAK - (Model: M2200 REV 4C / 8D / 10A ) |
240 | PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E) | 240 | PixelView PowerStudio PAK - (Model: PV-M3600 REV 4E) |
241 | PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A ) | 241 | PixelView DigitalVCR PAK - (Model: PV-M2400 REV 4C / 8D / 10A ) |
242 | 242 | ||
243 | PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800 | 243 | PixelView PlayTV PAK II (TV/FM card + usb camera) PV-M3800 |
244 | PixelView PlayTV XP PV-M4700,PV-M4700(w/FM) | 244 | PixelView PlayTV XP PV-M4700,PV-M4700(w/FM) |
245 | PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w | 245 | PixelView PlayTV DVR PV-M4600 package contents:PixelView PlayTV pro, windvr & videoMail s/w |
246 | 246 | ||
@@ -254,7 +254,7 @@ Prolink | |||
254 | 254 | ||
255 | DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030 | 255 | DTV3000 PV-DTV3000P+ DVB-S CI = Twinhan VP-1030 |
256 | DTV2000 DVB-S = Twinhan VP-1020 | 256 | DTV2000 DVB-S = Twinhan VP-1020 |
257 | 257 | ||
258 | Video Conferencing: | 258 | Video Conferencing: |
259 | PixelView Meeting PAK - (Model: PV-BT878P) | 259 | PixelView Meeting PAK - (Model: PV-BT878P) |
260 | PixelView Meeting PAK Lite - (Model: PV-BT878P) | 260 | PixelView Meeting PAK Lite - (Model: PV-BT878P) |
@@ -308,7 +308,7 @@ KNC One | |||
308 | 308 | ||
309 | newer Cards have saa7134, but model name stayed the same? | 309 | newer Cards have saa7134, but model name stayed the same? |
310 | 310 | ||
311 | Provideo | 311 | Provideo |
312 | -------- | 312 | -------- |
313 | PV951 or PV-951 (also are sold as: | 313 | PV951 or PV-951 (also are sold as: |
314 | Boeder TV-FM Video Capture Card | 314 | Boeder TV-FM Video Capture Card |
@@ -353,7 +353,7 @@ AVerMedia | |||
353 | AVerTV | 353 | AVerTV |
354 | AVerTV Stereo | 354 | AVerTV Stereo |
355 | AVerTV Studio (w/FM) | 355 | AVerTV Studio (w/FM) |
356 | AVerMedia TV98 with Remote | 356 | AVerMedia TV98 with Remote |
357 | AVerMedia TV/FM98 Stereo | 357 | AVerMedia TV/FM98 Stereo |
358 | AVerMedia TVCAM98 | 358 | AVerMedia TVCAM98 |
359 | TVCapture (Bt848) | 359 | TVCapture (Bt848) |
@@ -373,7 +373,7 @@ AVerMedia | |||
373 | (1) Daughterboard MB68-A with TDA9820T and TDA9840T | 373 | (1) Daughterboard MB68-A with TDA9820T and TDA9840T |
374 | (2) Sony NE41S soldered (stereo sound?) | 374 | (2) Sony NE41S soldered (stereo sound?) |
375 | (3) Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz | 375 | (3) Daughterboard M118-A w/ pic 16c54 and 4 MHz quartz |
376 | 376 | ||
377 | US site has different drivers for (as of 09/2002): | 377 | US site has different drivers for (as of 09/2002): |
378 | EZ Capture/InterCam PCI (BT-848 chip) | 378 | EZ Capture/InterCam PCI (BT-848 chip) |
379 | EZ Capture/InterCam PCI (BT-878 chip) | 379 | EZ Capture/InterCam PCI (BT-878 chip) |
@@ -437,7 +437,7 @@ Terratec | |||
437 | Terra TValueRadio, "LR102 Rev.C" printed on the PCB | 437 | Terra TValueRadio, "LR102 Rev.C" printed on the PCB |
438 | Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB, | 438 | Terra TV/Radio+ Version 1.0, "80-CP2830100-0" TTTV3 printed on the PCB, |
439 | "CPH010-E83" on the back, SAA6588T, TDA9873H | 439 | "CPH010-E83" on the back, SAA6588T, TDA9873H |
440 | Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB, | 440 | Terra TValue Version BT878, "80-CP2830110-0 TTTV4" printed on the PCB, |
441 | "CPH011-D83" on back | 441 | "CPH011-D83" on back |
442 | Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0) | 442 | Terra TValue Version 1.0 "ceb105.PCB" (really identical to Terra TV+ Version 1.0) |
443 | Terra TValue New Revision "LR102 Rec.C" | 443 | Terra TValue New Revision "LR102 Rec.C" |
@@ -528,7 +528,7 @@ Koutech | |||
528 | KW-606RSF | 528 | KW-606RSF |
529 | KW-607A (capture only) | 529 | KW-607A (capture only) |
530 | KW-608 (Zoran capture only) | 530 | KW-608 (Zoran capture only) |
531 | 531 | ||
532 | IODATA (jp) | 532 | IODATA (jp) |
533 | ------ | 533 | ------ |
534 | GV-BCTV/PCI | 534 | GV-BCTV/PCI |
@@ -542,15 +542,15 @@ Canopus (jp) | |||
542 | ------- | 542 | ------- |
543 | WinDVR = Kworld "KW-TVL878RF" | 543 | WinDVR = Kworld "KW-TVL878RF" |
544 | 544 | ||
545 | www.sigmacom.co.kr | 545 | www.sigmacom.co.kr |
546 | ------------------ | 546 | ------------------ |
547 | Sigma Cyber TV II | 547 | Sigma Cyber TV II |
548 | 548 | ||
549 | www.sasem.co.kr | 549 | www.sasem.co.kr |
550 | --------------- | 550 | --------------- |
551 | Litte OnAir TV | 551 | Litte OnAir TV |
552 | 552 | ||
553 | hama | 553 | hama |
554 | ---- | 554 | ---- |
555 | TV/Radio-Tuner Card, PCI (Model 44677) = CPH051 | 555 | TV/Radio-Tuner Card, PCI (Model 44677) = CPH051 |
556 | 556 | ||
@@ -638,7 +638,7 @@ Media-Surfer (esc-kathrein.de) | |||
638 | 638 | ||
639 | Jetway (www.jetway.com.tw) | 639 | Jetway (www.jetway.com.tw) |
640 | -------------------------- | 640 | -------------------------- |
641 | JW-TV 878M | 641 | JW-TV 878M |
642 | JW-TV 878 = KWorld KW-TV878RF | 642 | JW-TV 878 = KWorld KW-TV878RF |
643 | 643 | ||
644 | Galaxis | 644 | Galaxis |
@@ -715,7 +715,7 @@ Hauppauge | |||
715 | 809 MyVideo | 715 | 809 MyVideo |
716 | 872 MyTV2Go FM | 716 | 872 MyTV2Go FM |
717 | 717 | ||
718 | 718 | ||
719 | 546 WinTV Nova-S CI | 719 | 546 WinTV Nova-S CI |
720 | 543 WinTV Nova | 720 | 543 WinTV Nova |
721 | 907 Nova-S USB | 721 | 907 Nova-S USB |
@@ -739,7 +739,7 @@ Hauppauge | |||
739 | 832 MyTV2Go | 739 | 832 MyTV2Go |
740 | 869 MyTV2Go-FM | 740 | 869 MyTV2Go-FM |
741 | 805 MyVideo (USB) | 741 | 805 MyVideo (USB) |
742 | 742 | ||
743 | 743 | ||
744 | Matrix-Vision | 744 | Matrix-Vision |
745 | ------------- | 745 | ------------- |
@@ -764,7 +764,7 @@ Gallant (www.gallantcom.com) www.minton.com.tw | |||
764 | Intervision IV-550 (bt8x8) | 764 | Intervision IV-550 (bt8x8) |
765 | Intervision IV-100 (zoran) | 765 | Intervision IV-100 (zoran) |
766 | Intervision IV-1000 (bt8x8) | 766 | Intervision IV-1000 (bt8x8) |
767 | 767 | ||
768 | Asonic (www.asonic.com.cn) (website down) | 768 | Asonic (www.asonic.com.cn) (website down) |
769 | ----------------------------------------- | 769 | ----------------------------------------- |
770 | SkyEye tv 878 | 770 | SkyEye tv 878 |
@@ -804,11 +804,11 @@ Kworld (www.kworld.com.tw) | |||
804 | 804 | ||
805 | JTT/ Justy Corp.http://www.justy.co.jp/ (www.jtt.com.jp website down) | 805 | JTT/ Justy Corp.http://www.justy.co.jp/ (www.jtt.com.jp website down) |
806 | --------------------------------------------------------------------- | 806 | --------------------------------------------------------------------- |
807 | JTT-02 (JTT TV) "TV watchmate pro" (bt848) | 807 | JTT-02 (JTT TV) "TV watchmate pro" (bt848) |
808 | 808 | ||
809 | ADS www.adstech.com | 809 | ADS www.adstech.com |
810 | ------------------- | 810 | ------------------- |
811 | Channel Surfer TV ( CHX-950 ) | 811 | Channel Surfer TV ( CHX-950 ) |
812 | Channel Surfer TV+FM ( CHX-960FM ) | 812 | Channel Surfer TV+FM ( CHX-960FM ) |
813 | 813 | ||
814 | AVEC www.prochips.com | 814 | AVEC www.prochips.com |
@@ -874,7 +874,7 @@ www.ids-imaging.de | |||
874 | ------------------ | 874 | ------------------ |
875 | Falcon Series (capture only) | 875 | Falcon Series (capture only) |
876 | In USA: http://www.theimagingsource.com/ | 876 | In USA: http://www.theimagingsource.com/ |
877 | DFG/LC1 | 877 | DFG/LC1 |
878 | 878 | ||
879 | www.sknet-web.co.jp | 879 | www.sknet-web.co.jp |
880 | ------------------- | 880 | ------------------- |
@@ -890,7 +890,7 @@ Cybertainment | |||
890 | CyberMail Xtreme | 890 | CyberMail Xtreme |
891 | These are Flyvideo | 891 | These are Flyvideo |
892 | 892 | ||
893 | VCR (http://www.vcrinc.com/) | 893 | VCR (http://www.vcrinc.com/) |
894 | --- | 894 | --- |
895 | Video Catcher 16 | 895 | Video Catcher 16 |
896 | 896 | ||
@@ -920,7 +920,7 @@ Sdisilk www.sdisilk.com/ | |||
920 | SDI Silk 200 SDI Input Card | 920 | SDI Silk 200 SDI Input Card |
921 | 921 | ||
922 | www.euresys.com | 922 | www.euresys.com |
923 | PICOLO series | 923 | PICOLO series |
924 | 924 | ||
925 | PMC/Pace | 925 | PMC/Pace |
926 | www.pacecom.co.uk website closed | 926 | www.pacecom.co.uk website closed |
diff --git a/Documentation/video4linux/bttv/Insmod-options b/Documentation/video4linux/bttv/Insmod-options index 7bb5a50b0779..fc94ff235ffa 100644 --- a/Documentation/video4linux/bttv/Insmod-options +++ b/Documentation/video4linux/bttv/Insmod-options | |||
@@ -44,6 +44,9 @@ bttv.o | |||
44 | push used by bttv. bttv will disable overlay | 44 | push used by bttv. bttv will disable overlay |
45 | by default on this hardware to avoid crashes. | 45 | by default on this hardware to avoid crashes. |
46 | With this insmod option you can override this. | 46 | With this insmod option you can override this. |
47 | no_overlay=1 Disable overlay. It should be used by broken | ||
48 | hardware that doesn't support PCI2PCI direct | ||
49 | transfers. | ||
47 | automute=0/1 Automatically mutes the sound if there is | 50 | automute=0/1 Automatically mutes the sound if there is |
48 | no TV signal, on by default. You might try | 51 | no TV signal, on by default. You might try |
49 | to disable this if you have bad input signal | 52 | to disable this if you have bad input signal |
diff --git a/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt b/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt new file mode 100644 index 000000000000..93fec32a1188 --- /dev/null +++ b/Documentation/video4linux/hauppauge-wintv-cx88-ir.txt | |||
@@ -0,0 +1,54 @@ | |||
1 | The controls for the mux are GPIO [0,1] for source, and GPIO 2 for muting. | ||
2 | |||
3 | GPIO0 GPIO1 | ||
4 | 0 0 TV Audio | ||
5 | 1 0 FM radio | ||
6 | 0 1 Line-In | ||
7 | 1 1 Mono tuner bypass or CD passthru (tuner specific) | ||
8 | |||
9 | GPIO 16(i believe) is tied to the IR port (if present). | ||
10 | |||
11 | ------------------------------------------------------------------------------------ | ||
12 | |||
13 | >From the data sheet: | ||
14 | Register 24'h20004 PCI Interrupt Status | ||
15 | bit [18] IR_SMP_INT Set when 32 input samples have been collected over | ||
16 | gpio[16] pin into GP_SAMPLE register. | ||
17 | |||
18 | What's missing from the data sheet: | ||
19 | |||
20 | Setup 4KHz sampling rate (roughly 2x oversampled; good enough for our RC5 | ||
21 | compat remote) | ||
22 | set register 0x35C050 to 0xa80a80 | ||
23 | |||
24 | enable sampling | ||
25 | set register 0x35C054 to 0x5 | ||
26 | |||
27 | Of course, enable the IRQ bit 18 in the interrupt mask register .(and | ||
28 | provide for a handler) | ||
29 | |||
30 | GP_SAMPLE register is at 0x35C058 | ||
31 | |||
32 | Bits are then right shifted into the GP_SAMPLE register at the specified | ||
33 | rate; you get an interrupt when a full DWORD is recieved. | ||
34 | You need to recover the actual RC5 bits out of the (oversampled) IR sensor | ||
35 | bits. (Hint: look for the 0/1and 1/0 crossings of the RC5 bi-phase data) An | ||
36 | actual raw RC5 code will span 2-3 DWORDS, depending on the actual alignment. | ||
37 | |||
38 | I'm pretty sure when no IR signal is present the receiver is always in a | ||
39 | marking state(1); but stray light, etc can cause intermittent noise values | ||
40 | as well. Remember, this is a free running sample of the IR receiver state | ||
41 | over time, so don't assume any sample starts at any particular place. | ||
42 | |||
43 | http://www.atmel.com/dyn/resources/prod_documents/doc2817.pdf | ||
44 | This data sheet (google search) seems to have a lovely description of the | ||
45 | RC5 basics | ||
46 | |||
47 | http://users.pandora.be/nenya/electronics/rc5/ and more data | ||
48 | |||
49 | http://www.ee.washington.edu/circuit_archive/text/ir_decode.txt | ||
50 | and even a reference to how to decode a bi-phase data stream. | ||
51 | |||
52 | http://www.xs4all.nl/~sbp/knowledge/ir/rc5.htm | ||
53 | still more info | ||
54 | |||
diff --git a/Documentation/video4linux/lifeview.txt b/Documentation/video4linux/lifeview.txt new file mode 100644 index 000000000000..b07ea79c2b7e --- /dev/null +++ b/Documentation/video4linux/lifeview.txt | |||
@@ -0,0 +1,42 @@ | |||
1 | collecting data about the lifeview models and the config coding on | ||
2 | gpio pins 0-9 ... | ||
3 | ================================================================== | ||
4 | |||
5 | bt878: | ||
6 | LR50 rev. Q ("PARTS: 7031505116), Tuner wurde als Nr. 5 erkannt, Eingänge | ||
7 | SVideo, TV, Composite, Audio, Remote. CP9..1=100001001 (1: 0-Ohm-Widerstand | ||
8 | gegen GND unbestückt; 0: bestückt) | ||
9 | |||
10 | ------------------------------------------------------------------------------ | ||
11 | |||
12 | saa7134: | ||
13 | /* LifeView FlyTV Platinum FM (LR214WF) */ | ||
14 | /* "Peter Missel <peter.missel@onlinehome.de> */ | ||
15 | .name = "LifeView FlyTV Platinum FM", | ||
16 | /* GP27 MDT2005 PB4 pin 10 */ | ||
17 | /* GP26 MDT2005 PB3 pin 9 */ | ||
18 | /* GP25 MDT2005 PB2 pin 8 */ | ||
19 | /* GP23 MDT2005 PB1 pin 7 */ | ||
20 | /* GP22 MDT2005 PB0 pin 6 */ | ||
21 | /* GP21 MDT2005 PB5 pin 11 */ | ||
22 | /* GP20 MDT2005 PB6 pin 12 */ | ||
23 | /* GP19 MDT2005 PB7 pin 13 */ | ||
24 | /* nc MDT2005 PA3 pin 2 */ | ||
25 | /* Remote MDT2005 PA2 pin 1 */ | ||
26 | /* GP18 MDT2005 PA1 pin 18 */ | ||
27 | /* nc MDT2005 PA0 pin 17 strap low */ | ||
28 | |||
29 | /* GP17 Strap "GP7"=High */ | ||
30 | /* GP16 Strap "GP6"=High | ||
31 | 0=Radio 1=TV | ||
32 | Drives SA630D ENCH1 and HEF4052 A1 pins | ||
33 | to do FM radio through SIF input */ | ||
34 | /* GP15 nc */ | ||
35 | /* GP14 nc */ | ||
36 | /* GP13 nc */ | ||
37 | /* GP12 Strap "GP5" = High */ | ||
38 | /* GP11 Strap "GP4" = High */ | ||
39 | /* GP10 Strap "GP3" = High */ | ||
40 | /* GP09 Strap "GP2" = Low */ | ||
41 | /* GP08 Strap "GP1" = Low */ | ||
42 | /* GP07.00 nc */ | ||
diff --git a/Documentation/video4linux/not-in-cx2388x-datasheet.txt b/Documentation/video4linux/not-in-cx2388x-datasheet.txt new file mode 100644 index 000000000000..edbfe744d21d --- /dev/null +++ b/Documentation/video4linux/not-in-cx2388x-datasheet.txt | |||
@@ -0,0 +1,41 @@ | |||
1 | ================================================================================= | ||
2 | MO_OUTPUT_FORMAT (0x310164) | ||
3 | |||
4 | Previous default from DScaler: 0x1c1f0008 | ||
5 | Digit 8: 31-28 | ||
6 | 28: PREVREMOD = 1 | ||
7 | |||
8 | Digit 7: 27-24 (0xc = 12 = b1100 ) | ||
9 | 27: COMBALT = 1 | ||
10 | 26: PAL_INV_PHASE | ||
11 | (DScaler apparently set this to 1, resulted in sucky picture) | ||
12 | |||
13 | Digits 6,5: 23-16 | ||
14 | 25-16: COMB_RANGE = 0x1f [default] (9 bits -> max 512) | ||
15 | |||
16 | Digit 4: 15-12 | ||
17 | 15: DISIFX = 0 | ||
18 | 14: INVCBF = 0 | ||
19 | 13: DISADAPT = 0 | ||
20 | 12: NARROWADAPT = 0 | ||
21 | |||
22 | Digit 3: 11-8 | ||
23 | 11: FORCE2H | ||
24 | 10: FORCEREMD | ||
25 | 9: NCHROMAEN | ||
26 | 8: NREMODEN | ||
27 | |||
28 | Digit 2: 7-4 | ||
29 | 7-6: YCORE | ||
30 | 5-4: CCORE | ||
31 | |||
32 | Digit 1: 3-0 | ||
33 | 3: RANGE = 1 | ||
34 | 2: HACTEXT | ||
35 | 1: HSFMT | ||
36 | |||
37 | 0x47 is the sync byte for MPEG-2 transport stream packets. | ||
38 | Datasheet incorrectly states to use 47 decimal. 188 is the length. | ||
39 | All DVB compliant frontends output packets with this start code. | ||
40 | |||
41 | ================================================================================= | ||
diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt index b9e6be00cadf..678e8f192db2 100644 --- a/Documentation/x86_64/boot-options.txt +++ b/Documentation/x86_64/boot-options.txt | |||
@@ -6,6 +6,11 @@ only the AMD64 specific ones are listed here. | |||
6 | Machine check | 6 | Machine check |
7 | 7 | ||
8 | mce=off disable machine check | 8 | mce=off disable machine check |
9 | mce=bootlog Enable logging of machine checks left over from booting. | ||
10 | Disabled by default because some BIOS leave bogus ones. | ||
11 | If your BIOS doesn't do that it's a good idea to enable though | ||
12 | to make sure you log even machine check events that result | ||
13 | in a reboot. | ||
9 | 14 | ||
10 | nomce (for compatibility with i386): same as mce=off | 15 | nomce (for compatibility with i386): same as mce=off |
11 | 16 | ||
@@ -47,7 +52,7 @@ Timing | |||
47 | notsc | 52 | notsc |
48 | Don't use the CPU time stamp counter to read the wall time. | 53 | Don't use the CPU time stamp counter to read the wall time. |
49 | This can be used to work around timing problems on multiprocessor systems | 54 | This can be used to work around timing problems on multiprocessor systems |
50 | with not properly synchronized CPUs. Only useful with a SMP kernel | 55 | with not properly synchronized CPUs. |
51 | 56 | ||
52 | report_lost_ticks | 57 | report_lost_ticks |
53 | Report when timer interrupts are lost because some code turned off | 58 | Report when timer interrupts are lost because some code turned off |
@@ -74,6 +79,9 @@ Idle loop | |||
74 | event. This will make the CPUs eat a lot more power, but may be useful | 79 | event. This will make the CPUs eat a lot more power, but may be useful |
75 | to get slightly better performance in multiprocessor benchmarks. It also | 80 | to get slightly better performance in multiprocessor benchmarks. It also |
76 | makes some profiling using performance counters more accurate. | 81 | makes some profiling using performance counters more accurate. |
82 | Please note that on systems with MONITOR/MWAIT support (like Intel EM64T | ||
83 | CPUs) this option has no performance advantage over the normal idle loop. | ||
84 | It may also interact badly with hyperthreading. | ||
77 | 85 | ||
78 | Rebooting | 86 | Rebooting |
79 | 87 | ||
@@ -178,6 +186,5 @@ Debugging | |||
178 | Misc | 186 | Misc |
179 | 187 | ||
180 | noreplacement Don't replace instructions with more appropiate ones | 188 | noreplacement Don't replace instructions with more appropiate ones |
181 | for the CPU. This may be useful on asymmetric MP systems | 189 | for the CPU. This may be useful on asymmetric MP systems |
182 | where some CPU have less capabilities than the others. | 190 | where some CPU have less capabilities than the others. |
183 | |||