aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/00-INDEX2
-rw-r--r--Documentation/ABI/removed/o2cb2
-rw-r--r--Documentation/ABI/removed/raw13942
-rw-r--r--Documentation/ABI/testing/debugfs-ideapad19
-rw-r--r--Documentation/ABI/testing/evm23
-rw-r--r--Documentation/ABI/testing/sysfs-block13
-rw-r--r--Documentation/ABI/testing/sysfs-bus-bcma8
-rw-r--r--Documentation/ABI/testing/sysfs-bus-pci-drivers-ehci_hcd46
-rw-r--r--Documentation/ABI/testing/sysfs-bus-usb15
-rw-r--r--Documentation/ABI/testing/sysfs-class-backlight-driver-adp887016
-rw-r--r--Documentation/ABI/testing/sysfs-class-devfreq52
-rw-r--r--Documentation/ABI/testing/sysfs-class-net-mesh8
-rw-r--r--Documentation/ABI/testing/sysfs-class-scsi_host13
-rw-r--r--Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff7
-rw-r--r--Documentation/ABI/testing/sysfs-driver-wacom72
-rw-r--r--Documentation/ABI/testing/sysfs-platform-ideapad-laptop15
-rw-r--r--Documentation/ABI/testing/sysfs-wacom10
-rw-r--r--Documentation/DMA-API.txt7
-rw-r--r--Documentation/DocBook/80211.tmpl11
-rw-r--r--Documentation/DocBook/media/dvb/dvbproperty.xml24
-rw-r--r--Documentation/DocBook/media/dvb/intro.xml2
-rw-r--r--Documentation/DocBook/media/v4l/compat.xml8
-rw-r--r--Documentation/DocBook/media/v4l/controls.xml38
-rw-r--r--Documentation/DocBook/media/v4l/dev-subdev.xml2
-rw-r--r--Documentation/DocBook/media/v4l/v4l2.xml9
-rw-r--r--Documentation/DocBook/media/v4l/vidioc-dqevent.xml129
-rw-r--r--Documentation/DocBook/media/v4l/vidioc-queryctrl.xml9
-rw-r--r--Documentation/DocBook/media/v4l/vidioc-subscribe-event.xml123
-rw-r--r--Documentation/DocBook/uio-howto.tmpl2
-rw-r--r--Documentation/PCI/MSI-HOWTO.txt89
-rw-r--r--Documentation/PCI/pci.txt2
-rw-r--r--Documentation/RCU/NMI-RCU.txt2
-rw-r--r--Documentation/RCU/lockdep-splat.txt110
-rw-r--r--Documentation/RCU/lockdep.txt34
-rw-r--r--Documentation/RCU/torture.txt137
-rw-r--r--Documentation/RCU/trace.txt38
-rw-r--r--Documentation/SubmittingDrivers2
-rw-r--r--Documentation/SubmittingPatches2
-rw-r--r--Documentation/blackfin/bfin-gpio-notes.txt2
-rw-r--r--Documentation/block/biodoc.txt2
-rw-r--r--Documentation/block/cfq-iosched.txt71
-rw-r--r--Documentation/bus-virt-phys-mapping.txt2
-rw-r--r--Documentation/cdrom/packet-writing.txt2
-rw-r--r--Documentation/cgroups/memory.txt86
-rw-r--r--Documentation/cpu-freq/governors.txt2
-rw-r--r--Documentation/development-process/4.Coding2
-rw-r--r--Documentation/device-mapper/dm-log.txt2
-rw-r--r--Documentation/device-mapper/persistent-data.txt84
-rw-r--r--Documentation/device-mapper/thin-provisioning.txt285
-rw-r--r--Documentation/devicetree/bindings/arm/calxeda.txt8
-rw-r--r--Documentation/devicetree/bindings/arm/fsl.txt26
-rw-r--r--Documentation/devicetree/bindings/arm/gic.txt55
-rw-r--r--Documentation/devicetree/bindings/arm/l2cc.txt44
-rw-r--r--Documentation/devicetree/bindings/arm/omap/dsp.txt14
-rw-r--r--Documentation/devicetree/bindings/arm/omap/iva.txt19
-rw-r--r--Documentation/devicetree/bindings/arm/omap/l3-noc.txt19
-rw-r--r--Documentation/devicetree/bindings/arm/omap/mpu.txt27
-rw-r--r--Documentation/devicetree/bindings/arm/omap/omap.txt43
-rw-r--r--Documentation/devicetree/bindings/arm/picoxcell.txt24
-rw-r--r--Documentation/devicetree/bindings/arm/primecell.txt4
-rw-r--r--Documentation/devicetree/bindings/ata/calxeda-sata.txt17
-rw-r--r--Documentation/devicetree/bindings/crypto/picochip-spacc.txt23
-rw-r--r--Documentation/devicetree/bindings/gpio/led.txt2
-rw-r--r--Documentation/devicetree/bindings/gpio/pl061-gpio.txt10
-rw-r--r--Documentation/devicetree/bindings/i2c/fsl-imx-i2c.txt25
-rw-r--r--Documentation/devicetree/bindings/i2c/samsung-i2c.txt39
-rw-r--r--Documentation/devicetree/bindings/mmc/nvidia-sdhci.txt27
-rw-r--r--Documentation/devicetree/bindings/net/can/fsl-flexcan.txt63
-rw-r--r--Documentation/devicetree/bindings/net/smsc911x.txt38
-rw-r--r--Documentation/devicetree/bindings/pinmux/pinmux_nvidia.txt5
-rw-r--r--Documentation/devicetree/bindings/serial/rs485.txt31
-rw-r--r--Documentation/devicetree/bindings/spi/spi_pl022.txt12
-rw-r--r--Documentation/devicetree/bindings/tty/serial/atmel-usart.txt27
-rw-r--r--Documentation/devicetree/bindings/tty/serial/msm_serial.txt27
-rw-r--r--Documentation/devicetree/bindings/tty/serial/snps-dw-apb-uart.txt25
-rw-r--r--Documentation/devicetree/bindings/vendor-prefixes.txt40
-rw-r--r--Documentation/devicetree/bindings/virtio/mmio.txt17
-rw-r--r--Documentation/driver-model/binding.txt4
-rw-r--r--Documentation/driver-model/device.txt65
-rwxr-xr-xDocumentation/dvb/get_dvb_firmware51
-rw-r--r--Documentation/dvb/it9137.txt9
-rw-r--r--Documentation/email-clients.txt12
-rw-r--r--Documentation/fault-injection/fault-injection.txt8
-rw-r--r--Documentation/fb/udlfb.txt39
-rw-r--r--Documentation/feature-removal-schedule.txt75
-rw-r--r--Documentation/filesystems/9p.txt2
-rw-r--r--Documentation/filesystems/Locking1
-rw-r--r--Documentation/filesystems/befs.txt2
-rw-r--r--Documentation/filesystems/caching/object.txt6
-rw-r--r--Documentation/filesystems/ext3.txt8
-rw-r--r--Documentation/filesystems/ext4.txt41
-rw-r--r--Documentation/filesystems/locks.txt11
-rw-r--r--Documentation/filesystems/nfs/idmapper.txt2
-rw-r--r--Documentation/filesystems/pohmelfs/design_notes.txt5
-rw-r--r--Documentation/filesystems/proc.txt2
-rw-r--r--Documentation/filesystems/sysfs.txt10
-rw-r--r--Documentation/filesystems/vfs.txt3
-rw-r--r--Documentation/frv/booting.txt6
-rw-r--r--Documentation/hwmon/ad731425
-rw-r--r--Documentation/hwmon/adm127540
-rw-r--r--Documentation/hwmon/coretemp14
-rw-r--r--Documentation/hwmon/exynos4_tmu81
-rw-r--r--Documentation/hwmon/lm7561
-rw-r--r--Documentation/hwmon/ltc2978103
-rw-r--r--Documentation/hwmon/max160657
-rw-r--r--Documentation/hwmon/pmbus13
-rw-r--r--Documentation/hwmon/pmbus-core283
-rw-r--r--Documentation/hwmon/zl6100125
-rw-r--r--Documentation/hwspinlock.txt74
-rw-r--r--Documentation/i2c/smbus-protocol8
-rw-r--r--Documentation/input/elantech.txt295
-rw-r--r--Documentation/input/input.txt2
-rw-r--r--Documentation/input/multi-touch-protocol.txt14
-rw-r--r--Documentation/ioctl/ioctl-number.txt2
-rw-r--r--Documentation/kernel-docs.txt15
-rw-r--r--Documentation/kernel-parameters.txt154
-rw-r--r--Documentation/laptops/thinkpad-acpi.txt4
-rw-r--r--Documentation/media-framework.txt4
-rw-r--r--Documentation/memory-barriers.txt2
-rw-r--r--Documentation/networking/00-INDEX116
-rw-r--r--Documentation/networking/LICENSE.qlcnic51
-rw-r--r--Documentation/networking/batman-adv.txt8
-rw-r--r--Documentation/networking/bonding.txt29
-rw-r--r--Documentation/networking/dmfe.txt3
-rw-r--r--Documentation/networking/ip-sysctl.txt23
-rw-r--r--Documentation/networking/ipvs-sysctl.txt62
-rw-r--r--Documentation/networking/mac80211-injection.txt4
-rw-r--r--Documentation/networking/netdevices.txt4
-rw-r--r--Documentation/networking/scaling.txt378
-rw-r--r--Documentation/networking/stmmac.txt44
-rw-r--r--Documentation/pinctrl.txt950
-rw-r--r--Documentation/power/00-INDEX2
-rw-r--r--Documentation/power/basic-pm-debugging.txt26
-rw-r--r--Documentation/power/devices.txt8
-rw-r--r--Documentation/power/pm_qos_interface.txt92
-rw-r--r--Documentation/power/regulator/machine.txt19
-rw-r--r--Documentation/power/runtime_pm.txt24
-rw-r--r--Documentation/power/suspend-and-cpuhotplug.txt275
-rw-r--r--Documentation/power/userland-swsusp.txt3
-rw-r--r--Documentation/ramoops.txt76
-rw-r--r--Documentation/rapidio/rapidio.txt2
-rw-r--r--Documentation/rapidio/tsi721.txt49
-rw-r--r--Documentation/rfkill.txt3
-rw-r--r--Documentation/scheduler/sched-bwc.txt122
-rw-r--r--Documentation/scsi/00-INDEX2
-rw-r--r--Documentation/scsi/ChangeLog.megaraid_sas15
-rw-r--r--Documentation/scsi/LICENSE.qla4xxx310
-rw-r--r--Documentation/scsi/aic7xxx_old.txt2
-rw-r--r--Documentation/scsi/bnx2fc.txt75
-rw-r--r--Documentation/scsi/scsi_mid_low_api.txt5
-rw-r--r--Documentation/security/keys-trusted-encrypted.txt3
-rw-r--r--Documentation/serial/serial-rs485.txt8
-rw-r--r--Documentation/sound/oss/PAS163
-rw-r--r--Documentation/spi/pxa2xx4
-rw-r--r--Documentation/stable_kernel_rules.txt14
-rw-r--r--Documentation/sysctl/kernel.txt8
-rw-r--r--Documentation/timers/highres.txt2
-rw-r--r--Documentation/trace/postprocess/trace-vmscan-postprocess.pl8
-rw-r--r--Documentation/usb/dma.txt6
-rw-r--r--Documentation/usb/dwc3.txt45
-rw-r--r--Documentation/usb/power-management.txt34
-rw-r--r--Documentation/video4linux/CARDLIST.tm600016
-rw-r--r--Documentation/video4linux/gspca.txt4
-rw-r--r--Documentation/video4linux/omap3isp.txt9
-rw-r--r--Documentation/video4linux/v4l2-controls.txt43
-rw-r--r--Documentation/virtual/00-INDEX3
-rw-r--r--Documentation/virtual/kvm/api.txt71
-rw-r--r--Documentation/virtual/lguest/lguest.c5
-rw-r--r--Documentation/virtual/uml/UserModeLinux-HOWTO.txt532
-rw-r--r--Documentation/virtual/virtio-spec.txt2200
-rw-r--r--Documentation/vm/00-INDEX2
-rw-r--r--Documentation/vm/numa4
-rw-r--r--Documentation/vm/slub.txt2
-rw-r--r--Documentation/vm/transhuge.txt7
-rw-r--r--Documentation/x86/entry_64.txt3
-rw-r--r--Documentation/zh_CN/SubmitChecklist109
176 files changed, 8642 insertions, 1327 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 1f89424c36a6..65bbd2622396 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -272,6 +272,8 @@ printk-formats.txt
272 - how to get printk format specifiers right 272 - how to get printk format specifiers right
273prio_tree.txt 273prio_tree.txt
274 - info on radix-priority-search-tree use for indexing vmas. 274 - info on radix-priority-search-tree use for indexing vmas.
275ramoops.txt
276 - documentation of the ramoops oops/panic logging module.
275rbtree.txt 277rbtree.txt
276 - info on what red-black trees are and what they are for. 278 - info on what red-black trees are and what they are for.
277robust-futex-ABI.txt 279robust-futex-ABI.txt
diff --git a/Documentation/ABI/removed/o2cb b/Documentation/ABI/removed/o2cb
index 7f5daa465093..20c91adca6d4 100644
--- a/Documentation/ABI/removed/o2cb
+++ b/Documentation/ABI/removed/o2cb
@@ -1,6 +1,6 @@
1What: /sys/o2cb symlink 1What: /sys/o2cb symlink
2Date: May 2011 2Date: May 2011
3KernelVersion: 2.6.40 3KernelVersion: 3.0
4Contact: ocfs2-devel@oss.oracle.com 4Contact: ocfs2-devel@oss.oracle.com
5Description: This is a symlink: /sys/o2cb to /sys/fs/o2cb. The symlink is 5Description: This is a symlink: /sys/o2cb to /sys/fs/o2cb. The symlink is
6 removed when new versions of ocfs2-tools which know to look 6 removed when new versions of ocfs2-tools which know to look
diff --git a/Documentation/ABI/removed/raw1394 b/Documentation/ABI/removed/raw1394
index 490aa1efc4ae..ec333e676322 100644
--- a/Documentation/ABI/removed/raw1394
+++ b/Documentation/ABI/removed/raw1394
@@ -5,7 +5,7 @@ Description:
5 /dev/raw1394 was a character device file that allowed low-level 5 /dev/raw1394 was a character device file that allowed low-level
6 access to FireWire buses. Its major drawbacks were its inability 6 access to FireWire buses. Its major drawbacks were its inability
7 to implement sensible device security policies, and its low level 7 to implement sensible device security policies, and its low level
8 of abstraction that required userspace clients do duplicate much 8 of abstraction that required userspace clients to duplicate much
9 of the kernel's ieee1394 core functionality. 9 of the kernel's ieee1394 core functionality.
10 Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of 10 Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of
11 firewire-core. 11 firewire-core.
diff --git a/Documentation/ABI/testing/debugfs-ideapad b/Documentation/ABI/testing/debugfs-ideapad
new file mode 100644
index 000000000000..7079c0b21030
--- /dev/null
+++ b/Documentation/ABI/testing/debugfs-ideapad
@@ -0,0 +1,19 @@
1What: /sys/kernel/debug/ideapad/cfg
2Date: Sep 2011
3KernelVersion: 3.2
4Contact: Ike Panhc <ike.pan@canonical.com>
5Description:
6
7cfg shows the return value of _CFG method in VPC2004 device. It tells machine
8capability and what graphic component within the machine.
9
10
11What: /sys/kernel/debug/ideapad/status
12Date: Sep 2011
13KernelVersion: 3.2
14Contact: Ike Panhc <ike.pan@canonical.com>
15Description:
16
17status shows infos we can read and tells its meaning and value.
18
19
diff --git a/Documentation/ABI/testing/evm b/Documentation/ABI/testing/evm
new file mode 100644
index 000000000000..8374d4557e5d
--- /dev/null
+++ b/Documentation/ABI/testing/evm
@@ -0,0 +1,23 @@
1What: security/evm
2Date: March 2011
3Contact: Mimi Zohar <zohar@us.ibm.com>
4Description:
5 EVM protects a file's security extended attributes(xattrs)
6 against integrity attacks. The initial method maintains an
7 HMAC-sha1 value across the extended attributes, storing the
8 value as the extended attribute 'security.evm'.
9
10 EVM depends on the Kernel Key Retention System to provide it
11 with a trusted/encrypted key for the HMAC-sha1 operation.
12 The key is loaded onto the root's keyring using keyctl. Until
13 EVM receives notification that the key has been successfully
14 loaded onto the keyring (echo 1 > <securityfs>/evm), EVM
15 can not create or validate the 'security.evm' xattr, but
16 returns INTEGRITY_UNKNOWN. Loading the key and signaling EVM
17 should be done as early as possible. Normally this is done
18 in the initramfs, which has already been measured as part
19 of the trusted boot. For more information on creating and
20 loading existing trusted/encrypted keys, refer to:
21 Documentation/keys-trusted-encrypted.txt. (A sample dracut
22 patch, which loads the trusted/encrypted key and enables
23 EVM, is available from http://linux-ima.sourceforge.net/#EVM.)
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index c1eb41cb9876..2b5d56127fce 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -206,3 +206,16 @@ Description:
206 when a discarded area is read the discard_zeroes_data 206 when a discarded area is read the discard_zeroes_data
207 parameter will be set to one. Otherwise it will be 0 and 207 parameter will be set to one. Otherwise it will be 0 and
208 the result of reading a discarded area is undefined. 208 the result of reading a discarded area is undefined.
209What: /sys/block/<disk>/alias
210Date: Aug 2011
211Contact: Nao Nishijima <nao.nishijima.xt@hitachi.com>
212Description:
213 A raw device name of a disk does not always point a same disk
214 each boot-up time. Therefore, users have to use persistent
215 device names, which udev creates when the kernel finds a disk,
216 instead of raw device name. However, kernel doesn't show those
217 persistent names on its messages (e.g. dmesg).
218 This file can store an alias of the disk and it would be
219 appeared in kernel messages if it is set. A disk can have an
220 alias which length is up to 255bytes. Users can use alphabets,
221 numbers, "-" and "_" in alias name. This file is writeonce.
diff --git a/Documentation/ABI/testing/sysfs-bus-bcma b/Documentation/ABI/testing/sysfs-bus-bcma
index 06b62badddd1..721b4aea3020 100644
--- a/Documentation/ABI/testing/sysfs-bus-bcma
+++ b/Documentation/ABI/testing/sysfs-bus-bcma
@@ -1,6 +1,6 @@
1What: /sys/bus/bcma/devices/.../manuf 1What: /sys/bus/bcma/devices/.../manuf
2Date: May 2011 2Date: May 2011
3KernelVersion: 2.6.40 3KernelVersion: 3.0
4Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com> 4Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com>
5Description: 5Description:
6 Each BCMA core has it's manufacturer id. See 6 Each BCMA core has it's manufacturer id. See
@@ -8,7 +8,7 @@ Description:
8 8
9What: /sys/bus/bcma/devices/.../id 9What: /sys/bus/bcma/devices/.../id
10Date: May 2011 10Date: May 2011
11KernelVersion: 2.6.40 11KernelVersion: 3.0
12Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com> 12Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com>
13Description: 13Description:
14 There are a few types of BCMA cores, they can be identified by 14 There are a few types of BCMA cores, they can be identified by
@@ -16,7 +16,7 @@ Description:
16 16
17What: /sys/bus/bcma/devices/.../rev 17What: /sys/bus/bcma/devices/.../rev
18Date: May 2011 18Date: May 2011
19KernelVersion: 2.6.40 19KernelVersion: 3.0
20Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com> 20Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com>
21Description: 21Description:
22 BCMA cores of the same type can still slightly differ depending 22 BCMA cores of the same type can still slightly differ depending
@@ -24,7 +24,7 @@ Description:
24 24
25What: /sys/bus/bcma/devices/.../class 25What: /sys/bus/bcma/devices/.../class
26Date: May 2011 26Date: May 2011
27KernelVersion: 2.6.40 27KernelVersion: 3.0
28Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com> 28Contact: Rafa艂 Mi艂ecki <zajec5@gmail.com>
29Description: 29Description:
30 Each BCMA core is identified by few fields, including class it 30 Each BCMA core is identified by few fields, including class it
diff --git a/Documentation/ABI/testing/sysfs-bus-pci-drivers-ehci_hcd b/Documentation/ABI/testing/sysfs-bus-pci-drivers-ehci_hcd
new file mode 100644
index 000000000000..60c60fa624b2
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-pci-drivers-ehci_hcd
@@ -0,0 +1,46 @@
1What: /sys/bus/pci/drivers/ehci_hcd/.../companion
2 /sys/bus/usb/devices/usbN/../companion
3Date: January 2007
4KernelVersion: 2.6.21
5Contact: Alan Stern <stern@rowland.harvard.edu>
6Description:
7 PCI-based EHCI USB controllers (i.e., high-speed USB-2.0
8 controllers) are often implemented along with a set of
9 "companion" full/low-speed USB-1.1 controllers. When a
10 high-speed device is plugged in, the connection is routed
11 to the EHCI controller; when a full- or low-speed device
12 is plugged in, the connection is routed to the companion
13 controller.
14
15 Sometimes you want to force a high-speed device to connect
16 at full speed, which can be accomplished by forcing the
17 connection to be routed to the companion controller.
18 That's what this file does. Writing a port number to the
19 file causes connections on that port to be routed to the
20 companion controller, and writing the negative of a port
21 number returns the port to normal operation.
22
23 For example: To force the high-speed device attached to
24 port 4 on bus 2 to run at full speed:
25
26 echo 4 >/sys/bus/usb/devices/usb2/../companion
27
28 To return the port to high-speed operation:
29
30 echo -4 >/sys/bus/usb/devices/usb2/../companion
31
32 Reading the file gives the list of ports currently forced
33 to the companion controller.
34
35 Note: Some EHCI controllers do not have companions; they
36 may contain an internal "transaction translator" or they
37 may be attached directly to a "rate-matching hub". This
38 mechanism will not work with such controllers. Also, it
39 cannot be used to force a port on a high-speed hub to
40 connect at full speed.
41
42 Note: When this file was first added, it appeared in a
43 different sysfs directory. The location given above is
44 correct for 2.6.35 (and probably several earlier kernel
45 versions as well).
46
diff --git a/Documentation/ABI/testing/sysfs-bus-usb b/Documentation/ABI/testing/sysfs-bus-usb
index 294aa864a60a..e647378e9e88 100644
--- a/Documentation/ABI/testing/sysfs-bus-usb
+++ b/Documentation/ABI/testing/sysfs-bus-usb
@@ -142,3 +142,18 @@ Description:
142 such devices. 142 such devices.
143Users: 143Users:
144 usb_modeswitch 144 usb_modeswitch
145
146What: /sys/bus/usb/devices/.../power/usb2_hardware_lpm
147Date: September 2011
148Contact: Andiry Xu <andiry.xu@amd.com>
149Description:
150 If CONFIG_USB_SUSPEND is set and a USB 2.0 lpm-capable device
151 is plugged in to a xHCI host which support link PM, it will
152 perform a LPM test; if the test is passed and host supports
153 USB2 hardware LPM (xHCI 1.0 feature), USB2 hardware LPM will
154 be enabled for the device and the USB device directory will
155 contain a file named power/usb2_hardware_lpm. The file holds
156 a string value (enable or disable) indicating whether or not
157 USB2 hardware LPM is enabled for the device. Developer can
158 write y/Y/1 or n/N/0 to the file to enable/disable the
159 feature.
diff --git a/Documentation/ABI/testing/sysfs-class-backlight-driver-adp8870 b/Documentation/ABI/testing/sysfs-class-backlight-driver-adp8870
index aa11dbdd794b..4a9c545bda4b 100644
--- a/Documentation/ABI/testing/sysfs-class-backlight-driver-adp8870
+++ b/Documentation/ABI/testing/sysfs-class-backlight-driver-adp8870
@@ -4,8 +4,8 @@ What: /sys/class/backlight/<backlight>/l2_bright_max
4What: /sys/class/backlight/<backlight>/l3_office_max 4What: /sys/class/backlight/<backlight>/l3_office_max
5What: /sys/class/backlight/<backlight>/l4_indoor_max 5What: /sys/class/backlight/<backlight>/l4_indoor_max
6What: /sys/class/backlight/<backlight>/l5_dark_max 6What: /sys/class/backlight/<backlight>/l5_dark_max
7Date: Mai 2011 7Date: May 2011
8KernelVersion: 2.6.40 8KernelVersion: 3.0
9Contact: device-drivers-devel@blackfin.uclinux.org 9Contact: device-drivers-devel@blackfin.uclinux.org
10Description: 10Description:
11 Control the maximum brightness for <ambient light zone> 11 Control the maximum brightness for <ambient light zone>
@@ -18,8 +18,8 @@ What: /sys/class/backlight/<backlight>/l2_bright_dim
18What: /sys/class/backlight/<backlight>/l3_office_dim 18What: /sys/class/backlight/<backlight>/l3_office_dim
19What: /sys/class/backlight/<backlight>/l4_indoor_dim 19What: /sys/class/backlight/<backlight>/l4_indoor_dim
20What: /sys/class/backlight/<backlight>/l5_dark_dim 20What: /sys/class/backlight/<backlight>/l5_dark_dim
21Date: Mai 2011 21Date: May 2011
22KernelVersion: 2.6.40 22KernelVersion: 3.0
23Contact: device-drivers-devel@blackfin.uclinux.org 23Contact: device-drivers-devel@blackfin.uclinux.org
24Description: 24Description:
25 Control the dim brightness for <ambient light zone> 25 Control the dim brightness for <ambient light zone>
@@ -29,8 +29,8 @@ Description:
29 this <ambient light zone>. 29 this <ambient light zone>.
30 30
31What: /sys/class/backlight/<backlight>/ambient_light_level 31What: /sys/class/backlight/<backlight>/ambient_light_level
32Date: Mai 2011 32Date: May 2011
33KernelVersion: 2.6.40 33KernelVersion: 3.0
34Contact: device-drivers-devel@blackfin.uclinux.org 34Contact: device-drivers-devel@blackfin.uclinux.org
35Description: 35Description:
36 Get conversion value of the light sensor. 36 Get conversion value of the light sensor.
@@ -39,8 +39,8 @@ Description:
39 8000 (max ambient brightness) 39 8000 (max ambient brightness)
40 40
41What: /sys/class/backlight/<backlight>/ambient_light_zone 41What: /sys/class/backlight/<backlight>/ambient_light_zone
42Date: Mai 2011 42Date: May 2011
43KernelVersion: 2.6.40 43KernelVersion: 3.0
44Contact: device-drivers-devel@blackfin.uclinux.org 44Contact: device-drivers-devel@blackfin.uclinux.org
45Description: 45Description:
46 Get/Set current ambient light zone. Reading returns 46 Get/Set current ambient light zone. Reading returns
diff --git a/Documentation/ABI/testing/sysfs-class-devfreq b/Documentation/ABI/testing/sysfs-class-devfreq
new file mode 100644
index 000000000000..23d78b5aab11
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-devfreq
@@ -0,0 +1,52 @@
1What: /sys/class/devfreq/.../
2Date: September 2011
3Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
4Description:
5 Provide a place in sysfs for the devfreq objects.
6 This allows accessing various devfreq specific variables.
7 The name of devfreq object denoted as ... is same as the
8 name of device using devfreq.
9
10What: /sys/class/devfreq/.../governor
11Date: September 2011
12Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
13Description:
14 The /sys/class/devfreq/.../governor shows the name of the
15 governor used by the corresponding devfreq object.
16
17What: /sys/class/devfreq/.../cur_freq
18Date: September 2011
19Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
20Description:
21 The /sys/class/devfreq/.../cur_freq shows the current
22 frequency of the corresponding devfreq object.
23
24What: /sys/class/devfreq/.../central_polling
25Date: September 2011
26Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
27Description:
28 The /sys/class/devfreq/.../central_polling shows whether
29 the devfreq ojbect is using devfreq-provided central
30 polling mechanism or not.
31
32What: /sys/class/devfreq/.../polling_interval
33Date: September 2011
34Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
35Description:
36 The /sys/class/devfreq/.../polling_interval shows and sets
37 the requested polling interval of the corresponding devfreq
38 object. The values are represented in ms. If the value is
39 less than 1 jiffy, it is considered to be 0, which means
40 no polling. This value is meaningless if the governor is
41 not polling; thus. If the governor is not using
42 devfreq-provided central polling
43 (/sys/class/devfreq/.../central_polling is 0), this value
44 may be useless.
45
46What: /sys/class/devfreq/.../userspace/set_freq
47Date: September 2011
48Contact: MyungJoo Ham <myungjoo.ham@samsung.com>
49Description:
50 The /sys/class/devfreq/.../userspace/set_freq shows and
51 sets the requested frequency for the devfreq object if
52 userspace governor is in effect.
diff --git a/Documentation/ABI/testing/sysfs-class-net-mesh b/Documentation/ABI/testing/sysfs-class-net-mesh
index 748fe1701d25..b02001488eef 100644
--- a/Documentation/ABI/testing/sysfs-class-net-mesh
+++ b/Documentation/ABI/testing/sysfs-class-net-mesh
@@ -22,6 +22,14 @@ Description:
22 mesh will be fragmented or silently discarded if the 22 mesh will be fragmented or silently discarded if the
23 packet size exceeds the outgoing interface MTU. 23 packet size exceeds the outgoing interface MTU.
24 24
25What: /sys/class/net/<mesh_iface>/mesh/ap_isolation
26Date: May 2011
27Contact: Antonio Quartulli <ordex@autistici.org>
28Description:
29 Indicates whether the data traffic going from a
30 wireless client to another wireless client will be
31 silently dropped.
32
25What: /sys/class/net/<mesh_iface>/mesh/gw_bandwidth 33What: /sys/class/net/<mesh_iface>/mesh/gw_bandwidth
26Date: October 2010 34Date: October 2010
27Contact: Marek Lindner <lindner_marek@yahoo.de> 35Contact: Marek Lindner <lindner_marek@yahoo.de>
diff --git a/Documentation/ABI/testing/sysfs-class-scsi_host b/Documentation/ABI/testing/sysfs-class-scsi_host
new file mode 100644
index 000000000000..29a4f892e433
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-scsi_host
@@ -0,0 +1,13 @@
1What: /sys/class/scsi_host/hostX/isci_id
2Date: June 2011
3Contact: Dave Jiang <dave.jiang@intel.com>
4Description:
5 This file contains the enumerated host ID for the Intel
6 SCU controller. The Intel(R) C600 Series Chipset SATA/SAS
7 Storage Control Unit embeds up to two 4-port controllers in
8 a single PCI device. The controllers are enumerated in order
9 which usually means the lowest number scsi_host corresponds
10 with the first controller, but this association is not
11 guaranteed. The 'isci_id' attribute unambiguously identifies
12 the controller index: '0' for the first controller,
13 '1' for the second.
diff --git a/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff
new file mode 100644
index 000000000000..9aec8ef228b0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-hid-logitech-lg4ff
@@ -0,0 +1,7 @@
1What: /sys/module/hid_logitech/drivers/hid:logitech/<dev>/range.
2Date: July 2011
3KernelVersion: 3.2
4Contact: Michal Mal <madcatxster@gmail.com>
5Description: Display minimum, maximum and current range of the steering
6 wheel. Writing a value within min and max boundaries sets the
7 range of the wheel.
diff --git a/Documentation/ABI/testing/sysfs-driver-wacom b/Documentation/ABI/testing/sysfs-driver-wacom
new file mode 100644
index 000000000000..82d4df136444
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-wacom
@@ -0,0 +1,72 @@
1What: /sys/class/hidraw/hidraw*/device/speed
2Date: April 2010
3Kernel Version: 2.6.35
4Contact: linux-bluetooth@vger.kernel.org
5Description:
6 The /sys/class/hidraw/hidraw*/device/speed file controls
7 reporting speed of Wacom bluetooth tablet. Reading from
8 this file returns 1 if tablet reports in high speed mode
9 or 0 otherwise. Writing to this file one of these values
10 switches reporting speed.
11
12What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/led
13Date: August 2011
14Contact: linux-input@vger.kernel.org
15Description:
16 Attribute group for control of the status LEDs and the OLEDs.
17 This attribute group is only available for Intuos 4 M, L,
18 and XL (with LEDs and OLEDs) and Cintiq 21UX2 (LEDs only).
19 Therefore its presence implicitly signifies the presence of
20 said LEDs and OLEDs on the tablet device.
21
22What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/status0_luminance
23Date: August 2011
24Contact: linux-input@vger.kernel.org
25Description:
26 Writing to this file sets the status LED luminance (1..127)
27 when the stylus does not touch the tablet surface, and no
28 button is pressed on the stylus. This luminance level is
29 normally lower than the level when a button is pressed.
30
31What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/status1_luminance
32Date: August 2011
33Contact: linux-input@vger.kernel.org
34Description:
35 Writing to this file sets the status LED luminance (1..127)
36 when the stylus touches the tablet surface, or any button is
37 pressed on the stylus.
38
39What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/status_led0_select
40Date: August 2011
41Contact: linux-input@vger.kernel.org
42Description:
43 Writing to this file sets which one of the four (for Intuos 4)
44 or of the right four (for Cintiq 21UX2) status LEDs is active (0..3).
45 The other three LEDs on the same side are always inactive.
46
47What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/status_led1_select
48Date: September 2011
49Contact: linux-input@vger.kernel.org
50Description:
51 Writing to this file sets which one of the left four (for Cintiq 21UX2)
52 status LEDs is active (0..3). The other three LEDs on the left are always
53 inactive.
54
55What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/buttons_luminance
56Date: August 2011
57Contact: linux-input@vger.kernel.org
58Description:
59 Writing to this file sets the overall luminance level (0..15)
60 of all eight button OLED displays.
61
62What: /sys/bus/usb/devices/<busnum>-<devnum>:<cfg>.<intf>/wacom_led/button<n>_rawimg
63Date: August 2011
64Contact: linux-input@vger.kernel.org
65Description:
66 When writing a 1024 byte raw image in Wacom Intuos 4
67 interleaving format to the file, the image shows up on Button N
68 of the device. The image is a 64x32 pixel 4-bit gray image. The
69 1024 byte binary is split up into 16x 64 byte chunks. Each 64
70 byte chunk encodes the image data for two consecutive lines on
71 the display. The low nibble of each byte contains the first
72 line, and the high nibble contains the second line.
diff --git a/Documentation/ABI/testing/sysfs-platform-ideapad-laptop b/Documentation/ABI/testing/sysfs-platform-ideapad-laptop
index ff53183c3848..814b01354c41 100644
--- a/Documentation/ABI/testing/sysfs-platform-ideapad-laptop
+++ b/Documentation/ABI/testing/sysfs-platform-ideapad-laptop
@@ -5,19 +5,4 @@ Contact: "Ike Panhc <ike.pan@canonical.com>"
5Description: 5Description:
6 Control the power of camera module. 1 means on, 0 means off. 6 Control the power of camera module. 1 means on, 0 means off.
7 7
8What: /sys/devices/platform/ideapad/cfg
9Date: Jun 2011
10KernelVersion: 3.1
11Contact: "Ike Panhc <ike.pan@canonical.com>"
12Description:
13 Ideapad capability bits.
14 Bit 8-10: 1 - Intel graphic only
15 2 - ATI graphic only
16 3 - Nvidia graphic only
17 4 - Intel and ATI graphic
18 5 - Intel and Nvidia graphic
19 Bit 16: Bluetooth exist (1 for exist)
20 Bit 17: 3G exist (1 for exist)
21 Bit 18: Wifi exist (1 for exist)
22 Bit 19: Camera exist (1 for exist)
23 8
diff --git a/Documentation/ABI/testing/sysfs-wacom b/Documentation/ABI/testing/sysfs-wacom
deleted file mode 100644
index 1517976e25c4..000000000000
--- a/Documentation/ABI/testing/sysfs-wacom
+++ /dev/null
@@ -1,10 +0,0 @@
1What: /sys/class/hidraw/hidraw*/device/speed
2Date: April 2010
3Kernel Version: 2.6.35
4Contact: linux-bluetooth@vger.kernel.org
5Description:
6 The /sys/class/hidraw/hidraw*/device/speed file controls
7 reporting speed of wacom bluetooth tablet. Reading from
8 this file returns 1 if tablet reports in high speed mode
9 or 0 otherwise. Writing to this file one of these values
10 switches reporting speed.
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index fe2326906610..66bd97a95f10 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -50,6 +50,13 @@ specify the GFP_ flags (see kmalloc) for the allocation (the
50implementation may choose to ignore flags that affect the location of 50implementation may choose to ignore flags that affect the location of
51the returned memory, like GFP_DMA). 51the returned memory, like GFP_DMA).
52 52
53void *
54dma_zalloc_coherent(struct device *dev, size_t size,
55 dma_addr_t *dma_handle, gfp_t flag)
56
57Wraps dma_alloc_coherent() and also zeroes the returned memory if the
58allocation attempt succeeded.
59
53void 60void
54dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, 61dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
55 dma_addr_t dma_handle) 62 dma_addr_t dma_handle)
diff --git a/Documentation/DocBook/80211.tmpl b/Documentation/DocBook/80211.tmpl
index 445289cd0e65..2014155c899d 100644
--- a/Documentation/DocBook/80211.tmpl
+++ b/Documentation/DocBook/80211.tmpl
@@ -433,8 +433,18 @@
433 Insert notes about VLAN interfaces with hw crypto here or 433 Insert notes about VLAN interfaces with hw crypto here or
434 in the hw crypto chapter. 434 in the hw crypto chapter.
435 </para> 435 </para>
436 <section id="ps-client">
437 <title>support for powersaving clients</title>
438!Pinclude/net/mac80211.h AP support for powersaving clients
439 </section>
436!Finclude/net/mac80211.h ieee80211_get_buffered_bc 440!Finclude/net/mac80211.h ieee80211_get_buffered_bc
437!Finclude/net/mac80211.h ieee80211_beacon_get 441!Finclude/net/mac80211.h ieee80211_beacon_get
442!Finclude/net/mac80211.h ieee80211_sta_eosp_irqsafe
443!Finclude/net/mac80211.h ieee80211_frame_release_type
444!Finclude/net/mac80211.h ieee80211_sta_ps_transition
445!Finclude/net/mac80211.h ieee80211_sta_ps_transition_ni
446!Finclude/net/mac80211.h ieee80211_sta_set_buffered
447!Finclude/net/mac80211.h ieee80211_sta_block_awake
438 </chapter> 448 </chapter>
439 449
440 <chapter id="multi-iface"> 450 <chapter id="multi-iface">
@@ -460,7 +470,6 @@
460!Finclude/net/mac80211.h sta_notify_cmd 470!Finclude/net/mac80211.h sta_notify_cmd
461!Finclude/net/mac80211.h ieee80211_find_sta 471!Finclude/net/mac80211.h ieee80211_find_sta
462!Finclude/net/mac80211.h ieee80211_find_sta_by_ifaddr 472!Finclude/net/mac80211.h ieee80211_find_sta_by_ifaddr
463!Finclude/net/mac80211.h ieee80211_sta_block_awake
464 </chapter> 473 </chapter>
465 474
466 <chapter id="hardware-scan-offload"> 475 <chapter id="hardware-scan-offload">
diff --git a/Documentation/DocBook/media/dvb/dvbproperty.xml b/Documentation/DocBook/media/dvb/dvbproperty.xml
index 207e1a5bf8f0..3bc8a61efe30 100644
--- a/Documentation/DocBook/media/dvb/dvbproperty.xml
+++ b/Documentation/DocBook/media/dvb/dvbproperty.xml
@@ -352,6 +352,7 @@ typedef enum fe_delivery_system {
352 SYS_CMMB, 352 SYS_CMMB,
353 SYS_DAB, 353 SYS_DAB,
354 SYS_DVBT2, 354 SYS_DVBT2,
355 SYS_TURBO,
355} fe_delivery_system_t; 356} fe_delivery_system_t;
356</programlisting> 357</programlisting>
357 </section> 358 </section>
@@ -809,6 +810,8 @@ typedef enum fe_hierarchy {
809 <listitem><para><link linkend="DTV-INVERSION"><constant>DTV_INVERSION</constant></link></para></listitem> 810 <listitem><para><link linkend="DTV-INVERSION"><constant>DTV_INVERSION</constant></link></para></listitem>
810 <listitem><para><link linkend="DTV-SYMBOL-RATE"><constant>DTV_SYMBOL_RATE</constant></link></para></listitem> 811 <listitem><para><link linkend="DTV-SYMBOL-RATE"><constant>DTV_SYMBOL_RATE</constant></link></para></listitem>
811 <listitem><para><link linkend="DTV-INNER-FEC"><constant>DTV_INNER_FEC</constant></link></para></listitem> 812 <listitem><para><link linkend="DTV-INNER-FEC"><constant>DTV_INNER_FEC</constant></link></para></listitem>
813 <listitem><para><link linkend="DTV-VOLTAGE"><constant>DTV_VOLTAGE</constant></link></para></listitem>
814 <listitem><para><link linkend="DTV-TONE"><constant>DTV_TONE</constant></link></para></listitem>
812 </itemizedlist> 815 </itemizedlist>
813 <para>Future implementations might add those two missing parameters:</para> 816 <para>Future implementations might add those two missing parameters:</para>
814 <itemizedlist mark='opencircle'> 817 <itemizedlist mark='opencircle'>
@@ -818,25 +821,18 @@ typedef enum fe_hierarchy {
818 </section> 821 </section>
819 <section id="dvbs2-params"> 822 <section id="dvbs2-params">
820 <title>DVB-S2 delivery system</title> 823 <title>DVB-S2 delivery system</title>
821 <para>The following parameters are valid for DVB-S2:</para> 824 <para>In addition to all parameters valid for DVB-S, DVB-S2 supports the following parameters:</para>
822 <itemizedlist mark='opencircle'> 825 <itemizedlist mark='opencircle'>
823 <listitem><para><link linkend="DTV-API-VERSION"><constant>DTV_API_VERSION</constant></link></para></listitem> 826 <listitem><para><link linkend="DTV-MODULATION"><constant>DTV_MODULATION</constant></link></para></listitem>
824 <listitem><para><link linkend="DTV-DELIVERY-SYSTEM"><constant>DTV_DELIVERY_SYSTEM</constant></link></para></listitem>
825 <listitem><para><link linkend="DTV-TUNE"><constant>DTV_TUNE</constant></link></para></listitem>
826 <listitem><para><link linkend="DTV-CLEAR"><constant>DTV_CLEAR</constant></link></para></listitem>
827 <listitem><para><link linkend="DTV-FREQUENCY"><constant>DTV_FREQUENCY</constant></link></para></listitem>
828 <listitem><para><link linkend="DTV-INVERSION"><constant>DTV_INVERSION</constant></link></para></listitem>
829 <listitem><para><link linkend="DTV-SYMBOL-RATE"><constant>DTV_SYMBOL_RATE</constant></link></para></listitem>
830 <listitem><para><link linkend="DTV-INNER-FEC"><constant>DTV_INNER_FEC</constant></link></para></listitem>
831 <listitem><para><link linkend="DTV-VOLTAGE"><constant>DTV_VOLTAGE</constant></link></para></listitem>
832 <listitem><para><link linkend="DTV-TONE"><constant>DTV_TONE</constant></link></para></listitem>
833 <listitem><para><link linkend="DTV-PILOT"><constant>DTV_PILOT</constant></link></para></listitem> 827 <listitem><para><link linkend="DTV-PILOT"><constant>DTV_PILOT</constant></link></para></listitem>
834 <listitem><para><link linkend="DTV-ROLLOFF"><constant>DTV_ROLLOFF</constant></link></para></listitem> 828 <listitem><para><link linkend="DTV-ROLLOFF"><constant>DTV_ROLLOFF</constant></link></para></listitem>
835 </itemizedlist> 829 </itemizedlist>
836 <para>Future implementations might add those two missing parameters:</para> 830 </section>
831 <section id="turbo-params">
832 <title>Turbo code delivery system</title>
833 <para>In addition to all parameters valid for DVB-S, turbo code supports the following parameters:</para>
837 <itemizedlist mark='opencircle'> 834 <itemizedlist mark='opencircle'>
838 <listitem><para><link linkend="DTV-DISEQC-MASTER"><constant>DTV_DISEQC_MASTER</constant></link></para></listitem> 835 <listitem><para><link linkend="DTV-MODULATION"><constant>DTV_MODULATION</constant></link></para></listitem>
839 <listitem><para><link linkend="DTV-DISEQC-SLAVE-REPLY"><constant>DTV_DISEQC_SLAVE_REPLY</constant></link></para></listitem>
840 </itemizedlist> 836 </itemizedlist>
841 </section> 837 </section>
842 <section id="isdbs-params"> 838 <section id="isdbs-params">
diff --git a/Documentation/DocBook/media/dvb/intro.xml b/Documentation/DocBook/media/dvb/intro.xml
index c75dc7cc3e9b..170064a3dc8f 100644
--- a/Documentation/DocBook/media/dvb/intro.xml
+++ b/Documentation/DocBook/media/dvb/intro.xml
@@ -205,7 +205,7 @@ a partial path like:</para>
205additional include file <emphasis 205additional include file <emphasis
206role="tt">linux/dvb/version.h</emphasis> exists, which defines the 206role="tt">linux/dvb/version.h</emphasis> exists, which defines the
207constant <emphasis role="tt">DVB_API_VERSION</emphasis>. This document 207constant <emphasis role="tt">DVB_API_VERSION</emphasis>. This document
208describes <emphasis role="tt">DVB_API_VERSION&#x00A0;3</emphasis>. 208describes <emphasis role="tt">DVB_API_VERSION 5.4</emphasis>.
209</para> 209</para>
210 210
211</section> 211</section>
diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml
index ce1004a7da52..91410b6e7e08 100644
--- a/Documentation/DocBook/media/v4l/compat.xml
+++ b/Documentation/DocBook/media/v4l/compat.xml
@@ -2370,6 +2370,14 @@ that used it. It was originally scheduled for removal in 2.6.35.
2370 </listitem> 2370 </listitem>
2371 </orderedlist> 2371 </orderedlist>
2372 </section> 2372 </section>
2373 <section>
2374 <title>V4L2 in Linux 3.2</title>
2375 <orderedlist>
2376 <listitem>
2377 <para>V4L2_CTRL_FLAG_VOLATILE was added to signal volatile controls to userspace.</para>
2378 </listitem>
2379 </orderedlist>
2380 </section>
2373 2381
2374 <section id="other"> 2382 <section id="other">
2375 <title>Relation of V4L2 to other Linux multimedia APIs</title> 2383 <title>Relation of V4L2 to other Linux multimedia APIs</title>
diff --git a/Documentation/DocBook/media/v4l/controls.xml b/Documentation/DocBook/media/v4l/controls.xml
index 85164016ed26..23fdf79f8cf3 100644
--- a/Documentation/DocBook/media/v4l/controls.xml
+++ b/Documentation/DocBook/media/v4l/controls.xml
@@ -1455,7 +1455,7 @@ Applicable to the H264 encoder.</entry>
1455 </row> 1455 </row>
1456 1456
1457 <row><entry></entry></row> 1457 <row><entry></entry></row>
1458 <row> 1458 <row id="v4l2-mpeg-video-h264-vui-sar-idc">
1459 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC</constant>&nbsp;</entry> 1459 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC</constant>&nbsp;</entry>
1460 <entry>enum&nbsp;v4l2_mpeg_video_h264_vui_sar_idc</entry> 1460 <entry>enum&nbsp;v4l2_mpeg_video_h264_vui_sar_idc</entry>
1461 </row> 1461 </row>
@@ -1561,7 +1561,7 @@ Applicable to the H264 encoder.</entry>
1561 </row> 1561 </row>
1562 1562
1563 <row><entry></entry></row> 1563 <row><entry></entry></row>
1564 <row> 1564 <row id="v4l2-mpeg-video-h264-level">
1565 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_LEVEL</constant>&nbsp;</entry> 1565 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_LEVEL</constant>&nbsp;</entry>
1566 <entry>enum&nbsp;v4l2_mpeg_video_h264_level</entry> 1566 <entry>enum&nbsp;v4l2_mpeg_video_h264_level</entry>
1567 </row> 1567 </row>
@@ -1641,7 +1641,7 @@ Possible values are:</entry>
1641 </row> 1641 </row>
1642 1642
1643 <row><entry></entry></row> 1643 <row><entry></entry></row>
1644 <row> 1644 <row id="v4l2-mpeg-video-mpeg4-level">
1645 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL</constant>&nbsp;</entry> 1645 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL</constant>&nbsp;</entry>
1646 <entry>enum&nbsp;v4l2_mpeg_video_mpeg4_level</entry> 1646 <entry>enum&nbsp;v4l2_mpeg_video_mpeg4_level</entry>
1647 </row> 1647 </row>
@@ -1689,9 +1689,9 @@ Possible values are:</entry>
1689 </row> 1689 </row>
1690 1690
1691 <row><entry></entry></row> 1691 <row><entry></entry></row>
1692 <row> 1692 <row id="v4l2-mpeg-video-h264-profile">
1693 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_PROFILE</constant>&nbsp;</entry> 1693 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_PROFILE</constant>&nbsp;</entry>
1694 <entry>enum&nbsp;v4l2_mpeg_h264_profile</entry> 1694 <entry>enum&nbsp;v4l2_mpeg_video_h264_profile</entry>
1695 </row> 1695 </row>
1696 <row><entry spanname="descr">The profile information for H264. 1696 <row><entry spanname="descr">The profile information for H264.
1697Applicable to the H264 encoder. 1697Applicable to the H264 encoder.
@@ -1774,9 +1774,9 @@ Possible values are:</entry>
1774 </row> 1774 </row>
1775 1775
1776 <row><entry></entry></row> 1776 <row><entry></entry></row>
1777 <row> 1777 <row id="v4l2-mpeg-video-mpeg4-profile">
1778 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MPEG4_PROFILE</constant>&nbsp;</entry> 1778 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MPEG4_PROFILE</constant>&nbsp;</entry>
1779 <entry>enum&nbsp;v4l2_mpeg_mpeg4_profile</entry> 1779 <entry>enum&nbsp;v4l2_mpeg_video_mpeg4_profile</entry>
1780 </row> 1780 </row>
1781 <row><entry spanname="descr">The profile information for MPEG4. 1781 <row><entry spanname="descr">The profile information for MPEG4.
1782Applicable to the MPEG4 encoder. 1782Applicable to the MPEG4 encoder.
@@ -1820,9 +1820,9 @@ Applicable to the encoder.
1820 </row> 1820 </row>
1821 1821
1822 <row><entry></entry></row> 1822 <row><entry></entry></row>
1823 <row> 1823 <row id="v4l2-mpeg-video-multi-slice-mode">
1824 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MULTI_SLICE_MODE</constant>&nbsp;</entry> 1824 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_MULTI_SLICE_MODE</constant>&nbsp;</entry>
1825 <entry>enum&nbsp;v4l2_mpeg_multi_slice_mode</entry> 1825 <entry>enum&nbsp;v4l2_mpeg_video_multi_slice_mode</entry>
1826 </row> 1826 </row>
1827 <row><entry spanname="descr">Determines how the encoder should handle division of frame into slices. 1827 <row><entry spanname="descr">Determines how the encoder should handle division of frame into slices.
1828Applicable to the encoder. 1828Applicable to the encoder.
@@ -1868,9 +1868,9 @@ Applicable to the encoder.</entry>
1868 </row> 1868 </row>
1869 1869
1870 <row><entry></entry></row> 1870 <row><entry></entry></row>
1871 <row> 1871 <row id="v4l2-mpeg-video-h264-loop-filter-mode">
1872 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_LOOP_FILTER_MODE</constant>&nbsp;</entry> 1872 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_LOOP_FILTER_MODE</constant>&nbsp;</entry>
1873 <entry>enum&nbsp;v4l2_mpeg_h264_loop_filter_mode</entry> 1873 <entry>enum&nbsp;v4l2_mpeg_video_h264_loop_filter_mode</entry>
1874 </row> 1874 </row>
1875 <row><entry spanname="descr">Loop filter mode for H264 encoder. 1875 <row><entry spanname="descr">Loop filter mode for H264 encoder.
1876Possible values are:</entry> 1876Possible values are:</entry>
@@ -1913,9 +1913,9 @@ Applicable to the H264 encoder.</entry>
1913 </row> 1913 </row>
1914 1914
1915 <row><entry></entry></row> 1915 <row><entry></entry></row>
1916 <row> 1916 <row id="v4l2-mpeg-video-h264-entropy-mode">
1917 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_ENTROPY_MODE</constant>&nbsp;</entry> 1917 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_H264_ENTROPY_MODE</constant>&nbsp;</entry>
1918 <entry>enum&nbsp;v4l2_mpeg_h264_symbol_mode</entry> 1918 <entry>enum&nbsp;v4l2_mpeg_video_h264_entropy_mode</entry>
1919 </row> 1919 </row>
1920 <row><entry spanname="descr">Entropy coding mode for H264 - CABAC/CAVALC. 1920 <row><entry spanname="descr">Entropy coding mode for H264 - CABAC/CAVALC.
1921Applicable to the H264 encoder. 1921Applicable to the H264 encoder.
@@ -2140,9 +2140,9 @@ previous frames. Applicable to the H264 encoder.</entry>
2140 </row> 2140 </row>
2141 2141
2142 <row><entry></entry></row> 2142 <row><entry></entry></row>
2143 <row> 2143 <row id="v4l2-mpeg-video-header-mode">
2144 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_HEADER_MODE</constant>&nbsp;</entry> 2144 <entry spanname="id"><constant>V4L2_CID_MPEG_VIDEO_HEADER_MODE</constant>&nbsp;</entry>
2145 <entry>enum&nbsp;v4l2_mpeg_header_mode</entry> 2145 <entry>enum&nbsp;v4l2_mpeg_video_header_mode</entry>
2146 </row> 2146 </row>
2147 <row><entry spanname="descr">Determines whether the header is returned as the first buffer or is 2147 <row><entry spanname="descr">Determines whether the header is returned as the first buffer or is
2148it returned together with the first frame. Applicable to encoders. 2148it returned together with the first frame. Applicable to encoders.
@@ -2320,9 +2320,9 @@ Valid only when H.264 and macroblock level RC is enabled (<constant>V4L2_CID_MPE
2320Applicable to the H264 encoder.</entry> 2320Applicable to the H264 encoder.</entry>
2321 </row> 2321 </row>
2322 <row><entry></entry></row> 2322 <row><entry></entry></row>
2323 <row> 2323 <row id="v4l2-mpeg-mfc51-video-frame-skip-mode">
2324 <entry spanname="id"><constant>V4L2_CID_MPEG_MFC51_VIDEO_FRAME_SKIP_MODE</constant>&nbsp;</entry> 2324 <entry spanname="id"><constant>V4L2_CID_MPEG_MFC51_VIDEO_FRAME_SKIP_MODE</constant>&nbsp;</entry>
2325 <entry>enum&nbsp;v4l2_mpeg_mfc51_frame_skip_mode</entry> 2325 <entry>enum&nbsp;v4l2_mpeg_mfc51_video_frame_skip_mode</entry>
2326 </row> 2326 </row>
2327 <row><entry spanname="descr"> 2327 <row><entry spanname="descr">
2328Indicates in what conditions the encoder should skip frames. If encoding a frame would cause the encoded stream to be larger then 2328Indicates in what conditions the encoder should skip frames. If encoding a frame would cause the encoded stream to be larger then
@@ -2361,9 +2361,9 @@ the stream will meet tight bandwidth contraints. Applicable to encoders.
2361</entry> 2361</entry>
2362 </row> 2362 </row>
2363 <row><entry></entry></row> 2363 <row><entry></entry></row>
2364 <row> 2364 <row id="v4l2-mpeg-mfc51-video-force-frame-type">
2365 <entry spanname="id"><constant>V4L2_CID_MPEG_MFC51_VIDEO_FORCE_FRAME_TYPE</constant>&nbsp;</entry> 2365 <entry spanname="id"><constant>V4L2_CID_MPEG_MFC51_VIDEO_FORCE_FRAME_TYPE</constant>&nbsp;</entry>
2366 <entry>enum&nbsp;v4l2_mpeg_mfc51_force_frame_type</entry> 2366 <entry>enum&nbsp;v4l2_mpeg_mfc51_video_force_frame_type</entry>
2367 </row> 2367 </row>
2368 <row><entry spanname="descr">Force a frame type for the next queued buffer. Applicable to encoders. 2368 <row><entry spanname="descr">Force a frame type for the next queued buffer. Applicable to encoders.
2369Possible values are:</entry> 2369Possible values are:</entry>
diff --git a/Documentation/DocBook/media/v4l/dev-subdev.xml b/Documentation/DocBook/media/v4l/dev-subdev.xml
index 05c8fefcbcbe..0916a7343a16 100644
--- a/Documentation/DocBook/media/v4l/dev-subdev.xml
+++ b/Documentation/DocBook/media/v4l/dev-subdev.xml
@@ -266,7 +266,7 @@
266 266
267 <para>When satisfied with the try results, applications can set the active 267 <para>When satisfied with the try results, applications can set the active
268 formats by setting the <structfield>which</structfield> argument to 268 formats by setting the <structfield>which</structfield> argument to
269 <constant>V4L2_SUBDEV_FORMAT_TRY</constant>. Active formats are changed 269 <constant>V4L2_SUBDEV_FORMAT_ACTIVE</constant>. Active formats are changed
270 exactly as try formats by drivers. To avoid modifying the hardware state 270 exactly as try formats by drivers. To avoid modifying the hardware state
271 during format negotiation, applications should negotiate try formats first 271 during format negotiation, applications should negotiate try formats first
272 and then modify the active settings using the try formats returned during 272 and then modify the active settings using the try formats returned during
diff --git a/Documentation/DocBook/media/v4l/v4l2.xml b/Documentation/DocBook/media/v4l/v4l2.xml
index 0d05e8747c12..40132c277647 100644
--- a/Documentation/DocBook/media/v4l/v4l2.xml
+++ b/Documentation/DocBook/media/v4l/v4l2.xml
@@ -128,6 +128,13 @@ structs, ioctls) must be noted in more detail in the history chapter
128applications. --> 128applications. -->
129 129
130 <revision> 130 <revision>
131 <revnumber>3.2</revnumber>
132 <date>2011-08-26</date>
133 <authorinitials>hv</authorinitials>
134 <revremark>Added V4L2_CTRL_FLAG_VOLATILE.</revremark>
135 </revision>
136
137 <revision>
131 <revnumber>3.1</revnumber> 138 <revnumber>3.1</revnumber>
132 <date>2011-06-27</date> 139 <date>2011-06-27</date>
133 <authorinitials>mcc, po, hv</authorinitials> 140 <authorinitials>mcc, po, hv</authorinitials>
@@ -410,7 +417,7 @@ and discussions on the V4L mailing list.</revremark>
410</partinfo> 417</partinfo>
411 418
412<title>Video for Linux Two API Specification</title> 419<title>Video for Linux Two API Specification</title>
413 <subtitle>Revision 3.1</subtitle> 420 <subtitle>Revision 3.2</subtitle>
414 421
415 <chapter id="common"> 422 <chapter id="common">
416 &sub-common; 423 &sub-common;
diff --git a/Documentation/DocBook/media/v4l/vidioc-dqevent.xml b/Documentation/DocBook/media/v4l/vidioc-dqevent.xml
index 7769642ee431..e8714aa16433 100644
--- a/Documentation/DocBook/media/v4l/vidioc-dqevent.xml
+++ b/Documentation/DocBook/media/v4l/vidioc-dqevent.xml
@@ -88,6 +88,12 @@
88 </row> 88 </row>
89 <row> 89 <row>
90 <entry></entry> 90 <entry></entry>
91 <entry>&v4l2-event-frame-sync;</entry>
92 <entry><structfield>frame</structfield></entry>
93 <entry>Event data for event V4L2_EVENT_FRAME_SYNC.</entry>
94 </row>
95 <row>
96 <entry></entry>
91 <entry>__u8</entry> 97 <entry>__u8</entry>
92 <entry><structfield>data</structfield>[64]</entry> 98 <entry><structfield>data</structfield>[64]</entry>
93 <entry>Event data. Defined by the event type. The union 99 <entry>Event data. Defined by the event type. The union
@@ -135,6 +141,129 @@
135 </tgroup> 141 </tgroup>
136 </table> 142 </table>
137 143
144 <table frame="none" pgwide="1" id="v4l2-event-vsync">
145 <title>struct <structname>v4l2_event_vsync</structname></title>
146 <tgroup cols="3">
147 &cs-str;
148 <tbody valign="top">
149 <row>
150 <entry>__u8</entry>
151 <entry><structfield>field</structfield></entry>
152 <entry>The upcoming field. See &v4l2-field;.</entry>
153 </row>
154 </tbody>
155 </tgroup>
156 </table>
157
158 <table frame="none" pgwide="1" id="v4l2-event-ctrl">
159 <title>struct <structname>v4l2_event_ctrl</structname></title>
160 <tgroup cols="4">
161 &cs-str;
162 <tbody valign="top">
163 <row>
164 <entry>__u32</entry>
165 <entry><structfield>changes</structfield></entry>
166 <entry></entry>
167 <entry>A bitmask that tells what has changed. See <xref linkend="changes-flags" />.</entry>
168 </row>
169 <row>
170 <entry>__u32</entry>
171 <entry><structfield>type</structfield></entry>
172 <entry></entry>
173 <entry>The type of the control. See &v4l2-ctrl-type;.</entry>
174 </row>
175 <row>
176 <entry>union (anonymous)</entry>
177 <entry></entry>
178 <entry></entry>
179 <entry></entry>
180 </row>
181 <row>
182 <entry></entry>
183 <entry>__s32</entry>
184 <entry><structfield>value</structfield></entry>
185 <entry>The 32-bit value of the control for 32-bit control types.
186 This is 0 for string controls since the value of a string
187 cannot be passed using &VIDIOC-DQEVENT;.</entry>
188 </row>
189 <row>
190 <entry></entry>
191 <entry>__s64</entry>
192 <entry><structfield>value64</structfield></entry>
193 <entry>The 64-bit value of the control for 64-bit control types.</entry>
194 </row>
195 <row>
196 <entry>__u32</entry>
197 <entry><structfield>flags</structfield></entry>
198 <entry></entry>
199 <entry>The control flags. See <xref linkend="control-flags" />.</entry>
200 </row>
201 <row>
202 <entry>__s32</entry>
203 <entry><structfield>minimum</structfield></entry>
204 <entry></entry>
205 <entry>The minimum value of the control. See &v4l2-queryctrl;.</entry>
206 </row>
207 <row>
208 <entry>__s32</entry>
209 <entry><structfield>maximum</structfield></entry>
210 <entry></entry>
211 <entry>The maximum value of the control. See &v4l2-queryctrl;.</entry>
212 </row>
213 <row>
214 <entry>__s32</entry>
215 <entry><structfield>step</structfield></entry>
216 <entry></entry>
217 <entry>The step value of the control. See &v4l2-queryctrl;.</entry>
218 </row>
219 <row>
220 <entry>__s32</entry>
221 <entry><structfield>default_value</structfield></entry>
222 <entry></entry>
223 <entry>The default value value of the control. See &v4l2-queryctrl;.</entry>
224 </row>
225 </tbody>
226 </tgroup>
227 </table>
228
229 <table frame="none" pgwide="1" id="v4l2-event-frame-sync">
230 <title>struct <structname>v4l2_event_frame_sync</structname></title>
231 <tgroup cols="3">
232 &cs-str;
233 <tbody valign="top">
234 <row>
235 <entry>__u32</entry>
236 <entry><structfield>frame_sequence</structfield></entry>
237 <entry>
238 The sequence number of the frame being received.
239 </entry>
240 </row>
241 </tbody>
242 </tgroup>
243 </table>
244
245 <table pgwide="1" frame="none" id="changes-flags">
246 <title>Changes</title>
247 <tgroup cols="3">
248 &cs-def;
249 <tbody valign="top">
250 <row>
251 <entry><constant>V4L2_EVENT_CTRL_CH_VALUE</constant></entry>
252 <entry>0x0001</entry>
253 <entry>This control event was triggered because the value of the control
254 changed. Special case: if a button control is pressed, then this
255 event is sent as well, even though there is not explicit value
256 associated with a button control.</entry>
257 </row>
258 <row>
259 <entry><constant>V4L2_EVENT_CTRL_CH_FLAGS</constant></entry>
260 <entry>0x0002</entry>
261 <entry>This control event was triggered because the control flags
262 changed.</entry>
263 </row>
264 </tbody>
265 </tgroup>
266 </table>
138 </refsect1> 267 </refsect1>
139 <refsect1> 268 <refsect1>
140 &return-value; 269 &return-value;
diff --git a/Documentation/DocBook/media/v4l/vidioc-queryctrl.xml b/Documentation/DocBook/media/v4l/vidioc-queryctrl.xml
index 677ea646c29f..0ac0057a51c4 100644
--- a/Documentation/DocBook/media/v4l/vidioc-queryctrl.xml
+++ b/Documentation/DocBook/media/v4l/vidioc-queryctrl.xml
@@ -406,6 +406,15 @@ flag is typically present for relative controls or action controls where
406writing a value will cause the device to carry out a given action 406writing a value will cause the device to carry out a given action
407(&eg; motor control) but no meaningful value can be returned.</entry> 407(&eg; motor control) but no meaningful value can be returned.</entry>
408 </row> 408 </row>
409 <row>
410 <entry><constant>V4L2_CTRL_FLAG_VOLATILE</constant></entry>
411 <entry>0x0080</entry>
412 <entry>This control is volatile, which means that the value of the control
413changes continuously. A typical example would be the current gain value if the device
414is in auto-gain mode. In such a case the hardware calculates the gain value based on
415the lighting conditions which can change over time. Note that setting a new value for
416a volatile control will have no effect. The new value will just be ignored.</entry>
417 </row>
409 </tbody> 418 </tbody>
410 </tgroup> 419 </tgroup>
411 </table> 420 </table>
diff --git a/Documentation/DocBook/media/v4l/vidioc-subscribe-event.xml b/Documentation/DocBook/media/v4l/vidioc-subscribe-event.xml
index 69c0d8a2a3d2..5c70b616d818 100644
--- a/Documentation/DocBook/media/v4l/vidioc-subscribe-event.xml
+++ b/Documentation/DocBook/media/v4l/vidioc-subscribe-event.xml
@@ -139,6 +139,22 @@
139 </entry> 139 </entry>
140 </row> 140 </row>
141 <row> 141 <row>
142 <entry><constant>V4L2_EVENT_FRAME_SYNC</constant></entry>
143 <entry>4</entry>
144 <entry>
145 <para>Triggered immediately when the reception of a
146 frame has begun. This event has a
147 &v4l2-event-frame-sync; associated with it.</para>
148
149 <para>If the hardware needs to be stopped in the case of a
150 buffer underrun it might not be able to generate this event.
151 In such cases the <structfield>frame_sequence</structfield>
152 field in &v4l2-event-frame-sync; will not be incremented. This
153 causes two consecutive frame sequence numbers to have n times
154 frame interval in between them.</para>
155 </entry>
156 </row>
157 <row>
142 <entry><constant>V4L2_EVENT_PRIVATE_START</constant></entry> 158 <entry><constant>V4L2_EVENT_PRIVATE_START</constant></entry>
143 <entry>0x08000000</entry> 159 <entry>0x08000000</entry>
144 <entry>Base event number for driver-private events.</entry> 160 <entry>Base event number for driver-private events.</entry>
@@ -183,113 +199,6 @@
183 </tgroup> 199 </tgroup>
184 </table> 200 </table>
185 201
186 <table frame="none" pgwide="1" id="v4l2-event-vsync">
187 <title>struct <structname>v4l2_event_vsync</structname></title>
188 <tgroup cols="3">
189 &cs-str;
190 <tbody valign="top">
191 <row>
192 <entry>__u8</entry>
193 <entry><structfield>field</structfield></entry>
194 <entry>The upcoming field. See &v4l2-field;.</entry>
195 </row>
196 </tbody>
197 </tgroup>
198 </table>
199
200 <table frame="none" pgwide="1" id="v4l2-event-ctrl">
201 <title>struct <structname>v4l2_event_ctrl</structname></title>
202 <tgroup cols="4">
203 &cs-str;
204 <tbody valign="top">
205 <row>
206 <entry>__u32</entry>
207 <entry><structfield>changes</structfield></entry>
208 <entry></entry>
209 <entry>A bitmask that tells what has changed. See <xref linkend="changes-flags" />.</entry>
210 </row>
211 <row>
212 <entry>__u32</entry>
213 <entry><structfield>type</structfield></entry>
214 <entry></entry>
215 <entry>The type of the control. See &v4l2-ctrl-type;.</entry>
216 </row>
217 <row>
218 <entry>union (anonymous)</entry>
219 <entry></entry>
220 <entry></entry>
221 <entry></entry>
222 </row>
223 <row>
224 <entry></entry>
225 <entry>__s32</entry>
226 <entry><structfield>value</structfield></entry>
227 <entry>The 32-bit value of the control for 32-bit control types.
228 This is 0 for string controls since the value of a string
229 cannot be passed using &VIDIOC-DQEVENT;.</entry>
230 </row>
231 <row>
232 <entry></entry>
233 <entry>__s64</entry>
234 <entry><structfield>value64</structfield></entry>
235 <entry>The 64-bit value of the control for 64-bit control types.</entry>
236 </row>
237 <row>
238 <entry>__u32</entry>
239 <entry><structfield>flags</structfield></entry>
240 <entry></entry>
241 <entry>The control flags. See <xref linkend="control-flags" />.</entry>
242 </row>
243 <row>
244 <entry>__s32</entry>
245 <entry><structfield>minimum</structfield></entry>
246 <entry></entry>
247 <entry>The minimum value of the control. See &v4l2-queryctrl;.</entry>
248 </row>
249 <row>
250 <entry>__s32</entry>
251 <entry><structfield>maximum</structfield></entry>
252 <entry></entry>
253 <entry>The maximum value of the control. See &v4l2-queryctrl;.</entry>
254 </row>
255 <row>
256 <entry>__s32</entry>
257 <entry><structfield>step</structfield></entry>
258 <entry></entry>
259 <entry>The step value of the control. See &v4l2-queryctrl;.</entry>
260 </row>
261 <row>
262 <entry>__s32</entry>
263 <entry><structfield>default_value</structfield></entry>
264 <entry></entry>
265 <entry>The default value value of the control. See &v4l2-queryctrl;.</entry>
266 </row>
267 </tbody>
268 </tgroup>
269 </table>
270
271 <table pgwide="1" frame="none" id="changes-flags">
272 <title>Changes</title>
273 <tgroup cols="3">
274 &cs-def;
275 <tbody valign="top">
276 <row>
277 <entry><constant>V4L2_EVENT_CTRL_CH_VALUE</constant></entry>
278 <entry>0x0001</entry>
279 <entry>This control event was triggered because the value of the control
280 changed. Special case: if a button control is pressed, then this
281 event is sent as well, even though there is not explicit value
282 associated with a button control.</entry>
283 </row>
284 <row>
285 <entry><constant>V4L2_EVENT_CTRL_CH_FLAGS</constant></entry>
286 <entry>0x0002</entry>
287 <entry>This control event was triggered because the control flags
288 changed.</entry>
289 </row>
290 </tbody>
291 </tgroup>
292 </table>
293 </refsect1> 202 </refsect1>
294 <refsect1> 203 <refsect1>
295 &return-value; 204 &return-value;
diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
index 7c4b514d62b1..54883de5d5f9 100644
--- a/Documentation/DocBook/uio-howto.tmpl
+++ b/Documentation/DocBook/uio-howto.tmpl
@@ -529,7 +529,7 @@ memory (e.g. allocated with <function>kmalloc()</function>). There's also
529</para></listitem> 529</para></listitem>
530 530
531<listitem><para> 531<listitem><para>
532<varname>unsigned long addr</varname>: Required if the mapping is used. 532<varname>phys_addr_t addr</varname>: Required if the mapping is used.
533Fill in the address of your memory block. This address is the one that 533Fill in the address of your memory block. This address is the one that
534appears in sysfs. 534appears in sysfs.
535</para></listitem> 535</para></listitem>
diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.txt
index 3f5e0b09bed5..53e6fca146d7 100644
--- a/Documentation/PCI/MSI-HOWTO.txt
+++ b/Documentation/PCI/MSI-HOWTO.txt
@@ -45,7 +45,7 @@ arrived in memory (this becomes more likely with devices behind PCI-PCI
45bridges). In order to ensure that all the data has arrived in memory, 45bridges). In order to ensure that all the data has arrived in memory,
46the interrupt handler must read a register on the device which raised 46the interrupt handler must read a register on the device which raised
47the interrupt. PCI transaction ordering rules require that all the data 47the interrupt. PCI transaction ordering rules require that all the data
48arrives in memory before the value can be returned from the register. 48arrive in memory before the value may be returned from the register.
49Using MSIs avoids this problem as the interrupt-generating write cannot 49Using MSIs avoids this problem as the interrupt-generating write cannot
50pass the data writes, so by the time the interrupt is raised, the driver 50pass the data writes, so by the time the interrupt is raised, the driver
51knows that all the data has arrived in memory. 51knows that all the data has arrived in memory.
@@ -86,13 +86,13 @@ device.
86 86
87int pci_enable_msi(struct pci_dev *dev) 87int pci_enable_msi(struct pci_dev *dev)
88 88
89A successful call will allocate ONE interrupt to the device, regardless 89A successful call allocates ONE interrupt to the device, regardless
90of how many MSIs the device supports. The device will be switched from 90of how many MSIs the device supports. The device is switched from
91pin-based interrupt mode to MSI mode. The dev->irq number is changed 91pin-based interrupt mode to MSI mode. The dev->irq number is changed
92to a new number which represents the message signaled interrupt. 92to a new number which represents the message signaled interrupt;
93This function should be called before the driver calls request_irq() 93consequently, this function should be called before the driver calls
94since enabling MSIs disables the pin-based IRQ and the driver will not 94request_irq(), because an MSI is delivered via a vector that is
95receive interrupts on the old interrupt. 95different from the vector of a pin-based interrupt.
96 96
974.2.2 pci_enable_msi_block 974.2.2 pci_enable_msi_block
98 98
@@ -111,20 +111,20 @@ the device are in the range dev->irq to dev->irq + count - 1.
111 111
112If this function returns a negative number, it indicates an error and 112If this function returns a negative number, it indicates an error and
113the driver should not attempt to request any more MSI interrupts for 113the driver should not attempt to request any more MSI interrupts for
114this device. If this function returns a positive number, it will be 114this device. If this function returns a positive number, it is
115less than 'count' and indicate the number of interrupts that could have 115less than 'count' and indicates the number of interrupts that could have
116been allocated. In neither case will the irq value have been 116been allocated. In neither case is the irq value updated or the device
117updated, nor will the device have been switched into MSI mode. 117switched into MSI mode.
118 118
119The device driver must decide what action to take if 119The device driver must decide what action to take if
120pci_enable_msi_block() returns a value less than the number asked for. 120pci_enable_msi_block() returns a value less than the number requested.
121Some devices can make use of fewer interrupts than the maximum they 121For instance, the driver could still make use of fewer interrupts;
122request; in this case the driver should call pci_enable_msi_block() 122in this case the driver should call pci_enable_msi_block()
123again. Note that it is not guaranteed to succeed, even when the 123again. Note that it is not guaranteed to succeed, even when the
124'count' has been reduced to the value returned from a previous call to 124'count' has been reduced to the value returned from a previous call to
125pci_enable_msi_block(). This is because there are multiple constraints 125pci_enable_msi_block(). This is because there are multiple constraints
126on the number of vectors that can be allocated; pci_enable_msi_block() 126on the number of vectors that can be allocated; pci_enable_msi_block()
127will return as soon as it finds any constraint that doesn't allow the 127returns as soon as it finds any constraint that doesn't allow the
128call to succeed. 128call to succeed.
129 129
1304.2.3 pci_disable_msi 1304.2.3 pci_disable_msi
@@ -137,10 +137,10 @@ interrupt number and frees the previously allocated message signaled
137interrupt(s). The interrupt may subsequently be assigned to another 137interrupt(s). The interrupt may subsequently be assigned to another
138device, so drivers should not cache the value of dev->irq. 138device, so drivers should not cache the value of dev->irq.
139 139
140A device driver must always call free_irq() on the interrupt(s) 140Before calling this function, a device driver must always call free_irq()
141for which it has called request_irq() before calling this function. 141on any interrupt for which it previously called request_irq().
142Failure to do so will result in a BUG_ON(), the device will be left with 142Failure to do so results in a BUG_ON(), leaving the device with
143MSI enabled and will leak its vector. 143MSI enabled and thus leaking its vector.
144 144
1454.3 Using MSI-X 1454.3 Using MSI-X
146 146
@@ -155,10 +155,10 @@ struct msix_entry {
155}; 155};
156 156
157This allows for the device to use these interrupts in a sparse fashion; 157This allows for the device to use these interrupts in a sparse fashion;
158for example it could use interrupts 3 and 1027 and allocate only a 158for example, it could use interrupts 3 and 1027 and yet allocate only a
159two-element array. The driver is expected to fill in the 'entry' value 159two-element array. The driver is expected to fill in the 'entry' value
160in each element of the array to indicate which entries it wants the kernel 160in each element of the array to indicate for which entries the kernel
161to assign interrupts for. It is invalid to fill in two entries with the 161should assign interrupts; it is invalid to fill in two entries with the
162same number. 162same number.
163 163
1644.3.1 pci_enable_msix 1644.3.1 pci_enable_msix
@@ -168,10 +168,11 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
168Calling this function asks the PCI subsystem to allocate 'nvec' MSIs. 168Calling this function asks the PCI subsystem to allocate 'nvec' MSIs.
169The 'entries' argument is a pointer to an array of msix_entry structs 169The 'entries' argument is a pointer to an array of msix_entry structs
170which should be at least 'nvec' entries in size. On success, the 170which should be at least 'nvec' entries in size. On success, the
171function will return 0 and the device will have been switched into 171device is switched into MSI-X mode and the function returns 0.
172MSI-X interrupt mode. The 'vector' elements in each entry will have 172The 'vector' member in each entry is populated with the interrupt number;
173been filled in with the interrupt number. The driver should then call 173the driver should then call request_irq() for each 'vector' that it
174request_irq() for each 'vector' that it decides to use. 174decides to use. The device driver is responsible for keeping track of the
175interrupts assigned to the MSI-X vectors so it can free them again later.
175 176
176If this function returns a negative number, it indicates an error and 177If this function returns a negative number, it indicates an error and
177the driver should not attempt to allocate any more MSI-X interrupts for 178the driver should not attempt to allocate any more MSI-X interrupts for
@@ -181,16 +182,14 @@ below.
181 182
182This function, in contrast with pci_enable_msi(), does not adjust 183This function, in contrast with pci_enable_msi(), does not adjust
183dev->irq. The device will not generate interrupts for this interrupt 184dev->irq. The device will not generate interrupts for this interrupt
184number once MSI-X is enabled. The device driver is responsible for 185number once MSI-X is enabled.
185keeping track of the interrupts assigned to the MSI-X vectors so it can
186free them again later.
187 186
188Device drivers should normally call this function once per device 187Device drivers should normally call this function once per device
189during the initialization phase. 188during the initialization phase.
190 189
191It is ideal if drivers can cope with a variable number of MSI-X interrupts, 190It is ideal if drivers can cope with a variable number of MSI-X interrupts;
192there are many reasons why the platform may not be able to provide the 191there are many reasons why the platform may not be able to provide the
193exact number a driver asks for. 192exact number that a driver asks for.
194 193
195A request loop to achieve that might look like: 194A request loop to achieve that might look like:
196 195
@@ -212,15 +211,15 @@ static int foo_driver_enable_msix(struct foo_adapter *adapter, int nvec)
212 211
213void pci_disable_msix(struct pci_dev *dev) 212void pci_disable_msix(struct pci_dev *dev)
214 213
215This API should be used to undo the effect of pci_enable_msix(). It frees 214This function should be used to undo the effect of pci_enable_msix(). It frees
216the previously allocated message signaled interrupts. The interrupts may 215the previously allocated message signaled interrupts. The interrupts may
217subsequently be assigned to another device, so drivers should not cache 216subsequently be assigned to another device, so drivers should not cache
218the value of the 'vector' elements over a call to pci_disable_msix(). 217the value of the 'vector' elements over a call to pci_disable_msix().
219 218
220A device driver must always call free_irq() on the interrupt(s) 219Before calling this function, a device driver must always call free_irq()
221for which it has called request_irq() before calling this function. 220on any interrupt for which it previously called request_irq().
222Failure to do so will result in a BUG_ON(), the device will be left with 221Failure to do so results in a BUG_ON(), leaving the device with
223MSI enabled and will leak its vector. 222MSI-X enabled and thus leaking its vector.
224 223
2254.3.3 The MSI-X Table 2244.3.3 The MSI-X Table
226 225
@@ -232,10 +231,10 @@ mask or unmask an interrupt, it should call disable_irq() / enable_irq().
2324.4 Handling devices implementing both MSI and MSI-X capabilities 2314.4 Handling devices implementing both MSI and MSI-X capabilities
233 232
234If a device implements both MSI and MSI-X capabilities, it can 233If a device implements both MSI and MSI-X capabilities, it can
235run in either MSI mode or MSI-X mode but not both simultaneously. 234run in either MSI mode or MSI-X mode, but not both simultaneously.
236This is a requirement of the PCI spec, and it is enforced by the 235This is a requirement of the PCI spec, and it is enforced by the
237PCI layer. Calling pci_enable_msi() when MSI-X is already enabled or 236PCI layer. Calling pci_enable_msi() when MSI-X is already enabled or
238pci_enable_msix() when MSI is already enabled will result in an error. 237pci_enable_msix() when MSI is already enabled results in an error.
239If a device driver wishes to switch between MSI and MSI-X at runtime, 238If a device driver wishes to switch between MSI and MSI-X at runtime,
240it must first quiesce the device, then switch it back to pin-interrupt 239it must first quiesce the device, then switch it back to pin-interrupt
241mode, before calling pci_enable_msi() or pci_enable_msix() and resuming 240mode, before calling pci_enable_msi() or pci_enable_msix() and resuming
@@ -251,7 +250,7 @@ the MSI-X facilities in preference to the MSI facilities. As mentioned
251above, MSI-X supports any number of interrupts between 1 and 2048. 250above, MSI-X supports any number of interrupts between 1 and 2048.
252In constrast, MSI is restricted to a maximum of 32 interrupts (and 251In constrast, MSI is restricted to a maximum of 32 interrupts (and
253must be a power of two). In addition, the MSI interrupt vectors must 252must be a power of two). In addition, the MSI interrupt vectors must
254be allocated consecutively, so the system may not be able to allocate 253be allocated consecutively, so the system might not be able to allocate
255as many vectors for MSI as it could for MSI-X. On some platforms, MSI 254as many vectors for MSI as it could for MSI-X. On some platforms, MSI
256interrupts must all be targeted at the same set of CPUs whereas MSI-X 255interrupts must all be targeted at the same set of CPUs whereas MSI-X
257interrupts can all be targeted at different CPUs. 256interrupts can all be targeted at different CPUs.
@@ -281,7 +280,7 @@ disabled to enabled and back again.
281 280
282Using 'lspci -v' (as root) may show some devices with "MSI", "Message 281Using 'lspci -v' (as root) may show some devices with "MSI", "Message
283Signalled Interrupts" or "MSI-X" capabilities. Each of these capabilities 282Signalled Interrupts" or "MSI-X" capabilities. Each of these capabilities
284has an 'Enable' flag which will be followed with either "+" (enabled) 283has an 'Enable' flag which is followed with either "+" (enabled)
285or "-" (disabled). 284or "-" (disabled).
286 285
287 286
@@ -298,7 +297,7 @@ The PCI stack provides three ways to disable MSIs:
298 297
299Some host chipsets simply don't support MSIs properly. If we're 298Some host chipsets simply don't support MSIs properly. If we're
300lucky, the manufacturer knows this and has indicated it in the ACPI 299lucky, the manufacturer knows this and has indicated it in the ACPI
301FADT table. In this case, Linux will automatically disable MSIs. 300FADT table. In this case, Linux automatically disables MSIs.
302Some boards don't include this information in the table and so we have 301Some boards don't include this information in the table and so we have
303to detect them ourselves. The complete list of these is found near the 302to detect them ourselves. The complete list of these is found near the
304quirk_disable_all_msi() function in drivers/pci/quirks.c. 303quirk_disable_all_msi() function in drivers/pci/quirks.c.
@@ -317,7 +316,7 @@ Some bridges allow you to enable MSIs by changing some bits in their
317PCI configuration space (especially the Hypertransport chipsets such 316PCI configuration space (especially the Hypertransport chipsets such
318as the nVidia nForce and Serverworks HT2000). As with host chipsets, 317as the nVidia nForce and Serverworks HT2000). As with host chipsets,
319Linux mostly knows about them and automatically enables MSIs if it can. 318Linux mostly knows about them and automatically enables MSIs if it can.
320If you have a bridge which Linux doesn't yet know about, you can enable 319If you have a bridge unknown to Linux, you can enable
321MSIs in configuration space using whatever method you know works, then 320MSIs in configuration space using whatever method you know works, then
322enable MSIs on that bridge by doing: 321enable MSIs on that bridge by doing:
323 322
@@ -327,7 +326,7 @@ where $bridge is the PCI address of the bridge you've enabled (eg
3270000:00:0e.0). 3260000:00:0e.0).
328 327
329To disable MSIs, echo 0 instead of 1. Changing this value should be 328To disable MSIs, echo 0 instead of 1. Changing this value should be
330done with caution as it can break interrupt handling for all devices 329done with caution as it could break interrupt handling for all devices
331below this bridge. 330below this bridge.
332 331
333Again, please notify linux-pci@vger.kernel.org of any bridges that need 332Again, please notify linux-pci@vger.kernel.org of any bridges that need
@@ -336,7 +335,7 @@ special handling.
3365.3. Disabling MSIs on a single device 3355.3. Disabling MSIs on a single device
337 336
338Some devices are known to have faulty MSI implementations. Usually this 337Some devices are known to have faulty MSI implementations. Usually this
339is handled in the individual device driver but occasionally it's necessary 338is handled in the individual device driver, but occasionally it's necessary
340to handle this with a quirk. Some drivers have an option to disable use 339to handle this with a quirk. Some drivers have an option to disable use
341of MSI. While this is a convenient workaround for the driver author, 340of MSI. While this is a convenient workaround for the driver author,
342it is not good practise, and should not be emulated. 341it is not good practise, and should not be emulated.
@@ -350,7 +349,7 @@ for your machine. You should also check your .config to be sure you
350have enabled CONFIG_PCI_MSI. 349have enabled CONFIG_PCI_MSI.
351 350
352Then, 'lspci -t' gives the list of bridges above a device. Reading 351Then, 'lspci -t' gives the list of bridges above a device. Reading
353/sys/bus/pci/devices/*/msi_bus will tell you whether MSI are enabled (1) 352/sys/bus/pci/devices/*/msi_bus will tell you whether MSIs are enabled (1)
354or disabled (0). If 0 is found in any of the msi_bus files belonging 353or disabled (0). If 0 is found in any of the msi_bus files belonging
355to bridges between the PCI root and the device, MSIs are disabled. 354to bridges between the PCI root and the device, MSIs are disabled.
356 355
diff --git a/Documentation/PCI/pci.txt b/Documentation/PCI/pci.txt
index 6148d4080f88..aa09e5476bba 100644
--- a/Documentation/PCI/pci.txt
+++ b/Documentation/PCI/pci.txt
@@ -314,7 +314,7 @@ from the PCI device config space. Use the values in the pci_dev structure
314as the PCI "bus address" might have been remapped to a "host physical" 314as the PCI "bus address" might have been remapped to a "host physical"
315address by the arch/chip-set specific kernel support. 315address by the arch/chip-set specific kernel support.
316 316
317See Documentation/IO-mapping.txt for how to access device registers 317See Documentation/io-mapping.txt for how to access device registers
318or device memory. 318or device memory.
319 319
320The device driver needs to call pci_request_region() to verify 320The device driver needs to call pci_request_region() to verify
diff --git a/Documentation/RCU/NMI-RCU.txt b/Documentation/RCU/NMI-RCU.txt
index bf82851a0e57..687777f83b23 100644
--- a/Documentation/RCU/NMI-RCU.txt
+++ b/Documentation/RCU/NMI-RCU.txt
@@ -95,7 +95,7 @@ not to return until all ongoing NMI handlers exit. It is therefore safe
95to free up the handler's data as soon as synchronize_sched() returns. 95to free up the handler's data as soon as synchronize_sched() returns.
96 96
97Important note: for this to work, the architecture in question must 97Important note: for this to work, the architecture in question must
98invoke irq_enter() and irq_exit() on NMI entry and exit, respectively. 98invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.
99 99
100 100
101Answer to Quick Quiz 101Answer to Quick Quiz
diff --git a/Documentation/RCU/lockdep-splat.txt b/Documentation/RCU/lockdep-splat.txt
new file mode 100644
index 000000000000..bf9061142827
--- /dev/null
+++ b/Documentation/RCU/lockdep-splat.txt
@@ -0,0 +1,110 @@
1Lockdep-RCU was added to the Linux kernel in early 2010
2(http://lwn.net/Articles/371986/). This facility checks for some common
3misuses of the RCU API, most notably using one of the rcu_dereference()
4family to access an RCU-protected pointer without the proper protection.
5When such misuse is detected, an lockdep-RCU splat is emitted.
6
7The usual cause of a lockdep-RCU slat is someone accessing an
8RCU-protected data structure without either (1) being in the right kind of
9RCU read-side critical section or (2) holding the right update-side lock.
10This problem can therefore be serious: it might result in random memory
11overwriting or worse. There can of course be false positives, this
12being the real world and all that.
13
14So let's look at an example RCU lockdep splat from 3.0-rc5, one that
15has long since been fixed:
16
17===============================
18[ INFO: suspicious RCU usage. ]
19-------------------------------
20block/cfq-iosched.c:2776 suspicious rcu_dereference_protected() usage!
21
22other info that might help us debug this:
23
24
25rcu_scheduler_active = 1, debug_locks = 0
263 locks held by scsi_scan_6/1552:
27 #0: (&shost->scan_mutex){+.+.+.}, at: [<ffffffff8145efca>]
28scsi_scan_host_selected+0x5a/0x150
29 #1: (&eq->sysfs_lock){+.+...}, at: [<ffffffff812a5032>]
30elevator_exit+0x22/0x60
31 #2: (&(&q->__queue_lock)->rlock){-.-...}, at: [<ffffffff812b6233>]
32cfq_exit_queue+0x43/0x190
33
34stack backtrace:
35Pid: 1552, comm: scsi_scan_6 Not tainted 3.0.0-rc5 #17
36Call Trace:
37 [<ffffffff810abb9b>] lockdep_rcu_dereference+0xbb/0xc0
38 [<ffffffff812b6139>] __cfq_exit_single_io_context+0xe9/0x120
39 [<ffffffff812b626c>] cfq_exit_queue+0x7c/0x190
40 [<ffffffff812a5046>] elevator_exit+0x36/0x60
41 [<ffffffff812a802a>] blk_cleanup_queue+0x4a/0x60
42 [<ffffffff8145cc09>] scsi_free_queue+0x9/0x10
43 [<ffffffff81460944>] __scsi_remove_device+0x84/0xd0
44 [<ffffffff8145dca3>] scsi_probe_and_add_lun+0x353/0xb10
45 [<ffffffff817da069>] ? error_exit+0x29/0xb0
46 [<ffffffff817d98ed>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
47 [<ffffffff8145e722>] __scsi_scan_target+0x112/0x680
48 [<ffffffff812c690d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
49 [<ffffffff817da069>] ? error_exit+0x29/0xb0
50 [<ffffffff812bcc60>] ? kobject_del+0x40/0x40
51 [<ffffffff8145ed16>] scsi_scan_channel+0x86/0xb0
52 [<ffffffff8145f0b0>] scsi_scan_host_selected+0x140/0x150
53 [<ffffffff8145f149>] do_scsi_scan_host+0x89/0x90
54 [<ffffffff8145f170>] do_scan_async+0x20/0x160
55 [<ffffffff8145f150>] ? do_scsi_scan_host+0x90/0x90
56 [<ffffffff810975b6>] kthread+0xa6/0xb0
57 [<ffffffff817db154>] kernel_thread_helper+0x4/0x10
58 [<ffffffff81066430>] ? finish_task_switch+0x80/0x110
59 [<ffffffff817d9c04>] ? retint_restore_args+0xe/0xe
60 [<ffffffff81097510>] ? __init_kthread_worker+0x70/0x70
61 [<ffffffff817db150>] ? gs_change+0xb/0xb
62
63Line 2776 of block/cfq-iosched.c in v3.0-rc5 is as follows:
64
65 if (rcu_dereference(ioc->ioc_data) == cic) {
66
67This form says that it must be in a plain vanilla RCU read-side critical
68section, but the "other info" list above shows that this is not the
69case. Instead, we hold three locks, one of which might be RCU related.
70And maybe that lock really does protect this reference. If so, the fix
71is to inform RCU, perhaps by changing __cfq_exit_single_io_context() to
72take the struct request_queue "q" from cfq_exit_queue() as an argument,
73which would permit us to invoke rcu_dereference_protected as follows:
74
75 if (rcu_dereference_protected(ioc->ioc_data,
76 lockdep_is_held(&q->queue_lock)) == cic) {
77
78With this change, there would be no lockdep-RCU splat emitted if this
79code was invoked either from within an RCU read-side critical section
80or with the ->queue_lock held. In particular, this would have suppressed
81the above lockdep-RCU splat because ->queue_lock is held (see #2 in the
82list above).
83
84On the other hand, perhaps we really do need an RCU read-side critical
85section. In this case, the critical section must span the use of the
86return value from rcu_dereference(), or at least until there is some
87reference count incremented or some such. One way to handle this is to
88add rcu_read_lock() and rcu_read_unlock() as follows:
89
90 rcu_read_lock();
91 if (rcu_dereference(ioc->ioc_data) == cic) {
92 spin_lock(&ioc->lock);
93 rcu_assign_pointer(ioc->ioc_data, NULL);
94 spin_unlock(&ioc->lock);
95 }
96 rcu_read_unlock();
97
98With this change, the rcu_dereference() is always within an RCU
99read-side critical section, which again would have suppressed the
100above lockdep-RCU splat.
101
102But in this particular case, we don't actually deference the pointer
103returned from rcu_dereference(). Instead, that pointer is just compared
104to the cic pointer, which means that the rcu_dereference() can be replaced
105by rcu_access_pointer() as follows:
106
107 if (rcu_access_pointer(ioc->ioc_data) == cic) {
108
109Because it is legal to invoke rcu_access_pointer() without protection,
110this change would also suppress the above lockdep-RCU splat.
diff --git a/Documentation/RCU/lockdep.txt b/Documentation/RCU/lockdep.txt
index d7a49b2f6994..a102d4b3724b 100644
--- a/Documentation/RCU/lockdep.txt
+++ b/Documentation/RCU/lockdep.txt
@@ -32,9 +32,27 @@ checking of rcu_dereference() primitives:
32 srcu_dereference(p, sp): 32 srcu_dereference(p, sp):
33 Check for SRCU read-side critical section. 33 Check for SRCU read-side critical section.
34 rcu_dereference_check(p, c): 34 rcu_dereference_check(p, c):
35 Use explicit check expression "c". This is useful in 35 Use explicit check expression "c" along with
36 code that is invoked by both readers and updaters. 36 rcu_read_lock_held(). This is useful in code that is
37 rcu_dereference_raw(p) 37 invoked by both RCU readers and updaters.
38 rcu_dereference_bh_check(p, c):
39 Use explicit check expression "c" along with
40 rcu_read_lock_bh_held(). This is useful in code that
41 is invoked by both RCU-bh readers and updaters.
42 rcu_dereference_sched_check(p, c):
43 Use explicit check expression "c" along with
44 rcu_read_lock_sched_held(). This is useful in code that
45 is invoked by both RCU-sched readers and updaters.
46 srcu_dereference_check(p, c):
47 Use explicit check expression "c" along with
48 srcu_read_lock_held()(). This is useful in code that
49 is invoked by both SRCU readers and updaters.
50 rcu_dereference_index_check(p, c):
51 Use explicit check expression "c", but the caller
52 must supply one of the rcu_read_lock_held() functions.
53 This is useful in code that uses RCU-protected arrays
54 that is invoked by both RCU readers and updaters.
55 rcu_dereference_raw(p):
38 Don't check. (Use sparingly, if at all.) 56 Don't check. (Use sparingly, if at all.)
39 rcu_dereference_protected(p, c): 57 rcu_dereference_protected(p, c):
40 Use explicit check expression "c", and omit all barriers 58 Use explicit check expression "c", and omit all barriers
@@ -48,13 +66,11 @@ checking of rcu_dereference() primitives:
48 value of the pointer itself, for example, against NULL. 66 value of the pointer itself, for example, against NULL.
49 67
50The rcu_dereference_check() check expression can be any boolean 68The rcu_dereference_check() check expression can be any boolean
51expression, but would normally include one of the rcu_read_lock_held() 69expression, but would normally include a lockdep expression. However,
52family of functions and a lockdep expression. However, any boolean 70any boolean expression can be used. For a moderately ornate example,
53expression can be used. For a moderately ornate example, consider 71consider the following:
54the following:
55 72
56 file = rcu_dereference_check(fdt->fd[fd], 73 file = rcu_dereference_check(fdt->fd[fd],
57 rcu_read_lock_held() ||
58 lockdep_is_held(&files->file_lock) || 74 lockdep_is_held(&files->file_lock) ||
59 atomic_read(&files->count) == 1); 75 atomic_read(&files->count) == 1);
60 76
@@ -62,7 +78,7 @@ This expression picks up the pointer "fdt->fd[fd]" in an RCU-safe manner,
62and, if CONFIG_PROVE_RCU is configured, verifies that this expression 78and, if CONFIG_PROVE_RCU is configured, verifies that this expression
63is used in: 79is used in:
64 80
651. An RCU read-side critical section, or 811. An RCU read-side critical section (implicit), or
662. with files->file_lock held, or 822. with files->file_lock held, or
673. on an unshared files_struct. 833. on an unshared files_struct.
68 84
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 5d9016795fd8..783d6c134d3f 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -42,7 +42,7 @@ fqs_holdoff Holdoff time (in microseconds) between consecutive calls
42fqs_stutter Wait time (in seconds) between consecutive bursts 42fqs_stutter Wait time (in seconds) between consecutive bursts
43 of calls to force_quiescent_state(). 43 of calls to force_quiescent_state().
44 44
45irqreaders Says to invoke RCU readers from irq level. This is currently 45irqreader Says to invoke RCU readers from irq level. This is currently
46 done via timers. Defaults to "1" for variants of RCU that 46 done via timers. Defaults to "1" for variants of RCU that
47 permit this. (Or, more accurately, variants of RCU that do 47 permit this. (Or, more accurately, variants of RCU that do
48 -not- permit this know to ignore this variable.) 48 -not- permit this know to ignore this variable.)
@@ -79,19 +79,68 @@ stutter The length of time to run the test before pausing for this
79 Specifying "stutter=0" causes the test to run continuously 79 Specifying "stutter=0" causes the test to run continuously
80 without pausing, which is the old default behavior. 80 without pausing, which is the old default behavior.
81 81
82test_boost Whether or not to test the ability of RCU to do priority
83 boosting. Defaults to "test_boost=1", which performs
84 RCU priority-inversion testing only if the selected
85 RCU implementation supports priority boosting. Specifying
86 "test_boost=0" never performs RCU priority-inversion
87 testing. Specifying "test_boost=2" performs RCU
88 priority-inversion testing even if the selected RCU
89 implementation does not support RCU priority boosting,
90 which can be used to test rcutorture's ability to
91 carry out RCU priority-inversion testing.
92
93test_boost_interval
94 The number of seconds in an RCU priority-inversion test
95 cycle. Defaults to "test_boost_interval=7". It is
96 usually wise for this value to be relatively prime to
97 the value selected for "stutter".
98
99test_boost_duration
100 The number of seconds to do RCU priority-inversion testing
101 within any given "test_boost_interval". Defaults to
102 "test_boost_duration=4".
103
82test_no_idle_hz Whether or not to test the ability of RCU to operate in 104test_no_idle_hz Whether or not to test the ability of RCU to operate in
83 a kernel that disables the scheduling-clock interrupt to 105 a kernel that disables the scheduling-clock interrupt to
84 idle CPUs. Boolean parameter, "1" to test, "0" otherwise. 106 idle CPUs. Boolean parameter, "1" to test, "0" otherwise.
85 Defaults to omitting this test. 107 Defaults to omitting this test.
86 108
87torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, 109torture_type The type of RCU to test, with string values as follows:
88 "rcu_sync" for rcu_read_lock() with synchronous reclamation, 110
89 "rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for 111 "rcu": rcu_read_lock(), rcu_read_unlock() and call_rcu().
90 rcu_read_lock_bh() with synchronous reclamation, "srcu" for 112
91 the "srcu_read_lock()" API, "sched" for the use of 113 "rcu_sync": rcu_read_lock(), rcu_read_unlock(), and
92 preempt_disable() together with synchronize_sched(), 114 synchronize_rcu().
93 and "sched_expedited" for the use of preempt_disable() 115
94 with synchronize_sched_expedited(). 116 "rcu_expedited": rcu_read_lock(), rcu_read_unlock(), and
117 synchronize_rcu_expedited().
118
119 "rcu_bh": rcu_read_lock_bh(), rcu_read_unlock_bh(), and
120 call_rcu_bh().
121
122 "rcu_bh_sync": rcu_read_lock_bh(), rcu_read_unlock_bh(),
123 and synchronize_rcu_bh().
124
125 "rcu_bh_expedited": rcu_read_lock_bh(), rcu_read_unlock_bh(),
126 and synchronize_rcu_bh_expedited().
127
128 "srcu": srcu_read_lock(), srcu_read_unlock() and
129 synchronize_srcu().
130
131 "srcu_expedited": srcu_read_lock(), srcu_read_unlock() and
132 synchronize_srcu_expedited().
133
134 "sched": preempt_disable(), preempt_enable(), and
135 call_rcu_sched().
136
137 "sched_sync": preempt_disable(), preempt_enable(), and
138 synchronize_sched().
139
140 "sched_expedited": preempt_disable(), preempt_enable(), and
141 synchronize_sched_expedited().
142
143 Defaults to "rcu".
95 144
96verbose Enable debug printk()s. Default is disabled. 145verbose Enable debug printk()s. Default is disabled.
97 146
@@ -100,12 +149,12 @@ OUTPUT
100 149
101The statistics output is as follows: 150The statistics output is as follows:
102 151
103 rcu-torture: --- Start of test: nreaders=16 stat_interval=0 verbose=0 152 rcu-torture:--- Start of test: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
104 rcu-torture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915 153 rcu-torture: rtc: (null) ver: 155441 tfle: 0 rta: 155441 rtaf: 8884 rtf: 155440 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 3055767
105 rcu-torture: Reader Pipe: 1466408 9747 0 0 0 0 0 0 0 0 0 154 rcu-torture: Reader Pipe: 727860534 34213 0 0 0 0 0 0 0 0 0
106 rcu-torture: Reader Batch: 1464477 11678 0 0 0 0 0 0 0 0 155 rcu-torture: Reader Batch: 727877838 17003 0 0 0 0 0 0 0 0 0
107 rcu-torture: Free-Block Circulation: 1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0 156 rcu-torture: Free-Block Circulation: 155440 155440 155440 155440 155440 155440 155440 155440 155440 155440 0
108 rcu-torture: --- End of test 157 rcu-torture:--- End of test: SUCCESS: nreaders=16 nfakewriters=4 stat_interval=30 verbose=0 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4
109 158
110The command "dmesg | grep torture:" will extract this information on 159The command "dmesg | grep torture:" will extract this information on
111most systems. On more esoteric configurations, it may be necessary to 160most systems. On more esoteric configurations, it may be necessary to
@@ -113,26 +162,55 @@ use other commands to access the output of the printk()s used by
113the RCU torture test. The printk()s use KERN_ALERT, so they should 162the RCU torture test. The printk()s use KERN_ALERT, so they should
114be evident. ;-) 163be evident. ;-)
115 164
165The first and last lines show the rcutorture module parameters, and the
166last line shows either "SUCCESS" or "FAILURE", based on rcutorture's
167automatic determination as to whether RCU operated correctly.
168
116The entries are as follows: 169The entries are as follows:
117 170
118o "rtc": The hexadecimal address of the structure currently visible 171o "rtc": The hexadecimal address of the structure currently visible
119 to readers. 172 to readers.
120 173
121o "ver": The number of times since boot that the rcutw writer task 174o "ver": The number of times since boot that the RCU writer task
122 has changed the structure visible to readers. 175 has changed the structure visible to readers.
123 176
124o "tfle": If non-zero, indicates that the "torture freelist" 177o "tfle": If non-zero, indicates that the "torture freelist"
125 containing structure to be placed into the "rtc" area is empty. 178 containing structures to be placed into the "rtc" area is empty.
126 This condition is important, since it can fool you into thinking 179 This condition is important, since it can fool you into thinking
127 that RCU is working when it is not. :-/ 180 that RCU is working when it is not. :-/
128 181
129o "rta": Number of structures allocated from the torture freelist. 182o "rta": Number of structures allocated from the torture freelist.
130 183
131o "rtaf": Number of allocations from the torture freelist that have 184o "rtaf": Number of allocations from the torture freelist that have
132 failed due to the list being empty. 185 failed due to the list being empty. It is not unusual for this
186 to be non-zero, but it is bad for it to be a large fraction of
187 the value indicated by "rta".
133 188
134o "rtf": Number of frees into the torture freelist. 189o "rtf": Number of frees into the torture freelist.
135 190
191o "rtmbe": A non-zero value indicates that rcutorture believes that
192 rcu_assign_pointer() and rcu_dereference() are not working
193 correctly. This value should be zero.
194
195o "rtbke": rcutorture was unable to create the real-time kthreads
196 used to force RCU priority inversion. This value should be zero.
197
198o "rtbre": Although rcutorture successfully created the kthreads
199 used to force RCU priority inversion, it was unable to set them
200 to the real-time priority level of 1. This value should be zero.
201
202o "rtbf": The number of times that RCU priority boosting failed
203 to resolve RCU priority inversion.
204
205o "rtb": The number of times that rcutorture attempted to force
206 an RCU priority inversion condition. If you are testing RCU
207 priority boosting via the "test_boost" module parameter, this
208 value should be non-zero.
209
210o "nt": The number of times rcutorture ran RCU read-side code from
211 within a timer handler. This value should be non-zero only
212 if you specified the "irqreader" module parameter.
213
136o "Reader Pipe": Histogram of "ages" of structures seen by readers. 214o "Reader Pipe": Histogram of "ages" of structures seen by readers.
137 If any entries past the first two are non-zero, RCU is broken. 215 If any entries past the first two are non-zero, RCU is broken.
138 And rcutorture prints the error flag string "!!!" to make sure 216 And rcutorture prints the error flag string "!!!" to make sure
@@ -162,26 +240,15 @@ o "Free-Block Circulation": Shows the number of torture structures
162 somehow gets incremented farther than it should. 240 somehow gets incremented farther than it should.
163 241
164Different implementations of RCU can provide implementation-specific 242Different implementations of RCU can provide implementation-specific
165additional information. For example, SRCU provides the following: 243additional information. For example, SRCU provides the following
244additional line:
166 245
167 srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0
168 srcu-torture: Reader Pipe: 559738 939 0 0 0 0 0 0 0 0 0
169 srcu-torture: Reader Batch: 560434 243 0 0 0 0 0 0 0 0
170 srcu-torture: Free-Block Circulation: 355 354 353 352 351 350 349 348 347 346 0
171 srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1) 246 srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
172 247
173The first four lines are similar to those for RCU. The last line shows 248This line shows the per-CPU counter state. The numbers in parentheses are
174the per-CPU counter state. The numbers in parentheses are the values 249the values of the "old" and "current" counters for the corresponding CPU.
175of the "old" and "current" counters for the corresponding CPU. The 250The "idx" value maps the "old" and "current" values to the underlying
176"idx" value maps the "old" and "current" values to the underlying array, 251array, and is useful for debugging.
177and is useful for debugging.
178
179Similarly, sched_expedited RCU provides the following:
180
181 sched_expedited-torture: rtc: d0000000016c1880 ver: 1090796 tfle: 0 rta: 1090796 rtaf: 0 rtf: 1090787 rtmbe: 0 nt: 27713319
182 sched_expedited-torture: Reader Pipe: 12660320201 95875 0 0 0 0 0 0 0 0 0
183 sched_expedited-torture: Reader Batch: 12660424885 0 0 0 0 0 0 0 0 0 0
184 sched_expedited-torture: Free-Block Circulation: 1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0
185 252
186 253
187USAGE 254USAGE
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index 8173cec473aa..aaf65f6c6cd7 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -33,23 +33,23 @@ rcu/rcuboost:
33The output of "cat rcu/rcudata" looks as follows: 33The output of "cat rcu/rcudata" looks as follows:
34 34
35rcu_sched: 35rcu_sched:
36 0 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0 36 0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
37 1 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0 37 1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
38 2 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0 38 2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
39 3 c=20942 g=20943 pq=1 pqc=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0 39 3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
40 4 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0 40 4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
41 5 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0 41 5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
42 6 c=20972 g=20973 pq=1 pqc=20972 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0 42 6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
43 7 c=20897 g=20897 pq=1 pqc=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0 43 7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
44rcu_bh: 44rcu_bh:
45 0 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0 45 0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
46 1 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0 46 1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
47 2 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0 47 2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
48 3 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0 48 3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
49 4 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0 49 4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
50 5 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0 50 5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
51 6 c=1480 g=1480 pq=1 pqc=1479 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0 51 6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
52 7 c=1474 g=1474 pq=1 pqc=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0 52 7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0
53 53
54The first section lists the rcu_data structures for rcu_sched, the second 54The first section lists the rcu_data structures for rcu_sched, the second
55for rcu_bh. Note that CONFIG_TREE_PREEMPT_RCU kernels will have an 55for rcu_bh. Note that CONFIG_TREE_PREEMPT_RCU kernels will have an
@@ -84,7 +84,7 @@ o "pq" indicates that this CPU has passed through a quiescent state
84 CPU has not yet reported that fact, (2) some other CPU has not 84 CPU has not yet reported that fact, (2) some other CPU has not
85 yet reported for this grace period, or (3) both. 85 yet reported for this grace period, or (3) both.
86 86
87o "pqc" indicates which grace period the last-observed quiescent 87o "pgp" indicates which grace period the last-observed quiescent
88 state for this CPU corresponds to. This is important for handling 88 state for this CPU corresponds to. This is important for handling
89 the race between CPU 0 reporting an extended dynticks-idle 89 the race between CPU 0 reporting an extended dynticks-idle
90 quiescent state for CPU 1 and CPU 1 suddenly waking up and 90 quiescent state for CPU 1 and CPU 1 suddenly waking up and
@@ -184,10 +184,14 @@ o "kt" is the per-CPU kernel-thread state. The digit preceding
184 The number after the final slash is the CPU that the kthread 184 The number after the final slash is the CPU that the kthread
185 is actually running on. 185 is actually running on.
186 186
187 This field is displayed only for CONFIG_RCU_BOOST kernels.
188
187o "ktl" is the low-order 16 bits (in hexadecimal) of the count of 189o "ktl" is the low-order 16 bits (in hexadecimal) of the count of
188 the number of times that this CPU's per-CPU kthread has gone 190 the number of times that this CPU's per-CPU kthread has gone
189 through its loop servicing invoke_rcu_cpu_kthread() requests. 191 through its loop servicing invoke_rcu_cpu_kthread() requests.
190 192
193 This field is displayed only for CONFIG_RCU_BOOST kernels.
194
191o "b" is the batch limit for this CPU. If more than this number 195o "b" is the batch limit for this CPU. If more than this number
192 of RCU callbacks is ready to invoke, then the remainder will 196 of RCU callbacks is ready to invoke, then the remainder will
193 be deferred. 197 be deferred.
diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers
index 319baa8b60dd..36d16bbf72c6 100644
--- a/Documentation/SubmittingDrivers
+++ b/Documentation/SubmittingDrivers
@@ -130,7 +130,7 @@ Linux kernel master tree:
130 ftp.??.kernel.org:/pub/linux/kernel/... 130 ftp.??.kernel.org:/pub/linux/kernel/...
131 ?? == your country code, such as "us", "uk", "fr", etc. 131 ?? == your country code, such as "us", "uk", "fr", etc.
132 132
133 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git 133 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git
134 134
135Linux kernel mailing list: 135Linux kernel mailing list:
136 linux-kernel@vger.kernel.org 136 linux-kernel@vger.kernel.org
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 569f3532e138..4468ce24427c 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -303,7 +303,7 @@ patches that are being emailed around.
303 303
304The sign-off is a simple line at the end of the explanation for the 304The sign-off is a simple line at the end of the explanation for the
305patch, which certifies that you wrote it or otherwise have the right to 305patch, which certifies that you wrote it or otherwise have the right to
306pass it on as a open-source patch. The rules are pretty simple: if you 306pass it on as an open-source patch. The rules are pretty simple: if you
307can certify the below: 307can certify the below:
308 308
309 Developer's Certificate of Origin 1.1 309 Developer's Certificate of Origin 1.1
diff --git a/Documentation/blackfin/bfin-gpio-notes.txt b/Documentation/blackfin/bfin-gpio-notes.txt
index f731c1e56475..d36b01f778b9 100644
--- a/Documentation/blackfin/bfin-gpio-notes.txt
+++ b/Documentation/blackfin/bfin-gpio-notes.txt
@@ -1,5 +1,5 @@
1/* 1/*
2 * File: Documentation/blackfin/bfin-gpio-note.txt 2 * File: Documentation/blackfin/bfin-gpio-notes.txt
3 * Based on: 3 * Based on:
4 * Author: 4 * Author:
5 * 5 *
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index c6d84cfd2f56..e418dc0a7086 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -186,7 +186,7 @@ a virtual address mapping (unlike the earlier scheme of virtual address
186do not have a corresponding kernel virtual address space mapping) and 186do not have a corresponding kernel virtual address space mapping) and
187low-memory pages. 187low-memory pages.
188 188
189Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion 189Note: Please refer to Documentation/DMA-API-HOWTO.txt for a discussion
190on PCI high mem DMA aspects and mapping of scatter gather lists, and support 190on PCI high mem DMA aspects and mapping of scatter gather lists, and support
191for 64 bit PCI. 191for 64 bit PCI.
192 192
diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt
index e578feed6d81..6d670f570451 100644
--- a/Documentation/block/cfq-iosched.txt
+++ b/Documentation/block/cfq-iosched.txt
@@ -43,3 +43,74 @@ If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches
43to IOPS mode and starts providing fairness in terms of number of requests 43to IOPS mode and starts providing fairness in terms of number of requests
44dispatched. Note that this mode switching takes effect only for group 44dispatched. Note that this mode switching takes effect only for group
45scheduling. For non-cgroup users nothing should change. 45scheduling. For non-cgroup users nothing should change.
46
47CFQ IO scheduler Idling Theory
48===============================
49Idling on a queue is primarily about waiting for the next request to come
50on same queue after completion of a request. In this process CFQ will not
51dispatch requests from other cfq queues even if requests are pending there.
52
53The rationale behind idling is that it can cut down on number of seeks
54on rotational media. For example, if a process is doing dependent
55sequential reads (next read will come on only after completion of previous
56one), then not dispatching request from other queue should help as we
57did not move the disk head and kept on dispatching sequential IO from
58one queue.
59
60CFQ has following service trees and various queues are put on these trees.
61
62 sync-idle sync-noidle async
63
64All cfq queues doing synchronous sequential IO go on to sync-idle tree.
65On this tree we idle on each queue individually.
66
67All synchronous non-sequential queues go on sync-noidle tree. Also any
68request which are marked with REQ_NOIDLE go on this service tree. On this
69tree we do not idle on individual queues instead idle on the whole group
70of queues or the tree. So if there are 4 queues waiting for IO to dispatch
71we will idle only once last queue has dispatched the IO and there is
72no more IO on this service tree.
73
74All async writes go on async service tree. There is no idling on async
75queues.
76
77CFQ has some optimizations for SSDs and if it detects a non-rotational
78media which can support higher queue depth (multiple requests at in
79flight at a time), then it cuts down on idling of individual queues and
80all the queues move to sync-noidle tree and only tree idle remains. This
81tree idling provides isolation with buffered write queues on async tree.
82
83FAQ
84===
85Q1. Why to idle at all on queues marked with REQ_NOIDLE.
86
87A1. We only do tree idle (all queues on sync-noidle tree) on queues marked
88 with REQ_NOIDLE. This helps in providing isolation with all the sync-idle
89 queues. Otherwise in presence of many sequential readers, other
90 synchronous IO might not get fair share of disk.
91
92 For example, if there are 10 sequential readers doing IO and they get
93 100ms each. If a REQ_NOIDLE request comes in, it will be scheduled
94 roughly after 1 second. If after completion of REQ_NOIDLE request we
95 do not idle, and after a couple of milli seconds a another REQ_NOIDLE
96 request comes in, again it will be scheduled after 1second. Repeat it
97 and notice how a workload can lose its disk share and suffer due to
98 multiple sequential readers.
99
100 fsync can generate dependent IO where bunch of data is written in the
101 context of fsync, and later some journaling data is written. Journaling
102 data comes in only after fsync has finished its IO (atleast for ext4
103 that seemed to be the case). Now if one decides not to idle on fsync
104 thread due to REQ_NOIDLE, then next journaling write will not get
105 scheduled for another second. A process doing small fsync, will suffer
106 badly in presence of multiple sequential readers.
107
108 Hence doing tree idling on threads using REQ_NOIDLE flag on requests
109 provides isolation from multiple sequential readers and at the same
110 time we do not idle on individual threads.
111
112Q2. When to specify REQ_NOIDLE
113A2. I would think whenever one is doing synchronous write and not expecting
114 more writes to be dispatched from same context soon, should be able
115 to specify REQ_NOIDLE on writes and that probably should work well for
116 most of the cases.
diff --git a/Documentation/bus-virt-phys-mapping.txt b/Documentation/bus-virt-phys-mapping.txt
index 1b5aa10df845..2bc55ff3b4d1 100644
--- a/Documentation/bus-virt-phys-mapping.txt
+++ b/Documentation/bus-virt-phys-mapping.txt
@@ -1,6 +1,6 @@
1[ NOTE: The virt_to_bus() and bus_to_virt() functions have been 1[ NOTE: The virt_to_bus() and bus_to_virt() functions have been
2 superseded by the functionality provided by the PCI DMA interface 2 superseded by the functionality provided by the PCI DMA interface
3 (see Documentation/PCI/PCI-DMA-mapping.txt). They continue 3 (see Documentation/DMA-API-HOWTO.txt). They continue
4 to be documented below for historical purposes, but new code 4 to be documented below for historical purposes, but new code
5 must not use them. --davidm 00/12/12 ] 5 must not use them. --davidm 00/12/12 ]
6 6
diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.txt
index 13c251d5add6..2834170d821e 100644
--- a/Documentation/cdrom/packet-writing.txt
+++ b/Documentation/cdrom/packet-writing.txt
@@ -109,7 +109,7 @@ this interface. (see http://tom.ist-im-web.de/download/pktcdvd )
109 109
110For a description of the sysfs interface look into the file: 110For a description of the sysfs interface look into the file:
111 111
112 Documentation/ABI/testing/sysfs-block-pktcdvd 112 Documentation/ABI/testing/sysfs-class-pktcdvd
113 113
114 114
115Using the pktcdvd debugfs interface 115Using the pktcdvd debugfs interface
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 6f3c598971fc..cc0ebc5241b3 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -380,7 +380,7 @@ will be charged as a new owner of it.
380 380
3815.2 stat file 3815.2 stat file
382 382
3835.2.1 memory.stat file includes following statistics 383memory.stat file includes following statistics
384 384
385# per-memory cgroup local status 385# per-memory cgroup local status
386cache - # of bytes of page cache memory. 386cache - # of bytes of page cache memory.
@@ -418,7 +418,6 @@ total_unevictable - sum of all children's "unevictable"
418 418
419# The following additional stats are dependent on CONFIG_DEBUG_VM. 419# The following additional stats are dependent on CONFIG_DEBUG_VM.
420 420
421inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
422recent_rotated_anon - VM internal parameter. (see mm/vmscan.c) 421recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
423recent_rotated_file - VM internal parameter. (see mm/vmscan.c) 422recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
424recent_scanned_anon - VM internal parameter. (see mm/vmscan.c) 423recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
@@ -438,89 +437,6 @@ Note:
438 file_mapped is accounted only when the memory cgroup is owner of page 437 file_mapped is accounted only when the memory cgroup is owner of page
439 cache.) 438 cache.)
440 439
4415.2.2 memory.vmscan_stat
442
443memory.vmscan_stat includes statistics information for memory scanning and
444freeing, reclaiming. The statistics shows memory scanning information since
445memory cgroup creation and can be reset to 0 by writing 0 as
446
447 #echo 0 > ../memory.vmscan_stat
448
449This file contains following statistics.
450
451[param]_[file_or_anon]_pages_by_[reason]_[under_heararchy]
452[param]_elapsed_ns_by_[reason]_[under_hierarchy]
453
454For example,
455
456 scanned_file_pages_by_limit indicates the number of scanned
457 file pages at vmscan.
458
459Now, 3 parameters are supported
460
461 scanned - the number of pages scanned by vmscan
462 rotated - the number of pages activated at vmscan
463 freed - the number of pages freed by vmscan
464
465If "rotated" is high against scanned/freed, the memcg seems busy.
466
467Now, 2 reason are supported
468
469 limit - the memory cgroup's limit
470 system - global memory pressure + softlimit
471 (global memory pressure not under softlimit is not handled now)
472
473When under_hierarchy is added in the tail, the number indicates the
474total memcg scan of its children and itself.
475
476elapsed_ns is a elapsed time in nanosecond. This may include sleep time
477and not indicates CPU usage. So, please take this as just showing
478latency.
479
480Here is an example.
481
482# cat /cgroup/memory/A/memory.vmscan_stat
483scanned_pages_by_limit 9471864
484scanned_anon_pages_by_limit 6640629
485scanned_file_pages_by_limit 2831235
486rotated_pages_by_limit 4243974
487rotated_anon_pages_by_limit 3971968
488rotated_file_pages_by_limit 272006
489freed_pages_by_limit 2318492
490freed_anon_pages_by_limit 962052
491freed_file_pages_by_limit 1356440
492elapsed_ns_by_limit 351386416101
493scanned_pages_by_system 0
494scanned_anon_pages_by_system 0
495scanned_file_pages_by_system 0
496rotated_pages_by_system 0
497rotated_anon_pages_by_system 0
498rotated_file_pages_by_system 0
499freed_pages_by_system 0
500freed_anon_pages_by_system 0
501freed_file_pages_by_system 0
502elapsed_ns_by_system 0
503scanned_pages_by_limit_under_hierarchy 9471864
504scanned_anon_pages_by_limit_under_hierarchy 6640629
505scanned_file_pages_by_limit_under_hierarchy 2831235
506rotated_pages_by_limit_under_hierarchy 4243974
507rotated_anon_pages_by_limit_under_hierarchy 3971968
508rotated_file_pages_by_limit_under_hierarchy 272006
509freed_pages_by_limit_under_hierarchy 2318492
510freed_anon_pages_by_limit_under_hierarchy 962052
511freed_file_pages_by_limit_under_hierarchy 1356440
512elapsed_ns_by_limit_under_hierarchy 351386416101
513scanned_pages_by_system_under_hierarchy 0
514scanned_anon_pages_by_system_under_hierarchy 0
515scanned_file_pages_by_system_under_hierarchy 0
516rotated_pages_by_system_under_hierarchy 0
517rotated_anon_pages_by_system_under_hierarchy 0
518rotated_file_pages_by_system_under_hierarchy 0
519freed_pages_by_system_under_hierarchy 0
520freed_anon_pages_by_system_under_hierarchy 0
521freed_file_pages_by_system_under_hierarchy 0
522elapsed_ns_by_system_under_hierarchy 0
523
5245.3 swappiness 4405.3 swappiness
525 441
526Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only. 442Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.
diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt
index e74d0a2eb1cf..d221781dabaa 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -132,7 +132,7 @@ The sampling rate is limited by the HW transition latency:
132transition_latency * 100 132transition_latency * 100
133Or by kernel restrictions: 133Or by kernel restrictions:
134If CONFIG_NO_HZ is set, the limit is 10ms fixed. 134If CONFIG_NO_HZ is set, the limit is 10ms fixed.
135If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the 135If CONFIG_NO_HZ is not set or nohz=off boot parameter is used, the
136limits depend on the CONFIG_HZ option: 136limits depend on the CONFIG_HZ option:
137HZ=1000: min=20000us (20ms) 137HZ=1000: min=20000us (20ms)
138HZ=250: min=80000us (80ms) 138HZ=250: min=80000us (80ms)
diff --git a/Documentation/development-process/4.Coding b/Documentation/development-process/4.Coding
index 83f5f5b365a3..e3cb6a56653a 100644
--- a/Documentation/development-process/4.Coding
+++ b/Documentation/development-process/4.Coding
@@ -278,7 +278,7 @@ enabled, a configurable percentage of memory allocations will be made to
278fail; these failures can be restricted to a specific range of code. 278fail; these failures can be restricted to a specific range of code.
279Running with fault injection enabled allows the programmer to see how the 279Running with fault injection enabled allows the programmer to see how the
280code responds when things go badly. See 280code responds when things go badly. See
281Documentation/fault-injection/fault-injection.text for more information on 281Documentation/fault-injection/fault-injection.txt for more information on
282how to use this facility. 282how to use this facility.
283 283
284Other kinds of errors can be found with the "sparse" static analysis tool. 284Other kinds of errors can be found with the "sparse" static analysis tool.
diff --git a/Documentation/device-mapper/dm-log.txt b/Documentation/device-mapper/dm-log.txt
index 994dd75475a6..c155ac569c44 100644
--- a/Documentation/device-mapper/dm-log.txt
+++ b/Documentation/device-mapper/dm-log.txt
@@ -48,7 +48,7 @@ kernel and userspace, 'connector' is used as the interface for
48communication. 48communication.
49 49
50There are currently two userspace log implementations that leverage this 50There are currently two userspace log implementations that leverage this
51framework - "clustered_disk" and "clustered_core". These implementations 51framework - "clustered-disk" and "clustered-core". These implementations
52provide a cluster-coherent log for shared-storage. Device-mapper mirroring 52provide a cluster-coherent log for shared-storage. Device-mapper mirroring
53can be used in a shared-storage environment when the cluster log implementations 53can be used in a shared-storage environment when the cluster log implementations
54are employed. 54are employed.
diff --git a/Documentation/device-mapper/persistent-data.txt b/Documentation/device-mapper/persistent-data.txt
new file mode 100644
index 000000000000..0e5df9b04ad2
--- /dev/null
+++ b/Documentation/device-mapper/persistent-data.txt
@@ -0,0 +1,84 @@
1Introduction
2============
3
4The more-sophisticated device-mapper targets require complex metadata
5that is managed in kernel. In late 2010 we were seeing that various
6different targets were rolling their own data strutures, for example:
7
8- Mikulas Patocka's multisnap implementation
9- Heinz Mauelshagen's thin provisioning target
10- Another btree-based caching target posted to dm-devel
11- Another multi-snapshot target based on a design of Daniel Phillips
12
13Maintaining these data structures takes a lot of work, so if possible
14we'd like to reduce the number.
15
16The persistent-data library is an attempt to provide a re-usable
17framework for people who want to store metadata in device-mapper
18targets. It's currently used by the thin-provisioning target and an
19upcoming hierarchical storage target.
20
21Overview
22========
23
24The main documentation is in the header files which can all be found
25under drivers/md/persistent-data.
26
27The block manager
28-----------------
29
30dm-block-manager.[hc]
31
32This provides access to the data on disk in fixed sized-blocks. There
33is a read/write locking interface to prevent concurrent accesses, and
34keep data that is being used in the cache.
35
36Clients of persistent-data are unlikely to use this directly.
37
38The transaction manager
39-----------------------
40
41dm-transaction-manager.[hc]
42
43This restricts access to blocks and enforces copy-on-write semantics.
44The only way you can get hold of a writable block through the
45transaction manager is by shadowing an existing block (ie. doing
46copy-on-write) or allocating a fresh one. Shadowing is elided within
47the same transaction so performance is reasonable. The commit method
48ensures that all data is flushed before it writes the superblock.
49On power failure your metadata will be as it was when last committed.
50
51The Space Maps
52--------------
53
54dm-space-map.h
55dm-space-map-metadata.[hc]
56dm-space-map-disk.[hc]
57
58On-disk data structures that keep track of reference counts of blocks.
59Also acts as the allocator of new blocks. Currently two
60implementations: a simpler one for managing blocks on a different
61device (eg. thinly-provisioned data blocks); and one for managing
62the metadata space. The latter is complicated by the need to store
63its own data within the space it's managing.
64
65The data structures
66-------------------
67
68dm-btree.[hc]
69dm-btree-remove.c
70dm-btree-spine.c
71dm-btree-internal.h
72
73Currently there is only one data structure, a hierarchical btree.
74There are plans to add more. For example, something with an
75array-like interface would see a lot of use.
76
77The btree is 'hierarchical' in that you can define it to be composed
78of nested btrees, and take multiple keys. For example, the
79thin-provisioning target uses a btree with two levels of nesting.
80The first maps a device id to a mapping tree, and that in turn maps a
81virtual block to a physical block.
82
83Values stored in the btrees can have arbitrary size. Keys are always
8464bits, although nesting allows you to use multiple keys.
diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt
new file mode 100644
index 000000000000..801d9d1cf82b
--- /dev/null
+++ b/Documentation/device-mapper/thin-provisioning.txt
@@ -0,0 +1,285 @@
1Introduction
2============
3
4This document descibes a collection of device-mapper targets that
5between them implement thin-provisioning and snapshots.
6
7The main highlight of this implementation, compared to the previous
8implementation of snapshots, is that it allows many virtual devices to
9be stored on the same data volume. This simplifies administration and
10allows the sharing of data between volumes, thus reducing disk usage.
11
12Another significant feature is support for an arbitrary depth of
13recursive snapshots (snapshots of snapshots of snapshots ...). The
14previous implementation of snapshots did this by chaining together
15lookup tables, and so performance was O(depth). This new
16implementation uses a single data structure to avoid this degradation
17with depth. Fragmentation may still be an issue, however, in some
18scenarios.
19
20Metadata is stored on a separate device from data, giving the
21administrator some freedom, for example to:
22
23- Improve metadata resilience by storing metadata on a mirrored volume
24 but data on a non-mirrored one.
25
26- Improve performance by storing the metadata on SSD.
27
28Status
29======
30
31These targets are very much still in the EXPERIMENTAL state. Please
32do not yet rely on them in production. But do experiment and offer us
33feedback. Different use cases will have different performance
34characteristics, for example due to fragmentation of the data volume.
35
36If you find this software is not performing as expected please mail
37dm-devel@redhat.com with details and we'll try our best to improve
38things for you.
39
40Userspace tools for checking and repairing the metadata are under
41development.
42
43Cookbook
44========
45
46This section describes some quick recipes for using thin provisioning.
47They use the dmsetup program to control the device-mapper driver
48directly. End users will be advised to use a higher-level volume
49manager such as LVM2 once support has been added.
50
51Pool device
52-----------
53
54The pool device ties together the metadata volume and the data volume.
55It maps I/O linearly to the data volume and updates the metadata via
56two mechanisms:
57
58- Function calls from the thin targets
59
60- Device-mapper 'messages' from userspace which control the creation of new
61 virtual devices amongst other things.
62
63Setting up a fresh pool device
64------------------------------
65
66Setting up a pool device requires a valid metadata device, and a
67data device. If you do not have an existing metadata device you can
68make one by zeroing the first 4k to indicate empty metadata.
69
70 dd if=/dev/zero of=$metadata_dev bs=4096 count=1
71
72The amount of metadata you need will vary according to how many blocks
73are shared between thin devices (i.e. through snapshots). If you have
74less sharing than average you'll need a larger-than-average metadata device.
75
76As a guide, we suggest you calculate the number of bytes to use in the
77metadata device as 48 * $data_dev_size / $data_block_size but round it up
78to 2MB if the answer is smaller. The largest size supported is 16GB.
79
80If you're creating large numbers of snapshots which are recording large
81amounts of change, you may need find you need to increase this.
82
83Reloading a pool table
84----------------------
85
86You may reload a pool's table, indeed this is how the pool is resized
87if it runs out of space. (N.B. While specifying a different metadata
88device when reloading is not forbidden at the moment, things will go
89wrong if it does not route I/O to exactly the same on-disk location as
90previously.)
91
92Using an existing pool device
93-----------------------------
94
95 dmsetup create pool \
96 --table "0 20971520 thin-pool $metadata_dev $data_dev \
97 $data_block_size $low_water_mark"
98
99$data_block_size gives the smallest unit of disk space that can be
100allocated at a time expressed in units of 512-byte sectors. People
101primarily interested in thin provisioning may want to use a value such
102as 1024 (512KB). People doing lots of snapshotting may want a smaller value
103such as 128 (64KB). If you are not zeroing newly-allocated data,
104a larger $data_block_size in the region of 256000 (128MB) is suggested.
105$data_block_size must be the same for the lifetime of the
106metadata device.
107
108$low_water_mark is expressed in blocks of size $data_block_size. If
109free space on the data device drops below this level then a dm event
110will be triggered which a userspace daemon should catch allowing it to
111extend the pool device. Only one such event will be sent.
112Resuming a device with a new table itself triggers an event so the
113userspace daemon can use this to detect a situation where a new table
114already exceeds the threshold.
115
116Thin provisioning
117-----------------
118
119i) Creating a new thinly-provisioned volume.
120
121 To create a new thinly- provisioned volume you must send a message to an
122 active pool device, /dev/mapper/pool in this example.
123
124 dmsetup message /dev/mapper/pool 0 "create_thin 0"
125
126 Here '0' is an identifier for the volume, a 24-bit number. It's up
127 to the caller to allocate and manage these identifiers. If the
128 identifier is already in use, the message will fail with -EEXIST.
129
130ii) Using a thinly-provisioned volume.
131
132 Thinly-provisioned volumes are activated using the 'thin' target:
133
134 dmsetup create thin --table "0 2097152 thin /dev/mapper/pool 0"
135
136 The last parameter is the identifier for the thinp device.
137
138Internal snapshots
139------------------
140
141i) Creating an internal snapshot.
142
143 Snapshots are created with another message to the pool.
144
145 N.B. If the origin device that you wish to snapshot is active, you
146 must suspend it before creating the snapshot to avoid corruption.
147 This is NOT enforced at the moment, so please be careful!
148
149 dmsetup suspend /dev/mapper/thin
150 dmsetup message /dev/mapper/pool 0 "create_snap 1 0"
151 dmsetup resume /dev/mapper/thin
152
153 Here '1' is the identifier for the volume, a 24-bit number. '0' is the
154 identifier for the origin device.
155
156ii) Using an internal snapshot.
157
158 Once created, the user doesn't have to worry about any connection
159 between the origin and the snapshot. Indeed the snapshot is no
160 different from any other thinly-provisioned device and can be
161 snapshotted itself via the same method. It's perfectly legal to
162 have only one of them active, and there's no ordering requirement on
163 activating or removing them both. (This differs from conventional
164 device-mapper snapshots.)
165
166 Activate it exactly the same way as any other thinly-provisioned volume:
167
168 dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 1"
169
170Deactivation
171------------
172
173All devices using a pool must be deactivated before the pool itself
174can be.
175
176 dmsetup remove thin
177 dmsetup remove snap
178 dmsetup remove pool
179
180Reference
181=========
182
183'thin-pool' target
184------------------
185
186i) Constructor
187
188 thin-pool <metadata dev> <data dev> <data block size (sectors)> \
189 <low water mark (blocks)> [<number of feature args> [<arg>]*]
190
191 Optional feature arguments:
192 - 'skip_block_zeroing': skips the zeroing of newly-provisioned blocks.
193
194 Data block size must be between 64KB (128 sectors) and 1GB
195 (2097152 sectors) inclusive.
196
197
198ii) Status
199
200 <transaction id> <used metadata blocks>/<total metadata blocks>
201 <used data blocks>/<total data blocks> <held metadata root>
202
203
204 transaction id:
205 A 64-bit number used by userspace to help synchronise with metadata
206 from volume managers.
207
208 used data blocks / total data blocks
209 If the number of free blocks drops below the pool's low water mark a
210 dm event will be sent to userspace. This event is edge-triggered and
211 it will occur only once after each resume so volume manager writers
212 should register for the event and then check the target's status.
213
214 held metadata root:
215 The location, in sectors, of the metadata root that has been
216 'held' for userspace read access. '-' indicates there is no
217 held root. This feature is not yet implemented so '-' is
218 always returned.
219
220iii) Messages
221
222 create_thin <dev id>
223
224 Create a new thinly-provisioned device.
225 <dev id> is an arbitrary unique 24-bit identifier chosen by
226 the caller.
227
228 create_snap <dev id> <origin id>
229
230 Create a new snapshot of another thinly-provisioned device.
231 <dev id> is an arbitrary unique 24-bit identifier chosen by
232 the caller.
233 <origin id> is the identifier of the thinly-provisioned device
234 of which the new device will be a snapshot.
235
236 delete <dev id>
237
238 Deletes a thin device. Irreversible.
239
240 trim <dev id> <new size in sectors>
241
242 Delete mappings from the end of a thin device. Irreversible.
243 You might want to use this if you're reducing the size of
244 your thinly-provisioned device. In many cases, due to the
245 sharing of blocks between devices, it is not possible to
246 determine in advance how much space 'trim' will release. (In
247 future a userspace tool might be able to perform this
248 calculation.)
249
250 set_transaction_id <current id> <new id>
251
252 Userland volume managers, such as LVM, need a way to
253 synchronise their external metadata with the internal metadata of the
254 pool target. The thin-pool target offers to store an
255 arbitrary 64-bit transaction id and return it on the target's
256 status line. To avoid races you must provide what you think
257 the current transaction id is when you change it with this
258 compare-and-swap message.
259
260'thin' target
261-------------
262
263i) Constructor
264
265 thin <pool dev> <dev id>
266
267 pool dev:
268 the thin-pool device, e.g. /dev/mapper/my_pool or 253:0
269
270 dev id:
271 the internal device identifier of the device to be
272 activated.
273
274The pool doesn't store any size against the thin devices. If you
275load a thin target that is smaller than you've been using previously,
276then you'll have no access to blocks mapped beyond the end. If you
277load a target that is bigger than before, then extra blocks will be
278provisioned as and when needed.
279
280If you wish to reduce the size of your thin device and potentially
281regain some space then send the 'trim' message to the pool.
282
283ii) Status
284
285 <nr mapped sectors> <highest mapped sector>
diff --git a/Documentation/devicetree/bindings/arm/calxeda.txt b/Documentation/devicetree/bindings/arm/calxeda.txt
new file mode 100644
index 000000000000..4755caaccba6
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/calxeda.txt
@@ -0,0 +1,8 @@
1Calxeda Highbank Platforms Device Tree Bindings
2-----------------------------------------------
3
4Boards with Calxeda Cortex-A9 based Highbank SOC shall have the following
5properties.
6
7Required root node properties:
8 - compatible = "calxeda,highbank";
diff --git a/Documentation/devicetree/bindings/arm/fsl.txt b/Documentation/devicetree/bindings/arm/fsl.txt
new file mode 100644
index 000000000000..c9848ad0e2e3
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/fsl.txt
@@ -0,0 +1,26 @@
1Freescale i.MX Platforms Device Tree Bindings
2-----------------------------------------------
3
4i.MX51 Babbage Board
5Required root node properties:
6 - compatible = "fsl,imx51-babbage", "fsl,imx51";
7
8i.MX53 Automotive Reference Design Board
9Required root node properties:
10 - compatible = "fsl,imx53-ard", "fsl,imx53";
11
12i.MX53 Evaluation Kit
13Required root node properties:
14 - compatible = "fsl,imx53-evk", "fsl,imx53";
15
16i.MX53 Quick Start Board
17Required root node properties:
18 - compatible = "fsl,imx53-qsb", "fsl,imx53";
19
20i.MX53 Smart Mobile Reference Design Board
21Required root node properties:
22 - compatible = "fsl,imx53-smd", "fsl,imx53";
23
24i.MX6 Quad SABRE Automotive Board
25Required root node properties:
26 - compatible = "fsl,imx6q-sabreauto", "fsl,imx6q";
diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
new file mode 100644
index 000000000000..52916b4aa1fe
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -0,0 +1,55 @@
1* ARM Generic Interrupt Controller
2
3ARM SMP cores are often associated with a GIC, providing per processor
4interrupts (PPI), shared processor interrupts (SPI) and software
5generated interrupts (SGI).
6
7Primary GIC is attached directly to the CPU and typically has PPIs and SGIs.
8Secondary GICs are cascaded into the upward interrupt controller and do not
9have PPIs or SGIs.
10
11Main node required properties:
12
13- compatible : should be one of:
14 "arm,cortex-a9-gic"
15 "arm,arm11mp-gic"
16- interrupt-controller : Identifies the node as an interrupt controller
17- #interrupt-cells : Specifies the number of cells needed to encode an
18 interrupt source. The type shall be a <u32> and the value shall be 3.
19
20 The 1st cell is the interrupt type; 0 for SPI interrupts, 1 for PPI
21 interrupts.
22
23 The 2nd cell contains the interrupt number for the interrupt type.
24 SPI interrupts are in the range [0-987]. PPI interrupts are in the
25 range [0-15].
26
27 The 3rd cell is the flags, encoded as follows:
28 bits[3:0] trigger type and level flags.
29 1 = low-to-high edge triggered
30 2 = high-to-low edge triggered
31 4 = active high level-sensitive
32 8 = active low level-sensitive
33 bits[15:8] PPI interrupt cpu mask. Each bit corresponds to each of
34 the 8 possible cpus attached to the GIC. A bit set to '1' indicated
35 the interrupt is wired to that CPU. Only valid for PPI interrupts.
36
37- reg : Specifies base physical address(s) and size of the GIC registers. The
38 first region is the GIC distributor register base and size. The 2nd region is
39 the GIC cpu interface register base and size.
40
41Optional
42- interrupts : Interrupt source of the parent interrupt controller. Only
43 present on secondary GICs.
44
45Example:
46
47 intc: interrupt-controller@fff11000 {
48 compatible = "arm,cortex-a9-gic";
49 #interrupt-cells = <3>;
50 #address-cells = <1>;
51 interrupt-controller;
52 reg = <0xfff11000 0x1000>,
53 <0xfff10100 0x100>;
54 };
55
diff --git a/Documentation/devicetree/bindings/arm/l2cc.txt b/Documentation/devicetree/bindings/arm/l2cc.txt
new file mode 100644
index 000000000000..7ca52161e7ab
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/l2cc.txt
@@ -0,0 +1,44 @@
1* ARM L2 Cache Controller
2
3ARM cores often have a separate level 2 cache controller. There are various
4implementations of the L2 cache controller with compatible programming models.
5The ARM L2 cache representation in the device tree should be done as follows:
6
7Required properties:
8
9- compatible : should be one of:
10 "arm,pl310-cache"
11 "arm,l220-cache"
12 "arm,l210-cache"
13- cache-unified : Specifies the cache is a unified cache.
14- cache-level : Should be set to 2 for a level 2 cache.
15- reg : Physical base address and size of cache controller's memory mapped
16 registers.
17
18Optional properties:
19
20- arm,data-latency : Cycles of latency for Data RAM accesses. Specifies 3 cells of
21 read, write and setup latencies. Minimum valid values are 1. Controllers
22 without setup latency control should use a value of 0.
23- arm,tag-latency : Cycles of latency for Tag RAM accesses. Specifies 3 cells of
24 read, write and setup latencies. Controllers without setup latency control
25 should use 0. Controllers without separate read and write Tag RAM latency
26 values should only use the first cell.
27- arm,dirty-latency : Cycles of latency for Dirty RAMs. This is a single cell.
28- arm,filter-ranges : <start length> Starting address and length of window to
29 filter. Addresses in the filter window are directed to the M1 port. Other
30 addresses will go to the M0 port.
31- interrupts : 1 combined interrupt.
32
33Example:
34
35L2: cache-controller {
36 compatible = "arm,pl310-cache";
37 reg = <0xfff12000 0x1000>;
38 arm,data-latency = <1 1 1>;
39 arm,tag-latency = <2 2 2>;
40 arm,filter-latency = <0x80000000 0x8000000>;
41 cache-unified;
42 cache-level = <2>;
43 interrupts = <45>;
44};
diff --git a/Documentation/devicetree/bindings/arm/omap/dsp.txt b/Documentation/devicetree/bindings/arm/omap/dsp.txt
new file mode 100644
index 000000000000..d3830a32ce08
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/omap/dsp.txt
@@ -0,0 +1,14 @@
1* TI - DSP (Digital Signal Processor)
2
3TI DSP included in OMAP SoC
4
5Required properties:
6- compatible : Should be "ti,omap3-c64" for OMAP3 & 4
7- ti,hwmods: "dsp"
8
9Examples:
10
11dsp {
12 compatible = "ti,omap3-c64";
13 ti,hwmods = "dsp";
14};
diff --git a/Documentation/devicetree/bindings/arm/omap/iva.txt b/Documentation/devicetree/bindings/arm/omap/iva.txt
new file mode 100644
index 000000000000..6d6295171358
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/omap/iva.txt
@@ -0,0 +1,19 @@
1* TI - IVA (Imaging and Video Accelerator) subsystem
2
3The IVA contain various audio, video or imaging HW accelerator
4depending of the version.
5
6Required properties:
7- compatible : Should be:
8 - "ti,ivahd" for OMAP4
9 - "ti,iva2.2" for OMAP3
10 - "ti,iva2.1" for OMAP2430
11 - "ti,iva1" for OMAP2420
12- ti,hwmods: "iva"
13
14Examples:
15
16iva {
17 compatible = "ti,ivahd", "ti,iva";
18 ti,hwmods = "iva";
19};
diff --git a/Documentation/devicetree/bindings/arm/omap/l3-noc.txt b/Documentation/devicetree/bindings/arm/omap/l3-noc.txt
new file mode 100644
index 000000000000..6888a5efc860
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/omap/l3-noc.txt
@@ -0,0 +1,19 @@
1* TI - L3 Network On Chip (NoC)
2
3This version is an implementation of the generic NoC IP
4provided by Arteris.
5
6Required properties:
7- compatible : Should be "ti,omap3-l3-smx" for OMAP3 family
8 Should be "ti,omap4-l3-noc" for OMAP4 family
9- ti,hwmods: "l3_main_1", ... One hwmod for each noc domain.
10
11Examples:
12
13ocp {
14 compatible = "ti,omap4-l3-noc", "simple-bus";
15 #address-cells = <1>;
16 #size-cells = <1>;
17 ranges;
18 ti,hwmods = "l3_main_1", "l3_main_2", "l3_main_3";
19};
diff --git a/Documentation/devicetree/bindings/arm/omap/mpu.txt b/Documentation/devicetree/bindings/arm/omap/mpu.txt
new file mode 100644
index 000000000000..1a5a42ce21bb
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/omap/mpu.txt
@@ -0,0 +1,27 @@
1* TI - MPU (Main Processor Unit) subsystem
2
3The MPU subsystem contain one or several ARM cores
4depending of the version.
5The MPU contain CPUs, GIC, L2 cache and a local PRCM.
6
7Required properties:
8- compatible : Should be "ti,omap3-mpu" for OMAP3
9 Should be "ti,omap4-mpu" for OMAP4
10- ti,hwmods: "mpu"
11
12Examples:
13
14- For an OMAP4 SMP system:
15
16mpu {
17 compatible = "ti,omap4-mpu";
18 ti,hwmods = "mpu";
19};
20
21
22- For an OMAP3 monocore system:
23
24mpu {
25 compatible = "ti,omap3-mpu";
26 ti,hwmods = "mpu";
27};
diff --git a/Documentation/devicetree/bindings/arm/omap/omap.txt b/Documentation/devicetree/bindings/arm/omap/omap.txt
new file mode 100644
index 000000000000..dbdab40ed3a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/omap/omap.txt
@@ -0,0 +1,43 @@
1* Texas Instruments OMAP
2
3OMAP is currently using a static file per SoC family to describe the
4IPs present in the SoC.
5On top of that an omap_device is created to extend the platform_device
6capabilities and to allow binding with one or several hwmods.
7The hwmods will contain all the information to build the device:
8adresse range, irq lines, dma lines, interconnect, PRCM register,
9clock domain, input clocks.
10For the moment just point to the existing hwmod, the next step will be
11to move data from hwmod to device-tree representation.
12
13
14Required properties:
15- compatible: Every devices present in OMAP SoC should be in the
16 form: "ti,XXX"
17- ti,hwmods: list of hwmod names (ascii strings), that comes from the OMAP
18 HW documentation, attached to a device. Must contain at least
19 one hwmod.
20
21Optional properties:
22- ti,no_idle_on_suspend: When present, it prevents the PM to idle the module
23 during suspend.
24
25
26Example:
27
28spinlock@1 {
29 compatible = "ti,omap4-spinlock";
30 ti,hwmods = "spinlock";
31};
32
33
34Boards:
35
36- OMAP3 BeagleBoard : Low cost community board
37 compatible = "ti,omap3-beagle", "ti,omap3"
38
39- OMAP4 SDP : Software Developement Board
40 compatible = "ti,omap4-sdp", "ti,omap4430"
41
42- OMAP4 PandaBoard : Low cost community board
43 compatible = "ti,omap4-panda", "ti,omap4430"
diff --git a/Documentation/devicetree/bindings/arm/picoxcell.txt b/Documentation/devicetree/bindings/arm/picoxcell.txt
new file mode 100644
index 000000000000..e75c0ef51e69
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/picoxcell.txt
@@ -0,0 +1,24 @@
1Picochip picoXcell device tree bindings.
2========================================
3
4Required root node properties:
5 - compatible:
6 - "picochip,pc7302-pc3x3" : PC7302 development board with PC3X3 device.
7 - "picochip,pc7302-pc3x2" : PC7302 development board with PC3X2 device.
8 - "picochip,pc3x3" : picoXcell PC3X3 device based board.
9 - "picochip,pc3x2" : picoXcell PC3X2 device based board.
10
11Timers required properties:
12 - compatible = "picochip,pc3x2-timer"
13 - interrupts : The single IRQ line for the timer.
14 - clock-freq : The frequency in HZ of the timer.
15 - reg : The register bank for the timer.
16
17Note: two timers are required - one for the scheduler clock and one for the
18event tick/NOHZ.
19
20VIC required properties:
21 - compatible = "arm,pl192-vic".
22 - interrupt-controller.
23 - reg : The register bank for the device.
24 - #interrupt-cells : Must be 1.
diff --git a/Documentation/devicetree/bindings/arm/primecell.txt b/Documentation/devicetree/bindings/arm/primecell.txt
index 1d5d7a870ec7..951ca46789d4 100644
--- a/Documentation/devicetree/bindings/arm/primecell.txt
+++ b/Documentation/devicetree/bindings/arm/primecell.txt
@@ -6,7 +6,9 @@ driver matching.
6 6
7Required properties: 7Required properties:
8 8
9- compatible : should be a specific value for peripheral and "arm,primecell" 9- compatible : should be a specific name for the peripheral and
10 "arm,primecell". The specific name will match the ARM
11 engineering name for the logic block in the form: "arm,pl???"
10 12
11Optional properties: 13Optional properties:
12 14
diff --git a/Documentation/devicetree/bindings/ata/calxeda-sata.txt b/Documentation/devicetree/bindings/ata/calxeda-sata.txt
new file mode 100644
index 000000000000..79caa5651f53
--- /dev/null
+++ b/Documentation/devicetree/bindings/ata/calxeda-sata.txt
@@ -0,0 +1,17 @@
1* Calxeda SATA Controller
2
3SATA nodes are defined to describe on-chip Serial ATA controllers.
4Each SATA controller should have its own node.
5
6Required properties:
7- compatible : compatible list, contains "calxeda,hb-ahci"
8- interrupts : <interrupt mapping for SATA IRQ>
9- reg : <registers mapping>
10
11Example:
12 sata@ffe08000 {
13 compatible = "calxeda,hb-ahci";
14 reg = <0xffe08000 0x1000>;
15 interrupts = <115>;
16 };
17
diff --git a/Documentation/devicetree/bindings/crypto/picochip-spacc.txt b/Documentation/devicetree/bindings/crypto/picochip-spacc.txt
new file mode 100644
index 000000000000..d8609ece1f4c
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/picochip-spacc.txt
@@ -0,0 +1,23 @@
1Picochip picoXcell SPAcc (Security Protocol Accelerator) bindings
2
3Picochip picoXcell devices contain crypto offload engines that may be used for
4IPSEC and femtocell layer 2 ciphering.
5
6Required properties:
7 - compatible : "picochip,spacc-ipsec" for the IPSEC offload engine
8 "picochip,spacc-l2" for the femtocell layer 2 ciphering engine.
9 - reg : Offset and length of the register set for this device
10 - interrupt-parent : The interrupt controller that controls the SPAcc
11 interrupt.
12 - interrupts : The interrupt line from the SPAcc.
13 - ref-clock : The input clock that drives the SPAcc.
14
15Example SPAcc node:
16
17spacc@10000 {
18 compatible = "picochip,spacc-ipsec";
19 reg = <0x100000 0x10000>;
20 interrupt-parent = <&vic0>;
21 interrupts = <24>;
22 ref-clock = <&ipsec_clk>, "ref";
23};
diff --git a/Documentation/devicetree/bindings/gpio/led.txt b/Documentation/devicetree/bindings/gpio/led.txt
index 064db928c3c1..141087cf3107 100644
--- a/Documentation/devicetree/bindings/gpio/led.txt
+++ b/Documentation/devicetree/bindings/gpio/led.txt
@@ -8,7 +8,7 @@ node's name represents the name of the corresponding LED.
8 8
9LED sub-node properties: 9LED sub-node properties:
10- gpios : Should specify the LED's GPIO, see "Specifying GPIO information 10- gpios : Should specify the LED's GPIO, see "Specifying GPIO information
11 for devices" in Documentation/powerpc/booting-without-of.txt. Active 11 for devices" in Documentation/devicetree/booting-without-of.txt. Active
12 low LEDs should be indicated using flags in the GPIO specifier. 12 low LEDs should be indicated using flags in the GPIO specifier.
13- label : (optional) The label for this LED. If omitted, the label is 13- label : (optional) The label for this LED. If omitted, the label is
14 taken from the node name (excluding the unit address). 14 taken from the node name (excluding the unit address).
diff --git a/Documentation/devicetree/bindings/gpio/pl061-gpio.txt b/Documentation/devicetree/bindings/gpio/pl061-gpio.txt
new file mode 100644
index 000000000000..a2c416bcbccc
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpio/pl061-gpio.txt
@@ -0,0 +1,10 @@
1ARM PL061 GPIO controller
2
3Required properties:
4- compatible : "arm,pl061", "arm,primecell"
5- #gpio-cells : Should be two. The first cell is the pin number and the
6 second cell is used to specify optional parameters:
7 - bit 0 specifies polarity (0 for normal, 1 for inverted)
8- gpio-controller : Marks the device node as a GPIO controller.
9- interrupts : Interrupt mapping for GPIO IRQ.
10
diff --git a/Documentation/devicetree/bindings/i2c/fsl-imx-i2c.txt b/Documentation/devicetree/bindings/i2c/fsl-imx-i2c.txt
new file mode 100644
index 000000000000..f3cf43b66f7e
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/fsl-imx-i2c.txt
@@ -0,0 +1,25 @@
1* Freescale Inter IC (I2C) and High Speed Inter IC (HS-I2C) for i.MX
2
3Required properties:
4- compatible : Should be "fsl,<chip>-i2c"
5- reg : Should contain I2C/HS-I2C registers location and length
6- interrupts : Should contain I2C/HS-I2C interrupt
7
8Optional properties:
9- clock-frequency : Constains desired I2C/HS-I2C bus clock frequency in Hz.
10 The absence of the propoerty indicates the default frequency 100 kHz.
11
12Examples:
13
14i2c@83fc4000 { /* I2C2 on i.MX51 */
15 compatible = "fsl,imx51-i2c", "fsl,imx1-i2c";
16 reg = <0x83fc4000 0x4000>;
17 interrupts = <63>;
18};
19
20i2c@70038000 { /* HS-I2C on i.MX51 */
21 compatible = "fsl,imx51-i2c", "fsl,imx1-i2c";
22 reg = <0x70038000 0x4000>;
23 interrupts = <64>;
24 clock-frequency = <400000>;
25};
diff --git a/Documentation/devicetree/bindings/i2c/samsung-i2c.txt b/Documentation/devicetree/bindings/i2c/samsung-i2c.txt
new file mode 100644
index 000000000000..38832c712919
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/samsung-i2c.txt
@@ -0,0 +1,39 @@
1* Samsung's I2C controller
2
3The Samsung's I2C controller is used to interface with I2C devices.
4
5Required properties:
6 - compatible: value should be either of the following.
7 (a) "samsung, s3c2410-i2c", for i2c compatible with s3c2410 i2c.
8 (b) "samsung, s3c2440-i2c", for i2c compatible with s3c2440 i2c.
9 - reg: physical base address of the controller and length of memory mapped
10 region.
11 - interrupts: interrupt number to the cpu.
12 - samsung,i2c-sda-delay: Delay (in ns) applied to data line (SDA) edges.
13 - gpios: The order of the gpios should be the following: <SDA, SCL>.
14 The gpio specifier depends on the gpio controller.
15
16Optional properties:
17 - samsung,i2c-slave-addr: Slave address in multi-master enviroment. If not
18 specified, default value is 0.
19 - samsung,i2c-max-bus-freq: Desired frequency in Hz of the bus. If not
20 specified, the default value in Hz is 100000.
21
22Example:
23
24 i2c@13870000 {
25 compatible = "samsung,s3c2440-i2c";
26 reg = <0x13870000 0x100>;
27 interrupts = <345>;
28 samsung,i2c-sda-delay = <100>;
29 samsung,i2c-max-bus-freq = <100000>;
30 gpios = <&gpd1 2 0 /* SDA */
31 &gpd1 3 0 /* SCL */>;
32 #address-cells = <1>;
33 #size-cells = <0>;
34
35 wm8994@1a {
36 compatible = "wlf,wm8994";
37 reg = <0x1a>;
38 };
39 };
diff --git a/Documentation/devicetree/bindings/mmc/nvidia-sdhci.txt b/Documentation/devicetree/bindings/mmc/nvidia-sdhci.txt
new file mode 100644
index 000000000000..7e51154679a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/mmc/nvidia-sdhci.txt
@@ -0,0 +1,27 @@
1* NVIDIA Tegra Secure Digital Host Controller
2
3This controller on Tegra family SoCs provides an interface for MMC, SD,
4and SDIO types of memory cards.
5
6Required properties:
7- compatible : Should be "nvidia,<chip>-sdhci"
8- reg : Should contain SD/MMC registers location and length
9- interrupts : Should contain SD/MMC interrupt
10
11Optional properties:
12- cd-gpios : Specify GPIOs for card detection
13- wp-gpios : Specify GPIOs for write protection
14- power-gpios : Specify GPIOs for power control
15- support-8bit : Boolean, indicates if 8-bit mode should be used.
16
17Example:
18
19sdhci@c8000200 {
20 compatible = "nvidia,tegra20-sdhci";
21 reg = <0xc8000200 0x200>;
22 interrupts = <47>;
23 cd-gpios = <&gpio 69 0>; /* gpio PI5 */
24 wp-gpios = <&gpio 57 0>; /* gpio PH1 */
25 power-gpios = <&gpio 155 0>; /* gpio PT3 */
26 support-8bit;
27};
diff --git a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
index 1a729f089866..1ad80d5865a9 100644
--- a/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
+++ b/Documentation/devicetree/bindings/net/can/fsl-flexcan.txt
@@ -1,61 +1,24 @@
1CAN Device Tree Bindings 1Flexcan CAN contoller on Freescale's ARM and PowerPC system-on-a-chip (SOC).
2------------------------
32011 Freescale Semiconductor, Inc.
4 2
5fsl,flexcan-v1.0 nodes 3Required properties:
6-----------------------
7In addition to the required compatible-, reg- and interrupt-properties, you can
8also specify which clock source shall be used for the controller.
9 4
10CPI Clock- Can Protocol Interface Clock 5- compatible : Should be "fsl,<processor>-flexcan"
11 This CLK_SRC bit of CTRL(control register) selects the clock source to
12 the CAN Protocol Interface(CPI) to be either the peripheral clock
13 (driven by the PLL) or the crystal oscillator clock. The selected clock
14 is the one fed to the prescaler to generate the Serial Clock (Sclock).
15 The PRESDIV field of CTRL(control register) controls a prescaler that
16 generates the Serial Clock (Sclock), whose period defines the
17 time quantum used to compose the CAN waveform.
18 6
19Can Engine Clock Source 7 An implementation should also claim any of the following compatibles
20 There are two sources for CAN clock 8 that it is fully backwards compatible with:
21 - Platform Clock It represents the bus clock
22 - Oscillator Clock
23 9
24 Peripheral Clock (PLL) 10 - fsl,p1010-flexcan
25 --------------
26 |
27 --------- -------------
28 | |CPI Clock | Prescaler | Sclock
29 | |---------------->| (1.. 256) |------------>
30 --------- -------------
31 | |
32 -------------- ---------------------CLK_SRC
33 Oscillator Clock
34 11
35- fsl,flexcan-clock-source : CAN Engine Clock Source.This property selects 12- reg : Offset and length of the register set for this device
36 the peripheral clock. PLL clock is fed to the 13- interrupts : Interrupt tuple for this device
37 prescaler to generate the Serial Clock (Sclock). 14- clock-frequency : The oscillator frequency driving the flexcan device
38 Valid values are "oscillator" and "platform"
39 "oscillator": CAN engine clock source is oscillator clock.
40 "platform" The CAN engine clock source is the bus clock
41 (platform clock).
42 15
43- fsl,flexcan-clock-divider : for the reference and system clock, an additional 16Example:
44 clock divider can be specified.
45- clock-frequency: frequency required to calculate the bitrate for FlexCAN.
46 17
47Note: 18 can@1c000 {
48 - v1.0 of flexcan-v1.0 represent the IP block version for P1010 SOC. 19 compatible = "fsl,p1010-flexcan";
49 - P1010 does not have oscillator as the Clock Source.So the default
50 Clock Source is platform clock.
51Examples:
52
53 can0@1c000 {
54 compatible = "fsl,flexcan-v1.0";
55 reg = <0x1c000 0x1000>; 20 reg = <0x1c000 0x1000>;
56 interrupts = <48 0x2>; 21 interrupts = <48 0x2>;
57 interrupt-parent = <&mpic>; 22 interrupt-parent = <&mpic>;
58 fsl,flexcan-clock-source = "platform"; 23 clock-frequency = <200000000>; // filled in by bootloader
59 fsl,flexcan-clock-divider = <2>;
60 clock-frequency = <fixed by u-boot>;
61 }; 24 };
diff --git a/Documentation/devicetree/bindings/net/smsc911x.txt b/Documentation/devicetree/bindings/net/smsc911x.txt
new file mode 100644
index 000000000000..adb5b5744ecd
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/smsc911x.txt
@@ -0,0 +1,38 @@
1* Smart Mixed-Signal Connectivity (SMSC) LAN911x/912x Controller
2
3Required properties:
4- compatible : Should be "smsc,lan<model>", "smsc,lan9115"
5- reg : Address and length of the io space for SMSC LAN
6- interrupts : Should contain SMSC LAN interrupt line
7- interrupt-parent : Should be the phandle for the interrupt controller
8 that services interrupts for this device
9- phy-mode : String, operation mode of the PHY interface.
10 Supported values are: "mii", "gmii", "sgmii", "tbi", "rmii",
11 "rgmii", "rgmii-id", "rgmii-rxid", "rgmii-txid", "rtbi", "smii".
12
13Optional properties:
14- reg-shift : Specify the quantity to shift the register offsets by
15- reg-io-width : Specify the size (in bytes) of the IO accesses that
16 should be performed on the device. Valid value for SMSC LAN is
17 2 or 4. If it's omitted or invalid, the size would be 2.
18- smsc,irq-active-high : Indicates the IRQ polarity is active-high
19- smsc,irq-push-pull : Indicates the IRQ type is push-pull
20- smsc,force-internal-phy : Forces SMSC LAN controller to use
21 internal PHY
22- smsc,force-external-phy : Forces SMSC LAN controller to use
23 external PHY
24- smsc,save-mac-address : Indicates that mac address needs to be saved
25 before resetting the controller
26- local-mac-address : 6 bytes, mac address
27
28Examples:
29
30lan9220@f4000000 {
31 compatible = "smsc,lan9220", "smsc,lan9115";
32 reg = <0xf4000000 0x2000000>;
33 phy-mode = "mii";
34 interrupt-parent = <&gpio1>;
35 interrupts = <31>;
36 reg-io-width = <4>;
37 smsc,irq-push-pull;
38};
diff --git a/Documentation/devicetree/bindings/pinmux/pinmux_nvidia.txt b/Documentation/devicetree/bindings/pinmux/pinmux_nvidia.txt
new file mode 100644
index 000000000000..36f82dbdd14d
--- /dev/null
+++ b/Documentation/devicetree/bindings/pinmux/pinmux_nvidia.txt
@@ -0,0 +1,5 @@
1NVIDIA Tegra 2 pinmux controller
2
3Required properties:
4- compatible : "nvidia,tegra20-pinmux"
5
diff --git a/Documentation/devicetree/bindings/serial/rs485.txt b/Documentation/devicetree/bindings/serial/rs485.txt
new file mode 100644
index 000000000000..1e753c69fc83
--- /dev/null
+++ b/Documentation/devicetree/bindings/serial/rs485.txt
@@ -0,0 +1,31 @@
1* RS485 serial communications
2
3The RTS signal is capable of automatically controlling line direction for
4the built-in half-duplex mode.
5The properties described hereafter shall be given to a half-duplex capable
6UART node.
7
8Required properties:
9- rs485-rts-delay: prop-encoded-array <a b> where:
10 * a is the delay beteween rts signal and beginning of data sent in milliseconds.
11 it corresponds to the delay before sending data.
12 * b is the delay between end of data sent and rts signal in milliseconds
13 it corresponds to the delay after sending data and actual release of the line.
14
15Optional properties:
16- linux,rs485-enabled-at-boot-time: empty property telling to enable the rs485
17 feature at boot time. It can be disabled later with proper ioctl.
18- rs485-rx-during-tx: empty property that enables the receiving of data even
19 whilst sending data.
20
21RS485 example for Atmel USART:
22 usart0: serial@fff8c000 {
23 compatible = "atmel,at91sam9260-usart";
24 reg = <0xfff8c000 0x4000>;
25 interrupts = <7>;
26 atmel,use-dma-rx;
27 atmel,use-dma-tx;
28 linux,rs485-enabled-at-boot-time;
29 rs485-rts-delay = <0 200>; // in milliseconds
30 };
31
diff --git a/Documentation/devicetree/bindings/spi/spi_pl022.txt b/Documentation/devicetree/bindings/spi/spi_pl022.txt
new file mode 100644
index 000000000000..306ec3ff3c0e
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/spi_pl022.txt
@@ -0,0 +1,12 @@
1ARM PL022 SPI controller
2
3Required properties:
4- compatible : "arm,pl022", "arm,primecell"
5- reg : Offset and length of the register set for the device
6- interrupts : Should contain SPI controller interrupt
7
8Optional properties:
9- cs-gpios : should specify GPIOs used for chipselects.
10 The gpios will be referred to as reg = <index> in the SPI child nodes.
11 If unspecified, a single SPI device without a chip select can be used.
12
diff --git a/Documentation/devicetree/bindings/tty/serial/atmel-usart.txt b/Documentation/devicetree/bindings/tty/serial/atmel-usart.txt
new file mode 100644
index 000000000000..a49d9a1d4ccf
--- /dev/null
+++ b/Documentation/devicetree/bindings/tty/serial/atmel-usart.txt
@@ -0,0 +1,27 @@
1* Atmel Universal Synchronous Asynchronous Receiver/Transmitter (USART)
2
3Required properties:
4- compatible: Should be "atmel,<chip>-usart"
5 The compatible <chip> indicated will be the first SoC to support an
6 additional mode or an USART new feature.
7- reg: Should contain registers location and length
8- interrupts: Should contain interrupt
9
10Optional properties:
11- atmel,use-dma-rx: use of PDC or DMA for receiving data
12- atmel,use-dma-tx: use of PDC or DMA for transmitting data
13
14<chip> compatible description:
15- at91rm9200: legacy USART support
16- at91sam9260: generic USART implementation for SAM9 SoCs
17
18Example:
19
20 usart0: serial@fff8c000 {
21 compatible = "atmel,at91sam9260-usart";
22 reg = <0xfff8c000 0x4000>;
23 interrupts = <7>;
24 atmel,use-dma-rx;
25 atmel,use-dma-tx;
26 };
27
diff --git a/Documentation/devicetree/bindings/tty/serial/msm_serial.txt b/Documentation/devicetree/bindings/tty/serial/msm_serial.txt
new file mode 100644
index 000000000000..aef383eb8876
--- /dev/null
+++ b/Documentation/devicetree/bindings/tty/serial/msm_serial.txt
@@ -0,0 +1,27 @@
1* Qualcomm MSM UART
2
3Required properties:
4- compatible :
5 - "qcom,msm-uart", and one of "qcom,msm-hsuart" or
6 "qcom,msm-lsuart".
7- reg : offset and length of the register set for the device
8 for the hsuart operating in compatible mode, there should be a
9 second pair describing the gsbi registers.
10- interrupts : should contain the uart interrupt.
11
12There are two different UART blocks used in MSM devices,
13"qcom,msm-hsuart" and "qcom,msm-lsuart". The msm-serial driver is
14able to handle both of these, and matches against the "qcom,msm-uart"
15as the compatibility.
16
17The registers for the "qcom,msm-hsuart" device need to specify both
18register blocks, even for the common driver.
19
20Example:
21
22 uart@19c400000 {
23 compatible = "qcom,msm-hsuart", "qcom,msm-uart";
24 reg = <0x19c40000 0x1000>,
25 <0x19c00000 0x1000>;
26 interrupts = <195>;
27 };
diff --git a/Documentation/devicetree/bindings/tty/serial/snps-dw-apb-uart.txt b/Documentation/devicetree/bindings/tty/serial/snps-dw-apb-uart.txt
new file mode 100644
index 000000000000..f13f1c5be91c
--- /dev/null
+++ b/Documentation/devicetree/bindings/tty/serial/snps-dw-apb-uart.txt
@@ -0,0 +1,25 @@
1* Synopsys DesignWare ABP UART
2
3Required properties:
4- compatible : "snps,dw-apb-uart"
5- reg : offset and length of the register set for the device.
6- interrupts : should contain uart interrupt.
7- clock-frequency : the input clock frequency for the UART.
8
9Optional properties:
10- reg-shift : quantity to shift the register offsets by. If this property is
11 not present then the register offsets are not shifted.
12- reg-io-width : the size (in bytes) of the IO accesses that should be
13 performed on the device. If this property is not present then single byte
14 accesses are used.
15
16Example:
17
18 uart@80230000 {
19 compatible = "snps,dw-apb-uart";
20 reg = <0x80230000 0x100>;
21 clock-frequency = <3686400>;
22 interrupts = <10>;
23 reg-shift = <2>;
24 reg-io-width = <4>;
25 };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
new file mode 100644
index 000000000000..e8552782b440
--- /dev/null
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -0,0 +1,40 @@
1Device tree binding vendor prefix registry. Keep list in alphabetical order.
2
3This isn't an exhaustive list, but you should add new prefixes to it before
4using them to avoid name-space collisions.
5
6adi Analog Devices, Inc.
7amcc Applied Micro Circuits Corporation (APM, formally AMCC)
8apm Applied Micro Circuits Corporation (APM)
9arm ARM Ltd.
10atmel Atmel Corporation
11chrp Common Hardware Reference Platform
12dallas Maxim Integrated Products (formerly Dallas Semiconductor)
13denx Denx Software Engineering
14epson Seiko Epson Corp.
15est ESTeem Wireless Modems
16fsl Freescale Semiconductor
17GEFanuc GE Fanuc Intelligent Platforms Embedded Systems, Inc.
18gef GE Fanuc Intelligent Platforms Embedded Systems, Inc.
19hp Hewlett Packard
20ibm International Business Machines (IBM)
21idt Integrated Device Technologies, Inc.
22intercontrol Inter Control Group
23linux Linux-specific binding
24marvell Marvell Technology Group Ltd.
25maxim Maxim Integrated Products
26mosaixtech Mosaix Technologies, Inc.
27national National Semiconductor
28nintendo Nintendo
29nvidia NVIDIA
30nxp NXP Semiconductors
31powervr Imagination Technologies
32qcom Qualcomm, Inc.
33ramtron Ramtron International
34samsung Samsung Semiconductor
35schindler Schindler
36simtek
37sirf SiRF Technology, Inc.
38stericsson ST-Ericsson
39ti Texas Instruments
40xlnx Xilinx
diff --git a/Documentation/devicetree/bindings/virtio/mmio.txt b/Documentation/devicetree/bindings/virtio/mmio.txt
new file mode 100644
index 000000000000..5069c1b8e193
--- /dev/null
+++ b/Documentation/devicetree/bindings/virtio/mmio.txt
@@ -0,0 +1,17 @@
1* virtio memory mapped device
2
3See http://ozlabs.org/~rusty/virtio-spec/ for more details.
4
5Required properties:
6
7- compatible: "virtio,mmio" compatibility string
8- reg: control registers base address and size including configuration space
9- interrupts: interrupt generated by the device
10
11Example:
12
13 virtio_block@3000 {
14 compatible = "virtio,mmio";
15 reg = <0x3000 0x100>;
16 interrupts = <41>;
17 }
diff --git a/Documentation/driver-model/binding.txt b/Documentation/driver-model/binding.txt
index f7ec9d625bfc..abfc8e290d53 100644
--- a/Documentation/driver-model/binding.txt
+++ b/Documentation/driver-model/binding.txt
@@ -48,10 +48,6 @@ devclass_add_device is called to enumerate the device within the class
48and actually register it with the class, which happens with the 48and actually register it with the class, which happens with the
49class's register_dev callback. 49class's register_dev callback.
50 50
51NOTE: The device class structures and core routines to manipulate them
52are not in the mainline kernel, so the discussion is still a bit
53speculative.
54
55 51
56Driver 52Driver
57~~~~~~ 53~~~~~~
diff --git a/Documentation/driver-model/device.txt b/Documentation/driver-model/device.txt
index bdefe728a737..1e70220d20f4 100644
--- a/Documentation/driver-model/device.txt
+++ b/Documentation/driver-model/device.txt
@@ -45,33 +45,52 @@ struct device_attribute {
45 const char *buf, size_t count); 45 const char *buf, size_t count);
46}; 46};
47 47
48Attributes of devices can be exported via drivers using a simple 48Attributes of devices can be exported by a device driver through sysfs.
49procfs-like interface.
50 49
51Please see Documentation/filesystems/sysfs.txt for more information 50Please see Documentation/filesystems/sysfs.txt for more information
52on how sysfs works. 51on how sysfs works.
53 52
53As explained in Documentation/kobject.txt, device attributes must be be
54created before the KOBJ_ADD uevent is generated. The only way to realize
55that is by defining an attribute group.
56
54Attributes are declared using a macro called DEVICE_ATTR: 57Attributes are declared using a macro called DEVICE_ATTR:
55 58
56#define DEVICE_ATTR(name,mode,show,store) 59#define DEVICE_ATTR(name,mode,show,store)
57 60
58Example: 61Example:
59 62
60DEVICE_ATTR(power,0644,show_power,store_power); 63static DEVICE_ATTR(type, 0444, show_type, NULL);
64static DEVICE_ATTR(power, 0644, show_power, store_power);
61 65
62This declares a structure of type struct device_attribute named 66This declares two structures of type struct device_attribute with respective
63'dev_attr_power'. This can then be added and removed to the device's 67names 'dev_attr_type' and 'dev_attr_power'. These two attributes can be
64directory using: 68organized as follows into a group:
65 69
66int device_create_file(struct device *device, struct device_attribute * entry); 70static struct attribute *dev_attrs[] = {
67void device_remove_file(struct device * dev, struct device_attribute * attr); 71 &dev_attr_type.attr,
72 &dev_attr_power.attr,
73 NULL,
74};
68 75
69Example: 76static struct attribute_group dev_attr_group = {
77 .attrs = dev_attrs,
78};
79
80static const struct attribute_group *dev_attr_groups[] = {
81 &dev_attr_group,
82 NULL,
83};
84
85This array of groups can then be associated with a device by setting the
86group pointer in struct device before device_register() is invoked:
70 87
71device_create_file(dev,&dev_attr_power); 88 dev->groups = dev_attr_groups;
72device_remove_file(dev,&dev_attr_power); 89 device_register(dev);
73 90
74The file name will be 'power' with a mode of 0644 (-rw-r--r--). 91The device_register() function will use the 'groups' pointer to create the
92device attributes and the device_unregister() function will use this pointer
93to remove the device attributes.
75 94
76Word of warning: While the kernel allows device_create_file() and 95Word of warning: While the kernel allows device_create_file() and
77device_remove_file() to be called on a device at any time, userspace has 96device_remove_file() to be called on a device at any time, userspace has
@@ -84,24 +103,4 @@ not know about the new attributes.
84This is important for device driver that need to publish additional 103This is important for device driver that need to publish additional
85attributes for a device at driver probe time. If the device driver simply 104attributes for a device at driver probe time. If the device driver simply
86calls device_create_file() on the device structure passed to it, then 105calls device_create_file() on the device structure passed to it, then
87userspace will never be notified of the new attributes. Instead, it should 106userspace will never be notified of the new attributes.
88probably use class_create() and class->dev_attrs to set up a list of
89desired attributes in the modules_init function, and then in the .probe()
90hook, and then use device_create() to create a new device as a child
91of the probed device. The new device will generate a new uevent and
92properly advertise the new attributes to userspace.
93
94For example, if a driver wanted to add the following attributes:
95struct device_attribute mydriver_attribs[] = {
96 __ATTR(port_count, 0444, port_count_show),
97 __ATTR(serial_number, 0444, serial_number_show),
98 NULL
99};
100
101Then in the module init function is would do:
102 mydriver_class = class_create(THIS_MODULE, "my_attrs");
103 mydriver_class.dev_attr = mydriver_attribs;
104
105And assuming 'dev' is the struct device passed into the probe hook, the driver
106probe function would do something like:
107 device_create(&mydriver_class, dev, chrdev, &private_data, "my_name");
diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware
index c466f5831f15..e67be7afc78b 100755
--- a/Documentation/dvb/get_dvb_firmware
+++ b/Documentation/dvb/get_dvb_firmware
@@ -27,7 +27,8 @@ use IO::Handle;
27 "or51211", "or51132_qam", "or51132_vsb", "bluebird", 27 "or51211", "or51132_qam", "or51132_vsb", "bluebird",
28 "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718", 28 "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718",
29 "af9015", "ngene", "az6027", "lme2510_lg", "lme2510c_s7395", 29 "af9015", "ngene", "az6027", "lme2510_lg", "lme2510c_s7395",
30 "lme2510c_s7395_old", "drxk", "drxk_terratec_h5"); 30 "lme2510c_s7395_old", "drxk", "drxk_terratec_h5", "tda10071",
31 "it9135" );
31 32
32# Check args 33# Check args
33syntax() if (scalar(@ARGV) != 1); 34syntax() if (scalar(@ARGV) != 1);
@@ -575,19 +576,10 @@ sub ngene {
575} 576}
576 577
577sub az6027{ 578sub az6027{
578 my $file = "AZ6027_Linux_Driver.tar.gz";
579 my $url = "http://linux.terratec.de/files/$file";
580 my $firmware = "dvb-usb-az6027-03.fw"; 579 my $firmware = "dvb-usb-az6027-03.fw";
580 my $url = "http://linux.terratec.de/files/TERRATEC_S7/$firmware";
581 581
582 wgetfile($file, $url); 582 wgetfile($firmware, $url);
583
584 #untar
585 if( system("tar xzvf $file $firmware")){
586 die "failed to untar firmware";
587 }
588 if( system("rm $file")){
589 die ("unable to remove unnecessary files");
590 }
591 583
592 $firmware; 584 $firmware;
593} 585}
@@ -665,6 +657,41 @@ sub drxk_terratec_h5 {
665 "$fwfile" 657 "$fwfile"
666} 658}
667 659
660sub it9135 {
661 my $url = "http://kworld.server261.com/kworld/CD/ITE_TiVme/V1.00/";
662 my $zipfile = "Driver_V10.323.1.0412.100412.zip";
663 my $hash = "79b597dc648698ed6820845c0c9d0d37";
664 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 0);
665 my $drvfile = "Driver_V10.323.1.0412.100412/Data/x86/IT9135BDA.sys";
666 my $fwfile = "dvb-usb-it9137-01.fw";
667
668 checkstandard();
669
670 wgetfile($zipfile, $url . $zipfile);
671 verify($zipfile, $hash);
672 unzip($zipfile, $tmpdir);
673 extract("$tmpdir/$drvfile", 69632, 5731, "$fwfile");
674
675 "$fwfile"
676}
677
678sub tda10071 {
679 my $sourcefile = "PCTV_460e_reference.zip";
680 my $url = "ftp://ftp.pctvsystems.com/TV/driver/PCTV%2070e%2080e%20100e%20320e%20330e%20800e/";
681 my $hash = "4403de903bf2593464c8d74bbc200a57";
682 my $fwfile = "dvb-fe-tda10071.fw";
683 my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
684
685 checkstandard();
686
687 wgetfile($sourcefile, $url . $sourcefile);
688 verify($sourcefile, $hash);
689 unzip($sourcefile, $tmpdir);
690 extract("$tmpdir/PCTV\ 70e\ 80e\ 100e\ 320e\ 330e\ 800e/32\ bit/emOEM.sys", 0x67d38, 40504, $fwfile);
691
692 "$fwfile";
693}
694
668# --------------------------------------------------------------- 695# ---------------------------------------------------------------
669# Utilities 696# Utilities
670 697
diff --git a/Documentation/dvb/it9137.txt b/Documentation/dvb/it9137.txt
new file mode 100644
index 000000000000..9e6726eead90
--- /dev/null
+++ b/Documentation/dvb/it9137.txt
@@ -0,0 +1,9 @@
1To extract firmware for Kworld UB499-2T (id 1b80:e409) you need to copy the
2following file(s) to this directory.
3
4IT9135BDA.sys Dated Mon 22 Mar 2010 02:20:08 GMT
5
6extract using dd
7dd if=IT9135BDA.sys ibs=1 skip=69632 count=5731 of=dvb-usb-it9137-01.fw
8
9copy to default firmware location.
diff --git a/Documentation/email-clients.txt b/Documentation/email-clients.txt
index a0b58e29f911..860c29a472ad 100644
--- a/Documentation/email-clients.txt
+++ b/Documentation/email-clients.txt
@@ -199,18 +199,16 @@ to coerce it into behaving.
199 199
200To beat some sense out of the internal editor, do this: 200To beat some sense out of the internal editor, do this:
201 201
202- Under account settings, composition and addressing, uncheck "Compose
203 messages in HTML format".
204
205- Edit your Thunderbird config settings so that it won't use format=flowed. 202- Edit your Thunderbird config settings so that it won't use format=flowed.
206 Go to "edit->preferences->advanced->config editor" to bring up the 203 Go to "edit->preferences->advanced->config editor" to bring up the
207 thunderbird's registry editor, and set "mailnews.send_plaintext_flowed" to 204 thunderbird's registry editor, and set "mailnews.send_plaintext_flowed" to
208 "false". 205 "false".
209 206
210- Enable "preformat" mode: Shft-click on the Write icon to bring up the HTML 207- Disable HTML Format: Set "mail.identity.id1.compose_html" to "false".
211 composer, select "Preformat" from the drop-down box just under the subject 208
212 line, then close the message without saving. (This setting also applies to 209- Enable "preformat" mode: Set "editor.quotesPreformatted" to "true".
213 the text composer, but the only control for it is in the HTML composer.) 210
211- Enable UTF8: Set "prefs.converted-to-utf8" to "true".
214 212
215- Install the "toggle wordwrap" extension. Download the file from: 213- Install the "toggle wordwrap" extension. Download the file from:
216 https://addons.mozilla.org/thunderbird/addon/2351/ 214 https://addons.mozilla.org/thunderbird/addon/2351/
diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 82a5d250d75e..ba4be8b77093 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -21,6 +21,11 @@ o fail_make_request
21 /sys/block/<device>/make-it-fail or 21 /sys/block/<device>/make-it-fail or
22 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request()) 22 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
23 23
24o fail_mmc_request
25
26 injects MMC data errors on devices permitted by setting
27 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
28
24Configure fault-injection capabilities behavior 29Configure fault-injection capabilities behavior
25----------------------------------------------- 30-----------------------------------------------
26 31
@@ -115,7 +120,8 @@ use the boot option:
115 120
116 failslab= 121 failslab=
117 fail_page_alloc= 122 fail_page_alloc=
118 fail_make_request=<interval>,<probability>,<space>,<times> 123 fail_make_request=
124 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
119 125
120How to add new fault injection capability 126How to add new fault injection capability
121----------------------------------------- 127-----------------------------------------
diff --git a/Documentation/fb/udlfb.txt b/Documentation/fb/udlfb.txt
index 7fdde2a02a27..57d2f2908b12 100644
--- a/Documentation/fb/udlfb.txt
+++ b/Documentation/fb/udlfb.txt
@@ -87,23 +87,38 @@ Special configuration for udlfb is usually unnecessary. There are a few
87options, however. 87options, however.
88 88
89From the command line, pass options to modprobe 89From the command line, pass options to modprobe
90modprobe udlfb defio=1 console=1 90modprobe udlfb fb_defio=0 console=1 shadow=1
91 91
92Or for permanent option, create file like /etc/modprobe.d/options with text 92Or modify options on the fly at /sys/module/udlfb/parameters directory via
93options udlfb defio=1 console=1 93sudo nano fb_defio
94change the parameter in place, and save the file.
94 95
95Accepted options: 96Unplug/replug USB device to apply with new settings
97
98Or for permanent option, create file like /etc/modprobe.d/udlfb.conf with text
99options udlfb fb_defio=0 console=1 shadow=1
100
101Accepted boolean options:
96 102
97fb_defio Make use of the fb_defio (CONFIG_FB_DEFERRED_IO) kernel 103fb_defio Make use of the fb_defio (CONFIG_FB_DEFERRED_IO) kernel
98 module to track changed areas of the framebuffer by page faults. 104 module to track changed areas of the framebuffer by page faults.
99 Standard fbdev applications that use mmap but that do not 105 Standard fbdev applications that use mmap but that do not
100 report damage, may be able to work with this enabled. 106 report damage, should be able to work with this enabled.
101 Disabled by default because of overhead and other issues. 107 Disable when running with X server that supports reporting
102 108 changed regions via ioctl, as this method is simpler,
103console Allow fbcon to attach to udlfb provided framebuffers. This 109 more stable, and higher performance.
104 is disabled by default because fbcon will aggressively consume 110 default: fb_defio=1
105 the first framebuffer it finds, which isn't usually what the 111
106 user wants in the case of USB displays. 112console Allow fbcon to attach to udlfb provided framebuffers.
113 Can be disabled if fbcon and other clients
114 (e.g. X with --shared-vt) are in conflict.
115 default: console=1
116
117shadow Allocate a 2nd framebuffer to shadow what's currently across
118 the USB bus in device memory. If any pixels are unchanged,
119 do not transmit. Spends host memory to save USB transfers.
120 Enabled by default. Only disable on very low memory systems.
121 default: shadow=1
107 122
108Sysfs Attributes 123Sysfs Attributes
109================ 124================
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index c4a6e148732a..3d849122b5b1 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -133,41 +133,6 @@ Who: Pavel Machek <pavel@ucw.cz>
133 133
134--------------------------- 134---------------------------
135 135
136What: sys_sysctl
137When: September 2010
138Option: CONFIG_SYSCTL_SYSCALL
139Why: The same information is available in a more convenient from
140 /proc/sys, and none of the sysctl variables appear to be
141 important performance wise.
142
143 Binary sysctls are a long standing source of subtle kernel
144 bugs and security issues.
145
146 When I looked several months ago all I could find after
147 searching several distributions were 5 user space programs and
148 glibc (which falls back to /proc/sys) using this syscall.
149
150 The man page for sysctl(2) documents it as unusable for user
151 space programs.
152
153 sysctl(2) is not generally ABI compatible to a 32bit user
154 space application on a 64bit and a 32bit kernel.
155
156 For the last several months the policy has been no new binary
157 sysctls and no one has put forward an argument to use them.
158
159 Binary sysctls issues seem to keep happening appearing so
160 properly deprecating them (with a warning to user space) and a
161 2 year grace warning period will mean eventually we can kill
162 them and end the pain.
163
164 In the mean time individual binary sysctls can be dealt with
165 in a piecewise fashion.
166
167Who: Eric Biederman <ebiederm@xmission.com>
168
169---------------------------
170
171What: /proc/<pid>/oom_adj 136What: /proc/<pid>/oom_adj
172When: August 2012 137When: August 2012
173Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's 138Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's
@@ -495,29 +460,6 @@ Who: Jean Delvare <khali@linux-fr.org>
495 460
496---------------------------- 461----------------------------
497 462
498What: Support for UVCIOC_CTRL_ADD in the uvcvideo driver
499When: 3.2
500Why: The information passed to the driver by this ioctl is now queried
501 dynamically from the device.
502Who: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
503
504----------------------------
505
506What: Support for UVCIOC_CTRL_MAP_OLD in the uvcvideo driver
507When: 3.2
508Why: Used only by applications compiled against older driver versions.
509 Superseded by UVCIOC_CTRL_MAP which supports V4L2 menu controls.
510Who: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
511
512----------------------------
513
514What: Support for UVCIOC_CTRL_GET and UVCIOC_CTRL_SET in the uvcvideo driver
515When: 3.2
516Why: Superseded by the UVCIOC_CTRL_QUERY ioctl.
517Who: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
518
519----------------------------
520
521What: Support for driver specific ioctls in the pwc driver (everything 463What: Support for driver specific ioctls in the pwc driver (everything
522 defined in media/pwc-ioctl.h) 464 defined in media/pwc-ioctl.h)
523When: 3.3 465When: 3.3
@@ -592,3 +534,20 @@ Why: In 3.0, we can now autodetect internal 3G device and already have
592 interface that was used by acer-wmi driver. It will replaced by 534 interface that was used by acer-wmi driver. It will replaced by
593 information log when acer-wmi initial. 535 information log when acer-wmi initial.
594Who: Lee, Chun-Yi <jlee@novell.com> 536Who: Lee, Chun-Yi <jlee@novell.com>
537
538----------------------------
539
540What: The XFS nodelaylog mount option
541When: 3.3
542Why: The delaylog mode that has been the default since 2.6.39 has proven
543 stable, and the old code is in the way of additional improvements in
544 the log code.
545Who: Christoph Hellwig <hch@lst.de>
546
547----------------------------
548
549What: iwlagn alias support
550When: 3.5
551Why: The iwlagn module has been renamed iwlwifi. The alias will be around
552 for backward compatibility for several cycles and then dropped.
553Who: Don Fry <donald.h.fry@intel.com>
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index 13de64c7f0ab..2c0321442845 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -92,7 +92,7 @@ OPTIONS
92 92
93 wfdno=n the file descriptor for writing with trans=fd 93 wfdno=n the file descriptor for writing with trans=fd
94 94
95 maxdata=n the number of bytes to use for 9p packet payload (msize) 95 msize=n the number of bytes to use for 9p packet payload
96 96
97 port=n port to connect to on the remote server 97 port=n port to connect to on the remote server
98 98
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 653380793a6c..d819ba16a0c7 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -29,6 +29,7 @@ d_hash no no no maybe
29d_compare: yes no no maybe 29d_compare: yes no no maybe
30d_delete: no yes no no 30d_delete: no yes no no
31d_release: no no yes no 31d_release: no no yes no
32d_prune: no yes no no
32d_iput: no no yes no 33d_iput: no no yes no
33d_dname: no no no no 34d_dname: no no no no
34d_automount: no no yes no 35d_automount: no no yes no
diff --git a/Documentation/filesystems/befs.txt b/Documentation/filesystems/befs.txt
index 6e49c363938e..da45e6c842b8 100644
--- a/Documentation/filesystems/befs.txt
+++ b/Documentation/filesystems/befs.txt
@@ -27,7 +27,7 @@ His original code can still be found at:
27Does anyone know of a more current email address for Makoto? He doesn't 27Does anyone know of a more current email address for Makoto? He doesn't
28respond to the address given above... 28respond to the address given above...
29 29
30Current maintainer: Sergey S. Kostyliov <rathamahata@php4.ru> 30This filesystem doesn't have a maintainer.
31 31
32WHAT IS THIS DRIVER? 32WHAT IS THIS DRIVER?
33================== 33==================
diff --git a/Documentation/filesystems/caching/object.txt b/Documentation/filesystems/caching/object.txt
index e8b0a35d8fe5..58313348da87 100644
--- a/Documentation/filesystems/caching/object.txt
+++ b/Documentation/filesystems/caching/object.txt
@@ -127,9 +127,9 @@ fscache_enqueue_object()).
127PROVISION OF CPU TIME 127PROVISION OF CPU TIME
128--------------------- 128---------------------
129 129
130The work to be done by the various states is given CPU time by the threads of 130The work to be done by the various states was given CPU time by the threads of
131the slow work facility (see Documentation/slow-work.txt). This is used in 131the slow work facility. This was used in preference to the workqueue facility
132preference to the workqueue facility because: 132because:
133 133
134 (1) Threads may be completely occupied for very long periods of time by a 134 (1) Threads may be completely occupied for very long periods of time by a
135 particular work item. These state actions may be doing sequences of 135 particular work item. These state actions may be doing sequences of
diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt
index 22f3a0eda1d2..b100adc38adb 100644
--- a/Documentation/filesystems/ext3.txt
+++ b/Documentation/filesystems/ext3.txt
@@ -73,14 +73,6 @@ nobarrier (*) This also requires an IO stack which can support
73 also be used to enable or disable barriers, for 73 also be used to enable or disable barriers, for
74 consistency with other ext3 mount options. 74 consistency with other ext3 mount options.
75 75
76orlov (*) This enables the new Orlov block allocator. It is
77 enabled by default.
78
79oldalloc This disables the Orlov block allocator and enables
80 the old block allocator. Orlov should have better
81 performance - we'd like to get some feedback if it's
82 the contrary for you.
83
84user_xattr Enables Extended User Attributes. Additionally, you 76user_xattr Enables Extended User Attributes. Additionally, you
85 need to have extended attribute support enabled in the 77 need to have extended attribute support enabled in the
86 kernel configuration (CONFIG_EXT3_FS_XATTR). See the 78 kernel configuration (CONFIG_EXT3_FS_XATTR). See the
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 232a575a0c48..4917cf24a5e0 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -160,7 +160,9 @@ noload if the filesystem was not unmounted cleanly,
160 lead to any number of problems. 160 lead to any number of problems.
161 161
162data=journal All data are committed into the journal prior to being 162data=journal All data are committed into the journal prior to being
163 written into the main file system. 163 written into the main file system. Enabling
164 this mode will disable delayed allocation and
165 O_DIRECT support.
164 166
165data=ordered (*) All data are forced directly out to the main file 167data=ordered (*) All data are forced directly out to the main file
166 system prior to its metadata being committed to the 168 system prior to its metadata being committed to the
@@ -201,30 +203,19 @@ inode_readahead_blks=n This tuning parameter controls the maximum
201 table readahead algorithm will pre-read into 203 table readahead algorithm will pre-read into
202 the buffer cache. The default value is 32 blocks. 204 the buffer cache. The default value is 32 blocks.
203 205
204orlov (*) This enables the new Orlov block allocator. It is 206nouser_xattr Disables Extended User Attributes. If you have extended
205 enabled by default. 207 attribute support enabled in the kernel configuration
206 208 (CONFIG_EXT4_FS_XATTR), extended attribute support
207oldalloc This disables the Orlov block allocator and enables 209 is enabled by default on mount. See the attr(5) manual
208 the old block allocator. Orlov should have better 210 page and http://acl.bestbits.at/ for more information
209 performance - we'd like to get some feedback if it's 211 about extended attributes.
210 the contrary for you.
211
212user_xattr Enables Extended User Attributes. Additionally, you
213 need to have extended attribute support enabled in the
214 kernel configuration (CONFIG_EXT4_FS_XATTR). See the
215 attr(5) manual page and http://acl.bestbits.at/ to
216 learn more about extended attributes.
217
218nouser_xattr Disables Extended User Attributes.
219
220acl Enables POSIX Access Control Lists support.
221 Additionally, you need to have ACL support enabled in
222 the kernel configuration (CONFIG_EXT4_FS_POSIX_ACL).
223 See the acl(5) manual page and http://acl.bestbits.at/
224 for more information.
225 212
226noacl This option disables POSIX Access Control List 213noacl This option disables POSIX Access Control List
227 support. 214 support. If ACL support is enabled in the kernel
215 configuration (CONFIG_EXT4_FS_POSIX_ACL), ACL is
216 enabled by default on mount. See the acl(5) manual
217 page and http://acl.bestbits.at/ for more information
218 about acl.
228 219
229bsddf (*) Make 'df' act like BSD. 220bsddf (*) Make 'df' act like BSD.
230minixdf Make 'df' act like Minix. 221minixdf Make 'df' act like Minix.
@@ -419,8 +410,8 @@ written to the journal first, and then to its final location.
419In the event of a crash, the journal can be replayed, bringing both data and 410In the event of a crash, the journal can be replayed, bringing both data and
420metadata into a consistent state. This mode is the slowest except when data 411metadata into a consistent state. This mode is the slowest except when data
421needs to be read from and written to disk at the same time where it 412needs to be read from and written to disk at the same time where it
422outperforms all others modes. Currently ext4 does not have delayed 413outperforms all others modes. Enabling this mode will disable delayed
423allocation support if this data journalling mode is selected. 414allocation and O_DIRECT support.
424 415
425/proc entries 416/proc entries
426============= 417=============
diff --git a/Documentation/filesystems/locks.txt b/Documentation/filesystems/locks.txt
index fab857accbd6..2cf81082581d 100644
--- a/Documentation/filesystems/locks.txt
+++ b/Documentation/filesystems/locks.txt
@@ -53,11 +53,12 @@ fcntl(), with all the problems that implies.
531.3 Mandatory Locking As A Mount Option 531.3 Mandatory Locking As A Mount Option
54--------------------------------------- 54---------------------------------------
55 55
56Mandatory locking, as described in 'Documentation/filesystems/mandatory.txt' 56Mandatory locking, as described in
57was prior to this release a general configuration option that was valid for 57'Documentation/filesystems/mandatory-locking.txt' was prior to this release a
58all mounted filesystems. This had a number of inherent dangers, not the 58general configuration option that was valid for all mounted filesystems. This
59least of which was the ability to freeze an NFS server by asking it to read 59had a number of inherent dangers, not the least of which was the ability to
60a file for which a mandatory lock existed. 60freeze an NFS server by asking it to read a file for which a mandatory lock
61existed.
61 62
62From this release of the kernel, mandatory locking can be turned on and off 63From this release of the kernel, mandatory locking can be turned on and off
63on a per-filesystem basis, using the mount options 'mand' and 'nomand'. 64on a per-filesystem basis, using the mount options 'mand' and 'nomand'.
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt
index 9c8fd6148656..120fd3cf7fd9 100644
--- a/Documentation/filesystems/nfs/idmapper.txt
+++ b/Documentation/filesystems/nfs/idmapper.txt
@@ -47,7 +47,7 @@ request-key will find the first matching line and corresponding program. In
47this case, /some/other/program will handle all uid lookups and 47this case, /some/other/program will handle all uid lookups and
48/usr/sbin/nfs.idmap will handle gid, user, and group lookups. 48/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
49 49
50See <file:Documentation/security/keys-request-keys.txt> for more information 50See <file:Documentation/security/keys-request-key.txt> for more information
51about the request-key function. 51about the request-key function.
52 52
53 53
diff --git a/Documentation/filesystems/pohmelfs/design_notes.txt b/Documentation/filesystems/pohmelfs/design_notes.txt
index dcf833587162..8aef91335701 100644
--- a/Documentation/filesystems/pohmelfs/design_notes.txt
+++ b/Documentation/filesystems/pohmelfs/design_notes.txt
@@ -58,8 +58,9 @@ data transfers.
58POHMELFS clients operate with a working set of servers and are capable of balancing read-only 58POHMELFS clients operate with a working set of servers and are capable of balancing read-only
59operations (like lookups or directory listings) between them according to IO priorities. 59operations (like lookups or directory listings) between them according to IO priorities.
60Administrators can add or remove servers from the set at run-time via special commands (described 60Administrators can add or remove servers from the set at run-time via special commands (described
61in Documentation/pohmelfs/info.txt file). Writes are replicated to all servers, which are connected 61in Documentation/filesystems/pohmelfs/info.txt file). Writes are replicated to all servers, which
62with write permission turned on. IO priority and permissions can be changed in run-time. 62are connected with write permission turned on. IO priority and permissions can be changed in
63run-time.
63 64
64POHMELFS is capable of full data channel encryption and/or strong crypto hashing. 65POHMELFS is capable of full data channel encryption and/or strong crypto hashing.
65One can select any kernel supported cipher, encryption mode, hash type and operation mode 66One can select any kernel supported cipher, encryption mode, hash type and operation mode
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index db3b1aba32a3..0ec91f03422e 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1263,7 +1263,7 @@ review the kernel documentation in the directory /usr/src/linux/Documentation.
1263This chapter is heavily based on the documentation included in the pre 2.2 1263This chapter is heavily based on the documentation included in the pre 2.2
1264kernels, and became part of it in version 2.2.1 of the Linux kernel. 1264kernels, and became part of it in version 2.2.1 of the Linux kernel.
1265 1265
1266Please see: Documentation/sysctls/ directory for descriptions of these 1266Please see: Documentation/sysctl/ directory for descriptions of these
1267entries. 1267entries.
1268 1268
1269------------------------------------------------------------------------------ 1269------------------------------------------------------------------------------
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt
index 597f728e7b4e..07235caec22c 100644
--- a/Documentation/filesystems/sysfs.txt
+++ b/Documentation/filesystems/sysfs.txt
@@ -4,7 +4,7 @@ sysfs - _The_ filesystem for exporting kernel objects.
4Patrick Mochel <mochel@osdl.org> 4Patrick Mochel <mochel@osdl.org>
5Mike Murphy <mamurph@cs.clemson.edu> 5Mike Murphy <mamurph@cs.clemson.edu>
6 6
7Revised: 15 July 2010 7Revised: 16 August 2011
8Original: 10 January 2003 8Original: 10 January 2003
9 9
10 10
@@ -370,3 +370,11 @@ int driver_create_file(struct device_driver *, const struct driver_attribute *);
370void driver_remove_file(struct device_driver *, const struct driver_attribute *); 370void driver_remove_file(struct device_driver *, const struct driver_attribute *);
371 371
372 372
373Documentation
374~~~~~~~~~~~~~
375
376The sysfs directory structure and the attributes in each directory define an
377ABI between the kernel and user space. As for any ABI, it is important that
378this ABI is stable and properly documented. All new sysfs attributes must be
379documented in Documentation/ABI. See also Documentation/ABI/README for more
380information.
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 52d8fb81cfff..43cbd0821721 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -1053,9 +1053,6 @@ manipulate dentries:
1053 and the dentry is returned. The caller must use dput() 1053 and the dentry is returned. The caller must use dput()
1054 to free the dentry when it finishes using it. 1054 to free the dentry when it finishes using it.
1055 1055
1056For further information on dentry locking, please refer to the document
1057Documentation/filesystems/dentry-locking.txt.
1058
1059Mount Options 1056Mount Options
1060============= 1057=============
1061 1058
diff --git a/Documentation/frv/booting.txt b/Documentation/frv/booting.txt
index 37c4d84a0e57..9bdf4b46e741 100644
--- a/Documentation/frv/booting.txt
+++ b/Documentation/frv/booting.txt
@@ -180,9 +180,3 @@ separated by spaces:
180 180
181 This tells the kernel what program to run initially. By default this is 181 This tells the kernel what program to run initially. By default this is
182 /sbin/init, but /sbin/sash or /bin/sh are common alternatives. 182 /sbin/init, but /sbin/sash or /bin/sh are common alternatives.
183
184 (*) vdc=...
185
186 This option configures the MB93493 companion chip visual display
187 driver. Please see Documentation/frv/mb93493/vdc.txt for more
188 information.
diff --git a/Documentation/hwmon/ad7314 b/Documentation/hwmon/ad7314
new file mode 100644
index 000000000000..1912549c7467
--- /dev/null
+++ b/Documentation/hwmon/ad7314
@@ -0,0 +1,25 @@
1Kernel driver ad7314
2====================
3
4Supported chips:
5 * Analog Devices AD7314
6 Prefix: 'ad7314'
7 Datasheet: Publicly available at Analog Devices website.
8 * Analog Devices ADT7301
9 Prefix: 'adt7301'
10 Datasheet: Publicly available at Analog Devices website.
11 * Analog Devices ADT7302
12 Prefix: 'adt7302'
13 Datasheet: Publicly available at Analog Devices website.
14
15Description
16-----------
17
18Driver supports the above parts. The ad7314 has a 10 bit
19sensor with 1lsb = 0.25 degrees centigrade. The adt7301 and
20adt7302 have 14 bit sensors with 1lsb = 0.03125 degrees centigrade.
21
22Notes
23-----
24
25Currently power down mode is not supported.
diff --git a/Documentation/hwmon/adm1275 b/Documentation/hwmon/adm1275
index 097b3ccc4be7..ab70d96d2dfd 100644
--- a/Documentation/hwmon/adm1275
+++ b/Documentation/hwmon/adm1275
@@ -6,6 +6,10 @@ Supported chips:
6 Prefix: 'adm1275' 6 Prefix: 'adm1275'
7 Addresses scanned: - 7 Addresses scanned: -
8 Datasheet: www.analog.com/static/imported-files/data_sheets/ADM1275.pdf 8 Datasheet: www.analog.com/static/imported-files/data_sheets/ADM1275.pdf
9 * Analog Devices ADM1276
10 Prefix: 'adm1276'
11 Addresses scanned: -
12 Datasheet: www.analog.com/static/imported-files/data_sheets/ADM1276.pdf
9 13
10Author: Guenter Roeck <guenter.roeck@ericsson.com> 14Author: Guenter Roeck <guenter.roeck@ericsson.com>
11 15
@@ -13,13 +17,13 @@ Author: Guenter Roeck <guenter.roeck@ericsson.com>
13Description 17Description
14----------- 18-----------
15 19
16This driver supports hardware montoring for Analog Devices ADM1275 Hot-Swap 20This driver supports hardware montoring for Analog Devices ADM1275 and ADM1276
17Controller and Digital Power Monitor. 21Hot-Swap Controller and Digital Power Monitor.
18 22
19The ADM1275 is a hot-swap controller that allows a circuit board to be removed 23ADM1275 and ADM1276 are hot-swap controllers that allow a circuit board to be
20from or inserted into a live backplane. It also features current and voltage 24removed from or inserted into a live backplane. They also feature current and
21readback via an integrated 12-bit analog-to-digital converter (ADC), accessed 25voltage readback via an integrated 12-bit analog-to-digital converter (ADC),
22using a PMBus. interface. 26accessed using a PMBus interface.
23 27
24The driver is a client driver to the core PMBus driver. Please see 28The driver is a client driver to the core PMBus driver. Please see
25Documentation/hwmon/pmbus for details on PMBus client drivers. 29Documentation/hwmon/pmbus for details on PMBus client drivers.
@@ -48,17 +52,25 @@ attributes are write-only, all other attributes are read-only.
48 52
49in1_label "vin1" or "vout1" depending on chip variant and 53in1_label "vin1" or "vout1" depending on chip variant and
50 configuration. 54 configuration.
51in1_input Measured voltage. From READ_VOUT register. 55in1_input Measured voltage.
52in1_min Minumum Voltage. From VOUT_UV_WARN_LIMIT register. 56in1_min Minumum Voltage.
53in1_max Maximum voltage. From VOUT_OV_WARN_LIMIT register. 57in1_max Maximum voltage.
54in1_min_alarm Voltage low alarm. From VOLTAGE_UV_WARNING status. 58in1_min_alarm Voltage low alarm.
55in1_max_alarm Voltage high alarm. From VOLTAGE_OV_WARNING status. 59in1_max_alarm Voltage high alarm.
56in1_highest Historical maximum voltage. 60in1_highest Historical maximum voltage.
57in1_reset_history Write any value to reset history. 61in1_reset_history Write any value to reset history.
58 62
59curr1_label "iout1" 63curr1_label "iout1"
60curr1_input Measured current. From READ_IOUT register. 64curr1_input Measured current.
61curr1_max Maximum current. From IOUT_OC_WARN_LIMIT register. 65curr1_max Maximum current.
62curr1_max_alarm Current high alarm. From IOUT_OC_WARN_LIMIT register. 66curr1_max_alarm Current high alarm.
67curr1_lcrit Critical minimum current. Depending on the chip
68 configuration, either curr1_lcrit or curr1_crit is
69 supported, but not both.
70curr1_lcrit_alarm Critical current low alarm.
71curr1_crit Critical maximum current. Depending on the chip
72 configuration, either curr1_lcrit or curr1_crit is
73 supported, but not both.
74curr1_crit_alarm Critical current high alarm.
63curr1_highest Historical maximum current. 75curr1_highest Historical maximum current.
64curr1_reset_history Write any value to reset history. 76curr1_reset_history Write any value to reset history.
diff --git a/Documentation/hwmon/coretemp b/Documentation/hwmon/coretemp
index fa8776ab9b18..84d46c0c71a3 100644
--- a/Documentation/hwmon/coretemp
+++ b/Documentation/hwmon/coretemp
@@ -35,13 +35,6 @@ the Out-Of-Spec bit. Following table summarizes the exported sysfs files:
35All Sysfs entries are named with their core_id (represented here by 'X'). 35All Sysfs entries are named with their core_id (represented here by 'X').
36tempX_input - Core temperature (in millidegrees Celsius). 36tempX_input - Core temperature (in millidegrees Celsius).
37tempX_max - All cooling devices should be turned on (on Core2). 37tempX_max - All cooling devices should be turned on (on Core2).
38 Initialized with IA32_THERM_INTERRUPT. When the CPU
39 temperature reaches this temperature, an interrupt is
40 generated and tempX_max_alarm is set.
41tempX_max_hyst - If the CPU temperature falls below than temperature,
42 an interrupt is generated and tempX_max_alarm is reset.
43tempX_max_alarm - Set if the temperature reaches or exceeds tempX_max.
44 Reset if the temperature drops to or below tempX_max_hyst.
45tempX_crit - Maximum junction temperature (in millidegrees Celsius). 38tempX_crit - Maximum junction temperature (in millidegrees Celsius).
46tempX_crit_alarm - Set when Out-of-spec bit is set, never clears. 39tempX_crit_alarm - Set when Out-of-spec bit is set, never clears.
47 Correct CPU operation is no longer guaranteed. 40 Correct CPU operation is no longer guaranteed.
@@ -49,9 +42,10 @@ tempX_label - Contains string "Core X", where X is processor
49 number. For Package temp, this will be "Physical id Y", 42 number. For Package temp, this will be "Physical id Y",
50 where Y is the package number. 43 where Y is the package number.
51 44
52The TjMax temperature is set to 85 degrees C if undocumented model specific 45On CPU models which support it, TjMax is read from a model-specific register.
53register (UMSR) 0xee has bit 30 set. If not the TjMax is 100 degrees C as 46On other models, it is set to an arbitrary value based on weak heuristics.
54(sometimes) documented in processor datasheet. 47If these heuristics don't work for you, you can pass the correct TjMax value
48as a module parameter (tjmax).
55 49
56Appendix A. Known TjMax lists (TBD): 50Appendix A. Known TjMax lists (TBD):
57Some information comes from ark.intel.com 51Some information comes from ark.intel.com
diff --git a/Documentation/hwmon/exynos4_tmu b/Documentation/hwmon/exynos4_tmu
new file mode 100644
index 000000000000..c3c6b41db607
--- /dev/null
+++ b/Documentation/hwmon/exynos4_tmu
@@ -0,0 +1,81 @@
1Kernel driver exynos4_tmu
2=================
3
4Supported chips:
5* ARM SAMSUNG EXYNOS4 series of SoC
6 Prefix: 'exynos4-tmu'
7 Datasheet: Not publicly available
8
9Authors: Donggeun Kim <dg77.kim@samsung.com>
10
11Description
12-----------
13
14This driver allows to read temperature inside SAMSUNG EXYNOS4 series of SoC.
15
16The chip only exposes the measured 8-bit temperature code value
17through a register.
18Temperature can be taken from the temperature code.
19There are three equations converting from temperature to temperature code.
20
21The three equations are:
22 1. Two point trimming
23 Tc = (T - 25) * (TI2 - TI1) / (85 - 25) + TI1
24
25 2. One point trimming
26 Tc = T + TI1 - 25
27
28 3. No trimming
29 Tc = T + 50
30
31 Tc: Temperature code, T: Temperature,
32 TI1: Trimming info for 25 degree Celsius (stored at TRIMINFO register)
33 Temperature code measured at 25 degree Celsius which is unchanged
34 TI2: Trimming info for 85 degree Celsius (stored at TRIMINFO register)
35 Temperature code measured at 85 degree Celsius which is unchanged
36
37TMU(Thermal Management Unit) in EXYNOS4 generates interrupt
38when temperature exceeds pre-defined levels.
39The maximum number of configurable threshold is four.
40The threshold levels are defined as follows:
41 Level_0: current temperature > trigger_level_0 + threshold
42 Level_1: current temperature > trigger_level_1 + threshold
43 Level_2: current temperature > trigger_level_2 + threshold
44 Level_3: current temperature > trigger_level_3 + threshold
45
46 The threshold and each trigger_level are set
47 through the corresponding registers.
48
49When an interrupt occurs, this driver notify user space of
50one of four threshold levels for the interrupt
51through kobject_uevent_env and sysfs_notify functions.
52Although an interrupt condition for level_0 can be set,
53it is not notified to user space through sysfs_notify function.
54
55Sysfs Interface
56---------------
57name name of the temperature sensor
58 RO
59
60temp1_input temperature
61 RO
62
63temp1_max temperature for level_1 interrupt
64 RO
65
66temp1_crit temperature for level_2 interrupt
67 RO
68
69temp1_emergency temperature for level_3 interrupt
70 RO
71
72temp1_max_alarm alarm for level_1 interrupt
73 RO
74
75temp1_crit_alarm
76 alarm for level_2 interrupt
77 RO
78
79temp1_emergency_alarm
80 alarm for level_3 interrupt
81 RO
diff --git a/Documentation/hwmon/lm75 b/Documentation/hwmon/lm75
index a1790401fdde..c91a1d15fa28 100644
--- a/Documentation/hwmon/lm75
+++ b/Documentation/hwmon/lm75
@@ -12,26 +12,46 @@ Supported chips:
12 Addresses scanned: I2C 0x48 - 0x4f 12 Addresses scanned: I2C 0x48 - 0x4f
13 Datasheet: Publicly available at the National Semiconductor website 13 Datasheet: Publicly available at the National Semiconductor website
14 http://www.national.com/ 14 http://www.national.com/
15 * Dallas Semiconductor DS75 15 * Dallas Semiconductor DS75, DS1775
16 Prefix: 'lm75' 16 Prefixes: 'ds75', 'ds1775'
17 Addresses scanned: I2C 0x48 - 0x4f 17 Addresses scanned: none
18 Datasheet: Publicly available at the Dallas Semiconductor website
19 http://www.maxim-ic.com/
20 * Dallas Semiconductor DS1775
21 Prefix: 'lm75'
22 Addresses scanned: I2C 0x48 - 0x4f
23 Datasheet: Publicly available at the Dallas Semiconductor website 18 Datasheet: Publicly available at the Dallas Semiconductor website
24 http://www.maxim-ic.com/ 19 http://www.maxim-ic.com/
25 * Maxim MAX6625, MAX6626 20 * Maxim MAX6625, MAX6626
26 Prefix: 'lm75' 21 Prefixes: 'max6625', 'max6626'
27 Addresses scanned: I2C 0x48 - 0x4b 22 Addresses scanned: none
28 Datasheet: Publicly available at the Maxim website 23 Datasheet: Publicly available at the Maxim website
29 http://www.maxim-ic.com/ 24 http://www.maxim-ic.com/
30 * Microchip (TelCom) TCN75 25 * Microchip (TelCom) TCN75
31 Prefix: 'lm75' 26 Prefix: 'lm75'
32 Addresses scanned: I2C 0x48 - 0x4f 27 Addresses scanned: none
28 Datasheet: Publicly available at the Microchip website
29 http://www.microchip.com/
30 * Microchip MCP9800, MCP9801, MCP9802, MCP9803
31 Prefix: 'mcp980x'
32 Addresses scanned: none
33 Datasheet: Publicly available at the Microchip website 33 Datasheet: Publicly available at the Microchip website
34 http://www.microchip.com/ 34 http://www.microchip.com/
35 * Analog Devices ADT75
36 Prefix: 'adt75'
37 Addresses scanned: none
38 Datasheet: Publicly available at the Analog Devices website
39 http://www.analog.com/adt75
40 * ST Microelectronics STDS75
41 Prefix: 'stds75'
42 Addresses scanned: none
43 Datasheet: Publicly available at the ST website
44 http://www.st.com/internet/analog/product/121769.jsp
45 * Texas Instruments TMP100, TMP101, TMP105, TMP75, TMP175, TMP275
46 Prefixes: 'tmp100', 'tmp101', 'tmp105', 'tmp175', 'tmp75', 'tmp275'
47 Addresses scanned: none
48 Datasheet: Publicly available at the Texas Instruments website
49 http://www.ti.com/product/tmp100
50 http://www.ti.com/product/tmp101
51 http://www.ti.com/product/tmp105
52 http://www.ti.com/product/tmp75
53 http://www.ti.com/product/tmp175
54 http://www.ti.com/product/tmp275
35 55
36Author: Frodo Looijaard <frodol@dds.nl> 56Author: Frodo Looijaard <frodol@dds.nl>
37 57
@@ -50,21 +70,16 @@ range of -55 to +125 degrees.
50The LM75 only updates its values each 1.5 seconds; reading it more often 70The LM75 only updates its values each 1.5 seconds; reading it more often
51will do no harm, but will return 'old' values. 71will do no harm, but will return 'old' values.
52 72
53The LM75 is usually used in combination with LM78-like chips, to measure 73The original LM75 was typically used in combination with LM78-like chips
54the temperature of the processor(s). 74on PC motherboards, to measure the temperature of the processor(s). Clones
55 75are now used in various embedded designs.
56The DS75, DS1775, MAX6625, and MAX6626 are supported as well.
57They are not distinguished from an LM75. While most of these chips
58have three additional bits of accuracy (12 vs. 9 for the LM75),
59the additional bits are not supported. Not only that, but these chips will
60not be detected if not in 9-bit precision mode (use the force parameter if
61needed).
62
63The TCN75 is supported as well, and is not distinguished from an LM75.
64 76
65The LM75 is essentially an industry standard; there may be other 77The LM75 is essentially an industry standard; there may be other
66LM75 clones not listed here, with or without various enhancements, 78LM75 clones not listed here, with or without various enhancements,
67that are supported. 79that are supported. The clones are not detected by the driver, unless
80they reproduce the exact register tricks of the original LM75, and must
81therefore be instantiated explicitly. The specific enhancements (such as
82higher resolution) are not currently supported by the driver.
68 83
69The LM77 is not supported, contrary to what we pretended for a long time. 84The LM77 is not supported, contrary to what we pretended for a long time.
70Both chips are simply not compatible, value encoding differs. 85Both chips are simply not compatible, value encoding differs.
diff --git a/Documentation/hwmon/ltc2978 b/Documentation/hwmon/ltc2978
new file mode 100644
index 000000000000..c365f9beb5dd
--- /dev/null
+++ b/Documentation/hwmon/ltc2978
@@ -0,0 +1,103 @@
1Kernel driver ltc2978
2=====================
3
4Supported chips:
5 * Linear Technology LTC2978
6 Prefix: 'ltc2978'
7 Addresses scanned: -
8 Datasheet: http://cds.linear.com/docs/Datasheet/2978fa.pdf
9 * Linear Technology LTC3880
10 Prefix: 'ltc3880'
11 Addresses scanned: -
12 Datasheet: http://cds.linear.com/docs/Datasheet/3880f.pdf
13
14Author: Guenter Roeck <guenter.roeck@ericsson.com>
15
16
17Description
18-----------
19
20The LTC2978 is an octal power supply monitor, supervisor, sequencer and
21margin controller. The LTC3880 is a dual, PolyPhase DC/DC synchronous
22step-down switching regulator controller.
23
24
25Usage Notes
26-----------
27
28This driver does not probe for PMBus devices. You will have to instantiate
29devices explicitly.
30
31Example: the following commands will load the driver for an LTC2978 at address
320x60 on I2C bus #1:
33
34# modprobe ltc2978
35# echo ltc2978 0x60 > /sys/bus/i2c/devices/i2c-1/new_device
36
37
38Sysfs attributes
39----------------
40
41in1_label "vin"
42in1_input Measured input voltage.
43in1_min Minimum input voltage.
44in1_max Maximum input voltage.
45in1_lcrit Critical minimum input voltage.
46in1_crit Critical maximum input voltage.
47in1_min_alarm Input voltage low alarm.
48in1_max_alarm Input voltage high alarm.
49in1_lcrit_alarm Input voltage critical low alarm.
50in1_crit_alarm Input voltage critical high alarm.
51in1_lowest Lowest input voltage. LTC2978 only.
52in1_highest Highest input voltage.
53in1_reset_history Reset history. Writing into this attribute will reset
54 history for all attributes.
55
56in[2-9]_label "vout[1-8]". Channels 3 to 9 on LTC2978 only.
57in[2-9]_input Measured output voltage.
58in[2-9]_min Minimum output voltage.
59in[2-9]_max Maximum output voltage.
60in[2-9]_lcrit Critical minimum output voltage.
61in[2-9]_crit Critical maximum output voltage.
62in[2-9]_min_alarm Output voltage low alarm.
63in[2-9]_max_alarm Output voltage high alarm.
64in[2-9]_lcrit_alarm Output voltage critical low alarm.
65in[2-9]_crit_alarm Output voltage critical high alarm.
66in[2-9]_lowest Lowest output voltage. LTC2978 only.
67in[2-9]_highest Lowest output voltage.
68in[2-9]_reset_history Reset history. Writing into this attribute will reset
69 history for all attributes.
70
71temp[1-3]_input Measured temperature.
72 On LTC2978, only one temperature measurement is
73 supported and reflects the internal temperature.
74 On LTC3880, temp1 and temp2 report external
75 temperatures, and temp3 reports the internal
76 temperature.
77temp[1-3]_min Mimimum temperature.
78temp[1-3]_max Maximum temperature.
79temp[1-3]_lcrit Critical low temperature.
80temp[1-3]_crit Critical high temperature.
81temp[1-3]_min_alarm Chip temperature low alarm.
82temp[1-3]_max_alarm Chip temperature high alarm.
83temp[1-3]_lcrit_alarm Chip temperature critical low alarm.
84temp[1-3]_crit_alarm Chip temperature critical high alarm.
85temp[1-3]_lowest Lowest measured temperature. LTC2978 only.
86temp[1-3]_highest Highest measured temperature.
87temp[1-3]_reset_history Reset history. Writing into this attribute will reset
88 history for all attributes.
89
90power[1-2]_label "pout[1-2]". LTC3880 only.
91power[1-2]_input Measured power.
92
93curr1_label "iin". LTC3880 only.
94curr1_input Measured input current.
95curr1_max Maximum input current.
96curr1_max_alarm Input current high alarm.
97
98curr[2-3]_label "iout[1-2]". LTC3880 only.
99curr[2-3]_input Measured input current.
100curr[2-3]_max Maximum input current.
101curr[2-3]_crit Critical input current.
102curr[2-3]_max_alarm Input current high alarm.
103curr[2-3]_crit_alarm Input current critical high alarm.
diff --git a/Documentation/hwmon/max16065 b/Documentation/hwmon/max16065
index 44b4f61e04f9..c11f64a1f2ad 100644
--- a/Documentation/hwmon/max16065
+++ b/Documentation/hwmon/max16065
@@ -62,6 +62,13 @@ can be safely used to identify the chip. You will have to instantiate
62the devices explicitly. Please see Documentation/i2c/instantiating-devices for 62the devices explicitly. Please see Documentation/i2c/instantiating-devices for
63details. 63details.
64 64
65WARNING: Do not access chip registers using the i2cdump command, and do not use
66any of the i2ctools commands on a command register (0xa5 to 0xac). The chips
67supported by this driver interpret any access to a command register (including
68read commands) as request to execute the command in question. This may result in
69power loss, board resets, and/or Flash corruption. Worst case, your board may
70turn into a brick.
71
65 72
66Sysfs entries 73Sysfs entries
67------------- 74-------------
diff --git a/Documentation/hwmon/pmbus b/Documentation/hwmon/pmbus
index c36c1c1a62bb..15ac911ce51b 100644
--- a/Documentation/hwmon/pmbus
+++ b/Documentation/hwmon/pmbus
@@ -8,11 +8,6 @@ Supported chips:
8 Addresses scanned: - 8 Addresses scanned: -
9 Datasheet: 9 Datasheet:
10 http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146395 10 http://archive.ericsson.net/service/internet/picov/get?DocNo=28701-EN/LZT146395
11 * Linear Technology LTC2978
12 Octal PMBus Power Supply Monitor and Controller
13 Prefix: 'ltc2978'
14 Addresses scanned: -
15 Datasheet: http://cds.linear.com/docs/Datasheet/2978fa.pdf
16 * ON Semiconductor ADP4000, NCP4200, NCP4208 11 * ON Semiconductor ADP4000, NCP4200, NCP4208
17 Prefixes: 'adp4000', 'ncp4200', 'ncp4208' 12 Prefixes: 'adp4000', 'ncp4200', 'ncp4208'
18 Addresses scanned: - 13 Addresses scanned: -
@@ -20,6 +15,14 @@ Supported chips:
20 http://www.onsemi.com/pub_link/Collateral/ADP4000-D.PDF 15 http://www.onsemi.com/pub_link/Collateral/ADP4000-D.PDF
21 http://www.onsemi.com/pub_link/Collateral/NCP4200-D.PDF 16 http://www.onsemi.com/pub_link/Collateral/NCP4200-D.PDF
22 http://www.onsemi.com/pub_link/Collateral/JUNE%202009-%20REV.%200.PDF 17 http://www.onsemi.com/pub_link/Collateral/JUNE%202009-%20REV.%200.PDF
18 * Lineage Power
19 Prefixes: 'pdt003', 'pdt006', 'pdt012', 'udt020'
20 Addresses scanned: -
21 Datasheets:
22 http://www.lineagepower.com/oem/pdf/PDT003A0X.pdf
23 http://www.lineagepower.com/oem/pdf/PDT006A0X.pdf
24 http://www.lineagepower.com/oem/pdf/PDT012A0X.pdf
25 http://www.lineagepower.com/oem/pdf/UDT020A0X.pdf
23 * Generic PMBus devices 26 * Generic PMBus devices
24 Prefix: 'pmbus' 27 Prefix: 'pmbus'
25 Addresses scanned: - 28 Addresses scanned: -
diff --git a/Documentation/hwmon/pmbus-core b/Documentation/hwmon/pmbus-core
new file mode 100644
index 000000000000..31e4720fed18
--- /dev/null
+++ b/Documentation/hwmon/pmbus-core
@@ -0,0 +1,283 @@
1PMBus core driver and internal API
2==================================
3
4Introduction
5============
6
7[from pmbus.org] The Power Management Bus (PMBus) is an open standard
8power-management protocol with a fully defined command language that facilitates
9communication with power converters and other devices in a power system. The
10protocol is implemented over the industry-standard SMBus serial interface and
11enables programming, control, and real-time monitoring of compliant power
12conversion products. This flexible and highly versatile standard allows for
13communication between devices based on both analog and digital technologies, and
14provides true interoperability which will reduce design complexity and shorten
15time to market for power system designers. Pioneered by leading power supply and
16semiconductor companies, this open power system standard is maintained and
17promoted by the PMBus Implementers Forum (PMBus-IF), comprising 30+ adopters
18with the objective to provide support to, and facilitate adoption among, users.
19
20Unfortunately, while PMBus commands are standardized, there are no mandatory
21commands, and manufacturers can add as many non-standard commands as they like.
22Also, different PMBUs devices act differently if non-supported commands are
23executed. Some devices return an error, some devices return 0xff or 0xffff and
24set a status error flag, and some devices may simply hang up.
25
26Despite all those difficulties, a generic PMBus device driver is still useful
27and supported since kernel version 2.6.39. However, it was necessary to support
28device specific extensions in addition to the core PMBus driver, since it is
29simply unknown what new device specific functionality PMBus device developers
30come up with next.
31
32To make device specific extensions as scalable as possible, and to avoid having
33to modify the core PMBus driver repeatedly for new devices, the PMBus driver was
34split into core, generic, and device specific code. The core code (in
35pmbus_core.c) provides generic functionality. The generic code (in pmbus.c)
36provides support for generic PMBus devices. Device specific code is responsible
37for device specific initialization and, if needed, maps device specific
38functionality into generic functionality. This is to some degree comparable
39to PCI code, where generic code is augmented as needed with quirks for all kinds
40of devices.
41
42PMBus device capabilities auto-detection
43========================================
44
45For generic PMBus devices, code in pmbus.c attempts to auto-detect all supported
46PMBus commands. Auto-detection is somewhat limited, since there are simply too
47many variables to consider. For example, it is almost impossible to autodetect
48which PMBus commands are paged and which commands are replicated across all
49pages (see the PMBus specification for details on multi-page PMBus devices).
50
51For this reason, it often makes sense to provide a device specific driver if not
52all commands can be auto-detected. The data structures in this driver can be
53used to inform the core driver about functionality supported by individual
54chips.
55
56Some commands are always auto-detected. This applies to all limit commands
57(lcrit, min, max, and crit attributes) as well as associated alarm attributes.
58Limits and alarm attributes are auto-detected because there are simply too many
59possible combinations to provide a manual configuration interface.
60
61PMBus internal API
62==================
63
64The API between core and device specific PMBus code is defined in
65drivers/hwmon/pmbus/pmbus.h. In addition to the internal API, pmbus.h defines
66standard PMBus commands and virtual PMBus commands.
67
68Standard PMBus commands
69-----------------------
70
71Standard PMBus commands (commands values 0x00 to 0xff) are defined in the PMBUs
72specification.
73
74Virtual PMBus commands
75----------------------
76
77Virtual PMBus commands are provided to enable support for non-standard
78functionality which has been implemented by several chip vendors and is thus
79desirable to support.
80
81Virtual PMBus commands start with command value 0x100 and can thus easily be
82distinguished from standard PMBus commands (which can not have values larger
83than 0xff). Support for virtual PMBus commands is device specific and thus has
84to be implemented in device specific code.
85
86Virtual commands are named PMBUS_VIRT_xxx and start with PMBUS_VIRT_BASE. All
87virtual commands are word sized.
88
89There are currently two types of virtual commands.
90
91- READ commands are read-only; writes are either ignored or return an error.
92- RESET commands are read/write. Reading reset registers returns zero
93 (used for detection), writing any value causes the associated history to be
94 reset.
95
96Virtual commands have to be handled in device specific driver code. Chip driver
97code returns non-negative values if a virtual command is supported, or a
98negative error code if not. The chip driver may return -ENODATA or any other
99Linux error code in this case, though an error code other than -ENODATA is
100handled more efficiently and thus preferred. Either case, the calling PMBus
101core code will abort if the chip driver returns an error code when reading
102or writing virtual registers (in other words, the PMBus core code will never
103send a virtual command to a chip).
104
105PMBus driver information
106------------------------
107
108PMBus driver information, defined in struct pmbus_driver_info, is the main means
109for device specific drivers to pass information to the core PMBus driver.
110Specifically, it provides the following information.
111
112- For devices supporting its data in Direct Data Format, it provides coefficients
113 for converting register values into normalized data. This data is usually
114 provided by chip manufacturers in device datasheets.
115- Supported chip functionality can be provided to the core driver. This may be
116 necessary for chips which react badly if non-supported commands are executed,
117 and/or to speed up device detection and initialization.
118- Several function entry points are provided to support overriding and/or
119 augmenting generic command execution. This functionality can be used to map
120 non-standard PMBus commands to standard commands, or to augment standard
121 command return values with device specific information.
122
123 API functions
124 -------------
125
126 Functions provided by chip driver
127 ---------------------------------
128
129 All functions return the command return value (read) or zero (write) if
130 successful. A return value of -ENODATA indicates that there is no manufacturer
131 specific command, but that a standard PMBus command may exist. Any other
132 negative return value indicates that the commands does not exist for this
133 chip, and that no attempt should be made to read or write the standard
134 command.
135
136 As mentioned above, an exception to this rule applies to virtual commands,
137 which _must_ be handled in driver specific code. See "Virtual PMBus Commands"
138 above for more details.
139
140 Command execution in the core PMBus driver code is as follows.
141
142 if (chip_access_function) {
143 status = chip_access_function();
144 if (status != -ENODATA)
145 return status;
146 }
147 if (command >= PMBUS_VIRT_BASE) /* For word commands/registers only */
148 return -EINVAL;
149 return generic_access();
150
151 Chip drivers may provide pointers to the following functions in struct
152 pmbus_driver_info. All functions are optional.
153
154 int (*read_byte_data)(struct i2c_client *client, int page, int reg);
155
156 Read byte from page <page>, register <reg>.
157 <page> may be -1, which means "current page".
158
159 int (*read_word_data)(struct i2c_client *client, int page, int reg);
160
161 Read word from page <page>, register <reg>.
162
163 int (*write_word_data)(struct i2c_client *client, int page, int reg,
164 u16 word);
165
166 Write word to page <page>, register <reg>.
167
168 int (*write_byte)(struct i2c_client *client, int page, u8 value);
169
170 Write byte to page <page>, register <reg>.
171 <page> may be -1, which means "current page".
172
173 int (*identify)(struct i2c_client *client, struct pmbus_driver_info *info);
174
175 Determine supported PMBus functionality. This function is only necessary
176 if a chip driver supports multiple chips, and the chip functionality is not
177 pre-determined. It is currently only used by the generic pmbus driver
178 (pmbus.c).
179
180 Functions exported by core driver
181 ---------------------------------
182
183 Chip drivers are expected to use the following functions to read or write
184 PMBus registers. Chip drivers may also use direct I2C commands. If direct I2C
185 commands are used, the chip driver code must not directly modify the current
186 page, since the selected page is cached in the core driver and the core driver
187 will assume that it is selected. Using pmbus_set_page() to select a new page
188 is mandatory.
189
190 int pmbus_set_page(struct i2c_client *client, u8 page);
191
192 Set PMBus page register to <page> for subsequent commands.
193
194 int pmbus_read_word_data(struct i2c_client *client, u8 page, u8 reg);
195
196 Read word data from <page>, <reg>. Similar to i2c_smbus_read_word_data(), but
197 selects page first.
198
199 int pmbus_write_word_data(struct i2c_client *client, u8 page, u8 reg,
200 u16 word);
201
202 Write word data to <page>, <reg>. Similar to i2c_smbus_write_word_data(), but
203 selects page first.
204
205 int pmbus_read_byte_data(struct i2c_client *client, int page, u8 reg);
206
207 Read byte data from <page>, <reg>. Similar to i2c_smbus_read_byte_data(), but
208 selects page first. <page> may be -1, which means "current page".
209
210 int pmbus_write_byte(struct i2c_client *client, int page, u8 value);
211
212 Write byte data to <page>, <reg>. Similar to i2c_smbus_write_byte(), but
213 selects page first. <page> may be -1, which means "current page".
214
215 void pmbus_clear_faults(struct i2c_client *client);
216
217 Execute PMBus "Clear Fault" command on all chip pages.
218 This function calls the device specific write_byte function if defined.
219 Therefore, it must _not_ be called from that function.
220
221 bool pmbus_check_byte_register(struct i2c_client *client, int page, int reg);
222
223 Check if byte register exists. Return true if the register exists, false
224 otherwise.
225 This function calls the device specific write_byte function if defined to
226 obtain the chip status. Therefore, it must _not_ be called from that function.
227
228 bool pmbus_check_word_register(struct i2c_client *client, int page, int reg);
229
230 Check if word register exists. Return true if the register exists, false
231 otherwise.
232 This function calls the device specific write_byte function if defined to
233 obtain the chip status. Therefore, it must _not_ be called from that function.
234
235 int pmbus_do_probe(struct i2c_client *client, const struct i2c_device_id *id,
236 struct pmbus_driver_info *info);
237
238 Execute probe function. Similar to standard probe function for other drivers,
239 with the pointer to struct pmbus_driver_info as additional argument. Calls
240 identify function if supported. Must only be called from device probe
241 function.
242
243 void pmbus_do_remove(struct i2c_client *client);
244
245 Execute driver remove function. Similar to standard driver remove function.
246
247 const struct pmbus_driver_info
248 *pmbus_get_driver_info(struct i2c_client *client);
249
250 Return pointer to struct pmbus_driver_info as passed to pmbus_do_probe().
251
252
253PMBus driver platform data
254==========================
255
256PMBus platform data is defined in include/linux/i2c/pmbus.h. Platform data
257currently only provides a flag field with a single bit used.
258
259#define PMBUS_SKIP_STATUS_CHECK (1 << 0)
260
261struct pmbus_platform_data {
262 u32 flags; /* Device specific flags */
263};
264
265
266Flags
267-----
268
269PMBUS_SKIP_STATUS_CHECK
270
271During register detection, skip checking the status register for
272communication or command errors.
273
274Some PMBus chips respond with valid data when trying to read an unsupported
275register. For such chips, checking the status register is mandatory when
276trying to determine if a chip register exists or not.
277Other PMBus chips don't support the STATUS_CML register, or report
278communication errors for no explicable reason. For such chips, checking the
279status register must be disabled.
280
281Some i2c controllers do not support single-byte commands (write commands with
282no data, i2c_smbus_write_byte()). With such controllers, clearing the status
283register is impossible, and the PMBUS_SKIP_STATUS_CHECK flag must be set.
diff --git a/Documentation/hwmon/zl6100 b/Documentation/hwmon/zl6100
new file mode 100644
index 000000000000..7617798b5c97
--- /dev/null
+++ b/Documentation/hwmon/zl6100
@@ -0,0 +1,125 @@
1Kernel driver zl6100
2====================
3
4Supported chips:
5 * Intersil / Zilker Labs ZL2004
6 Prefix: 'zl2004'
7 Addresses scanned: -
8 Datasheet: http://www.intersil.com/data/fn/fn6847.pdf
9 * Intersil / Zilker Labs ZL2006
10 Prefix: 'zl2006'
11 Addresses scanned: -
12 Datasheet: http://www.intersil.com/data/fn/fn6850.pdf
13 * Intersil / Zilker Labs ZL2008
14 Prefix: 'zl2008'
15 Addresses scanned: -
16 Datasheet: http://www.intersil.com/data/fn/fn6859.pdf
17 * Intersil / Zilker Labs ZL2105
18 Prefix: 'zl2105'
19 Addresses scanned: -
20 Datasheet: http://www.intersil.com/data/fn/fn6851.pdf
21 * Intersil / Zilker Labs ZL2106
22 Prefix: 'zl2106'
23 Addresses scanned: -
24 Datasheet: http://www.intersil.com/data/fn/fn6852.pdf
25 * Intersil / Zilker Labs ZL6100
26 Prefix: 'zl6100'
27 Addresses scanned: -
28 Datasheet: http://www.intersil.com/data/fn/fn6876.pdf
29 * Intersil / Zilker Labs ZL6105
30 Prefix: 'zl6105'
31 Addresses scanned: -
32 Datasheet: http://www.intersil.com/data/fn/fn6906.pdf
33
34Author: Guenter Roeck <guenter.roeck@ericsson.com>
35
36
37Description
38-----------
39
40This driver supports hardware montoring for Intersil / Zilker Labs ZL6100 and
41compatible digital DC-DC controllers.
42
43The driver is a client driver to the core PMBus driver. Please see
44Documentation/hwmon/pmbus and Documentation.hwmon/pmbus-core for details
45on PMBus client drivers.
46
47
48Usage Notes
49-----------
50
51This driver does not auto-detect devices. You will have to instantiate the
52devices explicitly. Please see Documentation/i2c/instantiating-devices for
53details.
54
55WARNING: Do not access chip registers using the i2cdump command, and do not use
56any of the i2ctools commands on a command register used to save and restore
57configuration data (0x11, 0x12, 0x15, 0x16, and 0xf4). The chips supported by
58this driver interpret any access to those command registers (including read
59commands) as request to execute the command in question. Unless write accesses
60to those registers are protected, this may result in power loss, board resets,
61and/or Flash corruption. Worst case, your board may turn into a brick.
62
63
64Platform data support
65---------------------
66
67The driver supports standard PMBus driver platform data.
68
69
70Module parameters
71-----------------
72
73delay
74-----
75
76Some Intersil/Zilker Labs DC-DC controllers require a minimum interval between
77I2C bus accesses. According to Intersil, the minimum interval is 2 ms, though
781 ms appears to be sufficient and has not caused any problems in testing.
79The problem is known to affect ZL6100, ZL2105, and ZL2008. It is known not to
80affect ZL2004 and ZL6105. The driver automatically sets the interval to 1 ms
81except for ZL2004 and ZL6105. To enable manual override, the driver provides a
82writeable module parameter, 'delay', which can be used to set the interval to
83a value between 0 and 65,535 microseconds.
84
85
86Sysfs entries
87-------------
88
89The following attributes are supported. Limits are read-write; all other
90attributes are read-only.
91
92in1_label "vin"
93in1_input Measured input voltage.
94in1_min Minimum input voltage.
95in1_max Maximum input voltage.
96in1_lcrit Critical minumum input voltage.
97in1_crit Critical maximum input voltage.
98in1_min_alarm Input voltage low alarm.
99in1_max_alarm Input voltage high alarm.
100in1_lcrit_alarm Input voltage critical low alarm.
101in1_crit_alarm Input voltage critical high alarm.
102
103in2_label "vout1"
104in2_input Measured output voltage.
105in2_lcrit Critical minumum output Voltage.
106in2_crit Critical maximum output voltage.
107in2_lcrit_alarm Critical output voltage critical low alarm.
108in2_crit_alarm Critical output voltage critical high alarm.
109
110curr1_label "iout1"
111curr1_input Measured output current.
112curr1_lcrit Critical minimum output current.
113curr1_crit Critical maximum output current.
114curr1_lcrit_alarm Output current critical low alarm.
115curr1_crit_alarm Output current critical high alarm.
116
117temp[12]_input Measured temperature.
118temp[12]_min Minimum temperature.
119temp[12]_max Maximum temperature.
120temp[12]_lcrit Critical low temperature.
121temp[12]_crit Critical high temperature.
122temp[12]_min_alarm Chip temperature low alarm.
123temp[12]_max_alarm Chip temperature high alarm.
124temp[12]_lcrit_alarm Chip temperature critical low alarm.
125temp[12]_crit_alarm Chip temperature critical high alarm.
diff --git a/Documentation/hwspinlock.txt b/Documentation/hwspinlock.txt
index 7dcd1a4e726c..a903ee5e9776 100644
--- a/Documentation/hwspinlock.txt
+++ b/Documentation/hwspinlock.txt
@@ -39,23 +39,20 @@ independent, drivers.
39 in case an unused hwspinlock isn't available. Users of this 39 in case an unused hwspinlock isn't available. Users of this
40 API will usually want to communicate the lock's id to the remote core 40 API will usually want to communicate the lock's id to the remote core
41 before it can be used to achieve synchronization. 41 before it can be used to achieve synchronization.
42 Can be called from an atomic context (this function will not sleep) but 42 Should be called from a process context (might sleep).
43 not from within interrupt context.
44 43
45 struct hwspinlock *hwspin_lock_request_specific(unsigned int id); 44 struct hwspinlock *hwspin_lock_request_specific(unsigned int id);
46 - assign a specific hwspinlock id and return its address, or NULL 45 - assign a specific hwspinlock id and return its address, or NULL
47 if that hwspinlock is already in use. Usually board code will 46 if that hwspinlock is already in use. Usually board code will
48 be calling this function in order to reserve specific hwspinlock 47 be calling this function in order to reserve specific hwspinlock
49 ids for predefined purposes. 48 ids for predefined purposes.
50 Can be called from an atomic context (this function will not sleep) but 49 Should be called from a process context (might sleep).
51 not from within interrupt context.
52 50
53 int hwspin_lock_free(struct hwspinlock *hwlock); 51 int hwspin_lock_free(struct hwspinlock *hwlock);
54 - free a previously-assigned hwspinlock; returns 0 on success, or an 52 - free a previously-assigned hwspinlock; returns 0 on success, or an
55 appropriate error code on failure (e.g. -EINVAL if the hwspinlock 53 appropriate error code on failure (e.g. -EINVAL if the hwspinlock
56 is already free). 54 is already free).
57 Can be called from an atomic context (this function will not sleep) but 55 Should be called from a process context (might sleep).
58 not from within interrupt context.
59 56
60 int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned int timeout); 57 int hwspin_lock_timeout(struct hwspinlock *hwlock, unsigned int timeout);
61 - lock a previously-assigned hwspinlock with a timeout limit (specified in 58 - lock a previously-assigned hwspinlock with a timeout limit (specified in
@@ -230,45 +227,62 @@ int hwspinlock_example2(void)
230 227
2314. API for implementors 2284. API for implementors
232 229
233 int hwspin_lock_register(struct hwspinlock *hwlock); 230 int hwspin_lock_register(struct hwspinlock_device *bank, struct device *dev,
231 const struct hwspinlock_ops *ops, int base_id, int num_locks);
234 - to be called from the underlying platform-specific implementation, in 232 - to be called from the underlying platform-specific implementation, in
235 order to register a new hwspinlock instance. Can be called from an atomic 233 order to register a new hwspinlock device (which is usually a bank of
236 context (this function will not sleep) but not from within interrupt 234 numerous locks). Should be called from a process context (this function
237 context. Returns 0 on success, or appropriate error code on failure. 235 might sleep).
236 Returns 0 on success, or appropriate error code on failure.
238 237
239 struct hwspinlock *hwspin_lock_unregister(unsigned int id); 238 int hwspin_lock_unregister(struct hwspinlock_device *bank);
240 - to be called from the underlying vendor-specific implementation, in order 239 - to be called from the underlying vendor-specific implementation, in order
241 to unregister an existing (and unused) hwspinlock instance. 240 to unregister an hwspinlock device (which is usually a bank of numerous
242 Can be called from an atomic context (will not sleep) but not from 241 locks).
243 within interrupt context. 242 Should be called from a process context (this function might sleep).
244 Returns the address of hwspinlock on success, or NULL on error (e.g. 243 Returns the address of hwspinlock on success, or NULL on error (e.g.
245 if the hwspinlock is sill in use). 244 if the hwspinlock is sill in use).
246 245
2475. struct hwspinlock 2465. Important structs
248 247
249This struct represents an hwspinlock instance. It is registered by the 248struct hwspinlock_device is a device which usually contains a bank
250underlying hwspinlock implementation using the hwspin_lock_register() API. 249of hardware locks. It is registered by the underlying hwspinlock
250implementation using the hwspin_lock_register() API.
251 251
252/** 252/**
253 * struct hwspinlock - vendor-specific hwspinlock implementation 253 * struct hwspinlock_device - a device which usually spans numerous hwspinlocks
254 * 254 * @dev: underlying device, will be used to invoke runtime PM api
255 * @dev: underlying device, will be used with runtime PM api 255 * @ops: platform-specific hwspinlock handlers
256 * @ops: vendor-specific hwspinlock handlers 256 * @base_id: id index of the first lock in this device
257 * @id: a global, unique, system-wide, index of the lock. 257 * @num_locks: number of locks in this device
258 * @lock: initialized and used by hwspinlock core 258 * @lock: dynamically allocated array of 'struct hwspinlock'
259 * @owner: underlying implementation module, used to maintain module ref count
260 */ 259 */
261struct hwspinlock { 260struct hwspinlock_device {
262 struct device *dev; 261 struct device *dev;
263 const struct hwspinlock_ops *ops; 262 const struct hwspinlock_ops *ops;
264 int id; 263 int base_id;
264 int num_locks;
265 struct hwspinlock lock[0];
266};
267
268struct hwspinlock_device contains an array of hwspinlock structs, each
269of which represents a single hardware lock:
270
271/**
272 * struct hwspinlock - this struct represents a single hwspinlock instance
273 * @bank: the hwspinlock_device structure which owns this lock
274 * @lock: initialized and used by hwspinlock core
275 * @priv: private data, owned by the underlying platform-specific hwspinlock drv
276 */
277struct hwspinlock {
278 struct hwspinlock_device *bank;
265 spinlock_t lock; 279 spinlock_t lock;
266 struct module *owner; 280 void *priv;
267}; 281};
268 282
269The underlying implementation is responsible to assign the dev, ops, id and 283When registering a bank of locks, the hwspinlock driver only needs to
270owner members. The lock member, OTOH, is initialized and used by the hwspinlock 284set the priv members of the locks. The rest of the members are set and
271core. 285initialized by the hwspinlock core itself.
272 286
2736. Implementation callbacks 2876. Implementation callbacks
274 288
diff --git a/Documentation/i2c/smbus-protocol b/Documentation/i2c/smbus-protocol
index 7c19d1a2bea0..49f5b680809d 100644
--- a/Documentation/i2c/smbus-protocol
+++ b/Documentation/i2c/smbus-protocol
@@ -88,6 +88,10 @@ byte. But this time, the data is a complete word (16 bits).
88 88
89S Addr Wr [A] Comm [A] S Addr Rd [A] [DataLow] A [DataHigh] NA P 89S Addr Wr [A] Comm [A] S Addr Rd [A] [DataLow] A [DataHigh] NA P
90 90
91Note the convenience function i2c_smbus_read_word_swapped is
92available for reads where the two data bytes are the other way
93around (not SMBus compliant, but very popular.)
94
91 95
92SMBus Write Byte: i2c_smbus_write_byte_data() 96SMBus Write Byte: i2c_smbus_write_byte_data()
93============================================== 97==============================================
@@ -108,6 +112,10 @@ specified through the Comm byte.
108 112
109S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P 113S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P
110 114
115Note the convenience function i2c_smbus_write_word_swapped is
116available for writes where the two data bytes are the other way
117around (not SMBus compliant, but very popular.)
118
111 119
112SMBus Process Call: i2c_smbus_process_call() 120SMBus Process Call: i2c_smbus_process_call()
113============================================= 121=============================================
diff --git a/Documentation/input/elantech.txt b/Documentation/input/elantech.txt
index db798af5ef98..5602eb71ad5d 100644
--- a/Documentation/input/elantech.txt
+++ b/Documentation/input/elantech.txt
@@ -16,15 +16,28 @@ Contents
16 16
17 1. Introduction 17 1. Introduction
18 2. Extra knobs 18 2. Extra knobs
19 3. Hardware version 1 19 3. Differentiating hardware versions
20 3.1 Registers 20 4. Hardware version 1
21 3.2 Native relative mode 4 byte packet format
22 3.3 Native absolute mode 4 byte packet format
23 4. Hardware version 2
24 4.1 Registers 21 4.1 Registers
25 4.2 Native absolute mode 6 byte packet format 22 4.2 Native relative mode 4 byte packet format
26 4.2.1 One finger touch 23 4.3 Native absolute mode 4 byte packet format
27 4.2.2 Two finger touch 24 5. Hardware version 2
25 5.1 Registers
26 5.2 Native absolute mode 6 byte packet format
27 5.2.1 Parity checking and packet re-synchronization
28 5.2.2 One/Three finger touch
29 5.2.3 Two finger touch
30 6. Hardware version 3
31 6.1 Registers
32 6.2 Native absolute mode 6 byte packet format
33 6.2.1 One/Three finger touch
34 6.2.2 Two finger touch
35 7. Hardware version 4
36 7.1 Registers
37 7.2 Native absolute mode 6 byte packet format
38 7.2.1 Status packet
39 7.2.2 Head packet
40 7.2.3 Motion packet
28 41
29 42
30 43
@@ -375,7 +388,7 @@ For all the other ones, there are just a few constant bits:
375 388
376In case an error is detected, all the packets are shifted by one (and packet[0] is discarded). 389In case an error is detected, all the packets are shifted by one (and packet[0] is discarded).
377 390
3785.2.1 One/Three finger touch 3915.2.2 One/Three finger touch
379 ~~~~~~~~~~~~~~~~ 392 ~~~~~~~~~~~~~~~~
380 393
381byte 0: 394byte 0:
@@ -384,19 +397,19 @@ byte 0:
384 n1 n0 w3 w2 . . R L 397 n1 n0 w3 w2 . . R L
385 398
386 L, R = 1 when Left, Right mouse button pressed 399 L, R = 1 when Left, Right mouse button pressed
387 n1..n0 = numbers of fingers on touchpad 400 n1..n0 = number of fingers on touchpad
388 401
389byte 1: 402byte 1:
390 403
391 bit 7 6 5 4 3 2 1 0 404 bit 7 6 5 4 3 2 1 0
392 p7 p6 p5 p4 . x10 x9 x8 405 p7 p6 p5 p4 x11 x10 x9 x8
393 406
394byte 2: 407byte 2:
395 408
396 bit 7 6 5 4 3 2 1 0 409 bit 7 6 5 4 3 2 1 0
397 x7 x6 x5 x4 x3 x2 x1 x0 410 x7 x6 x5 x4 x3 x2 x1 x0
398 411
399 x10..x0 = absolute x value (horizontal) 412 x11..x0 = absolute x value (horizontal)
400 413
401byte 3: 414byte 3:
402 415
@@ -420,7 +433,7 @@ byte 3:
420byte 4: 433byte 4:
421 434
422 bit 7 6 5 4 3 2 1 0 435 bit 7 6 5 4 3 2 1 0
423 p3 p1 p2 p0 . . y9 y8 436 p3 p1 p2 p0 y11 y10 y9 y8
424 437
425 p7..p0 = pressure (not EF113) 438 p7..p0 = pressure (not EF113)
426 439
@@ -429,10 +442,10 @@ byte 5:
429 bit 7 6 5 4 3 2 1 0 442 bit 7 6 5 4 3 2 1 0
430 y7 y6 y5 y4 y3 y2 y1 y0 443 y7 y6 y5 y4 y3 y2 y1 y0
431 444
432 y9..y0 = absolute y value (vertical) 445 y11..y0 = absolute y value (vertical)
433 446
434 447
4354.2.2 Two finger touch 4485.2.3 Two finger touch
436 ~~~~~~~~~~~~~~~~ 449 ~~~~~~~~~~~~~~~~
437 450
438Note that the two pairs of coordinates are not exactly the coordinates of the 451Note that the two pairs of coordinates are not exactly the coordinates of the
@@ -446,7 +459,7 @@ byte 0:
446 n1 n0 ay8 ax8 . . R L 459 n1 n0 ay8 ax8 . . R L
447 460
448 L, R = 1 when Left, Right mouse button pressed 461 L, R = 1 when Left, Right mouse button pressed
449 n1..n0 = numbers of fingers on touchpad 462 n1..n0 = number of fingers on touchpad
450 463
451byte 1: 464byte 1:
452 465
@@ -480,3 +493,253 @@ byte 5:
480 by7 by8 by5 by4 by3 by2 by1 by0 493 by7 by8 by5 by4 by3 by2 by1 by0
481 494
482 by8..by0 = upper-right finger absolute y value 495 by8..by0 = upper-right finger absolute y value
496
497/////////////////////////////////////////////////////////////////////////////
498
4996. Hardware version 3
500 ==================
501
5026.1 Registers
503 ~~~~~~~~~
504* reg_10
505
506 bit 7 6 5 4 3 2 1 0
507 0 0 0 0 0 0 0 A
508
509 A: 1 = enable absolute tracking
510
5116.2 Native absolute mode 6 byte packet format
512 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
5131 and 3 finger touch shares the same 6-byte packet format, except that
5143 finger touch only reports the position of the center of all three fingers.
515
516Firmware would send 12 bytes of data for 2 finger touch.
517
518Note on debounce:
519In case the box has unstable power supply or other electricity issues, or
520when number of finger changes, F/W would send "debounce packet" to inform
521driver that the hardware is in debounce status.
522The debouce packet has the following signature:
523 byte 0: 0xc4
524 byte 1: 0xff
525 byte 2: 0xff
526 byte 3: 0x02
527 byte 4: 0xff
528 byte 5: 0xff
529When we encounter this kind of packet, we just ignore it.
530
5316.2.1 One/Three finger touch
532 ~~~~~~~~~~~~~~~~~~~~~~
533
534byte 0:
535
536 bit 7 6 5 4 3 2 1 0
537 n1 n0 w3 w2 0 1 R L
538
539 L, R = 1 when Left, Right mouse button pressed
540 n1..n0 = number of fingers on touchpad
541
542byte 1:
543
544 bit 7 6 5 4 3 2 1 0
545 p7 p6 p5 p4 x11 x10 x9 x8
546
547byte 2:
548
549 bit 7 6 5 4 3 2 1 0
550 x7 x6 x5 x4 x3 x2 x1 x0
551
552 x11..x0 = absolute x value (horizontal)
553
554byte 3:
555
556 bit 7 6 5 4 3 2 1 0
557 0 0 w1 w0 0 0 1 0
558
559 w3..w0 = width of the finger touch
560
561byte 4:
562
563 bit 7 6 5 4 3 2 1 0
564 p3 p1 p2 p0 y11 y10 y9 y8
565
566 p7..p0 = pressure
567
568byte 5:
569
570 bit 7 6 5 4 3 2 1 0
571 y7 y6 y5 y4 y3 y2 y1 y0
572
573 y11..y0 = absolute y value (vertical)
574
5756.2.2 Two finger touch
576 ~~~~~~~~~~~~~~~~
577
578The packet format is exactly the same for two finger touch, except the hardware
579sends two 6 byte packets. The first packet contains data for the first finger,
580the second packet has data for the second finger. So for two finger touch a
581total of 12 bytes are sent.
582
583/////////////////////////////////////////////////////////////////////////////
584
5857. Hardware version 4
586 ==================
587
5887.1 Registers
589 ~~~~~~~~~
590* reg_07
591
592 bit 7 6 5 4 3 2 1 0
593 0 0 0 0 0 0 0 A
594
595 A: 1 = enable absolute tracking
596
5977.2 Native absolute mode 6 byte packet format
598 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
599v4 hardware is a true multitouch touchpad, capable of tracking up to 5 fingers.
600Unfortunately, due to PS/2's limited bandwidth, its packet format is rather
601complex.
602
603Whenever the numbers or identities of the fingers changes, the hardware sends a
604status packet to indicate how many and which fingers is on touchpad, followed by
605head packets or motion packets. A head packet contains data of finger id, finger
606position (absolute x, y values), width, and pressure. A motion packet contains
607two fingers' position delta.
608
609For example, when status packet tells there are 2 fingers on touchpad, then we
610can expect two following head packets. If the finger status doesn't change,
611the following packets would be motion packets, only sending delta of finger
612position, until we receive a status packet.
613
614One exception is one finger touch. when a status packet tells us there is only
615one finger, the hardware would just send head packets afterwards.
616
6177.2.1 Status packet
618 ~~~~~~~~~~~~~
619
620byte 0:
621
622 bit 7 6 5 4 3 2 1 0
623 . . . . 0 1 R L
624
625 L, R = 1 when Left, Right mouse button pressed
626
627byte 1:
628
629 bit 7 6 5 4 3 2 1 0
630 . . . ft4 ft3 ft2 ft1 ft0
631
632 ft4 ft3 ft2 ft1 ft0 ftn = 1 when finger n is on touchpad
633
634byte 2: not used
635
636byte 3:
637
638 bit 7 6 5 4 3 2 1 0
639 . . . 1 0 0 0 0
640
641 constant bits
642
643byte 4:
644
645 bit 7 6 5 4 3 2 1 0
646 p . . . . . . .
647
648 p = 1 for palm
649
650byte 5: not used
651
6527.2.2 Head packet
653 ~~~~~~~~~~~
654
655byte 0:
656
657 bit 7 6 5 4 3 2 1 0
658 w3 w2 w1 w0 0 1 R L
659
660 L, R = 1 when Left, Right mouse button pressed
661 w3..w0 = finger width (spans how many trace lines)
662
663byte 1:
664
665 bit 7 6 5 4 3 2 1 0
666 p7 p6 p5 p4 x11 x10 x9 x8
667
668byte 2:
669
670 bit 7 6 5 4 3 2 1 0
671 x7 x6 x5 x4 x3 x2 x1 x0
672
673 x11..x0 = absolute x value (horizontal)
674
675byte 3:
676
677 bit 7 6 5 4 3 2 1 0
678 id2 id1 id0 1 0 0 0 1
679
680 id2..id0 = finger id
681
682byte 4:
683
684 bit 7 6 5 4 3 2 1 0
685 p3 p1 p2 p0 y11 y10 y9 y8
686
687 p7..p0 = pressure
688
689byte 5:
690
691 bit 7 6 5 4 3 2 1 0
692 y7 y6 y5 y4 y3 y2 y1 y0
693
694 y11..y0 = absolute y value (vertical)
695
6967.2.3 Motion packet
697 ~~~~~~~~~~~~~
698
699byte 0:
700
701 bit 7 6 5 4 3 2 1 0
702 id2 id1 id0 w 0 1 R L
703
704 L, R = 1 when Left, Right mouse button pressed
705 id2..id0 = finger id
706 w = 1 when delta overflows (> 127 or < -128), in this case
707 firmware sends us (delta x / 5) and (delta y / 5)
708
709byte 1:
710
711 bit 7 6 5 4 3 2 1 0
712 x7 x6 x5 x4 x3 x2 x1 x0
713
714 x7..x0 = delta x (two's complement)
715
716byte 2:
717
718 bit 7 6 5 4 3 2 1 0
719 y7 y6 y5 y4 y3 y2 y1 y0
720
721 y7..y0 = delta y (two's complement)
722
723byte 3:
724
725 bit 7 6 5 4 3 2 1 0
726 id2 id1 id0 1 0 0 1 0
727
728 id2..id0 = finger id
729
730byte 4:
731
732 bit 7 6 5 4 3 2 1 0
733 x7 x6 x5 x4 x3 x2 x1 x0
734
735 x7..x0 = delta x (two's complement)
736
737byte 5:
738
739 bit 7 6 5 4 3 2 1 0
740 y7 y6 y5 y4 y3 y2 y1 y0
741
742 y7..y0 = delta y (two's complement)
743
744 byte 0 ~ 2 for one finger
745 byte 3 ~ 5 for another
diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt
index b93c08442e3c..b3d6787b4fb1 100644
--- a/Documentation/input/input.txt
+++ b/Documentation/input/input.txt
@@ -111,7 +111,7 @@ LCDs and many other purposes.
111 111
112 The monitor and speaker controls should be easy to add to the hid/input 112 The monitor and speaker controls should be easy to add to the hid/input
113interface, but for the UPSs and LCDs it doesn't make much sense. For this, 113interface, but for the UPSs and LCDs it doesn't make much sense. For this,
114the hiddev interface was designed. See Documentation/usb/hiddev.txt 114the hiddev interface was designed. See Documentation/hid/hiddev.txt
115for more information about it. 115for more information about it.
116 116
117 The usage of the usbhid module is very simple, it takes no parameters, 117 The usage of the usbhid module is very simple, it takes no parameters,
diff --git a/Documentation/input/multi-touch-protocol.txt b/Documentation/input/multi-touch-protocol.txt
index 71536e78406f..543101c5bf26 100644
--- a/Documentation/input/multi-touch-protocol.txt
+++ b/Documentation/input/multi-touch-protocol.txt
@@ -65,6 +65,20 @@ the full state of each initiated contact has to reside in the receiving
65end. Upon receiving an MT event, one simply updates the appropriate 65end. Upon receiving an MT event, one simply updates the appropriate
66attribute of the current slot. 66attribute of the current slot.
67 67
68Some devices identify and/or track more contacts than they can report to the
69driver. A driver for such a device should associate one type B slot with each
70contact that is reported by the hardware. Whenever the identity of the
71contact associated with a slot changes, the driver should invalidate that
72slot by changing its ABS_MT_TRACKING_ID. If the hardware signals that it is
73tracking more contacts than it is currently reporting, the driver should use
74a BTN_TOOL_*TAP event to inform userspace of the total number of contacts
75being tracked by the hardware at that moment. The driver should do this by
76explicitly sending the corresponding BTN_TOOL_*TAP event and setting
77use_count to false when calling input_mt_report_pointer_emulation().
78The driver should only advertise as many slots as the hardware can report.
79Userspace can detect that a driver can report more total contacts than slots
80by noting that the largest supported BTN_TOOL_*TAP event is larger than the
81total number of type B slots reported in the absinfo for the ABS_MT_SLOT axis.
68 82
69Protocol Example A 83Protocol Example A
70------------------ 84------------------
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 845a191004b1..54078ed96b37 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -319,4 +319,6 @@ Code Seq#(hex) Include File Comments
319 <mailto:thomas@winischhofer.net> 319 <mailto:thomas@winischhofer.net>
3200xF4 00-1F video/mbxfb.h mbxfb 3200xF4 00-1F video/mbxfb.h mbxfb
321 <mailto:raph@8d.com> 321 <mailto:raph@8d.com>
3220xF6 all LTTng Linux Trace Toolkit Next Generation
323 <mailto:mathieu.desnoyers@efficios.com>
3220xFD all linux/dm-ioctl.h 3240xFD all linux/dm-ioctl.h
diff --git a/Documentation/kernel-docs.txt b/Documentation/kernel-docs.txt
index 9a8674629a07..eda1eb1451a0 100644
--- a/Documentation/kernel-docs.txt
+++ b/Documentation/kernel-docs.txt
@@ -300,7 +300,7 @@
300 300
301 * Title: "The Kernel Hacking HOWTO" 301 * Title: "The Kernel Hacking HOWTO"
302 Author: Various Talented People, and Rusty. 302 Author: Various Talented People, and Rusty.
303 Location: in kernel tree, Documentation/DocBook/kernel-hacking/ 303 Location: in kernel tree, Documentation/DocBook/kernel-hacking.tmpl
304 (must be built as "make {htmldocs | psdocs | pdfdocs}) 304 (must be built as "make {htmldocs | psdocs | pdfdocs})
305 Keywords: HOWTO, kernel contexts, deadlock, locking, modules, 305 Keywords: HOWTO, kernel contexts, deadlock, locking, modules,
306 symbols, return conventions. 306 symbols, return conventions.
@@ -351,7 +351,7 @@
351 351
352 * Title: "Linux Kernel Locking HOWTO" 352 * Title: "Linux Kernel Locking HOWTO"
353 Author: Various Talented People, and Rusty. 353 Author: Various Talented People, and Rusty.
354 Location: in kernel tree, Documentation/DocBook/kernel-locking/ 354 Location: in kernel tree, Documentation/DocBook/kernel-locking.tmpl
355 (must be built as "make {htmldocs | psdocs | pdfdocs}) 355 (must be built as "make {htmldocs | psdocs | pdfdocs})
356 Keywords: locks, locking, spinlock, semaphore, atomic, race 356 Keywords: locks, locking, spinlock, semaphore, atomic, race
357 condition, bottom halves, tasklets, softirqs. 357 condition, bottom halves, tasklets, softirqs.
@@ -620,17 +620,6 @@
620 (including this document itself) have been moved there, and might 620 (including this document itself) have been moved there, and might
621 be more up to date than the web version. 621 be more up to date than the web version.
622 622
623 * Name: "Linux Source Driver"
624 URL: http://lsd.linux.cz
625 Keywords: Browsing source code.
626 Description: "Linux Source Driver (LSD) is an application, which
627 can make browsing source codes of Linux kernel easier than you can
628 imagine. You can select between multiple versions of kernel (e.g.
629 0.01, 1.0.0, 2.0.33, 2.0.34pre13, 2.0.0, 2.1.101 etc.). With LSD
630 you can search Linux kernel (fulltext, macros, types, functions
631 and variables) and LSD can generate patches for you on the fly
632 (files, directories or kernel)".
633
634 * Name: "Linux Kernel Source Reference" 623 * Name: "Linux Kernel Source Reference"
635 Author: Thomas Graichen. 624 Author: Thomas Graichen.
636 URL: http://marc.info/?l=linux-kernel&m=96446640102205&w=4 625 URL: http://marc.info/?l=linux-kernel&m=96446640102205&w=4
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index e279b7242912..a0c5c5f4fce6 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -40,6 +40,7 @@ parameter is applicable:
40 ALSA ALSA sound support is enabled. 40 ALSA ALSA sound support is enabled.
41 APIC APIC support is enabled. 41 APIC APIC support is enabled.
42 APM Advanced Power Management support is enabled. 42 APM Advanced Power Management support is enabled.
43 ARM ARM architecture is enabled.
43 AVR32 AVR32 architecture is enabled. 44 AVR32 AVR32 architecture is enabled.
44 AX25 Appropriate AX.25 support is enabled. 45 AX25 Appropriate AX.25 support is enabled.
45 BLACKFIN Blackfin architecture is enabled. 46 BLACKFIN Blackfin architecture is enabled.
@@ -48,7 +49,9 @@ parameter is applicable:
48 EDD BIOS Enhanced Disk Drive Services (EDD) is enabled 49 EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
49 EFI EFI Partitioning (GPT) is enabled 50 EFI EFI Partitioning (GPT) is enabled
50 EIDE EIDE/ATAPI support is enabled. 51 EIDE EIDE/ATAPI support is enabled.
52 EVM Extended Verification Module
51 FB The frame buffer device is enabled. 53 FB The frame buffer device is enabled.
54 FTRACE Function tracing enabled.
52 GCOV GCOV profiling is enabled. 55 GCOV GCOV profiling is enabled.
53 HW Appropriate hardware is enabled. 56 HW Appropriate hardware is enabled.
54 IA-64 IA-64 architecture is enabled. 57 IA-64 IA-64 architecture is enabled.
@@ -69,6 +72,7 @@ parameter is applicable:
69 Documentation/m68k/kernel-options.txt. 72 Documentation/m68k/kernel-options.txt.
70 MCA MCA bus support is enabled. 73 MCA MCA bus support is enabled.
71 MDA MDA console support is enabled. 74 MDA MDA console support is enabled.
75 MIPS MIPS architecture is enabled.
72 MOUSE Appropriate mouse support is enabled. 76 MOUSE Appropriate mouse support is enabled.
73 MSI Message Signaled Interrupts (PCI). 77 MSI Message Signaled Interrupts (PCI).
74 MTD MTD (Memory Technology Device) support is enabled. 78 MTD MTD (Memory Technology Device) support is enabled.
@@ -100,7 +104,6 @@ parameter is applicable:
100 SPARC Sparc architecture is enabled. 104 SPARC Sparc architecture is enabled.
101 SWSUSP Software suspend (hibernation) is enabled. 105 SWSUSP Software suspend (hibernation) is enabled.
102 SUSPEND System suspend states are enabled. 106 SUSPEND System suspend states are enabled.
103 FTRACE Function tracing enabled.
104 TPM TPM drivers are enabled. 107 TPM TPM drivers are enabled.
105 TS Appropriate touchscreen support is enabled. 108 TS Appropriate touchscreen support is enabled.
106 UMS USB Mass Storage support is enabled. 109 UMS USB Mass Storage support is enabled.
@@ -115,7 +118,7 @@ parameter is applicable:
115 X86-64 X86-64 architecture is enabled. 118 X86-64 X86-64 architecture is enabled.
116 More X86-64 boot options can be found in 119 More X86-64 boot options can be found in
117 Documentation/x86/x86_64/boot-options.txt . 120 Documentation/x86/x86_64/boot-options.txt .
118 X86 Either 32bit or 64bit x86 (same as X86-32+X86-64) 121 X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
119 XEN Xen support is enabled 122 XEN Xen support is enabled
120 123
121In addition, the following text indicates that the option: 124In addition, the following text indicates that the option:
@@ -161,7 +164,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
161 rsdt -- prefer RSDT over (default) XSDT 164 rsdt -- prefer RSDT over (default) XSDT
162 copy_dsdt -- copy DSDT to memory 165 copy_dsdt -- copy DSDT to memory
163 166
164 See also Documentation/power/pm.txt, pci=noacpi 167 See also Documentation/power/runtime_pm.txt, pci=noacpi
165 168
166 acpi_rsdp= [ACPI,EFI,KEXEC] 169 acpi_rsdp= [ACPI,EFI,KEXEC]
167 Pass the RSDP address to the kernel, mostly used 170 Pass the RSDP address to the kernel, mostly used
@@ -304,6 +307,19 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
304 behaviour to be specified. Bit 0 enables warnings, 307 behaviour to be specified. Bit 0 enables warnings,
305 bit 1 enables fixups, and bit 2 sends a segfault. 308 bit 1 enables fixups, and bit 2 sends a segfault.
306 309
310 align_va_addr= [X86-64]
311 Align virtual addresses by clearing slice [14:12] when
312 allocating a VMA at process creation time. This option
313 gives you up to 3% performance improvement on AMD F15h
314 machines (where it is enabled by default) for a
315 CPU-intensive style benchmark, and it can vary highly in
316 a microbenchmark depending on workload and compiler.
317
318 1: only for 32-bit processes
319 2: only for 64-bit processes
320 on: enable for both 32- and 64-bit processes
321 off: disable for both 32- and 64-bit processes
322
307 amd_iommu= [HW,X86-84] 323 amd_iommu= [HW,X86-84]
308 Pass parameters to the AMD IOMMU driver in the system. 324 Pass parameters to the AMD IOMMU driver in the system.
309 Possible values are: 325 Possible values are:
@@ -317,7 +333,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
317 amijoy.map= [HW,JOY] Amiga joystick support 333 amijoy.map= [HW,JOY] Amiga joystick support
318 Map of devices attached to JOY0DAT and JOY1DAT 334 Map of devices attached to JOY0DAT and JOY1DAT
319 Format: <a>,<b> 335 Format: <a>,<b>
320 See also Documentation/kernel/input/joystick.txt 336 See also Documentation/input/joystick.txt
321 337
322 analog.map= [HW,JOY] Analog joystick and gamepad support 338 analog.map= [HW,JOY] Analog joystick and gamepad support
323 Specifies type or capabilities of an analog joystick 339 Specifies type or capabilities of an analog joystick
@@ -376,7 +392,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
376 atkbd.softrepeat= [HW] 392 atkbd.softrepeat= [HW]
377 Use software keyboard repeat 393 Use software keyboard repeat
378 394
379 autotest [IA64] 395 autotest [IA-64]
380 396
381 baycom_epp= [HW,AX25] 397 baycom_epp= [HW,AX25]
382 Format: <io>,<mode> 398 Format: <io>,<mode>
@@ -406,7 +422,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
406 bttv.radio= Most important insmod options are available as 422 bttv.radio= Most important insmod options are available as
407 kernel args too. 423 kernel args too.
408 bttv.pll= See Documentation/video4linux/bttv/Insmod-options 424 bttv.pll= See Documentation/video4linux/bttv/Insmod-options
409 bttv.tuner= and Documentation/video4linux/bttv/CARDLIST 425 bttv.tuner=
410 426
411 bulk_remove=off [PPC] This parameter disables the use of the pSeries 427 bulk_remove=off [PPC] This parameter disables the use of the pSeries
412 firmware feature for flushing multiple hpte entries 428 firmware feature for flushing multiple hpte entries
@@ -681,8 +697,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
681 uart[8250],mmio32,<addr>[,options] 697 uart[8250],mmio32,<addr>[,options]
682 Start an early, polled-mode console on the 8250/16550 698 Start an early, polled-mode console on the 8250/16550
683 UART at the specified I/O port or MMIO address. 699 UART at the specified I/O port or MMIO address.
684 MMIO inter-register address stride is either 8bit (mmio) 700 MMIO inter-register address stride is either 8-bit
685 or 32bit (mmio32). 701 (mmio) or 32-bit (mmio32).
686 The options are the same as for ttyS, above. 702 The options are the same as for ttyS, above.
687 703
688 earlyprintk= [X86,SH,BLACKFIN] 704 earlyprintk= [X86,SH,BLACKFIN]
@@ -722,13 +738,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
722 738
723 elevator= [IOSCHED] 739 elevator= [IOSCHED]
724 Format: {"cfq" | "deadline" | "noop"} 740 Format: {"cfq" | "deadline" | "noop"}
725 See Documentation/block/as-iosched.txt and 741 See Documentation/block/cfq-iosched.txt and
726 Documentation/block/deadline-iosched.txt for details. 742 Documentation/block/deadline-iosched.txt for details.
727 743
728 elfcorehdr= [IA64,PPC,SH,X86] 744 elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
729 Specifies physical address of start of kernel core 745 Specifies physical address of start of kernel core
730 image elf header. Generally kexec loader will 746 image elf header and optionally the size. Generally
731 pass this option to capture kernel. 747 kexec loader will pass this option to capture kernel.
732 See Documentation/kdump/kdump.txt for details. 748 See Documentation/kdump/kdump.txt for details.
733 749
734 enable_mtrr_cleanup [X86] 750 enable_mtrr_cleanup [X86]
@@ -758,12 +774,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
758 This option is obsoleted by the "netdev=" option, which 774 This option is obsoleted by the "netdev=" option, which
759 has equivalent usage. See its documentation for details. 775 has equivalent usage. See its documentation for details.
760 776
777 evm= [EVM]
778 Format: { "fix" }
779 Permit 'security.evm' to be updated regardless of
780 current integrity status.
781
761 failslab= 782 failslab=
762 fail_page_alloc= 783 fail_page_alloc=
763 fail_make_request=[KNL] 784 fail_make_request=[KNL]
764 General fault injection mechanism. 785 General fault injection mechanism.
765 Format: <interval>,<probability>,<space>,<times> 786 Format: <interval>,<probability>,<space>,<times>
766 See also /Documentation/fault-injection/. 787 See also Documentation/fault-injection/.
767 788
768 floppy= [HW] 789 floppy= [HW]
769 See Documentation/blockdev/floppy.txt. 790 See Documentation/blockdev/floppy.txt.
@@ -791,7 +812,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
791 tracer at boot up. function-list is a comma separated 812 tracer at boot up. function-list is a comma separated
792 list of functions. This list can be changed at run 813 list of functions. This list can be changed at run
793 time by the set_ftrace_filter file in the debugfs 814 time by the set_ftrace_filter file in the debugfs
794 tracing directory. 815 tracing directory.
795 816
796 ftrace_notrace=[function-list] 817 ftrace_notrace=[function-list]
797 [FTRACE] Do not trace the functions specified in 818 [FTRACE] Do not trace the functions specified in
@@ -829,7 +850,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
829 850
830 hashdist= [KNL,NUMA] Large hashes allocated during boot 851 hashdist= [KNL,NUMA] Large hashes allocated during boot
831 are distributed across NUMA nodes. Defaults on 852 are distributed across NUMA nodes. Defaults on
832 for 64bit NUMA, off otherwise. 853 for 64-bit NUMA, off otherwise.
833 Format: 0 | 1 (for off | on) 854 Format: 0 | 1 (for off | on)
834 855
835 hcl= [IA-64] SGI's Hardware Graph compatibility layer 856 hcl= [IA-64] SGI's Hardware Graph compatibility layer
@@ -952,6 +973,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
952 ignore_loglevel [KNL] 973 ignore_loglevel [KNL]
953 Ignore loglevel setting - this will print /all/ 974 Ignore loglevel setting - this will print /all/
954 kernel messages to the console. Useful for debugging. 975 kernel messages to the console. Useful for debugging.
976 We also add it as printk module parameter, so users
977 could change it dynamically, usually by
978 /sys/module/printk/parameters/ignore_loglevel.
955 979
956 ihash_entries= [KNL] 980 ihash_entries= [KNL]
957 Set number of hash buckets for inode cache. 981 Set number of hash buckets for inode cache.
@@ -998,10 +1022,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
998 DMA. 1022 DMA.
999 forcedac [x86_64] 1023 forcedac [x86_64]
1000 With this option iommu will not optimize to look 1024 With this option iommu will not optimize to look
1001 for io virtual address below 32 bit forcing dual 1025 for io virtual address below 32-bit forcing dual
1002 address cycle on pci bus for cards supporting greater 1026 address cycle on pci bus for cards supporting greater
1003 than 32 bit addressing. The default is to look 1027 than 32-bit addressing. The default is to look
1004 for translation below 32 bit and if not available 1028 for translation below 32-bit and if not available
1005 then look in the higher range. 1029 then look in the higher range.
1006 strict [Default Off] 1030 strict [Default Off]
1007 With this option on every unmap_single operation will 1031 With this option on every unmap_single operation will
@@ -1012,12 +1036,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1012 has the capability. With this option, super page will 1036 has the capability. With this option, super page will
1013 not be supported. 1037 not be supported.
1014 intremap= [X86-64, Intel-IOMMU] 1038 intremap= [X86-64, Intel-IOMMU]
1015 Format: { on (default) | off | nosid }
1016 on enable Interrupt Remapping (default) 1039 on enable Interrupt Remapping (default)
1017 off disable Interrupt Remapping 1040 off disable Interrupt Remapping
1018 nosid disable Source ID checking 1041 nosid disable Source ID checking
1042 no_x2apic_optout
1043 BIOS x2APIC opt-out request will be ignored
1019 1044
1020 inttest= [IA64] 1045 inttest= [IA-64]
1021 1046
1022 iomem= Disable strict checking of access to MMIO memory 1047 iomem= Disable strict checking of access to MMIO memory
1023 strict regions from userspace. 1048 strict regions from userspace.
@@ -1034,7 +1059,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1034 nomerge 1059 nomerge
1035 forcesac 1060 forcesac
1036 soft 1061 soft
1037 pt [x86, IA64] 1062 pt [x86, IA-64]
1038 1063
1039 io7= [HW] IO7 for Marvel based alpha systems 1064 io7= [HW] IO7 for Marvel based alpha systems
1040 See comment before marvel_specify_io7 in 1065 See comment before marvel_specify_io7 in
@@ -1165,7 +1190,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1165 1190
1166 kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU) 1191 kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
1167 for all guests. 1192 for all guests.
1168 Default is 1 (enabled) if in 64bit or 32bit-PAE mode 1193 Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.
1169 1194
1170 kvm-intel.ept= [KVM,Intel] Disable extended page tables 1195 kvm-intel.ept= [KVM,Intel] Disable extended page tables
1171 (virtualized MMU) support on capable Intel chips. 1196 (virtualized MMU) support on capable Intel chips.
@@ -1179,6 +1204,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1179 [KVM,Intel] Disable FlexPriority feature (TPR shadow). 1204 [KVM,Intel] Disable FlexPriority feature (TPR shadow).
1180 Default is 1 (enabled) 1205 Default is 1 (enabled)
1181 1206
1207 kvm-intel.nested=
1208 [KVM,Intel] Enable VMX nesting (nVMX).
1209 Default is 0 (disabled)
1210
1182 kvm-intel.unrestricted_guest= 1211 kvm-intel.unrestricted_guest=
1183 [KVM,Intel] Disable unrestricted guest feature 1212 [KVM,Intel] Disable unrestricted guest feature
1184 (virtualized real and unpaged mode) on capable 1213 (virtualized real and unpaged mode) on capable
@@ -1202,10 +1231,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1202 libata.dma=0 Disable all PATA and SATA DMA 1231 libata.dma=0 Disable all PATA and SATA DMA
1203 libata.dma=1 PATA and SATA Disk DMA only 1232 libata.dma=1 PATA and SATA Disk DMA only
1204 libata.dma=2 ATAPI (CDROM) DMA only 1233 libata.dma=2 ATAPI (CDROM) DMA only
1205 libata.dma=4 Compact Flash DMA only 1234 libata.dma=4 Compact Flash DMA only
1206 Combinations also work, so libata.dma=3 enables DMA 1235 Combinations also work, so libata.dma=3 enables DMA
1207 for disks and CDROMs, but not CFs. 1236 for disks and CDROMs, but not CFs.
1208 1237
1209 libata.ignore_hpa= [LIBATA] Ignore HPA limit 1238 libata.ignore_hpa= [LIBATA] Ignore HPA limit
1210 libata.ignore_hpa=0 keep BIOS limits (default) 1239 libata.ignore_hpa=0 keep BIOS limits (default)
1211 libata.ignore_hpa=1 ignore limits, using full disk 1240 libata.ignore_hpa=1 ignore limits, using full disk
@@ -1331,7 +1360,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1331 ltpc= [NET] 1360 ltpc= [NET]
1332 Format: <io>,<irq>,<dma> 1361 Format: <io>,<irq>,<dma>
1333 1362
1334 machvec= [IA64] Force the use of a particular machine-vector 1363 machvec= [IA-64] Force the use of a particular machine-vector
1335 (machvec) in a generic kernel. 1364 (machvec) in a generic kernel.
1336 Example: machvec=hpzx1_swiotlb 1365 Example: machvec=hpzx1_swiotlb
1337 1366
@@ -1348,9 +1377,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1348 it is equivalent to "nosmp", which also disables 1377 it is equivalent to "nosmp", which also disables
1349 the IO APIC. 1378 the IO APIC.
1350 1379
1351 max_loop= [LOOP] Maximum number of loopback devices that can 1380 max_loop= [LOOP] The number of loop block devices that get
1352 be mounted 1381 (loop.max_loop) unconditionally pre-created at init time. The default
1353 Format: <1-256> 1382 number is configured by BLK_DEV_LOOP_MIN_COUNT. Instead
1383 of statically allocating a predefined number, loop
1384 devices can be requested on-demand with the
1385 /dev/loop-control interface.
1354 1386
1355 mcatest= [IA-64] 1387 mcatest= [IA-64]
1356 1388
@@ -1637,6 +1669,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1637 debugging driver suspend/resume hooks). This may 1669 debugging driver suspend/resume hooks). This may
1638 not work reliably with all consoles, but is known 1670 not work reliably with all consoles, but is known
1639 to work with serial and VGA consoles. 1671 to work with serial and VGA consoles.
1672 To facilitate more flexible debugging, we also add
1673 console_suspend, a printk module parameter to control
1674 it. Users could use console_suspend (usually
1675 /sys/module/printk/parameters/console_suspend) to
1676 turn on/off it dynamically.
1640 1677
1641 noaliencache [MM, NUMA, SLAB] Disables the allocation of alien 1678 noaliencache [MM, NUMA, SLAB] Disables the allocation of alien
1642 caches in the slab allocator. Saves per-node memory, 1679 caches in the slab allocator. Saves per-node memory,
@@ -1734,7 +1771,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1734 1771
1735 nointroute [IA-64] 1772 nointroute [IA-64]
1736 1773
1737 nojitter [IA64] Disables jitter checking for ITC timers. 1774 nojitter [IA-64] Disables jitter checking for ITC timers.
1738 1775
1739 no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver 1776 no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
1740 1777
@@ -1772,6 +1809,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1772 1809
1773 noresidual [PPC] Don't use residual data on PReP machines. 1810 noresidual [PPC] Don't use residual data on PReP machines.
1774 1811
1812 nordrand [X86] Disable the direct use of the RDRAND
1813 instruction even if it is supported by the
1814 processor. RDRAND is still available to user
1815 space applications.
1816
1775 noresume [SWSUSP] Disables resume and restores original swap 1817 noresume [SWSUSP] Disables resume and restores original swap
1776 space. 1818 space.
1777 1819
@@ -1800,7 +1842,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1800 1842
1801 nox2apic [X86-64,APIC] Do not enable x2APIC mode. 1843 nox2apic [X86-64,APIC] Do not enable x2APIC mode.
1802 1844
1803 nptcg= [IA64] Override max number of concurrent global TLB 1845 nptcg= [IA-64] Override max number of concurrent global TLB
1804 purges which is reported from either PAL_VM_SUMMARY or 1846 purges which is reported from either PAL_VM_SUMMARY or
1805 SAL PALO. 1847 SAL PALO.
1806 1848
@@ -2077,13 +2119,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2077 Format: { parport<nr> | timid | 0 } 2119 Format: { parport<nr> | timid | 0 }
2078 See also Documentation/parport.txt. 2120 See also Documentation/parport.txt.
2079 2121
2080 pmtmr= [X86] Manual setup of pmtmr I/O Port. 2122 pmtmr= [X86] Manual setup of pmtmr I/O Port.
2081 Override pmtimer IOPort with a hex value. 2123 Override pmtimer IOPort with a hex value.
2082 e.g. pmtmr=0x508 2124 e.g. pmtmr=0x508
2083 2125
2084 pnp.debug [PNP] 2126 pnp.debug=1 [PNP]
2085 Enable PNP debug messages. This depends on the 2127 Enable PNP debug messages (depends on the
2086 CONFIG_PNP_DEBUG_MESSAGES option. 2128 CONFIG_PNP_DEBUG_MESSAGES option). Change at run-time
2129 via /sys/module/pnp/parameters/debug. We always show
2130 current resource usage; turning this on also shows
2131 possible settings and some assignment information.
2087 2132
2088 pnpacpi= [ACPI] 2133 pnpacpi= [ACPI]
2089 { off } 2134 { off }
@@ -2232,6 +2277,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2232 in <PAGE_SIZE> units (needed only for swap files). 2277 in <PAGE_SIZE> units (needed only for swap files).
2233 See Documentation/power/swsusp-and-swap-files.txt 2278 See Documentation/power/swsusp-and-swap-files.txt
2234 2279
2280 resumedelay= [HIBERNATION] Delay (in seconds) to pause before attempting to
2281 read the resume files
2282
2283 resumewait [HIBERNATION] Wait (indefinitely) for resume device to show up.
2284 Useful for devices that are detected asynchronously
2285 (e.g. USB and MMC devices).
2286
2235 hibernate= [HIBERNATION] 2287 hibernate= [HIBERNATION]
2236 noresume Don't check if there's a hibernation image 2288 noresume Don't check if there's a hibernation image
2237 present during boot. 2289 present during boot.
@@ -2367,7 +2419,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2367 Format: <integer> 2419 Format: <integer>
2368 2420
2369 sonypi.*= [HW] Sony Programmable I/O Control Device driver 2421 sonypi.*= [HW] Sony Programmable I/O Control Device driver
2370 See Documentation/sonypi.txt 2422 See Documentation/laptops/sonypi.txt
2371 2423
2372 specialix= [HW,SERIAL] Specialix multi-serial port adapter 2424 specialix= [HW,SERIAL] Specialix multi-serial port adapter
2373 See Documentation/serial/specialix.txt. 2425 See Documentation/serial/specialix.txt.
@@ -2635,6 +2687,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2635 medium is write-protected). 2687 medium is write-protected).
2636 Example: quirks=0419:aaf5:rl,0421:0433:rc 2688 Example: quirks=0419:aaf5:rl,0421:0433:rc
2637 2689
2690 user_debug= [KNL,ARM]
2691 Format: <int>
2692 See arch/arm/Kconfig.debug help text.
2693 1 - undefined instruction events
2694 2 - system calls
2695 4 - invalid data aborts
2696 8 - SIGSEGV faults
2697 16 - SIGBUS faults
2698 Example: user_debug=31
2699
2638 userpte= 2700 userpte=
2639 [X86] Flags controlling user PTE allocations. 2701 [X86] Flags controlling user PTE allocations.
2640 2702
@@ -2680,6 +2742,28 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2680 vmpoff= [KNL,S390] Perform z/VM CP command after power off. 2742 vmpoff= [KNL,S390] Perform z/VM CP command after power off.
2681 Format: <command> 2743 Format: <command>
2682 2744
2745 vsyscall= [X86-64]
2746 Controls the behavior of vsyscalls (i.e. calls to
2747 fixed addresses of 0xffffffffff600x00 from legacy
2748 code). Most statically-linked binaries and older
2749 versions of glibc use these calls. Because these
2750 functions are at fixed addresses, they make nice
2751 targets for exploits that can control RIP.
2752
2753 emulate Vsyscalls turn into traps and are emulated
2754 reasonably safely.
2755
2756 native [default] Vsyscalls are native syscall
2757 instructions.
2758 This is a little bit faster than trapping
2759 and makes a few dynamic recompilers work
2760 better than they would in emulation mode.
2761 It also makes exploits much easier to write.
2762
2763 none Vsyscalls don't work at all. This makes
2764 them quite hard to use for exploits but
2765 might break your system.
2766
2683 vt.cur_default= [VT] Default cursor shape. 2767 vt.cur_default= [VT] Default cursor shape.
2684 Format: 0xCCBBAA, where AA, BB, and CC are the same as 2768 Format: 0xCCBBAA, where AA, BB, and CC are the same as
2685 the parameters of the <Esc>[?A;B;Cc escape sequence; 2769 the parameters of the <Esc>[?A;B;Cc escape sequence;
diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt
index 61815483efa3..3ff0dad62d36 100644
--- a/Documentation/laptops/thinkpad-acpi.txt
+++ b/Documentation/laptops/thinkpad-acpi.txt
@@ -736,7 +736,7 @@ status as "unknown". The available commands are:
736sysfs notes: 736sysfs notes:
737 737
738The ThinkLight sysfs interface is documented by the LED class 738The ThinkLight sysfs interface is documented by the LED class
739documentation, in Documentation/leds-class.txt. The ThinkLight LED name 739documentation, in Documentation/leds/leds-class.txt. The ThinkLight LED name
740is "tpacpi::thinklight". 740is "tpacpi::thinklight".
741 741
742Due to limitations in the sysfs LED class, if the status of the ThinkLight 742Due to limitations in the sysfs LED class, if the status of the ThinkLight
@@ -833,7 +833,7 @@ All of the above can be turned on and off and can be made to blink.
833sysfs notes: 833sysfs notes:
834 834
835The ThinkPad LED sysfs interface is described in detail by the LED class 835The ThinkPad LED sysfs interface is described in detail by the LED class
836documentation, in Documentation/leds-class.txt. 836documentation, in Documentation/leds/leds-class.txt.
837 837
838The LEDs are named (in LED ID order, from 0 to 12): 838The LEDs are named (in LED ID order, from 0 to 12):
839"tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt", 839"tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt",
diff --git a/Documentation/media-framework.txt b/Documentation/media-framework.txt
index 669b5fb03a86..3a0f879533ce 100644
--- a/Documentation/media-framework.txt
+++ b/Documentation/media-framework.txt
@@ -9,8 +9,8 @@ Introduction
9------------ 9------------
10 10
11The media controller API is documented in DocBook format in 11The media controller API is documented in DocBook format in
12Documentation/DocBook/v4l/media-controller.xml. This document will focus on 12Documentation/DocBook/media/v4l/media-controller.xml. This document will focus
13the kernel-side implementation of the media framework. 13on the kernel-side implementation of the media framework.
14 14
15 15
16Abstract media device model 16Abstract media device model
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index f0d3a8026a56..2759f7c188f0 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -438,7 +438,7 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
438 [*] For information on bus mastering DMA and coherency please read: 438 [*] For information on bus mastering DMA and coherency please read:
439 439
440 Documentation/PCI/pci.txt 440 Documentation/PCI/pci.txt
441 Documentation/PCI/PCI-DMA-mapping.txt 441 Documentation/DMA-API-HOWTO.txt
442 Documentation/DMA-API.txt 442 Documentation/DMA-API.txt
443 443
444 444
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index 4edd78dfb362..bbce1215434a 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -1,13 +1,21 @@
100-INDEX 100-INDEX
2 - this file 2 - this file
33c359.txt
4 - information on the 3Com TokenLink Velocity XL (3c5359) driver.
33c505.txt 53c505.txt
4 - information on the 3Com EtherLink Plus (3c505) driver. 6 - information on the 3Com EtherLink Plus (3c505) driver.
73c509.txt
8 - information on the 3Com Etherlink III Series Ethernet cards.
56pack.txt 96pack.txt
6 - info on the 6pack protocol, an alternative to KISS for AX.25 10 - info on the 6pack protocol, an alternative to KISS for AX.25
7DLINK.txt 11DLINK.txt
8 - info on the D-Link DE-600/DE-620 parallel port pocket adapters 12 - info on the D-Link DE-600/DE-620 parallel port pocket adapters
9PLIP.txt 13PLIP.txt
10 - PLIP: The Parallel Line Internet Protocol device driver 14 - PLIP: The Parallel Line Internet Protocol device driver
15README.ipw2100
16 - README for the Intel PRO/Wireless 2100 driver.
17README.ipw2200
18 - README for the Intel PRO/Wireless 2915ABG and 2200BG driver.
11README.sb1000 19README.sb1000
12 - info on General Instrument/NextLevel SURFboard1000 cable modem. 20 - info on General Instrument/NextLevel SURFboard1000 cable modem.
13alias.txt 21alias.txt
@@ -20,8 +28,12 @@ atm.txt
20 - info on where to get ATM programs and support for Linux. 28 - info on where to get ATM programs and support for Linux.
21ax25.txt 29ax25.txt
22 - info on using AX.25 and NET/ROM code for Linux 30 - info on using AX.25 and NET/ROM code for Linux
31batman-adv.txt
32 - B.A.T.M.A.N routing protocol on top of layer 2 Ethernet Frames.
23baycom.txt 33baycom.txt
24 - info on the driver for Baycom style amateur radio modems 34 - info on the driver for Baycom style amateur radio modems
35bonding.txt
36 - Linux Ethernet Bonding Driver HOWTO: link aggregation in Linux.
25bridge.txt 37bridge.txt
26 - where to get user space programs for ethernet bridging with Linux. 38 - where to get user space programs for ethernet bridging with Linux.
27can.txt 39can.txt
@@ -34,32 +46,60 @@ cxacru.txt
34 - Conexant AccessRunner USB ADSL Modem 46 - Conexant AccessRunner USB ADSL Modem
35cxacru-cf.py 47cxacru-cf.py
36 - Conexant AccessRunner USB ADSL Modem configuration file parser 48 - Conexant AccessRunner USB ADSL Modem configuration file parser
49cxgb.txt
50 - Release Notes for the Chelsio N210 Linux device driver.
51dccp.txt
52 - the Datagram Congestion Control Protocol (DCCP) (RFC 4340..42).
37de4x5.txt 53de4x5.txt
38 - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver 54 - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver
39decnet.txt 55decnet.txt
40 - info on using the DECnet networking layer in Linux. 56 - info on using the DECnet networking layer in Linux.
41depca.txt 57depca.txt
42 - the Digital DEPCA/EtherWORKS DE1?? and DE2?? LANCE Ethernet driver 58 - the Digital DEPCA/EtherWORKS DE1?? and DE2?? LANCE Ethernet driver
59dl2k.txt
60 - README for D-Link DL2000-based Gigabit Ethernet Adapters (dl2k.ko).
61dm9000.txt
62 - README for the Simtec DM9000 Network driver.
43dmfe.txt 63dmfe.txt
44 - info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver. 64 - info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver.
65dns_resolver.txt
66 - The DNS resolver module allows kernel servies to make DNS queries.
67driver.txt
68 - Softnet driver issues.
45e100.txt 69e100.txt
46 - info on Intel's EtherExpress PRO/100 line of 10/100 boards 70 - info on Intel's EtherExpress PRO/100 line of 10/100 boards
47e1000.txt 71e1000.txt
48 - info on Intel's E1000 line of gigabit ethernet boards 72 - info on Intel's E1000 line of gigabit ethernet boards
73e1000e.txt
74 - README for the Intel Gigabit Ethernet Driver (e1000e).
49eql.txt 75eql.txt
50 - serial IP load balancing 76 - serial IP load balancing
51ewrk3.txt 77ewrk3.txt
52 - the Digital EtherWORKS 3 DE203/4/5 Ethernet driver 78 - the Digital EtherWORKS 3 DE203/4/5 Ethernet driver
79fib_trie.txt
80 - Level Compressed Trie (LC-trie) notes: a structure for routing.
53filter.txt 81filter.txt
54 - Linux Socket Filtering 82 - Linux Socket Filtering
55fore200e.txt 83fore200e.txt
56 - FORE Systems PCA-200E/SBA-200E ATM NIC driver info. 84 - FORE Systems PCA-200E/SBA-200E ATM NIC driver info.
57framerelay.txt 85framerelay.txt
58 - info on using Frame Relay/Data Link Connection Identifier (DLCI). 86 - info on using Frame Relay/Data Link Connection Identifier (DLCI).
87gen_stats.txt
88 - Generic networking statistics for netlink users.
89generic_hdlc.txt
90 - The generic High Level Data Link Control (HDLC) layer.
59generic_netlink.txt 91generic_netlink.txt
60 - info on Generic Netlink 92 - info on Generic Netlink
93gianfar.txt
94 - Gianfar Ethernet Driver.
61ieee802154.txt 95ieee802154.txt
62 - Linux IEEE 802.15.4 implementation, API and drivers 96 - Linux IEEE 802.15.4 implementation, API and drivers
97ifenslave.c
98 - Configure network interfaces for parallel routing (bonding).
99igb.txt
100 - README for the Intel Gigabit Ethernet Driver (igb).
101igbvf.txt
102 - README for the Intel Gigabit Ethernet Driver (igbvf).
63ip-sysctl.txt 103ip-sysctl.txt
64 - /proc/sys/net/ipv4/* variables 104 - /proc/sys/net/ipv4/* variables
65ip_dynaddr.txt 105ip_dynaddr.txt
@@ -68,41 +108,117 @@ ipddp.txt
68 - AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation 108 - AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
69iphase.txt 109iphase.txt
70 - Interphase PCI ATM (i)Chip IA Linux driver info. 110 - Interphase PCI ATM (i)Chip IA Linux driver info.
111ipv6.txt
112 - Options to the ipv6 kernel module.
113ipvs-sysctl.txt
114 - Per-inode explanation of the /proc/sys/net/ipv4/vs interface.
71irda.txt 115irda.txt
72 - where to get IrDA (infrared) utilities and info for Linux. 116 - where to get IrDA (infrared) utilities and info for Linux.
117ixgb.txt
118 - README for the Intel 10 Gigabit Ethernet Driver (ixgb).
119ixgbe.txt
120 - README for the Intel 10 Gigabit Ethernet Driver (ixgbe).
121ixgbevf.txt
122 - README for the Intel Virtual Function (VF) Driver (ixgbevf).
123l2tp.txt
124 - User guide to the L2TP tunnel protocol.
73lapb-module.txt 125lapb-module.txt
74 - programming information of the LAPB module. 126 - programming information of the LAPB module.
75ltpc.txt 127ltpc.txt
76 - the Apple or Farallon LocalTalk PC card driver 128 - the Apple or Farallon LocalTalk PC card driver
129mac80211-injection.txt
130 - HOWTO use packet injection with mac80211
77multicast.txt 131multicast.txt
78 - Behaviour of cards under Multicast 132 - Behaviour of cards under Multicast
133multiqueue.txt
134 - HOWTO for multiqueue network device support.
135netconsole.txt
136 - The network console module netconsole.ko: configuration and notes.
137netdev-features.txt
138 - Network interface features API description.
79netdevices.txt 139netdevices.txt
80 - info on network device driver functions exported to the kernel. 140 - info on network device driver functions exported to the kernel.
141netif-msg.txt
142 - Design of the network interface message level setting (NETIF_MSG_*).
143nfc.txt
144 - The Linux Near Field Communication (NFS) subsystem.
81olympic.txt 145olympic.txt
82 - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info. 146 - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info.
147operstates.txt
148 - Overview of network interface operational states.
149packet_mmap.txt
150 - User guide to memory mapped packet socket rings (PACKET_[RT]X_RING).
151phonet.txt
152 - The Phonet packet protocol used in Nokia cellular modems.
153phy.txt
154 - The PHY abstraction layer.
155pktgen.txt
156 - User guide to the kernel packet generator (pktgen.ko).
83policy-routing.txt 157policy-routing.txt
84 - IP policy-based routing 158 - IP policy-based routing
159ppp_generic.txt
160 - Information about the generic PPP driver.
161proc_net_tcp.txt
162 - Per inode overview of the /proc/net/tcp and /proc/net/tcp6 interfaces.
163radiotap-headers.txt
164 - Background on radiotap headers.
85ray_cs.txt 165ray_cs.txt
86 - Raylink Wireless LAN card driver info. 166 - Raylink Wireless LAN card driver info.
167rds.txt
168 - Background on the reliable, ordered datagram delivery method RDS.
169regulatory.txt
170 - Overview of the Linux wireless regulatory infrastructure.
171rxrpc.txt
172 - Guide to the RxRPC protocol.
173s2io.txt
174 - Release notes for Neterion Xframe I/II 10GbE driver.
175scaling.txt
176 - Explanation of network scaling techniques: RSS, RPS, RFS, aRFS, XPS.
177sctp.txt
178 - Notes on the Linux kernel implementation of the SCTP protocol.
179secid.txt
180 - Explanation of the secid member in flow structures.
87skfp.txt 181skfp.txt
88 - SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info. 182 - SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info.
89smc9.txt 183smc9.txt
90 - the driver for SMC's 9000 series of Ethernet cards 184 - the driver for SMC's 9000 series of Ethernet cards
91smctr.txt 185smctr.txt
92 - SMC TokenCard TokenRing Linux driver info. 186 - SMC TokenCard TokenRing Linux driver info.
187spider-net.txt
188 - README for the Spidernet Driver (as found in PS3 / Cell BE).
189stmmac.txt
190 - README for the STMicro Synopsys Ethernet driver.
191tc-actions-env-rules.txt
192 - rules for traffic control (tc) actions.
193timestamping.txt
194 - overview of network packet timestamping variants.
93tcp.txt 195tcp.txt
94 - short blurb on how TCP output takes place. 196 - short blurb on how TCP output takes place.
197tcp-thin.txt
198 - kernel tuning options for low rate 'thin' TCP streams.
95tlan.txt 199tlan.txt
96 - ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info. 200 - ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info.
97tms380tr.txt 201tms380tr.txt
98 - SysKonnect Token Ring ISA/PCI adapter driver info. 202 - SysKonnect Token Ring ISA/PCI adapter driver info.
203tproxy.txt
204 - Transparent proxy support user guide.
99tuntap.txt 205tuntap.txt
100 - TUN/TAP device driver, allowing user space Rx/Tx of packets. 206 - TUN/TAP device driver, allowing user space Rx/Tx of packets.
207udplite.txt
208 - UDP-Lite protocol (RFC 3828) introduction.
101vortex.txt 209vortex.txt
102 - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. 210 - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards.
211vxge.txt
212 - README for the Neterion X3100 PCIe Server Adapter.
103x25.txt 213x25.txt
104 - general info on X.25 development. 214 - general info on X.25 development.
105x25-iface.txt 215x25-iface.txt
106 - description of the X.25 Packet Layer to LAPB device interface. 216 - description of the X.25 Packet Layer to LAPB device interface.
217xfrm_proc.txt
218 - description of the statistics package for XFRM.
219xfrm_sync.txt
220 - sync patches for XFRM enable migration of an SA between hosts.
221xfrm_sysctl.txt
222 - description of the XFRM configuration options.
107z8530drv.txt 223z8530drv.txt
108 - info about Linux driver for Z8530 based HDLC cards for AX.25 224 - info about Linux driver for Z8530 based HDLC cards for AX.25
diff --git a/Documentation/networking/LICENSE.qlcnic b/Documentation/networking/LICENSE.qlcnic
index 29ad4b106420..e7fb2c6023bc 100644
--- a/Documentation/networking/LICENSE.qlcnic
+++ b/Documentation/networking/LICENSE.qlcnic
@@ -1,61 +1,22 @@
1Copyright (c) 2009-2010 QLogic Corporation 1Copyright (c) 2009-2011 QLogic Corporation
2QLogic Linux qlcnic NIC Driver 2QLogic Linux qlcnic NIC Driver
3 3
4This program includes a device driver for Linux 2.6 that may be
5distributed with QLogic hardware specific firmware binary file.
6You may modify and redistribute the device driver code under the 4You may modify and redistribute the device driver code under the
7GNU General Public License (a copy of which is attached hereto as 5GNU General Public License (a copy of which is attached hereto as
8Exhibit A) published by the Free Software Foundation (version 2). 6Exhibit A) published by the Free Software Foundation (version 2).
9 7
10You may redistribute the hardware specific firmware binary file
11under the following terms:
12
13 1. Redistribution of source code (only if applicable),
14 must retain the above copyright notice, this list of
15 conditions and the following disclaimer.
16
17 2. Redistribution in binary form must reproduce the above
18 copyright notice, this list of conditions and the
19 following disclaimer in the documentation and/or other
20 materials provided with the distribution.
21
22 3. The name of QLogic Corporation may not be used to
23 endorse or promote products derived from this software
24 without specific prior written permission
25
26REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
27THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
28EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
29IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
30PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
31BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
32EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
33TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
34DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
35ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
36OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
37OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
38POSSIBILITY OF SUCH DAMAGE.
39
40USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
41CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
42OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
43TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
44ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
45COMBINATION WITH THIS PROGRAM.
46
47 8
48EXHIBIT A 9EXHIBIT A
49 10
50 GNU GENERAL PUBLIC LICENSE 11 GNU GENERAL PUBLIC LICENSE
51 Version 2, June 1991 12 Version 2, June 1991
52 13
53 Copyright (C) 1989, 1991 Free Software Foundation, Inc. 14 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
54 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 15 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
55 Everyone is permitted to copy and distribute verbatim copies 16 Everyone is permitted to copy and distribute verbatim copies
56 of this license document, but changing it is not allowed. 17 of this license document, but changing it is not allowed.
57 18
58 Preamble 19 Preamble
59 20
60 The licenses for most software are designed to take away your 21 The licenses for most software are designed to take away your
61freedom to share and change it. By contrast, the GNU General Public 22freedom to share and change it. By contrast, the GNU General Public
@@ -105,7 +66,7 @@ patent must be licensed for everyone's free use or not licensed at all.
105 The precise terms and conditions for copying, distribution and 66 The precise terms and conditions for copying, distribution and
106modification follow. 67modification follow.
107 68
108 GNU GENERAL PUBLIC LICENSE 69 GNU GENERAL PUBLIC LICENSE
109 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 70 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
110 71
111 0. This License applies to any program or other work which contains 72 0. This License applies to any program or other work which contains
@@ -304,7 +265,7 @@ make exceptions for this. Our decision will be guided by the two goals
304of preserving the free status of all derivatives of our free software and 265of preserving the free status of all derivatives of our free software and
305of promoting the sharing and reuse of software generally. 266of promoting the sharing and reuse of software generally.
306 267
307 NO WARRANTY 268 NO WARRANTY
308 269
309 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 270 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
310FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 271FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt
index 88d4afbdef98..c86d03f18a5b 100644
--- a/Documentation/networking/batman-adv.txt
+++ b/Documentation/networking/batman-adv.txt
@@ -1,4 +1,4 @@
1[state: 17-04-2011] 1[state: 21-08-2011]
2 2
3BATMAN-ADV 3BATMAN-ADV
4---------- 4----------
@@ -68,9 +68,9 @@ All mesh wide settings can be found in batman's own interface
68folder: 68folder:
69 69
70# ls /sys/class/net/bat0/mesh/ 70# ls /sys/class/net/bat0/mesh/
71# aggregated_ogms gw_bandwidth hop_penalty 71# aggregated_ogms fragmentation gw_sel_class vis_mode
72# bonding gw_mode orig_interval 72# ap_isolation gw_bandwidth hop_penalty
73# fragmentation gw_sel_class vis_mode 73# bonding gw_mode orig_interval
74 74
75 75
76There is a special folder for debugging information: 76There is a special folder for debugging information:
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 5dd960d75174..91df678fb7f8 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -238,6 +238,18 @@ ad_select
238 238
239 This option was added in bonding version 3.4.0. 239 This option was added in bonding version 3.4.0.
240 240
241all_slaves_active
242
243 Specifies that duplicate frames (received on inactive ports) should be
244 dropped (0) or delivered (1).
245
246 Normally, bonding will drop duplicate frames (received on inactive
247 ports), which is desirable for most users. But there are some times
248 it is nice to allow duplicate frames to be delivered.
249
250 The default value is 0 (drop duplicate frames received on inactive
251 ports).
252
241arp_interval 253arp_interval
242 254
243 Specifies the ARP link monitoring frequency in milliseconds. 255 Specifies the ARP link monitoring frequency in milliseconds.
@@ -433,6 +445,23 @@ miimon
433 determined. See the High Availability section for additional 445 determined. See the High Availability section for additional
434 information. The default value is 0. 446 information. The default value is 0.
435 447
448min_links
449
450 Specifies the minimum number of links that must be active before
451 asserting carrier. It is similar to the Cisco EtherChannel min-links
452 feature. This allows setting the minimum number of member ports that
453 must be up (link-up state) before marking the bond device as up
454 (carrier on). This is useful for situations where higher level services
455 such as clustering want to ensure a minimum number of low bandwidth
456 links are active before switchover. This option only affect 802.3ad
457 mode.
458
459 The default value is 0. This will cause carrier to be asserted (for
460 802.3ad mode) whenever there is an active aggregator, regardless of the
461 number of available links in that aggregator. Note that, because an
462 aggregator cannot be active without at least one available link,
463 setting this option to 0 or to 1 has the exact same effect.
464
436mode 465mode
437 466
438 Specifies one of the bonding policies. The default is 467 Specifies one of the bonding policies. The default is
diff --git a/Documentation/networking/dmfe.txt b/Documentation/networking/dmfe.txt
index 8006c227fda2..25320bf19c86 100644
--- a/Documentation/networking/dmfe.txt
+++ b/Documentation/networking/dmfe.txt
@@ -1,3 +1,5 @@
1Note: This driver doesn't have a maintainer.
2
1Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux. 3Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux.
2 4
3This program is free software; you can redistribute it and/or 5This program is free software; you can redistribute it and/or
@@ -55,7 +57,6 @@ Test and make sure PCI latency is now correct for all cases.
55Authors: 57Authors:
56 58
57Sten Wang <sten_wang@davicom.com.tw > : Original Author 59Sten Wang <sten_wang@davicom.com.tw > : Original Author
58Tobias Ringstrom <tori@unhappy.mine.nu> : Current Maintainer
59 60
60Contributors: 61Contributors:
61 62
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index db2a4067013c..cb7f3148035d 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -992,7 +992,7 @@ bindv6only - BOOLEAN
992 TRUE: disable IPv4-mapped address feature 992 TRUE: disable IPv4-mapped address feature
993 FALSE: enable IPv4-mapped address feature 993 FALSE: enable IPv4-mapped address feature
994 994
995 Default: FALSE (as specified in RFC2553bis) 995 Default: FALSE (as specified in RFC3493)
996 996
997IPv6 Fragmentation: 997IPv6 Fragmentation:
998 998
@@ -1042,9 +1042,14 @@ conf/interface/*:
1042 The functional behaviour for certain settings is different 1042 The functional behaviour for certain settings is different
1043 depending on whether local forwarding is enabled or not. 1043 depending on whether local forwarding is enabled or not.
1044 1044
1045accept_ra - BOOLEAN 1045accept_ra - INTEGER
1046 Accept Router Advertisements; autoconfigure using them. 1046 Accept Router Advertisements; autoconfigure using them.
1047 1047
1048 It also determines whether or not to transmit Router
1049 Solicitations. If and only if the functional setting is to
1050 accept Router Advertisements, Router Solicitations will be
1051 transmitted.
1052
1048 Possible values are: 1053 Possible values are:
1049 0 Do not accept Router Advertisements. 1054 0 Do not accept Router Advertisements.
1050 1 Accept Router Advertisements if forwarding is disabled. 1055 1 Accept Router Advertisements if forwarding is disabled.
@@ -1106,7 +1111,7 @@ dad_transmits - INTEGER
1106 The amount of Duplicate Address Detection probes to send. 1111 The amount of Duplicate Address Detection probes to send.
1107 Default: 1 1112 Default: 1
1108 1113
1109forwarding - BOOLEAN 1114forwarding - INTEGER
1110 Configure interface-specific Host/Router behaviour. 1115 Configure interface-specific Host/Router behaviour.
1111 1116
1112 Note: It is recommended to have the same setting on all 1117 Note: It is recommended to have the same setting on all
@@ -1115,14 +1120,14 @@ forwarding - BOOLEAN
1115 Possible values are: 1120 Possible values are:
1116 0 Forwarding disabled 1121 0 Forwarding disabled
1117 1 Forwarding enabled 1122 1 Forwarding enabled
1118 2 Forwarding enabled (Hybrid Mode)
1119 1123
1120 FALSE (0): 1124 FALSE (0):
1121 1125
1122 By default, Host behaviour is assumed. This means: 1126 By default, Host behaviour is assumed. This means:
1123 1127
1124 1. IsRouter flag is not set in Neighbour Advertisements. 1128 1. IsRouter flag is not set in Neighbour Advertisements.
1125 2. Router Solicitations are being sent when necessary. 1129 2. If accept_ra is TRUE (default), transmit Router
1130 Solicitations.
1126 3. If accept_ra is TRUE (default), accept Router 1131 3. If accept_ra is TRUE (default), accept Router
1127 Advertisements (and do autoconfiguration). 1132 Advertisements (and do autoconfiguration).
1128 4. If accept_redirects is TRUE (default), accept Redirects. 1133 4. If accept_redirects is TRUE (default), accept Redirects.
@@ -1133,16 +1138,10 @@ forwarding - BOOLEAN
1133 This means exactly the reverse from the above: 1138 This means exactly the reverse from the above:
1134 1139
1135 1. IsRouter flag is set in Neighbour Advertisements. 1140 1. IsRouter flag is set in Neighbour Advertisements.
1136 2. Router Solicitations are not sent. 1141 2. Router Solicitations are not sent unless accept_ra is 2.
1137 3. Router Advertisements are ignored unless accept_ra is 2. 1142 3. Router Advertisements are ignored unless accept_ra is 2.
1138 4. Redirects are ignored. 1143 4. Redirects are ignored.
1139 1144
1140 TRUE (2):
1141
1142 Hybrid mode. Same behaviour as TRUE, except for:
1143
1144 2. Router Solicitations are being sent when necessary.
1145
1146 Default: 0 (disabled) if global forwarding is disabled (default), 1145 Default: 0 (disabled) if global forwarding is disabled (default),
1147 otherwise 1 (enabled). 1146 otherwise 1 (enabled).
1148 1147
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index 4ccdbca03811..f2a2488f1bf3 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -15,6 +15,23 @@ amemthresh - INTEGER
15 enabled and the variable is automatically set to 2, otherwise 15 enabled and the variable is automatically set to 2, otherwise
16 the strategy is disabled and the variable is set to 1. 16 the strategy is disabled and the variable is set to 1.
17 17
18conntrack - BOOLEAN
19 0 - disabled (default)
20 not 0 - enabled
21
22 If set, maintain connection tracking entries for
23 connections handled by IPVS.
24
25 This should be enabled if connections handled by IPVS are to be
26 also handled by stateful firewall rules. That is, iptables rules
27 that make use of connection tracking. It is a performance
28 optimisation to disable this setting otherwise.
29
30 Connections handled by the IPVS FTP application module
31 will have connection tracking entries regardless of this setting.
32
33 Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
34
18cache_bypass - BOOLEAN 35cache_bypass - BOOLEAN
19 0 - disabled (default) 36 0 - disabled (default)
20 not 0 - enabled 37 not 0 - enabled
@@ -39,7 +56,7 @@ debug_level - INTEGER
39 11 - IPVS packet handling (ip_vs_in/ip_vs_out) 56 11 - IPVS packet handling (ip_vs_in/ip_vs_out)
40 12 or more - packet traversal 57 12 or more - packet traversal
41 58
42 Only available when IPVS is compiled with the CONFIG_IPVS_DEBUG 59 Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
43 60
44 Higher debugging levels include the messages for lower debugging 61 Higher debugging levels include the messages for lower debugging
45 levels, so setting debug level 2, includes level 0, 1 and 2 62 levels, so setting debug level 2, includes level 0, 1 and 2
@@ -123,13 +140,11 @@ nat_icmp_send - BOOLEAN
123secure_tcp - INTEGER 140secure_tcp - INTEGER
124 0 - disabled (default) 141 0 - disabled (default)
125 142
126 The secure_tcp defense is to use a more complicated state 143 The secure_tcp defense is to use a more complicated TCP state
127 transition table and some possible short timeouts of each 144 transition table. For VS/NAT, it also delays entering the
128 state. In the VS/NAT, it delays the entering the ESTABLISHED 145 TCP ESTABLISHED state until the three way handshake is completed.
129 until the real server starts to send data and ACK packet
130 (after 3-way handshake).
131 146
132 The value definition is the same as that of drop_entry or 147 The value definition is the same as that of drop_entry and
133 drop_packet. 148 drop_packet.
134 149
135sync_threshold - INTEGER 150sync_threshold - INTEGER
@@ -141,3 +156,36 @@ sync_threshold - INTEGER
141 synchronized, every time the number of its incoming packets 156 synchronized, every time the number of its incoming packets
142 modulus 50 equals the threshold. The range of the threshold is 157 modulus 50 equals the threshold. The range of the threshold is
143 from 0 to 49. 158 from 0 to 49.
159
160snat_reroute - BOOLEAN
161 0 - disabled
162 not 0 - enabled (default)
163
164 If enabled, recalculate the route of SNATed packets from
165 realservers so that they are routed as if they originate from the
166 director. Otherwise they are routed as if they are forwarded by the
167 director.
168
169 If policy routing is in effect then it is possible that the route
170 of a packet originating from a director is routed differently to a
171 packet being forwarded by the director.
172
173 If policy routing is not in effect then the recalculated route will
174 always be the same as the original route so it is an optimisation
175 to disable snat_reroute and avoid the recalculation.
176
177sync_version - INTEGER
178 default 1
179
180 The version of the synchronisation protocol used when sending
181 synchronisation messages.
182
183 0 selects the original synchronisation protocol (version 0). This
184 should be used when sending synchronisation messages to a legacy
185 system that only understands the original synchronisation protocol.
186
187 1 selects the current synchronisation protocol (version 1). This
188 should be used where possible.
189
190 Kernels with this sync_version entry are able to receive messages
191 of both version 1 and version 2 of the synchronisation protocol.
diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt
index b30e81ad5307..3a930072b161 100644
--- a/Documentation/networking/mac80211-injection.txt
+++ b/Documentation/networking/mac80211-injection.txt
@@ -23,6 +23,10 @@ radiotap headers and used to control injection:
23 IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the 23 IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
24 current fragmentation threshold. 24 current fragmentation threshold.
25 25
26 * IEEE80211_RADIOTAP_TX_FLAGS
27
28 IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
29 an ACK even if it is a unicast frame
26 30
27The injection code can also skip all other currently defined radiotap fields 31The injection code can also skip all other currently defined radiotap fields
28facilitating replay of captured radiotap headers directly. 32facilitating replay of captured radiotap headers directly.
diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt
index 87b3d15f523a..89358341682a 100644
--- a/Documentation/networking/netdevices.txt
+++ b/Documentation/networking/netdevices.txt
@@ -73,7 +73,7 @@ dev->hard_start_xmit:
73 has to lock by itself when needed. It is recommended to use a try lock 73 has to lock by itself when needed. It is recommended to use a try lock
74 for this and return NETDEV_TX_LOCKED when the spin lock fails. 74 for this and return NETDEV_TX_LOCKED when the spin lock fails.
75 The locking there should also properly protect against 75 The locking there should also properly protect against
76 set_multicast_list. Note that the use of NETIF_F_LLTX is deprecated. 76 set_rx_mode. Note that the use of NETIF_F_LLTX is deprecated.
77 Don't use it for new drivers. 77 Don't use it for new drivers.
78 78
79 Context: Process with BHs disabled or BH (timer), 79 Context: Process with BHs disabled or BH (timer),
@@ -92,7 +92,7 @@ dev->tx_timeout:
92 Context: BHs disabled 92 Context: BHs disabled
93 Notes: netif_queue_stopped() is guaranteed true 93 Notes: netif_queue_stopped() is guaranteed true
94 94
95dev->set_multicast_list: 95dev->set_rx_mode:
96 Synchronization: netif_tx_lock spinlock. 96 Synchronization: netif_tx_lock spinlock.
97 Context: BHs disabled 97 Context: BHs disabled
98 98
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
new file mode 100644
index 000000000000..a177de21d28e
--- /dev/null
+++ b/Documentation/networking/scaling.txt
@@ -0,0 +1,378 @@
1Scaling in the Linux Networking Stack
2
3
4Introduction
5============
6
7This document describes a set of complementary techniques in the Linux
8networking stack to increase parallelism and improve performance for
9multi-processor systems.
10
11The following technologies are described:
12
13 RSS: Receive Side Scaling
14 RPS: Receive Packet Steering
15 RFS: Receive Flow Steering
16 Accelerated Receive Flow Steering
17 XPS: Transmit Packet Steering
18
19
20RSS: Receive Side Scaling
21=========================
22
23Contemporary NICs support multiple receive and transmit descriptor queues
24(multi-queue). On reception, a NIC can send different packets to different
25queues to distribute processing among CPUs. The NIC distributes packets by
26applying a filter to each packet that assigns it to one of a small number
27of logical flows. Packets for each flow are steered to a separate receive
28queue, which in turn can be processed by separate CPUs. This mechanism is
29generally known as 鈥淩eceive-side Scaling鈥 (RSS). The goal of RSS and
30the other scaling techniques is to increase performance uniformly.
31Multi-queue distribution can also be used for traffic prioritization, but
32that is not the focus of these techniques.
33
34The filter used in RSS is typically a hash function over the network
35and/or transport layer headers-- for example, a 4-tuple hash over
36IP addresses and TCP ports of a packet. The most common hardware
37implementation of RSS uses a 128-entry indirection table where each entry
38stores a queue number. The receive queue for a packet is determined
39by masking out the low order seven bits of the computed hash for the
40packet (usually a Toeplitz hash), taking this number as a key into the
41indirection table and reading the corresponding value.
42
43Some advanced NICs allow steering packets to queues based on
44programmable filters. For example, webserver bound TCP port 80 packets
45can be directed to their own receive queue. Such 鈥渘-tuple鈥 filters can
46be configured from ethtool (--config-ntuple).
47
48==== RSS Configuration
49
50The driver for a multi-queue capable NIC typically provides a kernel
51module parameter for specifying the number of hardware queues to
52configure. In the bnx2x driver, for instance, this parameter is called
53num_queues. A typical RSS configuration would be to have one receive queue
54for each CPU if the device supports enough queues, or otherwise at least
55one for each memory domain, where a memory domain is a set of CPUs that
56share a particular memory level (L1, L2, NUMA node, etc.).
57
58The indirection table of an RSS device, which resolves a queue by masked
59hash, is usually programmed by the driver at initialization. The
60default mapping is to distribute the queues evenly in the table, but the
61indirection table can be retrieved and modified at runtime using ethtool
62commands (--show-rxfh-indir and --set-rxfh-indir). Modifying the
63indirection table could be done to give different queues different
64relative weights.
65
66== RSS IRQ Configuration
67
68Each receive queue has a separate IRQ associated with it. The NIC triggers
69this to notify a CPU when new packets arrive on the given queue. The
70signaling path for PCIe devices uses message signaled interrupts (MSI-X),
71that can route each interrupt to a particular CPU. The active mapping
72of queues to IRQs can be determined from /proc/interrupts. By default,
73an IRQ may be handled on any CPU. Because a non-negligible part of packet
74processing takes place in receive interrupt handling, it is advantageous
75to spread receive interrupts between CPUs. To manually adjust the IRQ
76affinity of each interrupt see Documentation/IRQ-affinity.txt. Some systems
77will be running irqbalance, a daemon that dynamically optimizes IRQ
78assignments and as a result may override any manual settings.
79
80== Suggested Configuration
81
82RSS should be enabled when latency is a concern or whenever receive
83interrupt processing forms a bottleneck. Spreading load between CPUs
84decreases queue length. For low latency networking, the optimal setting
85is to allocate as many queues as there are CPUs in the system (or the
86NIC maximum, if lower). The most efficient high-rate configuration
87is likely the one with the smallest number of receive queues where no
88receive queue overflows due to a saturated CPU, because in default
89mode with interrupt coalescing enabled, the aggregate number of
90interrupts (and thus work) grows with each additional queue.
91
92Per-cpu load can be observed using the mpstat utility, but note that on
93processors with hyperthreading (HT), each hyperthread is represented as
94a separate CPU. For interrupt handling, HT has shown no benefit in
95initial tests, so limit the number of queues to the number of CPU cores
96in the system.
97
98
99RPS: Receive Packet Steering
100============================
101
102Receive Packet Steering (RPS) is logically a software implementation of
103RSS. Being in software, it is necessarily called later in the datapath.
104Whereas RSS selects the queue and hence CPU that will run the hardware
105interrupt handler, RPS selects the CPU to perform protocol processing
106above the interrupt handler. This is accomplished by placing the packet
107on the desired CPU鈥檚 backlog queue and waking up the CPU for processing.
108RPS has some advantages over RSS: 1) it can be used with any NIC,
1092) software filters can easily be added to hash over new protocols,
1103) it does not increase hardware device interrupt rate (although it does
111introduce inter-processor interrupts (IPIs)).
112
113RPS is called during bottom half of the receive interrupt handler, when
114a driver sends a packet up the network stack with netif_rx() or
115netif_receive_skb(). These call the get_rps_cpu() function, which
116selects the queue that should process a packet.
117
118The first step in determining the target CPU for RPS is to calculate a
119flow hash over the packet鈥檚 addresses or ports (2-tuple or 4-tuple hash
120depending on the protocol). This serves as a consistent hash of the
121associated flow of the packet. The hash is either provided by hardware
122or will be computed in the stack. Capable hardware can pass the hash in
123the receive descriptor for the packet; this would usually be the same
124hash used for RSS (e.g. computed Toeplitz hash). The hash is saved in
125skb->rx_hash and can be used elsewhere in the stack as a hash of the
126packet鈥檚 flow.
127
128Each receive hardware queue has an associated list of CPUs to which
129RPS may enqueue packets for processing. For each received packet,
130an index into the list is computed from the flow hash modulo the size
131of the list. The indexed CPU is the target for processing the packet,
132and the packet is queued to the tail of that CPU鈥檚 backlog queue. At
133the end of the bottom half routine, IPIs are sent to any CPUs for which
134packets have been queued to their backlog queue. The IPI wakes backlog
135processing on the remote CPU, and any queued packets are then processed
136up the networking stack.
137
138==== RPS Configuration
139
140RPS requires a kernel compiled with the CONFIG_RPS kconfig symbol (on
141by default for SMP). Even when compiled in, RPS remains disabled until
142explicitly configured. The list of CPUs to which RPS may forward traffic
143can be configured for each receive queue using a sysfs file entry:
144
145 /sys/class/net/<dev>/queues/rx-<n>/rps_cpus
146
147This file implements a bitmap of CPUs. RPS is disabled when it is zero
148(the default), in which case packets are processed on the interrupting
149CPU. Documentation/IRQ-affinity.txt explains how CPUs are assigned to
150the bitmap.
151
152== Suggested Configuration
153
154For a single queue device, a typical RPS configuration would be to set
155the rps_cpus to the CPUs in the same memory domain of the interrupting
156CPU. If NUMA locality is not an issue, this could also be all CPUs in
157the system. At high interrupt rate, it might be wise to exclude the
158interrupting CPU from the map since that already performs much work.
159
160For a multi-queue system, if RSS is configured so that a hardware
161receive queue is mapped to each CPU, then RPS is probably redundant
162and unnecessary. If there are fewer hardware queues than CPUs, then
163RPS might be beneficial if the rps_cpus for each queue are the ones that
164share the same memory domain as the interrupting CPU for that queue.
165
166
167RFS: Receive Flow Steering
168==========================
169
170While RPS steers packets solely based on hash, and thus generally
171provides good load distribution, it does not take into account
172application locality. This is accomplished by Receive Flow Steering
173(RFS). The goal of RFS is to increase datacache hitrate by steering
174kernel processing of packets to the CPU where the application thread
175consuming the packet is running. RFS relies on the same RPS mechanisms
176to enqueue packets onto the backlog of another CPU and to wake up that
177CPU.
178
179In RFS, packets are not forwarded directly by the value of their hash,
180but the hash is used as index into a flow lookup table. This table maps
181flows to the CPUs where those flows are being processed. The flow hash
182(see RPS section above) is used to calculate the index into this table.
183The CPU recorded in each entry is the one which last processed the flow.
184If an entry does not hold a valid CPU, then packets mapped to that entry
185are steered using plain RPS. Multiple table entries may point to the
186same CPU. Indeed, with many flows and few CPUs, it is very likely that
187a single application thread handles flows with many different flow hashes.
188
189rps_sock_flow_table is a global flow table that contains the *desired* CPU
190for flows: the CPU that is currently processing the flow in userspace.
191Each table value is a CPU index that is updated during calls to recvmsg
192and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
193and tcp_splice_read()).
194
195When the scheduler moves a thread to a new CPU while it has outstanding
196receive packets on the old CPU, packets may arrive out of order. To
197avoid this, RFS uses a second flow table to track outstanding packets
198for each flow: rps_dev_flow_table is a table specific to each hardware
199receive queue of each device. Each table value stores a CPU index and a
200counter. The CPU index represents the *current* CPU onto which packets
201for this flow are enqueued for further kernel processing. Ideally, kernel
202and userspace processing occur on the same CPU, and hence the CPU index
203in both tables is identical. This is likely false if the scheduler has
204recently migrated a userspace thread while the kernel still has packets
205enqueued for kernel processing on the old CPU.
206
207The counter in rps_dev_flow_table values records the length of the current
208CPU's backlog when a packet in this flow was last enqueued. Each backlog
209queue has a head counter that is incremented on dequeue. A tail counter
210is computed as head counter + queue length. In other words, the counter
211in rps_dev_flow_table[i] records the last element in flow i that has
212been enqueued onto the currently designated CPU for flow i (of course,
213entry i is actually selected by hash and multiple flows may hash to the
214same entry i).
215
216And now the trick for avoiding out of order packets: when selecting the
217CPU for packet processing (from get_rps_cpu()) the rps_sock_flow table
218and the rps_dev_flow table of the queue that the packet was received on
219are compared. If the desired CPU for the flow (found in the
220rps_sock_flow table) matches the current CPU (found in the rps_dev_flow
221table), the packet is enqueued onto that CPU鈥檚 backlog. If they differ,
222the current CPU is updated to match the desired CPU if one of the
223following is true:
224
225- The current CPU's queue head counter >= the recorded tail counter
226 value in rps_dev_flow[i]
227- The current CPU is unset (equal to NR_CPUS)
228- The current CPU is offline
229
230After this check, the packet is sent to the (possibly updated) current
231CPU. These rules aim to ensure that a flow only moves to a new CPU when
232there are no packets outstanding on the old CPU, as the outstanding
233packets could arrive later than those about to be processed on the new
234CPU.
235
236==== RFS Configuration
237
238RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on
239by default for SMP). The functionality remains disabled until explicitly
240configured. The number of entries in the global flow table is set through:
241
242 /proc/sys/net/core/rps_sock_flow_entries
243
244The number of entries in the per-queue flow table are set through:
245
246 /sys/class/net/<dev>/queues/rx-<n>/rps_flow_cnt
247
248== Suggested Configuration
249
250Both of these need to be set before RFS is enabled for a receive queue.
251Values for both are rounded up to the nearest power of two. The
252suggested flow count depends on the expected number of active connections
253at any given time, which may be significantly less than the number of open
254connections. We have found that a value of 32768 for rps_sock_flow_entries
255works fairly well on a moderately loaded server.
256
257For a single queue device, the rps_flow_cnt value for the single queue
258would normally be configured to the same value as rps_sock_flow_entries.
259For a multi-queue device, the rps_flow_cnt for each queue might be
260configured as rps_sock_flow_entries / N, where N is the number of
261queues. So for instance, if rps_flow_entries is set to 32768 and there
262are 16 configured receive queues, rps_flow_cnt for each queue might be
263configured as 2048.
264
265
266Accelerated RFS
267===============
268
269Accelerated RFS is to RFS what RSS is to RPS: a hardware-accelerated load
270balancing mechanism that uses soft state to steer flows based on where
271the application thread consuming the packets of each flow is running.
272Accelerated RFS should perform better than RFS since packets are sent
273directly to a CPU local to the thread consuming the data. The target CPU
274will either be the same CPU where the application runs, or at least a CPU
275which is local to the application thread鈥檚 CPU in the cache hierarchy.
276
277To enable accelerated RFS, the networking stack calls the
278ndo_rx_flow_steer driver function to communicate the desired hardware
279queue for packets matching a particular flow. The network stack
280automatically calls this function every time a flow entry in
281rps_dev_flow_table is updated. The driver in turn uses a device specific
282method to program the NIC to steer the packets.
283
284The hardware queue for a flow is derived from the CPU recorded in
285rps_dev_flow_table. The stack consults a CPU to hardware queue map which
286is maintained by the NIC driver. This is an auto-generated reverse map of
287the IRQ affinity table shown by /proc/interrupts. Drivers can use
288functions in the cpu_rmap (鈥淐PU affinity reverse map鈥) kernel library
289to populate the map. For each CPU, the corresponding queue in the map is
290set to be one whose processing CPU is closest in cache locality.
291
292==== Accelerated RFS Configuration
293
294Accelerated RFS is only available if the kernel is compiled with
295CONFIG_RFS_ACCEL and support is provided by the NIC device and driver.
296It also requires that ntuple filtering is enabled via ethtool. The map
297of CPU to queues is automatically deduced from the IRQ affinities
298configured for each receive queue by the driver, so no additional
299configuration should be necessary.
300
301== Suggested Configuration
302
303This technique should be enabled whenever one wants to use RFS and the
304NIC supports hardware acceleration.
305
306XPS: Transmit Packet Steering
307=============================
308
309Transmit Packet Steering is a mechanism for intelligently selecting
310which transmit queue to use when transmitting a packet on a multi-queue
311device. To accomplish this, a mapping from CPU to hardware queue(s) is
312recorded. The goal of this mapping is usually to assign queues
313exclusively to a subset of CPUs, where the transmit completions for
314these queues are processed on a CPU within this set. This choice
315provides two benefits. First, contention on the device queue lock is
316significantly reduced since fewer CPUs contend for the same queue
317(contention can be eliminated completely if each CPU has its own
318transmit queue). Secondly, cache miss rate on transmit completion is
319reduced, in particular for data cache lines that hold the sk_buff
320structures.
321
322XPS is configured per transmit queue by setting a bitmap of CPUs that
323may use that queue to transmit. The reverse mapping, from CPUs to
324transmit queues, is computed and maintained for each network device.
325When transmitting the first packet in a flow, the function
326get_xps_queue() is called to select a queue. This function uses the ID
327of the running CPU as a key into the CPU-to-queue lookup table. If the
328ID matches a single queue, that is used for transmission. If multiple
329queues match, one is selected by using the flow hash to compute an index
330into the set.
331
332The queue chosen for transmitting a particular flow is saved in the
333corresponding socket structure for the flow (e.g. a TCP connection).
334This transmit queue is used for subsequent packets sent on the flow to
335prevent out of order (ooo) packets. The choice also amortizes the cost
336of calling get_xps_queues() over all packets in the flow. To avoid
337ooo packets, the queue for a flow can subsequently only be changed if
338skb->ooo_okay is set for a packet in the flow. This flag indicates that
339there are no outstanding packets in the flow, so the transmit queue can
340change without the risk of generating out of order packets. The
341transport layer is responsible for setting ooo_okay appropriately. TCP,
342for instance, sets the flag when all data for a connection has been
343acknowledged.
344
345==== XPS Configuration
346
347XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by
348default for SMP). The functionality remains disabled until explicitly
349configured. To enable XPS, the bitmap of CPUs that may use a transmit
350queue is configured using the sysfs file entry:
351
352/sys/class/net/<dev>/queues/tx-<n>/xps_cpus
353
354== Suggested Configuration
355
356For a network device with a single transmission queue, XPS configuration
357has no effect, since there is no choice in this case. In a multi-queue
358system, XPS is preferably configured so that each CPU maps onto one queue.
359If there are as many queues as there are CPUs in the system, then each
360queue can also map onto one CPU, resulting in exclusive pairings that
361experience no contention. If there are fewer queues than CPUs, then the
362best CPUs to share a given queue are probably those that share the cache
363with the CPU that processes transmit completions for that queue
364(transmit interrupts).
365
366
367Further Information
368===================
369RPS and RFS were introduced in kernel 2.6.35. XPS was incorporated into
3702.6.38. Original patches were submitted by Tom Herbert
371(therbert@google.com)
372
373Accelerated RFS was introduced in 2.6.35. Original patches were
374submitted by Ben Hutchings (bhutchings@solarflare.com)
375
376Authors:
377Tom Herbert (therbert@google.com)
378Willem de Bruijn (willemb@google.com)
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 57a24108b845..8d67980fabe8 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -76,7 +76,16 @@ core.
76 76
774.5) DMA descriptors 774.5) DMA descriptors
78Driver handles both normal and enhanced descriptors. The latter has been only 78Driver handles both normal and enhanced descriptors. The latter has been only
79tested on DWC Ether MAC 10/100/1000 Universal version 3.41a. 79tested on DWC Ether MAC 10/100/1000 Universal version 3.41a and later.
80
81STMMAC supports DMA descriptor to operate both in dual buffer (RING)
82and linked-list(CHAINED) mode. In RING each descriptor points to two
83data buffer pointers whereas in CHAINED mode they point to only one data
84buffer pointer. RING mode is the default.
85
86In CHAINED mode each descriptor will have pointer to next descriptor in
87the list, hence creating the explicit chaining in the descriptor itself,
88whereas such explicit chaining is not possible in RING mode.
80 89
814.6) Ethtool support 904.6) Ethtool support
82Ethtool is supported. Driver statistics and internal errors can be taken using: 91Ethtool is supported. Driver statistics and internal errors can be taken using:
@@ -235,7 +244,38 @@ reset procedure etc).
235 o enh_desc.c: functions for handling enhanced descriptors 244 o enh_desc.c: functions for handling enhanced descriptors
236 o norm_desc.c: functions for handling normal descriptors 245 o norm_desc.c: functions for handling normal descriptors
237 246
2385) TODO: 2475) Debug Information
248
249The driver exports many information i.e. internal statistics,
250debug information, MAC and DMA registers etc.
251
252These can be read in several ways depending on the
253type of the information actually needed.
254
255For example a user can be use the ethtool support
256to get statistics: e.g. using: ethtool -S ethX
257(that shows the Management counters (MMC) if supported)
258or sees the MAC/DMA registers: e.g. using: ethtool -d ethX
259
260Compiling the Kernel with CONFIG_DEBUG_FS and enabling the
261STMMAC_DEBUG_FS option the driver will export the following
262debugfs entries:
263
264/sys/kernel/debug/stmmaceth/descriptors_status
265 To show the DMA TX/RX descriptor rings
266
267Developer can also use the "debug" module parameter to get
268further debug information.
269
270In the end, there are other macros (that cannot be enabled
271via menuconfig) to turn-on the RX/TX DMA debugging,
272specific MAC core debug printk etc. Others to enable the
273debug in the TX and RX processes.
274All these are only useful during the developing stage
275and should never enabled inside the code for general usage.
276In fact, these can generate an huge amount of debug messages.
277
2786) TODO:
239 o XGMAC is not supported. 279 o XGMAC is not supported.
240 o Review the timer optimisation code to use an embedded device that will be 280 o Review the timer optimisation code to use an embedded device that will be
241 available in new chip generations. 281 available in new chip generations.
diff --git a/Documentation/pinctrl.txt b/Documentation/pinctrl.txt
new file mode 100644
index 000000000000..b04cb7d45a16
--- /dev/null
+++ b/Documentation/pinctrl.txt
@@ -0,0 +1,950 @@
1PINCTRL (PIN CONTROL) subsystem
2This document outlines the pin control subsystem in Linux
3
4This subsystem deals with:
5
6- Enumerating and naming controllable pins
7
8- Multiplexing of pins, pads, fingers (etc) see below for details
9
10The intention is to also deal with:
11
12- Software-controlled biasing and driving mode specific pins, such as
13 pull-up/down, open drain etc, load capacitance configuration when controlled
14 by software, etc.
15
16
17Top-level interface
18===================
19
20Definition of PIN CONTROLLER:
21
22- A pin controller is a piece of hardware, usually a set of registers, that
23 can control PINs. It may be able to multiplex, bias, set load capacitance,
24 set drive strength etc for individual pins or groups of pins.
25
26Definition of PIN:
27
28- PINS are equal to pads, fingers, balls or whatever packaging input or
29 output line you want to control and these are denoted by unsigned integers
30 in the range 0..maxpin. This numberspace is local to each PIN CONTROLLER, so
31 there may be several such number spaces in a system. This pin space may
32 be sparse - i.e. there may be gaps in the space with numbers where no
33 pin exists.
34
35When a PIN CONTROLLER is instatiated, it will register a descriptor to the
36pin control framework, and this descriptor contains an array of pin descriptors
37describing the pins handled by this specific pin controller.
38
39Here is an example of a PGA (Pin Grid Array) chip seen from underneath:
40
41 A B C D E F G H
42
43 8 o o o o o o o o
44
45 7 o o o o o o o o
46
47 6 o o o o o o o o
48
49 5 o o o o o o o o
50
51 4 o o o o o o o o
52
53 3 o o o o o o o o
54
55 2 o o o o o o o o
56
57 1 o o o o o o o o
58
59To register a pin controller and name all the pins on this package we can do
60this in our driver:
61
62#include <linux/pinctrl/pinctrl.h>
63
64const struct pinctrl_pin_desc __refdata foo_pins[] = {
65 PINCTRL_PIN(0, "A1"),
66 PINCTRL_PIN(1, "A2"),
67 PINCTRL_PIN(2, "A3"),
68 ...
69 PINCTRL_PIN(61, "H6"),
70 PINCTRL_PIN(62, "H7"),
71 PINCTRL_PIN(63, "H8"),
72};
73
74static struct pinctrl_desc foo_desc = {
75 .name = "foo",
76 .pins = foo_pins,
77 .npins = ARRAY_SIZE(foo_pins),
78 .maxpin = 63,
79 .owner = THIS_MODULE,
80};
81
82int __init foo_probe(void)
83{
84 struct pinctrl_dev *pctl;
85
86 pctl = pinctrl_register(&foo_desc, <PARENT>, NULL);
87 if (IS_ERR(pctl))
88 pr_err("could not register foo pin driver\n");
89}
90
91Pins usually have fancier names than this. You can find these in the dataheet
92for your chip. Notice that the core pinctrl.h file provides a fancy macro
93called PINCTRL_PIN() to create the struct entries. As you can see I enumerated
94the pins from 0 in the upper left corner to 63 in the lower right corner,
95this enumeration was arbitrarily chosen, in practice you need to think
96through your numbering system so that it matches the layout of registers
97and such things in your driver, or the code may become complicated. You must
98also consider matching of offsets to the GPIO ranges that may be handled by
99the pin controller.
100
101For a padring with 467 pads, as opposed to actual pins, I used an enumeration
102like this, walking around the edge of the chip, which seems to be industry
103standard too (all these pads had names, too):
104
105
106 0 ..... 104
107 466 105
108 . .
109 . .
110 358 224
111 357 .... 225
112
113
114Pin groups
115==========
116
117Many controllers need to deal with groups of pins, so the pin controller
118subsystem has a mechanism for enumerating groups of pins and retrieving the
119actual enumerated pins that are part of a certain group.
120
121For example, say that we have a group of pins dealing with an SPI interface
122on { 0, 8, 16, 24 }, and a group of pins dealing with an I2C interface on pins
123on { 24, 25 }.
124
125These two groups are presented to the pin control subsystem by implementing
126some generic pinctrl_ops like this:
127
128#include <linux/pinctrl/pinctrl.h>
129
130struct foo_group {
131 const char *name;
132 const unsigned int *pins;
133 const unsigned num_pins;
134};
135
136static unsigned int spi0_pins[] = { 0, 8, 16, 24 };
137static unsigned int i2c0_pins[] = { 24, 25 };
138
139static const struct foo_group foo_groups[] = {
140 {
141 .name = "spi0_grp",
142 .pins = spi0_pins,
143 .num_pins = ARRAY_SIZE(spi0_pins),
144 },
145 {
146 .name = "i2c0_grp",
147 .pins = i2c0_pins,
148 .num_pins = ARRAY_SIZE(i2c0_pins),
149 },
150};
151
152
153static int foo_list_groups(struct pinctrl_dev *pctldev, unsigned selector)
154{
155 if (selector >= ARRAY_SIZE(foo_groups))
156 return -EINVAL;
157 return 0;
158}
159
160static const char *foo_get_group_name(struct pinctrl_dev *pctldev,
161 unsigned selector)
162{
163 return foo_groups[selector].name;
164}
165
166static int foo_get_group_pins(struct pinctrl_dev *pctldev, unsigned selector,
167 unsigned ** const pins,
168 unsigned * const num_pins)
169{
170 *pins = (unsigned *) foo_groups[selector].pins;
171 *num_pins = foo_groups[selector].num_pins;
172 return 0;
173}
174
175static struct pinctrl_ops foo_pctrl_ops = {
176 .list_groups = foo_list_groups,
177 .get_group_name = foo_get_group_name,
178 .get_group_pins = foo_get_group_pins,
179};
180
181
182static struct pinctrl_desc foo_desc = {
183 ...
184 .pctlops = &foo_pctrl_ops,
185};
186
187The pin control subsystem will call the .list_groups() function repeatedly
188beginning on 0 until it returns non-zero to determine legal selectors, then
189it will call the other functions to retrieve the name and pins of the group.
190Maintaining the data structure of the groups is up to the driver, this is
191just a simple example - in practice you may need more entries in your group
192structure, for example specific register ranges associated with each group
193and so on.
194
195
196Interaction with the GPIO subsystem
197===================================
198
199The GPIO drivers may want to perform operations of various types on the same
200physical pins that are also registered as pin controller pins.
201
202Since the pin controller subsystem have its pinspace local to the pin
203controller we need a mapping so that the pin control subsystem can figure out
204which pin controller handles control of a certain GPIO pin. Since a single
205pin controller may be muxing several GPIO ranges (typically SoCs that have
206one set of pins but internally several GPIO silicon blocks, each modeled as
207a struct gpio_chip) any number of GPIO ranges can be added to a pin controller
208instance like this:
209
210struct gpio_chip chip_a;
211struct gpio_chip chip_b;
212
213static struct pinctrl_gpio_range gpio_range_a = {
214 .name = "chip a",
215 .id = 0,
216 .base = 32,
217 .npins = 16,
218 .gc = &chip_a;
219};
220
221static struct pinctrl_gpio_range gpio_range_a = {
222 .name = "chip b",
223 .id = 0,
224 .base = 48,
225 .npins = 8,
226 .gc = &chip_b;
227};
228
229
230{
231 struct pinctrl_dev *pctl;
232 ...
233 pinctrl_add_gpio_range(pctl, &gpio_range_a);
234 pinctrl_add_gpio_range(pctl, &gpio_range_b);
235}
236
237So this complex system has one pin controller handling two different
238GPIO chips. Chip a has 16 pins and chip b has 8 pins. They are mapped in
239the global GPIO pin space at:
240
241chip a: [32 .. 47]
242chip b: [48 .. 55]
243
244When GPIO-specific functions in the pin control subsystem are called, these
245ranges will be used to look up the apropriate pin controller by inspecting
246and matching the pin to the pin ranges across all controllers. When a
247pin controller handling the matching range is found, GPIO-specific functions
248will be called on that specific pin controller.
249
250For all functionalities dealing with pin biasing, pin muxing etc, the pin
251controller subsystem will subtract the range's .base offset from the passed
252in gpio pin number, and pass that on to the pin control driver, so the driver
253will get an offset into its handled number range. Further it is also passed
254the range ID value, so that the pin controller knows which range it should
255deal with.
256
257For example: if a user issues pinctrl_gpio_set_foo(50), the pin control
258subsystem will find that the second range on this pin controller matches,
259subtract the base 48 and call the
260pinctrl_driver_gpio_set_foo(pinctrl, range, 2) where the latter function has
261this signature:
262
263int pinctrl_driver_gpio_set_foo(struct pinctrl_dev *pctldev,
264 struct pinctrl_gpio_range *rangeid,
265 unsigned offset);
266
267Now the driver knows that we want to do some GPIO-specific operation on the
268second GPIO range handled by "chip b", at offset 2 in that specific range.
269
270(If the GPIO subsystem is ever refactored to use a local per-GPIO controller
271pin space, this mapping will need to be augmented accordingly.)
272
273
274PINMUX interfaces
275=================
276
277These calls use the pinmux_* naming prefix. No other calls should use that
278prefix.
279
280
281What is pinmuxing?
282==================
283
284PINMUX, also known as padmux, ballmux, alternate functions or mission modes
285is a way for chip vendors producing some kind of electrical packages to use
286a certain physical pin (ball, pad, finger, etc) for multiple mutually exclusive
287functions, depending on the application. By "application" in this context
288we usually mean a way of soldering or wiring the package into an electronic
289system, even though the framework makes it possible to also change the function
290at runtime.
291
292Here is an example of a PGA (Pin Grid Array) chip seen from underneath:
293
294 A B C D E F G H
295 +---+
296 8 | o | o o o o o o o
297 | |
298 7 | o | o o o o o o o
299 | |
300 6 | o | o o o o o o o
301 +---+---+
302 5 | o | o | o o o o o o
303 +---+---+ +---+
304 4 o o o o o o | o | o
305 | |
306 3 o o o o o o | o | o
307 | |
308 2 o o o o o o | o | o
309 +-------+-------+-------+---+---+
310 1 | o o | o o | o o | o | o |
311 +-------+-------+-------+---+---+
312
313This is not tetris. The game to think of is chess. Not all PGA/BGA packages
314are chessboard-like, big ones have "holes" in some arrangement according to
315different design patterns, but we're using this as a simple example. Of the
316pins you see some will be taken by things like a few VCC and GND to feed power
317to the chip, and quite a few will be taken by large ports like an external
318memory interface. The remaining pins will often be subject to pin multiplexing.
319
320The example 8x8 PGA package above will have pin numbers 0 thru 63 assigned to
321its physical pins. It will name the pins { A1, A2, A3 ... H6, H7, H8 } using
322pinctrl_register_pins() and a suitable data set as shown earlier.
323
324In this 8x8 BGA package the pins { A8, A7, A6, A5 } can be used as an SPI port
325(these are four pins: CLK, RXD, TXD, FRM). In that case, pin B5 can be used as
326some general-purpose GPIO pin. However, in another setting, pins { A5, B5 } can
327be used as an I2C port (these are just two pins: SCL, SDA). Needless to say,
328we cannot use the SPI port and I2C port at the same time. However in the inside
329of the package the silicon performing the SPI logic can alternatively be routed
330out on pins { G4, G3, G2, G1 }.
331
332On the botton row at { A1, B1, C1, D1, E1, F1, G1, H1 } we have something
333special - it's an external MMC bus that can be 2, 4 or 8 bits wide, and it will
334consume 2, 4 or 8 pins respectively, so either { A1, B1 } are taken or
335{ A1, B1, C1, D1 } or all of them. If we use all 8 bits, we cannot use the SPI
336port on pins { G4, G3, G2, G1 } of course.
337
338This way the silicon blocks present inside the chip can be multiplexed "muxed"
339out on different pin ranges. Often contemporary SoC (systems on chip) will
340contain several I2C, SPI, SDIO/MMC, etc silicon blocks that can be routed to
341different pins by pinmux settings.
342
343Since general-purpose I/O pins (GPIO) are typically always in shortage, it is
344common to be able to use almost any pin as a GPIO pin if it is not currently
345in use by some other I/O port.
346
347
348Pinmux conventions
349==================
350
351The purpose of the pinmux functionality in the pin controller subsystem is to
352abstract and provide pinmux settings to the devices you choose to instantiate
353in your machine configuration. It is inspired by the clk, GPIO and regulator
354subsystems, so devices will request their mux setting, but it's also possible
355to request a single pin for e.g. GPIO.
356
357Definitions:
358
359- FUNCTIONS can be switched in and out by a driver residing with the pin
360 control subsystem in the drivers/pinctrl/* directory of the kernel. The
361 pin control driver knows the possible functions. In the example above you can
362 identify three pinmux functions, one for spi, one for i2c and one for mmc.
363
364- FUNCTIONS are assumed to be enumerable from zero in a one-dimensional array.
365 In this case the array could be something like: { spi0, i2c0, mmc0 }
366 for the three available functions.
367
368- FUNCTIONS have PIN GROUPS as defined on the generic level - so a certain
369 function is *always* associated with a certain set of pin groups, could
370 be just a single one, but could also be many. In the example above the
371 function i2c is associated with the pins { A5, B5 }, enumerated as
372 { 24, 25 } in the controller pin space.
373
374 The Function spi is associated with pin groups { A8, A7, A6, A5 }
375 and { G4, G3, G2, G1 }, which are enumerated as { 0, 8, 16, 24 } and
376 { 38, 46, 54, 62 } respectively.
377
378 Group names must be unique per pin controller, no two groups on the same
379 controller may have the same name.
380
381- The combination of a FUNCTION and a PIN GROUP determine a certain function
382 for a certain set of pins. The knowledge of the functions and pin groups
383 and their machine-specific particulars are kept inside the pinmux driver,
384 from the outside only the enumerators are known, and the driver core can:
385
386 - Request the name of a function with a certain selector (>= 0)
387 - A list of groups associated with a certain function
388 - Request that a certain group in that list to be activated for a certain
389 function
390
391 As already described above, pin groups are in turn self-descriptive, so
392 the core will retrieve the actual pin range in a certain group from the
393 driver.
394
395- FUNCTIONS and GROUPS on a certain PIN CONTROLLER are MAPPED to a certain
396 device by the board file, device tree or similar machine setup configuration
397 mechanism, similar to how regulators are connected to devices, usually by
398 name. Defining a pin controller, function and group thus uniquely identify
399 the set of pins to be used by a certain device. (If only one possible group
400 of pins is available for the function, no group name need to be supplied -
401 the core will simply select the first and only group available.)
402
403 In the example case we can define that this particular machine shall
404 use device spi0 with pinmux function fspi0 group gspi0 and i2c0 on function
405 fi2c0 group gi2c0, on the primary pin controller, we get mappings
406 like these:
407
408 {
409 {"map-spi0", spi0, pinctrl0, fspi0, gspi0},
410 {"map-i2c0", i2c0, pinctrl0, fi2c0, gi2c0}
411 }
412
413 Every map must be assigned a symbolic name, pin controller and function.
414 The group is not compulsory - if it is omitted the first group presented by
415 the driver as applicable for the function will be selected, which is
416 useful for simple cases.
417
418 The device name is present in map entries tied to specific devices. Maps
419 without device names are referred to as SYSTEM pinmuxes, such as can be taken
420 by the machine implementation on boot and not tied to any specific device.
421
422 It is possible to map several groups to the same combination of device,
423 pin controller and function. This is for cases where a certain function on
424 a certain pin controller may use different sets of pins in different
425 configurations.
426
427- PINS for a certain FUNCTION using a certain PIN GROUP on a certain
428 PIN CONTROLLER are provided on a first-come first-serve basis, so if some
429 other device mux setting or GPIO pin request has already taken your physical
430 pin, you will be denied the use of it. To get (activate) a new setting, the
431 old one has to be put (deactivated) first.
432
433Sometimes the documentation and hardware registers will be oriented around
434pads (or "fingers") rather than pins - these are the soldering surfaces on the
435silicon inside the package, and may or may not match the actual number of
436pins/balls underneath the capsule. Pick some enumeration that makes sense to
437you. Define enumerators only for the pins you can control if that makes sense.
438
439Assumptions:
440
441We assume that the number possible function maps to pin groups is limited by
442the hardware. I.e. we assume that there is no system where any function can be
443mapped to any pin, like in a phone exchange. So the available pins groups for
444a certain function will be limited to a few choices (say up to eight or so),
445not hundreds or any amount of choices. This is the characteristic we have found
446by inspecting available pinmux hardware, and a necessary assumption since we
447expect pinmux drivers to present *all* possible function vs pin group mappings
448to the subsystem.
449
450
451Pinmux drivers
452==============
453
454The pinmux core takes care of preventing conflicts on pins and calling
455the pin controller driver to execute different settings.
456
457It is the responsibility of the pinmux driver to impose further restrictions
458(say for example infer electronic limitations due to load etc) to determine
459whether or not the requested function can actually be allowed, and in case it
460is possible to perform the requested mux setting, poke the hardware so that
461this happens.
462
463Pinmux drivers are required to supply a few callback functions, some are
464optional. Usually the enable() and disable() functions are implemented,
465writing values into some certain registers to activate a certain mux setting
466for a certain pin.
467
468A simple driver for the above example will work by setting bits 0, 1, 2, 3 or 4
469into some register named MUX to select a certain function with a certain
470group of pins would work something like this:
471
472#include <linux/pinctrl/pinctrl.h>
473#include <linux/pinctrl/pinmux.h>
474
475struct foo_group {
476 const char *name;
477 const unsigned int *pins;
478 const unsigned num_pins;
479};
480
481static const unsigned spi0_0_pins[] = { 0, 8, 16, 24 };
482static const unsigned spi0_1_pins[] = { 38, 46, 54, 62 };
483static const unsigned i2c0_pins[] = { 24, 25 };
484static const unsigned mmc0_1_pins[] = { 56, 57 };
485static const unsigned mmc0_2_pins[] = { 58, 59 };
486static const unsigned mmc0_3_pins[] = { 60, 61, 62, 63 };
487
488static const struct foo_group foo_groups[] = {
489 {
490 .name = "spi0_0_grp",
491 .pins = spi0_0_pins,
492 .num_pins = ARRAY_SIZE(spi0_0_pins),
493 },
494 {
495 .name = "spi0_1_grp",
496 .pins = spi0_1_pins,
497 .num_pins = ARRAY_SIZE(spi0_1_pins),
498 },
499 {
500 .name = "i2c0_grp",
501 .pins = i2c0_pins,
502 .num_pins = ARRAY_SIZE(i2c0_pins),
503 },
504 {
505 .name = "mmc0_1_grp",
506 .pins = mmc0_1_pins,
507 .num_pins = ARRAY_SIZE(mmc0_1_pins),
508 },
509 {
510 .name = "mmc0_2_grp",
511 .pins = mmc0_2_pins,
512 .num_pins = ARRAY_SIZE(mmc0_2_pins),
513 },
514 {
515 .name = "mmc0_3_grp",
516 .pins = mmc0_3_pins,
517 .num_pins = ARRAY_SIZE(mmc0_3_pins),
518 },
519};
520
521
522static int foo_list_groups(struct pinctrl_dev *pctldev, unsigned selector)
523{
524 if (selector >= ARRAY_SIZE(foo_groups))
525 return -EINVAL;
526 return 0;
527}
528
529static const char *foo_get_group_name(struct pinctrl_dev *pctldev,
530 unsigned selector)
531{
532 return foo_groups[selector].name;
533}
534
535static int foo_get_group_pins(struct pinctrl_dev *pctldev, unsigned selector,
536 unsigned ** const pins,
537 unsigned * const num_pins)
538{
539 *pins = (unsigned *) foo_groups[selector].pins;
540 *num_pins = foo_groups[selector].num_pins;
541 return 0;
542}
543
544static struct pinctrl_ops foo_pctrl_ops = {
545 .list_groups = foo_list_groups,
546 .get_group_name = foo_get_group_name,
547 .get_group_pins = foo_get_group_pins,
548};
549
550struct foo_pmx_func {
551 const char *name;
552 const char * const *groups;
553 const unsigned num_groups;
554};
555
556static const char * const spi0_groups[] = { "spi0_1_grp" };
557static const char * const i2c0_groups[] = { "i2c0_grp" };
558static const char * const mmc0_groups[] = { "mmc0_1_grp", "mmc0_2_grp",
559 "mmc0_3_grp" };
560
561static const struct foo_pmx_func foo_functions[] = {
562 {
563 .name = "spi0",
564 .groups = spi0_groups,
565 .num_groups = ARRAY_SIZE(spi0_groups),
566 },
567 {
568 .name = "i2c0",
569 .groups = i2c0_groups,
570 .num_groups = ARRAY_SIZE(i2c0_groups),
571 },
572 {
573 .name = "mmc0",
574 .groups = mmc0_groups,
575 .num_groups = ARRAY_SIZE(mmc0_groups),
576 },
577};
578
579int foo_list_funcs(struct pinctrl_dev *pctldev, unsigned selector)
580{
581 if (selector >= ARRAY_SIZE(foo_functions))
582 return -EINVAL;
583 return 0;
584}
585
586const char *foo_get_fname(struct pinctrl_dev *pctldev, unsigned selector)
587{
588 return myfuncs[selector].name;
589}
590
591static int foo_get_groups(struct pinctrl_dev *pctldev, unsigned selector,
592 const char * const **groups,
593 unsigned * const num_groups)
594{
595 *groups = foo_functions[selector].groups;
596 *num_groups = foo_functions[selector].num_groups;
597 return 0;
598}
599
600int foo_enable(struct pinctrl_dev *pctldev, unsigned selector,
601 unsigned group)
602{
603 u8 regbit = (1 << group);
604
605 writeb((readb(MUX)|regbit), MUX)
606 return 0;
607}
608
609int foo_disable(struct pinctrl_dev *pctldev, unsigned selector,
610 unsigned group)
611{
612 u8 regbit = (1 << group);
613
614 writeb((readb(MUX) & ~(regbit)), MUX)
615 return 0;
616}
617
618struct pinmux_ops foo_pmxops = {
619 .list_functions = foo_list_funcs,
620 .get_function_name = foo_get_fname,
621 .get_function_groups = foo_get_groups,
622 .enable = foo_enable,
623 .disable = foo_disable,
624};
625
626/* Pinmux operations are handled by some pin controller */
627static struct pinctrl_desc foo_desc = {
628 ...
629 .pctlops = &foo_pctrl_ops,
630 .pmxops = &foo_pmxops,
631};
632
633In the example activating muxing 0 and 1 at the same time setting bits
6340 and 1, uses one pin in common so they would collide.
635
636The beauty of the pinmux subsystem is that since it keeps track of all
637pins and who is using them, it will already have denied an impossible
638request like that, so the driver does not need to worry about such
639things - when it gets a selector passed in, the pinmux subsystem makes
640sure no other device or GPIO assignment is already using the selected
641pins. Thus bits 0 and 1 in the control register will never be set at the
642same time.
643
644All the above functions are mandatory to implement for a pinmux driver.
645
646
647Pinmux interaction with the GPIO subsystem
648==========================================
649
650The function list could become long, especially if you can convert every
651individual pin into a GPIO pin independent of any other pins, and then try
652the approach to define every pin as a function.
653
654In this case, the function array would become 64 entries for each GPIO
655setting and then the device functions.
656
657For this reason there is an additional function a pinmux driver can implement
658to enable only GPIO on an individual pin: .gpio_request_enable(). The same
659.free() function as for other functions is assumed to be usable also for
660GPIO pins.
661
662This function will pass in the affected GPIO range identified by the pin
663controller core, so you know which GPIO pins are being affected by the request
664operation.
665
666Alternatively it is fully allowed to use named functions for each GPIO
667pin, the pinmux_request_gpio() will attempt to obtain the function "gpioN"
668where "N" is the global GPIO pin number if no special GPIO-handler is
669registered.
670
671
672Pinmux board/machine configuration
673==================================
674
675Boards and machines define how a certain complete running system is put
676together, including how GPIOs and devices are muxed, how regulators are
677constrained and how the clock tree looks. Of course pinmux settings are also
678part of this.
679
680A pinmux config for a machine looks pretty much like a simple regulator
681configuration, so for the example array above we want to enable i2c and
682spi on the second function mapping:
683
684#include <linux/pinctrl/machine.h>
685
686static struct pinmux_map pmx_mapping[] = {
687 {
688 .ctrl_dev_name = "pinctrl.0",
689 .function = "spi0",
690 .dev_name = "foo-spi.0",
691 },
692 {
693 .ctrl_dev_name = "pinctrl.0",
694 .function = "i2c0",
695 .dev_name = "foo-i2c.0",
696 },
697 {
698 .ctrl_dev_name = "pinctrl.0",
699 .function = "mmc0",
700 .dev_name = "foo-mmc.0",
701 },
702};
703
704The dev_name here matches to the unique device name that can be used to look
705up the device struct (just like with clockdev or regulators). The function name
706must match a function provided by the pinmux driver handling this pin range.
707
708As you can see we may have several pin controllers on the system and thus
709we need to specify which one of them that contain the functions we wish
710to map. The map can also use struct device * directly, so there is no
711inherent need to use strings to specify .dev_name or .ctrl_dev_name, these
712are for the situation where you do not have a handle to the struct device *,
713for example if they are not yet instantiated or cumbersome to obtain.
714
715You register this pinmux mapping to the pinmux subsystem by simply:
716
717 ret = pinmux_register_mappings(&pmx_mapping, ARRAY_SIZE(pmx_mapping));
718
719Since the above construct is pretty common there is a helper macro to make
720it even more compact which assumes you want to use pinctrl.0 and position
7210 for mapping, for example:
722
723static struct pinmux_map pmx_mapping[] = {
724 PINMUX_MAP_PRIMARY("I2CMAP", "i2c0", "foo-i2c.0"),
725};
726
727
728Complex mappings
729================
730
731As it is possible to map a function to different groups of pins an optional
732.group can be specified like this:
733
734...
735{
736 .name = "spi0-pos-A",
737 .ctrl_dev_name = "pinctrl.0",
738 .function = "spi0",
739 .group = "spi0_0_grp",
740 .dev_name = "foo-spi.0",
741},
742{
743 .name = "spi0-pos-B",
744 .ctrl_dev_name = "pinctrl.0",
745 .function = "spi0",
746 .group = "spi0_1_grp",
747 .dev_name = "foo-spi.0",
748},
749...
750
751This example mapping is used to switch between two positions for spi0 at
752runtime, as described further below under the heading "Runtime pinmuxing".
753
754Further it is possible to match several groups of pins to the same function
755for a single device, say for example in the mmc0 example above, where you can
756additively expand the mmc0 bus from 2 to 4 to 8 pins. If we want to use all
757three groups for a total of 2+2+4 = 8 pins (for an 8-bit MMC bus as is the
758case), we define a mapping like this:
759
760...
761{
762 .name "2bit"
763 .ctrl_dev_name = "pinctrl.0",
764 .function = "mmc0",
765 .group = "mmc0_0_grp",
766 .dev_name = "foo-mmc.0",
767},
768{
769 .name "4bit"
770 .ctrl_dev_name = "pinctrl.0",
771 .function = "mmc0",
772 .group = "mmc0_0_grp",
773 .dev_name = "foo-mmc.0",
774},
775{
776 .name "4bit"
777 .ctrl_dev_name = "pinctrl.0",
778 .function = "mmc0",
779 .group = "mmc0_1_grp",
780 .dev_name = "foo-mmc.0",
781},
782{
783 .name "8bit"
784 .ctrl_dev_name = "pinctrl.0",
785 .function = "mmc0",
786 .group = "mmc0_0_grp",
787 .dev_name = "foo-mmc.0",
788},
789{
790 .name "8bit"
791 .ctrl_dev_name = "pinctrl.0",
792 .function = "mmc0",
793 .group = "mmc0_1_grp",
794 .dev_name = "foo-mmc.0",
795},
796{
797 .name "8bit"
798 .ctrl_dev_name = "pinctrl.0",
799 .function = "mmc0",
800 .group = "mmc0_2_grp",
801 .dev_name = "foo-mmc.0",
802},
803...
804
805The result of grabbing this mapping from the device with something like
806this (see next paragraph):
807
808 pmx = pinmux_get(&device, "8bit");
809
810Will be that you activate all the three bottom records in the mapping at
811once. Since they share the same name, pin controller device, funcion and
812device, and since we allow multiple groups to match to a single device, they
813all get selected, and they all get enabled and disable simultaneously by the
814pinmux core.
815
816
817Pinmux requests from drivers
818============================
819
820Generally it is discouraged to let individual drivers get and enable pinmuxes.
821So if possible, handle the pinmuxes in platform code or some other place where
822you have access to all the affected struct device * pointers. In some cases
823where a driver needs to switch between different mux mappings at runtime
824this is not possible.
825
826A driver may request a certain mux to be activated, usually just the default
827mux like this:
828
829#include <linux/pinctrl/pinmux.h>
830
831struct foo_state {
832 struct pinmux *pmx;
833 ...
834};
835
836foo_probe()
837{
838 /* Allocate a state holder named "state" etc */
839 struct pinmux pmx;
840
841 pmx = pinmux_get(&device, NULL);
842 if IS_ERR(pmx)
843 return PTR_ERR(pmx);
844 pinmux_enable(pmx);
845
846 state->pmx = pmx;
847}
848
849foo_remove()
850{
851 pinmux_disable(state->pmx);
852 pinmux_put(state->pmx);
853}
854
855If you want to grab a specific mux mapping and not just the first one found for
856this device you can specify a specific mapping name, for example in the above
857example the second i2c0 setting: pinmux_get(&device, "spi0-pos-B");
858
859This get/enable/disable/put sequence can just as well be handled by bus drivers
860if you don't want each and every driver to handle it and you know the
861arrangement on your bus.
862
863The semantics of the get/enable respective disable/put is as follows:
864
865- pinmux_get() is called in process context to reserve the pins affected with
866 a certain mapping and set up the pinmux core and the driver. It will allocate
867 a struct from the kernel memory to hold the pinmux state.
868
869- pinmux_enable()/pinmux_disable() is quick and can be called from fastpath
870 (irq context) when you quickly want to set up/tear down the hardware muxing
871 when running a device driver. Usually it will just poke some values into a
872 register.
873
874- pinmux_disable() is called in process context to tear down the pin requests
875 and release the state holder struct for the mux setting.
876
877Usually the pinmux core handled the get/put pair and call out to the device
878drivers bookkeeping operations, like checking available functions and the
879associated pins, whereas the enable/disable pass on to the pin controller
880driver which takes care of activating and/or deactivating the mux setting by
881quickly poking some registers.
882
883The pins are allocated for your device when you issue the pinmux_get() call,
884after this you should be able to see this in the debugfs listing of all pins.
885
886
887System pinmux hogging
888=====================
889
890A system pinmux map entry, i.e. a pinmux setting that does not have a device
891associated with it, can be hogged by the core when the pin controller is
892registered. This means that the core will attempt to call pinmux_get() and
893pinmux_enable() on it immediately after the pin control device has been
894registered.
895
896This is enabled by simply setting the .hog_on_boot field in the map to true,
897like this:
898
899{
900 .name "POWERMAP"
901 .ctrl_dev_name = "pinctrl.0",
902 .function = "power_func",
903 .hog_on_boot = true,
904},
905
906Since it may be common to request the core to hog a few always-applicable
907mux settings on the primary pin controller, there is a convenience macro for
908this:
909
910PINMUX_MAP_PRIMARY_SYS_HOG("POWERMAP", "power_func")
911
912This gives the exact same result as the above construction.
913
914
915Runtime pinmuxing
916=================
917
918It is possible to mux a certain function in and out at runtime, say to move
919an SPI port from one set of pins to another set of pins. Say for example for
920spi0 in the example above, we expose two different groups of pins for the same
921function, but with different named in the mapping as described under
922"Advanced mapping" above. So we have two mappings named "spi0-pos-A" and
923"spi0-pos-B".
924
925This snippet first muxes the function in the pins defined by group A, enables
926it, disables and releases it, and muxes it in on the pins defined by group B:
927
928foo_switch()
929{
930 struct pinmux pmx;
931
932 /* Enable on position A */
933 pmx = pinmux_get(&device, "spi0-pos-A");
934 if IS_ERR(pmx)
935 return PTR_ERR(pmx);
936 pinmux_enable(pmx);
937
938 /* This releases the pins again */
939 pinmux_disable(pmx);
940 pinmux_put(pmx);
941
942 /* Enable on position B */
943 pmx = pinmux_get(&device, "spi0-pos-B");
944 if IS_ERR(pmx)
945 return PTR_ERR(pmx);
946 pinmux_enable(pmx);
947 ...
948}
949
950The above has to be done from process context.
diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX
index 45e9d4a91284..a4d682f54231 100644
--- a/Documentation/power/00-INDEX
+++ b/Documentation/power/00-INDEX
@@ -26,6 +26,8 @@ s2ram.txt
26 - How to get suspend to ram working (and debug it when it isn't) 26 - How to get suspend to ram working (and debug it when it isn't)
27states.txt 27states.txt
28 - System power management states 28 - System power management states
29suspend-and-cpuhotplug.txt
30 - Explains the interaction between Suspend-to-RAM (S3) and CPU hotplug
29swsusp-and-swap-files.txt 31swsusp-and-swap-files.txt
30 - Using swap files with software suspend (to disk) 32 - Using swap files with software suspend (to disk)
31swsusp-dmcrypt.txt 33swsusp-dmcrypt.txt
diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
index ddd78172ef73..40a4c65f380a 100644
--- a/Documentation/power/basic-pm-debugging.txt
+++ b/Documentation/power/basic-pm-debugging.txt
@@ -173,7 +173,7 @@ kernel messages using the serial console. This may provide you with some
173information about the reasons of the suspend (resume) failure. Alternatively, 173information about the reasons of the suspend (resume) failure. Alternatively,
174it may be possible to use a FireWire port for debugging with firescope 174it may be possible to use a FireWire port for debugging with firescope
175(ftp://ftp.firstfloor.org/pub/ak/firescope/). On x86 it is also possible to 175(ftp://ftp.firstfloor.org/pub/ak/firescope/). On x86 it is also possible to
176use the PM_TRACE mechanism documented in Documentation/s2ram.txt . 176use the PM_TRACE mechanism documented in Documentation/power/s2ram.txt .
177 177
1782. Testing suspend to RAM (STR) 1782. Testing suspend to RAM (STR)
179 179
@@ -201,3 +201,27 @@ case, you may be able to search for failing drivers by following the procedure
201analogous to the one described in section 1. If you find some failing drivers, 201analogous to the one described in section 1. If you find some failing drivers,
202you will have to unload them every time before an STR transition (ie. before 202you will have to unload them every time before an STR transition (ie. before
203you run s2ram), and please report the problems with them. 203you run s2ram), and please report the problems with them.
204
205There is a debugfs entry which shows the suspend to RAM statistics. Here is an
206example of its output.
207 # mount -t debugfs none /sys/kernel/debug
208 # cat /sys/kernel/debug/suspend_stats
209 success: 20
210 fail: 5
211 failed_freeze: 0
212 failed_prepare: 0
213 failed_suspend: 5
214 failed_suspend_noirq: 0
215 failed_resume: 0
216 failed_resume_noirq: 0
217 failures:
218 last_failed_dev: alarm
219 adc
220 last_failed_errno: -16
221 -16
222 last_failed_step: suspend
223 suspend
224Field success means the success number of suspend to RAM, and field fail means
225the failure number. Others are the failure number of different steps of suspend
226to RAM. suspend_stats just lists the last 2 failed devices, error number and
227failed step of suspend.
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt
index 3384d5996be2..646a89e0c07d 100644
--- a/Documentation/power/devices.txt
+++ b/Documentation/power/devices.txt
@@ -152,7 +152,9 @@ try to use its wakeup mechanism. device_set_wakeup_enable() affects this flag;
152for the most part drivers should not change its value. The initial value of 152for the most part drivers should not change its value. The initial value of
153should_wakeup is supposed to be false for the majority of devices; the major 153should_wakeup is supposed to be false for the majority of devices; the major
154exceptions are power buttons, keyboards, and Ethernet adapters whose WoL 154exceptions are power buttons, keyboards, and Ethernet adapters whose WoL
155(wake-on-LAN) feature has been set up with ethtool. 155(wake-on-LAN) feature has been set up with ethtool. It should also default
156to true for devices that don't generate wakeup requests on their own but merely
157forward wakeup requests from one bus to another (like PCI bridges).
156 158
157Whether or not a device is capable of issuing wakeup events is a hardware 159Whether or not a device is capable of issuing wakeup events is a hardware
158matter, and the kernel is responsible for keeping track of it. By contrast, 160matter, and the kernel is responsible for keeping track of it. By contrast,
@@ -279,10 +281,6 @@ When the system goes into the standby or memory sleep state, the phases are:
279 time.) Unlike the other suspend-related phases, during the prepare 281 time.) Unlike the other suspend-related phases, during the prepare
280 phase the device tree is traversed top-down. 282 phase the device tree is traversed top-down.
281 283
282 In addition to that, if device drivers need to allocate additional
283 memory to be able to hadle device suspend correctly, that should be
284 done in the prepare phase.
285
286 After the prepare callback method returns, no new children may be 284 After the prepare callback method returns, no new children may be
287 registered below the device. The method may also prepare the device or 285 registered below the device. The method may also prepare the device or
288 driver in some way for the upcoming system power transition (for 286 driver in some way for the upcoming system power transition (for
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt
index bfed898a03fc..17e130a80347 100644
--- a/Documentation/power/pm_qos_interface.txt
+++ b/Documentation/power/pm_qos_interface.txt
@@ -4,14 +4,19 @@ This interface provides a kernel and user mode interface for registering
4performance expectations by drivers, subsystems and user space applications on 4performance expectations by drivers, subsystems and user space applications on
5one of the parameters. 5one of the parameters.
6 6
7Currently we have {cpu_dma_latency, network_latency, network_throughput} as the 7Two different PM QoS frameworks are available:
8initial set of pm_qos parameters. 81. PM QoS classes for cpu_dma_latency, network_latency, network_throughput.
92. the per-device PM QoS framework provides the API to manage the per-device latency
10constraints.
9 11
10Each parameters have defined units: 12Each parameters have defined units:
11 * latency: usec 13 * latency: usec
12 * timeout: usec 14 * timeout: usec
13 * throughput: kbs (kilo bit / sec) 15 * throughput: kbs (kilo bit / sec)
14 16
17
181. PM QoS framework
19
15The infrastructure exposes multiple misc device nodes one per implemented 20The infrastructure exposes multiple misc device nodes one per implemented
16parameter. The set of parameters implement is defined by pm_qos_power_init() 21parameter. The set of parameters implement is defined by pm_qos_power_init()
17and pm_qos_params.h. This is done because having the available parameters 22and pm_qos_params.h. This is done because having the available parameters
@@ -23,14 +28,18 @@ an aggregated target value. The aggregated target value is updated with
23changes to the request list or elements of the list. Typically the 28changes to the request list or elements of the list. Typically the
24aggregated target value is simply the max or min of the request values held 29aggregated target value is simply the max or min of the request values held
25in the parameter list elements. 30in the parameter list elements.
31Note: the aggregated target value is implemented as an atomic variable so that
32reading the aggregated value does not require any locking mechanism.
33
26 34
27From kernel mode the use of this interface is simple: 35From kernel mode the use of this interface is simple:
28 36
29handle = pm_qos_add_request(param_class, target_value): 37void pm_qos_add_request(handle, param_class, target_value):
30Will insert an element into the list for that identified PM_QOS class with the 38Will insert an element into the list for that identified PM QoS class with the
31target value. Upon change to this list the new target is recomputed and any 39target value. Upon change to this list the new target is recomputed and any
32registered notifiers are called only if the target value is now different. 40registered notifiers are called only if the target value is now different.
33Clients of pm_qos need to save the returned handle. 41Clients of pm_qos need to save the returned handle for future use in other
42pm_qos API functions.
34 43
35void pm_qos_update_request(handle, new_target_value): 44void pm_qos_update_request(handle, new_target_value):
36Will update the list element pointed to by the handle with the new target value 45Will update the list element pointed to by the handle with the new target value
@@ -42,6 +51,20 @@ Will remove the element. After removal it will update the aggregate target and
42call the notification tree if the target was changed as a result of removing 51call the notification tree if the target was changed as a result of removing
43the request. 52the request.
44 53
54int pm_qos_request(param_class):
55Returns the aggregated value for a given PM QoS class.
56
57int pm_qos_request_active(handle):
58Returns if the request is still active, i.e. it has not been removed from a
59PM QoS class constraints list.
60
61int pm_qos_add_notifier(param_class, notifier):
62Adds a notification callback function to the PM QoS class. The callback is
63called when the aggregated value for the PM QoS class is changed.
64
65int pm_qos_remove_notifier(int param_class, notifier):
66Removes the notification callback function for the PM QoS class.
67
45 68
46From user mode: 69From user mode:
47Only processes can register a pm_qos request. To provide for automatic 70Only processes can register a pm_qos request. To provide for automatic
@@ -63,4 +86,63 @@ To remove the user mode request for a target value simply close the device
63node. 86node.
64 87
65 88
892. PM QoS per-device latency framework
90
91For each device a list of performance requests is maintained along with
92an aggregated target value. The aggregated target value is updated with
93changes to the request list or elements of the list. Typically the
94aggregated target value is simply the max or min of the request values held
95in the parameter list elements.
96Note: the aggregated target value is implemented as an atomic variable so that
97reading the aggregated value does not require any locking mechanism.
98
99
100From kernel mode the use of this interface is the following:
101
102int dev_pm_qos_add_request(device, handle, value):
103Will insert an element into the list for that identified device with the
104target value. Upon change to this list the new target is recomputed and any
105registered notifiers are called only if the target value is now different.
106Clients of dev_pm_qos need to save the handle for future use in other
107dev_pm_qos API functions.
108
109int dev_pm_qos_update_request(handle, new_value):
110Will update the list element pointed to by the handle with the new target value
111and recompute the new aggregated target, calling the notification trees if the
112target is changed.
113
114int dev_pm_qos_remove_request(handle):
115Will remove the element. After removal it will update the aggregate target and
116call the notification trees if the target was changed as a result of removing
117the request.
118
119s32 dev_pm_qos_read_value(device):
120Returns the aggregated value for a given device's constraints list.
121
122
123Notification mechanisms:
124The per-device PM QoS framework has 2 different and distinct notification trees:
125a per-device notification tree and a global notification tree.
126
127int dev_pm_qos_add_notifier(device, notifier):
128Adds a notification callback function for the device.
129The callback is called when the aggregated value of the device constraints list
130is changed.
131
132int dev_pm_qos_remove_notifier(device, notifier):
133Removes the notification callback function for the device.
134
135int dev_pm_qos_add_global_notifier(notifier):
136Adds a notification callback function in the global notification tree of the
137framework.
138The callback is called when the aggregated value for any device is changed.
139
140int dev_pm_qos_remove_global_notifier(notifier):
141Removes the notification callback function from the global notification tree
142of the framework.
143
144
145From user mode:
146No API for user space access to the per-device latency constraints is provided
147yet - still under discussion.
66 148
diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt
index b42419b52e44..ce63af0a8e35 100644
--- a/Documentation/power/regulator/machine.txt
+++ b/Documentation/power/regulator/machine.txt
@@ -16,7 +16,7 @@ initialisation code by creating a struct regulator_consumer_supply for
16each regulator. 16each regulator.
17 17
18struct regulator_consumer_supply { 18struct regulator_consumer_supply {
19 struct device *dev; /* consumer */ 19 const char *dev_name; /* consumer dev_name() */
20 const char *supply; /* consumer supply - e.g. "vcc" */ 20 const char *supply; /* consumer supply - e.g. "vcc" */
21}; 21};
22 22
@@ -24,13 +24,13 @@ e.g. for the machine above
24 24
25static struct regulator_consumer_supply regulator1_consumers[] = { 25static struct regulator_consumer_supply regulator1_consumers[] = {
26{ 26{
27 .dev = &platform_consumerB_device.dev, 27 .dev_name = "dev_name(consumer B)",
28 .supply = "Vcc", 28 .supply = "Vcc",
29},}; 29},};
30 30
31static struct regulator_consumer_supply regulator2_consumers[] = { 31static struct regulator_consumer_supply regulator2_consumers[] = {
32{ 32{
33 .dev = &platform_consumerA_device.dev, 33 .dev = "dev_name(consumer A"),
34 .supply = "Vcc", 34 .supply = "Vcc",
35},}; 35},};
36 36
@@ -43,6 +43,7 @@ to their supply regulator :-
43 43
44static struct regulator_init_data regulator1_data = { 44static struct regulator_init_data regulator1_data = {
45 .constraints = { 45 .constraints = {
46 .name = "Regulator-1",
46 .min_uV = 3300000, 47 .min_uV = 3300000,
47 .max_uV = 3300000, 48 .max_uV = 3300000,
48 .valid_modes_mask = REGULATOR_MODE_NORMAL, 49 .valid_modes_mask = REGULATOR_MODE_NORMAL,
@@ -51,13 +52,19 @@ static struct regulator_init_data regulator1_data = {
51 .consumer_supplies = regulator1_consumers, 52 .consumer_supplies = regulator1_consumers,
52}; 53};
53 54
55The name field should be set to something that is usefully descriptive
56for the board for configuration of supplies for other regulators and
57for use in logging and other diagnostic output. Normally the name
58used for the supply rail in the schematic is a good choice. If no
59name is provided then the subsystem will choose one.
60
54Regulator-1 supplies power to Regulator-2. This relationship must be registered 61Regulator-1 supplies power to Regulator-2. This relationship must be registered
55with the core so that Regulator-1 is also enabled when Consumer A enables its 62with the core so that Regulator-1 is also enabled when Consumer A enables its
56supply (Regulator-2). The supply regulator is set by the supply_regulator 63supply (Regulator-2). The supply regulator is set by the supply_regulator
57field below:- 64field below and co:-
58 65
59static struct regulator_init_data regulator2_data = { 66static struct regulator_init_data regulator2_data = {
60 .supply_regulator = "regulator_name", 67 .supply_regulator = "Regulator-1",
61 .constraints = { 68 .constraints = {
62 .min_uV = 1800000, 69 .min_uV = 1800000,
63 .max_uV = 2000000, 70 .max_uV = 2000000,
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 4ce5450ab6e8..0e856088db7c 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -43,13 +43,18 @@ struct dev_pm_ops {
43 ... 43 ...
44}; 44};
45 45
46The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks are 46The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks
47executed by the PM core for either the device type, or the class (if the device 47are executed by the PM core for either the power domain, or the device type
48type's struct dev_pm_ops object does not exist), or the bus type (if the 48(if the device power domain's struct dev_pm_ops does not exist), or the class
49device type's and class' struct dev_pm_ops objects do not exist) of the given 49(if the device power domain's and type's struct dev_pm_ops object does not
50device (this allows device types to override callbacks provided by bus types or 50exist), or the bus type (if the device power domain's, type's and class'
51classes if necessary). The bus type, device type and class callbacks are 51struct dev_pm_ops objects do not exist) of the given device, so the priority
52referred to as subsystem-level callbacks in what follows. 52order of callbacks from high to low is that power domain callbacks, device
53type callbacks, class callbacks and bus type callbacks, and the high priority
54one will take precedence over low priority one. The bus type, device type and
55class callbacks are referred to as subsystem-level callbacks in what follows,
56and generally speaking, the power domain callbacks are used for representing
57power domains within a SoC.
53 58
54By default, the callbacks are always invoked in process context with interrupts 59By default, the callbacks are always invoked in process context with interrupts
55enabled. However, subsystems can use the pm_runtime_irq_safe() helper function 60enabled. However, subsystems can use the pm_runtime_irq_safe() helper function
@@ -431,8 +436,7 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
431 436
432 void pm_runtime_irq_safe(struct device *dev); 437 void pm_runtime_irq_safe(struct device *dev);
433 - set the power.irq_safe flag for the device, causing the runtime-PM 438 - set the power.irq_safe flag for the device, causing the runtime-PM
434 suspend and resume callbacks (but not the idle callback) to be invoked 439 callbacks to be invoked with interrupts off
435 with interrupts disabled
436 440
437 void pm_runtime_mark_last_busy(struct device *dev); 441 void pm_runtime_mark_last_busy(struct device *dev);
438 - set the power.last_busy field to the current time 442 - set the power.last_busy field to the current time
@@ -478,12 +482,14 @@ pm_runtime_autosuspend_expiration()
478If pm_runtime_irq_safe() has been called for a device then the following helper 482If pm_runtime_irq_safe() has been called for a device then the following helper
479functions may also be used in interrupt context: 483functions may also be used in interrupt context:
480 484
485pm_runtime_idle()
481pm_runtime_suspend() 486pm_runtime_suspend()
482pm_runtime_autosuspend() 487pm_runtime_autosuspend()
483pm_runtime_resume() 488pm_runtime_resume()
484pm_runtime_get_sync() 489pm_runtime_get_sync()
485pm_runtime_put_sync() 490pm_runtime_put_sync()
486pm_runtime_put_sync_suspend() 491pm_runtime_put_sync_suspend()
492pm_runtime_put_sync_autosuspend()
487 493
4885. Runtime PM Initialization, Device Probing and Removal 4945. Runtime PM Initialization, Device Probing and Removal
489 495
diff --git a/Documentation/power/suspend-and-cpuhotplug.txt b/Documentation/power/suspend-and-cpuhotplug.txt
new file mode 100644
index 000000000000..f28f9a6f0347
--- /dev/null
+++ b/Documentation/power/suspend-and-cpuhotplug.txt
@@ -0,0 +1,275 @@
1Interaction of Suspend code (S3) with the CPU hotplug infrastructure
2
3 (C) 2011 Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
4
5
6I. How does the regular CPU hotplug code differ from how the Suspend-to-RAM
7 infrastructure uses it internally? And where do they share common code?
8
9Well, a picture is worth a thousand words... So ASCII art follows :-)
10
11[This depicts the current design in the kernel, and focusses only on the
12interactions involving the freezer and CPU hotplug and also tries to explain
13the locking involved. It outlines the notifications involved as well.
14But please note that here, only the call paths are illustrated, with the aim
15of describing where they take different paths and where they share code.
16What happens when regular CPU hotplug and Suspend-to-RAM race with each other
17is not depicted here.]
18
19On a high level, the suspend-resume cycle goes like this:
20
21|Freeze| -> |Disable nonboot| -> |Do suspend| -> |Enable nonboot| -> |Thaw |
22|tasks | | cpus | | | | cpus | |tasks|
23
24
25More details follow:
26
27 Suspend call path
28 -----------------
29
30 Write 'mem' to
31 /sys/power/state
32 syfs file
33 |
34 v
35 Acquire pm_mutex lock
36 |
37 v
38 Send PM_SUSPEND_PREPARE
39 notifications
40 |
41 v
42 Freeze tasks
43 |
44 |
45 v
46 disable_nonboot_cpus()
47 /* start */
48 |
49 v
50 Acquire cpu_add_remove_lock
51 |
52 v
53 Iterate over CURRENTLY
54 online CPUs
55 |
56 |
57 | ----------
58 v | L
59 ======> _cpu_down() |
60 | [This takes cpuhotplug.lock |
61 Common | before taking down the CPU |
62 code | and releases it when done] | O
63 | While it is at it, notifications |
64 | are sent when notable events occur, |
65 ======> by running all registered callbacks. |
66 | | O
67 | |
68 | |
69 v |
70 Note down these cpus in | P
71 frozen_cpus mask ----------
72 |
73 v
74 Disable regular cpu hotplug
75 by setting cpu_hotplug_disabled=1
76 |
77 v
78 Release cpu_add_remove_lock
79 |
80 v
81 /* disable_nonboot_cpus() complete */
82 |
83 v
84 Do suspend
85
86
87
88Resuming back is likewise, with the counterparts being (in the order of
89execution during resume):
90* enable_nonboot_cpus() which involves:
91 | Acquire cpu_add_remove_lock
92 | Reset cpu_hotplug_disabled to 0, thereby enabling regular cpu hotplug
93 | Call _cpu_up() [for all those cpus in the frozen_cpus mask, in a loop]
94 | Release cpu_add_remove_lock
95 v
96
97* thaw tasks
98* send PM_POST_SUSPEND notifications
99* Release pm_mutex lock.
100
101
102It is to be noted here that the pm_mutex lock is acquired at the very
103beginning, when we are just starting out to suspend, and then released only
104after the entire cycle is complete (i.e., suspend + resume).
105
106
107
108 Regular CPU hotplug call path
109 -----------------------------
110
111 Write 0 (or 1) to
112 /sys/devices/system/cpu/cpu*/online
113 sysfs file
114 |
115 |
116 v
117 cpu_down()
118 |
119 v
120 Acquire cpu_add_remove_lock
121 |
122 v
123 If cpu_hotplug_disabled is 1
124 return gracefully
125 |
126 |
127 v
128 ======> _cpu_down()
129 | [This takes cpuhotplug.lock
130 Common | before taking down the CPU
131 code | and releases it when done]
132 | While it is at it, notifications
133 | are sent when notable events occur,
134 ======> by running all registered callbacks.
135 |
136 |
137 v
138 Release cpu_add_remove_lock
139 [That's it!, for
140 regular CPU hotplug]
141
142
143
144So, as can be seen from the two diagrams (the parts marked as "Common code"),
145regular CPU hotplug and the suspend code path converge at the _cpu_down() and
146_cpu_up() functions. They differ in the arguments passed to these functions,
147in that during regular CPU hotplug, 0 is passed for the 'tasks_frozen'
148argument. But during suspend, since the tasks are already frozen by the time
149the non-boot CPUs are offlined or onlined, the _cpu_*() functions are called
150with the 'tasks_frozen' argument set to 1.
151[See below for some known issues regarding this.]
152
153
154Important files and functions/entry points:
155------------------------------------------
156
157kernel/power/process.c : freeze_processes(), thaw_processes()
158kernel/power/suspend.c : suspend_prepare(), suspend_enter(), suspend_finish()
159kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](), [disable|enable]_nonboot_cpus()
160
161
162
163II. What are the issues involved in CPU hotplug?
164 -------------------------------------------
165
166There are some interesting situations involving CPU hotplug and microcode
167update on the CPUs, as discussed below:
168
169[Please bear in mind that the kernel requests the microcode images from
170userspace, using the request_firmware() function defined in
171drivers/base/firmware_class.c]
172
173
174a. When all the CPUs are identical:
175
176 This is the most common situation and it is quite straightforward: we want
177 to apply the same microcode revision to each of the CPUs.
178 To give an example of x86, the collect_cpu_info() function defined in
179 arch/x86/kernel/microcode_core.c helps in discovering the type of the CPU
180 and thereby in applying the correct microcode revision to it.
181 But note that the kernel does not maintain a common microcode image for the
182 all CPUs, in order to handle case 'b' described below.
183
184
185b. When some of the CPUs are different than the rest:
186
187 In this case since we probably need to apply different microcode revisions
188 to different CPUs, the kernel maintains a copy of the correct microcode
189 image for each CPU (after appropriate CPU type/model discovery using
190 functions such as collect_cpu_info()).
191
192
193c. When a CPU is physically hot-unplugged and a new (and possibly different
194 type of) CPU is hot-plugged into the system:
195
196 In the current design of the kernel, whenever a CPU is taken offline during
197 a regular CPU hotplug operation, upon receiving the CPU_DEAD notification
198 (which is sent by the CPU hotplug code), the microcode update driver's
199 callback for that event reacts by freeing the kernel's copy of the
200 microcode image for that CPU.
201
202 Hence, when a new CPU is brought online, since the kernel finds that it
203 doesn't have the microcode image, it does the CPU type/model discovery
204 afresh and then requests the userspace for the appropriate microcode image
205 for that CPU, which is subsequently applied.
206
207 For example, in x86, the mc_cpu_callback() function (which is the microcode
208 update driver's callback registered for CPU hotplug events) calls
209 microcode_update_cpu() which would call microcode_init_cpu() in this case,
210 instead of microcode_resume_cpu() when it finds that the kernel doesn't
211 have a valid microcode image. This ensures that the CPU type/model
212 discovery is performed and the right microcode is applied to the CPU after
213 getting it from userspace.
214
215
216d. Handling microcode update during suspend/hibernate:
217
218 Strictly speaking, during a CPU hotplug operation which does not involve
219 physically removing or inserting CPUs, the CPUs are not actually powered
220 off during a CPU offline. They are just put to the lowest C-states possible.
221 Hence, in such a case, it is not really necessary to re-apply microcode
222 when the CPUs are brought back online, since they wouldn't have lost the
223 image during the CPU offline operation.
224
225 This is the usual scenario encountered during a resume after a suspend.
226 However, in the case of hibernation, since all the CPUs are completely
227 powered off, during restore it becomes necessary to apply the microcode
228 images to all the CPUs.
229
230 [Note that we don't expect someone to physically pull out nodes and insert
231 nodes with a different type of CPUs in-between a suspend-resume or a
232 hibernate/restore cycle.]
233
234 In the current design of the kernel however, during a CPU offline operation
235 as part of the suspend/hibernate cycle (the CPU_DEAD_FROZEN notification),
236 the existing copy of microcode image in the kernel is not freed up.
237 And during the CPU online operations (during resume/restore), since the
238 kernel finds that it already has copies of the microcode images for all the
239 CPUs, it just applies them to the CPUs, avoiding any re-discovery of CPU
240 type/model and the need for validating whether the microcode revisions are
241 right for the CPUs or not (due to the above assumption that physical CPU
242 hotplug will not be done in-between suspend/resume or hibernate/restore
243 cycles).
244
245
246III. Are there any known problems when regular CPU hotplug and suspend race
247 with each other?
248
249Yes, they are listed below:
250
2511. When invoking regular CPU hotplug, the 'tasks_frozen' argument passed to
252 the _cpu_down() and _cpu_up() functions is *always* 0.
253 This might not reflect the true current state of the system, since the
254 tasks could have been frozen by an out-of-band event such as a suspend
255 operation in progress. Hence, it will lead to wrong notifications being
256 sent during the cpu online/offline events (eg, CPU_ONLINE notification
257 instead of CPU_ONLINE_FROZEN) which in turn will lead to execution of
258 inappropriate code by the callbacks registered for such CPU hotplug events.
259
2602. If a regular CPU hotplug stress test happens to race with the freezer due
261 to a suspend operation in progress at the same time, then we could hit the
262 situation described below:
263
264 * A regular cpu online operation continues its journey from userspace
265 into the kernel, since the freezing has not yet begun.
266 * Then freezer gets to work and freezes userspace.
267 * If cpu online has not yet completed the microcode update stuff by now,
268 it will now start waiting on the frozen userspace in the
269 TASK_UNINTERRUPTIBLE state, in order to get the microcode image.
270 * Now the freezer continues and tries to freeze the remaining tasks. But
271 due to this wait mentioned above, the freezer won't be able to freeze
272 the cpu online hotplug task and hence freezing of tasks fails.
273
274 As a result of this task freezing failure, the suspend operation gets
275 aborted.
diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt
index 1101bee4e822..0e870825c1b9 100644
--- a/Documentation/power/userland-swsusp.txt
+++ b/Documentation/power/userland-swsusp.txt
@@ -77,7 +77,8 @@ SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in <PAGE_SIZE>
77 resume_swap_area, as defined in kernel/power/suspend_ioctls.h, 77 resume_swap_area, as defined in kernel/power/suspend_ioctls.h,
78 containing the resume device specification and the offset); for swap 78 containing the resume device specification and the offset); for swap
79 partitions the offset is always 0, but it is different from zero for 79 partitions the offset is always 0, but it is different from zero for
80 swap files (see Documentation/swsusp-and-swap-files.txt for details). 80 swap files (see Documentation/power/swsusp-and-swap-files.txt for
81 details).
81 82
82SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, 83SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support,
83 depending on the argument value (enable, if the argument is nonzero) 84 depending on the argument value (enable, if the argument is nonzero)
diff --git a/Documentation/ramoops.txt b/Documentation/ramoops.txt
new file mode 100644
index 000000000000..8fb1ba7fe7bf
--- /dev/null
+++ b/Documentation/ramoops.txt
@@ -0,0 +1,76 @@
1Ramoops oops/panic logger
2=========================
3
4Sergiu Iordache <sergiu@chromium.org>
5
6Updated: 8 August 2011
7
80. Introduction
9
10Ramoops is an oops/panic logger that writes its logs to RAM before the system
11crashes. It works by logging oopses and panics in a circular buffer. Ramoops
12needs a system with persistent RAM so that the content of that area can
13survive after a restart.
14
151. Ramoops concepts
16
17Ramoops uses a predefined memory area to store the dump. The start and size of
18the memory area are set using two variables:
19 * "mem_address" for the start
20 * "mem_size" for the size. The memory size will be rounded down to a
21 power of two.
22
23The memory area is divided into "record_size" chunks (also rounded down to
24power of two) and each oops/panic writes a "record_size" chunk of
25information.
26
27Dumping both oopses and panics can be done by setting 1 in the "dump_oops"
28variable while setting 0 in that variable dumps only the panics.
29
30The module uses a counter to record multiple dumps but the counter gets reset
31on restart (i.e. new dumps after the restart will overwrite old ones).
32
332. Setting the parameters
34
35Setting the ramoops parameters can be done in 2 different manners:
36 1. Use the module parameters (which have the names of the variables described
37 as before).
38 2. Use a platform device and set the platform data. The parameters can then
39 be set through that platform data. An example of doing that is:
40
41#include <linux/ramoops.h>
42[...]
43
44static struct ramoops_platform_data ramoops_data = {
45 .mem_size = <...>,
46 .mem_address = <...>,
47 .record_size = <...>,
48 .dump_oops = <...>,
49};
50
51static struct platform_device ramoops_dev = {
52 .name = "ramoops",
53 .dev = {
54 .platform_data = &ramoops_data,
55 },
56};
57
58[... inside a function ...]
59int ret;
60
61ret = platform_device_register(&ramoops_dev);
62if (ret) {
63 printk(KERN_ERR "unable to register platform device\n");
64 return ret;
65}
66
673. Dump format
68
69The data dump begins with a header, currently defined as "====" followed by a
70timestamp and a new line. The dump then continues with the actual data.
71
724. Reading the data
73
74The dump data can be read from memory (through /dev/mem or other means).
75Getting the module parameters, which are needed in order to parse the data, can
76be done through /sys/module/ramoops/parameters/* .
diff --git a/Documentation/rapidio/rapidio.txt b/Documentation/rapidio/rapidio.txt
index be70ee15f8ca..c75694b35d08 100644
--- a/Documentation/rapidio/rapidio.txt
+++ b/Documentation/rapidio/rapidio.txt
@@ -144,7 +144,7 @@ and the default device ID in order to access the device on the active port.
144 144
145After the host has completed enumeration of the entire network it releases 145After the host has completed enumeration of the entire network it releases
146devices by clearing device ID locks (calls rio_clear_locks()). For each endpoint 146devices by clearing device ID locks (calls rio_clear_locks()). For each endpoint
147in the system, it sets the Master Enable bit in the Port General Control CSR 147in the system, it sets the Discovered bit in the Port General Control CSR
148to indicate that enumeration is completed and agents are allowed to execute 148to indicate that enumeration is completed and agents are allowed to execute
149passive discovery of the network. 149passive discovery of the network.
150 150
diff --git a/Documentation/rapidio/tsi721.txt b/Documentation/rapidio/tsi721.txt
new file mode 100644
index 000000000000..335f3c6087dc
--- /dev/null
+++ b/Documentation/rapidio/tsi721.txt
@@ -0,0 +1,49 @@
1RapidIO subsystem mport driver for IDT Tsi721 PCI Express-to-SRIO bridge.
2=========================================================================
3
4I. Overview
5
6This driver implements all currently defined RapidIO mport callback functions.
7It supports maintenance read and write operations, inbound and outbound RapidIO
8doorbells, inbound maintenance port-writes and RapidIO messaging.
9
10To generate SRIO maintenance transactions this driver uses one of Tsi721 DMA
11channels. This mechanism provides access to larger range of hop counts and
12destination IDs without need for changes in outbound window translation.
13
14RapidIO messaging support uses dedicated messaging channels for each mailbox.
15For inbound messages this driver uses destination ID matching to forward messages
16into the corresponding message queue. Messaging callbacks are implemented to be
17fully compatible with RIONET driver (Ethernet over RapidIO messaging services).
18
19II. Known problems
20
21 None.
22
23III. To do
24
25 Add DMA data transfers (non-messaging).
26 Add inbound region (SRIO-to-PCIe) mapping.
27
28IV. Version History
29
30 1.0.0 - Initial driver release.
31
32V. License
33-----------------------------------------------
34
35 Copyright(c) 2011 Integrated Device Technology, Inc. All rights reserved.
36
37 This program is free software; you can redistribute it and/or modify it
38 under the terms of the GNU General Public License as published by the Free
39 Software Foundation; either version 2 of the License, or (at your option)
40 any later version.
41
42 This program is distributed in the hope that it will be useful, but WITHOUT
43 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
44 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
45 more details.
46
47 You should have received a copy of the GNU General Public License along with
48 this program; if not, write to the Free Software Foundation, Inc.,
49 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
diff --git a/Documentation/rfkill.txt b/Documentation/rfkill.txt
index 83668e5dd17f..03c9d9299c6b 100644
--- a/Documentation/rfkill.txt
+++ b/Documentation/rfkill.txt
@@ -117,5 +117,4 @@ The contents of these variables corresponds to the "name", "state" and
117"type" sysfs files explained above. 117"type" sysfs files explained above.
118 118
119 119
120For further details consult Documentation/ABI/stable/dev-rfkill and 120For further details consult Documentation/ABI/stable/sysfs-class-rfkill.
121Documentation/ABI/stable/sysfs-class-rfkill.
diff --git a/Documentation/scheduler/sched-bwc.txt b/Documentation/scheduler/sched-bwc.txt
new file mode 100644
index 000000000000..f6b1873f68ab
--- /dev/null
+++ b/Documentation/scheduler/sched-bwc.txt
@@ -0,0 +1,122 @@
1CFS Bandwidth Control
2=====================
3
4[ This document only discusses CPU bandwidth control for SCHED_NORMAL.
5 The SCHED_RT case is covered in Documentation/scheduler/sched-rt-group.txt ]
6
7CFS bandwidth control is a CONFIG_FAIR_GROUP_SCHED extension which allows the
8specification of the maximum CPU bandwidth available to a group or hierarchy.
9
10The bandwidth allowed for a group is specified using a quota and period. Within
11each given "period" (microseconds), a group is allowed to consume only up to
12"quota" microseconds of CPU time. When the CPU bandwidth consumption of a
13group exceeds this limit (for that period), the tasks belonging to its
14hierarchy will be throttled and are not allowed to run again until the next
15period.
16
17A group's unused runtime is globally tracked, being refreshed with quota units
18above at each period boundary. As threads consume this bandwidth it is
19transferred to cpu-local "silos" on a demand basis. The amount transferred
20within each of these updates is tunable and described as the "slice".
21
22Management
23----------
24Quota and period are managed within the cpu subsystem via cgroupfs.
25
26cpu.cfs_quota_us: the total available run-time within a period (in microseconds)
27cpu.cfs_period_us: the length of a period (in microseconds)
28cpu.stat: exports throttling statistics [explained further below]
29
30The default values are:
31 cpu.cfs_period_us=100ms
32 cpu.cfs_quota=-1
33
34A value of -1 for cpu.cfs_quota_us indicates that the group does not have any
35bandwidth restriction in place, such a group is described as an unconstrained
36bandwidth group. This represents the traditional work-conserving behavior for
37CFS.
38
39Writing any (valid) positive value(s) will enact the specified bandwidth limit.
40The minimum quota allowed for the quota or period is 1ms. There is also an
41upper bound on the period length of 1s. Additional restrictions exist when
42bandwidth limits are used in a hierarchical fashion, these are explained in
43more detail below.
44
45Writing any negative value to cpu.cfs_quota_us will remove the bandwidth limit
46and return the group to an unconstrained state once more.
47
48Any updates to a group's bandwidth specification will result in it becoming
49unthrottled if it is in a constrained state.
50
51System wide settings
52--------------------
53For efficiency run-time is transferred between the global pool and CPU local
54"silos" in a batch fashion. This greatly reduces global accounting pressure
55on large systems. The amount transferred each time such an update is required
56is described as the "slice".
57
58This is tunable via procfs:
59 /proc/sys/kernel/sched_cfs_bandwidth_slice_us (default=5ms)
60
61Larger slice values will reduce transfer overheads, while smaller values allow
62for more fine-grained consumption.
63
64Statistics
65----------
66A group's bandwidth statistics are exported via 3 fields in cpu.stat.
67
68cpu.stat:
69- nr_periods: Number of enforcement intervals that have elapsed.
70- nr_throttled: Number of times the group has been throttled/limited.
71- throttled_time: The total time duration (in nanoseconds) for which entities
72 of the group have been throttled.
73
74This interface is read-only.
75
76Hierarchical considerations
77---------------------------
78The interface enforces that an individual entity's bandwidth is always
79attainable, that is: max(c_i) <= C. However, over-subscription in the
80aggregate case is explicitly allowed to enable work-conserving semantics
81within a hierarchy.
82 e.g. \Sum (c_i) may exceed C
83[ Where C is the parent's bandwidth, and c_i its children ]
84
85
86There are two ways in which a group may become throttled:
87 a. it fully consumes its own quota within a period
88 b. a parent's quota is fully consumed within its period
89
90In case b) above, even though the child may have runtime remaining it will not
91be allowed to until the parent's runtime is refreshed.
92
93Examples
94--------
951. Limit a group to 1 CPU worth of runtime.
96
97 If period is 250ms and quota is also 250ms, the group will get
98 1 CPU worth of runtime every 250ms.
99
100 # echo 250000 > cpu.cfs_quota_us /* quota = 250ms */
101 # echo 250000 > cpu.cfs_period_us /* period = 250ms */
102
1032. Limit a group to 2 CPUs worth of runtime on a multi-CPU machine.
104
105 With 500ms period and 1000ms quota, the group can get 2 CPUs worth of
106 runtime every 500ms.
107
108 # echo 1000000 > cpu.cfs_quota_us /* quota = 1000ms */
109 # echo 500000 > cpu.cfs_period_us /* period = 500ms */
110
111 The larger period here allows for increased burst capacity.
112
1133. Limit a group to 20% of 1 CPU.
114
115 With 50ms period, 10ms quota will be equivalent to 20% of 1 CPU.
116
117 # echo 10000 > cpu.cfs_quota_us /* quota = 10ms */
118 # echo 50000 > cpu.cfs_period_us /* period = 50ms */
119
120 By using a small period here we are ensuring a consistent latency
121 response at the expense of burst capacity.
122
diff --git a/Documentation/scsi/00-INDEX b/Documentation/scsi/00-INDEX
index c2e18e109858..b48ded55b555 100644
--- a/Documentation/scsi/00-INDEX
+++ b/Documentation/scsi/00-INDEX
@@ -28,6 +28,8 @@ LICENSE.FlashPoint
28 - Licence of the Flashpoint driver 28 - Licence of the Flashpoint driver
29LICENSE.qla2xxx 29LICENSE.qla2xxx
30 - License for QLogic Linux Fibre Channel HBA Driver firmware. 30 - License for QLogic Linux Fibre Channel HBA Driver firmware.
31LICENSE.qla4xxx
32 - License for QLogic Linux iSCSI HBA Driver.
31Mylex.txt 33Mylex.txt
32 - info on driver for Mylex adapters 34 - info on driver for Mylex adapters
33NinjaSCSI.txt 35NinjaSCSI.txt
diff --git a/Documentation/scsi/ChangeLog.megaraid_sas b/Documentation/scsi/ChangeLog.megaraid_sas
index 1b6e27ddb7f3..64adb98b181c 100644
--- a/Documentation/scsi/ChangeLog.megaraid_sas
+++ b/Documentation/scsi/ChangeLog.megaraid_sas
@@ -1,3 +1,18 @@
1Release Date : Wed. Oct 5, 2011 17:00:00 PST 2010 -
2 (emaild-id:megaraidlinux@lsi.com)
3 Adam Radford
4Current Version : 00.00.06.12-rc1
5Old Version : 00.00.05.40-rc1
6 1. Continue booting immediately if FW in FAULT at driver load time.
7 2. Increase default cmds per lun to 256.
8 3. Fix mismatch in megasas_reset_fusion() mutex lock-unlock.
9 4. Remove some un-necessary code.
10 5. Clear state change interrupts for Fusion/Invader.
11 6. Clear FUSION_IN_RESET before enabling interrupts.
12 7. Add support for MegaRAID 9360/9380 12GB/s controllers.
13 8. Add multiple MSI-X vector/multiple reply queue support.
14 9. Add driver workaround for PERC5/1068 kdump kernel panic.
15-------------------------------------------------------------------------------
1Release Date : Tue. Jul 26, 2011 17:00:00 PST 2010 - 16Release Date : Tue. Jul 26, 2011 17:00:00 PST 2010 -
2 (emaild-id:megaraidlinux@lsi.com) 17 (emaild-id:megaraidlinux@lsi.com)
3 Adam Radford 18 Adam Radford
diff --git a/Documentation/scsi/LICENSE.qla4xxx b/Documentation/scsi/LICENSE.qla4xxx
new file mode 100644
index 000000000000..494980e40491
--- /dev/null
+++ b/Documentation/scsi/LICENSE.qla4xxx
@@ -0,0 +1,310 @@
1Copyright (c) 2003-2011 QLogic Corporation
2QLogic Linux iSCSI HBA Driver
3
4This program includes a device driver for Linux 3.x.
5You may modify and redistribute the device driver code under the
6GNU General Public License (a copy of which is attached hereto as
7Exhibit A) published by the Free Software Foundation (version 2).
8
9REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
10THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
11EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
12IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
13PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
14BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
15EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
16TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
17DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
18ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
19OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
20OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
21POSSIBILITY OF SUCH DAMAGE.
22
23USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
24CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
25OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
26TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
27ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
28COMBINATION WITH THIS PROGRAM.
29
30
31EXHIBIT A
32
33 GNU GENERAL PUBLIC LICENSE
34 Version 2, June 1991
35
36 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
37 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
38 Everyone is permitted to copy and distribute verbatim copies
39 of this license document, but changing it is not allowed.
40
41 Preamble
42
43 The licenses for most software are designed to take away your
44freedom to share and change it. By contrast, the GNU General Public
45License is intended to guarantee your freedom to share and change free
46software--to make sure the software is free for all its users. This
47General Public License applies to most of the Free Software
48Foundation's software and to any other program whose authors commit to
49using it. (Some other Free Software Foundation software is covered by
50the GNU Lesser General Public License instead.) You can apply it to
51your programs, too.
52
53 When we speak of free software, we are referring to freedom, not
54price. Our General Public Licenses are designed to make sure that you
55have the freedom to distribute copies of free software (and charge for
56this service if you wish), that you receive source code or can get it
57if you want it, that you can change the software or use pieces of it
58in new free programs; and that you know you can do these things.
59
60 To protect your rights, we need to make restrictions that forbid
61anyone to deny you these rights or to ask you to surrender the rights.
62These restrictions translate to certain responsibilities for you if you
63distribute copies of the software, or if you modify it.
64
65 For example, if you distribute copies of such a program, whether
66gratis or for a fee, you must give the recipients all the rights that
67you have. You must make sure that they, too, receive or can get the
68source code. And you must show them these terms so they know their
69rights.
70
71 We protect your rights with two steps: (1) copyright the software, and
72(2) offer you this license which gives you legal permission to copy,
73distribute and/or modify the software.
74
75 Also, for each author's protection and ours, we want to make certain
76that everyone understands that there is no warranty for this free
77software. If the software is modified by someone else and passed on, we
78want its recipients to know that what they have is not the original, so
79that any problems introduced by others will not reflect on the original
80authors' reputations.
81
82 Finally, any free program is threatened constantly by software
83patents. We wish to avoid the danger that redistributors of a free
84program will individually obtain patent licenses, in effect making the
85program proprietary. To prevent this, we have made it clear that any
86patent must be licensed for everyone's free use or not licensed at all.
87
88 The precise terms and conditions for copying, distribution and
89modification follow.
90
91 GNU GENERAL PUBLIC LICENSE
92 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
93
94 0. This License applies to any program or other work which contains
95a notice placed by the copyright holder saying it may be distributed
96under the terms of this General Public License. The "Program", below,
97refers to any such program or work, and a "work based on the Program"
98means either the Program or any derivative work under copyright law:
99that is to say, a work containing the Program or a portion of it,
100either verbatim or with modifications and/or translated into another
101language. (Hereinafter, translation is included without limitation in
102the term "modification".) Each licensee is addressed as "you".
103
104Activities other than copying, distribution and modification are not
105covered by this License; they are outside its scope. The act of
106running the Program is not restricted, and the output from the Program
107is covered only if its contents constitute a work based on the
108Program (independent of having been made by running the Program).
109Whether that is true depends on what the Program does.
110
111 1. You may copy and distribute verbatim copies of the Program's
112source code as you receive it, in any medium, provided that you
113conspicuously and appropriately publish on each copy an appropriate
114copyright notice and disclaimer of warranty; keep intact all the
115notices that refer to this License and to the absence of any warranty;
116and give any other recipients of the Program a copy of this License
117along with the Program.
118
119You may charge a fee for the physical act of transferring a copy, and
120you may at your option offer warranty protection in exchange for a fee.
121
122 2. You may modify your copy or copies of the Program or any portion
123of it, thus forming a work based on the Program, and copy and
124distribute such modifications or work under the terms of Section 1
125above, provided that you also meet all of these conditions:
126
127 a) You must cause the modified files to carry prominent notices
128 stating that you changed the files and the date of any change.
129
130 b) You must cause any work that you distribute or publish, that in
131 whole or in part contains or is derived from the Program or any
132 part thereof, to be licensed as a whole at no charge to all third
133 parties under the terms of this License.
134
135 c) If the modified program normally reads commands interactively
136 when run, you must cause it, when started running for such
137 interactive use in the most ordinary way, to print or display an
138 announcement including an appropriate copyright notice and a
139 notice that there is no warranty (or else, saying that you provide
140 a warranty) and that users may redistribute the program under
141 these conditions, and telling the user how to view a copy of this
142 License. (Exception: if the Program itself is interactive but
143 does not normally print such an announcement, your work based on
144 the Program is not required to print an announcement.)
145
146These requirements apply to the modified work as a whole. If
147identifiable sections of that work are not derived from the Program,
148and can be reasonably considered independent and separate works in
149themselves, then this License, and its terms, do not apply to those
150sections when you distribute them as separate works. But when you
151distribute the same sections as part of a whole which is a work based
152on the Program, the distribution of the whole must be on the terms of
153this License, whose permissions for other licensees extend to the
154entire whole, and thus to each and every part regardless of who wrote it.
155
156Thus, it is not the intent of this section to claim rights or contest
157your rights to work written entirely by you; rather, the intent is to
158exercise the right to control the distribution of derivative or
159collective works based on the Program.
160
161In addition, mere aggregation of another work not based on the Program
162with the Program (or with a work based on the Program) on a volume of
163a storage or distribution medium does not bring the other work under
164the scope of this License.
165
166 3. You may copy and distribute the Program (or a work based on it,
167under Section 2) in object code or executable form under the terms of
168Sections 1 and 2 above provided that you also do one of the following:
169
170 a) Accompany it with the complete corresponding machine-readable
171 source code, which must be distributed under the terms of Sections
172 1 and 2 above on a medium customarily used for software interchange; or,
173
174 b) Accompany it with a written offer, valid for at least three
175 years, to give any third party, for a charge no more than your
176 cost of physically performing source distribution, a complete
177 machine-readable copy of the corresponding source code, to be
178 distributed under the terms of Sections 1 and 2 above on a medium
179 customarily used for software interchange; or,
180
181 c) Accompany it with the information you received as to the offer
182 to distribute corresponding source code. (This alternative is
183 allowed only for noncommercial distribution and only if you
184 received the program in object code or executable form with such
185 an offer, in accord with Subsection b above.)
186
187The source code for a work means the preferred form of the work for
188making modifications to it. For an executable work, complete source
189code means all the source code for all modules it contains, plus any
190associated interface definition files, plus the scripts used to
191control compilation and installation of the executable. However, as a
192special exception, the source code distributed need not include
193anything that is normally distributed (in either source or binary
194form) with the major components (compiler, kernel, and so on) of the
195operating system on which the executable runs, unless that component
196itself accompanies the executable.
197
198If distribution of executable or object code is made by offering
199access to copy from a designated place, then offering equivalent
200access to copy the source code from the same place counts as
201distribution of the source code, even though third parties are not
202compelled to copy the source along with the object code.
203
204 4. You may not copy, modify, sublicense, or distribute the Program
205except as expressly provided under this License. Any attempt
206otherwise to copy, modify, sublicense or distribute the Program is
207void, and will automatically terminate your rights under this License.
208However, parties who have received copies, or rights, from you under
209this License will not have their licenses terminated so long as such
210parties remain in full compliance.
211
212 5. You are not required to accept this License, since you have not
213signed it. However, nothing else grants you permission to modify or
214distribute the Program or its derivative works. These actions are
215prohibited by law if you do not accept this License. Therefore, by
216modifying or distributing the Program (or any work based on the
217Program), you indicate your acceptance of this License to do so, and
218all its terms and conditions for copying, distributing or modifying
219the Program or works based on it.
220
221 6. Each time you redistribute the Program (or any work based on the
222Program), the recipient automatically receives a license from the
223original licensor to copy, distribute or modify the Program subject to
224these terms and conditions. You may not impose any further
225restrictions on the recipients' exercise of the rights granted herein.
226You are not responsible for enforcing compliance by third parties to
227this License.
228
229 7. If, as a consequence of a court judgment or allegation of patent
230infringement or for any other reason (not limited to patent issues),
231conditions are imposed on you (whether by court order, agreement or
232otherwise) that contradict the conditions of this License, they do not
233excuse you from the conditions of this License. If you cannot
234distribute so as to satisfy simultaneously your obligations under this
235License and any other pertinent obligations, then as a consequence you
236may not distribute the Program at all. For example, if a patent
237license would not permit royalty-free redistribution of the Program by
238all those who receive copies directly or indirectly through you, then
239the only way you could satisfy both it and this License would be to
240refrain entirely from distribution of the Program.
241
242If any portion of this section is held invalid or unenforceable under
243any particular circumstance, the balance of the section is intended to
244apply and the section as a whole is intended to apply in other
245circumstances.
246
247It is not the purpose of this section to induce you to infringe any
248patents or other property right claims or to contest validity of any
249such claims; this section has the sole purpose of protecting the
250integrity of the free software distribution system, which is
251implemented by public license practices. Many people have made
252generous contributions to the wide range of software distributed
253through that system in reliance on consistent application of that
254system; it is up to the author/donor to decide if he or she is willing
255to distribute software through any other system and a licensee cannot
256impose that choice.
257
258This section is intended to make thoroughly clear what is believed to
259be a consequence of the rest of this License.
260
261 8. If the distribution and/or use of the Program is restricted in
262certain countries either by patents or by copyrighted interfaces, the
263original copyright holder who places the Program under this License
264may add an explicit geographical distribution limitation excluding
265those countries, so that distribution is permitted only in or among
266countries not thus excluded. In such case, this License incorporates
267the limitation as if written in the body of this License.
268
269 9. The Free Software Foundation may publish revised and/or new versions
270of the General Public License from time to time. Such new versions will
271be similar in spirit to the present version, but may differ in detail to
272address new problems or concerns.
273
274Each version is given a distinguishing version number. If the Program
275specifies a version number of this License which applies to it and "any
276later version", you have the option of following the terms and conditions
277either of that version or of any later version published by the Free
278Software Foundation. If the Program does not specify a version number of
279this License, you may choose any version ever published by the Free Software
280Foundation.
281
282 10. If you wish to incorporate parts of the Program into other free
283programs whose distribution conditions are different, write to the author
284to ask for permission. For software which is copyrighted by the Free
285Software Foundation, write to the Free Software Foundation; we sometimes
286make exceptions for this. Our decision will be guided by the two goals
287of preserving the free status of all derivatives of our free software and
288of promoting the sharing and reuse of software generally.
289
290 NO WARRANTY
291
292 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
293FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
294OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
295PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
296OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
297MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
298TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
299PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
300REPAIR OR CORRECTION.
301
302 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
303WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
304REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
305INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
306OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
307TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
308YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
309PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
310POSSIBILITY OF SUCH DAMAGES.
diff --git a/Documentation/scsi/aic7xxx_old.txt b/Documentation/scsi/aic7xxx_old.txt
index 7bd210ab45a1..ecfc474f36a8 100644
--- a/Documentation/scsi/aic7xxx_old.txt
+++ b/Documentation/scsi/aic7xxx_old.txt
@@ -444,7 +444,7 @@ linux-1.1.x and fairly stable since linux-1.2.x, and are also in FreeBSD
444 Kernel Compile options 444 Kernel Compile options
445 ------------------------------ 445 ------------------------------
446 The various kernel compile time options for this driver are now fairly 446 The various kernel compile time options for this driver are now fairly
447 well documented in the file Documentation/Configure.help. In order to 447 well documented in the file drivers/scsi/Kconfig. In order to
448 see this documentation, you need to use one of the advanced configuration 448 see this documentation, you need to use one of the advanced configuration
449 programs (menuconfig and xconfig). If you are using the "make menuconfig" 449 programs (menuconfig and xconfig). If you are using the "make menuconfig"
450 method of configuring your kernel, then you would simply highlight the 450 method of configuring your kernel, then you would simply highlight the
diff --git a/Documentation/scsi/bnx2fc.txt b/Documentation/scsi/bnx2fc.txt
new file mode 100644
index 000000000000..80823556d62f
--- /dev/null
+++ b/Documentation/scsi/bnx2fc.txt
@@ -0,0 +1,75 @@
1Operating FCoE using bnx2fc
2===========================
3Broadcom FCoE offload through bnx2fc is full stateful hardware offload that
4cooperates with all interfaces provided by the Linux ecosystem for FC/FCoE and
5SCSI controllers. As such, FCoE functionality, once enabled is largely
6transparent. Devices discovered on the SAN will be registered and unregistered
7automatically with the upper storage layers.
8
9Despite the fact that the Broadcom's FCoE offload is fully offloaded, it does
10depend on the state of the network interfaces to operate. As such, the network
11interface (e.g. eth0) associated with the FCoE offload initiator must be 'up'.
12It is recommended that the network interfaces be configured to be brought up
13automatically at boot time.
14
15Furthermore, the Broadcom FCoE offload solution creates VLAN interfaces to
16support the VLANs that have been discovered for FCoE operation (e.g.
17eth0.1001-fcoe). Do not delete or disable these interfaces or FCoE operation
18will be disrupted.
19
20Driver Usage Model:
21===================
22
231. Ensure that fcoe-utils package is installed.
24
252. Configure the interfaces on which bnx2fc driver has to operate on.
26Here are the steps to configure:
27 a. cd /etc/fcoe
28 b. copy cfg-ethx to cfg-eth5 if FCoE has to be enabled on eth5.
29 c. Repeat this for all the interfaces where FCoE has to be enabled.
30 d. Edit all the cfg-eth files to set "no" for DCB_REQUIRED** field, and
31 "yes" for AUTO_VLAN.
32 e. Other configuration parameters should be left as default
33
343. Ensure that "bnx2fc" is in SUPPORTED_DRIVERS list in /etc/fcoe/config.
35
364. Start fcoe service. (service fcoe start). If Broadcom devices are present in
37the system, bnx2fc driver would automatically claim the interfaces, starts vlan
38discovery and log into the targets.
39
405. "Symbolic Name" in 'fcoeadm -i' output would display if bnx2fc has claimed
41the interface.
42Eg:
43[root@bh2 ~]# fcoeadm -i
44 Description: NetXtreme II BCM57712 10 Gigabit Ethernet
45 Revision: 01
46 Manufacturer: Broadcom Corporation
47 Serial Number: 0010186FD558
48 Driver: bnx2x 1.70.00-0
49 Number of Ports: 2
50
51 Symbolic Name: bnx2fc v1.0.5 over eth5.4
52 OS Device Name: host11
53 Node Name: 0x10000010186FD559
54 Port Name: 0x20000010186FD559
55 FabricName: 0x2001000DECB3B681
56 Speed: 10 Gbit
57 Supported Speed: 10 Gbit
58 MaxFrameSize: 2048
59 FC-ID (Port ID): 0x0F0377
60 State: Online
61
626. Verify the vlan discovery is performed by running ifconfig and notice
63<INTERFACE>.<VLAN>-fcoe interfaces are automatically created.
64
65Refer to fcoeadm manpage for more information on fcoeadm operations to
66create/destroy interfaces or to display lun/target information.
67
68NOTE:
69====
70** Broadcom FCoE capable devices implement a DCBX/LLDP client on-chip. Only one
71LLDP client is allowed per interface. For proper operation all host software
72based DCBX/LLDP clients (e.g. lldpad) must be disabled. To disable lldpad on a
73given interface, run the following command:
74
75lldptool set-lldp -i <interface_name> adminStatus=disabled
diff --git a/Documentation/scsi/scsi_mid_low_api.txt b/Documentation/scsi/scsi_mid_low_api.txt
index 5f17d29c59b5..a340b18cd4eb 100644
--- a/Documentation/scsi/scsi_mid_low_api.txt
+++ b/Documentation/scsi/scsi_mid_low_api.txt
@@ -55,11 +55,6 @@ or in the same directory as the C source code. For example to find a url
55about the USB mass storage driver see the 55about the USB mass storage driver see the
56/usr/src/linux/drivers/usb/storage directory. 56/usr/src/linux/drivers/usb/storage directory.
57 57
58The Linux kernel source Documentation/DocBook/scsidrivers.tmpl file
59refers to this file. With the appropriate DocBook tool-set, this permits
60users to generate html, ps and pdf renderings of information within this
61file (e.g. the interface functions).
62
63Driver structure 58Driver structure
64================ 59================
65Traditionally an LLD for the SCSI subsystem has been at least two files in 60Traditionally an LLD for the SCSI subsystem has been at least two files in
diff --git a/Documentation/security/keys-trusted-encrypted.txt b/Documentation/security/keys-trusted-encrypted.txt
index 5f50ccabfc8a..c9e4855ed3d7 100644
--- a/Documentation/security/keys-trusted-encrypted.txt
+++ b/Documentation/security/keys-trusted-encrypted.txt
@@ -156,4 +156,5 @@ Load an encrypted key "evm" from saved blob:
156Other uses for trusted and encrypted keys, such as for disk and file encryption 156Other uses for trusted and encrypted keys, such as for disk and file encryption
157are anticipated. In particular the new format 'ecryptfs' has been defined in 157are anticipated. In particular the new format 'ecryptfs' has been defined in
158in order to use encrypted keys to mount an eCryptfs filesystem. More details 158in order to use encrypted keys to mount an eCryptfs filesystem. More details
159about the usage can be found in the file 'Documentation/keys-ecryptfs.txt'. 159about the usage can be found in the file
160'Documentation/security/keys-ecryptfs.txt'.
diff --git a/Documentation/serial/serial-rs485.txt b/Documentation/serial/serial-rs485.txt
index a4932387bbfb..079cb3df62cf 100644
--- a/Documentation/serial/serial-rs485.txt
+++ b/Documentation/serial/serial-rs485.txt
@@ -28,6 +28,10 @@
28 RS485 communications. This data structure is used to set and configure RS485 28 RS485 communications. This data structure is used to set and configure RS485
29 parameters in the platform data and in ioctls. 29 parameters in the platform data and in ioctls.
30 30
31 The device tree can also provide RS485 boot time parameters (see [2]
32 for bindings). The driver is in charge of filling this data structure from
33 the values given by the device tree.
34
31 Any driver for devices capable of working both as RS232 and RS485 should 35 Any driver for devices capable of working both as RS232 and RS485 should
32 provide at least the following ioctls: 36 provide at least the following ioctls:
33 37
@@ -104,6 +108,9 @@
104 rs485conf.flags |= SER_RS485_RTS_AFTER_SEND; 108 rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
105 rs485conf.delay_rts_after_send = ...; 109 rs485conf.delay_rts_after_send = ...;
106 110
111 /* Set this flag if you want to receive data even whilst sending data */
112 rs485conf.flags |= SER_RS485_RX_DURING_TX;
113
107 if (ioctl (fd, TIOCSRS485, &rs485conf) < 0) { 114 if (ioctl (fd, TIOCSRS485, &rs485conf) < 0) {
108 /* Error handling. See errno. */ 115 /* Error handling. See errno. */
109 } 116 }
@@ -118,3 +125,4 @@
1185. REFERENCES 1255. REFERENCES
119 126
120 [1] include/linux/serial.h 127 [1] include/linux/serial.h
128 [2] Documentation/devicetree/bindings/serial/rs485.txt
diff --git a/Documentation/sound/oss/PAS16 b/Documentation/sound/oss/PAS16
index 951b3dce51b4..3dca4b75988e 100644
--- a/Documentation/sound/oss/PAS16
+++ b/Documentation/sound/oss/PAS16
@@ -60,8 +60,7 @@ With PAS16 you can use two audio device files at the same time. /dev/dsp (and
60 60
61The new stuff for 2.3.99 and later 61The new stuff for 2.3.99 and later
62============================================================================ 62============================================================================
63The following configuration options from Documentation/Configure.help 63The following configuration options are relevant to configuring the PAS16:
64are relevant to configuring the PAS16:
65 64
66Sound card support 65Sound card support
67CONFIG_SOUND 66CONFIG_SOUND
diff --git a/Documentation/spi/pxa2xx b/Documentation/spi/pxa2xx
index 00511e08db78..3352f97430e4 100644
--- a/Documentation/spi/pxa2xx
+++ b/Documentation/spi/pxa2xx
@@ -2,7 +2,7 @@ PXA2xx SPI on SSP driver HOWTO
2=================================================== 2===================================================
3This a mini howto on the pxa2xx_spi driver. The driver turns a PXA2xx 3This a mini howto on the pxa2xx_spi driver. The driver turns a PXA2xx
4synchronous serial port into a SPI master controller 4synchronous serial port into a SPI master controller
5(see Documentation/spi/spi_summary). The driver has the following features 5(see Documentation/spi/spi-summary). The driver has the following features
6 6
7- Support for any PXA2xx SSP 7- Support for any PXA2xx SSP
8- SSP PIO and SSP DMA data transfers. 8- SSP PIO and SSP DMA data transfers.
@@ -85,7 +85,7 @@ Declaring Slave Devices
85----------------------- 85-----------------------
86Typically each SPI slave (chip) is defined in the arch/.../mach-*/board-*.c 86Typically each SPI slave (chip) is defined in the arch/.../mach-*/board-*.c
87using the "spi_board_info" structure found in "linux/spi/spi.h". See 87using the "spi_board_info" structure found in "linux/spi/spi.h". See
88"Documentation/spi/spi_summary" for additional information. 88"Documentation/spi/spi-summary" for additional information.
89 89
90Each slave device attached to the PXA must provide slave specific configuration 90Each slave device attached to the PXA must provide slave specific configuration
91information via the structure "pxa2xx_spi_chip" found in 91information via the structure "pxa2xx_spi_chip" found in
diff --git a/Documentation/stable_kernel_rules.txt b/Documentation/stable_kernel_rules.txt
index e213f45cf9d7..21fd05c28e73 100644
--- a/Documentation/stable_kernel_rules.txt
+++ b/Documentation/stable_kernel_rules.txt
@@ -24,10 +24,10 @@ Rules on what kind of patches are accepted, and which ones are not, into the
24Procedure for submitting patches to the -stable tree: 24Procedure for submitting patches to the -stable tree:
25 25
26 - Send the patch, after verifying that it follows the above rules, to 26 - Send the patch, after verifying that it follows the above rules, to
27 stable@kernel.org. You must note the upstream commit ID in the changelog 27 stable@vger.kernel.org. You must note the upstream commit ID in the
28 of your submission. 28 changelog of your submission.
29 - To have the patch automatically included in the stable tree, add the tag 29 - To have the patch automatically included in the stable tree, add the tag
30 Cc: stable@kernel.org 30 Cc: stable@vger.kernel.org
31 in the sign-off area. Once the patch is merged it will be applied to 31 in the sign-off area. Once the patch is merged it will be applied to
32 the stable tree without anything else needing to be done by the author 32 the stable tree without anything else needing to be done by the author
33 or subsystem maintainer. 33 or subsystem maintainer.
@@ -35,10 +35,10 @@ Procedure for submitting patches to the -stable tree:
35 cherry-picked than this can be specified in the following format in 35 cherry-picked than this can be specified in the following format in
36 the sign-off area: 36 the sign-off area:
37 37
38 Cc: <stable@kernel.org> # .32.x: a1f84a3: sched: Check for idle 38 Cc: <stable@vger.kernel.org> # .32.x: a1f84a3: sched: Check for idle
39 Cc: <stable@kernel.org> # .32.x: 1b9508f: sched: Rate-limit newidle 39 Cc: <stable@vger.kernel.org> # .32.x: 1b9508f: sched: Rate-limit newidle
40 Cc: <stable@kernel.org> # .32.x: fd21073: sched: Fix affinity logic 40 Cc: <stable@vger.kernel.org> # .32.x: fd21073: sched: Fix affinity logic
41 Cc: <stable@kernel.org> # .32.x 41 Cc: <stable@vger.kernel.org> # .32.x
42 Signed-off-by: Ingo Molnar <mingo@elte.hu> 42 Signed-off-by: Ingo Molnar <mingo@elte.hu>
43 43
44 The tag sequence has the meaning of: 44 The tag sequence has the meaning of:
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 704e474a93df..1f2463671a1a 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -24,6 +24,7 @@ show up in /proc/sys/kernel:
24- bootloader_type [ X86 only ] 24- bootloader_type [ X86 only ]
25- bootloader_version [ X86 only ] 25- bootloader_version [ X86 only ]
26- callhome [ S390 only ] 26- callhome [ S390 only ]
27- cap_last_cap
27- core_pattern 28- core_pattern
28- core_pipe_limit 29- core_pipe_limit
29- core_uses_pid 30- core_uses_pid
@@ -155,6 +156,13 @@ on has a service contract with IBM.
155 156
156============================================================== 157==============================================================
157 158
159cap_last_cap
160
161Highest valid capability of the running kernel. Exports
162CAP_LAST_CAP from the kernel.
163
164==============================================================
165
158core_pattern: 166core_pattern:
159 167
160core_pattern is used to specify a core dumpfile pattern name. 168core_pattern is used to specify a core dumpfile pattern name.
diff --git a/Documentation/timers/highres.txt b/Documentation/timers/highres.txt
index 21332233cef1..e8789976e77c 100644
--- a/Documentation/timers/highres.txt
+++ b/Documentation/timers/highres.txt
@@ -30,7 +30,7 @@ hrtimer base infrastructure
30--------------------------- 30---------------------------
31 31
32The hrtimer base infrastructure was merged into the 2.6.16 kernel. Details of 32The hrtimer base infrastructure was merged into the 2.6.16 kernel. Details of
33the base implementation are covered in Documentation/hrtimers/hrtimer.txt. See 33the base implementation are covered in Documentation/timers/hrtimers.txt. See
34also figure #2 (OLS slides p. 15) 34also figure #2 (OLS slides p. 15)
35 35
36The main differences to the timer wheel, which holds the armed timer_list type 36The main differences to the timer wheel, which holds the armed timer_list type
diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
index 12cecc83cd91..4a37c4759cd2 100644
--- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
+++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl
@@ -379,10 +379,10 @@ EVENT_PROCESS:
379 379
380 # To closer match vmstat scanning statistics, only count isolate_both 380 # To closer match vmstat scanning statistics, only count isolate_both
381 # and isolate_inactive as scanning. isolate_active is rotation 381 # and isolate_inactive as scanning. isolate_active is rotation
382 # isolate_inactive == 0 382 # isolate_inactive == 1
383 # isolate_active == 1 383 # isolate_active == 2
384 # isolate_both == 2 384 # isolate_both == 3
385 if ($isolate_mode != 1) { 385 if ($isolate_mode != 2) {
386 $perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned; 386 $perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
387 } 387 }
388 $perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty; 388 $perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty;
diff --git a/Documentation/usb/dma.txt b/Documentation/usb/dma.txt
index 84ef865237db..444651e70d95 100644
--- a/Documentation/usb/dma.txt
+++ b/Documentation/usb/dma.txt
@@ -7,7 +7,7 @@ API OVERVIEW
7 7
8The big picture is that USB drivers can continue to ignore most DMA issues, 8The big picture is that USB drivers can continue to ignore most DMA issues,
9though they still must provide DMA-ready buffers (see 9though they still must provide DMA-ready buffers (see
10Documentation/PCI/PCI-DMA-mapping.txt). That's how they've worked through 10Documentation/DMA-API-HOWTO.txt). That's how they've worked through
11the 2.4 (and earlier) kernels. 11the 2.4 (and earlier) kernels.
12 12
13OR: they can now be DMA-aware. 13OR: they can now be DMA-aware.
@@ -57,7 +57,7 @@ and effects like cache-trashing can impose subtle penalties.
57 force a consistent memory access ordering by using memory barriers. It's 57 force a consistent memory access ordering by using memory barriers. It's
58 not using a streaming DMA mapping, so it's good for small transfers on 58 not using a streaming DMA mapping, so it's good for small transfers on
59 systems where the I/O would otherwise thrash an IOMMU mapping. (See 59 systems where the I/O would otherwise thrash an IOMMU mapping. (See
60 Documentation/PCI/PCI-DMA-mapping.txt for definitions of "coherent" and 60 Documentation/DMA-API-HOWTO.txt for definitions of "coherent" and
61 "streaming" DMA mappings.) 61 "streaming" DMA mappings.)
62 62
63 Asking for 1/Nth of a page (as well as asking for N pages) is reasonably 63 Asking for 1/Nth of a page (as well as asking for N pages) is reasonably
@@ -88,7 +88,7 @@ WORKING WITH EXISTING BUFFERS
88Existing buffers aren't usable for DMA without first being mapped into the 88Existing buffers aren't usable for DMA without first being mapped into the
89DMA address space of the device. However, most buffers passed to your 89DMA address space of the device. However, most buffers passed to your
90driver can safely be used with such DMA mapping. (See the first section 90driver can safely be used with such DMA mapping. (See the first section
91of Documentation/PCI/PCI-DMA-mapping.txt, titled "What memory is DMA-able?") 91of Documentation/DMA-API-HOWTO.txt, titled "What memory is DMA-able?")
92 92
93- When you're using scatterlists, you can map everything at once. On some 93- When you're using scatterlists, you can map everything at once. On some
94 systems, this kicks in an IOMMU and turns the scatterlists into single 94 systems, this kicks in an IOMMU and turns the scatterlists into single
diff --git a/Documentation/usb/dwc3.txt b/Documentation/usb/dwc3.txt
new file mode 100644
index 000000000000..7b590edae145
--- /dev/null
+++ b/Documentation/usb/dwc3.txt
@@ -0,0 +1,45 @@
1
2 TODO
3~~~~~~
4Please pick something while reading :)
5
6- Convert interrupt handler to per-ep-thread-irq
7
8 As it turns out some DWC3-commands ~1ms to complete. Currently we spin
9 until the command completes which is bad.
10
11 Implementation idea:
12 - dwc core implements a demultiplexing irq chip for interrupts per
13 endpoint. The interrupt numbers are allocated during probe and belong
14 to the device. If MSI provides per-endpoint interrupt this dummy
15 interrupt chip can be replaced with "real" interrupts.
16 - interrupts are requested / allocated on usb_ep_enable() and removed on
17 usb_ep_disable(). Worst case are 32 interrupts, the lower limit is two
18 for ep0/1.
19 - dwc3_send_gadget_ep_cmd() will sleep in wait_for_completion_timeout()
20 until the command completes.
21 - the interrupt handler is split into the following pieces:
22 - primary handler of the device
23 goes through every event and calls generic_handle_irq() for event
24 it. On return from generic_handle_irq() in acknowledges the event
25 counter so interrupt goes away (eventually).
26
27 - threaded handler of the device
28 none
29
30 - primary handler of the EP-interrupt
31 reads the event and tries to process it. Everything that requries
32 sleeping is handed over to the Thread. The event is saved in an
33 per-endpoint data-structure.
34 We probably have to pay attention not to process events once we
35 handed something to thread so we don't process event X prio Y
36 where X > Y.
37
38 - threaded handler of the EP-interrupt
39 handles the remaining EP work which might sleep such as waiting
40 for command completion.
41
42 Latency:
43 There should be no increase in latency since the interrupt-thread has a
44 high priority and will be run before an average task in user land
45 (except the user changed priorities).
diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt
index c9ffa9ced7ee..12511c98cc4f 100644
--- a/Documentation/usb/power-management.txt
+++ b/Documentation/usb/power-management.txt
@@ -439,10 +439,10 @@ cause autosuspends to fail with -EBUSY if the driver needs to use the
439device. 439device.
440 440
441External suspend calls should never be allowed to fail in this way, 441External suspend calls should never be allowed to fail in this way,
442only autosuspend calls. The driver can tell them apart by checking 442only autosuspend calls. The driver can tell them apart by applying
443the PM_EVENT_AUTO bit in the message.event argument to the suspend 443the PMSG_IS_AUTO() macro to the message argument to the suspend
444method; this bit will be set for internal PM events (autosuspend) and 444method; it will return True for internal PM events (autosuspend) and
445clear for external PM events. 445False for external PM events.
446 446
447 447
448 Mutual exclusion 448 Mutual exclusion
@@ -487,3 +487,29 @@ succeed, it may still remain active and thus cause the system to
487resume as soon as the system suspend is complete. Or the remote 487resume as soon as the system suspend is complete. Or the remote
488wakeup may fail and get lost. Which outcome occurs depends on timing 488wakeup may fail and get lost. Which outcome occurs depends on timing
489and on the hardware and firmware design. 489and on the hardware and firmware design.
490
491
492 xHCI hardware link PM
493 ---------------------
494
495xHCI host controller provides hardware link power management to usb2.0
496(xHCI 1.0 feature) and usb3.0 devices which support link PM. By
497enabling hardware LPM, the host can automatically put the device into
498lower power state(L1 for usb2.0 devices, or U1/U2 for usb3.0 devices),
499which state device can enter and resume very quickly.
500
501The user interface for controlling USB2 hardware LPM is located in the
502power/ subdirectory of each USB device's sysfs directory, that is, in
503/sys/bus/usb/devices/.../power/ where "..." is the device's ID. The
504relevant attribute files is usb2_hardware_lpm.
505
506 power/usb2_hardware_lpm
507
508 When a USB2 device which support LPM is plugged to a
509 xHCI host root hub which support software LPM, the
510 host will run a software LPM test for it; if the device
511 enters L1 state and resume successfully and the host
512 supports USB2 hardware LPM, this file will show up and
513 driver will enable hardware LPM for the device. You
514 can write y/Y/1 or n/N/0 to the file to enable/disable
515 USB2 hardware LPM manually. This is for test purpose mainly.
diff --git a/Documentation/video4linux/CARDLIST.tm6000 b/Documentation/video4linux/CARDLIST.tm6000
new file mode 100644
index 000000000000..b5edce487997
--- /dev/null
+++ b/Documentation/video4linux/CARDLIST.tm6000
@@ -0,0 +1,16 @@
1 1 -> Generic tm5600 board (tm5600) [6000:0001]
2 2 -> Generic tm6000 board (tm6000) [6000:0001]
3 3 -> Generic tm6010 board (tm6010) [6000:0002]
4 4 -> 10Moons UT821 (tm5600) [6000:0001]
5 5 -> 10Moons UT330 (tm5600)
6 6 -> ADSTech Dual TV (tm6000) [06e1:f332]
7 7 -> FreeCom and similar (tm6000) [14aa:0620]
8 8 -> ADSTech Mini Dual TV (tm6000) [06e1:b339]
9 9 -> Hauppauge WinTV HVR-900H/USB2 Stick (tm6010) [2040:6600,2040:6601,2040:6610,2040:6611]
10 10 -> Beholder Wander (tm6010) [6000:dec0]
11 11 -> Beholder Voyager (tm6010) [6000:dec1]
12 12 -> TerraTec Cinergy Hybrid XE/Cinergy Hybrid Stick (tm6010) [0ccd:0086,0ccd:00a5]
13 13 -> TwinHan TU501 (tm6010) [13d3:3240,13d3:3241,13d3:3243,13d3:3264]
14 14 -> Beholder Wander Lite (tm6010) [6000:dec2]
15 15 -> Beholder Voyager Lite (tm6010) [6000:dec3]
16
diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt
index 5bfa9a777d26..b15e29f31121 100644
--- a/Documentation/video4linux/gspca.txt
+++ b/Documentation/video4linux/gspca.txt
@@ -8,6 +8,7 @@ xxxx vend:prod
8---- 8----
9spca501 0000:0000 MystFromOri Unknown Camera 9spca501 0000:0000 MystFromOri Unknown Camera
10spca508 0130:0130 Clone Digital Webcam 11043 10spca508 0130:0130 Clone Digital Webcam 11043
11zc3xx 03f0:1b07 HP Premium Starter Cam
11m5602 0402:5602 ALi Video Camera Controller 12m5602 0402:5602 ALi Video Camera Controller
12spca501 040a:0002 Kodak DVC-325 13spca501 040a:0002 Kodak DVC-325
13spca500 040a:0300 Kodak EZ200 14spca500 040a:0300 Kodak EZ200
@@ -190,6 +191,7 @@ ov519 05a9:0519 OV519 Microphone
190ov519 05a9:0530 OmniVision 191ov519 05a9:0530 OmniVision
191ov519 05a9:2800 OmniVision SuperCAM 192ov519 05a9:2800 OmniVision SuperCAM
192ov519 05a9:4519 Webcam Classic 193ov519 05a9:4519 Webcam Classic
194ov534_9 05a9:8065 OmniVision test kit ov538+ov9712
193ov519 05a9:8519 OmniVision 195ov519 05a9:8519 OmniVision
194ov519 05a9:a511 D-Link USB Digital Video Camera 196ov519 05a9:a511 D-Link USB Digital Video Camera
195ov519 05a9:a518 D-Link DSB-C310 Webcam 197ov519 05a9:a518 D-Link DSB-C310 Webcam
@@ -199,6 +201,8 @@ gl860 05e3:0503 Genesys Logic PC Camera
199gl860 05e3:f191 Genesys Logic PC Camera 201gl860 05e3:f191 Genesys Logic PC Camera
200spca561 060b:a001 Maxell Compact Pc PM3 202spca561 060b:a001 Maxell Compact Pc PM3
201zc3xx 0698:2003 CTX M730V built in 203zc3xx 0698:2003 CTX M730V built in
204topro 06a2:0003 TP6800 PC Camera, CmoX CX0342 webcam
205topro 06a2:6810 Creative Qmax
202nw80x 06a5:0000 Typhoon Webcam 100 USB 206nw80x 06a5:0000 Typhoon Webcam 100 USB
203nw80x 06a5:d001 Divio based webcams 207nw80x 06a5:d001 Divio based webcams
204nw80x 06a5:d800 Divio Chicony TwinkleCam, Trust SpaceCam 208nw80x 06a5:d800 Divio Chicony TwinkleCam, Trust SpaceCam
diff --git a/Documentation/video4linux/omap3isp.txt b/Documentation/video4linux/omap3isp.txt
index 69be2c782b98..5dd1439b61fd 100644
--- a/Documentation/video4linux/omap3isp.txt
+++ b/Documentation/video4linux/omap3isp.txt
@@ -70,10 +70,11 @@ Events
70The OMAP 3 ISP driver does support the V4L2 event interface on CCDC and 70The OMAP 3 ISP driver does support the V4L2 event interface on CCDC and
71statistics (AEWB, AF and histogram) subdevs. 71statistics (AEWB, AF and histogram) subdevs.
72 72
73The CCDC subdev produces V4L2_EVENT_OMAP3ISP_HS_VS type event on HS_VS 73The CCDC subdev produces V4L2_EVENT_FRAME_SYNC type event on HS_VS
74interrupt which is used to signal frame start. The event is triggered exactly 74interrupt which is used to signal frame start. Earlier version of this
75when the reception of the first line of the frame starts in the CCDC module. 75driver used V4L2_EVENT_OMAP3ISP_HS_VS for this purpose. The event is
76The event can be subscribed on the CCDC subdev. 76triggered exactly when the reception of the first line of the frame starts
77in the CCDC module. The event can be subscribed on the CCDC subdev.
77 78
78(When using parallel interface one must pay account to correct configuration 79(When using parallel interface one must pay account to correct configuration
79of the VS signal polarity. This is automatically correct when using the serial 80of the VS signal polarity. This is automatically correct when using the serial
diff --git a/Documentation/video4linux/v4l2-controls.txt b/Documentation/video4linux/v4l2-controls.txt
index 9346fc8cbf2b..26aa0573933e 100644
--- a/Documentation/video4linux/v4l2-controls.txt
+++ b/Documentation/video4linux/v4l2-controls.txt
@@ -285,11 +285,11 @@ implement g_volatile_ctrl like this:
285Note that you use the 'new value' union as well in g_volatile_ctrl. In general 285Note that you use the 'new value' union as well in g_volatile_ctrl. In general
286controls that need to implement g_volatile_ctrl are read-only controls. 286controls that need to implement g_volatile_ctrl are read-only controls.
287 287
288To mark a control as volatile you have to set the is_volatile flag: 288To mark a control as volatile you have to set V4L2_CTRL_FLAG_VOLATILE:
289 289
290 ctrl = v4l2_ctrl_new_std(&sd->ctrl_handler, ...); 290 ctrl = v4l2_ctrl_new_std(&sd->ctrl_handler, ...);
291 if (ctrl) 291 if (ctrl)
292 ctrl->is_volatile = 1; 292 ctrl->flags |= V4L2_CTRL_FLAG_VOLATILE;
293 293
294For try/s_ctrl the new values (i.e. as passed by the user) are filled in and 294For try/s_ctrl the new values (i.e. as passed by the user) are filled in and
295you can modify them in try_ctrl or set them in s_ctrl. The 'cur' union 295you can modify them in try_ctrl or set them in s_ctrl. The 'cur' union
@@ -367,8 +367,7 @@ Driver specific controls can be created using v4l2_ctrl_new_custom():
367The last argument is the priv pointer which can be set to driver-specific 367The last argument is the priv pointer which can be set to driver-specific
368private data. 368private data.
369 369
370The v4l2_ctrl_config struct also has fields to set the is_private and is_volatile 370The v4l2_ctrl_config struct also has a field to set the is_private flag.
371flags.
372 371
373If the name field is not set, then the framework will assume this is a standard 372If the name field is not set, then the framework will assume this is a standard
374control and will fill in the name, type and flags fields accordingly. 373control and will fill in the name, type and flags fields accordingly.
@@ -496,18 +495,20 @@ Handling autogain/gain-type Controls with Auto Clusters
496 495
497A common type of control cluster is one that handles 'auto-foo/foo'-type 496A common type of control cluster is one that handles 'auto-foo/foo'-type
498controls. Typical examples are autogain/gain, autoexposure/exposure, 497controls. Typical examples are autogain/gain, autoexposure/exposure,
499autowhitebalance/red balance/blue balance. In all cases you have one controls 498autowhitebalance/red balance/blue balance. In all cases you have one control
500that determines whether another control is handled automatically by the hardware, 499that determines whether another control is handled automatically by the hardware,
501or whether it is under manual control from the user. 500or whether it is under manual control from the user.
502 501
503If the cluster is in automatic mode, then the manual controls should be 502If the cluster is in automatic mode, then the manual controls should be
504marked inactive. When the volatile controls are read the g_volatile_ctrl 503marked inactive and volatile. When the volatile controls are read the
505operation should return the value that the hardware's automatic mode set up 504g_volatile_ctrl operation should return the value that the hardware's automatic
506automatically. 505mode set up automatically.
507 506
508If the cluster is put in manual mode, then the manual controls should become 507If the cluster is put in manual mode, then the manual controls should become
509active again and the is_volatile flag should be ignored (so g_volatile_ctrl is 508active again and the volatile flag is cleared (so g_volatile_ctrl is no longer
510no longer called while in manual mode). 509called while in manual mode). In addition just before switching to manual mode
510the current values as determined by the auto mode are copied as the new manual
511values.
511 512
512Finally the V4L2_CTRL_FLAG_UPDATE should be set for the auto control since 513Finally the V4L2_CTRL_FLAG_UPDATE should be set for the auto control since
513changing that control affects the control flags of the manual controls. 514changing that control affects the control flags of the manual controls.
@@ -520,7 +521,11 @@ void v4l2_ctrl_auto_cluster(unsigned ncontrols, struct v4l2_ctrl **controls,
520 521
521The first two arguments are identical to v4l2_ctrl_cluster. The third argument 522The first two arguments are identical to v4l2_ctrl_cluster. The third argument
522tells the framework which value switches the cluster into manual mode. The 523tells the framework which value switches the cluster into manual mode. The
523last argument will optionally set the is_volatile flag for the non-auto controls. 524last argument will optionally set V4L2_CTRL_FLAG_VOLATILE for the non-auto controls.
525If it is false, then the manual controls are never volatile. You would typically
526use that if the hardware does not give you the option to read back to values as
527determined by the auto mode (e.g. if autogain is on, the hardware doesn't allow
528you to obtain the current gain value).
524 529
525The first control of the cluster is assumed to be the 'auto' control. 530The first control of the cluster is assumed to be the 'auto' control.
526 531
@@ -681,16 +686,6 @@ if there are no controls at all.
681count if nothing was done yet. If it is less than count then only the controls 686count if nothing was done yet. If it is less than count then only the controls
682up to error_idx-1 were successfully applied. 687up to error_idx-1 were successfully applied.
683 688
6843) When attempting to read a button control the framework will return -EACCES
685instead of -EINVAL as stated in the spec. It seems to make more sense since
686button controls are write-only controls.
687
6884) Attempting to write to a read-only control will return -EACCES instead of
689-EINVAL as the spec says.
690
6915) The spec does not mention what should happen when you try to set/get a
692control class controls. The framework will return -EACCES.
693
694 689
695Proposals for Extensions 690Proposals for Extensions
696======================== 691========================
@@ -703,9 +698,3 @@ decimal. Useful for e.g. video_mute_yuv.
7032) It is possible to mark in the controls array which controls have been 6982) It is possible to mark in the controls array which controls have been
704successfully written and which failed by for example adding a bit to the 699successfully written and which failed by for example adding a bit to the
705control ID. Not sure if it is worth the effort, though. 700control ID. Not sure if it is worth the effort, though.
706
7073) Trying to set volatile inactive controls should result in -EACCESS.
708
7094) Add a new flag to mark volatile controls. Any application that wants
710to store the state of the controls can then skip volatile inactive controls.
711Currently it is not possible to detect such controls.
diff --git a/Documentation/virtual/00-INDEX b/Documentation/virtual/00-INDEX
index fe0251c4cfb7..8e601991d91c 100644
--- a/Documentation/virtual/00-INDEX
+++ b/Documentation/virtual/00-INDEX
@@ -8,3 +8,6 @@ lguest/
8 - Extremely simple hypervisor for experimental/educational use. 8 - Extremely simple hypervisor for experimental/educational use.
9uml/ 9uml/
10 - User Mode Linux, builds/runs Linux kernel as a userspace program. 10 - User Mode Linux, builds/runs Linux kernel as a userspace program.
11virtio.txt
12 - Text version of draft virtio spec.
13 See http://ozlabs.org/~rusty/virtio-spec
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b0e4b9cd6a66..7945b0bd35e2 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -175,10 +175,30 @@ Parameters: vcpu id (apic id on x86)
175Returns: vcpu fd on success, -1 on error 175Returns: vcpu fd on success, -1 on error
176 176
177This API adds a vcpu to a virtual machine. The vcpu id is a small integer 177This API adds a vcpu to a virtual machine. The vcpu id is a small integer
178in the range [0, max_vcpus). You can use KVM_CAP_NR_VCPUS of the 178in the range [0, max_vcpus).
179KVM_CHECK_EXTENSION ioctl() to determine the value for max_vcpus at run-time. 179
180The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of
181the KVM_CHECK_EXTENSION ioctl() at run-time.
182The maximum possible value for max_vcpus can be retrieved using the
183KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time.
184
180If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4 185If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4
181cpus max. 186cpus max.
187If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is
188same as the value returned from KVM_CAP_NR_VCPUS.
189
190On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
191threads in one or more virtual CPU cores. (This is because the
192hardware requires all the hardware threads in a CPU core to be in the
193same partition.) The KVM_CAP_PPC_SMT capability indicates the number
194of vcpus per virtual core (vcore). The vcore id is obtained by
195dividing the vcpu id by the number of vcpus per vcore. The vcpus in a
196given vcore will always be in the same physical core as each other
197(though that might be a different physical core from time to time).
198Userspace can control the threading (SMT) mode of the guest by its
199allocation of vcpu ids. For example, if userspace wants
200single-threaded guest vcpus, it should make all vcpu ids be a multiple
201of the number of vcpus per vcore.
182 202
183On powerpc using book3s_hv mode, the vcpus are mapped onto virtual 203On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
184threads in one or more virtual CPU cores. (This is because the 204threads in one or more virtual CPU cores. (This is because the
@@ -1633,3 +1653,50 @@ developer registration required to access it).
1633 char padding[256]; 1653 char padding[256];
1634 }; 1654 };
1635}; 1655};
1656
16576. Capabilities that can be enabled
1658
1659There are certain capabilities that change the behavior of the virtual CPU when
1660enabled. To enable them, please see section 4.37. Below you can find a list of
1661capabilities and what their effect on the vCPU is when enabling them.
1662
1663The following information is provided along with the description:
1664
1665 Architectures: which instruction set architectures provide this ioctl.
1666 x86 includes both i386 and x86_64.
1667
1668 Parameters: what parameters are accepted by the capability.
1669
1670 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
1671 are not detailed, but errors with specific meanings are.
1672
16736.1 KVM_CAP_PPC_OSI
1674
1675Architectures: ppc
1676Parameters: none
1677Returns: 0 on success; -1 on error
1678
1679This capability enables interception of OSI hypercalls that otherwise would
1680be treated as normal system calls to be injected into the guest. OSI hypercalls
1681were invented by Mac-on-Linux to have a standardized communication mechanism
1682between the guest and the host.
1683
1684When this capability is enabled, KVM_EXIT_OSI can occur.
1685
16866.2 KVM_CAP_PPC_PAPR
1687
1688Architectures: ppc
1689Parameters: none
1690Returns: 0 on success; -1 on error
1691
1692This capability enables interception of PAPR hypercalls. PAPR hypercalls are
1693done using the hypercall instruction "sc 1".
1694
1695It also sets the guest privilege level to "supervisor" mode. Usually the guest
1696runs in "hypervisor" privilege mode with a few missing features.
1697
1698In addition to the above, it changes the semantics of SDR1. In this mode, the
1699HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the
1700HTAB invisible to the guest.
1701
1702When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur.
diff --git a/Documentation/virtual/lguest/lguest.c b/Documentation/virtual/lguest/lguest.c
index 043bd7df3139..c095d79cae73 100644
--- a/Documentation/virtual/lguest/lguest.c
+++ b/Documentation/virtual/lguest/lguest.c
@@ -436,7 +436,7 @@ static unsigned long load_bzimage(int fd)
436 436
437 /* 437 /*
438 * Go back to the start of the file and read the header. It should be 438 * Go back to the start of the file and read the header. It should be
439 * a Linux boot header (see Documentation/x86/i386/boot.txt) 439 * a Linux boot header (see Documentation/x86/boot.txt)
440 */ 440 */
441 lseek(fd, 0, SEEK_SET); 441 lseek(fd, 0, SEEK_SET);
442 read(fd, &boot, sizeof(boot)); 442 read(fd, &boot, sizeof(boot));
@@ -1996,6 +1996,9 @@ int main(int argc, char *argv[])
1996 /* We use a simple helper to copy the arguments separated by spaces. */ 1996 /* We use a simple helper to copy the arguments separated by spaces. */
1997 concat((char *)(boot + 1), argv+optind+2); 1997 concat((char *)(boot + 1), argv+optind+2);
1998 1998
1999 /* Set kernel alignment to 16M (CONFIG_PHYSICAL_ALIGN) */
2000 boot->hdr.kernel_alignment = 0x1000000;
2001
1999 /* Boot protocol version: 2.07 supports the fields for lguest. */ 2002 /* Boot protocol version: 2.07 supports the fields for lguest. */
2000 boot->hdr.version = 0x207; 2003 boot->hdr.version = 0x207;
2001 2004
diff --git a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
index 5d0fc8bfcdb9..77dfecf4e2d6 100644
--- a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
+++ b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
@@ -134,13 +134,13 @@
134 134
135 ______________________________________________________________________ 135 ______________________________________________________________________
136 136
137 11.. IInnttrroodduuccttiioonn 137 1. Introduction
138 138
139 Welcome to User Mode Linux. It's going to be fun. 139 Welcome to User Mode Linux. It's going to be fun.
140 140
141 141
142 142
143 11..11.. HHooww iiss UUsseerr MMooddee LLiinnuuxx DDiiffffeerreenntt?? 143 1.1. How is User Mode Linux Different?
144 144
145 Normally, the Linux Kernel talks straight to your hardware (video 145 Normally, the Linux Kernel talks straight to your hardware (video
146 card, keyboard, hard drives, etc), and any programs which run ask the 146 card, keyboard, hard drives, etc), and any programs which run ask the
@@ -181,7 +181,7 @@
181 181
182 182
183 183
184 11..22.. WWhhyy WWoouulldd II WWaanntt UUsseerr MMooddee LLiinnuuxx?? 184 1.2. Why Would I Want User Mode Linux?
185 185
186 186
187 1. If User Mode Linux crashes, your host kernel is still fine. 187 1. If User Mode Linux crashes, your host kernel is still fine.
@@ -206,12 +206,12 @@
206 206
207 207
208 208
209 22.. CCoommppiilliinngg tthhee kkeerrnneell aanndd mmoodduulleess 209 2. Compiling the kernel and modules
210 210
211 211
212 212
213 213
214 22..11.. CCoommppiilliinngg tthhee kkeerrnneell 214 2.1. Compiling the kernel
215 215
216 216
217 Compiling the user mode kernel is just like compiling any other 217 Compiling the user mode kernel is just like compiling any other
@@ -322,7 +322,7 @@
322 bug fixes and enhancements that have gone into subsequent releases. 322 bug fixes and enhancements that have gone into subsequent releases.
323 323
324 324
325 22..22.. CCoommppiilliinngg aanndd iinnssttaalllliinngg kkeerrnneell mmoodduulleess 325 2.2. Compiling and installing kernel modules
326 326
327 UML modules are built in the same way as the native kernel (with the 327 UML modules are built in the same way as the native kernel (with the
328 exception of the 'ARCH=um' that you always need for UML): 328 exception of the 'ARCH=um' that you always need for UML):
@@ -386,19 +386,19 @@
386 386
387 387
388 388
389 22..33.. CCoommppiilliinngg aanndd iinnssttaalllliinngg uummll__uuttiilliittiieess 389 2.3. Compiling and installing uml_utilities
390 390
391 Many features of the UML kernel require a user-space helper program, 391 Many features of the UML kernel require a user-space helper program,
392 so a uml_utilities package is distributed separately from the kernel 392 so a uml_utilities package is distributed separately from the kernel
393 patch which provides these helpers. Included within this is: 393 patch which provides these helpers. Included within this is:
394 394
395 +o port-helper - Used by consoles which connect to xterms or ports 395 o port-helper - Used by consoles which connect to xterms or ports
396 396
397 +o tunctl - Configuration tool to create and delete tap devices 397 o tunctl - Configuration tool to create and delete tap devices
398 398
399 +o uml_net - Setuid binary for automatic tap device configuration 399 o uml_net - Setuid binary for automatic tap device configuration
400 400
401 +o uml_switch - User-space virtual switch required for daemon 401 o uml_switch - User-space virtual switch required for daemon
402 transport 402 transport
403 403
404 The uml_utilities tree is compiled with: 404 The uml_utilities tree is compiled with:
@@ -423,11 +423,11 @@
423 423
424 424
425 425
426 33.. RRuunnnniinngg UUMMLL aanndd llooggggiinngg iinn 426 3. Running UML and logging in
427 427
428 428
429 429
430 33..11.. RRuunnnniinngg UUMMLL 430 3.1. Running UML
431 431
432 It runs on 2.2.15 or later, and all 2.4 kernels. 432 It runs on 2.2.15 or later, and all 2.4 kernels.
433 433
@@ -454,7 +454,7 @@
454 454
455 455
456 456
457 33..22.. LLooggggiinngg iinn 457 3.2. Logging in
458 458
459 459
460 460
@@ -468,7 +468,7 @@
468 468
469 There are a couple of other ways to log in: 469 There are a couple of other ways to log in:
470 470
471 +o On a virtual console 471 o On a virtual console
472 472
473 473
474 474
@@ -480,7 +480,7 @@
480 480
481 481
482 482
483 +o Over the serial line 483 o Over the serial line
484 484
485 485
486 In the boot output, find a line that looks like: 486 In the boot output, find a line that looks like:
@@ -503,7 +503,7 @@
503 503
504 504
505 505
506 +o Over the net 506 o Over the net
507 507
508 508
509 If the network is running, then you can telnet to the virtual 509 If the network is running, then you can telnet to the virtual
@@ -514,13 +514,13 @@
514 down and the process will exit. 514 down and the process will exit.
515 515
516 516
517 33..33.. EExxaammpplleess 517 3.3. Examples
518 518
519 Here are some examples of UML in action: 519 Here are some examples of UML in action:
520 520
521 +o A login session <http://user-mode-linux.sourceforge.net/login.html> 521 o A login session <http://user-mode-linux.sourceforge.net/login.html>
522 522
523 +o A virtual network <http://user-mode-linux.sourceforge.net/net.html> 523 o A virtual network <http://user-mode-linux.sourceforge.net/net.html>
524 524
525 525
526 526
@@ -528,12 +528,12 @@
528 528
529 529
530 530
531 44.. UUMMLL oonn 22GG//22GG hhoossttss 531 4. UML on 2G/2G hosts
532 532
533 533
534 534
535 535
536 44..11.. IInnttrroodduuccttiioonn 536 4.1. Introduction
537 537
538 538
539 Most Linux machines are configured so that the kernel occupies the 539 Most Linux machines are configured so that the kernel occupies the
@@ -546,7 +546,7 @@
546 546
547 547
548 548
549 44..22.. TThhee pprroobblleemm 549 4.2. The problem
550 550
551 551
552 The prebuilt UML binaries on this site will not run on 2G/2G hosts 552 The prebuilt UML binaries on this site will not run on 2G/2G hosts
@@ -558,7 +558,7 @@
558 558
559 559
560 560
561 44..33.. TThhee ssoolluuttiioonn 561 4.3. The solution
562 562
563 563
564 The fix for this is to rebuild UML from source after enabling 564 The fix for this is to rebuild UML from source after enabling
@@ -576,7 +576,7 @@
576 576
577 577
578 578
579 55.. SSeettttiinngg uupp sseerriiaall lliinneess aanndd ccoonnssoolleess 579 5. Setting up serial lines and consoles
580 580
581 581
582 It is possible to attach UML serial lines and consoles to many types 582 It is possible to attach UML serial lines and consoles to many types
@@ -586,12 +586,12 @@
586 You can attach them to host ptys, ttys, file descriptors, and ports. 586 You can attach them to host ptys, ttys, file descriptors, and ports.
587 This allows you to do things like 587 This allows you to do things like
588 588
589 +o have a UML console appear on an unused host console, 589 o have a UML console appear on an unused host console,
590 590
591 +o hook two virtual machines together by having one attach to a pty 591 o hook two virtual machines together by having one attach to a pty
592 and having the other attach to the corresponding tty 592 and having the other attach to the corresponding tty
593 593
594 +o make a virtual machine accessible from the net by attaching a 594 o make a virtual machine accessible from the net by attaching a
595 console to a port on the host. 595 console to a port on the host.
596 596
597 597
@@ -599,7 +599,7 @@
599 599
600 600
601 601
602 55..11.. SSppeecciiffyyiinngg tthhee ddeevviiccee 602 5.1. Specifying the device
603 603
604 Devices are specified with "con" or "ssl" (console or serial line, 604 Devices are specified with "con" or "ssl" (console or serial line,
605 respectively), optionally with a device number if you are talking 605 respectively), optionally with a device number if you are talking
@@ -626,13 +626,13 @@
626 626
627 627
628 628
629 55..22.. SSppeecciiffyyiinngg tthhee cchhaannnneell 629 5.2. Specifying the channel
630 630
631 There are a number of different types of channels to attach a UML 631 There are a number of different types of channels to attach a UML
632 device to, each with a different way of specifying exactly what to 632 device to, each with a different way of specifying exactly what to
633 attach to. 633 attach to.
634 634
635 +o pseudo-terminals - device=pty pts terminals - device=pts 635 o pseudo-terminals - device=pty pts terminals - device=pts
636 636
637 637
638 This will cause UML to allocate a free host pseudo-terminal for the 638 This will cause UML to allocate a free host pseudo-terminal for the
@@ -640,20 +640,20 @@
640 log. You access it by attaching a terminal program to the 640 log. You access it by attaching a terminal program to the
641 corresponding tty: 641 corresponding tty:
642 642
643 +o screen /dev/pts/n 643 o screen /dev/pts/n
644 644
645 +o screen /dev/ttyxx 645 o screen /dev/ttyxx
646 646
647 +o minicom -o -p /dev/ttyxx - minicom seems not able to handle pts 647 o minicom -o -p /dev/ttyxx - minicom seems not able to handle pts
648 devices 648 devices
649 649
650 +o kermit - start it up, 'open' the device, then 'connect' 650 o kermit - start it up, 'open' the device, then 'connect'
651 651
652 652
653 653
654 654
655 655
656 +o terminals - device=tty:tty device file 656 o terminals - device=tty:tty device file
657 657
658 658
659 This will make UML attach the device to the specified tty (i.e 659 This will make UML attach the device to the specified tty (i.e
@@ -672,7 +672,7 @@
672 672
673 673
674 674
675 +o xterms - device=xterm 675 o xterms - device=xterm
676 676
677 677
678 UML will run an xterm and the device will be attached to it. 678 UML will run an xterm and the device will be attached to it.
@@ -681,7 +681,7 @@
681 681
682 682
683 683
684 +o Port - device=port:port number 684 o Port - device=port:port number
685 685
686 686
687 This will attach the UML devices to the specified host port. 687 This will attach the UML devices to the specified host port.
@@ -725,7 +725,7 @@
725 725
726 726
727 727
728 +o already-existing file descriptors - device=file descriptor 728 o already-existing file descriptors - device=file descriptor
729 729
730 730
731 If you set up a file descriptor on the UML command line, you can 731 If you set up a file descriptor on the UML command line, you can
@@ -743,7 +743,7 @@
743 743
744 744
745 745
746 +o Nothing - device=null 746 o Nothing - device=null
747 747
748 748
749 This allows the device to be opened, in contrast to 'none', but 749 This allows the device to be opened, in contrast to 'none', but
@@ -754,7 +754,7 @@
754 754
755 755
756 756
757 +o None - device=none 757 o None - device=none
758 758
759 759
760 This causes the device to disappear. 760 This causes the device to disappear.
@@ -770,7 +770,7 @@
770 770
771 771
772 772
773 will cause serial line 3 to accept input on the host's /dev/tty3 and 773 will cause serial line 3 to accept input on the host's /dev/tty2 and
774 display output on an xterm. That's a silly example - the most common 774 display output on an xterm. That's a silly example - the most common
775 use of this syntax is to reattach the main console to stdin and stdout 775 use of this syntax is to reattach the main console to stdin and stdout
776 as shown above. 776 as shown above.
@@ -785,7 +785,7 @@
785 785
786 786
787 787
788 55..33.. EExxaammpplleess 788 5.3. Examples
789 789
790 There are a number of interesting things you can do with this 790 There are a number of interesting things you can do with this
791 capability. 791 capability.
@@ -838,7 +838,7 @@
838 prompt of the other virtual machine. 838 prompt of the other virtual machine.
839 839
840 840
841 66.. SSeettttiinngg uupp tthhee nneettwwoorrkk 841 6. Setting up the network
842 842
843 843
844 844
@@ -858,19 +858,19 @@
858 There are currently five transport types available for a UML virtual 858 There are currently five transport types available for a UML virtual
859 machine to exchange packets with other hosts: 859 machine to exchange packets with other hosts:
860 860
861 +o ethertap 861 o ethertap
862 862
863 +o TUN/TAP 863 o TUN/TAP
864 864
865 +o Multicast 865 o Multicast
866 866
867 +o a switch daemon 867 o a switch daemon
868 868
869 +o slip 869 o slip
870 870
871 +o slirp 871 o slirp
872 872
873 +o pcap 873 o pcap
874 874
875 The TUN/TAP, ethertap, slip, and slirp transports allow a UML 875 The TUN/TAP, ethertap, slip, and slirp transports allow a UML
876 instance to exchange packets with the host. They may be directed 876 instance to exchange packets with the host. They may be directed
@@ -893,28 +893,28 @@
893 With so many host transports, which one should you use? Here's when 893 With so many host transports, which one should you use? Here's when
894 you should use each one: 894 you should use each one:
895 895
896 +o ethertap - if you want access to the host networking and it is 896 o ethertap - if you want access to the host networking and it is
897 running 2.2 897 running 2.2
898 898
899 +o TUN/TAP - if you want access to the host networking and it is 899 o TUN/TAP - if you want access to the host networking and it is
900 running 2.4. Also, the TUN/TAP transport is able to use a 900 running 2.4. Also, the TUN/TAP transport is able to use a
901 preconfigured device, allowing it to avoid using the setuid uml_net 901 preconfigured device, allowing it to avoid using the setuid uml_net
902 helper, which is a security advantage. 902 helper, which is a security advantage.
903 903
904 +o Multicast - if you want a purely virtual network and you don't want 904 o Multicast - if you want a purely virtual network and you don't want
905 to set up anything but the UML 905 to set up anything but the UML
906 906
907 +o a switch daemon - if you want a purely virtual network and you 907 o a switch daemon - if you want a purely virtual network and you
908 don't mind running the daemon in order to get somewhat better 908 don't mind running the daemon in order to get somewhat better
909 performance 909 performance
910 910
911 +o slip - there is no particular reason to run the slip backend unless 911 o slip - there is no particular reason to run the slip backend unless
912 ethertap and TUN/TAP are just not available for some reason 912 ethertap and TUN/TAP are just not available for some reason
913 913
914 +o slirp - if you don't have root access on the host to setup 914 o slirp - if you don't have root access on the host to setup
915 networking, or if you don't want to allocate an IP to your UML 915 networking, or if you don't want to allocate an IP to your UML
916 916
917 +o pcap - not much use for actual network connectivity, but great for 917 o pcap - not much use for actual network connectivity, but great for
918 monitoring traffic on the host 918 monitoring traffic on the host
919 919
920 Ethertap is available on 2.4 and works fine. TUN/TAP is preferred 920 Ethertap is available on 2.4 and works fine. TUN/TAP is preferred
@@ -926,7 +926,7 @@
926 exploit the helper's root privileges. 926 exploit the helper's root privileges.
927 927
928 928
929 66..11.. GGeenneerraall sseettuupp 929 6.1. General setup
930 930
931 First, you must have the virtual network enabled in your UML. If are 931 First, you must have the virtual network enabled in your UML. If are
932 running a prebuilt kernel from this site, everything is already 932 running a prebuilt kernel from this site, everything is already
@@ -995,7 +995,7 @@
995 995
996 996
997 997
998 66..22.. UUsseerrssppaaccee ddaaeemmoonnss 998 6.2. Userspace daemons
999 999
1000 You will likely need the setuid helper, or the switch daemon, or both. 1000 You will likely need the setuid helper, or the switch daemon, or both.
1001 They are both installed with the RPM and deb, so if you've installed 1001 They are both installed with the RPM and deb, so if you've installed
@@ -1011,7 +1011,7 @@
1011 1011
1012 1012
1013 1013
1014 66..33.. SSppeecciiffyyiinngg eetthheerrnneett aaddddrreesssseess 1014 6.3. Specifying ethernet addresses
1015 1015
1016 Below, you will see that the TUN/TAP, ethertap, and daemon interfaces 1016 Below, you will see that the TUN/TAP, ethertap, and daemon interfaces
1017 allow you to specify hardware addresses for the virtual ethernet 1017 allow you to specify hardware addresses for the virtual ethernet
@@ -1023,11 +1023,11 @@
1023 sufficient to guarantee a unique hardware address for the device. A 1023 sufficient to guarantee a unique hardware address for the device. A
1024 couple of exceptions are: 1024 couple of exceptions are:
1025 1025
1026 +o Another set of virtual ethernet devices are on the same network and 1026 o Another set of virtual ethernet devices are on the same network and
1027 they are assigned hardware addresses using a different scheme which 1027 they are assigned hardware addresses using a different scheme which
1028 may conflict with the UML IP address-based scheme 1028 may conflict with the UML IP address-based scheme
1029 1029
1030 +o You aren't going to use the device for IP networking, so you don't 1030 o You aren't going to use the device for IP networking, so you don't
1031 assign the device an IP address 1031 assign the device an IP address
1032 1032
1033 If you let the driver provide the hardware address, you should make 1033 If you let the driver provide the hardware address, you should make
@@ -1049,7 +1049,7 @@
1049 1049
1050 1050
1051 1051
1052 66..44.. UUMMLL iinntteerrffaaccee sseettuupp 1052 6.4. UML interface setup
1053 1053
1054 Once the network devices have been described on the command line, you 1054 Once the network devices have been described on the command line, you
1055 should boot UML and log in. 1055 should boot UML and log in.
@@ -1131,7 +1131,7 @@
1131 1131
1132 1132
1133 1133
1134 66..55.. MMuullttiiccaasstt 1134 6.5. Multicast
1135 1135
1136 The simplest way to set up a virtual network between multiple UMLs is 1136 The simplest way to set up a virtual network between multiple UMLs is
1137 to use the mcast transport. This was written by Harald Welte and is 1137 to use the mcast transport. This was written by Harald Welte and is
@@ -1194,7 +1194,7 @@
1194 1194
1195 1195
1196 1196
1197 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr 1197 6.6. TUN/TAP with the uml_net helper
1198 1198
1199 TUN/TAP is the preferred mechanism on 2.4 to exchange packets with the 1199 TUN/TAP is the preferred mechanism on 2.4 to exchange packets with the
1200 host. The TUN/TAP backend has been in UML since 2.4.9-3um. 1200 host. The TUN/TAP backend has been in UML since 2.4.9-3um.
@@ -1247,10 +1247,10 @@
1247 There are a couple potential problems with running the TUN/TAP 1247 There are a couple potential problems with running the TUN/TAP
1248 transport on a 2.4 host kernel 1248 transport on a 2.4 host kernel
1249 1249
1250 +o TUN/TAP seems not to work on 2.4.3 and earlier. Upgrade the host 1250 o TUN/TAP seems not to work on 2.4.3 and earlier. Upgrade the host
1251 kernel or use the ethertap transport. 1251 kernel or use the ethertap transport.
1252 1252
1253 +o With an upgraded kernel, TUN/TAP may fail with 1253 o With an upgraded kernel, TUN/TAP may fail with
1254 1254
1255 1255
1256 File descriptor in bad state 1256 File descriptor in bad state
@@ -1269,7 +1269,7 @@
1269 1269
1270 1270
1271 1271
1272 66..77.. TTUUNN//TTAAPP wwiitthh aa pprreeccoonnffiigguurreedd ttaapp ddeevviiccee 1272 6.7. TUN/TAP with a preconfigured tap device
1273 1273
1274 If you prefer not to have UML use uml_net (which is somewhat 1274 If you prefer not to have UML use uml_net (which is somewhat
1275 insecure), with UML 2.4.17-11, you can set up a TUN/TAP device 1275 insecure), with UML 2.4.17-11, you can set up a TUN/TAP device
@@ -1277,7 +1277,7 @@
1277 there is no need for root assistance. Setting up the device is done 1277 there is no need for root assistance. Setting up the device is done
1278 as follows: 1278 as follows:
1279 1279
1280 +o Create the device with tunctl (available from the UML utilities 1280 o Create the device with tunctl (available from the UML utilities
1281 tarball) 1281 tarball)
1282 1282
1283 1283
@@ -1291,7 +1291,7 @@
1291 where uid is the user id or username that UML will be run as. This 1291 where uid is the user id or username that UML will be run as. This
1292 will tell you what device was created. 1292 will tell you what device was created.
1293 1293
1294 +o Configure the device IP (change IP addresses and device name to 1294 o Configure the device IP (change IP addresses and device name to
1295 suit) 1295 suit)
1296 1296
1297 1297
@@ -1303,7 +1303,7 @@
1303 1303
1304 1304
1305 1305
1306 +o Set up routing and arping if desired - this is my recipe, there are 1306 o Set up routing and arping if desired - this is my recipe, there are
1307 other ways of doing the same thing 1307 other ways of doing the same thing
1308 1308
1309 1309
@@ -1338,7 +1338,7 @@
1338 utility which reads the information from a config file and sets up 1338 utility which reads the information from a config file and sets up
1339 devices at boot time. 1339 devices at boot time.
1340 1340
1341 +o Rather than using up two IPs and ARPing for one of them, you can 1341 o Rather than using up two IPs and ARPing for one of them, you can
1342 also provide direct access to your LAN by the UML by using a 1342 also provide direct access to your LAN by the UML by using a
1343 bridge. 1343 bridge.
1344 1344
@@ -1417,7 +1417,7 @@
1417 Note that 'br0' should be setup using ifconfig with the existing IP 1417 Note that 'br0' should be setup using ifconfig with the existing IP
1418 address of eth0, as eth0 no longer has its own IP. 1418 address of eth0, as eth0 no longer has its own IP.
1419 1419
1420 +o 1420 o
1421 1421
1422 1422
1423 Also, the /dev/net/tun device must be writable by the user running 1423 Also, the /dev/net/tun device must be writable by the user running
@@ -1438,11 +1438,11 @@
1438 devices and chgrp /dev/net/tun to that group with mode 664 or 660. 1438 devices and chgrp /dev/net/tun to that group with mode 664 or 660.
1439 1439
1440 1440
1441 +o Once the device is set up, run UML with 'eth0=tuntap,device name' 1441 o Once the device is set up, run UML with 'eth0=tuntap,device name'
1442 (i.e. 'eth0=tuntap,tap0') on the command line (or do it with the 1442 (i.e. 'eth0=tuntap,tap0') on the command line (or do it with the
1443 mconsole config command). 1443 mconsole config command).
1444 1444
1445 +o Bring the eth device up in UML and you're in business. 1445 o Bring the eth device up in UML and you're in business.
1446 1446
1447 If you don't want that tap device any more, you can make it non- 1447 If you don't want that tap device any more, you can make it non-
1448 persistent with 1448 persistent with
@@ -1465,7 +1465,7 @@
1465 1465
1466 1466
1467 1467
1468 66..88.. EEtthheerrttaapp 1468 6.8. Ethertap
1469 1469
1470 Ethertap is the general mechanism on 2.2 for userspace processes to 1470 Ethertap is the general mechanism on 2.2 for userspace processes to
1471 exchange packets with the kernel. 1471 exchange packets with the kernel.
@@ -1561,9 +1561,9 @@
1561 1561
1562 1562
1563 1563
1564 66..99.. TThhee sswwiittcchh ddaaeemmoonn 1564 6.9. The switch daemon
1565 1565
1566 NNoottee: This is the daemon formerly known as uml_router, but which was 1566 Note: This is the daemon formerly known as uml_router, but which was
1567 renamed so the network weenies of the world would stop growling at me. 1567 renamed so the network weenies of the world would stop growling at me.
1568 1568
1569 1569
@@ -1649,7 +1649,7 @@
1649 1649
1650 1650
1651 1651
1652 66..1100.. SSlliipp 1652 6.10. Slip
1653 1653
1654 Slip is another, less general, mechanism for a process to communicate 1654 Slip is another, less general, mechanism for a process to communicate
1655 with the host networking. In contrast to the ethertap interface, 1655 with the host networking. In contrast to the ethertap interface,
@@ -1681,7 +1681,7 @@
1681 1681
1682 1682
1683 1683
1684 66..1111.. SSlliirrpp 1684 6.11. Slirp
1685 1685
1686 slirp uses an external program, usually /usr/bin/slirp, to provide IP 1686 slirp uses an external program, usually /usr/bin/slirp, to provide IP
1687 only networking connectivity through the host. This is similar to IP 1687 only networking connectivity through the host. This is similar to IP
@@ -1737,7 +1737,7 @@
1737 1737
1738 1738
1739 1739
1740 66..1122.. ppccaapp 1740 6.12. pcap
1741 1741
1742 The pcap transport is attached to a UML ethernet device on the command 1742 The pcap transport is attached to a UML ethernet device on the command
1743 line or with uml_mconsole with the following syntax: 1743 line or with uml_mconsole with the following syntax:
@@ -1777,7 +1777,7 @@
1777 1777
1778 1778
1779 1779
1780 66..1133.. SSeettttiinngg uupp tthhee hhoosstt yyoouurrsseellff 1780 6.13. Setting up the host yourself
1781 1781
1782 If you don't specify an address for the host side of the ethertap or 1782 If you don't specify an address for the host side of the ethertap or
1783 slip device, UML won't do any setup on the host. So this is what is 1783 slip device, UML won't do any setup on the host. So this is what is
@@ -1785,7 +1785,7 @@
1785 192.168.0.251 and a UML-side IP of 192.168.0.250 - adjust to suit your 1785 192.168.0.251 and a UML-side IP of 192.168.0.250 - adjust to suit your
1786 own network): 1786 own network):
1787 1787
1788 +o The device needs to be configured with its IP address. Tap devices 1788 o The device needs to be configured with its IP address. Tap devices
1789 are also configured with an mtu of 1484. Slip devices are 1789 are also configured with an mtu of 1484. Slip devices are
1790 configured with a point-to-point address pointing at the UML ip 1790 configured with a point-to-point address pointing at the UML ip
1791 address. 1791 address.
@@ -1805,7 +1805,7 @@
1805 1805
1806 1806
1807 1807
1808 +o If a tap device is being set up, a route is set to the UML IP. 1808 o If a tap device is being set up, a route is set to the UML IP.
1809 1809
1810 1810
1811 UML# route add -host 192.168.0.250 gw 192.168.0.251 1811 UML# route add -host 192.168.0.250 gw 192.168.0.251
@@ -1814,7 +1814,7 @@
1814 1814
1815 1815
1816 1816
1817 +o To allow other hosts on your network to see the virtual machine, 1817 o To allow other hosts on your network to see the virtual machine,
1818 proxy arp is set up for it. 1818 proxy arp is set up for it.
1819 1819
1820 1820
@@ -1824,7 +1824,7 @@
1824 1824
1825 1825
1826 1826
1827 +o Finally, the host is set up to route packets. 1827 o Finally, the host is set up to route packets.
1828 1828
1829 1829
1830 host# echo 1 > /proc/sys/net/ipv4/ip_forward 1830 host# echo 1 > /proc/sys/net/ipv4/ip_forward
@@ -1838,12 +1838,12 @@
1838 1838
1839 1839
1840 1840
1841 77.. SShhaarriinngg FFiilleessyysstteemmss bbeettwweeeenn VViirrttuuaall MMaacchhiinneess 1841 7. Sharing Filesystems between Virtual Machines
1842 1842
1843 1843
1844 1844
1845 1845
1846 77..11.. AA wwaarrnniinngg 1846 7.1. A warning
1847 1847
1848 Don't attempt to share filesystems simply by booting two UMLs from the 1848 Don't attempt to share filesystems simply by booting two UMLs from the
1849 same file. That's the same thing as booting two physical machines 1849 same file. That's the same thing as booting two physical machines
@@ -1851,7 +1851,7 @@
1851 1851
1852 1852
1853 1853
1854 77..22.. UUssiinngg llaayyeerreedd bblloocckk ddeevviicceess 1854 7.2. Using layered block devices
1855 1855
1856 The way to share a filesystem between two virtual machines is to use 1856 The way to share a filesystem between two virtual machines is to use
1857 the copy-on-write (COW) layering capability of the ubd block driver. 1857 the copy-on-write (COW) layering capability of the ubd block driver.
@@ -1896,7 +1896,7 @@
1896 1896
1897 1897
1898 1898
1899 77..33.. NNoottee!! 1899 7.3. Note!
1900 1900
1901 When checking the size of the COW file in order to see the gobs of 1901 When checking the size of the COW file in order to see the gobs of
1902 space that you're saving, make sure you use 'ls -ls' to see the actual 1902 space that you're saving, make sure you use 'ls -ls' to see the actual
@@ -1926,7 +1926,7 @@
1926 1926
1927 1927
1928 1928
1929 77..44.. AAnnootthheerr wwaarrnniinngg 1929 7.4. Another warning
1930 1930
1931 Once a filesystem is being used as a readonly backing file for a COW 1931 Once a filesystem is being used as a readonly backing file for a COW
1932 file, do not boot directly from it or modify it in any way. Doing so 1932 file, do not boot directly from it or modify it in any way. Doing so
@@ -1952,7 +1952,7 @@
1952 1952
1953 1953
1954 1954
1955 77..55.. uummll__mmoooo :: MMeerrggiinngg aa CCOOWW ffiillee wwiitthh iittss bbaacckkiinngg ffiillee 1955 7.5. uml_moo : Merging a COW file with its backing file
1956 1956
1957 Depending on how you use UML and COW devices, it may be advisable to 1957 Depending on how you use UML and COW devices, it may be advisable to
1958 merge the changes in the COW file into the backing file every once in 1958 merge the changes in the COW file into the backing file every once in
@@ -2001,7 +2001,7 @@
2001 2001
2002 2002
2003 2003
2004 88.. CCrreeaattiinngg ffiilleessyysstteemmss 2004 8. Creating filesystems
2005 2005
2006 2006
2007 You may want to create and mount new UML filesystems, either because 2007 You may want to create and mount new UML filesystems, either because
@@ -2015,7 +2015,7 @@
2015 should be easy to translate to the filesystem of your choice. 2015 should be easy to translate to the filesystem of your choice.
2016 2016
2017 2017
2018 88..11.. CCrreeaattee tthhee ffiilleessyysstteemm ffiillee 2018 8.1. Create the filesystem file
2019 2019
2020 dd is your friend. All you need to do is tell dd to create an empty 2020 dd is your friend. All you need to do is tell dd to create an empty
2021 file of the appropriate size. I usually make it sparse to save time 2021 file of the appropriate size. I usually make it sparse to save time
@@ -2032,7 +2032,7 @@
2032 2032
2033 2033
2034 2034
2035 88..22.. AAssssiiggnn tthhee ffiillee ttoo aa UUMMLL ddeevviiccee 2035 8.2. Assign the file to a UML device
2036 2036
2037 Add an argument like the following to the UML command line: 2037 Add an argument like the following to the UML command line:
2038 2038
@@ -2045,7 +2045,7 @@
2045 2045
2046 2046
2047 2047
2048 88..33.. CCrreeaattiinngg aanndd mmoouunnttiinngg tthhee ffiilleessyysstteemm 2048 8.3. Creating and mounting the filesystem
2049 2049
2050 Make sure that the filesystem is available, either by being built into 2050 Make sure that the filesystem is available, either by being built into
2051 the kernel, or available as a module, then boot up UML and log in. If 2051 the kernel, or available as a module, then boot up UML and log in. If
@@ -2096,7 +2096,7 @@
2096 2096
2097 2097
2098 2098
2099 99.. HHoosstt ffiillee aacccceessss 2099 9. Host file access
2100 2100
2101 2101
2102 If you want to access files on the host machine from inside UML, you 2102 If you want to access files on the host machine from inside UML, you
@@ -2112,7 +2112,7 @@
2112 files contained in it just as you would on the host. 2112 files contained in it just as you would on the host.
2113 2113
2114 2114
2115 99..11.. UUssiinngg hhoossttffss 2115 9.1. Using hostfs
2116 2116
2117 To begin with, make sure that hostfs is available inside the virtual 2117 To begin with, make sure that hostfs is available inside the virtual
2118 machine with 2118 machine with
@@ -2151,7 +2151,7 @@
2151 2151
2152 2152
2153 2153
2154 99..22.. hhoossttffss aass tthhee rroooott ffiilleessyysstteemm 2154 9.2. hostfs as the root filesystem
2155 2155
2156 It's possible to boot from a directory hierarchy on the host using 2156 It's possible to boot from a directory hierarchy on the host using
2157 hostfs rather than using the standard filesystem in a file. 2157 hostfs rather than using the standard filesystem in a file.
@@ -2194,20 +2194,20 @@
2194 UML should then boot as it does normally. 2194 UML should then boot as it does normally.
2195 2195
2196 2196
2197 99..33.. BBuuiillddiinngg hhoossttffss 2197 9.3. Building hostfs
2198 2198
2199 If you need to build hostfs because it's not in your kernel, you have 2199 If you need to build hostfs because it's not in your kernel, you have
2200 two choices: 2200 two choices:
2201 2201
2202 2202
2203 2203
2204 +o Compiling hostfs into the kernel: 2204 o Compiling hostfs into the kernel:
2205 2205
2206 2206
2207 Reconfigure the kernel and set the 'Host filesystem' option under 2207 Reconfigure the kernel and set the 'Host filesystem' option under
2208 2208
2209 2209
2210 +o Compiling hostfs as a module: 2210 o Compiling hostfs as a module:
2211 2211
2212 2212
2213 Reconfigure the kernel and set the 'Host filesystem' option under 2213 Reconfigure the kernel and set the 'Host filesystem' option under
@@ -2228,7 +2228,7 @@
2228 2228
2229 2229
2230 2230
2231 1100.. TThhee MMaannaaggeemmeenntt CCoonnssoollee 2231 10. The Management Console
2232 2232
2233 2233
2234 2234
@@ -2240,15 +2240,15 @@
2240 2240
2241 There are a number of things you can do with the mconsole interface: 2241 There are a number of things you can do with the mconsole interface:
2242 2242
2243 +o get the kernel version 2243 o get the kernel version
2244 2244
2245 +o add and remove devices 2245 o add and remove devices
2246 2246
2247 +o halt or reboot the machine 2247 o halt or reboot the machine
2248 2248
2249 +o Send SysRq commands 2249 o Send SysRq commands
2250 2250
2251 +o Pause and resume the UML 2251 o Pause and resume the UML
2252 2252
2253 2253
2254 You need the mconsole client (uml_mconsole) which is present in CVS 2254 You need the mconsole client (uml_mconsole) which is present in CVS
@@ -2300,28 +2300,28 @@
2300 2300
2301 You'll get a prompt, at which you can run one of these commands: 2301 You'll get a prompt, at which you can run one of these commands:
2302 2302
2303 +o version 2303 o version
2304 2304
2305 +o halt 2305 o halt
2306 2306
2307 +o reboot 2307 o reboot
2308 2308
2309 +o config 2309 o config
2310 2310
2311 +o remove 2311 o remove
2312 2312
2313 +o sysrq 2313 o sysrq
2314 2314
2315 +o help 2315 o help
2316 2316
2317 +o cad 2317 o cad
2318 2318
2319 +o stop 2319 o stop
2320 2320
2321 +o go 2321 o go
2322 2322
2323 2323
2324 1100..11.. vveerrssiioonn 2324 10.1. version
2325 2325
2326 This takes no arguments. It prints the UML version. 2326 This takes no arguments. It prints the UML version.
2327 2327
@@ -2342,7 +2342,7 @@
2342 2342
2343 2343
2344 2344
2345 1100..22.. hhaalltt aanndd rreebboooott 2345 10.2. halt and reboot
2346 2346
2347 These take no arguments. They shut the machine down immediately, with 2347 These take no arguments. They shut the machine down immediately, with
2348 no syncing of disks and no clean shutdown of userspace. So, they are 2348 no syncing of disks and no clean shutdown of userspace. So, they are
@@ -2357,7 +2357,7 @@
2357 2357
2358 2358
2359 2359
2360 1100..33.. ccoonnffiigg 2360 10.3. config
2361 2361
2362 "config" adds a new device to the virtual machine. Currently the ubd 2362 "config" adds a new device to the virtual machine. Currently the ubd
2363 and network drivers support this. It takes one argument, which is the 2363 and network drivers support this. It takes one argument, which is the
@@ -2378,7 +2378,7 @@
2378 2378
2379 2379
2380 2380
2381 1100..44.. rreemmoovvee 2381 10.4. remove
2382 2382
2383 "remove" deletes a device from the system. Its argument is just the 2383 "remove" deletes a device from the system. Its argument is just the
2384 name of the device to be removed. The device must be idle in whatever 2384 name of the device to be removed. The device must be idle in whatever
@@ -2397,7 +2397,7 @@
2397 2397
2398 2398
2399 2399
2400 1100..55.. ssyyssrrqq 2400 10.5. sysrq
2401 2401
2402 This takes one argument, which is a single letter. It calls the 2402 This takes one argument, which is a single letter. It calls the
2403 generic kernel's SysRq driver, which does whatever is called for by 2403 generic kernel's SysRq driver, which does whatever is called for by
@@ -2407,14 +2407,14 @@
2407 2407
2408 2408
2409 2409
2410 1100..66.. hheellpp 2410 10.6. help
2411 2411
2412 "help" returns a string listing the valid commands and what each one 2412 "help" returns a string listing the valid commands and what each one
2413 does. 2413 does.
2414 2414
2415 2415
2416 2416
2417 1100..77.. ccaadd 2417 10.7. cad
2418 2418
2419 This invokes the Ctl-Alt-Del action on init. What exactly this ends 2419 This invokes the Ctl-Alt-Del action on init. What exactly this ends
2420 up doing is up to /etc/inittab. Normally, it reboots the machine. 2420 up doing is up to /etc/inittab. Normally, it reboots the machine.
@@ -2432,7 +2432,7 @@
2432 2432
2433 2433
2434 2434
2435 1100..88.. ssttoopp 2435 10.8. stop
2436 2436
2437 This puts the UML in a loop reading mconsole requests until a 'go' 2437 This puts the UML in a loop reading mconsole requests until a 'go'
2438 mconsole command is received. This is very useful for making backups 2438 mconsole command is received. This is very useful for making backups
@@ -2448,7 +2448,7 @@
2448 2448
2449 2449
2450 2450
2451 1100..99.. ggoo 2451 10.9. go
2452 2452
2453 This resumes a UML after being paused by a 'stop' command. Note that 2453 This resumes a UML after being paused by a 'stop' command. Note that
2454 when the UML has resumed, TCP connections may have timed out and if 2454 when the UML has resumed, TCP connections may have timed out and if
@@ -2462,10 +2462,10 @@
2462 2462
2463 2463
2464 2464
2465 1111.. KKeerrnneell ddeebbuuggggiinngg 2465 11. Kernel debugging
2466 2466
2467 2467
2468 NNoottee:: The interface that makes debugging, as described here, possible 2468 Note: The interface that makes debugging, as described here, possible
2469 is present in 2.4.0-test6 kernels and later. 2469 is present in 2.4.0-test6 kernels and later.
2470 2470
2471 2471
@@ -2485,7 +2485,7 @@
2485 2485
2486 2486
2487 2487
2488 1111..11.. SSttaarrttiinngg tthhee kkeerrnneell uunnddeerr ggddbb 2488 11.1. Starting the kernel under gdb
2489 2489
2490 You can have the kernel running under the control of gdb from the 2490 You can have the kernel running under the control of gdb from the
2491 beginning by putting 'debug' on the command line. You will get an 2491 beginning by putting 'debug' on the command line. You will get an
@@ -2498,7 +2498,7 @@
2498 There is a transcript of a debugging session here <debug- 2498 There is a transcript of a debugging session here <debug-
2499 session.html> , with breakpoints being set in the scheduler and in an 2499 session.html> , with breakpoints being set in the scheduler and in an
2500 interrupt handler. 2500 interrupt handler.
2501 1111..22.. EExxaammiinniinngg sslleeeeppiinngg pprroocceesssseess 2501 11.2. Examining sleeping processes
2502 2502
2503 Not every bug is evident in the currently running process. Sometimes, 2503 Not every bug is evident in the currently running process. Sometimes,
2504 processes hang in the kernel when they shouldn't because they've 2504 processes hang in the kernel when they shouldn't because they've
@@ -2516,7 +2516,7 @@
2516 2516
2517 Now what you do is this: 2517 Now what you do is this:
2518 2518
2519 +o detach from the current thread 2519 o detach from the current thread
2520 2520
2521 2521
2522 (UML gdb) det 2522 (UML gdb) det
@@ -2525,7 +2525,7 @@
2525 2525
2526 2526
2527 2527
2528 +o attach to the thread you are interested in 2528 o attach to the thread you are interested in
2529 2529
2530 2530
2531 (UML gdb) att <host pid> 2531 (UML gdb) att <host pid>
@@ -2534,7 +2534,7 @@
2534 2534
2535 2535
2536 2536
2537 +o look at its stack and anything else of interest 2537 o look at its stack and anything else of interest
2538 2538
2539 2539
2540 (UML gdb) bt 2540 (UML gdb) bt
@@ -2545,7 +2545,7 @@
2545 Note that you can't do anything at this point that requires that a 2545 Note that you can't do anything at this point that requires that a
2546 process execute, e.g. calling a function 2546 process execute, e.g. calling a function
2547 2547
2548 +o when you're done looking at that process, reattach to the current 2548 o when you're done looking at that process, reattach to the current
2549 thread and continue it 2549 thread and continue it
2550 2550
2551 2551
@@ -2569,12 +2569,12 @@
2569 2569
2570 2570
2571 2571
2572 1111..33.. RRuunnnniinngg dddddd oonn UUMMLL 2572 11.3. Running ddd on UML
2573 2573
2574 ddd works on UML, but requires a special kludge. The process goes 2574 ddd works on UML, but requires a special kludge. The process goes
2575 like this: 2575 like this:
2576 2576
2577 +o Start ddd 2577 o Start ddd
2578 2578
2579 2579
2580 host% ddd linux 2580 host% ddd linux
@@ -2583,14 +2583,14 @@
2583 2583
2584 2584
2585 2585
2586 +o With ps, get the pid of the gdb that ddd started. You can ask the 2586 o With ps, get the pid of the gdb that ddd started. You can ask the
2587 gdb to tell you, but for some reason that confuses things and 2587 gdb to tell you, but for some reason that confuses things and
2588 causes a hang. 2588 causes a hang.
2589 2589
2590 +o run UML with 'debug=parent gdb-pid=<pid>' added to the command line 2590 o run UML with 'debug=parent gdb-pid=<pid>' added to the command line
2591 - it will just sit there after you hit return 2591 - it will just sit there after you hit return
2592 2592
2593 +o type 'att 1' to the ddd gdb and you will see something like 2593 o type 'att 1' to the ddd gdb and you will see something like
2594 2594
2595 2595
2596 0xa013dc51 in __kill () 2596 0xa013dc51 in __kill ()
@@ -2602,12 +2602,12 @@
2602 2602
2603 2603
2604 2604
2605 +o At this point, type 'c', UML will boot up, and you can use ddd just 2605 o At this point, type 'c', UML will boot up, and you can use ddd just
2606 as you do on any other process. 2606 as you do on any other process.
2607 2607
2608 2608
2609 2609
2610 1111..44.. DDeebbuuggggiinngg mmoodduulleess 2610 11.4. Debugging modules
2611 2611
2612 gdb has support for debugging code which is dynamically loaded into 2612 gdb has support for debugging code which is dynamically loaded into
2613 the process. This support is what is needed to debug kernel modules 2613 the process. This support is what is needed to debug kernel modules
@@ -2823,7 +2823,7 @@
2823 2823
2824 2824
2825 2825
2826 1111..55.. AAttttaacchhiinngg ggddbb ttoo tthhee kkeerrnneell 2826 11.5. Attaching gdb to the kernel
2827 2827
2828 If you don't have the kernel running under gdb, you can attach gdb to 2828 If you don't have the kernel running under gdb, you can attach gdb to
2829 it later by sending the tracing thread a SIGUSR1. The first line of 2829 it later by sending the tracing thread a SIGUSR1. The first line of
@@ -2857,7 +2857,7 @@
2857 2857
2858 2858
2859 2859
2860 1111..66.. UUssiinngg aalltteerrnnaattee ddeebbuuggggeerrss 2860 11.6. Using alternate debuggers
2861 2861
2862 UML has support for attaching to an already running debugger rather 2862 UML has support for attaching to an already running debugger rather
2863 than starting gdb itself. This is present in CVS as of 17 Apr 2001. 2863 than starting gdb itself. This is present in CVS as of 17 Apr 2001.
@@ -2886,7 +2886,7 @@
2886 An example of an alternate debugger is strace. You can strace the 2886 An example of an alternate debugger is strace. You can strace the
2887 actual kernel as follows: 2887 actual kernel as follows:
2888 2888
2889 +o Run the following in a shell 2889 o Run the following in a shell
2890 2890
2891 2891
2892 host% 2892 host%
@@ -2894,10 +2894,10 @@
2894 2894
2895 2895
2896 2896
2897 +o Run UML with 'debug' and 'gdb-pid=<pid>' with the pid printed out 2897 o Run UML with 'debug' and 'gdb-pid=<pid>' with the pid printed out
2898 by the previous command 2898 by the previous command
2899 2899
2900 +o Hit return in the shell, and UML will start running, and strace 2900 o Hit return in the shell, and UML will start running, and strace
2901 output will start accumulating in the output file. 2901 output will start accumulating in the output file.
2902 2902
2903 Note that this is different from running 2903 Note that this is different from running
@@ -2917,9 +2917,9 @@
2917 2917
2918 2918
2919 2919
2920 1122.. KKeerrnneell ddeebbuuggggiinngg eexxaammpplleess 2920 12. Kernel debugging examples
2921 2921
2922 1122..11.. TThhee ccaassee ooff tthhee hhuunngg ffsscckk 2922 12.1. The case of the hung fsck
2923 2923
2924 When booting up the kernel, fsck failed, and dropped me into a shell 2924 When booting up the kernel, fsck failed, and dropped me into a shell
2925 to fix things up. I ran fsck -y, which hung: 2925 to fix things up. I ran fsck -y, which hung:
@@ -3154,9 +3154,9 @@
3154 3154
3155 The interesting things here are : 3155 The interesting things here are :
3156 3156
3157 +o There are two segfaults on this stack (frames 9 and 14) 3157 o There are two segfaults on this stack (frames 9 and 14)
3158 3158
3159 +o The first faulting address (frame 11) is 0x50000800 3159 o The first faulting address (frame 11) is 0x50000800
3160 3160
3161 (gdb) p (void *)1342179328 3161 (gdb) p (void *)1342179328
3162 $16 = (void *) 0x50000800 3162 $16 = (void *) 0x50000800
@@ -3399,7 +3399,7 @@
3399 on will be somewhat clearer. 3399 on will be somewhat clearer.
3400 3400
3401 3401
3402 1122..22.. EEppiissooddee 22:: TThhee ccaassee ooff tthhee hhuunngg ffsscckk 3402 12.2. Episode 2: The case of the hung fsck
3403 3403
3404 After setting a trap in the SEGV handler for accesses to the signal 3404 After setting a trap in the SEGV handler for accesses to the signal
3405 thread's stack, I reran the kernel. 3405 thread's stack, I reran the kernel.
@@ -3788,12 +3788,12 @@
3788 3788
3789 3789
3790 3790
3791 1133.. WWhhaatt ttoo ddoo wwhheenn UUMMLL ddooeessnn''tt wwoorrkk 3791 13. What to do when UML doesn't work
3792 3792
3793 3793
3794 3794
3795 3795
3796 1133..11.. SSttrraannggee ccoommppiillaattiioonn eerrrroorrss wwhheenn yyoouu bbuuiilldd ffrroomm ssoouurrccee 3796 13.1. Strange compilation errors when you build from source
3797 3797
3798 As of test11, it is necessary to have "ARCH=um" in the environment or 3798 As of test11, it is necessary to have "ARCH=um" in the environment or
3799 on the make command line for all steps in building UML, including 3799 on the make command line for all steps in building UML, including
@@ -3824,8 +3824,8 @@
3824 3824
3825 3825
3826 3826
3827 1133..33.. AA vvaarriieettyy ooff ppaanniiccss aanndd hhaannggss wwiitthh //ttmmpp oonn aa rreeiisseerrffss ffiilleessyyss-- 3827 13.3. A variety of panics and hangs with /tmp on a reiserfs filesys-
3828 tteemm 3828 tem
3829 3829
3830 I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27. 3830 I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27.
3831 Panics preceded by 3831 Panics preceded by
@@ -3842,8 +3842,8 @@
3842 3842
3843 3843
3844 3844
3845 1133..44.. TThhee ccoommppiillee ffaaiillss wwiitthh eerrrroorrss aabboouutt ccoonnfflliiccttiinngg ttyyppeess ffoorr 3845 13.4. The compile fails with errors about conflicting types for
3846 ''ooppeenn'',, ''dduupp'',, aanndd ''wwaaiittppiidd'' 3846 'open', 'dup', and 'waitpid'
3847 3847
3848 This happens when you build in /usr/src/linux. The UML build makes 3848 This happens when you build in /usr/src/linux. The UML build makes
3849 the include/asm link point to include/asm-um. /usr/include/asm points 3849 the include/asm link point to include/asm-um. /usr/include/asm points
@@ -3854,14 +3854,14 @@
3854 3854
3855 3855
3856 3856
3857 1133..55.. UUMMLL ddooeessnn''tt wwoorrkk wwhheenn //ttmmpp iiss aann NNFFSS ffiilleessyysstteemm 3857 13.5. UML doesn't work when /tmp is an NFS filesystem
3858 3858
3859 This seems to be a similar situation with the ReiserFS problem above. 3859 This seems to be a similar situation with the ReiserFS problem above.
3860 Some versions of NFS seems not to handle mmap correctly, which UML 3860 Some versions of NFS seems not to handle mmap correctly, which UML
3861 depends on. The workaround is have /tmp be a non-NFS directory. 3861 depends on. The workaround is have /tmp be a non-NFS directory.
3862 3862
3863 3863
3864 1133..66.. UUMMLL hhaannggss oonn bboooott wwhheenn ccoommppiilleedd wwiitthh ggpprrooff ssuuppppoorrtt 3864 13.6. UML hangs on boot when compiled with gprof support
3865 3865
3866 If you build UML with gprof support and, early in the boot, it does 3866 If you build UML with gprof support and, early in the boot, it does
3867 this 3867 this
@@ -3878,7 +3878,7 @@
3878 3878
3879 3879
3880 3880
3881 1133..77.. ssyyssllooggdd ddiieess wwiitthh aa SSIIGGTTEERRMM oonn ssttaarrttuupp 3881 13.7. syslogd dies with a SIGTERM on startup
3882 3882
3883 The exact boot error depends on the distribution that you're booting, 3883 The exact boot error depends on the distribution that you're booting,
3884 but Debian produces this: 3884 but Debian produces this:
@@ -3897,17 +3897,17 @@
3897 3897
3898 3898
3899 3899
3900 1133..88.. TTUUNN//TTAAPP nneettwwoorrkkiinngg ddooeessnn''tt wwoorrkk oonn aa 22..44 hhoosstt 3900 13.8. TUN/TAP networking doesn't work on a 2.4 host
3901 3901
3902 There are a couple of problems which were 3902 There are a couple of problems which were
3903 <http://www.geocrawler.com/lists/3/SourceForge/597/0/> name="pointed 3903 <http://www.geocrawler.com/lists/3/SourceForge/597/0/> name="pointed
3904 out"> by Tim Robinson <timro at trkr dot net> 3904 out"> by Tim Robinson <timro at trkr dot net>
3905 3905
3906 +o It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier. 3906 o It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier.
3907 The fix is to upgrade to something more recent and then read the 3907 The fix is to upgrade to something more recent and then read the
3908 next item. 3908 next item.
3909 3909
3910 +o If you see 3910 o If you see
3911 3911
3912 3912
3913 File descriptor in bad state 3913 File descriptor in bad state
@@ -3921,8 +3921,8 @@
3921 3921
3922 3922
3923 3923
3924 1133..99.. YYoouu ccaann nneettwwoorrkk ttoo tthhee hhoosstt bbuutt nnoott ttoo ootthheerr mmaacchhiinneess oonn tthhee 3924 13.9. You can network to the host but not to other machines on the
3925 nneett 3925 net
3926 3926
3927 If you can connect to the host, and the host can connect to UML, but 3927 If you can connect to the host, and the host can connect to UML, but
3928 you cannot connect to any other machines, then you may need to enable 3928 you cannot connect to any other machines, then you may need to enable
@@ -3972,7 +3972,7 @@
3972 3972
3973 3973
3974 3974
3975 1133..1100.. II hhaavvee nnoo rroooott aanndd II wwaanntt ttoo ssccrreeaamm 3975 13.10. I have no root and I want to scream
3976 3976
3977 Thanks to Birgit Wahlich for telling me about this strange one. It 3977 Thanks to Birgit Wahlich for telling me about this strange one. It
3978 turns out that there's a limit of six environment variables on the 3978 turns out that there's a limit of six environment variables on the
@@ -3987,7 +3987,7 @@
3987 3987
3988 3988
3989 3989
3990 1133..1111.. UUMMLL bbuuiilldd ccoonnfflliicctt bbeettwweeeenn ppttrraaccee..hh aanndd uuccoonntteexxtt..hh 3990 13.11. UML build conflict between ptrace.h and ucontext.h
3991 3991
3992 On some older systems, /usr/include/asm/ptrace.h and 3992 On some older systems, /usr/include/asm/ptrace.h and
3993 /usr/include/sys/ucontext.h define the same names. So, when they're 3993 /usr/include/sys/ucontext.h define the same names. So, when they're
@@ -4007,7 +4007,7 @@
4007 4007
4008 4008
4009 4009
4010 1133..1122.. TThhee UUMMLL BBooggooMMiippss iiss eexxaaccttllyy hhaallff tthhee hhoosstt''ss BBooggooMMiippss 4010 13.12. The UML BogoMips is exactly half the host's BogoMips
4011 4011
4012 On i386 kernels, there are two ways of running the loop that is used 4012 On i386 kernels, there are two ways of running the loop that is used
4013 to calculate the BogoMips rating, using the TSC if it's there or using 4013 to calculate the BogoMips rating, using the TSC if it's there or using
@@ -4019,7 +4019,7 @@
4019 4019
4020 4020
4021 4021
4022 1133..1133.. WWhheenn yyoouu rruunn UUMMLL,, iitt iimmmmeeddiiaatteellyy sseeggffaauullttss 4022 13.13. When you run UML, it immediately segfaults
4023 4023
4024 If the host is configured with the 2G/2G address space split, that's 4024 If the host is configured with the 2G/2G address space split, that's
4025 why. See ``UML on 2G/2G hosts'' for the details on getting UML to 4025 why. See ``UML on 2G/2G hosts'' for the details on getting UML to
@@ -4027,7 +4027,7 @@
4027 4027
4028 4028
4029 4029
4030 1133..1144.. xxtteerrmmss aappppeeaarr,, tthheenn iimmmmeeddiiaatteellyy ddiissaappppeeaarr 4030 13.14. xterms appear, then immediately disappear
4031 4031
4032 If you're running an up to date kernel with an old release of 4032 If you're running an up to date kernel with an old release of
4033 uml_utilities, the port-helper program will not work properly, so 4033 uml_utilities, the port-helper program will not work properly, so
@@ -4039,7 +4039,7 @@
4039 4039
4040 4040
4041 4041
4042 1133..1155.. AAnnyy ootthheerr ppaanniicc,, hhaanngg,, oorr ssttrraannggee bbeehhaavviioorr 4042 13.15. Any other panic, hang, or strange behavior
4043 4043
4044 If you're seeing truly strange behavior, such as hangs or panics that 4044 If you're seeing truly strange behavior, such as hangs or panics that
4045 happen in random places, or you try running the debugger to see what's 4045 happen in random places, or you try running the debugger to see what's
@@ -4059,7 +4059,7 @@
4059 4059
4060 If you want to be super-helpful, read ``Diagnosing Problems'' and 4060 If you want to be super-helpful, read ``Diagnosing Problems'' and
4061 follow the instructions contained therein. 4061 follow the instructions contained therein.
4062 1144.. DDiiaaggnnoossiinngg PPrroobblleemmss 4062 14. Diagnosing Problems
4063 4063
4064 4064
4065 If you get UML to crash, hang, or otherwise misbehave, you should 4065 If you get UML to crash, hang, or otherwise misbehave, you should
@@ -4078,7 +4078,7 @@
4078 ``Kernel debugging'' UML first. 4078 ``Kernel debugging'' UML first.
4079 4079
4080 4080
4081 1144..11.. CCaassee 11 :: NNoorrmmaall kkeerrnneell ppaanniiccss 4081 14.1. Case 1 : Normal kernel panics
4082 4082
4083 The most common case is for a normal thread to panic. To debug this, 4083 The most common case is for a normal thread to panic. To debug this,
4084 you will need to run it under the debugger (add 'debug' to the command 4084 you will need to run it under the debugger (add 'debug' to the command
@@ -4128,7 +4128,7 @@
4128 to get that information from the faulting ip. 4128 to get that information from the faulting ip.
4129 4129
4130 4130
4131 1144..22.. CCaassee 22 :: TTrraacciinngg tthhrreeaadd ppaanniiccss 4131 14.2. Case 2 : Tracing thread panics
4132 4132
4133 The less common and more painful case is when the tracing thread 4133 The less common and more painful case is when the tracing thread
4134 panics. In this case, the kernel debugger will be useless because it 4134 panics. In this case, the kernel debugger will be useless because it
@@ -4161,7 +4161,7 @@
4161 backtrace in and wait for our crack debugging team to fix the problem. 4161 backtrace in and wait for our crack debugging team to fix the problem.
4162 4162
4163 4163
4164 1144..33.. CCaassee 33 :: TTrraacciinngg tthhrreeaadd ppaanniiccss ccaauusseedd bbyy ootthheerr tthhrreeaaddss 4164 14.3. Case 3 : Tracing thread panics caused by other threads
4165 4165
4166 However, there are cases where the misbehavior of another thread 4166 However, there are cases where the misbehavior of another thread
4167 caused the problem. The most common panic of this type is: 4167 caused the problem. The most common panic of this type is:
@@ -4227,7 +4227,7 @@
4227 4227
4228 4228
4229 4229
4230 1144..44.. CCaassee 44 :: HHaannggss 4230 14.4. Case 4 : Hangs
4231 4231
4232 Hangs seem to be fairly rare, but they sometimes happen. When a hang 4232 Hangs seem to be fairly rare, but they sometimes happen. When a hang
4233 happens, we need a backtrace from the offending process. Run the 4233 happens, we need a backtrace from the offending process. Run the
@@ -4257,7 +4257,7 @@
4257 4257
4258 4258
4259 4259
4260 1155.. TThhaannkkss 4260 15. Thanks
4261 4261
4262 4262
4263 A number of people have helped this project in various ways, and this 4263 A number of people have helped this project in various ways, and this
@@ -4274,20 +4274,20 @@
4274 bookkeeping lapses and I forget about contributions. 4274 bookkeeping lapses and I forget about contributions.
4275 4275
4276 4276
4277 1155..11.. CCooddee aanndd DDooccuummeennttaattiioonn 4277 15.1. Code and Documentation
4278 4278
4279 Rusty Russell <rusty at linuxcare.com.au> - 4279 Rusty Russell <rusty at linuxcare.com.au> -
4280 4280
4281 +o wrote the HOWTO <http://user-mode- 4281 o wrote the HOWTO <http://user-mode-
4282 linux.sourceforge.net/UserModeLinux-HOWTO.html> 4282 linux.sourceforge.net/UserModeLinux-HOWTO.html>
4283 4283
4284 +o prodded me into making this project official and putting it on 4284 o prodded me into making this project official and putting it on
4285 SourceForge 4285 SourceForge
4286 4286
4287 +o came up with the way cool UML logo <http://user-mode- 4287 o came up with the way cool UML logo <http://user-mode-
4288 linux.sourceforge.net/uml-small.png> 4288 linux.sourceforge.net/uml-small.png>
4289 4289
4290 +o redid the config process 4290 o redid the config process
4291 4291
4292 4292
4293 Peter Moulder <reiter at netspace.net.au> - Fixed my config and build 4293 Peter Moulder <reiter at netspace.net.au> - Fixed my config and build
@@ -4296,18 +4296,18 @@
4296 4296
4297 Bill Stearns <wstearns at pobox.com> - 4297 Bill Stearns <wstearns at pobox.com> -
4298 4298
4299 +o HOWTO updates 4299 o HOWTO updates
4300 4300
4301 +o lots of bug reports 4301 o lots of bug reports
4302 4302
4303 +o lots of testing 4303 o lots of testing
4304 4304
4305 +o dedicated a box (uml.ists.dartmouth.edu) to support UML development 4305 o dedicated a box (uml.ists.dartmouth.edu) to support UML development
4306 4306
4307 +o wrote the mkrootfs script, which allows bootable filesystems of 4307 o wrote the mkrootfs script, which allows bootable filesystems of
4308 RPM-based distributions to be cranked out 4308 RPM-based distributions to be cranked out
4309 4309
4310 +o cranked out a large number of filesystems with said script 4310 o cranked out a large number of filesystems with said script
4311 4311
4312 4312
4313 Jim Leu <jleu at mindspring.com> - Wrote the virtual ethernet driver 4313 Jim Leu <jleu at mindspring.com> - Wrote the virtual ethernet driver
@@ -4375,176 +4375,176 @@
4375 4375
4376 David Coulson <http://davidcoulson.net> - 4376 David Coulson <http://davidcoulson.net> -
4377 4377
4378 +o Set up the usermodelinux.org <http://usermodelinux.org> site, 4378 o Set up the usermodelinux.org <http://usermodelinux.org> site,
4379 which is a great way of keeping the UML user community on top of 4379 which is a great way of keeping the UML user community on top of
4380 UML goings-on. 4380 UML goings-on.
4381 4381
4382 +o Site documentation and updates 4382 o Site documentation and updates
4383 4383
4384 +o Nifty little UML management daemon UMLd 4384 o Nifty little UML management daemon UMLd
4385 <http://uml.openconsultancy.com/umld/> 4385 <http://uml.openconsultancy.com/umld/>
4386 4386
4387 +o Lots of testing and bug reports 4387 o Lots of testing and bug reports
4388 4388
4389 4389
4390 4390
4391 4391
4392 1155..22.. FFlluusshhiinngg oouutt bbuuggss 4392 15.2. Flushing out bugs
4393 4393
4394 4394
4395 4395
4396 +o Yuri Pudgorodsky 4396 o Yuri Pudgorodsky
4397 4397
4398 +o Gerald Britton 4398 o Gerald Britton
4399 4399
4400 +o Ian Wehrman 4400 o Ian Wehrman
4401 4401
4402 +o Gord Lamb 4402 o Gord Lamb
4403 4403
4404 +o Eugene Koontz 4404 o Eugene Koontz
4405 4405
4406 +o John H. Hartman 4406 o John H. Hartman
4407 4407
4408 +o Anders Karlsson 4408 o Anders Karlsson
4409 4409
4410 +o Daniel Phillips 4410 o Daniel Phillips
4411 4411
4412 +o John Fremlin 4412 o John Fremlin
4413 4413
4414 +o Rainer Burgstaller 4414 o Rainer Burgstaller
4415 4415
4416 +o James Stevenson 4416 o James Stevenson
4417 4417
4418 +o Matt Clay 4418 o Matt Clay
4419 4419
4420 +o Cliff Jefferies 4420 o Cliff Jefferies
4421 4421
4422 +o Geoff Hoff 4422 o Geoff Hoff
4423 4423
4424 +o Lennert Buytenhek 4424 o Lennert Buytenhek
4425 4425
4426 +o Al Viro 4426 o Al Viro
4427 4427
4428 +o Frank Klingenhoefer 4428 o Frank Klingenhoefer
4429 4429
4430 +o Livio Baldini Soares 4430 o Livio Baldini Soares
4431 4431
4432 +o Jon Burgess 4432 o Jon Burgess
4433 4433
4434 +o Petru Paler 4434 o Petru Paler
4435 4435
4436 +o Paul 4436 o Paul
4437 4437
4438 +o Chris Reahard 4438 o Chris Reahard
4439 4439
4440 +o Sverker Nilsson 4440 o Sverker Nilsson
4441 4441
4442 +o Gong Su 4442 o Gong Su
4443 4443
4444 +o johan verrept 4444 o johan verrept
4445 4445
4446 +o Bjorn Eriksson 4446 o Bjorn Eriksson
4447 4447
4448 +o Lorenzo Allegrucci 4448 o Lorenzo Allegrucci
4449 4449
4450 +o Muli Ben-Yehuda 4450 o Muli Ben-Yehuda
4451 4451
4452 +o David Mansfield 4452 o David Mansfield
4453 4453
4454 +o Howard Goff 4454 o Howard Goff
4455 4455
4456 +o Mike Anderson 4456 o Mike Anderson
4457 4457
4458 +o John Byrne 4458 o John Byrne
4459 4459
4460 +o Sapan J. Batia 4460 o Sapan J. Batia
4461 4461
4462 +o Iris Huang 4462 o Iris Huang
4463 4463
4464 +o Jan Hudec 4464 o Jan Hudec
4465 4465
4466 +o Voluspa 4466 o Voluspa
4467 4467
4468 4468
4469 4469
4470 4470
4471 1155..33.. BBuugglleettss aanndd cclleeaann--uuppss 4471 15.3. Buglets and clean-ups
4472 4472
4473 4473
4474 4474
4475 +o Dave Zarzycki 4475 o Dave Zarzycki
4476 4476
4477 +o Adam Lazur 4477 o Adam Lazur
4478 4478
4479 +o Boria Feigin 4479 o Boria Feigin
4480 4480
4481 +o Brian J. Murrell 4481 o Brian J. Murrell
4482 4482
4483 +o JS 4483 o JS
4484 4484
4485 +o Roman Zippel 4485 o Roman Zippel
4486 4486
4487 +o Wil Cooley 4487 o Wil Cooley
4488 4488
4489 +o Ayelet Shemesh 4489 o Ayelet Shemesh
4490 4490
4491 +o Will Dyson 4491 o Will Dyson
4492 4492
4493 +o Sverker Nilsson 4493 o Sverker Nilsson
4494 4494
4495 +o dvorak 4495 o dvorak
4496 4496
4497 +o v.naga srinivas 4497 o v.naga srinivas
4498 4498
4499 +o Shlomi Fish 4499 o Shlomi Fish
4500 4500
4501 +o Roger Binns 4501 o Roger Binns
4502 4502
4503 +o johan verrept 4503 o johan verrept
4504 4504
4505 +o MrChuoi 4505 o MrChuoi
4506 4506
4507 +o Peter Cleve 4507 o Peter Cleve
4508 4508
4509 +o Vincent Guffens 4509 o Vincent Guffens
4510 4510
4511 +o Nathan Scott 4511 o Nathan Scott
4512 4512
4513 +o Patrick Caulfield 4513 o Patrick Caulfield
4514 4514
4515 +o jbearce 4515 o jbearce
4516 4516
4517 +o Catalin Marinas 4517 o Catalin Marinas
4518 4518
4519 +o Shane Spencer 4519 o Shane Spencer
4520 4520
4521 +o Zou Min 4521 o Zou Min
4522 4522
4523 4523
4524 +o Ryan Boder 4524 o Ryan Boder
4525 4525
4526 +o Lorenzo Colitti 4526 o Lorenzo Colitti
4527 4527
4528 +o Gwendal Grignou 4528 o Gwendal Grignou
4529 4529
4530 +o Andre' Breiler 4530 o Andre' Breiler
4531 4531
4532 +o Tsutomu Yasuda 4532 o Tsutomu Yasuda
4533 4533
4534 4534
4535 4535
4536 1155..44.. CCaassee SSttuuddiieess 4536 15.4. Case Studies
4537 4537
4538 4538
4539 +o Jon Wright 4539 o Jon Wright
4540 4540
4541 +o William McEwan 4541 o William McEwan
4542 4542
4543 +o Michael Richardson 4543 o Michael Richardson
4544 4544
4545 4545
4546 4546
4547 1155..55.. OOtthheerr ccoonnttrriibbuuttiioonnss 4547 15.5. Other contributions
4548 4548
4549 4549
4550 Bill Carr <Bill.Carr at compaq.com> made the Red Hat mkrootfs script 4550 Bill Carr <Bill.Carr at compaq.com> made the Red Hat mkrootfs script
diff --git a/Documentation/virtual/virtio-spec.txt b/Documentation/virtual/virtio-spec.txt
new file mode 100644
index 000000000000..a350ae135b8c
--- /dev/null
+++ b/Documentation/virtual/virtio-spec.txt
@@ -0,0 +1,2200 @@
1[Generated file: see http://ozlabs.org/~rusty/virtio-spec/]
2Virtio PCI Card Specification
3v0.9.1 DRAFT
4-
5
6Rusty Russell <rusty@rustcorp.com.au>IBM Corporation (Editor)
7
82011 August 1.
9
10Purpose and Description
11
12This document describes the specifications of the 鈥渧irtio鈥 family
13of PCI[LaTeX Command: nomenclature] devices. These are devices
14are found in virtual environments[LaTeX Command: nomenclature],
15yet by design they are not all that different from physical PCI
16devices, and this document treats them as such. This allows the
17guest to use standard PCI drivers and discovery mechanisms.
18
19The purpose of virtio and this specification is that virtual
20environments and guests should have a straightforward, efficient,
21standard and extensible mechanism for virtual devices, rather
22than boutique per-environment or per-OS mechanisms.
23
24 Straightforward: Virtio PCI devices use normal PCI mechanisms
25 of interrupts and DMA which should be familiar to any device
26 driver author. There is no exotic page-flipping or COW
27 mechanism: it's just a PCI device.[footnote:
28This lack of page-sharing implies that the implementation of the
29device (e.g. the hypervisor or host) needs full access to the
30guest memory. Communication with untrusted parties (i.e.
31inter-guest communication) requires copying.
32]
33
34 Efficient: Virtio PCI devices consist of rings of descriptors
35 for input and output, which are neatly separated to avoid cache
36 effects from both guest and device writing to the same cache
37 lines.
38
39 Standard: Virtio PCI makes no assumptions about the environment
40 in which it operates, beyond supporting PCI. In fact the virtio
41 devices specified in the appendices do not require PCI at all:
42 they have been implemented on non-PCI buses.[footnote:
43The Linux implementation further separates the PCI virtio code
44from the specific virtio drivers: these drivers are shared with
45the non-PCI implementations (currently lguest and S/390).
46]
47
48 Extensible: Virtio PCI devices contain feature bits which are
49 acknowledged by the guest operating system during device setup.
50 This allows forwards and backwards compatibility: the device
51 offers all the features it knows about, and the driver
52 acknowledges those it understands and wishes to use.
53
54 Virtqueues
55
56The mechanism for bulk data transport on virtio PCI devices is
57pretentiously called a virtqueue. Each device can have zero or
58more virtqueues: for example, the network device has one for
59transmit and one for receive.
60
61Each virtqueue occupies two or more physically-contiguous pages
62(defined, for the purposes of this specification, as 4096 bytes),
63and consists of three parts:
64
65
66+-------------------+-----------------------------------+-----------+
67| Descriptor Table | Available Ring (padding) | Used Ring |
68+-------------------+-----------------------------------+-----------+
69
70
71When the driver wants to send buffers to the device, it puts them
72in one or more slots in the descriptor table, and writes the
73descriptor indices into the available ring. It then notifies the
74device. When the device has finished with the buffers, it writes
75the descriptors into the used ring, and sends an interrupt.
76
77Specification
78
79 PCI Discovery
80
81Any PCI device with Vendor ID 0x1AF4, and Device ID 0x1000
82through 0x103F inclusive is a virtio device[footnote:
83The actual value within this range is ignored
84]. The device must also have a Revision ID of 0 to match this
85specification.
86
87The Subsystem Device ID indicates which virtio device is
88supported by the device. The Subsystem Vendor ID should reflect
89the PCI Vendor ID of the environment (it's currently only used
90for informational purposes by the guest).
91
92
93+----------------------+--------------------+---------------+
94| Subsystem Device ID | Virtio Device | Specification |
95+----------------------+--------------------+---------------+
96+----------------------+--------------------+---------------+
97| 1 | network card | Appendix C |
98+----------------------+--------------------+---------------+
99| 2 | block device | Appendix D |
100+----------------------+--------------------+---------------+
101| 3 | console | Appendix E |
102+----------------------+--------------------+---------------+
103| 4 | entropy source | Appendix F |
104+----------------------+--------------------+---------------+
105| 5 | memory ballooning | Appendix G |
106+----------------------+--------------------+---------------+
107| 6 | ioMemory | - |
108+----------------------+--------------------+---------------+
109| 9 | 9P transport | - |
110+----------------------+--------------------+---------------+
111
112
113 Device Configuration
114
115To configure the device, we use the first I/O region of the PCI
116device. This contains a virtio header followed by a
117device-specific region.
118
119There may be different widths of accesses to the I/O region; the 鈥
120natural鈥 access method for each field in the virtio header must
121be used (i.e. 32-bit accesses for 32-bit fields, etc), but the
122device-specific region can be accessed using any width accesses,
123and should obtain the same results.
124
125Note that this is possible because while the virtio header is PCI
126(i.e. little) endian, the device-specific region is encoded in
127the native endian of the guest (where such distinction is
128applicable).
129
130 Device Initialization Sequence
131
132We start with an overview of device initialization, then expand
133on the details of the device and how each step is preformed.
134
135 Reset the device. This is not required on initial start up.
136
137 The ACKNOWLEDGE status bit is set: we have noticed the device.
138
139 The DRIVER status bit is set: we know how to drive the device.
140
141 Device-specific setup, including reading the Device Feature
142 Bits, discovery of virtqueues for the device, optional MSI-X
143 setup, and reading and possibly writing the virtio
144 configuration space.
145
146 The subset of Device Feature Bits understood by the driver is
147 written to the device.
148
149 The DRIVER_OK status bit is set.
150
151 The device can now be used (ie. buffers added to the
152 virtqueues)[footnote:
153Historically, drivers have used the device before steps 5 and 6.
154This is only allowed if the driver does not use any features
155which would alter this early use of the device.
156]
157
158If any of these steps go irrecoverably wrong, the guest should
159set the FAILED status bit to indicate that it has given up on the
160device (it can reset the device later to restart if desired).
161
162We now cover the fields required for general setup in detail.
163
164 Virtio Header
165
166The virtio header looks as follows:
167
168
169+------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+
170| Bits || 32 | 32 | 32 | 16 | 16 | 16 | 8 | 8 |
171+------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+
172| Read/Write || R | R+W | R+W | R | R+W | R+W | R+W | R |
173+------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+
174| Purpose || Device | Guest | Queue | Queue | Queue | Queue | Device | ISR |
175| || Features bits 0:31 | Features bits 0:31 | Address | Size | Select | Notify | Status | Status |
176+------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+
177
178
179If MSI-X is enabled for the device, two additional fields
180immediately follow this header:
181
182
183+------------++----------------+--------+
184| Bits || 16 | 16 |
185 +----------------+--------+
186+------------++----------------+--------+
187| Read/Write || R+W | R+W |
188+------------++----------------+--------+
189| Purpose || Configuration | Queue |
190| (MSI-X) || Vector | Vector |
191+------------++----------------+--------+
192
193
194Finally, if feature bits (VIRTIO_F_FEATURES_HI) this is
195immediately followed by two additional fields:
196
197
198+------------++----------------------+----------------------
199| Bits || 32 | 32
200+------------++----------------------+----------------------
201| Read/Write || R | R+W
202+------------++----------------------+----------------------
203| Purpose || Device | Guest
204| || Features bits 32:63 | Features bits 32:63
205+------------++----------------------+----------------------
206
207
208Immediately following these general headers, there may be
209device-specific headers:
210
211
212+------------++--------------------+
213| Bits || Device Specific |
214 +--------------------+
215+------------++--------------------+
216| Read/Write || Device Specific |
217+------------++--------------------+
218| Purpose || Device Specific... |
219| || |
220+------------++--------------------+
221
222
223 Device Status
224
225The Device Status field is updated by the guest to indicate its
226progress. This provides a simple low-level diagnostic: it's most
227useful to imagine them hooked up to traffic lights on the console
228indicating the status of each device.
229
230The device can be reset by writing a 0 to this field, otherwise
231at least one bit should be set:
232
233 ACKNOWLEDGE (1) Indicates that the guest OS has found the
234 device and recognized it as a valid virtio device.
235
236 DRIVER (2) Indicates that the guest OS knows how to drive the
237 device. Under Linux, drivers can be loadable modules so there
238 may be a significant (or infinite) delay before setting this
239 bit.
240
241 DRIVER_OK (3) Indicates that the driver is set up and ready to
242 drive the device.
243
244 FAILED (8) Indicates that something went wrong in the guest,
245 and it has given up on the device. This could be an internal
246 error, or the driver didn't like the device for some reason, or
247 even a fatal error during device operation. The device must be
248 reset before attempting to re-initialize.
249
250 Feature Bits
251
252The least significant 31 bits of the first configuration field
253indicates the features that the device supports (the high bit is
254reserved, and will be used to indicate the presence of future
255feature bits elsewhere). If more than 31 feature bits are
256supported, the device indicates so by setting feature bit 31 (see
257[cha:Reserved-Feature-Bits]). The bits are allocated as follows:
258
259 0 to 23 Feature bits for the specific device type
260
261 24 to 40 Feature bits reserved for extensions to the queue and
262 feature negotiation mechanisms
263
264 41 to 63 Feature bits reserved for future extensions
265
266For example, feature bit 0 for a network device (i.e. Subsystem
267Device ID 1) indicates that the device supports checksumming of
268packets.
269
270The feature bits are negotiated: the device lists all the
271features it understands in the Device Features field, and the
272guest writes the subset that it understands into the Guest
273Features field. The only way to renegotiate is to reset the
274device.
275
276In particular, new fields in the device configuration header are
277indicated by offering a feature bit, so the guest can check
278before accessing that part of the configuration space.
279
280This allows for forwards and backwards compatibility: if the
281device is enhanced with a new feature bit, older guests will not
282write that feature bit back to the Guest Features field and it
283can go into backwards compatibility mode. Similarly, if a guest
284is enhanced with a feature that the device doesn't support, it
285will not see that feature bit in the Device Features field and
286can go into backwards compatibility mode (or, for poor
287implementations, set the FAILED Device Status bit).
288
289Access to feature bits 32 to 63 is enabled by Guest by setting
290feature bit 31. If this bit is unset, Device must assume that all
291feature bits > 31 are unset.
292
293 Configuration/Queue Vectors
294
295When MSI-X capability is present and enabled in the device
296(through standard PCI configuration space) 4 bytes at byte offset
29720 are used to map configuration change and queue interrupts to
298MSI-X vectors. In this case, the ISR Status field is unused, and
299device specific configuration starts at byte offset 24 in virtio
300header structure. When MSI-X capability is not enabled, device
301specific configuration starts at byte offset 20 in virtio header.
302
303Writing a valid MSI-X Table entry number, 0 to 0x7FF, to one of
304Configuration/Queue Vector registers, maps interrupts triggered
305by the configuration change/selected queue events respectively to
306the corresponding MSI-X vector. To disable interrupts for a
307specific event type, unmap it by writing a special NO_VECTOR
308value:
309
310/* Vector value used to disable MSI for queue */
311
312#define VIRTIO_MSI_NO_VECTOR 0xffff
313
314Reading these registers returns vector mapped to a given event,
315or NO_VECTOR if unmapped. All queue and configuration change
316events are unmapped by default.
317
318Note that mapping an event to vector might require allocating
319internal device resources, and might fail. Devices report such
320failures by returning the NO_VECTOR value when the relevant
321Vector field is read. After mapping an event to vector, the
322driver must verify success by reading the Vector field value: on
323success, the previously written value is returned, and on
324failure, NO_VECTOR is returned. If a mapping failure is detected,
325the driver can retry mapping with fewervectors, or disable MSI-X.
326
327 Virtqueue Configuration
328
329As a device can have zero or more virtqueues for bulk data
330transport (for example, the network driver has two), the driver
331needs to configure them as part of the device-specific
332configuration.
333
334This is done as follows, for each virtqueue a device has:
335
336 Write the virtqueue index (first queue is 0) to the Queue
337 Select field.
338
339 Read the virtqueue size from the Queue Size field, which is
340 always a power of 2. This controls how big the virtqueue is
341 (see below). If this field is 0, the virtqueue does not exist.
342
343 Allocate and zero virtqueue in contiguous physical memory, on a
344 4096 byte alignment. Write the physical address, divided by
345 4096 to the Queue Address field.[footnote:
346The 4096 is based on the x86 page size, but it's also large
347enough to ensure that the separate parts of the virtqueue are on
348separate cache lines.
349]
350
351 Optionally, if MSI-X capability is present and enabled on the
352 device, select a vector to use to request interrupts triggered
353 by virtqueue events. Write the MSI-X Table entry number
354 corresponding to this vector in Queue Vector field. Read the
355 Queue Vector field: on success, previously written value is
356 returned; on failure, NO_VECTOR value is returned.
357
358The Queue Size field controls the total number of bytes required
359for the virtqueue according to the following formula:
360
361#define ALIGN(x) (((x) + 4095) & ~4095)
362
363static inline unsigned vring_size(unsigned int qsz)
364
365{
366
367 return ALIGN(sizeof(struct vring_desc)*qsz + sizeof(u16)*(2
368+ qsz))
369
370 + ALIGN(sizeof(struct vring_used_elem)*qsz);
371
372}
373
374This currently wastes some space with padding, but also allows
375future extensions. The virtqueue layout structure looks like this
376(qsz is the Queue Size field, which is a variable, so this code
377won't compile):
378
379struct vring {
380
381 /* The actual descriptors (16 bytes each) */
382
383 struct vring_desc desc[qsz];
384
385
386
387 /* A ring of available descriptor heads with free-running
388index. */
389
390 struct vring_avail avail;
391
392
393
394 // Padding to the next 4096 boundary.
395
396 char pad[];
397
398
399
400 // A ring of used descriptor heads with free-running index.
401
402 struct vring_used used;
403
404};
405
406 A Note on Virtqueue Endianness
407
408Note that the endian of these fields and everything else in the
409virtqueue is the native endian of the guest, not little-endian as
410PCI normally is. This makes for simpler guest code, and it is
411assumed that the host already has to be deeply aware of the guest
412endian so such an 鈥渆ndian-aware鈥 device is not a significant
413issue.
414
415 Descriptor Table
416
417The descriptor table refers to the buffers the guest is using for
418the device. The addresses are physical addresses, and the buffers
419can be chained via the next field. Each descriptor describes a
420buffer which is read-only or write-only, but a chain of
421descriptors can contain both read-only and write-only buffers.
422
423No descriptor chain may be more than 2^32 bytes long in total.struct vring_desc {
424
425 /* Address (guest-physical). */
426
427 u64 addr;
428
429 /* Length. */
430
431 u32 len;
432
433/* This marks a buffer as continuing via the next field. */
434
435#define VRING_DESC_F_NEXT 1
436
437/* This marks a buffer as write-only (otherwise read-only). */
438
439#define VRING_DESC_F_WRITE 2
440
441/* This means the buffer contains a list of buffer descriptors.
442*/
443
444#define VRING_DESC_F_INDIRECT 4
445
446 /* The flags as indicated above. */
447
448 u16 flags;
449
450 /* Next field if flags & NEXT */
451
452 u16 next;
453
454};
455
456The number of descriptors in the table is specified by the Queue
457Size field for this virtqueue.
458
459 <sub:Indirect-Descriptors>Indirect Descriptors
460
461Some devices benefit by concurrently dispatching a large number
462of large requests. The VIRTIO_RING_F_INDIRECT_DESC feature can be
463used to allow this (see [cha:Reserved-Feature-Bits]). To increase
464ring capacity it is possible to store a table of indirect
465descriptors anywhere in memory, and insert a descriptor in main
466virtqueue (with flags&INDIRECT on) that refers to memory buffer
467containing this indirect descriptor table; fields addr and len
468refer to the indirect table address and length in bytes,
469respectively. The indirect table layout structure looks like this
470(len is the length of the descriptor that refers to this table,
471which is a variable, so this code won't compile):
472
473struct indirect_descriptor_table {
474
475 /* The actual descriptors (16 bytes each) */
476
477 struct vring_desc desc[len / 16];
478
479};
480
481The first indirect descriptor is located at start of the indirect
482descriptor table (index 0), additional indirect descriptors are
483chained by next field. An indirect descriptor without next field
484(with flags&NEXT off) signals the end of the indirect descriptor
485table, and transfers control back to the main virtqueue. An
486indirect descriptor can not refer to another indirect descriptor
487table (flags&INDIRECT must be off). A single indirect descriptor
488table can include both read-only and write-only descriptors;
489write-only flag (flags&WRITE) in the descriptor that refers to it
490is ignored.
491
492 Available Ring
493
494The available ring refers to what descriptors we are offering the
495device: it refers to the head of a descriptor chain. The 鈥渇lags鈥
496field is currently 0 or 1: 1 indicating that we do not need an
497interrupt when the device consumes a descriptor from the
498available ring. Alternatively, the guest can ask the device to
499delay interrupts until an entry with an index specified by the 鈥
500used_event鈥 field is written in the used ring (equivalently,
501until the idx field in the used ring will reach the value
502used_event + 1). The method employed by the device is controlled
503by the VIRTIO_RING_F_EVENT_IDX feature bit (see [cha:Reserved-Feature-Bits]
504). This interrupt suppression is merely an optimization; it may
505not suppress interrupts entirely.
506
507The 鈥渋dx鈥 field indicates where we would put the next descriptor
508entry (modulo the ring size). This starts at 0, and increases.
509
510struct vring_avail {
511
512#define VRING_AVAIL_F_NO_INTERRUPT 1
513
514 u16 flags;
515
516 u16 idx;
517
518 u16 ring[qsz]; /* qsz is the Queue Size field read from device
519*/
520
521 u16 used_event;
522
523};
524
525 Used Ring
526
527The used ring is where the device returns buffers once it is done
528with them. The flags field can be used by the device to hint that
529no notification is necessary when the guest adds to the available
530ring. Alternatively, the 鈥渁vail_event鈥 field can be used by the
531device to hint that no notification is necessary until an entry
532with an index specified by the 鈥渁vail_event鈥 is written in the
533available ring (equivalently, until the idx field in the
534available ring will reach the value avail_event + 1). The method
535employed by the device is controlled by the guest through the
536VIRTIO_RING_F_EVENT_IDX feature bit (see [cha:Reserved-Feature-Bits]
537). [footnote:
538These fields are kept here because this is the only part of the
539virtqueue written by the device
540].
541
542Each entry in the ring is a pair: the head entry of the
543descriptor chain describing the buffer (this matches an entry
544placed in the available ring by the guest earlier), and the total
545of bytes written into the buffer. The latter is extremely useful
546for guests using untrusted buffers: if you do not know exactly
547how much has been written by the device, you usually have to zero
548the buffer to ensure no data leakage occurs.
549
550/* u32 is used here for ids for padding reasons. */
551
552struct vring_used_elem {
553
554 /* Index of start of used descriptor chain. */
555
556 u32 id;
557
558 /* Total length of the descriptor chain which was used
559(written to) */
560
561 u32 len;
562
563};
564
565
566
567struct vring_used {
568
569#define VRING_USED_F_NO_NOTIFY 1
570
571 u16 flags;
572
573 u16 idx;
574
575 struct vring_used_elem ring[qsz];
576
577 u16 avail_event;
578
579};
580
581 Helpers for Managing Virtqueues
582
583The Linux Kernel Source code contains the definitions above and
584helper routines in a more usable form, in
585include/linux/virtio_ring.h. This was explicitly licensed by IBM
586and Red Hat under the (3-clause) BSD license so that it can be
587freely used by all other projects, and is reproduced (with slight
588variation to remove Linux assumptions) in Appendix A.
589
590 Device Operation
591
592There are two parts to device operation: supplying new buffers to
593the device, and processing used buffers from the device. As an
594example, the virtio network device has two virtqueues: the
595transmit virtqueue and the receive virtqueue. The driver adds
596outgoing (read-only) packets to the transmit virtqueue, and then
597frees them after they are used. Similarly, incoming (write-only)
598buffers are added to the receive virtqueue, and processed after
599they are used.
600
601 Supplying Buffers to The Device
602
603Actual transfer of buffers from the guest OS to the device
604operates as follows:
605
606 Place the buffer(s) into free descriptor(s).
607
608 If there are no free descriptors, the guest may choose to
609 notify the device even if notifications are suppressed (to
610 reduce latency).[footnote:
611The Linux drivers do this only for read-only buffers: for
612write-only buffers, it is assumed that the driver is merely
613trying to keep the receive buffer ring full, and no notification
614of this expected condition is necessary.
615]
616
617 Place the id of the buffer in the next ring entry of the
618 available ring.
619
620 The steps (1) and (2) may be performed repeatedly if batching
621 is possible.
622
623 A memory barrier should be executed to ensure the device sees
624 the updated descriptor table and available ring before the next
625 step.
626
627 The available 鈥渋dx鈥 field should be increased by the number of
628 entries added to the available ring.
629
630 A memory barrier should be executed to ensure that we update
631 the idx field before checking for notification suppression.
632
633 If notifications are not suppressed, the device should be
634 notified of the new buffers.
635
636Note that the above code does not take precautions against the
637available ring buffer wrapping around: this is not possible since
638the ring buffer is the same size as the descriptor table, so step
639(1) will prevent such a condition.
640
641In addition, the maximum queue size is 32768 (it must be a power
642of 2 which fits in 16 bits), so the 16-bit 鈥渋dx鈥 value can always
643distinguish between a full and empty buffer.
644
645Here is a description of each stage in more detail.
646
647 Placing Buffers Into The Descriptor Table
648
649A buffer consists of zero or more read-only physically-contiguous
650elements followed by zero or more physically-contiguous
651write-only elements (it must have at least one element). This
652algorithm maps it into the descriptor table:
653
654 for each buffer element, b:
655
656 Get the next free descriptor table entry, d
657
658 Set d.addr to the physical address of the start of b
659
660 Set d.len to the length of b.
661
662 If b is write-only, set d.flags to VRING_DESC_F_WRITE,
663 otherwise 0.
664
665 If there is a buffer element after this:
666
667 Set d.next to the index of the next free descriptor element.
668
669 Set the VRING_DESC_F_NEXT bit in d.flags.
670
671In practice, the d.next fields are usually used to chain free
672descriptors, and a separate count kept to check there are enough
673free descriptors before beginning the mappings.
674
675 Updating The Available Ring
676
677The head of the buffer we mapped is the first d in the algorithm
678above. A naive implementation would do the following:
679
680avail->ring[avail->idx % qsz] = head;
681
682However, in general we can add many descriptors before we update
683the 鈥渋dx鈥 field (at which point they become visible to the
684device), so we keep a counter of how many we've added:
685
686avail->ring[(avail->idx + added++) % qsz] = head;
687
688 Updating The Index Field
689
690Once the idx field of the virtqueue is updated, the device will
691be able to access the descriptor entries we've created and the
692memory they refer to. This is why a memory barrier is generally
693used before the idx update, to ensure it sees the most up-to-date
694copy.
695
696The idx field always increments, and we let it wrap naturally at
69765536:
698
699avail->idx += added;
700
701 <sub:Notifying-The-Device>Notifying The Device
702
703Device notification occurs by writing the 16-bit virtqueue index
704of this virtqueue to the Queue Notify field of the virtio header
705in the first I/O region of the PCI device. This can be expensive,
706however, so the device can suppress such notifications if it
707doesn't need them. We have to be careful to expose the new idx
708value before checking the suppression flag: it's OK to notify
709gratuitously, but not to omit a required notification. So again,
710we use a memory barrier here before reading the flags or the
711avail_event field.
712
713If the VIRTIO_F_RING_EVENT_IDX feature is not negotiated, and if
714the VRING_USED_F_NOTIFY flag is not set, we go ahead and write to
715the PCI configuration space.
716
717If the VIRTIO_F_RING_EVENT_IDX feature is negotiated, we read the
718avail_event field in the available ring structure. If the
719available index crossed_the avail_event field value since the
720last notification, we go ahead and write to the PCI configuration
721space. The avail_event field wraps naturally at 65536 as well:
722
723(u16)(new_idx - avail_event - 1) < (u16)(new_idx - old_idx)
724
725 <sub:Receiving-Used-Buffers>Receiving Used Buffers From The
726 Device
727
728Once the device has used a buffer (read from or written to it, or
729parts of both, depending on the nature of the virtqueue and the
730device), it sends an interrupt, following an algorithm very
731similar to the algorithm used for the driver to send the device a
732buffer:
733
734 Write the head descriptor number to the next field in the used
735 ring.
736
737 Update the used ring idx.
738
739 Determine whether an interrupt is necessary:
740
741 If the VIRTIO_F_RING_EVENT_IDX feature is not negotiated: check
742 if f the VRING_AVAIL_F_NO_INTERRUPT flag is not set in avail-
743 >flags
744
745 If the VIRTIO_F_RING_EVENT_IDX feature is negotiated: check
746 whether the used index crossed the used_event field value
747 since the last update. The used_event field wraps naturally
748 at 65536 as well:(u16)(new_idx - used_event - 1) < (u16)(new_idx - old_idx)
749
750 If an interrupt is necessary:
751
752 If MSI-X capability is disabled:
753
754 Set the lower bit of the ISR Status field for the device.
755
756 Send the appropriate PCI interrupt for the device.
757
758 If MSI-X capability is enabled:
759
760 Request the appropriate MSI-X interrupt message for the
761 device, Queue Vector field sets the MSI-X Table entry
762 number.
763
764 If Queue Vector field value is NO_VECTOR, no interrupt
765 message is requested for this event.
766
767The guest interrupt handler should:
768
769 If MSI-X capability is disabled: read the ISR Status field,
770 which will reset it to zero. If the lower bit is zero, the
771 interrupt was not for this device. Otherwise, the guest driver
772 should look through the used rings of each virtqueue for the
773 device, to see if any progress has been made by the device
774 which requires servicing.
775
776 If MSI-X capability is enabled: look through the used rings of
777 each virtqueue mapped to the specific MSI-X vector for the
778 device, to see if any progress has been made by the device
779 which requires servicing.
780
781For each ring, guest should then disable interrupts by writing
782VRING_AVAIL_F_NO_INTERRUPT flag in avail structure, if required.
783It can then process used ring entries finally enabling interrupts
784by clearing the VRING_AVAIL_F_NO_INTERRUPT flag or updating the
785EVENT_IDX field in the available structure, Guest should then
786execute a memory barrier, and then recheck the ring empty
787condition. This is necessary to handle the case where, after the
788last check and before enabling interrupts, an interrupt has been
789suppressed by the device:
790
791vring_disable_interrupts(vq);
792
793for (;;) {
794
795 if (vq->last_seen_used != vring->used.idx) {
796
797 vring_enable_interrupts(vq);
798
799 mb();
800
801 if (vq->last_seen_used != vring->used.idx)
802
803 break;
804
805 }
806
807 struct vring_used_elem *e =
808vring.used->ring[vq->last_seen_used%vsz];
809
810 process_buffer(e);
811
812 vq->last_seen_used++;
813
814}
815
816 Dealing With Configuration Changes
817
818Some virtio PCI devices can change the device configuration
819state, as reflected in the virtio header in the PCI configuration
820space. In this case:
821
822 If MSI-X capability is disabled: an interrupt is delivered and
823 the second highest bit is set in the ISR Status field to
824 indicate that the driver should re-examine the configuration
825 space.Note that a single interrupt can indicate both that one
826 or more virtqueue has been used and that the configuration
827 space has changed: even if the config bit is set, virtqueues
828 must be scanned.
829
830 If MSI-X capability is enabled: an interrupt message is
831 requested. The Configuration Vector field sets the MSI-X Table
832 entry number to use. If Configuration Vector field value is
833 NO_VECTOR, no interrupt message is requested for this event.
834
835Creating New Device Types
836
837Various considerations are necessary when creating a new device
838type:
839
840 How Many Virtqueues?
841
842It is possible that a very simple device will operate entirely
843through its configuration space, but most will need at least one
844virtqueue in which it will place requests. A device with both
845input and output (eg. console and network devices described here)
846need two queues: one which the driver fills with buffers to
847receive input, and one which the driver places buffers to
848transmit output.
849
850 What Configuration Space Layout?
851
852Configuration space is generally used for rarely-changing or
853initialization-time parameters. But it is a limited resource, so
854it might be better to use a virtqueue to update configuration
855information (the network device does this for filtering,
856otherwise the table in the config space could potentially be very
857large).
858
859Note that this space is generally the guest's native endian,
860rather than PCI's little-endian.
861
862 What Device Number?
863
864Currently device numbers are assigned quite freely: a simple
865request mail to the author of this document or the Linux
866virtualization mailing list[footnote:
867
868https://lists.linux-foundation.org/mailman/listinfo/virtualization
869] will be sufficient to secure a unique one.
870
871Meanwhile for experimental drivers, use 65535 and work backwards.
872
873 How many MSI-X vectors?
874
875Using the optional MSI-X capability devices can speed up
876interrupt processing by removing the need to read ISR Status
877register by guest driver (which might be an expensive operation),
878reducing interrupt sharing between devices and queues within the
879device, and handling interrupts from multiple CPUs. However, some
880systems impose a limit (which might be as low as 256) on the
881total number of MSI-X vectors that can be allocated to all
882devices. Devices and/or device drivers should take this into
883account, limiting the number of vectors used unless the device is
884expected to cause a high volume of interrupts. Devices can
885control the number of vectors used by limiting the MSI-X Table
886Size or not presenting MSI-X capability in PCI configuration
887space. Drivers can control this by mapping events to as small
888number of vectors as possible, or disabling MSI-X capability
889altogether.
890
891 Message Framing
892
893The descriptors used for a buffer should not effect the semantics
894of the message, except for the total length of the buffer. For
895example, a network buffer consists of a 10 byte header followed
896by the network packet. Whether this is presented in the ring
897descriptor chain as (say) a 10 byte buffer and a 1514 byte
898buffer, or a single 1524 byte buffer, or even three buffers,
899should have no effect.
900
901In particular, no implementation should use the descriptor
902boundaries to determine the size of any header in a request.[footnote:
903The current qemu device implementations mistakenly insist that
904the first descriptor cover the header in these cases exactly, so
905a cautious driver should arrange it so.
906]
907
908 Device Improvements
909
910Any change to configuration space, or new virtqueues, or
911behavioural changes, should be indicated by negotiation of a new
912feature bit. This establishes clarity[footnote:
913Even if it does mean documenting design or implementation
914mistakes!
915] and avoids future expansion problems.
916
917Clusters of functionality which are always implemented together
918can use a single bit, but if one feature makes sense without the
919others they should not be gratuitously grouped together to
920conserve feature bits. We can always extend the spec when the
921first person needs more than 24 feature bits for their device.
922
923[LaTeX Command: printnomenclature]
924
925Appendix A: virtio_ring.h
926
927#ifndef VIRTIO_RING_H
928
929#define VIRTIO_RING_H
930
931/* An interface for efficient virtio implementation.
932
933 *
934
935 * This header is BSD licensed so anyone can use the definitions
936
937 * to implement compatible drivers/servers.
938
939 *
940
941 * Copyright 2007, 2009, IBM Corporation
942
943 * Copyright 2011, Red Hat, Inc
944
945 * All rights reserved.
946
947 *
948
949 * Redistribution and use in source and binary forms, with or
950without
951
952 * modification, are permitted provided that the following
953conditions
954
955 * are met:
956
957 * 1. Redistributions of source code must retain the above
958copyright
959
960 * notice, this list of conditions and the following
961disclaimer.
962
963 * 2. Redistributions in binary form must reproduce the above
964copyright
965
966 * notice, this list of conditions and the following
967disclaimer in the
968
969 * documentation and/or other materials provided with the
970distribution.
971
972 * 3. Neither the name of IBM nor the names of its contributors
973
974 * may be used to endorse or promote products derived from
975this software
976
977 * without specific prior written permission.
978
979 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
980CONTRIBUTORS ``AS IS'' AND
981
982 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
983TO, THE
984
985 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
986PARTICULAR PURPOSE
987
988 * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE
989LIABLE
990
991 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
992CONSEQUENTIAL
993
994 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
995SUBSTITUTE GOODS
996
997 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
998INTERRUPTION)
999
1000 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
1001CONTRACT, STRICT
1002
1003 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
1004IN ANY WAY
1005
1006 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
1007POSSIBILITY OF
1008
1009 * SUCH DAMAGE.
1010
1011 */
1012
1013
1014
1015/* This marks a buffer as continuing via the next field. */
1016
1017#define VRING_DESC_F_NEXT 1
1018
1019/* This marks a buffer as write-only (otherwise read-only). */
1020
1021#define VRING_DESC_F_WRITE 2
1022
1023
1024
1025/* The Host uses this in used->flags to advise the Guest: don't
1026kick me
1027
1028 * when you add a buffer. It's unreliable, so it's simply an
1029
1030 * optimization. Guest will still kick if it's out of buffers.
1031*/
1032
1033#define VRING_USED_F_NO_NOTIFY 1
1034
1035/* The Guest uses this in avail->flags to advise the Host: don't
1036
1037 * interrupt me when you consume a buffer. It's unreliable, so
1038it's
1039
1040 * simply an optimization. */
1041
1042#define VRING_AVAIL_F_NO_INTERRUPT 1
1043
1044
1045
1046/* Virtio ring descriptors: 16 bytes.
1047
1048 * These can chain together via "next". */
1049
1050struct vring_desc {
1051
1052 /* Address (guest-physical). */
1053
1054 uint64_t addr;
1055
1056 /* Length. */
1057
1058 uint32_t len;
1059
1060 /* The flags as indicated above. */
1061
1062 uint16_t flags;
1063
1064 /* We chain unused descriptors via this, too */
1065
1066 uint16_t next;
1067
1068};
1069
1070
1071
1072struct vring_avail {
1073
1074 uint16_t flags;
1075
1076 uint16_t idx;
1077
1078 uint16_t ring[];
1079
1080 uint16_t used_event;
1081
1082};
1083
1084
1085
1086/* u32 is used here for ids for padding reasons. */
1087
1088struct vring_used_elem {
1089
1090 /* Index of start of used descriptor chain. */
1091
1092 uint32_t id;
1093
1094 /* Total length of the descriptor chain which was written
1095to. */
1096
1097 uint32_t len;
1098
1099};
1100
1101
1102
1103struct vring_used {
1104
1105 uint16_t flags;
1106
1107 uint16_t idx;
1108
1109 struct vring_used_elem ring[];
1110
1111 uint16_t avail_event;
1112
1113};
1114
1115
1116
1117struct vring {
1118
1119 unsigned int num;
1120
1121
1122
1123 struct vring_desc *desc;
1124
1125 struct vring_avail *avail;
1126
1127 struct vring_used *used;
1128
1129};
1130
1131
1132
1133/* The standard layout for the ring is a continuous chunk of
1134memory which
1135
1136 * looks like this. We assume num is a power of 2.
1137
1138 *
1139
1140 * struct vring {
1141
1142 * // The actual descriptors (16 bytes each)
1143
1144 * struct vring_desc desc[num];
1145
1146 *
1147
1148 * // A ring of available descriptor heads with free-running
1149index.
1150
1151 * __u16 avail_flags;
1152
1153 * __u16 avail_idx;
1154
1155 * __u16 available[num];
1156
1157 *
1158
1159 * // Padding to the next align boundary.
1160
1161 * char pad[];
1162
1163 *
1164
1165 * // A ring of used descriptor heads with free-running
1166index.
1167
1168 * __u16 used_flags;
1169
1170 * __u16 EVENT_IDX;
1171
1172 * struct vring_used_elem used[num];
1173
1174 * };
1175
1176 * Note: for virtio PCI, align is 4096.
1177
1178 */
1179
1180static inline void vring_init(struct vring *vr, unsigned int num,
1181void *p,
1182
1183 unsigned long align)
1184
1185{
1186
1187 vr->num = num;
1188
1189 vr->desc = p;
1190
1191 vr->avail = p + num*sizeof(struct vring_desc);
1192
1193 vr->used = (void *)(((unsigned long)&vr->avail->ring[num]
1194
1195 + align-1)
1196
1197 & ~(align - 1));
1198
1199}
1200
1201
1202
1203static inline unsigned vring_size(unsigned int num, unsigned long
1204align)
1205
1206{
1207
1208 return ((sizeof(struct vring_desc)*num +
1209sizeof(uint16_t)*(2+num)
1210
1211 + align - 1) & ~(align - 1))
1212
1213 + sizeof(uint16_t)*3 + sizeof(struct
1214vring_used_elem)*num;
1215
1216}
1217
1218
1219
1220static inline int vring_need_event(uint16_t event_idx, uint16_t
1221new_idx, uint16_t old_idx)
1222
1223{
1224
1225 return (uint16_t)(new_idx - event_idx - 1) <
1226(uint16_t)(new_idx - old_idx);
1227
1228}
1229
1230#endif /* VIRTIO_RING_H */
1231
1232<cha:Reserved-Feature-Bits>Appendix B: Reserved Feature Bits
1233
1234Currently there are five device-independent feature bits defined:
1235
1236 VIRTIO_F_NOTIFY_ON_EMPTY (24) Negotiating this feature
1237 indicates that the driver wants an interrupt if the device runs
1238 out of available descriptors on a virtqueue, even though
1239 interrupts are suppressed using the VRING_AVAIL_F_NO_INTERRUPT
1240 flag or the used_event field. An example of this is the
1241 networking driver: it doesn't need to know every time a packet
1242 is transmitted, but it does need to free the transmitted
1243 packets a finite time after they are transmitted. It can avoid
1244 using a timer if the device interrupts it when all the packets
1245 are transmitted.
1246
1247 VIRTIO_F_RING_INDIRECT_DESC (28) Negotiating this feature
1248 indicates that the driver can use descriptors with the
1249 VRING_DESC_F_INDIRECT flag set, as described in [sub:Indirect-Descriptors]
1250 .
1251
1252 VIRTIO_F_RING_EVENT_IDX(29) This feature enables the used_event
1253 and the avail_event fields. If set, it indicates that the
1254 device should ignore the flags field in the available ring
1255 structure. Instead, the used_event field in this structure is
1256 used by guest to suppress device interrupts. Further, the
1257 driver should ignore the flags field in the used ring
1258 structure. Instead, the avail_event field in this structure is
1259 used by the device to suppress notifications. If unset, the
1260 driver should ignore the used_event field; the device should
1261 ignore the avail_event field; the flags field is used
1262
1263 VIRTIO_F_BAD_FEATURE(30) This feature should never be
1264 negotiated by the guest; doing so is an indication that the
1265 guest is faulty[footnote:
1266An experimental virtio PCI driver contained in Linux version
12672.6.25 had this problem, and this feature bit can be used to
1268detect it.
1269]
1270
1271 VIRTIO_F_FEATURES_HIGH(31) This feature indicates that the
1272 device supports feature bits 32:63. If unset, feature bits
1273 32:63 are unset.
1274
1275Appendix C: Network Device
1276
1277The virtio network device is a virtual ethernet card, and is the
1278most complex of the devices supported so far by virtio. It has
1279enhanced rapidly and demonstrates clearly how support for new
1280features should be added to an existing device. Empty buffers are
1281placed in one virtqueue for receiving packets, and outgoing
1282packets are enqueued into another for transmission in that order.
1283A third command queue is used to control advanced filtering
1284features.
1285
1286 Configuration
1287
1288 Subsystem Device ID 1
1289
1290 Virtqueues 0:receiveq. 1:transmitq. 2:controlq[footnote:
1291Only if VIRTIO_NET_F_CTRL_VQ set
1292]
1293
1294 Feature bits
1295
1296 VIRTIO_NET_F_CSUM (0) Device handles packets with partial
1297 checksum
1298
1299 VIRTIO_NET_F_GUEST_CSUM (1) Guest handles packets with partial
1300 checksum
1301
1302 VIRTIO_NET_F_MAC (5) Device has given MAC address.
1303
1304 VIRTIO_NET_F_GSO (6) (Deprecated) device handles packets with
1305 any GSO type.[footnote:
1306It was supposed to indicate segmentation offload support, but
1307upon further investigation it became clear that multiple bits
1308were required.
1309]
1310
1311 VIRTIO_NET_F_GUEST_TSO4 (7) Guest can receive TSOv4.
1312
1313 VIRTIO_NET_F_GUEST_TSO6 (8) Guest can receive TSOv6.
1314
1315 VIRTIO_NET_F_GUEST_ECN (9) Guest can receive TSO with ECN.
1316
1317 VIRTIO_NET_F_GUEST_UFO (10) Guest can receive UFO.
1318
1319 VIRTIO_NET_F_HOST_TSO4 (11) Device can receive TSOv4.
1320
1321 VIRTIO_NET_F_HOST_TSO6 (12) Device can receive TSOv6.
1322
1323 VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.
1324
1325 VIRTIO_NET_F_HOST_UFO (14) Device can receive UFO.
1326
1327 VIRTIO_NET_F_MRG_RXBUF (15) Guest can merge receive buffers.
1328
1329 VIRTIO_NET_F_STATUS (16) Configuration status field is
1330 available.
1331
1332 VIRTIO_NET_F_CTRL_VQ (17) Control channel is available.
1333
1334 VIRTIO_NET_F_CTRL_RX (18) Control channel RX mode support.
1335
1336 VIRTIO_NET_F_CTRL_VLAN (19) Control channel VLAN filtering.
1337
1338 Device configuration layout Two configuration fields are
1339 currently defined. The mac address field always exists (though
1340 is only valid if VIRTIO_NET_F_MAC is set), and the status field
1341 only exists if VIRTIO_NET_F_STATUS is set. Only one bit is
1342 currently defined for the status field: VIRTIO_NET_S_LINK_UP. #define VIRTIO_NET_S_LINK_UP 1
1343
1344
1345
1346struct virtio_net_config {
1347
1348 u8 mac[6];
1349
1350 u16 status;
1351
1352};
1353
1354 Device Initialization
1355
1356 The initialization routine should identify the receive and
1357 transmission virtqueues.
1358
1359 If the VIRTIO_NET_F_MAC feature bit is set, the configuration
1360 space 鈥渕ac鈥 entry indicates the 鈥減hysical鈥 address of the the
1361 network card, otherwise a private MAC address should be
1362 assigned. All guests are expected to negotiate this feature if
1363 it is set.
1364
1365 If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify
1366 the control virtqueue.
1367
1368 If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link
1369 status can be read from the bottom bit of the 鈥渟tatus鈥 config
1370 field. Otherwise, the link should be assumed active.
1371
1372 The receive virtqueue should be filled with receive buffers.
1373 This is described in detail below in 鈥淪etting Up Receive
1374 Buffers鈥.
1375
1376 A driver can indicate that it will generate checksumless
1377 packets by negotating the VIRTIO_NET_F_CSUM feature. This 鈥
1378 checksum offload鈥 is a common feature on modern network cards.
1379
1380 If that feature is negotiated, a driver can use TCP or UDP
1381 segmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4
1382 (IPv4 TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP) and
1383 VIRTIO_NET_F_HOST_UFO (UDP fragmentation) features. It should
1384 not send TCP packets requiring segmentation offload which have
1385 the Explicit Congestion Notification bit set, unless the
1386 VIRTIO_NET_F_HOST_ECN feature is negotiated.[footnote:
1387This is a common restriction in real, older network cards.
1388]
1389
1390 The converse features are also available: a driver can save the
1391 virtual device some work by negotiating these features.[footnote:
1392For example, a network packet transported between two guests on
1393the same system may not require checksumming at all, nor
1394segmentation, if both guests are amenable.
1395] The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
1396 checksummed packets can be received, and if it can do that then
1397 the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
1398 VIRTIO_NET_F_GUEST_UFO and VIRTIO_NET_F_GUEST_ECN are the input
1399 equivalents of the features described above. See 鈥淩eceiving
1400 Packets鈥 below.
1401
1402 Device Operation
1403
1404Packets are transmitted by placing them in the transmitq, and
1405buffers for incoming packets are placed in the receiveq. In each
1406case, the packet itself is preceeded by a header:
1407
1408struct virtio_net_hdr {
1409
1410#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
1411
1412 u8 flags;
1413
1414#define VIRTIO_NET_HDR_GSO_NONE 0
1415
1416#define VIRTIO_NET_HDR_GSO_TCPV4 1
1417
1418#define VIRTIO_NET_HDR_GSO_UDP 3
1419
1420#define VIRTIO_NET_HDR_GSO_TCPV6 4
1421
1422#define VIRTIO_NET_HDR_GSO_ECN 0x80
1423
1424 u8 gso_type;
1425
1426 u16 hdr_len;
1427
1428 u16 gso_size;
1429
1430 u16 csum_start;
1431
1432 u16 csum_offset;
1433
1434/* Only if VIRTIO_NET_F_MRG_RXBUF: */
1435
1436 u16 num_buffers
1437
1438};
1439
1440The controlq is used to control device features such as
1441filtering.
1442
1443 Packet Transmission
1444
1445Transmitting a single packet is simple, but varies depending on
1446the different features the driver negotiated.
1447
1448 If the driver negotiated VIRTIO_NET_F_CSUM, and the packet has
1449 not been fully checksummed, then the virtio_net_hdr's fields
1450 are set as follows. Otherwise, the packet must be fully
1451 checksummed, and flags is zero.
1452
1453 flags has the VIRTIO_NET_HDR_F_NEEDS_CSUM set,
1454
1455 <ite:csum_start-is-set>csum_start is set to the offset within
1456 the packet to begin checksumming, and
1457
1458 csum_offset indicates how many bytes after the csum_start the
1459 new (16 bit ones' complement) checksum should be placed.[footnote:
1460For example, consider a partially checksummed TCP (IPv4) packet.
1461It will have a 14 byte ethernet header and 20 byte IP header
1462followed by the TCP header (with the TCP checksum field 16 bytes
1463into that header). csum_start will be 14+20 = 34 (the TCP
1464checksum includes the header), and csum_offset will be 16. The
1465value in the TCP checksum field will be the sum of the TCP pseudo
1466header, so that replacing it by the ones' complement checksum of
1467the TCP header and body will give the correct result.
1468]
1469
1470 <enu:If-the-driver>If the driver negotiated
1471 VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO, and the packet requires
1472 TCP segmentation or UDP fragmentation, then the 鈥済so_type鈥
1473 field is set to VIRTIO_NET_HDR_GSO_TCPV4, TCPV6 or UDP.
1474 (Otherwise, it is set to VIRTIO_NET_HDR_GSO_NONE). In this
1475 case, packets larger than 1514 bytes can be transmitted: the
1476 metadata indicates how to replicate the packet header to cut it
1477 into smaller packets. The other gso fields are set:
1478
1479 hdr_len is a hint to the device as to how much of the header
1480 needs to be kept to copy into each packet, usually set to the
1481 length of the headers, including the transport header.[footnote:
1482Due to various bugs in implementations, this field is not useful
1483as a guarantee of the transport header size.
1484]
1485
1486 gso_size is the size of the packet beyond that header (ie.
1487 MSS).
1488
1489 If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the
1490 VIRTIO_NET_HDR_GSO_ECN bit may be set in 鈥済so_type鈥 as well,
1491 indicating that the TCP packet has the ECN bit set.[footnote:
1492This case is not handled by some older hardware, so is called out
1493specifically in the protocol.
1494]
1495
1496 If the driver negotiated the VIRTIO_NET_F_MRG_RXBUF feature,
1497 the num_buffers field is set to zero.
1498
1499 The header and packet are added as one output buffer to the
1500 transmitq, and the device is notified of the new entry (see [sub:Notifying-The-Device]
1501 ).[footnote:
1502Note that the header will be two bytes longer for the
1503VIRTIO_NET_F_MRG_RXBUF case.
1504]
1505
1506 Packet Transmission Interrupt
1507
1508Often a driver will suppress transmission interrupts using the
1509VRING_AVAIL_F_NO_INTERRUPT flag (see [sub:Receiving-Used-Buffers]
1510) and check for used packets in the transmit path of following
1511packets. However, it will still receive interrupts if the
1512VIRTIO_F_NOTIFY_ON_EMPTY feature is negotiated, indicating that
1513the transmission queue is completely emptied.
1514
1515The normal behavior in this interrupt handler is to retrieve and
1516new descriptors from the used ring and free the corresponding
1517headers and packets.
1518
1519 Setting Up Receive Buffers
1520
1521It is generally a good idea to keep the receive virtqueue as
1522fully populated as possible: if it runs out, network performance
1523will suffer.
1524
1525If the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or
1526VIRTIO_NET_F_GUEST_UFO features are used, the Guest will need to
1527accept packets of up to 65550 bytes long (the maximum size of a
1528TCP or UDP packet, plus the 14 byte ethernet header), otherwise
15291514 bytes. So unless VIRTIO_NET_F_MRG_RXBUF is negotiated, every
1530buffer in the receive queue needs to be at least this length [footnote:
1531Obviously each one can be split across multiple descriptor
1532elements.
1533].
1534
1535If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer must be at
1536least the size of the struct virtio_net_hdr.
1537
1538 Packet Receive Interrupt
1539
1540When a packet is copied into a buffer in the receiveq, the
1541optimal path is to disable further interrupts for the receiveq
1542(see [sub:Receiving-Used-Buffers]) and process packets until no
1543more are found, then re-enable them.
1544
1545Processing packet involves:
1546
1547 If the driver negotiated the VIRTIO_NET_F_MRG_RXBUF feature,
1548 then the 鈥渘um_buffers鈥 field indicates how many descriptors
1549 this packet is spread over (including this one). This allows
1550 receipt of large packets without having to allocate large
1551 buffers. In this case, there will be at least 鈥渘um_buffers鈥 in
1552 the used ring, and they should be chained together to form a
1553 single packet. The other buffers will not begin with a struct
1554 virtio_net_hdr.
1555
1556 If the VIRTIO_NET_F_MRG_RXBUF feature was not negotiated, or
1557 the 鈥渘um_buffers鈥 field is one, then the entire packet will be
1558 contained within this buffer, immediately following the struct
1559 virtio_net_hdr.
1560
1561 If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
1562 VIRTIO_NET_HDR_F_NEEDS_CSUM bit in the 鈥渇lags鈥 field may be
1563 set: if so, the checksum on the packet is incomplete and the 鈥
1564 csum_start鈥 and 鈥渃sum_offset鈥 fields indicate how to calculate
1565 it (see [ite:csum_start-is-set]).
1566
1567 If the VIRTIO_NET_F_GUEST_TSO4, TSO6 or UFO options were
1568 negotiated, then the 鈥済so_type鈥 may be something other than
1569 VIRTIO_NET_HDR_GSO_NONE, and the 鈥済so_size鈥 field indicates the
1570 desired MSS (see [enu:If-the-driver]).Control Virtqueue
1571
1572The driver uses the control virtqueue (if VIRTIO_NET_F_VTRL_VQ is
1573negotiated) to send commands to manipulate various features of
1574the device which would not easily map into the configuration
1575space.
1576
1577All commands are of the following form:
1578
1579struct virtio_net_ctrl {
1580
1581 u8 class;
1582
1583 u8 command;
1584
1585 u8 command-specific-data[];
1586
1587 u8 ack;
1588
1589};
1590
1591
1592
1593/* ack values */
1594
1595#define VIRTIO_NET_OK 0
1596
1597#define VIRTIO_NET_ERR 1
1598
1599The class, command and command-specific-data are set by the
1600driver, and the device sets the ack byte. There is little it can
1601do except issue a diagnostic if the ack byte is not
1602VIRTIO_NET_OK.
1603
1604 Packet Receive Filtering
1605
1606If the VIRTIO_NET_F_CTRL_RX feature is negotiated, the driver can
1607send control commands for promiscuous mode, multicast receiving,
1608and filtering of MAC addresses.
1609
1610Note that in general, these commands are best-effort: unwanted
1611packets may still arrive.
1612
1613 Setting Promiscuous Mode
1614
1615#define VIRTIO_NET_CTRL_RX 0
1616
1617 #define VIRTIO_NET_CTRL_RX_PROMISC 0
1618
1619 #define VIRTIO_NET_CTRL_RX_ALLMULTI 1
1620
1621The class VIRTIO_NET_CTRL_RX has two commands:
1622VIRTIO_NET_CTRL_RX_PROMISC turns promiscuous mode on and off, and
1623VIRTIO_NET_CTRL_RX_ALLMULTI turns all-multicast receive on and
1624off. The command-specific-data is one byte containing 0 (off) or
16251 (on).
1626
1627 Setting MAC Address Filtering
1628
1629struct virtio_net_ctrl_mac {
1630
1631 u32 entries;
1632
1633 u8 macs[entries][ETH_ALEN];
1634
1635};
1636
1637
1638
1639#define VIRTIO_NET_CTRL_MAC 1
1640
1641 #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0
1642
1643The device can filter incoming packets by any number of
1644destination MAC addresses.[footnote:
1645Since there are no guarentees, it can use a hash filter
1646orsilently switch to allmulti or promiscuous mode if it is given
1647too many addresses.
1648] This table is set using the class VIRTIO_NET_CTRL_MAC and the
1649command VIRTIO_NET_CTRL_MAC_TABLE_SET. The command-specific-data
1650is two variable length tables of 6-byte MAC addresses. The first
1651table contains unicast addresses, and the second contains
1652multicast addresses.
1653
1654 VLAN Filtering
1655
1656If the driver negotiates the VIRTION_NET_F_CTRL_VLAN feature, it
1657can control a VLAN filter table in the device.
1658
1659#define VIRTIO_NET_CTRL_VLAN 2
1660
1661 #define VIRTIO_NET_CTRL_VLAN_ADD 0
1662
1663 #define VIRTIO_NET_CTRL_VLAN_DEL 1
1664
1665Both the VIRTIO_NET_CTRL_VLAN_ADD and VIRTIO_NET_CTRL_VLAN_DEL
1666command take a 16-bit VLAN id as the command-specific-data.
1667
1668Appendix D: Block Device
1669
1670The virtio block device is a simple virtual block device (ie.
1671disk). Read and write requests (and other exotic requests) are
1672placed in the queue, and serviced (probably out of order) by the
1673device except where noted.
1674
1675 Configuration
1676
1677 Subsystem Device ID 2
1678
1679 Virtqueues 0:requestq.
1680
1681 Feature bits
1682
1683 VIRTIO_BLK_F_BARRIER (0) Host supports request barriers.
1684
1685 VIRTIO_BLK_F_SIZE_MAX (1) Maximum size of any single segment is
1686 in 鈥渟ize_max鈥.
1687
1688 VIRTIO_BLK_F_SEG_MAX (2) Maximum number of segments in a
1689 request is in 鈥渟eg_max鈥.
1690
1691 VIRTIO_BLK_F_GEOMETRY (4) Disk-style geometry specified in 鈥
1692 geometry鈥.
1693
1694 VIRTIO_BLK_F_RO (5) Device is read-only.
1695
1696 VIRTIO_BLK_F_BLK_SIZE (6) Block size of disk is in 鈥渂lk_size鈥.
1697
1698 VIRTIO_BLK_F_SCSI (7) Device supports scsi packet commands.
1699
1700 VIRTIO_BLK_F_FLUSH (9) Cache flush command support.
1701
1702
1703
1704 Device configuration layout The capacity of the device
1705 (expressed in 512-byte sectors) is always present. The
1706 availability of the others all depend on various feature bits
1707 as indicated above. struct virtio_blk_config {
1708
1709 u64 capacity;
1710
1711 u32 size_max;
1712
1713 u32 seg_max;
1714
1715 struct virtio_blk_geometry {
1716
1717 u16 cylinders;
1718
1719 u8 heads;
1720
1721 u8 sectors;
1722
1723 } geometry;
1724
1725 u32 blk_size;
1726
1727
1728
1729};
1730
1731 Device Initialization
1732
1733 The device size should be read from the 鈥渃apacity鈥
1734 configuration field. No requests should be submitted which goes
1735 beyond this limit.
1736
1737 If the VIRTIO_BLK_F_BLK_SIZE feature is negotiated, the
1738 blk_size field can be read to determine the optimal sector size
1739 for the driver to use. This does not effect the units used in
1740 the protocol (always 512 bytes), but awareness of the correct
1741 value can effect performance.
1742
1743 If the VIRTIO_BLK_F_RO feature is set by the device, any write
1744 requests will fail.
1745
1746
1747
1748 Device Operation
1749
1750The driver queues requests to the virtqueue, and they are used by
1751the device (not necessarily in order). Each request is of form:
1752
1753struct virtio_blk_req {
1754
1755
1756
1757 u32 type;
1758
1759 u32 ioprio;
1760
1761 u64 sector;
1762
1763 char data[][512];
1764
1765 u8 status;
1766
1767};
1768
1769If the device has VIRTIO_BLK_F_SCSI feature, it can also support
1770scsi packet command requests, each of these requests is of form:struct virtio_scsi_pc_req {
1771
1772 u32 type;
1773
1774 u32 ioprio;
1775
1776 u64 sector;
1777
1778 char cmd[];
1779
1780 char data[][512];
1781
1782#define SCSI_SENSE_BUFFERSIZE 96
1783
1784 u8 sense[SCSI_SENSE_BUFFERSIZE];
1785
1786 u32 errors;
1787
1788 u32 data_len;
1789
1790 u32 sense_len;
1791
1792 u32 residual;
1793
1794 u8 status;
1795
1796};
1797
1798The type of the request is either a read (VIRTIO_BLK_T_IN), a
1799write (VIRTIO_BLK_T_OUT), a scsi packet command
1800(VIRTIO_BLK_T_SCSI_CMD or VIRTIO_BLK_T_SCSI_CMD_OUT[footnote:
1801the SCSI_CMD and SCSI_CMD_OUT types are equivalent, the device
1802does not distinguish between them
1803]) or a flush (VIRTIO_BLK_T_FLUSH or VIRTIO_BLK_T_FLUSH_OUT[footnote:
1804the FLUSH and FLUSH_OUT types are equivalent, the device does not
1805distinguish between them
1806]). If the device has VIRTIO_BLK_F_BARRIER feature the high bit
1807(VIRTIO_BLK_T_BARRIER) indicates that this request acts as a
1808barrier and that all preceeding requests must be complete before
1809this one, and all following requests must not be started until
1810this is complete. Note that a barrier does not flush caches in
1811the underlying backend device in host, and thus does not serve as
1812data consistency guarantee. Driver must use FLUSH request to
1813flush the host cache.
1814
1815#define VIRTIO_BLK_T_IN 0
1816
1817#define VIRTIO_BLK_T_OUT 1
1818
1819#define VIRTIO_BLK_T_SCSI_CMD 2
1820
1821#define VIRTIO_BLK_T_SCSI_CMD_OUT 3
1822
1823#define VIRTIO_BLK_T_FLUSH 4
1824
1825#define VIRTIO_BLK_T_FLUSH_OUT 5
1826
1827#define VIRTIO_BLK_T_BARRIER 0x80000000
1828
1829The ioprio field is a hint about the relative priorities of
1830requests to the device: higher numbers indicate more important
1831requests.
1832
1833The sector number indicates the offset (multiplied by 512) where
1834the read or write is to occur. This field is unused and set to 0
1835for scsi packet commands and for flush commands.
1836
1837The cmd field is only present for scsi packet command requests,
1838and indicates the command to perform. This field must reside in a
1839single, separate read-only buffer; command length can be derived
1840from the length of this buffer.
1841
1842Note that these first three (four for scsi packet commands)
1843fields are always read-only: the data field is either read-only
1844or write-only, depending on the request. The size of the read or
1845write can be derived from the total size of the request buffers.
1846
1847The sense field is only present for scsi packet command requests,
1848and indicates the buffer for scsi sense data.
1849
1850The data_len field is only present for scsi packet command
1851requests, this field is deprecated, and should be ignored by the
1852driver. Historically, devices copied data length there.
1853
1854The sense_len field is only present for scsi packet command
1855requests and indicates the number of bytes actually written to
1856the sense buffer.
1857
1858The residual field is only present for scsi packet command
1859requests and indicates the residual size, calculated as data
1860length - number of bytes actually transferred.
1861
1862The final status byte is written by the device: either
1863VIRTIO_BLK_S_OK for success, VIRTIO_BLK_S_IOERR for host or guest
1864error or VIRTIO_BLK_S_UNSUPP for a request unsupported by host:#define VIRTIO_BLK_S_OK 0
1865
1866#define VIRTIO_BLK_S_IOERR 1
1867
1868#define VIRTIO_BLK_S_UNSUPP 2
1869
1870Historically, devices assumed that the fields type, ioprio and
1871sector reside in a single, separate read-only buffer; the fields
1872errors, data_len, sense_len and residual reside in a single,
1873separate write-only buffer; the sense field in a separate
1874write-only buffer of size 96 bytes, by itself; the fields errors,
1875data_len, sense_len and residual in a single write-only buffer;
1876and the status field is a separate read-only buffer of size 1
1877byte, by itself.
1878
1879Appendix E: Console Device
1880
1881The virtio console device is a simple device for data input and
1882output. A device may have one or more ports. Each port has a pair
1883of input and output virtqueues. Moreover, a device has a pair of
1884control IO virtqueues. The control virtqueues are used to
1885communicate information between the device and the driver about
1886ports being opened and closed on either side of the connection,
1887indication from the host about whether a particular port is a
1888console port, adding new ports, port hot-plug/unplug, etc., and
1889indication from the guest about whether a port or a device was
1890successfully added, port open/close, etc.. For data IO, one or
1891more empty buffers are placed in the receive queue for incoming
1892data and outgoing characters are placed in the transmit queue.
1893
1894 Configuration
1895
1896 Subsystem Device ID 3
1897
1898 Virtqueues 0:receiveq(port0). 1:transmitq(port0), 2:control
1899 receiveq[footnote:
1900Ports 2 onwards only if VIRTIO_CONSOLE_F_MULTIPORT is set
1901], 3:control transmitq, 4:receiveq(port1), 5:transmitq(port1),
1902 ...
1903
1904 Feature bits
1905
1906 VIRTIO_CONSOLE_F_SIZE (0) Configuration cols and rows fields
1907 are valid.
1908
1909 VIRTIO_CONSOLE_F_MULTIPORT(1) Device has support for multiple
1910 ports; configuration fields nr_ports and max_nr_ports are
1911 valid and control virtqueues will be used.
1912
1913 Device configuration layout The size of the console is supplied
1914 in the configuration space if the VIRTIO_CONSOLE_F_SIZE feature
1915 is set. Furthermore, if the VIRTIO_CONSOLE_F_MULTIPORT feature
1916 is set, the maximum number of ports supported by the device can
1917 be fetched.struct virtio_console_config {
1918
1919 u16 cols;
1920
1921 u16 rows;
1922
1923
1924
1925 u32 max_nr_ports;
1926
1927};
1928
1929 Device Initialization
1930
1931 If the VIRTIO_CONSOLE_F_SIZE feature is negotiated, the driver
1932 can read the console dimensions from the configuration fields.
1933
1934 If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the
1935 driver can spawn multiple ports, not all of which may be
1936 attached to a console. Some could be generic ports. In this
1937 case, the control virtqueues are enabled and according to the
1938 max_nr_ports configuration-space value, the appropriate number
1939 of virtqueues are created. A control message indicating the
1940 driver is ready is sent to the host. The host can then send
1941 control messages for adding new ports to the device. After
1942 creating and initializing each port, a
1943 VIRTIO_CONSOLE_PORT_READY control message is sent to the host
1944 for that port so the host can let us know of any additional
1945 configuration options set for that port.
1946
1947 The receiveq for each port is populated with one or more
1948 receive buffers.
1949
1950 Device Operation
1951
1952 For output, a buffer containing the characters is placed in the
1953 port's transmitq.[footnote:
1954Because this is high importance and low bandwidth, the current
1955Linux implementation polls for the buffer to be used, rather than
1956waiting for an interrupt, simplifying the implementation
1957significantly. However, for generic serial ports with the
1958O_NONBLOCK flag set, the polling limitation is relaxed and the
1959consumed buffers are freed upon the next write or poll call or
1960when a port is closed or hot-unplugged.
1961]
1962
1963 When a buffer is used in the receiveq (signalled by an
1964 interrupt), the contents is the input to the port associated
1965 with the virtqueue for which the notification was received.
1966
1967 If the driver negotiated the VIRTIO_CONSOLE_F_SIZE feature, a
1968 configuration change interrupt may occur. The updated size can
1969 be read from the configuration fields.
1970
1971 If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT
1972 feature, active ports are announced by the host using the
1973 VIRTIO_CONSOLE_PORT_ADD control message. The same message is
1974 used for port hot-plug as well.
1975
1976 If the host specified a port `name', a sysfs attribute is
1977 created with the name filled in, so that udev rules can be
1978 written that can create a symlink from the port's name to the
1979 char device for port discovery by applications in the guest.
1980
1981 Changes to ports' state are effected by control messages.
1982 Appropriate action is taken on the port indicated in the
1983 control message. The layout of the structure of the control
1984 buffer and the events associated are:struct virtio_console_control {
1985
1986 uint32_t id; /* Port number */
1987
1988 uint16_t event; /* The kind of control event */
1989
1990 uint16_t value; /* Extra information for the event */
1991
1992};
1993
1994
1995
1996/* Some events for the internal messages (control packets) */
1997
1998
1999
2000#define VIRTIO_CONSOLE_DEVICE_READY 0
2001
2002#define VIRTIO_CONSOLE_PORT_ADD 1
2003
2004#define VIRTIO_CONSOLE_PORT_REMOVE 2
2005
2006#define VIRTIO_CONSOLE_PORT_READY 3
2007
2008#define VIRTIO_CONSOLE_CONSOLE_PORT 4
2009
2010#define VIRTIO_CONSOLE_RESIZE 5
2011
2012#define VIRTIO_CONSOLE_PORT_OPEN 6
2013
2014#define VIRTIO_CONSOLE_PORT_NAME 7
2015
2016Appendix F: Entropy Device
2017
2018The virtio entropy device supplies high-quality randomness for
2019guest use.
2020
2021 Configuration
2022
2023 Subsystem Device ID 4
2024
2025 Virtqueues 0:requestq.
2026
2027 Feature bits None currently defined
2028
2029 Device configuration layout None currently defined.
2030
2031 Device Initialization
2032
2033 The virtqueue is initialized
2034
2035 Device Operation
2036
2037When the driver requires random bytes, it places the descriptor
2038of one or more buffers in the queue. It will be completely filled
2039by random data by the device.
2040
2041Appendix G: Memory Balloon Device
2042
2043The virtio memory balloon device is a primitive device for
2044managing guest memory: the device asks for a certain amount of
2045memory, and the guest supplies it (or withdraws it, if the device
2046has more than it asks for). This allows the guest to adapt to
2047changes in allowance of underlying physical memory. If the
2048feature is negotiated, the device can also be used to communicate
2049guest memory statistics to the host.
2050
2051 Configuration
2052
2053 Subsystem Device ID 5
2054
2055 Virtqueues 0:inflateq. 1:deflateq. 2:statsq.[footnote:
2056Only if VIRTIO_BALLON_F_STATS_VQ set
2057]
2058
2059 Feature bits
2060
2061 VIRTIO_BALLOON_F_MUST_TELL_HOST (0) Host must be told before
2062 pages from the balloon are used.
2063
2064 VIRTIO_BALLOON_F_STATS_VQ (1) A virtqueue for reporting guest
2065 memory statistics is present.
2066
2067 Device configuration layout Both fields of this configuration
2068 are always available. Note that they are little endian, despite
2069 convention that device fields are guest endian:struct virtio_balloon_config {
2070
2071 u32 num_pages;
2072
2073 u32 actual;
2074
2075};
2076
2077 Device Initialization
2078
2079 The inflate and deflate virtqueues are identified.
2080
2081 If the VIRTIO_BALLOON_F_STATS_VQ feature bit is negotiated:
2082
2083 Identify the stats virtqueue.
2084
2085 Add one empty buffer to the stats virtqueue and notify the
2086 host.
2087
2088Device operation begins immediately.
2089
2090 Device Operation
2091
2092 Memory Ballooning The device is driven by the receipt of a
2093 configuration change interrupt.
2094
2095 The 鈥渘um_pages鈥 configuration field is examined. If this is
2096 greater than the 鈥渁ctual鈥 number of pages, memory must be given
2097 to the balloon. If it is less than the 鈥渁ctual鈥 number of
2098 pages, memory may be taken back from the balloon for general
2099 use.
2100
2101 To supply memory to the balloon (aka. inflate):
2102
2103 The driver constructs an array of addresses of unused memory
2104 pages. These addresses are divided by 4096[footnote:
2105This is historical, and independent of the guest page size
2106] and the descriptor describing the resulting 32-bit array is
2107 added to the inflateq.
2108
2109 To remove memory from the balloon (aka. deflate):
2110
2111 The driver constructs an array of addresses of memory pages it
2112 has previously given to the balloon, as described above. This
2113 descriptor is added to the deflateq.
2114
2115 If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the
2116 guest may not use these requested pages until that descriptor
2117 in the deflateq has been used by the device.
2118
2119 Otherwise, the guest may begin to re-use pages previously given
2120 to the balloon before the device has acknowledged their
2121 withdrawl. [footnote:
2122In this case, deflation advice is merely a courtesy
2123]
2124
2125 In either case, once the device has completed the inflation or
2126 deflation, the 鈥渁ctual鈥 field of the configuration should be
2127 updated to reflect the new number of pages in the balloon.[footnote:
2128As updates to configuration space are not atomic, this field
2129isn't particularly reliable, but can be used to diagnose buggy
2130guests.
2131]
2132
2133 Memory Statistics
2134
2135The stats virtqueue is atypical because communication is driven
2136by the device (not the driver). The channel becomes active at
2137driver initialization time when the driver adds an empty buffer
2138and notifies the device. A request for memory statistics proceeds
2139as follows:
2140
2141 The device pushes the buffer onto the used ring and sends an
2142 interrupt.
2143
2144 The driver pops the used buffer and discards it.
2145
2146 The driver collects memory statistics and writes them into a
2147 new buffer.
2148
2149 The driver adds the buffer to the virtqueue and notifies the
2150 device.
2151
2152 The device pops the buffer (retaining it to initiate a
2153 subsequent request) and consumes the statistics.
2154
2155 Memory Statistics Format Each statistic consists of a 16 bit
2156 tag and a 64 bit value. Both quantities are represented in the
2157 native endian of the guest. All statistics are optional and the
2158 driver may choose which ones to supply. To guarantee backwards
2159 compatibility, unsupported statistics should be omitted.
2160
2161 struct virtio_balloon_stat {
2162
2163#define VIRTIO_BALLOON_S_SWAP_IN 0
2164
2165#define VIRTIO_BALLOON_S_SWAP_OUT 1
2166
2167#define VIRTIO_BALLOON_S_MAJFLT 2
2168
2169#define VIRTIO_BALLOON_S_MINFLT 3
2170
2171#define VIRTIO_BALLOON_S_MEMFREE 4
2172
2173#define VIRTIO_BALLOON_S_MEMTOT 5
2174
2175 u16 tag;
2176
2177 u64 val;
2178
2179} __attribute__((packed));
2180
2181 Tags
2182
2183 VIRTIO_BALLOON_S_SWAP_IN The amount of memory that has been
2184 swapped in (in bytes).
2185
2186 VIRTIO_BALLOON_S_SWAP_OUT The amount of memory that has been
2187 swapped out to disk (in bytes).
2188
2189 VIRTIO_BALLOON_S_MAJFLT The number of major page faults that
2190 have occurred.
2191
2192 VIRTIO_BALLOON_S_MINFLT The number of minor page faults that
2193 have occurred.
2194
2195 VIRTIO_BALLOON_S_MEMFREE The amount of memory not being used
2196 for any purpose (in bytes).
2197
2198 VIRTIO_BALLOON_S_MEMTOT The total amount of memory available
2199 (in bytes).
2200
diff --git a/Documentation/vm/00-INDEX b/Documentation/vm/00-INDEX
index dca82d7c83d8..5481c8ba3412 100644
--- a/Documentation/vm/00-INDEX
+++ b/Documentation/vm/00-INDEX
@@ -30,8 +30,6 @@ page_migration
30 - description of page migration in NUMA systems. 30 - description of page migration in NUMA systems.
31pagemap.txt 31pagemap.txt
32 - pagemap, from the userspace perspective 32 - pagemap, from the userspace perspective
33slabinfo.c
34 - source code for a tool to get reports about slabs.
35slub.txt 33slub.txt
36 - a short users guide for SLUB. 34 - a short users guide for SLUB.
37unevictable-lru.txt 35unevictable-lru.txt
diff --git a/Documentation/vm/numa b/Documentation/vm/numa
index a200a386429d..ade01274212d 100644
--- a/Documentation/vm/numa
+++ b/Documentation/vm/numa
@@ -109,11 +109,11 @@ to improve NUMA locality using various CPU affinity command line interfaces,
109such as taskset(1) and numactl(1), and program interfaces such as 109such as taskset(1) and numactl(1), and program interfaces such as
110sched_setaffinity(2). Further, one can modify the kernel's default local 110sched_setaffinity(2). Further, one can modify the kernel's default local
111allocation behavior using Linux NUMA memory policy. 111allocation behavior using Linux NUMA memory policy.
112[see Documentation/vm/numa_memory_policy.] 112[see Documentation/vm/numa_memory_policy.txt.]
113 113
114System administrators can restrict the CPUs and nodes' memories that a non- 114System administrators can restrict the CPUs and nodes' memories that a non-
115privileged user can specify in the scheduling or NUMA commands and functions 115privileged user can specify in the scheduling or NUMA commands and functions
116using control groups and CPUsets. [see Documentation/cgroups/CPUsets.txt] 116using control groups and CPUsets. [see Documentation/cgroups/cpusets.txt]
117 117
118On architectures that do not hide memoryless nodes, Linux will include only 118On architectures that do not hide memoryless nodes, Linux will include only
119zones [nodes] with memory in the zonelists. This means that for a memoryless 119zones [nodes] with memory in the zonelists. This means that for a memoryless
diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt
index 07375e73981a..f464f47bc60d 100644
--- a/Documentation/vm/slub.txt
+++ b/Documentation/vm/slub.txt
@@ -17,7 +17,7 @@ data and perform operation on the slabs. By default slabinfo only lists
17slabs that have data in them. See "slabinfo -h" for more options when 17slabs that have data in them. See "slabinfo -h" for more options when
18running the command. slabinfo can be compiled with 18running the command. slabinfo can be compiled with
19 19
20gcc -o slabinfo Documentation/vm/slabinfo.c 20gcc -o slabinfo tools/slub/slabinfo.c
21 21
22Some of the modes of operation of slabinfo require that slub debugging 22Some of the modes of operation of slabinfo require that slub debugging
23be enabled on the command line. F.e. no tracking information will be 23be enabled on the command line. F.e. no tracking information will be
diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt
index 0924aaca3302..29bdf62aac09 100644
--- a/Documentation/vm/transhuge.txt
+++ b/Documentation/vm/transhuge.txt
@@ -123,10 +123,11 @@ be automatically shutdown if it's set to "never".
123khugepaged runs usually at low frequency so while one may not want to 123khugepaged runs usually at low frequency so while one may not want to
124invoke defrag algorithms synchronously during the page faults, it 124invoke defrag algorithms synchronously during the page faults, it
125should be worth invoking defrag at least in khugepaged. However it's 125should be worth invoking defrag at least in khugepaged. However it's
126also possible to disable defrag in khugepaged: 126also possible to disable defrag in khugepaged by writing 0 or enable
127defrag in khugepaged by writing 1:
127 128
128echo yes >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag 129echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag
129echo no >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag 130echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag
130 131
131You can also control how many pages khugepaged should scan at each 132You can also control how many pages khugepaged should scan at each
132pass: 133pass:
diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.txt
index 7869f14d055c..bc7226ef5055 100644
--- a/Documentation/x86/entry_64.txt
+++ b/Documentation/x86/entry_64.txt
@@ -27,9 +27,6 @@ Some of these entries are:
27 magically-generated functions that make their way to do_IRQ with 27 magically-generated functions that make their way to do_IRQ with
28 the interrupt number as a parameter. 28 the interrupt number as a parameter.
29 29
30 - emulate_vsyscall: int 0xcc, a special non-ABI entry used by
31 vsyscall emulation.
32
33 - APIC interrupts: Various special-purpose interrupts for things 30 - APIC interrupts: Various special-purpose interrupts for things
34 like TLB shootdown. 31 like TLB shootdown.
35 32
diff --git a/Documentation/zh_CN/SubmitChecklist b/Documentation/zh_CN/SubmitChecklist
deleted file mode 100644
index 4c741d6bc048..000000000000
--- a/Documentation/zh_CN/SubmitChecklist
+++ /dev/null
@@ -1,109 +0,0 @@
1Chinese translated version of Documentation/SubmitChecklist
2
3If you have any comment or update to the content, please contact the
4original document maintainer directly. However, if you have a problem
5communicating in English you can also ask the Chinese maintainer for
6help. Contact the Chinese maintainer if this translation is outdated
7or if there is a problem with the translation.
8
9Chinese maintainer: Harry Wei <harryxiyou@gmail.com>
10---------------------------------------------------------------------
11Documentation/SubmitChecklist 的中文翻译
12
13如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
14交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
15译存在问题,请联系中文版维护者。
16
17中文版维护者: 贾威威 Harry Wei <harryxiyou@gmail.com>
18中文版翻译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
19中文版校译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
20
21
22以下为正文
23---------------------------------------------------------------------
24Linux内核提交清单
25~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26
27这里有一些内核开发者应该做的基本事情,如果他们想看到自己的内核补丁提交
28被接受的更快。
29
30这些都是超出Documentation/SubmittingPatches文档里所提供的以及其他
31关于提交Linux内核补丁的说明。
32
331:如果你使用了一个功能那么就#include定义/声明那个功能的那个文件。
34 不要依靠其他间接引入定义/声明那个功能的头文件。
35
362:构建简洁适用或者更改CONFIG选项 =y,=m,或者=n。
37 不要有编译警告/错误, 不要有链接警告/错误。
38
392b:通过 allnoconfig, allmodconfig
40
412c:当使用 0=builddir 成功地构建
42
433:通过使用本地交叉编译工具或者其他一些构建产所,在多CPU框架上构建。
44
454:ppc64 是一个很好的检查交叉编译的框架,因为它往往把‘unsigned long’
46 当64位值来使用。
47
485:按照Documentation/CodingStyle文件里的详细描述,检查你补丁的整体风格。
49 使用补丁风格检查琐碎的违规(scripts/checkpatch.pl),审核员优先提交。
50 你应该调整遗留在你补丁中的所有违规。
51
526:任何更新或者改动CONFIG选项都不能打乱配置菜单。
53
547:所有的Kconfig选项更新都要有说明文字。
55
568:已经认真地总结了相关的Kconfig组合。这是很难通过测试做好的--脑力在这里下降。
57
589:检查具有简洁性。
59
6010:使用'make checkstack'和'make namespacecheck'检查,然后修改所找到的问题。
61 注意:堆栈检查不会明确地出现问题,但是任何的一个函数在堆栈上使用多于512字节
62 都要准备修改。
63
6411:包含kernel-doc到全局内核APIs文件。(不要求静态的函数,但是包含也无所谓。)
65 使用'make htmldocs'或者'make mandocs'来检查kernel-doc,然后修改任何
66 发现的问题。
67
6812:已经通过CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
69 CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
70 CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_ATOMIC_SLEEP测试,并且同时都
71 使能。
72
7313:已经都构建并且使用或者不使用 CONFIG_SMP 和 CONFIG_PREEMPT测试执行时间。
74
7514:如果补丁影响IO/Disk,等等:已经通过使用或者不使用 CONFIG_LBDAF 测试。
76
7715:所有的codepaths已经行使所有lockdep启用功能。
78
7916:所有的/proc记录更新都要作成文件放在Documentation/目录下。
80
8117:所有的内核启动参数更新都被记录到Documentation/kernel-parameters.txt文件中。
82
8318:所有的模块参数更新都用MODULE_PARM_DESC()记录。
84
8519:所有的用户空间接口更新都被记录到Documentation/ABI/。查看Documentation/ABI/README
86 可以获得更多的信息。改变用户空间接口的补丁应该被邮件抄送给linux-api@vger.kernel.org。
87
8820:检查它是不是都通过`make headers_check'。
89
9021:已经通过至少引入slab和page-allocation失败检查。查看Documentation/fault-injection/。
91
9222:新加入的源码已经通过`gcc -W'(使用"make EXTRA_CFLAGS=-W")编译。这样将产生很多烦恼,
93 但是对于寻找漏洞很有益处,例如:"warning: comparison between signed and unsigned"。
94
9523:当它被合并到-mm补丁集后再测试,用来确定它是否还和补丁队列中的其他补丁一起工作以及在VM,VFS
96 和其他子系统中各个变化。
97
9824:所有的内存屏障{e.g., barrier(), rmb(), wmb()}需要在源代码中的一个注释来解释他们都是干什么的
99 以及原因。
100
10125:如果有任何输入输出控制的补丁被添加,也要更新Documentation/ioctl/ioctl-number.txt。
102
10326:如果你的更改代码依靠或者使用任何的内核APIs或者与下面的kconfig符号有关系的功能,你就要
104 使用相关的kconfig符号关闭, and/or =m(如果选项提供)[在同一时间不是所用的都启用,仅仅各个或者自由
105 组合他们]:
106
107 CONFIG_SMP, CONFIG_SYSFS, CONFIG_PROC_FS, CONFIG_INPUT, CONFIG_PCI,
108 CONFIG_BLOCK, CONFIG_PM, CONFIG_HOTPLUG, CONFIG_MAGIC_SYSRQ,
109 CONFIG_NET, CONFIG_INET=n (后一个使用 CONFIG_NET=y)