From 113b3a2b901573961509e81a28e9546cf9defef0 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Wed, 27 May 2009 21:46:11 -0400 Subject: ACPI: delete acpi.power_nocheck from kernel-parameters.txt Signed-off-by: Len Brown --- Documentation/kernel-parameters.txt | 8 -------- 1 file changed, 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index fd5cac013037..ed3d8b966c02 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -229,14 +229,6 @@ and is between 256 and 4096 characters. It is defined in the file to assume that this machine's pmtimer latches its value and always returns good values. - acpi.power_nocheck= [HW,ACPI] - Format: 1/0 enable/disable the check of power state. - On some bogus BIOS the _PSC object/_STA object of - power resource can't return the correct device power - state. In such case it is unneccessary to check its - power state again in power transition. - 1 : disable the power state check - acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode Format: { level | edge | high | low } -- cgit v1.2.2 From f21179a47ff8d1046a61c1cf5920244997a4a7bb Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Sat, 30 May 2009 13:25:08 -0300 Subject: thinkpad-acpi: enhance led support Add support for extra LEDs on recent ThinkPads, and avoid registering with the led class the LEDs which are not available for a given ThinkPad model. All non-restricted LEDs are always available through the procfs interface, as the firmware doesn't care if an attempt is made to access an invalid LED. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index e7e9a69069e1..88fc0661de56 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -920,7 +920,7 @@ The available commands are: echo ' off' >/proc/acpi/ibm/led echo ' blink' >/proc/acpi/ibm/led -The range is 0 to 7. The set of LEDs that can be +The range is 0 to 15. The set of LEDs that can be controlled varies from model to model. Here is the common ThinkPad mapping: @@ -932,6 +932,11 @@ mapping: 5 - UltraBase battery slot 6 - (unknown) 7 - standby + 8 - dock status 1 + 9 - dock status 2 + 10, 11 - (unknown) + 12 - thinkvantage + 13, 14, 15 - (unknown) All of the above can be turned on and off and can be made to blink. @@ -940,10 +945,12 @@ sysfs notes: The ThinkPad LED sysfs interface is described in detail by the LED class documentation, in Documentation/leds-class.txt. -The leds are named (in LED ID order, from 0 to 7): +The LEDs are named (in LED ID order, from 0 to 12): "tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt", "tpacpi::dock_active", "tpacpi::bay_active", "tpacpi::dock_batt", -"tpacpi::unknown_led", "tpacpi::standby". +"tpacpi::unknown_led", "tpacpi::standby", "tpacpi::dock_status1", +"tpacpi::dock_status2", "tpacpi::unknown_led2", "tpacpi::unknown_led3", +"tpacpi::thinkvantage". Due to limitations in the sysfs LED class, if the status of the LED indicators cannot be read due to an error, thinkpad-acpi will report it as @@ -958,6 +965,12 @@ ThinkPad indicator LED should blink in hardware accelerated mode, use the "timer" trigger, and leave the delay_on and delay_off parameters set to zero (to request hardware acceleration autodetection). +LEDs that are known not to exist in a given ThinkPad model are not +made available through the sysfs interface. If you have a dock and you +notice there are LEDs listed for your ThinkPad that do not exist (and +are not in the dock), or if you notice that there are missing LEDs, +a report to ibm-acpi-devel@lists.sourceforge.net is appreciated. + ACPI sounds -- /proc/acpi/ibm/beep ---------------------------------- @@ -1555,3 +1568,7 @@ Sysfs interface changelog: 0x020300: hotkey enable/disable support removed, attributes hotkey_bios_enabled and hotkey_enable deprecated and marked for removal. + +0x020400: Marker for 16 LEDs support. Also, LEDs that are known + to not exist in a given model are not registered with + the LED sysfs class anymore. -- cgit v1.2.2 From d7880f10c5d42ba182a97c1fd41d41d0b8837097 Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Thu, 18 Jun 2009 00:40:16 -0300 Subject: thinkpad-acpi: forbid the use of HBRV on Lenovo ThinkPads Forcing thinkpad-acpi to do EC-based brightness control (HBRV) on a X61 has very... interesting effects, instead of doing nothing (since it doesn't have EC-based backlight control), it causes "weirdness" in the fan tachometer readings, for example. This means the EC register that used to be HBRV has been reused by Lenovo for something else, but they didn't remove it from the DSDT. Make sure the documentation reflects this data, and forbid the user from forcing the driver to access HBRV on Lenovo ThinkPads. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index 88fc0661de56..f2ff638cce8d 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -1169,17 +1169,19 @@ may not be distinct. Later Lenovo models that implement the ACPI display backlight brightness control methods have 16 levels, ranging from 0 to 15. -There are two interfaces to the firmware for direct brightness control, -EC and UCMS (or CMOS). To select which one should be used, use the -brightness_mode module parameter: brightness_mode=1 selects EC mode, -brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC -mode with NVRAM backing (so that brightness changes are remembered -across shutdown/reboot). +For IBM ThinkPads, there are two interfaces to the firmware for direct +brightness control, EC and UCMS (or CMOS). To select which one should be +used, use the brightness_mode module parameter: brightness_mode=1 selects +EC mode, brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC +mode with NVRAM backing (so that brightness changes are remembered across +shutdown/reboot). The driver tries to select which interface to use from a table of defaults for each ThinkPad model. If it makes a wrong choice, please report this as a bug, so that we can fix it. +Lenovo ThinkPads only support brightness_mode=2 (UCMS). + When display backlight brightness controls are available through the standard ACPI interface, it is best to use it instead of this direct ThinkPad-specific interface. The driver will disable its native -- cgit v1.2.2 From d73772474f6ebbacbe820c31c0fa1cffa7160246 Mon Sep 17 00:00:00 2001 From: Henrique de Moraes Holschuh Date: Thu, 18 Jun 2009 00:40:17 -0300 Subject: thinkpad-acpi: support the second fan on the X61 Support reading the tachometer of the auxiliary fan of a X60/X61. It was found out by sheer luck, that bit 0 of EC register 0x31 (formely HBRV) selects which fan is active for tachometer readings through EC 0x84/0x085: 0 for fan1, 1 for fan2. Many thanks to Christoph Kl??nter, to Whoopie, and to weasel, who helped confirm that behaviour. Fan control through EC HFSP applies to both fans equally, regardless of the state of bit 0 of EC 0x31. That matches the way the DSDT uses HFSP. In order to better support the secondary fan, export a second tachometer over hwmon, and add defensive measures to make sure we are reading the correct tachometer. Support for the second fan is whitelist-based, as I have not found anything obvious to look for in the DSDT to detect the presence of the second fan. Signed-off-by: Henrique de Moraes Holschuh Signed-off-by: Len Brown --- Documentation/laptops/thinkpad-acpi.txt | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/laptops/thinkpad-acpi.txt b/Documentation/laptops/thinkpad-acpi.txt index f2ff638cce8d..0d5e91379ae8 100644 --- a/Documentation/laptops/thinkpad-acpi.txt +++ b/Documentation/laptops/thinkpad-acpi.txt @@ -1269,7 +1269,7 @@ Fan control and monitoring: fan speed, fan enable/disable procfs: /proc/acpi/ibm/fan sysfs device attributes: (hwmon "thinkpad") fan1_input, pwm1, - pwm1_enable + pwm1_enable, fan2_input sysfs hwmon driver attributes: fan_watchdog NOTE NOTE NOTE: fan control operations are disabled by default for @@ -1282,6 +1282,9 @@ from the hardware registers of the embedded controller. This is known to work on later R, T, X and Z series ThinkPads but may show a bogus value on other models. +Some Lenovo ThinkPads support a secondary fan. This fan cannot be +controlled separately, it shares the main fan control. + Fan levels: Most ThinkPad fans work in "levels" at the firmware interface. Level 0 @@ -1412,6 +1415,11 @@ hwmon device attribute fan1_input: which can take up to two minutes. May return rubbish on older ThinkPads. +hwmon device attribute fan2_input: + Fan tachometer reading, in RPM, for the secondary fan. + Available only on some ThinkPads. If the secondary fan is + not installed, will always read 0. + hwmon driver attribute fan_watchdog: Fan safety watchdog timer interval, in seconds. Minimum is 1 second, maximum is 120 seconds. 0 disables the watchdog. -- cgit v1.2.2 From d4ec36bac3de39b7e10ec8f42fbdd20d9a9ed753 Mon Sep 17 00:00:00 2001 From: Wolfram Sang Date: Sun, 21 Jun 2009 12:32:39 +0200 Subject: sched: Documentation/sched-rt-group: Fix style issues & bump version - add missing closing bracket - fix two 80-chars issues (as the rest of the document adheres to it) - bump a reference to kernel version, so the document does not feel aged Signed-off-by: Wolfram Sang Cc: Peter Zijlstra LKML-Reference: <1245580359-4465-1-git-send-email-w.sang@pengutronix.de> Signed-off-by: Ingo Molnar --- Documentation/scheduler/sched-rt-group.txt | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) (limited to 'Documentation') diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt index 1df7f9cdab05..86eabe6c3419 100644 --- a/Documentation/scheduler/sched-rt-group.txt +++ b/Documentation/scheduler/sched-rt-group.txt @@ -73,7 +73,7 @@ The remaining CPU time will be used for user input and other tasks. Because realtime tasks have explicitly allocated the CPU time they need to perform their tasks, buffer underruns in the graphics or audio can be eliminated. -NOTE: the above example is not fully implemented as of yet (2.6.25). We still +NOTE: the above example is not fully implemented yet. We still lack an EDF scheduler to make non-uniform periods usable. @@ -140,14 +140,15 @@ The other option is: .o CONFIG_CGROUP_SCHED (aka "Basis for grouping tasks" = "Control groups") -This uses the /cgroup virtual file system and "/cgroup//cpu.rt_runtime_us" -to control the CPU time reserved for each control group instead. +This uses the /cgroup virtual file system and +"/cgroup//cpu.rt_runtime_us" to control the CPU time reserved for each +control group instead. For more information on working with control groups, you should read Documentation/cgroups/cgroups.txt as well. -Group settings are checked against the following limits in order to keep the configuration -schedulable: +Group settings are checked against the following limits in order to keep the +configuration schedulable: \Sum_{i} runtime_{i} / global_period <= global_runtime / global_period @@ -189,7 +190,7 @@ Implementing SCHED_EDF might take a while to complete. Priority Inheritance is the biggest challenge as the current linux PI infrastructure is geared towards the limited static priority levels 0-99. With deadline scheduling you need to do deadline inheritance (since priority is inversely proportional to the -deadline delta (deadline - now). +deadline delta (deadline - now)). This means the whole PI machinery will have to be reworked - and that is one of the most complex pieces of code we have. -- cgit v1.2.2 From fa8a7094ba1679b4b9b443e0ac9f5e046c79ee8d Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Mon, 22 Jun 2009 11:56:24 +0900 Subject: x86: implement percpu_alloc kernel parameter According to Andi, it isn't clear whether lpage allocator is worth the trouble as there are many processors where PMD TLB is far scarcer than PTE TLB. The advantage or disadvantage probably depends on the actual size of percpu area and specific processor. As performance degradation due to TLB pressure tends to be highly workload specific and subtle, it is difficult to decide which way to go without more data. This patch implements percpu_alloc kernel parameter to allow selecting which first chunk allocator to use to ease debugging and testing. While at it, make sure all the failure paths report why something failed to help determining why certain allocator isn't working. Also, kill the "Great future plan" comment which had already been realized quite some time ago. [ Impact: allow explicit percpu first chunk allocator selection ] Signed-off-by: Tejun Heo Reported-by: Jan Beulich Cc: Andi Kleen Cc: Ingo Molnar --- Documentation/kernel-parameters.txt | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 08def8deb5f5..ecad946920d1 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1882,6 +1882,12 @@ and is between 256 and 4096 characters. It is defined in the file Format: { 0 | 1 } See arch/parisc/kernel/pdc_chassis.c + percpu_alloc= [X86] Select which percpu first chunk allocator to use. + Allowed values are one of "lpage", "embed" and "4k". + See comments in arch/x86/kernel/setup_percpu.c for + details on each allocator. This parameter is primarily + for debugging and performance comparison. + pf. [PARIDE] See Documentation/blockdev/paride.txt. -- cgit v1.2.2 From fd5e033908b7b743b5650790f196761dd930f988 Mon Sep 17 00:00:00 2001 From: Kiyoshi Ueda Date: Mon, 22 Jun 2009 10:12:27 +0100 Subject: dm mpath: add queue length load balancer This patch adds a dynamic load balancer, dm-queue-length, which balances the number of in-flight I/Os across the paths. The code is based on the patch posted by Stefan Bader: https://www.redhat.com/archives/dm-devel/2005-October/msg00050.html Signed-off-by: Stefan Bader Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-queue-length.txt | 39 +++++++++++++++++++++++++ 1 file changed, 39 insertions(+) create mode 100644 Documentation/device-mapper/dm-queue-length.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-queue-length.txt b/Documentation/device-mapper/dm-queue-length.txt new file mode 100644 index 000000000000..f4db2562175c --- /dev/null +++ b/Documentation/device-mapper/dm-queue-length.txt @@ -0,0 +1,39 @@ +dm-queue-length +=============== + +dm-queue-length is a path selector module for device-mapper targets, +which selects a path with the least number of in-flight I/Os. +The path selector name is 'queue-length'. + +Table parameters for each path: [] + : The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + +Status for each path: + : 'A' if the path is active, 'F' if the path is failed. + : The number of path failures. + : The number of in-flight I/Os on the path. + + +Algorithm +========= + +dm-queue-length increments/decrements 'in-flight' when an I/O is +dispatched/completed respectively. +dm-queue-length selects a path with the minimum 'in-flight'. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128. + +# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0 -- cgit v1.2.2 From f392ba889b019602976082bfe7bf486c2594f85c Mon Sep 17 00:00:00 2001 From: Kiyoshi Ueda Date: Mon, 22 Jun 2009 10:12:28 +0100 Subject: dm mpath: add service time load balancer This patch adds a service time oriented dynamic load balancer, dm-service-time, which selects the path with the shortest estimated service time for the incoming I/O. The service time is estimated by dividing the in-flight I/O size by a performance value of each path. The performance value can be given as a table argument at the table loading time. If no performance value is given, all paths are considered equal. Signed-off-by: Kiyoshi Ueda Signed-off-by: Jun'ichi Nomura Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-service-time.txt | 91 +++++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 Documentation/device-mapper/dm-service-time.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-service-time.txt b/Documentation/device-mapper/dm-service-time.txt new file mode 100644 index 000000000000..7d00668e97bb --- /dev/null +++ b/Documentation/device-mapper/dm-service-time.txt @@ -0,0 +1,91 @@ +dm-service-time +=============== + +dm-service-time is a path selector module for device-mapper targets, +which selects a path with the shortest estimated service time for +the incoming I/O. + +The service time for each path is estimated by dividing the total size +of in-flight I/Os on a path with the performance value of the path. +The performance value is a relative throughput value among all paths +in a path-group, and it can be specified as a table argument. + +The path selector name is 'service-time'. + +Table parameters for each path: [ []] + : The number of I/Os to dispatch using the selected + path before switching to the next path. + If not given, internal default is used. To check + the default value, see the activated table. + : The relative throughput value of the path + among all paths in the path-group. + The valid range is 0-100. + If not given, minimum value '1' is used. + If '0' is given, the path isn't selected while + other paths having a positive value are available. + +Status for each path: \ + + : 'A' if the path is active, 'F' if the path is failed. + : The number of path failures. + : The size of in-flight I/Os on the path. + : The relative throughput value of the path + among all paths in the path-group. + + +Algorithm +========= + +dm-service-time adds the I/O size to 'in-flight-size' when the I/O is +dispatched and substracts when completed. +Basically, dm-service-time selects a path having minimum service time +which is calculated by: + + ('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput' + +However, some optimizations below are used to reduce the calculation +as much as possible. + + 1. If the paths have the same 'relative_throughput', skip + the division and just compare the 'in-flight-size'. + + 2. If the paths have the same 'in-flight-size', skip the division + and just compare the 'relative_throughput'. + + 3. If some paths have non-zero 'relative_throughput' and others + have zero 'relative_throughput', ignore those paths with zero + 'relative_throughput'. + +If such optimizations can't be applied, calculate service time, and +compare service time. +If calculated service time is equal, the path having maximum +'relative_throughput' may be better. So compare 'relative_throughput' +then. + + +Examples +======== +In case that 2 paths (sda and sdb) are used with repeat_count == 128 +and sda has an average throughput 1GB/s and sdb has 4GB/s, +'relative_throughput' value may be '1' for sda and '4' for sdb. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4 + + +Or '2' for sda and '8' for sdb would be also true. + +# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \ + dmsetup create test +# +# dmsetup table +test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8 +# +# dmsetup status +test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8 -- cgit v1.2.2 From f5db4af466e2dca0fe822019812d586ca910b00c Mon Sep 17 00:00:00 2001 From: Jonthan Brassow Date: Mon, 22 Jun 2009 10:12:35 +0100 Subject: dm raid1: add userspace log This patch contains a device-mapper mirror log module that forwards requests to userspace for processing. The structures used for communication between kernel and userspace are located in include/linux/dm-log-userspace.h. Due to the frequency, diversity, and 2-way communication nature of the exchanges between kernel and userspace, 'connector' was chosen as the interface for communication. The first log implementations written in userspace - "clustered-disk" and "clustered-core" - support clustered shared storage. A userspace daemon (in the LVM2 source code repository) uses openAIS/corosync to process requests in an ordered fashion with the rest of the nodes in the cluster so as to prevent log state corruption. Other implementations with no association to LVM or openAIS/corosync, are certainly possible. (Imagine if two machines are writing to the same region of a mirror. They would both mark the region dirty, but you need a cluster-aware entity that can handle properly marking the region clean when they are done. Otherwise, you might clear the region when the first machine is done, not the second.) Signed-off-by: Jonathan Brassow Cc: Evgeniy Polyakov Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/dm-log.txt | 54 ++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 Documentation/device-mapper/dm-log.txt (limited to 'Documentation') diff --git a/Documentation/device-mapper/dm-log.txt b/Documentation/device-mapper/dm-log.txt new file mode 100644 index 000000000000..994dd75475a6 --- /dev/null +++ b/Documentation/device-mapper/dm-log.txt @@ -0,0 +1,54 @@ +Device-Mapper Logging +===================== +The device-mapper logging code is used by some of the device-mapper +RAID targets to track regions of the disk that are not consistent. +A region (or portion of the address space) of the disk may be +inconsistent because a RAID stripe is currently being operated on or +a machine died while the region was being altered. In the case of +mirrors, a region would be considered dirty/inconsistent while you +are writing to it because the writes need to be replicated for all +the legs of the mirror and may not reach the legs at the same time. +Once all writes are complete, the region is considered clean again. + +There is a generic logging interface that the device-mapper RAID +implementations use to perform logging operations (see +dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different +logging implementations are available and provide different +capabilities. The list includes: + +Type Files +==== ===== +disk drivers/md/dm-log.c +core drivers/md/dm-log.c +userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h + +The "disk" log type +------------------- +This log implementation commits the log state to disk. This way, the +logging state survives reboots/crashes. + +The "core" log type +------------------- +This log implementation keeps the log state in memory. The log state +will not survive a reboot or crash, but there may be a small boost in +performance. This method can also be used if no storage device is +available for storing log state. + +The "userspace" log type +------------------------ +This log type simply provides a way to export the log API to userspace, +so log implementations can be done there. This is done by forwarding most +logging requests to userspace, where a daemon receives and processes the +request. + +The structure used for communication between kernel and userspace are +located in include/linux/dm-log-userspace.h. Due to the frequency, +diversity, and 2-way communication nature of the exchanges between +kernel and userspace, 'connector' is used as the interface for +communication. + +There are currently two userspace log implementations that leverage this +framework - "clustered_disk" and "clustered_core". These implementations +provide a cluster-coherent log for shared-storage. Device-mapper mirroring +can be used in a shared-storage environment when the cluster log implementations +are employed. -- cgit v1.2.2 From b053dc5a722eade28514f2cc922caf7a4baad987 Mon Sep 17 00:00:00 2001 From: Kumar Gala Date: Fri, 19 Jun 2009 08:31:05 -0500 Subject: powerpc: Refactor device tree binding Split device tree binding out of booting-without-of.txt and put them into their own files per binding. Signed-off-by: Kumar Gala --- Documentation/powerpc/booting-without-of.txt | 1168 +--------------------- Documentation/powerpc/dts-bindings/4xx/emac.txt | 148 +++ Documentation/powerpc/dts-bindings/gpio/gpio.txt | 50 + Documentation/powerpc/dts-bindings/gpio/mdio.txt | 19 + Documentation/powerpc/dts-bindings/marvell.txt | 521 ++++++++++ Documentation/powerpc/dts-bindings/phy.txt | 25 + Documentation/powerpc/dts-bindings/spi-bus.txt | 57 ++ Documentation/powerpc/dts-bindings/usb-ehci.txt | 25 + Documentation/powerpc/dts-bindings/xilinx.txt | 295 ++++++ 9 files changed, 1142 insertions(+), 1166 deletions(-) create mode 100644 Documentation/powerpc/dts-bindings/4xx/emac.txt create mode 100644 Documentation/powerpc/dts-bindings/gpio/gpio.txt create mode 100644 Documentation/powerpc/dts-bindings/gpio/mdio.txt create mode 100644 Documentation/powerpc/dts-bindings/marvell.txt create mode 100644 Documentation/powerpc/dts-bindings/phy.txt create mode 100644 Documentation/powerpc/dts-bindings/spi-bus.txt create mode 100644 Documentation/powerpc/dts-bindings/usb-ehci.txt create mode 100644 Documentation/powerpc/dts-bindings/xilinx.txt (limited to 'Documentation') diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index 8d999d862d0e..79f533f38c61 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt @@ -1238,1122 +1238,7 @@ descriptions for the SOC devices for which new nodes have been defined; this list will expand as more and more SOC-containing platforms are moved over to use the flattened-device-tree model. - a) PHY nodes - - Required properties: - - - device_type : Should be "ethernet-phy" - - interrupts : where a is the interrupt number and b is a - field that represents an encoding of the sense and level - information for the interrupt. This should be encoded based on - the information in section 2) depending on the type of interrupt - controller you have. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - reg : The ID number for the phy, usually a small integer - - linux,phandle : phandle for this node; likely referenced by an - ethernet controller node. - - - Example: - - ethernet-phy@0 { - linux,phandle = <2452000> - interrupt-parent = <40000>; - interrupts = <35 1>; - reg = <0>; - device_type = "ethernet-phy"; - }; - - - b) Interrupt controllers - - Some SOC devices contain interrupt controllers that are different - from the standard Open PIC specification. The SOC device nodes for - these types of controllers should be specified just like a standard - OpenPIC controller. Sense and level information should be encoded - as specified in section 2) of this chapter for each device that - specifies an interrupt. - - Example : - - pic@40000 { - linux,phandle = <40000>; - interrupt-controller; - #address-cells = <0>; - reg = <40000 40000>; - compatible = "chrp,open-pic"; - device_type = "open-pic"; - }; - - c) 4xx/Axon EMAC ethernet nodes - - The EMAC ethernet controller in IBM and AMCC 4xx chips, and also - the Axon bridge. To operate this needs to interact with a ths - special McMAL DMA controller, and sometimes an RGMII or ZMII - interface. In addition to the nodes and properties described - below, the node for the OPB bus on which the EMAC sits must have a - correct clock-frequency property. - - i) The EMAC node itself - - Required properties: - - device_type : "network" - - - compatible : compatible list, contains 2 entries, first is - "ibm,emac-CHIP" where CHIP is the host ASIC (440gx, - 405gp, Axon) and second is either "ibm,emac" or - "ibm,emac4". For Axon, thus, we have: "ibm,emac-axon", - "ibm,emac4" - - interrupts : - - interrupt-parent : optional, if needed for interrupt mapping - - reg : - - local-mac-address : 6 bytes, MAC address - - mal-device : phandle of the associated McMAL node - - mal-tx-channel : 1 cell, index of the tx channel on McMAL associated - with this EMAC - - mal-rx-channel : 1 cell, index of the rx channel on McMAL associated - with this EMAC - - cell-index : 1 cell, hardware index of the EMAC cell on a given - ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on - each Axon chip) - - max-frame-size : 1 cell, maximum frame size supported in bytes - - rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec - operations. - For Axon, 2048 - - tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec - operations. - For Axon, 2048. - - fifo-entry-size : 1 cell, size of a fifo entry (used to calculate - thresholds). - For Axon, 0x00000010 - - mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds) - in bytes. - For Axon, 0x00000100 (I think ...) - - phy-mode : string, mode of operations of the PHY interface. - Supported values are: "mii", "rmii", "smii", "rgmii", - "tbi", "gmii", rtbi", "sgmii". - For Axon on CAB, it is "rgmii" - - mdio-device : 1 cell, required iff using shared MDIO registers - (440EP). phandle of the EMAC to use to drive the - MDIO lines for the PHY used by this EMAC. - - zmii-device : 1 cell, required iff connected to a ZMII. phandle of - the ZMII device node - - zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII - channel or 0xffffffff if ZMII is only used for MDIO. - - rgmii-device : 1 cell, required iff connected to an RGMII. phandle - of the RGMII device node. - For Axon: phandle of plb5/plb4/opb/rgmii - - rgmii-channel : 1 cell, required iff connected to an RGMII. Which - RGMII channel is used by this EMAC. - Fox Axon: present, whatever value is appropriate for each - EMAC, that is the content of the current (bogus) "phy-port" - property. - - Optional properties: - - phy-address : 1 cell, optional, MDIO address of the PHY. If absent, - a search is performed. - - phy-map : 1 cell, optional, bitmap of addresses to probe the PHY - for, used if phy-address is absent. bit 0x00000001 is - MDIO address 0. - For Axon it can be absent, though my current driver - doesn't handle phy-address yet so for now, keep - 0x00ffffff in it. - - rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec - operations (if absent the value is the same as - rx-fifo-size). For Axon, either absent or 2048. - - tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec - operations (if absent the value is the same as - tx-fifo-size). For Axon, either absent or 2048. - - tah-device : 1 cell, optional. If connected to a TAH engine for - offload, phandle of the TAH device node. - - tah-channel : 1 cell, optional. If appropriate, channel used on the - TAH engine. - - Example: - - EMAC0: ethernet@40000800 { - device_type = "network"; - compatible = "ibm,emac-440gp", "ibm,emac"; - interrupt-parent = <&UIC1>; - interrupts = <1c 4 1d 4>; - reg = <40000800 70>; - local-mac-address = [00 04 AC E3 1B 1E]; - mal-device = <&MAL0>; - mal-tx-channel = <0 1>; - mal-rx-channel = <0>; - cell-index = <0>; - max-frame-size = <5dc>; - rx-fifo-size = <1000>; - tx-fifo-size = <800>; - phy-mode = "rmii"; - phy-map = <00000001>; - zmii-device = <&ZMII0>; - zmii-channel = <0>; - }; - - ii) McMAL node - - Required properties: - - device_type : "dma-controller" - - compatible : compatible list, containing 2 entries, first is - "ibm,mcmal-CHIP" where CHIP is the host ASIC (like - emac) and the second is either "ibm,mcmal" or - "ibm,mcmal2". - For Axon, "ibm,mcmal-axon","ibm,mcmal2" - - interrupts : . - For Axon: This is _different_ from the current - firmware. We use the "delayed" interrupts for txeob - and rxeob. Thus we end up with mapping those 5 MPIC - interrupts, all level positive sensitive: 10, 11, 32, - 33, 34 (in decimal) - - dcr-reg : < DCR registers range > - - dcr-parent : if needed for dcr-reg - - num-tx-chans : 1 cell, number of Tx channels - - num-rx-chans : 1 cell, number of Rx channels - - iii) ZMII node - - Required properties: - - compatible : compatible list, containing 2 entries, first is - "ibm,zmii-CHIP" where CHIP is the host ASIC (like - EMAC) and the second is "ibm,zmii". - For Axon, there is no ZMII node. - - reg : - - iv) RGMII node - - Required properties: - - compatible : compatible list, containing 2 entries, first is - "ibm,rgmii-CHIP" where CHIP is the host ASIC (like - EMAC) and the second is "ibm,rgmii". - For Axon, "ibm,rgmii-axon","ibm,rgmii" - - reg : - - revision : as provided by the RGMII new version register if - available. - For Axon: 0x0000012a - - d) Xilinx IP cores - - The Xilinx EDK toolchain ships with a set of IP cores (devices) for use - in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range - of standard device types (network, serial, etc.) and miscellaneous - devices (gpio, LCD, spi, etc). Also, since these devices are - implemented within the fpga fabric every instance of the device can be - synthesised with different options that change the behaviour. - - Each IP-core has a set of parameters which the FPGA designer can use to - control how the core is synthesized. Historically, the EDK tool would - extract the device parameters relevant to device drivers and copy them - into an 'xparameters.h' in the form of #define symbols. This tells the - device drivers how the IP cores are configured, but it requres the kernel - to be recompiled every time the FPGA bitstream is resynthesized. - - The new approach is to export the parameters into the device tree and - generate a new device tree each time the FPGA bitstream changes. The - parameters which used to be exported as #defines will now become - properties of the device node. In general, device nodes for IP-cores - will take the following form: - - (name): (generic-name)@(base-address) { - compatible = "xlnx,(ip-core-name)-(HW_VER)" - [, (list of compatible devices), ...]; - reg = <(baseaddr) (size)>; - interrupt-parent = <&interrupt-controller-phandle>; - interrupts = < ... >; - xlnx,(parameter1) = "(string-value)"; - xlnx,(parameter2) = <(int-value)>; - }; - - (generic-name): an open firmware-style name that describes the - generic class of device. Preferably, this is one word, such - as 'serial' or 'ethernet'. - (ip-core-name): the name of the ip block (given after the BEGIN - directive in system.mhs). Should be in lowercase - and all underscores '_' converted to dashes '-'. - (name): is derived from the "PARAMETER INSTANCE" value. - (parameter#): C_* parameters from system.mhs. The C_ prefix is - dropped from the parameter name, the name is converted - to lowercase and all underscore '_' characters are - converted to dashes '-'. - (baseaddr): the baseaddr parameter value (often named C_BASEADDR). - (HW_VER): from the HW_VER parameter. - (size): the address range size (often C_HIGHADDR - C_BASEADDR + 1). - - Typically, the compatible list will include the exact IP core version - followed by an older IP core version which implements the same - interface or any other device with the same interface. - - 'reg', 'interrupt-parent' and 'interrupts' are all optional properties. - - For example, the following block from system.mhs: - - BEGIN opb_uartlite - PARAMETER INSTANCE = opb_uartlite_0 - PARAMETER HW_VER = 1.00.b - PARAMETER C_BAUDRATE = 115200 - PARAMETER C_DATA_BITS = 8 - PARAMETER C_ODD_PARITY = 0 - PARAMETER C_USE_PARITY = 0 - PARAMETER C_CLK_FREQ = 50000000 - PARAMETER C_BASEADDR = 0xEC100000 - PARAMETER C_HIGHADDR = 0xEC10FFFF - BUS_INTERFACE SOPB = opb_7 - PORT OPB_Clk = CLK_50MHz - PORT Interrupt = opb_uartlite_0_Interrupt - PORT RX = opb_uartlite_0_RX - PORT TX = opb_uartlite_0_TX - PORT OPB_Rst = sys_bus_reset_0 - END - - becomes the following device tree node: - - opb_uartlite_0: serial@ec100000 { - device_type = "serial"; - compatible = "xlnx,opb-uartlite-1.00.b"; - reg = ; - interrupt-parent = <&opb_intc_0>; - interrupts = <1 0>; // got this from the opb_intc parameters - current-speed = ; // standard serial device prop - clock-frequency = ; // standard serial device prop - xlnx,data-bits = <8>; - xlnx,odd-parity = <0>; - xlnx,use-parity = <0>; - }; - - Some IP cores actually implement 2 or more logical devices. In - this case, the device should still describe the whole IP core with - a single node and add a child node for each logical device. The - ranges property can be used to translate from parent IP-core to the - registers of each device. In addition, the parent node should be - compatible with the bus type 'xlnx,compound', and should contain - #address-cells and #size-cells, as with any other bus. (Note: this - makes the assumption that both logical devices have the same bus - binding. If this is not true, then separate nodes should be used - for each logical device). The 'cell-index' property can be used to - enumerate logical devices within an IP core. For example, the - following is the system.mhs entry for the dual ps2 controller found - on the ml403 reference design. - - BEGIN opb_ps2_dual_ref - PARAMETER INSTANCE = opb_ps2_dual_ref_0 - PARAMETER HW_VER = 1.00.a - PARAMETER C_BASEADDR = 0xA9000000 - PARAMETER C_HIGHADDR = 0xA9001FFF - BUS_INTERFACE SOPB = opb_v20_0 - PORT Sys_Intr1 = ps2_1_intr - PORT Sys_Intr2 = ps2_2_intr - PORT Clkin1 = ps2_clk_rx_1 - PORT Clkin2 = ps2_clk_rx_2 - PORT Clkpd1 = ps2_clk_tx_1 - PORT Clkpd2 = ps2_clk_tx_2 - PORT Rx1 = ps2_d_rx_1 - PORT Rx2 = ps2_d_rx_2 - PORT Txpd1 = ps2_d_tx_1 - PORT Txpd2 = ps2_d_tx_2 - END - - It would result in the following device tree nodes: - - opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "xlnx,compound"; - ranges = <0 a9000000 2000>; - // If this device had extra parameters, then they would - // go here. - ps2@0 { - compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; - reg = <0 40>; - interrupt-parent = <&opb_intc_0>; - interrupts = <3 0>; - cell-index = <0>; - }; - ps2@1000 { - compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; - reg = <1000 40>; - interrupt-parent = <&opb_intc_0>; - interrupts = <3 0>; - cell-index = <0>; - }; - }; - - Also, the system.mhs file defines bus attachments from the processor - to the devices. The device tree structure should reflect the bus - attachments. Again an example; this system.mhs fragment: - - BEGIN ppc405_virtex4 - PARAMETER INSTANCE = ppc405_0 - PARAMETER HW_VER = 1.01.a - BUS_INTERFACE DPLB = plb_v34_0 - BUS_INTERFACE IPLB = plb_v34_0 - END - - BEGIN opb_intc - PARAMETER INSTANCE = opb_intc_0 - PARAMETER HW_VER = 1.00.c - PARAMETER C_BASEADDR = 0xD1000FC0 - PARAMETER C_HIGHADDR = 0xD1000FDF - BUS_INTERFACE SOPB = opb_v20_0 - END - - BEGIN opb_uart16550 - PARAMETER INSTANCE = opb_uart16550_0 - PARAMETER HW_VER = 1.00.d - PARAMETER C_BASEADDR = 0xa0000000 - PARAMETER C_HIGHADDR = 0xa0001FFF - BUS_INTERFACE SOPB = opb_v20_0 - END - - BEGIN plb_v34 - PARAMETER INSTANCE = plb_v34_0 - PARAMETER HW_VER = 1.02.a - END - - BEGIN plb_bram_if_cntlr - PARAMETER INSTANCE = plb_bram_if_cntlr_0 - PARAMETER HW_VER = 1.00.b - PARAMETER C_BASEADDR = 0xFFFF0000 - PARAMETER C_HIGHADDR = 0xFFFFFFFF - BUS_INTERFACE SPLB = plb_v34_0 - END - - BEGIN plb2opb_bridge - PARAMETER INSTANCE = plb2opb_bridge_0 - PARAMETER HW_VER = 1.01.a - PARAMETER C_RNG0_BASEADDR = 0x20000000 - PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF - PARAMETER C_RNG1_BASEADDR = 0x60000000 - PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF - PARAMETER C_RNG2_BASEADDR = 0x80000000 - PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF - PARAMETER C_RNG3_BASEADDR = 0xC0000000 - PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF - BUS_INTERFACE SPLB = plb_v34_0 - BUS_INTERFACE MOPB = opb_v20_0 - END - - Gives this device tree (some properties removed for clarity): - - plb@0 { - #address-cells = <1>; - #size-cells = <1>; - compatible = "xlnx,plb-v34-1.02.a"; - device_type = "ibm,plb"; - ranges; // 1:1 translation - - plb_bram_if_cntrl_0: bram@ffff0000 { - reg = ; - } - - opb@20000000 { - #address-cells = <1>; - #size-cells = <1>; - ranges = <20000000 20000000 20000000 - 60000000 60000000 20000000 - 80000000 80000000 40000000 - c0000000 c0000000 20000000>; - - opb_uart16550_0: serial@a0000000 { - reg = ; - }; - - opb_intc_0: interrupt-controller@d1000fc0 { - reg = ; - }; - }; - }; - - That covers the general approach to binding xilinx IP cores into the - device tree. The following are bindings for specific devices: - - i) Xilinx ML300 Framebuffer - - Simple framebuffer device from the ML300 reference design (also on the - ML403 reference design as well as others). - - Optional properties: - - resolution = : pixel resolution of framebuffer. Some - implementations use a different resolution. - Default is - - virt-resolution = : Size of framebuffer in memory. - Default is . - - rotate-display (empty) : rotate display 180 degrees. - - ii) Xilinx SystemACE - - The Xilinx SystemACE device is used to program FPGAs from an FPGA - bitstream stored on a CF card. It can also be used as a generic CF - interface device. - - Optional properties: - - 8-bit (empty) : Set this property for SystemACE in 8 bit mode - - iii) Xilinx EMAC and Xilinx TEMAC - - Xilinx Ethernet devices. In addition to general xilinx properties - listed above, nodes for these devices should include a phy-handle - property, and may include other common network device properties - like local-mac-address. - - iv) Xilinx Uartlite - - Xilinx uartlite devices are simple fixed speed serial ports. - - Required properties: - - current-speed : Baud rate of uartlite - - v) Xilinx hwicap - - Xilinx hwicap devices provide access to the configuration logic - of the FPGA through the Internal Configuration Access Port - (ICAP). The ICAP enables partial reconfiguration of the FPGA, - readback of the configuration information, and some control over - 'warm boots' of the FPGA fabric. - - Required properties: - - xlnx,family : The family of the FPGA, necessary since the - capabilities of the underlying ICAP hardware - differ between different families. May be - 'virtex2p', 'virtex4', or 'virtex5'. - - vi) Xilinx Uart 16550 - - Xilinx UART 16550 devices are very similar to the NS16550 but with - different register spacing and an offset from the base address. - - Required properties: - - clock-frequency : Frequency of the clock input - - reg-offset : A value of 3 is required - - reg-shift : A value of 2 is required - - e) USB EHCI controllers - - Required properties: - - compatible : should be "usb-ehci". - - reg : should contain at least address and length of the standard EHCI - register set for the device. Optional platform-dependent registers - (debug-port or other) can be also specified here, but only after - definition of standard EHCI registers. - - interrupts : one EHCI interrupt should be described here. - If device registers are implemented in big endian mode, the device - node should have "big-endian-regs" property. - If controller implementation operates with big endian descriptors, - "big-endian-desc" property should be specified. - If both big endian registers and descriptors are used by the controller - implementation, "big-endian" property can be specified instead of having - both "big-endian-regs" and "big-endian-desc". - - Example (Sequoia 440EPx): - ehci@e0000300 { - compatible = "ibm,usb-ehci-440epx", "usb-ehci"; - interrupt-parent = <&UIC0>; - interrupts = <1a 4>; - reg = <0 e0000300 90 0 e0000390 70>; - big-endian; - }; - - f) MDIO on GPIOs - - Currently defined compatibles: - - virtual,gpio-mdio - - MDC and MDIO lines connected to GPIO controllers are listed in the - gpios property as described in section VIII.1 in the following order: - - MDC, MDIO. - - Example: - - mdio { - compatible = "virtual,mdio-gpio"; - #address-cells = <1>; - #size-cells = <0>; - gpios = <&qe_pio_a 11 - &qe_pio_c 6>; - }; - - g) SPI (Serial Peripheral Interface) busses - - SPI busses can be described with a node for the SPI master device - and a set of child nodes for each SPI slave on the bus. For this - discussion, it is assumed that the system's SPI controller is in - SPI master mode. This binding does not describe SPI controllers - in slave mode. - - The SPI master node requires the following properties: - - #address-cells - number of cells required to define a chip select - address on the SPI bus. - - #size-cells - should be zero. - - compatible - name of SPI bus controller following generic names - recommended practice. - No other properties are required in the SPI bus node. It is assumed - that a driver for an SPI bus device will understand that it is an SPI bus. - However, the binding does not attempt to define the specific method for - assigning chip select numbers. Since SPI chip select configuration is - flexible and non-standardized, it is left out of this binding with the - assumption that board specific platform code will be used to manage - chip selects. Individual drivers can define additional properties to - support describing the chip select layout. - - SPI slave nodes must be children of the SPI master node and can - contain the following properties. - - reg - (required) chip select address of device. - - compatible - (required) name of SPI device following generic names - recommended practice - - spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz - - spi-cpol - (optional) Empty property indicating device requires - inverse clock polarity (CPOL) mode - - spi-cpha - (optional) Empty property indicating device requires - shifted clock phase (CPHA) mode - - spi-cs-high - (optional) Empty property indicating device requires - chip select active high - - SPI example for an MPC5200 SPI bus: - spi@f00 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi"; - reg = <0xf00 0x20>; - interrupts = <2 13 0 2 14 0>; - interrupt-parent = <&mpc5200_pic>; - - ethernet-switch@0 { - compatible = "micrel,ks8995m"; - spi-max-frequency = <1000000>; - reg = <0>; - }; - - codec@1 { - compatible = "ti,tlv320aic26"; - spi-max-frequency = <100000>; - reg = <1>; - }; - }; - -VII - Marvell Discovery mv64[345]6x System Controller chips -=========================================================== - -The Marvell mv64[345]60 series of system controller chips contain -many of the peripherals needed to implement a complete computer -system. In this section, we define device tree nodes to describe -the system controller chip itself and each of the peripherals -which it contains. Compatible string values for each node are -prefixed with the string "marvell,", for Marvell Technology Group Ltd. - -1) The /system-controller node - - This node is used to represent the system-controller and must be - present when the system uses a system controller chip. The top-level - system-controller node contains information that is global to all - devices within the system controller chip. The node name begins - with "system-controller" followed by the unit address, which is - the base address of the memory-mapped register set for the system - controller chip. - - Required properties: - - - ranges : Describes the translation of system controller addresses - for memory mapped registers. - - clock-frequency: Contains the main clock frequency for the system - controller chip. - - reg : This property defines the address and size of the - memory-mapped registers contained within the system controller - chip. The address specified in the "reg" property should match - the unit address of the system-controller node. - - #address-cells : Address representation for system controller - devices. This field represents the number of cells needed to - represent the address of the memory-mapped registers of devices - within the system controller chip. - - #size-cells : Size representation for for the memory-mapped - registers within the system controller chip. - - #interrupt-cells : Defines the width of cells used to represent - interrupts. - - Optional properties: - - - model : The specific model of the system controller chip. Such - as, "mv64360", "mv64460", or "mv64560". - - compatible : A string identifying the compatibility identifiers - of the system controller chip. - - The system-controller node contains child nodes for each system - controller device that the platform uses. Nodes should not be created - for devices which exist on the system controller chip but are not used - - Example Marvell Discovery mv64360 system-controller node: - - system-controller@f1000000 { /* Marvell Discovery mv64360 */ - #address-cells = <1>; - #size-cells = <1>; - model = "mv64360"; /* Default */ - compatible = "marvell,mv64360"; - clock-frequency = <133333333>; - reg = <0xf1000000 0x10000>; - virtual-reg = <0xf1000000>; - ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */ - 0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */ - 0xa0000000 0xa0000000 0x4000000 /* User FLASH */ - 0x00000000 0xf1000000 0x0010000 /* Bridge's regs */ - 0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */ - - [ child node definitions... ] - } - -2) Child nodes of /system-controller - - a) Marvell Discovery MDIO bus - - The MDIO is a bus to which the PHY devices are connected. For each - device that exists on this bus, a child node should be created. See - the definition of the PHY node below for an example of how to define - a PHY. - - Required properties: - - #address-cells : Should be <1> - - #size-cells : Should be <0> - - device_type : Should be "mdio" - - compatible : Should be "marvell,mv64360-mdio" - - Example: - - mdio { - #address-cells = <1>; - #size-cells = <0>; - device_type = "mdio"; - compatible = "marvell,mv64360-mdio"; - - ethernet-phy@0 { - ...... - }; - }; - - - b) Marvell Discovery ethernet controller - - The Discover ethernet controller is described with two levels - of nodes. The first level describes an ethernet silicon block - and the second level describes up to 3 ethernet nodes within - that block. The reason for the multiple levels is that the - registers for the node are interleaved within a single set - of registers. The "ethernet-block" level describes the - shared register set, and the "ethernet" nodes describe ethernet - port-specific properties. - - Ethernet block node - - Required properties: - - #address-cells : <1> - - #size-cells : <0> - - compatible : "marvell,mv64360-eth-block" - - reg : Offset and length of the register set for this block - - Example Discovery Ethernet block node: - ethernet-block@2000 { - #address-cells = <1>; - #size-cells = <0>; - compatible = "marvell,mv64360-eth-block"; - reg = <0x2000 0x2000>; - ethernet@0 { - ....... - }; - }; - - Ethernet port node - - Required properties: - - device_type : Should be "network". - - compatible : Should be "marvell,mv64360-eth". - - reg : Should be <0>, <1>, or <2>, according to which registers - within the silicon block the device uses. - - interrupts : where a is the interrupt number for the port. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - phy : the phandle for the PHY connected to this ethernet - controller. - - local-mac-address : 6 bytes, MAC address - - Example Discovery Ethernet port node: - ethernet@0 { - device_type = "network"; - compatible = "marvell,mv64360-eth"; - reg = <0>; - interrupts = <32>; - interrupt-parent = <&PIC>; - phy = <&PHY0>; - local-mac-address = [ 00 00 00 00 00 00 ]; - }; - - - - c) Marvell Discovery PHY nodes - - Required properties: - - device_type : Should be "ethernet-phy" - - interrupts : where a is the interrupt number for this phy. - - interrupt-parent : the phandle for the interrupt controller that - services interrupts for this device. - - reg : The ID number for the phy, usually a small integer - - Example Discovery PHY node: - ethernet-phy@1 { - device_type = "ethernet-phy"; - compatible = "broadcom,bcm5421"; - interrupts = <76>; /* GPP 12 */ - interrupt-parent = <&PIC>; - reg = <1>; - }; - - - d) Marvell Discovery SDMA nodes - - Represent DMA hardware associated with the MPSC (multiprotocol - serial controllers). - - Required properties: - - compatible : "marvell,mv64360-sdma" - - reg : Offset and length of the register set for this device - - interrupts : where a is the interrupt number for the DMA - device. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery SDMA node: - sdma@4000 { - compatible = "marvell,mv64360-sdma"; - reg = <0x4000 0xc18>; - virtual-reg = <0xf1004000>; - interrupts = <36>; - interrupt-parent = <&PIC>; - }; - - - e) Marvell Discovery BRG nodes - - Represent baud rate generator hardware associated with the MPSC - (multiprotocol serial controllers). - - Required properties: - - compatible : "marvell,mv64360-brg" - - reg : Offset and length of the register set for this device - - clock-src : A value from 0 to 15 which selects the clock - source for the baud rate generator. This value corresponds - to the CLKS value in the BRGx configuration register. See - the mv64x60 User's Manual. - - clock-frequence : The frequency (in Hz) of the baud rate - generator's input clock. - - current-speed : The current speed setting (presumably by - firmware) of the baud rate generator. - - Example Discovery BRG node: - brg@b200 { - compatible = "marvell,mv64360-brg"; - reg = <0xb200 0x8>; - clock-src = <8>; - clock-frequency = <133333333>; - current-speed = <9600>; - }; - - - f) Marvell Discovery CUNIT nodes - - Represent the Serial Communications Unit device hardware. - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery CUNIT node: - cunit@f200 { - reg = <0xf200 0x200>; - }; - - - g) Marvell Discovery MPSCROUTING nodes - - Represent the Discovery's MPSC routing hardware - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery CUNIT node: - mpscrouting@b500 { - reg = <0xb400 0xc>; - }; - - - h) Marvell Discovery MPSCINTR nodes - - Represent the Discovery's MPSC DMA interrupt hardware registers - (SDMA cause and mask registers). - - Required properties: - - reg : Offset and length of the register set for this device - - Example Discovery MPSCINTR node: - mpsintr@b800 { - reg = <0xb800 0x100>; - }; - - - i) Marvell Discovery MPSC nodes - - Represent the Discovery's MPSC (Multiprotocol Serial Controller) - serial port. - - Required properties: - - device_type : "serial" - - compatible : "marvell,mv64360-mpsc" - - reg : Offset and length of the register set for this device - - sdma : the phandle for the SDMA node used by this port - - brg : the phandle for the BRG node used by this port - - cunit : the phandle for the CUNIT node used by this port - - mpscrouting : the phandle for the MPSCROUTING node used by this port - - mpscintr : the phandle for the MPSCINTR node used by this port - - cell-index : the hardware index of this cell in the MPSC core - - max_idle : value needed for MPSC CHR3 (Maximum Frame Length) - register - - interrupts : where a is the interrupt number for the MPSC. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery MPSCINTR node: - mpsc@8000 { - device_type = "serial"; - compatible = "marvell,mv64360-mpsc"; - reg = <0x8000 0x38>; - virtual-reg = <0xf1008000>; - sdma = <&SDMA0>; - brg = <&BRG0>; - cunit = <&CUNIT>; - mpscrouting = <&MPSCROUTING>; - mpscintr = <&MPSCINTR>; - cell-index = <0>; - max_idle = <40>; - interrupts = <40>; - interrupt-parent = <&PIC>; - }; - - - j) Marvell Discovery Watch Dog Timer nodes - - Represent the Discovery's watchdog timer hardware - - Required properties: - - compatible : "marvell,mv64360-wdt" - - reg : Offset and length of the register set for this device - - Example Discovery Watch Dog Timer node: - wdt@b410 { - compatible = "marvell,mv64360-wdt"; - reg = <0xb410 0x8>; - }; - - - k) Marvell Discovery I2C nodes - - Represent the Discovery's I2C hardware - - Required properties: - - device_type : "i2c" - - compatible : "marvell,mv64360-i2c" - - reg : Offset and length of the register set for this device - - interrupts : where a is the interrupt number for the I2C. - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery I2C node: - compatible = "marvell,mv64360-i2c"; - reg = <0xc000 0x20>; - virtual-reg = <0xf100c000>; - interrupts = <37>; - interrupt-parent = <&PIC>; - }; - - - l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes - - Represent the Discovery's PIC hardware - - Required properties: - - #interrupt-cells : <1> - - #address-cells : <0> - - compatible : "marvell,mv64360-pic" - - reg : Offset and length of the register set for this device - - interrupt-controller - - Example Discovery PIC node: - pic { - #interrupt-cells = <1>; - #address-cells = <0>; - compatible = "marvell,mv64360-pic"; - reg = <0x0 0x88>; - interrupt-controller; - }; - - - m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes - - Represent the Discovery's MPP hardware - - Required properties: - - compatible : "marvell,mv64360-mpp" - - reg : Offset and length of the register set for this device - - Example Discovery MPP node: - mpp@f000 { - compatible = "marvell,mv64360-mpp"; - reg = <0xf000 0x10>; - }; - - - n) Marvell Discovery GPP (General Purpose Pins) nodes - - Represent the Discovery's GPP hardware - - Required properties: - - compatible : "marvell,mv64360-gpp" - - reg : Offset and length of the register set for this device - - Example Discovery GPP node: - gpp@f000 { - compatible = "marvell,mv64360-gpp"; - reg = <0xf100 0x20>; - }; - - - o) Marvell Discovery PCI host bridge node - - Represents the Discovery's PCI host bridge device. The properties - for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE - 1275-1994. A typical value for the compatible property is - "marvell,mv64360-pci". - - Example Discovery PCI host bridge node - pci@80000000 { - #address-cells = <3>; - #size-cells = <2>; - #interrupt-cells = <1>; - device_type = "pci"; - compatible = "marvell,mv64360-pci"; - reg = <0xcf8 0x8>; - ranges = <0x01000000 0x0 0x0 - 0x88000000 0x0 0x01000000 - 0x02000000 0x0 0x80000000 - 0x80000000 0x0 0x08000000>; - bus-range = <0 255>; - clock-frequency = <66000000>; - interrupt-parent = <&PIC>; - interrupt-map-mask = <0xf800 0x0 0x0 0x7>; - interrupt-map = < - /* IDSEL 0x0a */ - 0x5000 0 0 1 &PIC 80 - 0x5000 0 0 2 &PIC 81 - 0x5000 0 0 3 &PIC 91 - 0x5000 0 0 4 &PIC 93 - - /* IDSEL 0x0b */ - 0x5800 0 0 1 &PIC 91 - 0x5800 0 0 2 &PIC 93 - 0x5800 0 0 3 &PIC 80 - 0x5800 0 0 4 &PIC 81 - - /* IDSEL 0x0c */ - 0x6000 0 0 1 &PIC 91 - 0x6000 0 0 2 &PIC 93 - 0x6000 0 0 3 &PIC 80 - 0x6000 0 0 4 &PIC 81 - - /* IDSEL 0x0d */ - 0x6800 0 0 1 &PIC 93 - 0x6800 0 0 2 &PIC 80 - 0x6800 0 0 3 &PIC 81 - 0x6800 0 0 4 &PIC 91 - >; - }; - - - p) Marvell Discovery CPU Error nodes - - Represent the Discovery's CPU error handler device. - - Required properties: - - compatible : "marvell,mv64360-cpu-error" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery CPU Error node: - cpu-error@0070 { - compatible = "marvell,mv64360-cpu-error"; - reg = <0x70 0x10 0x128 0x28>; - interrupts = <3>; - interrupt-parent = <&PIC>; - }; - - - q) Marvell Discovery SRAM Controller nodes - - Represent the Discovery's SRAM controller device. - - Required properties: - - compatible : "marvell,mv64360-sram-ctrl" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery SRAM Controller node: - sram-ctrl@0380 { - compatible = "marvell,mv64360-sram-ctrl"; - reg = <0x380 0x80>; - interrupts = <13>; - interrupt-parent = <&PIC>; - }; - - - r) Marvell Discovery PCI Error Handler nodes - - Represent the Discovery's PCI error handler device. - - Required properties: - - compatible : "marvell,mv64360-pci-error" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery PCI Error Handler node: - pci-error@1d40 { - compatible = "marvell,mv64360-pci-error"; - reg = <0x1d40 0x40 0xc28 0x4>; - interrupts = <12>; - interrupt-parent = <&PIC>; - }; - - - s) Marvell Discovery Memory Controller nodes - - Represent the Discovery's memory controller device. - - Required properties: - - compatible : "marvell,mv64360-mem-ctrl" - - reg : Offset and length of the register set for this device - - interrupts : the interrupt number for this device - - interrupt-parent : the phandle for the interrupt controller - that services interrupts for this device. - - Example Discovery Memory Controller node: - mem-ctrl@1400 { - compatible = "marvell,mv64360-mem-ctrl"; - reg = <0x1400 0x60>; - interrupts = <17>; - interrupt-parent = <&PIC>; - }; - - -VIII - Specifying interrupt information for devices +VII - Specifying interrupt information for devices =================================================== The device tree represents the busses and devices of a hardware @@ -2439,56 +1324,7 @@ encodings listed below: 2 = high to low edge sensitive type enabled 3 = low to high edge sensitive type enabled -IX - Specifying GPIO information for devices -============================================ - -1) gpios property ------------------ - -Nodes that makes use of GPIOs should define them using `gpios' property, -format of which is: <&gpio-controller1-phandle gpio1-specifier - &gpio-controller2-phandle gpio2-specifier - 0 /* holes are permitted, means no GPIO 3 */ - &gpio-controller4-phandle gpio4-specifier - ...>; - -Note that gpio-specifier length is controller dependent. - -gpio-specifier may encode: bank, pin position inside the bank, -whether pin is open-drain and whether pin is logically inverted. - -Example of the node using GPIOs: - - node { - gpios = <&qe_pio_e 18 0>; - }; - -In this example gpio-specifier is "18 0" and encodes GPIO pin number, -and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller. - -2) gpio-controller nodes ------------------------- - -Every GPIO controller node must have #gpio-cells property defined, -this information will be used to translate gpio-specifiers. - -Example of two SOC GPIO banks defined as gpio-controller nodes: - - qe_pio_a: gpio-controller@1400 { - #gpio-cells = <2>; - compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank"; - reg = <0x1400 0x18>; - gpio-controller; - }; - - qe_pio_e: gpio-controller@1460 { - #gpio-cells = <2>; - compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank"; - reg = <0x1460 0x18>; - gpio-controller; - }; - -X - Specifying Device Power Management Information (sleep property) +VIII - Specifying Device Power Management Information (sleep property) =================================================================== Devices on SOCs often have mechanisms for placing devices into low-power diff --git a/Documentation/powerpc/dts-bindings/4xx/emac.txt b/Documentation/powerpc/dts-bindings/4xx/emac.txt new file mode 100644 index 000000000000..2161334a7ca5 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/4xx/emac.txt @@ -0,0 +1,148 @@ + 4xx/Axon EMAC ethernet nodes + + The EMAC ethernet controller in IBM and AMCC 4xx chips, and also + the Axon bridge. To operate this needs to interact with a ths + special McMAL DMA controller, and sometimes an RGMII or ZMII + interface. In addition to the nodes and properties described + below, the node for the OPB bus on which the EMAC sits must have a + correct clock-frequency property. + + i) The EMAC node itself + + Required properties: + - device_type : "network" + + - compatible : compatible list, contains 2 entries, first is + "ibm,emac-CHIP" where CHIP is the host ASIC (440gx, + 405gp, Axon) and second is either "ibm,emac" or + "ibm,emac4". For Axon, thus, we have: "ibm,emac-axon", + "ibm,emac4" + - interrupts : + - interrupt-parent : optional, if needed for interrupt mapping + - reg : + - local-mac-address : 6 bytes, MAC address + - mal-device : phandle of the associated McMAL node + - mal-tx-channel : 1 cell, index of the tx channel on McMAL associated + with this EMAC + - mal-rx-channel : 1 cell, index of the rx channel on McMAL associated + with this EMAC + - cell-index : 1 cell, hardware index of the EMAC cell on a given + ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on + each Axon chip) + - max-frame-size : 1 cell, maximum frame size supported in bytes + - rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec + operations. + For Axon, 2048 + - tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec + operations. + For Axon, 2048. + - fifo-entry-size : 1 cell, size of a fifo entry (used to calculate + thresholds). + For Axon, 0x00000010 + - mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds) + in bytes. + For Axon, 0x00000100 (I think ...) + - phy-mode : string, mode of operations of the PHY interface. + Supported values are: "mii", "rmii", "smii", "rgmii", + "tbi", "gmii", rtbi", "sgmii". + For Axon on CAB, it is "rgmii" + - mdio-device : 1 cell, required iff using shared MDIO registers + (440EP). phandle of the EMAC to use to drive the + MDIO lines for the PHY used by this EMAC. + - zmii-device : 1 cell, required iff connected to a ZMII. phandle of + the ZMII device node + - zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII + channel or 0xffffffff if ZMII is only used for MDIO. + - rgmii-device : 1 cell, required iff connected to an RGMII. phandle + of the RGMII device node. + For Axon: phandle of plb5/plb4/opb/rgmii + - rgmii-channel : 1 cell, required iff connected to an RGMII. Which + RGMII channel is used by this EMAC. + Fox Axon: present, whatever value is appropriate for each + EMAC, that is the content of the current (bogus) "phy-port" + property. + + Optional properties: + - phy-address : 1 cell, optional, MDIO address of the PHY. If absent, + a search is performed. + - phy-map : 1 cell, optional, bitmap of addresses to probe the PHY + for, used if phy-address is absent. bit 0x00000001 is + MDIO address 0. + For Axon it can be absent, though my current driver + doesn't handle phy-address yet so for now, keep + 0x00ffffff in it. + - rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec + operations (if absent the value is the same as + rx-fifo-size). For Axon, either absent or 2048. + - tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec + operations (if absent the value is the same as + tx-fifo-size). For Axon, either absent or 2048. + - tah-device : 1 cell, optional. If connected to a TAH engine for + offload, phandle of the TAH device node. + - tah-channel : 1 cell, optional. If appropriate, channel used on the + TAH engine. + + Example: + + EMAC0: ethernet@40000800 { + device_type = "network"; + compatible = "ibm,emac-440gp", "ibm,emac"; + interrupt-parent = <&UIC1>; + interrupts = <1c 4 1d 4>; + reg = <40000800 70>; + local-mac-address = [00 04 AC E3 1B 1E]; + mal-device = <&MAL0>; + mal-tx-channel = <0 1>; + mal-rx-channel = <0>; + cell-index = <0>; + max-frame-size = <5dc>; + rx-fifo-size = <1000>; + tx-fifo-size = <800>; + phy-mode = "rmii"; + phy-map = <00000001>; + zmii-device = <&ZMII0>; + zmii-channel = <0>; + }; + + ii) McMAL node + + Required properties: + - device_type : "dma-controller" + - compatible : compatible list, containing 2 entries, first is + "ibm,mcmal-CHIP" where CHIP is the host ASIC (like + emac) and the second is either "ibm,mcmal" or + "ibm,mcmal2". + For Axon, "ibm,mcmal-axon","ibm,mcmal2" + - interrupts : . + For Axon: This is _different_ from the current + firmware. We use the "delayed" interrupts for txeob + and rxeob. Thus we end up with mapping those 5 MPIC + interrupts, all level positive sensitive: 10, 11, 32, + 33, 34 (in decimal) + - dcr-reg : < DCR registers range > + - dcr-parent : if needed for dcr-reg + - num-tx-chans : 1 cell, number of Tx channels + - num-rx-chans : 1 cell, number of Rx channels + + iii) ZMII node + + Required properties: + - compatible : compatible list, containing 2 entries, first is + "ibm,zmii-CHIP" where CHIP is the host ASIC (like + EMAC) and the second is "ibm,zmii". + For Axon, there is no ZMII node. + - reg : + + iv) RGMII node + + Required properties: + - compatible : compatible list, containing 2 entries, first is + "ibm,rgmii-CHIP" where CHIP is the host ASIC (like + EMAC) and the second is "ibm,rgmii". + For Axon, "ibm,rgmii-axon","ibm,rgmii" + - reg : + - revision : as provided by the RGMII new version register if + available. + For Axon: 0x0000012a + diff --git a/Documentation/powerpc/dts-bindings/gpio/gpio.txt b/Documentation/powerpc/dts-bindings/gpio/gpio.txt new file mode 100644 index 000000000000..edaa84d288a1 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/gpio/gpio.txt @@ -0,0 +1,50 @@ +Specifying GPIO information for devices +============================================ + +1) gpios property +----------------- + +Nodes that makes use of GPIOs should define them using `gpios' property, +format of which is: <&gpio-controller1-phandle gpio1-specifier + &gpio-controller2-phandle gpio2-specifier + 0 /* holes are permitted, means no GPIO 3 */ + &gpio-controller4-phandle gpio4-specifier + ...>; + +Note that gpio-specifier length is controller dependent. + +gpio-specifier may encode: bank, pin position inside the bank, +whether pin is open-drain and whether pin is logically inverted. + +Example of the node using GPIOs: + + node { + gpios = <&qe_pio_e 18 0>; + }; + +In this example gpio-specifier is "18 0" and encodes GPIO pin number, +and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller. + +2) gpio-controller nodes +------------------------ + +Every GPIO controller node must have #gpio-cells property defined, +this information will be used to translate gpio-specifiers. + +Example of two SOC GPIO banks defined as gpio-controller nodes: + + qe_pio_a: gpio-controller@1400 { + #gpio-cells = <2>; + compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank"; + reg = <0x1400 0x18>; + gpio-controller; + }; + + qe_pio_e: gpio-controller@1460 { + #gpio-cells = <2>; + compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank"; + reg = <0x1460 0x18>; + gpio-controller; + }; + + diff --git a/Documentation/powerpc/dts-bindings/gpio/mdio.txt b/Documentation/powerpc/dts-bindings/gpio/mdio.txt new file mode 100644 index 000000000000..bc9549529014 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/gpio/mdio.txt @@ -0,0 +1,19 @@ +MDIO on GPIOs + +Currently defined compatibles: +- virtual,gpio-mdio + +MDC and MDIO lines connected to GPIO controllers are listed in the +gpios property as described in section VIII.1 in the following order: + +MDC, MDIO. + +Example: + +mdio { + compatible = "virtual,mdio-gpio"; + #address-cells = <1>; + #size-cells = <0>; + gpios = <&qe_pio_a 11 + &qe_pio_c 6>; +}; diff --git a/Documentation/powerpc/dts-bindings/marvell.txt b/Documentation/powerpc/dts-bindings/marvell.txt new file mode 100644 index 000000000000..3708a2fd4747 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/marvell.txt @@ -0,0 +1,521 @@ +Marvell Discovery mv64[345]6x System Controller chips +=========================================================== + +The Marvell mv64[345]60 series of system controller chips contain +many of the peripherals needed to implement a complete computer +system. In this section, we define device tree nodes to describe +the system controller chip itself and each of the peripherals +which it contains. Compatible string values for each node are +prefixed with the string "marvell,", for Marvell Technology Group Ltd. + +1) The /system-controller node + + This node is used to represent the system-controller and must be + present when the system uses a system controller chip. The top-level + system-controller node contains information that is global to all + devices within the system controller chip. The node name begins + with "system-controller" followed by the unit address, which is + the base address of the memory-mapped register set for the system + controller chip. + + Required properties: + + - ranges : Describes the translation of system controller addresses + for memory mapped registers. + - clock-frequency: Contains the main clock frequency for the system + controller chip. + - reg : This property defines the address and size of the + memory-mapped registers contained within the system controller + chip. The address specified in the "reg" property should match + the unit address of the system-controller node. + - #address-cells : Address representation for system controller + devices. This field represents the number of cells needed to + represent the address of the memory-mapped registers of devices + within the system controller chip. + - #size-cells : Size representation for for the memory-mapped + registers within the system controller chip. + - #interrupt-cells : Defines the width of cells used to represent + interrupts. + + Optional properties: + + - model : The specific model of the system controller chip. Such + as, "mv64360", "mv64460", or "mv64560". + - compatible : A string identifying the compatibility identifiers + of the system controller chip. + + The system-controller node contains child nodes for each system + controller device that the platform uses. Nodes should not be created + for devices which exist on the system controller chip but are not used + + Example Marvell Discovery mv64360 system-controller node: + + system-controller@f1000000 { /* Marvell Discovery mv64360 */ + #address-cells = <1>; + #size-cells = <1>; + model = "mv64360"; /* Default */ + compatible = "marvell,mv64360"; + clock-frequency = <133333333>; + reg = <0xf1000000 0x10000>; + virtual-reg = <0xf1000000>; + ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */ + 0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */ + 0xa0000000 0xa0000000 0x4000000 /* User FLASH */ + 0x00000000 0xf1000000 0x0010000 /* Bridge's regs */ + 0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */ + + [ child node definitions... ] + } + +2) Child nodes of /system-controller + + a) Marvell Discovery MDIO bus + + The MDIO is a bus to which the PHY devices are connected. For each + device that exists on this bus, a child node should be created. See + the definition of the PHY node below for an example of how to define + a PHY. + + Required properties: + - #address-cells : Should be <1> + - #size-cells : Should be <0> + - device_type : Should be "mdio" + - compatible : Should be "marvell,mv64360-mdio" + + Example: + + mdio { + #address-cells = <1>; + #size-cells = <0>; + device_type = "mdio"; + compatible = "marvell,mv64360-mdio"; + + ethernet-phy@0 { + ...... + }; + }; + + + b) Marvell Discovery ethernet controller + + The Discover ethernet controller is described with two levels + of nodes. The first level describes an ethernet silicon block + and the second level describes up to 3 ethernet nodes within + that block. The reason for the multiple levels is that the + registers for the node are interleaved within a single set + of registers. The "ethernet-block" level describes the + shared register set, and the "ethernet" nodes describe ethernet + port-specific properties. + + Ethernet block node + + Required properties: + - #address-cells : <1> + - #size-cells : <0> + - compatible : "marvell,mv64360-eth-block" + - reg : Offset and length of the register set for this block + + Example Discovery Ethernet block node: + ethernet-block@2000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "marvell,mv64360-eth-block"; + reg = <0x2000 0x2000>; + ethernet@0 { + ....... + }; + }; + + Ethernet port node + + Required properties: + - device_type : Should be "network". + - compatible : Should be "marvell,mv64360-eth". + - reg : Should be <0>, <1>, or <2>, according to which registers + within the silicon block the device uses. + - interrupts : where a is the interrupt number for the port. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + - phy : the phandle for the PHY connected to this ethernet + controller. + - local-mac-address : 6 bytes, MAC address + + Example Discovery Ethernet port node: + ethernet@0 { + device_type = "network"; + compatible = "marvell,mv64360-eth"; + reg = <0>; + interrupts = <32>; + interrupt-parent = <&PIC>; + phy = <&PHY0>; + local-mac-address = [ 00 00 00 00 00 00 ]; + }; + + + + c) Marvell Discovery PHY nodes + + Required properties: + - device_type : Should be "ethernet-phy" + - interrupts : where a is the interrupt number for this phy. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - reg : The ID number for the phy, usually a small integer + + Example Discovery PHY node: + ethernet-phy@1 { + device_type = "ethernet-phy"; + compatible = "broadcom,bcm5421"; + interrupts = <76>; /* GPP 12 */ + interrupt-parent = <&PIC>; + reg = <1>; + }; + + + d) Marvell Discovery SDMA nodes + + Represent DMA hardware associated with the MPSC (multiprotocol + serial controllers). + + Required properties: + - compatible : "marvell,mv64360-sdma" + - reg : Offset and length of the register set for this device + - interrupts : where a is the interrupt number for the DMA + device. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery SDMA node: + sdma@4000 { + compatible = "marvell,mv64360-sdma"; + reg = <0x4000 0xc18>; + virtual-reg = <0xf1004000>; + interrupts = <36>; + interrupt-parent = <&PIC>; + }; + + + e) Marvell Discovery BRG nodes + + Represent baud rate generator hardware associated with the MPSC + (multiprotocol serial controllers). + + Required properties: + - compatible : "marvell,mv64360-brg" + - reg : Offset and length of the register set for this device + - clock-src : A value from 0 to 15 which selects the clock + source for the baud rate generator. This value corresponds + to the CLKS value in the BRGx configuration register. See + the mv64x60 User's Manual. + - clock-frequence : The frequency (in Hz) of the baud rate + generator's input clock. + - current-speed : The current speed setting (presumably by + firmware) of the baud rate generator. + + Example Discovery BRG node: + brg@b200 { + compatible = "marvell,mv64360-brg"; + reg = <0xb200 0x8>; + clock-src = <8>; + clock-frequency = <133333333>; + current-speed = <9600>; + }; + + + f) Marvell Discovery CUNIT nodes + + Represent the Serial Communications Unit device hardware. + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery CUNIT node: + cunit@f200 { + reg = <0xf200 0x200>; + }; + + + g) Marvell Discovery MPSCROUTING nodes + + Represent the Discovery's MPSC routing hardware + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery CUNIT node: + mpscrouting@b500 { + reg = <0xb400 0xc>; + }; + + + h) Marvell Discovery MPSCINTR nodes + + Represent the Discovery's MPSC DMA interrupt hardware registers + (SDMA cause and mask registers). + + Required properties: + - reg : Offset and length of the register set for this device + + Example Discovery MPSCINTR node: + mpsintr@b800 { + reg = <0xb800 0x100>; + }; + + + i) Marvell Discovery MPSC nodes + + Represent the Discovery's MPSC (Multiprotocol Serial Controller) + serial port. + + Required properties: + - device_type : "serial" + - compatible : "marvell,mv64360-mpsc" + - reg : Offset and length of the register set for this device + - sdma : the phandle for the SDMA node used by this port + - brg : the phandle for the BRG node used by this port + - cunit : the phandle for the CUNIT node used by this port + - mpscrouting : the phandle for the MPSCROUTING node used by this port + - mpscintr : the phandle for the MPSCINTR node used by this port + - cell-index : the hardware index of this cell in the MPSC core + - max_idle : value needed for MPSC CHR3 (Maximum Frame Length) + register + - interrupts : where a is the interrupt number for the MPSC. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery MPSCINTR node: + mpsc@8000 { + device_type = "serial"; + compatible = "marvell,mv64360-mpsc"; + reg = <0x8000 0x38>; + virtual-reg = <0xf1008000>; + sdma = <&SDMA0>; + brg = <&BRG0>; + cunit = <&CUNIT>; + mpscrouting = <&MPSCROUTING>; + mpscintr = <&MPSCINTR>; + cell-index = <0>; + max_idle = <40>; + interrupts = <40>; + interrupt-parent = <&PIC>; + }; + + + j) Marvell Discovery Watch Dog Timer nodes + + Represent the Discovery's watchdog timer hardware + + Required properties: + - compatible : "marvell,mv64360-wdt" + - reg : Offset and length of the register set for this device + + Example Discovery Watch Dog Timer node: + wdt@b410 { + compatible = "marvell,mv64360-wdt"; + reg = <0xb410 0x8>; + }; + + + k) Marvell Discovery I2C nodes + + Represent the Discovery's I2C hardware + + Required properties: + - device_type : "i2c" + - compatible : "marvell,mv64360-i2c" + - reg : Offset and length of the register set for this device + - interrupts : where a is the interrupt number for the I2C. + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery I2C node: + compatible = "marvell,mv64360-i2c"; + reg = <0xc000 0x20>; + virtual-reg = <0xf100c000>; + interrupts = <37>; + interrupt-parent = <&PIC>; + }; + + + l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes + + Represent the Discovery's PIC hardware + + Required properties: + - #interrupt-cells : <1> + - #address-cells : <0> + - compatible : "marvell,mv64360-pic" + - reg : Offset and length of the register set for this device + - interrupt-controller + + Example Discovery PIC node: + pic { + #interrupt-cells = <1>; + #address-cells = <0>; + compatible = "marvell,mv64360-pic"; + reg = <0x0 0x88>; + interrupt-controller; + }; + + + m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes + + Represent the Discovery's MPP hardware + + Required properties: + - compatible : "marvell,mv64360-mpp" + - reg : Offset and length of the register set for this device + + Example Discovery MPP node: + mpp@f000 { + compatible = "marvell,mv64360-mpp"; + reg = <0xf000 0x10>; + }; + + + n) Marvell Discovery GPP (General Purpose Pins) nodes + + Represent the Discovery's GPP hardware + + Required properties: + - compatible : "marvell,mv64360-gpp" + - reg : Offset and length of the register set for this device + + Example Discovery GPP node: + gpp@f000 { + compatible = "marvell,mv64360-gpp"; + reg = <0xf100 0x20>; + }; + + + o) Marvell Discovery PCI host bridge node + + Represents the Discovery's PCI host bridge device. The properties + for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE + 1275-1994. A typical value for the compatible property is + "marvell,mv64360-pci". + + Example Discovery PCI host bridge node + pci@80000000 { + #address-cells = <3>; + #size-cells = <2>; + #interrupt-cells = <1>; + device_type = "pci"; + compatible = "marvell,mv64360-pci"; + reg = <0xcf8 0x8>; + ranges = <0x01000000 0x0 0x0 + 0x88000000 0x0 0x01000000 + 0x02000000 0x0 0x80000000 + 0x80000000 0x0 0x08000000>; + bus-range = <0 255>; + clock-frequency = <66000000>; + interrupt-parent = <&PIC>; + interrupt-map-mask = <0xf800 0x0 0x0 0x7>; + interrupt-map = < + /* IDSEL 0x0a */ + 0x5000 0 0 1 &PIC 80 + 0x5000 0 0 2 &PIC 81 + 0x5000 0 0 3 &PIC 91 + 0x5000 0 0 4 &PIC 93 + + /* IDSEL 0x0b */ + 0x5800 0 0 1 &PIC 91 + 0x5800 0 0 2 &PIC 93 + 0x5800 0 0 3 &PIC 80 + 0x5800 0 0 4 &PIC 81 + + /* IDSEL 0x0c */ + 0x6000 0 0 1 &PIC 91 + 0x6000 0 0 2 &PIC 93 + 0x6000 0 0 3 &PIC 80 + 0x6000 0 0 4 &PIC 81 + + /* IDSEL 0x0d */ + 0x6800 0 0 1 &PIC 93 + 0x6800 0 0 2 &PIC 80 + 0x6800 0 0 3 &PIC 81 + 0x6800 0 0 4 &PIC 91 + >; + }; + + + p) Marvell Discovery CPU Error nodes + + Represent the Discovery's CPU error handler device. + + Required properties: + - compatible : "marvell,mv64360-cpu-error" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery CPU Error node: + cpu-error@0070 { + compatible = "marvell,mv64360-cpu-error"; + reg = <0x70 0x10 0x128 0x28>; + interrupts = <3>; + interrupt-parent = <&PIC>; + }; + + + q) Marvell Discovery SRAM Controller nodes + + Represent the Discovery's SRAM controller device. + + Required properties: + - compatible : "marvell,mv64360-sram-ctrl" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery SRAM Controller node: + sram-ctrl@0380 { + compatible = "marvell,mv64360-sram-ctrl"; + reg = <0x380 0x80>; + interrupts = <13>; + interrupt-parent = <&PIC>; + }; + + + r) Marvell Discovery PCI Error Handler nodes + + Represent the Discovery's PCI error handler device. + + Required properties: + - compatible : "marvell,mv64360-pci-error" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery PCI Error Handler node: + pci-error@1d40 { + compatible = "marvell,mv64360-pci-error"; + reg = <0x1d40 0x40 0xc28 0x4>; + interrupts = <12>; + interrupt-parent = <&PIC>; + }; + + + s) Marvell Discovery Memory Controller nodes + + Represent the Discovery's memory controller device. + + Required properties: + - compatible : "marvell,mv64360-mem-ctrl" + - reg : Offset and length of the register set for this device + - interrupts : the interrupt number for this device + - interrupt-parent : the phandle for the interrupt controller + that services interrupts for this device. + + Example Discovery Memory Controller node: + mem-ctrl@1400 { + compatible = "marvell,mv64360-mem-ctrl"; + reg = <0x1400 0x60>; + interrupts = <17>; + interrupt-parent = <&PIC>; + }; + + diff --git a/Documentation/powerpc/dts-bindings/phy.txt b/Documentation/powerpc/dts-bindings/phy.txt new file mode 100644 index 000000000000..bb8c742eb8c5 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/phy.txt @@ -0,0 +1,25 @@ +PHY nodes + +Required properties: + + - device_type : Should be "ethernet-phy" + - interrupts : where a is the interrupt number and b is a + field that represents an encoding of the sense and level + information for the interrupt. This should be encoded based on + the information in section 2) depending on the type of interrupt + controller you have. + - interrupt-parent : the phandle for the interrupt controller that + services interrupts for this device. + - reg : The ID number for the phy, usually a small integer + - linux,phandle : phandle for this node; likely referenced by an + ethernet controller node. + +Example: + +ethernet-phy@0 { + linux,phandle = <2452000> + interrupt-parent = <40000>; + interrupts = <35 1>; + reg = <0>; + device_type = "ethernet-phy"; +}; diff --git a/Documentation/powerpc/dts-bindings/spi-bus.txt b/Documentation/powerpc/dts-bindings/spi-bus.txt new file mode 100644 index 000000000000..e782add2e457 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/spi-bus.txt @@ -0,0 +1,57 @@ +SPI (Serial Peripheral Interface) busses + +SPI busses can be described with a node for the SPI master device +and a set of child nodes for each SPI slave on the bus. For this +discussion, it is assumed that the system's SPI controller is in +SPI master mode. This binding does not describe SPI controllers +in slave mode. + +The SPI master node requires the following properties: +- #address-cells - number of cells required to define a chip select + address on the SPI bus. +- #size-cells - should be zero. +- compatible - name of SPI bus controller following generic names + recommended practice. +No other properties are required in the SPI bus node. It is assumed +that a driver for an SPI bus device will understand that it is an SPI bus. +However, the binding does not attempt to define the specific method for +assigning chip select numbers. Since SPI chip select configuration is +flexible and non-standardized, it is left out of this binding with the +assumption that board specific platform code will be used to manage +chip selects. Individual drivers can define additional properties to +support describing the chip select layout. + +SPI slave nodes must be children of the SPI master node and can +contain the following properties. +- reg - (required) chip select address of device. +- compatible - (required) name of SPI device following generic names + recommended practice +- spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz +- spi-cpol - (optional) Empty property indicating device requires + inverse clock polarity (CPOL) mode +- spi-cpha - (optional) Empty property indicating device requires + shifted clock phase (CPHA) mode +- spi-cs-high - (optional) Empty property indicating device requires + chip select active high + +SPI example for an MPC5200 SPI bus: + spi@f00 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi"; + reg = <0xf00 0x20>; + interrupts = <2 13 0 2 14 0>; + interrupt-parent = <&mpc5200_pic>; + + ethernet-switch@0 { + compatible = "micrel,ks8995m"; + spi-max-frequency = <1000000>; + reg = <0>; + }; + + codec@1 { + compatible = "ti,tlv320aic26"; + spi-max-frequency = <100000>; + reg = <1>; + }; + }; diff --git a/Documentation/powerpc/dts-bindings/usb-ehci.txt b/Documentation/powerpc/dts-bindings/usb-ehci.txt new file mode 100644 index 000000000000..fa18612f757b --- /dev/null +++ b/Documentation/powerpc/dts-bindings/usb-ehci.txt @@ -0,0 +1,25 @@ +USB EHCI controllers + +Required properties: + - compatible : should be "usb-ehci". + - reg : should contain at least address and length of the standard EHCI + register set for the device. Optional platform-dependent registers + (debug-port or other) can be also specified here, but only after + definition of standard EHCI registers. + - interrupts : one EHCI interrupt should be described here. +If device registers are implemented in big endian mode, the device +node should have "big-endian-regs" property. +If controller implementation operates with big endian descriptors, +"big-endian-desc" property should be specified. +If both big endian registers and descriptors are used by the controller +implementation, "big-endian" property can be specified instead of having +both "big-endian-regs" and "big-endian-desc". + +Example (Sequoia 440EPx): + ehci@e0000300 { + compatible = "ibm,usb-ehci-440epx", "usb-ehci"; + interrupt-parent = <&UIC0>; + interrupts = <1a 4>; + reg = <0 e0000300 90 0 e0000390 70>; + big-endian; + }; diff --git a/Documentation/powerpc/dts-bindings/xilinx.txt b/Documentation/powerpc/dts-bindings/xilinx.txt new file mode 100644 index 000000000000..80339fe4300b --- /dev/null +++ b/Documentation/powerpc/dts-bindings/xilinx.txt @@ -0,0 +1,295 @@ + d) Xilinx IP cores + + The Xilinx EDK toolchain ships with a set of IP cores (devices) for use + in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range + of standard device types (network, serial, etc.) and miscellaneous + devices (gpio, LCD, spi, etc). Also, since these devices are + implemented within the fpga fabric every instance of the device can be + synthesised with different options that change the behaviour. + + Each IP-core has a set of parameters which the FPGA designer can use to + control how the core is synthesized. Historically, the EDK tool would + extract the device parameters relevant to device drivers and copy them + into an 'xparameters.h' in the form of #define symbols. This tells the + device drivers how the IP cores are configured, but it requres the kernel + to be recompiled every time the FPGA bitstream is resynthesized. + + The new approach is to export the parameters into the device tree and + generate a new device tree each time the FPGA bitstream changes. The + parameters which used to be exported as #defines will now become + properties of the device node. In general, device nodes for IP-cores + will take the following form: + + (name): (generic-name)@(base-address) { + compatible = "xlnx,(ip-core-name)-(HW_VER)" + [, (list of compatible devices), ...]; + reg = <(baseaddr) (size)>; + interrupt-parent = <&interrupt-controller-phandle>; + interrupts = < ... >; + xlnx,(parameter1) = "(string-value)"; + xlnx,(parameter2) = <(int-value)>; + }; + + (generic-name): an open firmware-style name that describes the + generic class of device. Preferably, this is one word, such + as 'serial' or 'ethernet'. + (ip-core-name): the name of the ip block (given after the BEGIN + directive in system.mhs). Should be in lowercase + and all underscores '_' converted to dashes '-'. + (name): is derived from the "PARAMETER INSTANCE" value. + (parameter#): C_* parameters from system.mhs. The C_ prefix is + dropped from the parameter name, the name is converted + to lowercase and all underscore '_' characters are + converted to dashes '-'. + (baseaddr): the baseaddr parameter value (often named C_BASEADDR). + (HW_VER): from the HW_VER parameter. + (size): the address range size (often C_HIGHADDR - C_BASEADDR + 1). + + Typically, the compatible list will include the exact IP core version + followed by an older IP core version which implements the same + interface or any other device with the same interface. + + 'reg', 'interrupt-parent' and 'interrupts' are all optional properties. + + For example, the following block from system.mhs: + + BEGIN opb_uartlite + PARAMETER INSTANCE = opb_uartlite_0 + PARAMETER HW_VER = 1.00.b + PARAMETER C_BAUDRATE = 115200 + PARAMETER C_DATA_BITS = 8 + PARAMETER C_ODD_PARITY = 0 + PARAMETER C_USE_PARITY = 0 + PARAMETER C_CLK_FREQ = 50000000 + PARAMETER C_BASEADDR = 0xEC100000 + PARAMETER C_HIGHADDR = 0xEC10FFFF + BUS_INTERFACE SOPB = opb_7 + PORT OPB_Clk = CLK_50MHz + PORT Interrupt = opb_uartlite_0_Interrupt + PORT RX = opb_uartlite_0_RX + PORT TX = opb_uartlite_0_TX + PORT OPB_Rst = sys_bus_reset_0 + END + + becomes the following device tree node: + + opb_uartlite_0: serial@ec100000 { + device_type = "serial"; + compatible = "xlnx,opb-uartlite-1.00.b"; + reg = ; + interrupt-parent = <&opb_intc_0>; + interrupts = <1 0>; // got this from the opb_intc parameters + current-speed = ; // standard serial device prop + clock-frequency = ; // standard serial device prop + xlnx,data-bits = <8>; + xlnx,odd-parity = <0>; + xlnx,use-parity = <0>; + }; + + Some IP cores actually implement 2 or more logical devices. In + this case, the device should still describe the whole IP core with + a single node and add a child node for each logical device. The + ranges property can be used to translate from parent IP-core to the + registers of each device. In addition, the parent node should be + compatible with the bus type 'xlnx,compound', and should contain + #address-cells and #size-cells, as with any other bus. (Note: this + makes the assumption that both logical devices have the same bus + binding. If this is not true, then separate nodes should be used + for each logical device). The 'cell-index' property can be used to + enumerate logical devices within an IP core. For example, the + following is the system.mhs entry for the dual ps2 controller found + on the ml403 reference design. + + BEGIN opb_ps2_dual_ref + PARAMETER INSTANCE = opb_ps2_dual_ref_0 + PARAMETER HW_VER = 1.00.a + PARAMETER C_BASEADDR = 0xA9000000 + PARAMETER C_HIGHADDR = 0xA9001FFF + BUS_INTERFACE SOPB = opb_v20_0 + PORT Sys_Intr1 = ps2_1_intr + PORT Sys_Intr2 = ps2_2_intr + PORT Clkin1 = ps2_clk_rx_1 + PORT Clkin2 = ps2_clk_rx_2 + PORT Clkpd1 = ps2_clk_tx_1 + PORT Clkpd2 = ps2_clk_tx_2 + PORT Rx1 = ps2_d_rx_1 + PORT Rx2 = ps2_d_rx_2 + PORT Txpd1 = ps2_d_tx_1 + PORT Txpd2 = ps2_d_tx_2 + END + + It would result in the following device tree nodes: + + opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "xlnx,compound"; + ranges = <0 a9000000 2000>; + // If this device had extra parameters, then they would + // go here. + ps2@0 { + compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; + reg = <0 40>; + interrupt-parent = <&opb_intc_0>; + interrupts = <3 0>; + cell-index = <0>; + }; + ps2@1000 { + compatible = "xlnx,opb-ps2-dual-ref-1.00.a"; + reg = <1000 40>; + interrupt-parent = <&opb_intc_0>; + interrupts = <3 0>; + cell-index = <0>; + }; + }; + + Also, the system.mhs file defines bus attachments from the processor + to the devices. The device tree structure should reflect the bus + attachments. Again an example; this system.mhs fragment: + + BEGIN ppc405_virtex4 + PARAMETER INSTANCE = ppc405_0 + PARAMETER HW_VER = 1.01.a + BUS_INTERFACE DPLB = plb_v34_0 + BUS_INTERFACE IPLB = plb_v34_0 + END + + BEGIN opb_intc + PARAMETER INSTANCE = opb_intc_0 + PARAMETER HW_VER = 1.00.c + PARAMETER C_BASEADDR = 0xD1000FC0 + PARAMETER C_HIGHADDR = 0xD1000FDF + BUS_INTERFACE SOPB = opb_v20_0 + END + + BEGIN opb_uart16550 + PARAMETER INSTANCE = opb_uart16550_0 + PARAMETER HW_VER = 1.00.d + PARAMETER C_BASEADDR = 0xa0000000 + PARAMETER C_HIGHADDR = 0xa0001FFF + BUS_INTERFACE SOPB = opb_v20_0 + END + + BEGIN plb_v34 + PARAMETER INSTANCE = plb_v34_0 + PARAMETER HW_VER = 1.02.a + END + + BEGIN plb_bram_if_cntlr + PARAMETER INSTANCE = plb_bram_if_cntlr_0 + PARAMETER HW_VER = 1.00.b + PARAMETER C_BASEADDR = 0xFFFF0000 + PARAMETER C_HIGHADDR = 0xFFFFFFFF + BUS_INTERFACE SPLB = plb_v34_0 + END + + BEGIN plb2opb_bridge + PARAMETER INSTANCE = plb2opb_bridge_0 + PARAMETER HW_VER = 1.01.a + PARAMETER C_RNG0_BASEADDR = 0x20000000 + PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF + PARAMETER C_RNG1_BASEADDR = 0x60000000 + PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF + PARAMETER C_RNG2_BASEADDR = 0x80000000 + PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF + PARAMETER C_RNG3_BASEADDR = 0xC0000000 + PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF + BUS_INTERFACE SPLB = plb_v34_0 + BUS_INTERFACE MOPB = opb_v20_0 + END + + Gives this device tree (some properties removed for clarity): + + plb@0 { + #address-cells = <1>; + #size-cells = <1>; + compatible = "xlnx,plb-v34-1.02.a"; + device_type = "ibm,plb"; + ranges; // 1:1 translation + + plb_bram_if_cntrl_0: bram@ffff0000 { + reg = ; + } + + opb@20000000 { + #address-cells = <1>; + #size-cells = <1>; + ranges = <20000000 20000000 20000000 + 60000000 60000000 20000000 + 80000000 80000000 40000000 + c0000000 c0000000 20000000>; + + opb_uart16550_0: serial@a0000000 { + reg = ; + }; + + opb_intc_0: interrupt-controller@d1000fc0 { + reg = ; + }; + }; + }; + + That covers the general approach to binding xilinx IP cores into the + device tree. The following are bindings for specific devices: + + i) Xilinx ML300 Framebuffer + + Simple framebuffer device from the ML300 reference design (also on the + ML403 reference design as well as others). + + Optional properties: + - resolution = : pixel resolution of framebuffer. Some + implementations use a different resolution. + Default is + - virt-resolution = : Size of framebuffer in memory. + Default is . + - rotate-display (empty) : rotate display 180 degrees. + + ii) Xilinx SystemACE + + The Xilinx SystemACE device is used to program FPGAs from an FPGA + bitstream stored on a CF card. It can also be used as a generic CF + interface device. + + Optional properties: + - 8-bit (empty) : Set this property for SystemACE in 8 bit mode + + iii) Xilinx EMAC and Xilinx TEMAC + + Xilinx Ethernet devices. In addition to general xilinx properties + listed above, nodes for these devices should include a phy-handle + property, and may include other common network device properties + like local-mac-address. + + iv) Xilinx Uartlite + + Xilinx uartlite devices are simple fixed speed serial ports. + + Required properties: + - current-speed : Baud rate of uartlite + + v) Xilinx hwicap + + Xilinx hwicap devices provide access to the configuration logic + of the FPGA through the Internal Configuration Access Port + (ICAP). The ICAP enables partial reconfiguration of the FPGA, + readback of the configuration information, and some control over + 'warm boots' of the FPGA fabric. + + Required properties: + - xlnx,family : The family of the FPGA, necessary since the + capabilities of the underlying ICAP hardware + differ between different families. May be + 'virtex2p', 'virtex4', or 'virtex5'. + + vi) Xilinx Uart 16550 + + Xilinx UART 16550 devices are very similar to the NS16550 but with + different register spacing and an offset from the base address. + + Required properties: + - clock-frequency : Frequency of the clock input + - reg-offset : A value of 3 is required + - reg-shift : A value of 2 is required + + -- cgit v1.2.2 From 5054d39e327f76df022163a2ebd02e444c5d65f9 Mon Sep 17 00:00:00 2001 From: Antonio Ospite Date: Fri, 19 Jun 2009 13:55:42 +0200 Subject: leds: LED driver for National Semiconductor LP3944 Funlight Chip LEDs driver for National Semiconductor LP3944 Funlight Chip http://www.national.com/pf/LP/LP3944.html This helper chip can drive up to 8 leds, with two programmable DIM modes; it could even be used as a gpio expander but this driver assumes it is used as a led controller. The DIM modes are used to set _blink_ patterns for leds, the pattern is specified supplying two parameters: - period: from 0s to 1.6s - duty cycle: percentage of the period the led is on, from 0 to 100 LP3944 can be found on Motorola A910 smartphone, where it drives the rgb leds, the camera flash light and the displays backlights. Signed-off-by: Antonio Ospite Signed-off-by: Richard Purdie --- Documentation/leds-lp3944.txt | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 Documentation/leds-lp3944.txt (limited to 'Documentation') diff --git a/Documentation/leds-lp3944.txt b/Documentation/leds-lp3944.txt new file mode 100644 index 000000000000..c6eda18b15ef --- /dev/null +++ b/Documentation/leds-lp3944.txt @@ -0,0 +1,50 @@ +Kernel driver lp3944 +==================== + + * National Semiconductor LP3944 Fun-light Chip + Prefix: 'lp3944' + Addresses scanned: None (see the Notes section below) + Datasheet: Publicly available at the National Semiconductor website + http://www.national.com/pf/LP/LP3944.html + +Authors: + Antonio Ospite + + +Description +----------- +The LP3944 is a helper chip that can drive up to 8 leds, with two programmable +DIM modes; it could even be used as a gpio expander but this driver assumes it +is used as a led controller. + +The DIM modes are used to set _blink_ patterns for leds, the pattern is +specified supplying two parameters: + - period: from 0s to 1.6s + - duty cycle: percentage of the period the led is on, from 0 to 100 + +Setting a led in DIM0 or DIM1 mode makes it blink according to the pattern. +See the datasheet for details. + +LP3944 can be found on Motorola A910 smartphone, where it drives the rgb +leds, the camera flash light and the lcds power. + + +Notes +----- +The chip is used mainly in embedded contexts, so this driver expects it is +registered using the i2c_board_info mechanism. + +To register the chip at address 0x60 on adapter 0, set the platform data +according to include/linux/leds-lp3944.h, set the i2c board info: + + static struct i2c_board_info __initdata a910_i2c_board_info[] = { + { + I2C_BOARD_INFO("lp3944", 0x60), + .platform_data = &a910_lp3944_leds, + }, + }; + +and register it in the platform init function + + i2c_register_board_info(0, a910_i2c_board_info, + ARRAY_SIZE(a910_i2c_board_info)); -- cgit v1.2.2 From ed88bae6918fa990cbfe47316bd0f790121aaf00 Mon Sep 17 00:00:00 2001 From: Trent Piepho Date: Tue, 12 May 2009 15:33:12 -0700 Subject: leds: Add options to have GPIO LEDs start on or keep their state There already is a "default-on" trigger but there are problems with it. For one, it's a inefficient way to do it and requires led trigger support to be compiled in. But the real reason is that is produces a glitch on the LED. The GPIO is allocate with the LED *off*, then *later* when the trigger runs it is turned back on. If the LED was already on via the GPIO's reset default or action of the firmware, this produces a glitch where the LED goes from on to off to on. While normally this is fast enough that it wouldn't be noticeable to a human observer, there are still serious problems. One is that there may be something else on the GPIO line, like a hardware alarm or watchdog, that is fast enough to notice the glitch. Another is that the kernel may panic before the LED is turned back on, thus hanging with the LED in the wrong state. This is not just speculation, but actually happened to me with an embedded system that has an LED which should turn off when the kernel finishes booting, which was left in the incorrect state due to a bug in the OF LED binding code. We also let GPIO LEDs get their initial value from whatever the current state of the GPIO line is. On some systems the LEDs are put into some state by the firmware or hardware before Linux boots, and it is desired to have them keep this state which is otherwise unknown to Linux. This requires that the underlying GPIO driver support reading the value of output GPIOs. Some drivers support this and some do not. The platform device binding gains a field in the platform data "default_state" that controls this. There are three constants defined to select from on, off, or keeping the current state. The OpenFirmware binding uses a property named "default-state" that can be set to "on", "off", or "keep". The default if the property isn't present is off. Signed-off-by: Trent Piepho Acked-by: Grant Likely Acked-by: Wolfram Sang Acked-by: Sean MacLennan Signed-off-by: Richard Purdie --- Documentation/powerpc/dts-bindings/gpio/led.txt | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/powerpc/dts-bindings/gpio/led.txt b/Documentation/powerpc/dts-bindings/gpio/led.txt index 4fe14deedc0a..064db928c3c1 100644 --- a/Documentation/powerpc/dts-bindings/gpio/led.txt +++ b/Documentation/powerpc/dts-bindings/gpio/led.txt @@ -16,10 +16,17 @@ LED sub-node properties: string defining the trigger assigned to the LED. Current triggers are: "backlight" - LED will act as a back-light, controlled by the framebuffer system - "default-on" - LED will turn on + "default-on" - LED will turn on, but see "default-state" below "heartbeat" - LED "double" flashes at a load average based rate "ide-disk" - LED indicates disk activity "timer" - LED flashes at a fixed, configurable rate +- default-state: (optional) The initial state of the LED. Valid + values are "on", "off", and "keep". If the LED is already on or off + and the default-state property is set the to same value, then no + glitch should be produced where the LED momentarily turns off (or + on). The "keep" setting will keep the LED at whatever its current + state is, without producing a glitch. The default is off if this + property is not present. Examples: @@ -30,14 +37,22 @@ leds { gpios = <&mcu_pio 0 1>; /* Active low */ linux,default-trigger = "ide-disk"; }; + + fault { + gpios = <&mcu_pio 1 0>; + /* Keep LED on if BIOS detected hardware fault */ + default-state = "keep"; + }; }; run-control { compatible = "gpio-leds"; red { gpios = <&mpc8572 6 0>; + default-state = "off"; }; green { gpios = <&mpc8572 7 0>; + default-state = "on"; }; } -- cgit v1.2.2 From c912e7a58054304575fe88574c776be7e684098e Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 24 Jun 2009 14:14:34 +0200 Subject: ALSA: hda - Fix support for Samsung P50 with AD1986A codec Samsung P50 requires the HP auto-muting unlike other Samsung models. Added a new model=samsung-p50 to support this. Signed-off-by: Takashi Iwai --- Documentation/sound/alsa/HD-Audio-Models.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 0d8d23581c44..939a3dd58148 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt @@ -240,6 +240,7 @@ AD1986A laptop-automute 2-channel with EAPD and HP-automute (Lenovo N100) ultra 2-channel with EAPD (Samsung Ultra tablet PC) samsung 2-channel with EAPD (Samsung R65) + samsung-p50 2-channel with HP-automute (Samsung P50) AD1988/AD1988B/AD1989A/AD1989B ============================== -- cgit v1.2.2 From 7e325d3a6b117c7288bfc0755410e9d9d2b71326 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 19 Jun 2009 20:22:37 +0200 Subject: update Documentation/filesystems/Locking The rules for locking in many superblock operations has changed significantly, so update the documentation for it. Also correct some older updates and ommissions. Signed-off-by: Christoph Hellwig Signed-off-by: Al Viro --- Documentation/filesystems/Locking | 43 ++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 21 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 229d7b7c50a3..18b9d0ca0630 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -109,27 +109,28 @@ prototypes: locking rules: All may block. - BKL s_lock s_umount -alloc_inode: no no no -destroy_inode: no -dirty_inode: no (must not sleep) -write_inode: no -drop_inode: no !!!inode_lock!!! -delete_inode: no -put_super: yes yes no -write_super: no yes read -sync_fs: no no read -freeze_fs: ? -unfreeze_fs: ? -statfs: no no no -remount_fs: yes yes maybe (see below) -clear_inode: no -umount_begin: yes no no -show_options: no (vfsmount->sem) -quota_read: no no no (see below) -quota_write: no no no (see below) - -->remount_fs() will have the s_umount lock if it's already mounted. + None have BKL + s_umount +alloc_inode: +destroy_inode: +dirty_inode: (must not sleep) +write_inode: +drop_inode: !!!inode_lock!!! +delete_inode: +put_super: write +write_super: read +sync_fs: read +freeze_fs: read +unfreeze_fs: read +statfs: no +remount_fs: maybe (see below) +clear_inode: +umount_begin: no +show_options: no (namespace_sem) +quota_read: no (see below) +quota_write: no (see below) + +->remount_fs() will have the s_umount exclusive lock if it's already mounted. When called from get_sb_single, it does NOT have the s_umount lock. ->quota_read() and ->quota_write() functions are both guaranteed to be the only ones operating on the quota file by the quota code (via -- cgit v1.2.2 From 236e946b53ffd5e2f5d7e6abebbe72a9f0826d15 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Wed, 24 Jun 2009 16:23:03 -0700 Subject: Revert "PCI: use ACPI _CRS data by default" This reverts commit 9e9f46c44e487af0a82eb61b624553e2f7118f5b. Quoting from the commit message: "At this point, it seems to solve more problems than it causes, so let's try using it by default. It's an easy revert if it ends up causing trouble." And guess what? The _CRS code causes trouble. Signed-off-by: Linus Torvalds --- Documentation/kernel-parameters.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 040fee607282..d08759aa0903 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1855,7 +1855,7 @@ and is between 256 and 4096 characters. It is defined in the file IRQ routing is enabled. noacpi [X86] Do not use ACPI for IRQ routing or for PCI scanning. - nocrs [X86] Don't use _CRS for PCI resource + use_crs [X86] Use _CRS for PCI resource allocation. routeirq Do IRQ routing for all PCI devices. This is normally done in pci_enable_device(), -- cgit v1.2.2 From a9d9058abab4ac17b79d500506e6c74bd16cecdc Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Thu, 25 Jun 2009 10:16:11 +0100 Subject: kmemleak: Allow the early log buffer to be configurable. (feature suggested by Sergey Senozhatsky) Kmemleak needs to track all the memory allocations but some of these happen before kmemleak is initialised. These are stored in an internal buffer which may be exceeded in some kernel configurations. This patch adds a configuration option with a default value of 400 and also removes the stack dump when the early log buffer is exceeded. Signed-off-by: Catalin Marinas Acked-by: Sergey Senozhatsky --- Documentation/kmemleak.txt | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt index 0112da3b9ab8..f655308064d7 100644 --- a/Documentation/kmemleak.txt +++ b/Documentation/kmemleak.txt @@ -41,6 +41,10 @@ Memory scanning parameters can be modified at run-time by writing to the Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on the kernel command line. +Memory may be allocated or freed before kmemleak is initialised and +these actions are stored in an early log buffer. The size of this buffer +is configured via the CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option. + Basic Algorithm --------------- -- cgit v1.2.2 From e0a2a1601bec01243bcad44414d06f59dae2eedb Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Fri, 26 Jun 2009 17:38:25 +0100 Subject: kmemleak: Enable task stacks scanning by default This is to reduce the number of false positives reported. Signed-off-by: Catalin Marinas --- Documentation/kmemleak.txt | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt index f655308064d7..9426e94f291a 100644 --- a/Documentation/kmemleak.txt +++ b/Documentation/kmemleak.txt @@ -31,12 +31,12 @@ Memory scanning parameters can be modified at run-time by writing to the /sys/kernel/debug/kmemleak file. The following parameters are supported: off - disable kmemleak (irreversible) - stack=on - enable the task stacks scanning + stack=on - enable the task stacks scanning (default) stack=off - disable the tasks stacks scanning - scan=on - start the automatic memory scanning thread + scan=on - start the automatic memory scanning thread (default) scan=off - stop the automatic memory scanning thread - scan= - set the automatic memory scanning period in seconds (0 - to disable it) + scan= - set the automatic memory scanning period in seconds + (default 600, 0 to stop the automatic scanning) Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on the kernel command line. -- cgit v1.2.2 From bab4a34afc301fdb81b6ea0e3098d96fc356e03a Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Fri, 26 Jun 2009 17:38:26 +0100 Subject: kmemleak: Simplify the reports logged by the scanning thread Because of false positives, the memory scanning thread may print too much information. This patch changes the scanning thread to only print the number of newly suspected leaks. Further information can be read from the /sys/kernel/debug/kmemleak file. Signed-off-by: Catalin Marinas --- Documentation/kmemleak.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt index 9426e94f291a..c06f7ba64993 100644 --- a/Documentation/kmemleak.txt +++ b/Documentation/kmemleak.txt @@ -16,9 +16,9 @@ Usage ----- CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel -thread scans the memory every 10 minutes (by default) and prints any new -unreferenced objects found. To trigger an intermediate scan and display -all the possible memory leaks: +thread scans the memory every 10 minutes (by default) and prints the +number of new unreferenced objects found. To trigger an intermediate +scan and display the details of all the possible memory leaks: # mount -t debugfs nodev /sys/kernel/debug/ # cat /sys/kernel/debug/kmemleak -- cgit v1.2.2 From 4698c1f2bbe44ce852ef1a6716973c1f5401a4c4 Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Fri, 26 Jun 2009 17:38:27 +0100 Subject: kmemleak: Do not trigger a scan when reading the debug/kmemleak file Since there is a kernel thread for automatically scanning the memory, it makes sense for the debug/kmemleak file to only show its findings. This patch also adds support for "echo scan > debug/kmemleak" to trigger an intermediate memory scan and eliminates the kmemleak_mutex (scan_mutex covers all the cases now). Signed-off-by: Catalin Marinas --- Documentation/kmemleak.txt | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt index c06f7ba64993..89068030b01b 100644 --- a/Documentation/kmemleak.txt +++ b/Documentation/kmemleak.txt @@ -17,12 +17,16 @@ Usage CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel thread scans the memory every 10 minutes (by default) and prints the -number of new unreferenced objects found. To trigger an intermediate -scan and display the details of all the possible memory leaks: +number of new unreferenced objects found. To display the details of all +the possible memory leaks: # mount -t debugfs nodev /sys/kernel/debug/ # cat /sys/kernel/debug/kmemleak +To trigger an intermediate memory scan: + + # echo scan > /sys/kernel/debug/kmemleak + Note that the orphan objects are listed in the order they were allocated and one object at the beginning of the list may cause other subsequent objects to be reported as orphan. @@ -37,6 +41,7 @@ Memory scanning parameters can be modified at run-time by writing to the scan=off - stop the automatic memory scanning thread scan= - set the automatic memory scanning period in seconds (default 600, 0 to stop the automatic scanning) + scan - trigger a memory scan Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on the kernel command line. -- cgit v1.2.2 From 972c71a3183ab41c0b1a9e50842be7e3e980954f Mon Sep 17 00:00:00 2001 From: Peter Oberparleiter Date: Tue, 30 Jun 2009 11:41:20 -0700 Subject: gcov: fix documentation Commonly available versions of cp and tar don't work well with special files created using seq_file. Mention this problem in the gcov documentation and update the helper script example to work around these problems. Signed-off-by: Peter Oberparleiter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/gcov.txt | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/gcov.txt b/Documentation/gcov.txt index e716aadb3a33..40ec63352760 100644 --- a/Documentation/gcov.txt +++ b/Documentation/gcov.txt @@ -188,13 +188,18 @@ Solution: Exclude affected source files from profiling by specifying GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the corresponding Makefile. +Problem: Files copied from sysfs appear empty or incomplete. +Cause: Due to the way seq_file works, some tools such as cp or tar + may not correctly copy files from sysfs. +Solution: Use 'cat' to read .gcda files and 'cp -d' to copy links. + Alternatively use the mechanism shown in Appendix B. + Appendix A: gather_on_build.sh ============================== Sample script to gather coverage meta files on the build machine (see 6a): - #!/bin/bash KSRC=$1 @@ -226,7 +231,7 @@ Appendix B: gather_on_test.sh Sample script to gather coverage data files on the test machine (see 6b): -#!/bin/bash +#!/bin/bash -e DEST=$1 GCDA=/sys/kernel/debug/gcov @@ -236,11 +241,13 @@ if [ -z "$DEST" ] ; then exit 1 fi -find $GCDA -name '*.gcno' -o -name '*.gcda' | tar cfz $DEST -T - +TEMPDIR=$(mktemp -d) +echo Collecting data.. +find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \; +find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \; +find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \; +tar czf $DEST -C $TEMPDIR sys +rm -rf $TEMPDIR -if [ $? -eq 0 ] ; then - echo "$DEST successfully created, copy to build system and unpack with:" - echo " tar xfz $DEST" -else - echo "Could not create file $DEST" -fi +echo "$DEST successfully created, copy to build system and unpack with:" +echo " tar xfz $DEST" -- cgit v1.2.2 From b55f627feeb9d48fdbde3835e18afbc76712e49b Mon Sep 17 00:00:00 2001 From: David Brownell Date: Tue, 30 Jun 2009 11:41:26 -0700 Subject: spi: new spi->mode bits Add two new spi_device.mode bits to accomodate more protocol options, and pass them through to usermode drivers: * SPI_NO_CS ... a second 3-wire variant, where the chipselect line is removed instead of a data line; transfers are still full duplex. This obviously has STRONG protocol implications since the chipselect transitions can't be used to synchronize state transitions with the SPI master. * SPI_READY ... defines open drain signal that's pulled low to pause the clock. This defines a 5-wire variant (normal 4-wire SPI plus READY) and two 4-wire variants (READY plus each of the 3-wire flavors). Such hardware flow control can be a big win. There are ADC converters and flash chips that expose READY signals, but not many host controllers support it today. The spi_bitbang code should be changed to use SPI_NO_CS instead of its current nonportable hack. That's a mode most hardware can easily support (unlike SPI_READY). Signed-off-by: David Brownell Cc: "Paulraj, Sandeep" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/spi/spidev_test.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/spi/spidev_test.c b/Documentation/spi/spidev_test.c index cf0e3ce0d526..c1a5aad3c75a 100644 --- a/Documentation/spi/spidev_test.c +++ b/Documentation/spi/spidev_test.c @@ -99,11 +99,13 @@ void parse_opts(int argc, char *argv[]) { "lsb", 0, 0, 'L' }, { "cs-high", 0, 0, 'C' }, { "3wire", 0, 0, '3' }, + { "no-cs", 0, 0, 'N' }, + { "ready", 0, 0, 'R' }, { NULL, 0, 0, 0 }, }; int c; - c = getopt_long(argc, argv, "D:s:d:b:lHOLC3", lopts, NULL); + c = getopt_long(argc, argv, "D:s:d:b:lHOLC3NR", lopts, NULL); if (c == -1) break; @@ -139,6 +141,12 @@ void parse_opts(int argc, char *argv[]) case '3': mode |= SPI_3WIRE; break; + case 'N': + mode |= SPI_NO_CS; + break; + case 'R': + mode |= SPI_READY; + break; default: print_usage(argv[0]); break; -- cgit v1.2.2 From b37f2d4de6dfce4bfd6df311af80e4d61458ee1e Mon Sep 17 00:00:00 2001 From: Nikanth Karthikesan Date: Tue, 30 Jun 2009 11:41:36 -0700 Subject: cpusets: document adding/removing cpus to cpuset elaborately By writing a tasks's pid to the file, a process adds that task to that cgroup/cpuset. But to add a cpu/mem to a cpuset, the new list of cpus should be written to the cpuset.mems file which would replace the old list of cpus. Make this clearer in the documentation. Signed-off-by: Nikanth Karthikesan Signed-off-by: Li Zefan Acked-by: Paul Menage Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/cgroups/cpusets.txt | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'Documentation') diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt index f9ca389dddf4..1d7e9784439a 100644 --- a/Documentation/cgroups/cpusets.txt +++ b/Documentation/cgroups/cpusets.txt @@ -777,6 +777,18 @@ in cpuset directories: # /bin/echo 1-4 > cpus -> set cpus list to cpus 1,2,3,4 # /bin/echo 1,2,3,4 > cpus -> set cpus list to cpus 1,2,3,4 +To add a CPU to a cpuset, write the new list of CPUs including the +CPU to be added. To add 6 to the above cpuset: + +# /bin/echo 1-4,6 > cpus -> set cpus list to cpus 1,2,3,4,6 + +Similarly to remove a CPU from a cpuset, write the new list of CPUs +without the CPU to be removed. + +To remove all the CPUs: + +# /bin/echo "" > cpus -> clear cpus list + 2.3 Setting flags ----------------- -- cgit v1.2.2 From 61fd21670d048017c81e62f60894ef1b04b481db Mon Sep 17 00:00:00 2001 From: Andre Noll Date: Thu, 25 Jun 2009 13:03:04 +0200 Subject: Trivial typo fixes in Documentation/block/data-integrity.txt. Signed-off-by: Andre Noll Acked-by: Martin K. Petersen Signed-off-by: Jens Axboe --- Documentation/block/data-integrity.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/block/data-integrity.txt b/Documentation/block/data-integrity.txt index e8ca040ba2cf..2d735b0ae383 100644 --- a/Documentation/block/data-integrity.txt +++ b/Documentation/block/data-integrity.txt @@ -50,7 +50,7 @@ encouraged them to allow separation of the data and integrity metadata scatter-gather lists. The controller will interleave the buffers on write and split them on -read. This means that the Linux can DMA the data buffers to and from +read. This means that Linux can DMA the data buffers to and from host memory without changes to the page cache. Also, the 16-bit CRC checksum mandated by both the SCSI and SATA specs @@ -66,7 +66,7 @@ software RAID5). The IP checksum is weaker than the CRC in terms of detecting bit errors. However, the strength is really in the separation of the data -buffers and the integrity metadata. These two distinct buffers much +buffers and the integrity metadata. These two distinct buffers must match up for an I/O to complete. The separation of the data and integrity metadata buffers as well as -- cgit v1.2.2 From 9f38a920b232800fd4000ba3d4617b41198e017e Mon Sep 17 00:00:00 2001 From: Andy Walls Date: Fri, 3 Jul 2009 11:33:17 -0300 Subject: V4L/DVB (12181): get_dvb_firmware: Add Yuan MPC718 MT352 DVB-T "firmware" extraction Add routine to support extracting the MT352 DVB-T demodulator initialization sequence for Yuan MPC718 cards for use by the cx18 driver. This routine uses a hueristic for extracting a good sequence. It should work on all different versions of the "yuanrap.sys" file, given the way the MT352 tuning sequences are stored in all versions of that file I have seen so far. However, the current patch simply looks for one specific archive URL. Signed-off-by: Andy Walls Signed-off-by: Mauro Carvalho Chehab --- Documentation/dvb/get_dvb_firmware | 52 +++++++++++++++++++++++++++++++++++++- 1 file changed, 51 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index a52adfc9a57f..64174d6258f0 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -25,7 +25,7 @@ use IO::Handle; "tda10046lifeview", "av7110", "dec2000t", "dec2540t", "dec3000s", "vp7041", "dibusb", "nxt2002", "nxt2004", "or51211", "or51132_qam", "or51132_vsb", "bluebird", - "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2" ); + "opera1", "cx231xx", "cx18", "cx23885", "pvrusb2", "mpc718" ); # Check args syntax() if (scalar(@ARGV) != 1); @@ -381,6 +381,56 @@ sub cx18 { $allfiles; } +sub mpc718 { + my $archive = 'Yuan MPC718 TV Tuner Card 2.13.10.1016.zip'; + my $url = "ftp://ftp.work.acer-euro.com/desktop/aspire_idea510/vista/Drivers/$archive"; + my $fwfile = "dvb-cx18-mpc718-mt352.fw"; + my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1); + + checkstandard(); + wgetfile($archive, $url); + unzip($archive, $tmpdir); + + my $sourcefile = "$tmpdir/Yuan MPC718 TV Tuner Card 2.13.10.1016/mpc718_32bit/yuanrap.sys"; + my $found = 0; + + open IN, '<', $sourcefile or die "Couldn't open $sourcefile to extract $fwfile data\n"; + binmode IN; + open OUT, '>', $fwfile; + binmode OUT; + { + # Block scope because we change the line terminator variable $/ + my $prevlen = 0; + my $currlen; + + # Buried in the data segment are 3 runs of almost identical + # register-value pairs that end in 0x5d 0x01 which is a "TUNER GO" + # command for the MT352. + # Pull out the middle run (because it's easy) of register-value + # pairs to make the "firmware" file. + + local $/ = "\x5d\x01"; # MT352 "TUNER GO" + + while () { + $currlen = length($_); + if ($prevlen == $currlen || $currlen <= 64) { + chop; chop; # Get rid of "TUNER GO" + s/^\0\0//; # get rid of leading 00 00 if it's there + printf OUT "$_"; + $found = 1; + last; + } + } + } + close OUT; + close IN; + if (!$found) { + unlink $fwfile; + die "Couldn't find valid register-value sequence in $sourcefile for $fwfile\n"; + } + $fwfile; +} + sub cx23885 { my $url = "http://linuxtv.org/downloads/firmware/"; -- cgit v1.2.2 From 02e7804b2135ff941b8846f5820cf48fbfdadd54 Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Mon, 29 Jun 2009 11:35:05 -0300 Subject: V4L/DVB (12138): em28xx: add support for Silvercrest Webcam This webcam uses a em2710 chipset, that identifies itself as em2820, plus a mt9v011 sensor, and a DY-301P lens. It needs a few different initializations than a normal em28xx device. Thanks to Hans de Goede and Douglas Landgraf for providing the acces for the webcam during this weekend, I could make a patch for it while returning back from FISL/Fudcom LATAM 2009. Signed-off-by: Mauro Carvalho Chehab --- Documentation/video4linux/CARDLIST.em28xx | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index 873630e7e53e..014d255231fc 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx @@ -66,3 +66,4 @@ 68 -> Terratec AV350 (em2860) [0ccd:0084] 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313] 70 -> Evga inDtube (em2882) + 71 -> Silvercrest Webcam 1.3mpix (em2820/em2840) -- cgit v1.2.2 From 0a6843483c256c859cd9542361812a29403f0fb5 Mon Sep 17 00:00:00 2001 From: Andy Walls Date: Sun, 5 Jul 2009 16:22:45 -0300 Subject: V4L/DVB (12206): get_dvb_firmware: Correct errors in MPC718 firmware extraction logic The extraction routine for the MPC718 "firmware" had 2 bugs in it, where one bug masked the effect of the other. The loop iteration should have set $prevlen = $currlen at the end of the loop, and the if() check should have used && instead of || for deciding if the firmware length is reasonable. Signed-off-by: Andy Walls Signed-off-by: Mauro Carvalho Chehab --- Documentation/dvb/get_dvb_firmware | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/dvb/get_dvb_firmware b/Documentation/dvb/get_dvb_firmware index 64174d6258f0..3d1b0ab70c8e 100644 --- a/Documentation/dvb/get_dvb_firmware +++ b/Documentation/dvb/get_dvb_firmware @@ -413,13 +413,14 @@ sub mpc718 { while () { $currlen = length($_); - if ($prevlen == $currlen || $currlen <= 64) { + if ($prevlen == $currlen && $currlen <= 64) { chop; chop; # Get rid of "TUNER GO" s/^\0\0//; # get rid of leading 00 00 if it's there printf OUT "$_"; $found = 1; last; } + $prevlen = $currlen; } } close OUT; -- cgit v1.2.2 From 37c90e8887efd218dc4af949b7f498ca2da4af9f Mon Sep 17 00:00:00 2001 From: "venkatesh.pallipadi@intel.com" Date: Thu, 2 Jul 2009 17:08:31 -0700 Subject: [CPUFREQ] Mark policy_rwsem as going static in cpufreq.c wont be exported lock_policy_rwsem_* and unlock_policy_rwsem_* routines in cpufreq.c are currently exported to drivers. Improper use of those locks can result in deadlocks and it is better to keep the locks localized. Two previous in-kernel users of these interfaces (ondemand and conservative), do not use this interfaces any more. Schedule them for removal. Signed-off-by: Venkatesh Pallipadi Signed-off-by: Dave Jones --- Documentation/feature-removal-schedule.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index f8cd450be9aa..09e031c55887 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -458,3 +458,13 @@ Why: Remove the old legacy 32bit machine check code. This has been but the old version has been kept around for easier testing. Note this doesn't impact the old P5 and WinChip machine check handlers. Who: Andi Kleen + +---------------------------- + +What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be + exported interface anymore. +When: 2.6.33 +Why: cpu_policy_rwsem has a new cleaner definition making it local to + cpufreq core and contained inside cpufreq.c. Other dependent + drivers should not use it in order to safely avoid lockdep issues. +Who: Venkatesh Pallipadi -- cgit v1.2.2 From b9744d19e35d74f965fb94bd55f9313d3a7d9e54 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Tue, 7 Jul 2009 11:10:12 +0200 Subject: mac80211: fix docbook These two functions no longer exist in mac80211, so trying to insert them generates warnings in the document. Signed-off-by: Johannes Berg Signed-off-by: John W. Linville --- Documentation/DocBook/mac80211.tmpl | 2 -- 1 file changed, 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/DocBook/mac80211.tmpl b/Documentation/DocBook/mac80211.tmpl index e36986663570..f3f37f141dbd 100644 --- a/Documentation/DocBook/mac80211.tmpl +++ b/Documentation/DocBook/mac80211.tmpl @@ -184,8 +184,6 @@ usage should require reading the full document. !Finclude/net/mac80211.h ieee80211_ctstoself_get !Finclude/net/mac80211.h ieee80211_ctstoself_duration !Finclude/net/mac80211.h ieee80211_generic_frame_duration -!Finclude/net/mac80211.h ieee80211_get_hdrlen_from_skb -!Finclude/net/mac80211.h ieee80211_hdrlen !Finclude/net/mac80211.h ieee80211_wake_queue !Finclude/net/mac80211.h ieee80211_stop_queue !Finclude/net/mac80211.h ieee80211_wake_queues -- cgit v1.2.2 From 8d7ff4f2a0b22b7d6d7bc3982257d1dadea22824 Mon Sep 17 00:00:00 2001 From: Robert Richter Date: Tue, 23 Jun 2009 11:48:14 +0200 Subject: x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon The short name of the achitecture is 'arch_perfmon'. This patch changes the kernel parameter to use this name. Cc: Andi Kleen Signed-off-by: Robert Richter Signed-off-by: Ingo Molnar --- Documentation/kernel-parameters.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 92e1ab8178a8..c59e965a748d 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1728,8 +1728,8 @@ and is between 256 and 4096 characters. It is defined in the file oprofile.cpu_type= Force an oprofile cpu type This might be useful if you have an older oprofile userland or if you want common events. - Format: { archperfmon } - archperfmon: [X86] Force use of architectural + Format: { arch_perfmon } + arch_perfmon: [X86] Force use of architectural perfmon on Intel CPUs instead of the CPU specific event set. -- cgit v1.2.2 From 3697cd9aa80125f7717c3c7e7253cfa49a39a388 Mon Sep 17 00:00:00 2001 From: Amerigo Wang Date: Fri, 10 Jul 2009 15:02:41 -0700 Subject: Doc: update Documentation/exception.txt Update Documentation/exception.txt. Remove trailing whitespaces in it. Signed-off-by: WANG Cong Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/exception.txt | 202 ++++++++++++++++++++++---------------------- 1 file changed, 101 insertions(+), 101 deletions(-) (limited to 'Documentation') diff --git a/Documentation/exception.txt b/Documentation/exception.txt index 2d5aded64247..32901aa36f0a 100644 --- a/Documentation/exception.txt +++ b/Documentation/exception.txt @@ -1,123 +1,123 @@ - Kernel level exception handling in Linux 2.1.8 + Kernel level exception handling in Linux Commentary by Joerg Pommnitz -When a process runs in kernel mode, it often has to access user -mode memory whose address has been passed by an untrusted program. +When a process runs in kernel mode, it often has to access user +mode memory whose address has been passed by an untrusted program. To protect itself the kernel has to verify this address. -In older versions of Linux this was done with the -int verify_area(int type, const void * addr, unsigned long size) +In older versions of Linux this was done with the +int verify_area(int type, const void * addr, unsigned long size) function (which has since been replaced by access_ok()). -This function verified that the memory area starting at address +This function verified that the memory area starting at address 'addr' and of size 'size' was accessible for the operation specified -in type (read or write). To do this, verify_read had to look up the -virtual memory area (vma) that contained the address addr. In the -normal case (correctly working program), this test was successful. +in type (read or write). To do this, verify_read had to look up the +virtual memory area (vma) that contained the address addr. In the +normal case (correctly working program), this test was successful. It only failed for a few buggy programs. In some kernel profiling tests, this normally unneeded verification used up a considerable amount of time. -To overcome this situation, Linus decided to let the virtual memory +To overcome this situation, Linus decided to let the virtual memory hardware present in every Linux-capable CPU handle this test. How does this work? -Whenever the kernel tries to access an address that is currently not -accessible, the CPU generates a page fault exception and calls the -page fault handler +Whenever the kernel tries to access an address that is currently not +accessible, the CPU generates a page fault exception and calls the +page fault handler void do_page_fault(struct pt_regs *regs, unsigned long error_code) -in arch/i386/mm/fault.c. The parameters on the stack are set up by -the low level assembly glue in arch/i386/kernel/entry.S. The parameter -regs is a pointer to the saved registers on the stack, error_code +in arch/x86/mm/fault.c. The parameters on the stack are set up by +the low level assembly glue in arch/x86/kernel/entry_32.S. The parameter +regs is a pointer to the saved registers on the stack, error_code contains a reason code for the exception. -do_page_fault first obtains the unaccessible address from the CPU -control register CR2. If the address is within the virtual address -space of the process, the fault probably occurred, because the page -was not swapped in, write protected or something similar. However, -we are interested in the other case: the address is not valid, there -is no vma that contains this address. In this case, the kernel jumps -to the bad_area label. - -There it uses the address of the instruction that caused the exception -(i.e. regs->eip) to find an address where the execution can continue -(fixup). If this search is successful, the fault handler modifies the -return address (again regs->eip) and returns. The execution will +do_page_fault first obtains the unaccessible address from the CPU +control register CR2. If the address is within the virtual address +space of the process, the fault probably occurred, because the page +was not swapped in, write protected or something similar. However, +we are interested in the other case: the address is not valid, there +is no vma that contains this address. In this case, the kernel jumps +to the bad_area label. + +There it uses the address of the instruction that caused the exception +(i.e. regs->eip) to find an address where the execution can continue +(fixup). If this search is successful, the fault handler modifies the +return address (again regs->eip) and returns. The execution will continue at the address in fixup. Where does fixup point to? -Since we jump to the contents of fixup, fixup obviously points -to executable code. This code is hidden inside the user access macros. -I have picked the get_user macro defined in include/asm/uaccess.h as an -example. The definition is somewhat hard to follow, so let's peek at +Since we jump to the contents of fixup, fixup obviously points +to executable code. This code is hidden inside the user access macros. +I have picked the get_user macro defined in arch/x86/include/asm/uaccess.h +as an example. The definition is somewhat hard to follow, so let's peek at the code generated by the preprocessor and the compiler. I selected -the get_user call in drivers/char/console.c for a detailed examination. +the get_user call in drivers/char/sysrq.c for a detailed examination. -The original code in console.c line 1405: +The original code in sysrq.c line 587: get_user(c, buf); The preprocessor output (edited to become somewhat readable): ( - { - long __gu_err = - 14 , __gu_val = 0; - const __typeof__(*( ( buf ) )) *__gu_addr = ((buf)); - if (((((0 + current_set[0])->tss.segment) == 0x18 ) || - (((sizeof(*(buf))) <= 0xC0000000UL) && - ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf))))))) + { + long __gu_err = - 14 , __gu_val = 0; + const __typeof__(*( ( buf ) )) *__gu_addr = ((buf)); + if (((((0 + current_set[0])->tss.segment) == 0x18 ) || + (((sizeof(*(buf))) <= 0xC0000000UL) && + ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf))))))) do { - __gu_err = 0; - switch ((sizeof(*(buf)))) { - case 1: - __asm__ __volatile__( - "1: mov" "b" " %2,%" "b" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "b" " %" "b" "1,%" "b" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" - " .long 1b,3b\n" + __gu_err = 0; + switch ((sizeof(*(buf)))) { + case 1: + __asm__ __volatile__( + "1: mov" "b" " %2,%" "b" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "b" " %" "b" "1,%" "b" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ; - break; - case 2: + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ; + break; + case 2: __asm__ __volatile__( - "1: mov" "w" " %2,%" "w" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "w" " %" "w" "1,%" "w" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" - " .long 1b,3b\n" + "1: mov" "w" " %2,%" "w" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "w" " %" "w" "1,%" "w" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )); - break; - case 4: - __asm__ __volatile__( - "1: mov" "l" " %2,%" "" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "l" " %" "" "1,%" "" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" " .long 1b,3b\n" + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )); + break; + case 4: + __asm__ __volatile__( + "1: mov" "l" " %2,%" "" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "l" " %" "" "1,%" "" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" " .long 1b,3b\n" ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err)); - break; - default: - (__gu_val) = __get_user_bad(); - } - } while (0) ; - ((c)) = (__typeof__(*((buf))))__gu_val; + ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err)); + break; + default: + (__gu_val) = __get_user_bad(); + } + } while (0) ; + ((c)) = (__typeof__(*((buf))))__gu_val; __gu_err; } ); @@ -127,12 +127,12 @@ see what code gcc generates: > xorl %edx,%edx > movl current_set,%eax - > cmpl $24,788(%eax) - > je .L1424 + > cmpl $24,788(%eax) + > je .L1424 > cmpl $-1073741825,64(%esp) - > ja .L1423 + > ja .L1423 > .L1424: - > movl %edx,%eax + > movl %edx,%eax > movl 64(%esp),%ebx > #APP > 1: movb (%ebx),%dl /* this is the actual user access */ @@ -149,17 +149,17 @@ see what code gcc generates: > .L1423: > movzbl %dl,%esi -The optimizer does a good job and gives us something we can actually -understand. Can we? The actual user access is quite obvious. Thanks -to the unified address space we can just access the address in user +The optimizer does a good job and gives us something we can actually +understand. Can we? The actual user access is quite obvious. Thanks +to the unified address space we can just access the address in user memory. But what does the .section stuff do????? To understand this we have to look at the final kernel: > objdump --section-headers vmlinux - > + > > vmlinux: file format elf32-i386 - > + > > Sections: > Idx Name Size VMA LMA File off Algn > 0 .text 00098f40 c0100000 c0100000 00001000 2**4 @@ -198,18 +198,18 @@ final kernel executable: The whole user memory access is reduced to 10 x86 machine instructions. The instructions bracketed in the .section directives are no longer -in the normal execution path. They are located in a different section +in the normal execution path. They are located in a different section of the executable file: > objdump --disassemble --section=.fixup vmlinux - > + > > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax > c0199ffa <.fixup+10ba> xorb %dl,%dl > c0199ffc <.fixup+10bc> jmp c017e7a7 And finally: > objdump --full-contents --section=__ex_table vmlinux - > + > > c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................ > c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................ > c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................ @@ -235,8 +235,8 @@ sections in the ELF object file. So the instructions ended up in the .fixup section of the object file and the addresses .long 1b,3b ended up in the __ex_table section of the object file. 1b and 3b -are local labels. The local label 1b (1b stands for next label 1 -backward) is the address of the instruction that might fault, i.e. +are local labels. The local label 1b (1b stands for next label 1 +backward) is the address of the instruction that might fault, i.e. in our case the address of the label 1 is c017e7a5: the original assembly code: > 1: movb (%ebx),%dl and linked in vmlinux : > c017e7a5 movb (%ebx),%dl @@ -254,7 +254,7 @@ The assembly code becomes the value pair > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ ^this is ^this is - 1b 3b + 1b 3b c017e7a5,c0199ff5 in the exception table of the kernel. So, what actually happens if a fault from kernel mode with no suitable @@ -266,9 +266,9 @@ vma occurs? 3.) CPU calls do_page_fault 4.) do page fault calls search_exception_table (regs->eip == c017e7a5); 5.) search_exception_table looks up the address c017e7a5 in the - exception table (i.e. the contents of the ELF section __ex_table) + exception table (i.e. the contents of the ELF section __ex_table) and returns the address of the associated fault handle code c0199ff5. -6.) do_page_fault modifies its own return address to point to the fault +6.) do_page_fault modifies its own return address to point to the fault handle code and returns. 7.) execution continues in the fault handling code. 8.) 8a) EAX becomes -EFAULT (== -14) -- cgit v1.2.2 From c368b4921bc6e309aba2fbee0efcbbc965008d9f Mon Sep 17 00:00:00 2001 From: Amerigo Wang Date: Fri, 10 Jul 2009 15:02:44 -0700 Subject: Doc: move Documentation/exception.txt into x86 subdir exception.txt only explains the code on x86, so it's better to move it into Documentation/x86 directory. And also rename it to exception-tables.txt which looks much more reasonable. This patch is on top of the previous one. Signed-off-by: WANG Cong Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/exception.txt | 292 --------------------------------- Documentation/x86/00-INDEX | 2 + Documentation/x86/exception-tables.txt | 292 +++++++++++++++++++++++++++++++++ 3 files changed, 294 insertions(+), 292 deletions(-) delete mode 100644 Documentation/exception.txt create mode 100644 Documentation/x86/exception-tables.txt (limited to 'Documentation') diff --git a/Documentation/exception.txt b/Documentation/exception.txt deleted file mode 100644 index 32901aa36f0a..000000000000 --- a/Documentation/exception.txt +++ /dev/null @@ -1,292 +0,0 @@ - Kernel level exception handling in Linux - Commentary by Joerg Pommnitz - -When a process runs in kernel mode, it often has to access user -mode memory whose address has been passed by an untrusted program. -To protect itself the kernel has to verify this address. - -In older versions of Linux this was done with the -int verify_area(int type, const void * addr, unsigned long size) -function (which has since been replaced by access_ok()). - -This function verified that the memory area starting at address -'addr' and of size 'size' was accessible for the operation specified -in type (read or write). To do this, verify_read had to look up the -virtual memory area (vma) that contained the address addr. In the -normal case (correctly working program), this test was successful. -It only failed for a few buggy programs. In some kernel profiling -tests, this normally unneeded verification used up a considerable -amount of time. - -To overcome this situation, Linus decided to let the virtual memory -hardware present in every Linux-capable CPU handle this test. - -How does this work? - -Whenever the kernel tries to access an address that is currently not -accessible, the CPU generates a page fault exception and calls the -page fault handler - -void do_page_fault(struct pt_regs *regs, unsigned long error_code) - -in arch/x86/mm/fault.c. The parameters on the stack are set up by -the low level assembly glue in arch/x86/kernel/entry_32.S. The parameter -regs is a pointer to the saved registers on the stack, error_code -contains a reason code for the exception. - -do_page_fault first obtains the unaccessible address from the CPU -control register CR2. If the address is within the virtual address -space of the process, the fault probably occurred, because the page -was not swapped in, write protected or something similar. However, -we are interested in the other case: the address is not valid, there -is no vma that contains this address. In this case, the kernel jumps -to the bad_area label. - -There it uses the address of the instruction that caused the exception -(i.e. regs->eip) to find an address where the execution can continue -(fixup). If this search is successful, the fault handler modifies the -return address (again regs->eip) and returns. The execution will -continue at the address in fixup. - -Where does fixup point to? - -Since we jump to the contents of fixup, fixup obviously points -to executable code. This code is hidden inside the user access macros. -I have picked the get_user macro defined in arch/x86/include/asm/uaccess.h -as an example. The definition is somewhat hard to follow, so let's peek at -the code generated by the preprocessor and the compiler. I selected -the get_user call in drivers/char/sysrq.c for a detailed examination. - -The original code in sysrq.c line 587: - get_user(c, buf); - -The preprocessor output (edited to become somewhat readable): - -( - { - long __gu_err = - 14 , __gu_val = 0; - const __typeof__(*( ( buf ) )) *__gu_addr = ((buf)); - if (((((0 + current_set[0])->tss.segment) == 0x18 ) || - (((sizeof(*(buf))) <= 0xC0000000UL) && - ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf))))))) - do { - __gu_err = 0; - switch ((sizeof(*(buf)))) { - case 1: - __asm__ __volatile__( - "1: mov" "b" " %2,%" "b" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "b" " %" "b" "1,%" "b" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" - " .long 1b,3b\n" - ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ; - break; - case 2: - __asm__ __volatile__( - "1: mov" "w" " %2,%" "w" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "w" " %" "w" "1,%" "w" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" - " .long 1b,3b\n" - ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )); - break; - case 4: - __asm__ __volatile__( - "1: mov" "l" " %2,%" "" "1\n" - "2:\n" - ".section .fixup,\"ax\"\n" - "3: movl %3,%0\n" - " xor" "l" " %" "" "1,%" "" "1\n" - " jmp 2b\n" - ".section __ex_table,\"a\"\n" - " .align 4\n" " .long 1b,3b\n" - ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) - ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err)); - break; - default: - (__gu_val) = __get_user_bad(); - } - } while (0) ; - ((c)) = (__typeof__(*((buf))))__gu_val; - __gu_err; - } -); - -WOW! Black GCC/assembly magic. This is impossible to follow, so let's -see what code gcc generates: - - > xorl %edx,%edx - > movl current_set,%eax - > cmpl $24,788(%eax) - > je .L1424 - > cmpl $-1073741825,64(%esp) - > ja .L1423 - > .L1424: - > movl %edx,%eax - > movl 64(%esp),%ebx - > #APP - > 1: movb (%ebx),%dl /* this is the actual user access */ - > 2: - > .section .fixup,"ax" - > 3: movl $-14,%eax - > xorb %dl,%dl - > jmp 2b - > .section __ex_table,"a" - > .align 4 - > .long 1b,3b - > .text - > #NO_APP - > .L1423: - > movzbl %dl,%esi - -The optimizer does a good job and gives us something we can actually -understand. Can we? The actual user access is quite obvious. Thanks -to the unified address space we can just access the address in user -memory. But what does the .section stuff do????? - -To understand this we have to look at the final kernel: - - > objdump --section-headers vmlinux - > - > vmlinux: file format elf32-i386 - > - > Sections: - > Idx Name Size VMA LMA File off Algn - > 0 .text 00098f40 c0100000 c0100000 00001000 2**4 - > CONTENTS, ALLOC, LOAD, READONLY, CODE - > 1 .fixup 000016bc c0198f40 c0198f40 00099f40 2**0 - > CONTENTS, ALLOC, LOAD, READONLY, CODE - > 2 .rodata 0000f127 c019a5fc c019a5fc 0009b5fc 2**2 - > CONTENTS, ALLOC, LOAD, READONLY, DATA - > 3 __ex_table 000015c0 c01a9724 c01a9724 000aa724 2**2 - > CONTENTS, ALLOC, LOAD, READONLY, DATA - > 4 .data 0000ea58 c01abcf0 c01abcf0 000abcf0 2**4 - > CONTENTS, ALLOC, LOAD, DATA - > 5 .bss 00018e21 c01ba748 c01ba748 000ba748 2**2 - > ALLOC - > 6 .comment 00000ec4 00000000 00000000 000ba748 2**0 - > CONTENTS, READONLY - > 7 .note 00001068 00000ec4 00000ec4 000bb60c 2**0 - > CONTENTS, READONLY - -There are obviously 2 non standard ELF sections in the generated object -file. But first we want to find out what happened to our code in the -final kernel executable: - - > objdump --disassemble --section=.text vmlinux - > - > c017e785 xorl %edx,%edx - > c017e787 movl 0xc01c7bec,%eax - > c017e78c cmpl $0x18,0x314(%eax) - > c017e793 je c017e79f - > c017e795 cmpl $0xbfffffff,0x40(%esp,1) - > c017e79d ja c017e7a7 - > c017e79f movl %edx,%eax - > c017e7a1 movl 0x40(%esp,1),%ebx - > c017e7a5 movb (%ebx),%dl - > c017e7a7 movzbl %dl,%esi - -The whole user memory access is reduced to 10 x86 machine instructions. -The instructions bracketed in the .section directives are no longer -in the normal execution path. They are located in a different section -of the executable file: - - > objdump --disassemble --section=.fixup vmlinux - > - > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax - > c0199ffa <.fixup+10ba> xorb %dl,%dl - > c0199ffc <.fixup+10bc> jmp c017e7a7 - -And finally: - > objdump --full-contents --section=__ex_table vmlinux - > - > c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................ - > c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................ - > c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................ - -or in human readable byte order: - - > c01aa7c4 c017c093 c0199fe0 c017c097 c017c099 ................ - > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ - ^^^^^^^^^^^^^^^^^ - this is the interesting part! - > c01aa7e4 c0180a08 c019a001 c0180a0a c019a004 ................ - -What happened? The assembly directives - -.section .fixup,"ax" -.section __ex_table,"a" - -told the assembler to move the following code to the specified -sections in the ELF object file. So the instructions -3: movl $-14,%eax - xorb %dl,%dl - jmp 2b -ended up in the .fixup section of the object file and the addresses - .long 1b,3b -ended up in the __ex_table section of the object file. 1b and 3b -are local labels. The local label 1b (1b stands for next label 1 -backward) is the address of the instruction that might fault, i.e. -in our case the address of the label 1 is c017e7a5: -the original assembly code: > 1: movb (%ebx),%dl -and linked in vmlinux : > c017e7a5 movb (%ebx),%dl - -The local label 3 (backwards again) is the address of the code to handle -the fault, in our case the actual value is c0199ff5: -the original assembly code: > 3: movl $-14,%eax -and linked in vmlinux : > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax - -The assembly code - > .section __ex_table,"a" - > .align 4 - > .long 1b,3b - -becomes the value pair - > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ - ^this is ^this is - 1b 3b -c017e7a5,c0199ff5 in the exception table of the kernel. - -So, what actually happens if a fault from kernel mode with no suitable -vma occurs? - -1.) access to invalid address: - > c017e7a5 movb (%ebx),%dl -2.) MMU generates exception -3.) CPU calls do_page_fault -4.) do page fault calls search_exception_table (regs->eip == c017e7a5); -5.) search_exception_table looks up the address c017e7a5 in the - exception table (i.e. the contents of the ELF section __ex_table) - and returns the address of the associated fault handle code c0199ff5. -6.) do_page_fault modifies its own return address to point to the fault - handle code and returns. -7.) execution continues in the fault handling code. -8.) 8a) EAX becomes -EFAULT (== -14) - 8b) DL becomes zero (the value we "read" from user space) - 8c) execution continues at local label 2 (address of the - instruction immediately after the faulting user access). - -The steps 8a to 8c in a certain way emulate the faulting instruction. - -That's it, mostly. If you look at our example, you might ask why -we set EAX to -EFAULT in the exception handler code. Well, the -get_user macro actually returns a value: 0, if the user access was -successful, -EFAULT on failure. Our original code did not test this -return value, however the inline assembly code in get_user tries to -return -EFAULT. GCC selected EAX to return this value. - -NOTE: -Due to the way that the exception table is built and needs to be ordered, -only use exceptions for code in the .text section. Any other section -will cause the exception table to not be sorted correctly, and the -exceptions will fail. diff --git a/Documentation/x86/00-INDEX b/Documentation/x86/00-INDEX index dbe3377754af..f37b46d34861 100644 --- a/Documentation/x86/00-INDEX +++ b/Documentation/x86/00-INDEX @@ -2,3 +2,5 @@ - this file mtrr.txt - how to use x86 Memory Type Range Registers to increase performance +exception-tables.txt + - why and how Linux kernel uses exception tables on x86 diff --git a/Documentation/x86/exception-tables.txt b/Documentation/x86/exception-tables.txt new file mode 100644 index 000000000000..32901aa36f0a --- /dev/null +++ b/Documentation/x86/exception-tables.txt @@ -0,0 +1,292 @@ + Kernel level exception handling in Linux + Commentary by Joerg Pommnitz + +When a process runs in kernel mode, it often has to access user +mode memory whose address has been passed by an untrusted program. +To protect itself the kernel has to verify this address. + +In older versions of Linux this was done with the +int verify_area(int type, const void * addr, unsigned long size) +function (which has since been replaced by access_ok()). + +This function verified that the memory area starting at address +'addr' and of size 'size' was accessible for the operation specified +in type (read or write). To do this, verify_read had to look up the +virtual memory area (vma) that contained the address addr. In the +normal case (correctly working program), this test was successful. +It only failed for a few buggy programs. In some kernel profiling +tests, this normally unneeded verification used up a considerable +amount of time. + +To overcome this situation, Linus decided to let the virtual memory +hardware present in every Linux-capable CPU handle this test. + +How does this work? + +Whenever the kernel tries to access an address that is currently not +accessible, the CPU generates a page fault exception and calls the +page fault handler + +void do_page_fault(struct pt_regs *regs, unsigned long error_code) + +in arch/x86/mm/fault.c. The parameters on the stack are set up by +the low level assembly glue in arch/x86/kernel/entry_32.S. The parameter +regs is a pointer to the saved registers on the stack, error_code +contains a reason code for the exception. + +do_page_fault first obtains the unaccessible address from the CPU +control register CR2. If the address is within the virtual address +space of the process, the fault probably occurred, because the page +was not swapped in, write protected or something similar. However, +we are interested in the other case: the address is not valid, there +is no vma that contains this address. In this case, the kernel jumps +to the bad_area label. + +There it uses the address of the instruction that caused the exception +(i.e. regs->eip) to find an address where the execution can continue +(fixup). If this search is successful, the fault handler modifies the +return address (again regs->eip) and returns. The execution will +continue at the address in fixup. + +Where does fixup point to? + +Since we jump to the contents of fixup, fixup obviously points +to executable code. This code is hidden inside the user access macros. +I have picked the get_user macro defined in arch/x86/include/asm/uaccess.h +as an example. The definition is somewhat hard to follow, so let's peek at +the code generated by the preprocessor and the compiler. I selected +the get_user call in drivers/char/sysrq.c for a detailed examination. + +The original code in sysrq.c line 587: + get_user(c, buf); + +The preprocessor output (edited to become somewhat readable): + +( + { + long __gu_err = - 14 , __gu_val = 0; + const __typeof__(*( ( buf ) )) *__gu_addr = ((buf)); + if (((((0 + current_set[0])->tss.segment) == 0x18 ) || + (((sizeof(*(buf))) <= 0xC0000000UL) && + ((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf))))))) + do { + __gu_err = 0; + switch ((sizeof(*(buf)))) { + case 1: + __asm__ __volatile__( + "1: mov" "b" " %2,%" "b" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "b" " %" "b" "1,%" "b" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ; + break; + case 2: + __asm__ __volatile__( + "1: mov" "w" " %2,%" "w" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "w" " %" "w" "1,%" "w" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" + " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )); + break; + case 4: + __asm__ __volatile__( + "1: mov" "l" " %2,%" "" "1\n" + "2:\n" + ".section .fixup,\"ax\"\n" + "3: movl %3,%0\n" + " xor" "l" " %" "" "1,%" "" "1\n" + " jmp 2b\n" + ".section __ex_table,\"a\"\n" + " .align 4\n" " .long 1b,3b\n" + ".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *) + ( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err)); + break; + default: + (__gu_val) = __get_user_bad(); + } + } while (0) ; + ((c)) = (__typeof__(*((buf))))__gu_val; + __gu_err; + } +); + +WOW! Black GCC/assembly magic. This is impossible to follow, so let's +see what code gcc generates: + + > xorl %edx,%edx + > movl current_set,%eax + > cmpl $24,788(%eax) + > je .L1424 + > cmpl $-1073741825,64(%esp) + > ja .L1423 + > .L1424: + > movl %edx,%eax + > movl 64(%esp),%ebx + > #APP + > 1: movb (%ebx),%dl /* this is the actual user access */ + > 2: + > .section .fixup,"ax" + > 3: movl $-14,%eax + > xorb %dl,%dl + > jmp 2b + > .section __ex_table,"a" + > .align 4 + > .long 1b,3b + > .text + > #NO_APP + > .L1423: + > movzbl %dl,%esi + +The optimizer does a good job and gives us something we can actually +understand. Can we? The actual user access is quite obvious. Thanks +to the unified address space we can just access the address in user +memory. But what does the .section stuff do????? + +To understand this we have to look at the final kernel: + + > objdump --section-headers vmlinux + > + > vmlinux: file format elf32-i386 + > + > Sections: + > Idx Name Size VMA LMA File off Algn + > 0 .text 00098f40 c0100000 c0100000 00001000 2**4 + > CONTENTS, ALLOC, LOAD, READONLY, CODE + > 1 .fixup 000016bc c0198f40 c0198f40 00099f40 2**0 + > CONTENTS, ALLOC, LOAD, READONLY, CODE + > 2 .rodata 0000f127 c019a5fc c019a5fc 0009b5fc 2**2 + > CONTENTS, ALLOC, LOAD, READONLY, DATA + > 3 __ex_table 000015c0 c01a9724 c01a9724 000aa724 2**2 + > CONTENTS, ALLOC, LOAD, READONLY, DATA + > 4 .data 0000ea58 c01abcf0 c01abcf0 000abcf0 2**4 + > CONTENTS, ALLOC, LOAD, DATA + > 5 .bss 00018e21 c01ba748 c01ba748 000ba748 2**2 + > ALLOC + > 6 .comment 00000ec4 00000000 00000000 000ba748 2**0 + > CONTENTS, READONLY + > 7 .note 00001068 00000ec4 00000ec4 000bb60c 2**0 + > CONTENTS, READONLY + +There are obviously 2 non standard ELF sections in the generated object +file. But first we want to find out what happened to our code in the +final kernel executable: + + > objdump --disassemble --section=.text vmlinux + > + > c017e785 xorl %edx,%edx + > c017e787 movl 0xc01c7bec,%eax + > c017e78c cmpl $0x18,0x314(%eax) + > c017e793 je c017e79f + > c017e795 cmpl $0xbfffffff,0x40(%esp,1) + > c017e79d ja c017e7a7 + > c017e79f movl %edx,%eax + > c017e7a1 movl 0x40(%esp,1),%ebx + > c017e7a5 movb (%ebx),%dl + > c017e7a7 movzbl %dl,%esi + +The whole user memory access is reduced to 10 x86 machine instructions. +The instructions bracketed in the .section directives are no longer +in the normal execution path. They are located in a different section +of the executable file: + + > objdump --disassemble --section=.fixup vmlinux + > + > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax + > c0199ffa <.fixup+10ba> xorb %dl,%dl + > c0199ffc <.fixup+10bc> jmp c017e7a7 + +And finally: + > objdump --full-contents --section=__ex_table vmlinux + > + > c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................ + > c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................ + > c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................ + +or in human readable byte order: + + > c01aa7c4 c017c093 c0199fe0 c017c097 c017c099 ................ + > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ + ^^^^^^^^^^^^^^^^^ + this is the interesting part! + > c01aa7e4 c0180a08 c019a001 c0180a0a c019a004 ................ + +What happened? The assembly directives + +.section .fixup,"ax" +.section __ex_table,"a" + +told the assembler to move the following code to the specified +sections in the ELF object file. So the instructions +3: movl $-14,%eax + xorb %dl,%dl + jmp 2b +ended up in the .fixup section of the object file and the addresses + .long 1b,3b +ended up in the __ex_table section of the object file. 1b and 3b +are local labels. The local label 1b (1b stands for next label 1 +backward) is the address of the instruction that might fault, i.e. +in our case the address of the label 1 is c017e7a5: +the original assembly code: > 1: movb (%ebx),%dl +and linked in vmlinux : > c017e7a5 movb (%ebx),%dl + +The local label 3 (backwards again) is the address of the code to handle +the fault, in our case the actual value is c0199ff5: +the original assembly code: > 3: movl $-14,%eax +and linked in vmlinux : > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax + +The assembly code + > .section __ex_table,"a" + > .align 4 + > .long 1b,3b + +becomes the value pair + > c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................ + ^this is ^this is + 1b 3b +c017e7a5,c0199ff5 in the exception table of the kernel. + +So, what actually happens if a fault from kernel mode with no suitable +vma occurs? + +1.) access to invalid address: + > c017e7a5 movb (%ebx),%dl +2.) MMU generates exception +3.) CPU calls do_page_fault +4.) do page fault calls search_exception_table (regs->eip == c017e7a5); +5.) search_exception_table looks up the address c017e7a5 in the + exception table (i.e. the contents of the ELF section __ex_table) + and returns the address of the associated fault handle code c0199ff5. +6.) do_page_fault modifies its own return address to point to the fault + handle code and returns. +7.) execution continues in the fault handling code. +8.) 8a) EAX becomes -EFAULT (== -14) + 8b) DL becomes zero (the value we "read" from user space) + 8c) execution continues at local label 2 (address of the + instruction immediately after the faulting user access). + +The steps 8a to 8c in a certain way emulate the faulting instruction. + +That's it, mostly. If you look at our example, you might ask why +we set EAX to -EFAULT in the exception handler code. Well, the +get_user macro actually returns a value: 0, if the user access was +successful, -EFAULT on failure. Our original code did not test this +return value, however the inline assembly code in get_user tries to +return -EFAULT. GCC selected EAX to return this value. + +NOTE: +Due to the way that the exception table is built and needs to be ordered, +only use exceptions for code in the .text section. Any other section +will cause the exception table to not be sorted correctly, and the +exceptions will fail. -- cgit v1.2.2 From 909662e1e7290945fa3bca038bc3b7bb5d19499f Mon Sep 17 00:00:00 2001 From: vibi sreenivasan Date: Wed, 8 Jul 2009 15:37:03 -0700 Subject: driver model: fix show/store prototypes in doc. FIX prototypes for show & store method in struct driver_attribute Signed-off-by: vibi sreenivasan Signed-off-by: Randy Dunlap Signed-off-by: Greg Kroah-Hartman --- Documentation/driver-model/driver.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/driver-model/driver.txt b/Documentation/driver-model/driver.txt index 82132169d47a..60120fb3b961 100644 --- a/Documentation/driver-model/driver.txt +++ b/Documentation/driver-model/driver.txt @@ -207,8 +207,8 @@ Attributes ~~~~~~~~~~ struct driver_attribute { struct attribute attr; - ssize_t (*show)(struct device_driver *, char * buf, size_t count, loff_t off); - ssize_t (*store)(struct device_driver *, const char * buf, size_t count, loff_t off); + ssize_t (*show)(struct device_driver *driver, char *buf); + ssize_t (*store)(struct device_driver *, const char * buf, size_t count); }; Device drivers can export attributes via their sysfs directories. -- cgit v1.2.2