diff options
Diffstat (limited to 'Documentation')
31 files changed, 1257 insertions, 49 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index 4873c759d535..c1eb41cb9876 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block | |||
| @@ -142,3 +142,67 @@ Description: | |||
| 142 | with the previous I/O request are enabled. When set to 2, | 142 | with the previous I/O request are enabled. When set to 2, |
| 143 | all merge tries are disabled. The default value is 0 - | 143 | all merge tries are disabled. The default value is 0 - |
| 144 | which enables all types of merge tries. | 144 | which enables all types of merge tries. |
| 145 | |||
| 146 | What: /sys/block/<disk>/discard_alignment | ||
| 147 | Date: May 2011 | ||
| 148 | Contact: Martin K. Petersen <martin.petersen@oracle.com> | ||
| 149 | Description: | ||
| 150 | Devices that support discard functionality may | ||
| 151 | internally allocate space in units that are bigger than | ||
| 152 | the exported logical block size. The discard_alignment | ||
| 153 | parameter indicates how many bytes the beginning of the | ||
| 154 | device is offset from the internal allocation unit's | ||
| 155 | natural alignment. | ||
| 156 | |||
| 157 | What: /sys/block/<disk>/<partition>/discard_alignment | ||
| 158 | Date: May 2011 | ||
| 159 | Contact: Martin K. Petersen <martin.petersen@oracle.com> | ||
| 160 | Description: | ||
| 161 | Devices that support discard functionality may | ||
| 162 | internally allocate space in units that are bigger than | ||
| 163 | the exported logical block size. The discard_alignment | ||
| 164 | parameter indicates how many bytes the beginning of the | ||
| 165 | partition is offset from the internal allocation unit's | ||
| 166 | natural alignment. | ||
| 167 | |||
| 168 | What: /sys/block/<disk>/queue/discard_granularity | ||
| 169 | Date: May 2011 | ||
| 170 | Contact: Martin K. Petersen <martin.petersen@oracle.com> | ||
| 171 | Description: | ||
| 172 | Devices that support discard functionality may | ||
| 173 | internally allocate space using units that are bigger | ||
| 174 | than the logical block size. The discard_granularity | ||
| 175 | parameter indicates the size of the internal allocation | ||
| 176 | unit in bytes if reported by the device. Otherwise the | ||
| 177 | discard_granularity will be set to match the device's | ||
| 178 | physical block size. A discard_granularity of 0 means | ||
| 179 | that the device does not support discard functionality. | ||
| 180 | |||
| 181 | What: /sys/block/<disk>/queue/discard_max_bytes | ||
| 182 | Date: May 2011 | ||
| 183 | Contact: Martin K. Petersen <martin.petersen@oracle.com> | ||
| 184 | Description: | ||
| 185 | Devices that support discard functionality may have | ||
| 186 | internal limits on the number of bytes that can be | ||
| 187 | trimmed or unmapped in a single operation. Some storage | ||
| 188 | protocols also have inherent limits on the number of | ||
| 189 | blocks that can be described in a single command. The | ||
| 190 | discard_max_bytes parameter is set by the device driver | ||
| 191 | to the maximum number of bytes that can be discarded in | ||
| 192 | a single operation. Discard requests issued to the | ||
| 193 | device must not exceed this limit. A discard_max_bytes | ||
| 194 | value of 0 means that the device does not support | ||
| 195 | discard functionality. | ||
| 196 | |||
| 197 | What: /sys/block/<disk>/queue/discard_zeroes_data | ||
| 198 | Date: May 2011 | ||
| 199 | Contact: Martin K. Petersen <martin.petersen@oracle.com> | ||
| 200 | Description: | ||
| 201 | Devices that support discard functionality may return | ||
| 202 | stale or random data when a previously discarded block | ||
| 203 | is read back. This can cause problems if the filesystem | ||
| 204 | expects discarded blocks to be explicitly cleared. If a | ||
| 205 | device reports that it deterministically returns zeroes | ||
| 206 | when a discarded area is read the discard_zeroes_data | ||
| 207 | parameter will be set to one. Otherwise it will be 0 and | ||
| 208 | the result of reading a discarded area is undefined. | ||
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cleancache b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache new file mode 100644 index 000000000000..662ae646ea12 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache | |||
| @@ -0,0 +1,11 @@ | |||
| 1 | What: /sys/kernel/mm/cleancache/ | ||
| 2 | Date: April 2011 | ||
| 3 | Contact: Dan Magenheimer <dan.magenheimer@oracle.com> | ||
| 4 | Description: | ||
| 5 | /sys/kernel/mm/cleancache/ contains a number of files which | ||
| 6 | record a count of various cleancache operations | ||
| 7 | (sum across all filesystems): | ||
| 8 | succ_gets | ||
| 9 | failed_gets | ||
| 10 | puts | ||
| 11 | flushes | ||
diff --git a/Documentation/ABI/testing/sysfs-ptp b/Documentation/ABI/testing/sysfs-ptp new file mode 100644 index 000000000000..d40d2b550502 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-ptp | |||
| @@ -0,0 +1,98 @@ | |||
| 1 | What: /sys/class/ptp/ | ||
| 2 | Date: September 2010 | ||
| 3 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 4 | Description: | ||
| 5 | This directory contains files and directories | ||
| 6 | providing a standardized interface to the ancillary | ||
| 7 | features of PTP hardware clocks. | ||
| 8 | |||
| 9 | What: /sys/class/ptp/ptpN/ | ||
| 10 | Date: September 2010 | ||
| 11 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 12 | Description: | ||
| 13 | This directory contains the attributes of the Nth PTP | ||
| 14 | hardware clock registered into the PTP class driver | ||
| 15 | subsystem. | ||
| 16 | |||
| 17 | What: /sys/class/ptp/ptpN/clock_name | ||
| 18 | Date: September 2010 | ||
| 19 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 20 | Description: | ||
| 21 | This file contains the name of the PTP hardware clock | ||
| 22 | as a human readable string. | ||
| 23 | |||
| 24 | What: /sys/class/ptp/ptpN/max_adjustment | ||
| 25 | Date: September 2010 | ||
| 26 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 27 | Description: | ||
| 28 | This file contains the PTP hardware clock's maximum | ||
| 29 | frequency adjustment value (a positive integer) in | ||
| 30 | parts per billion. | ||
| 31 | |||
| 32 | What: /sys/class/ptp/ptpN/n_alarms | ||
| 33 | Date: September 2010 | ||
| 34 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 35 | Description: | ||
| 36 | This file contains the number of periodic or one shot | ||
| 37 | alarms offer by the PTP hardware clock. | ||
| 38 | |||
| 39 | What: /sys/class/ptp/ptpN/n_external_timestamps | ||
| 40 | Date: September 2010 | ||
| 41 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 42 | Description: | ||
| 43 | This file contains the number of external timestamp | ||
| 44 | channels offered by the PTP hardware clock. | ||
| 45 | |||
| 46 | What: /sys/class/ptp/ptpN/n_periodic_outputs | ||
| 47 | Date: September 2010 | ||
| 48 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 49 | Description: | ||
| 50 | This file contains the number of programmable periodic | ||
| 51 | output channels offered by the PTP hardware clock. | ||
| 52 | |||
| 53 | What: /sys/class/ptp/ptpN/pps_avaiable | ||
| 54 | Date: September 2010 | ||
| 55 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 56 | Description: | ||
| 57 | This file indicates whether the PTP hardware clock | ||
| 58 | supports a Pulse Per Second to the host CPU. Reading | ||
| 59 | "1" means that the PPS is supported, while "0" means | ||
| 60 | not supported. | ||
| 61 | |||
| 62 | What: /sys/class/ptp/ptpN/extts_enable | ||
| 63 | Date: September 2010 | ||
| 64 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 65 | Description: | ||
| 66 | This write-only file enables or disables external | ||
| 67 | timestamps. To enable external timestamps, write the | ||
| 68 | channel index followed by a "1" into the file. | ||
| 69 | To disable external timestamps, write the channel | ||
| 70 | index followed by a "0" into the file. | ||
| 71 | |||
| 72 | What: /sys/class/ptp/ptpN/fifo | ||
| 73 | Date: September 2010 | ||
| 74 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 75 | Description: | ||
| 76 | This file provides timestamps on external events, in | ||
| 77 | the form of three integers: channel index, seconds, | ||
| 78 | and nanoseconds. | ||
| 79 | |||
| 80 | What: /sys/class/ptp/ptpN/period | ||
| 81 | Date: September 2010 | ||
| 82 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 83 | Description: | ||
| 84 | This write-only file enables or disables periodic | ||
| 85 | outputs. To enable a periodic output, write five | ||
| 86 | integers into the file: channel index, start time | ||
| 87 | seconds, start time nanoseconds, period seconds, and | ||
| 88 | period nanoseconds. To disable a periodic output, set | ||
| 89 | all the seconds and nanoseconds values to zero. | ||
| 90 | |||
| 91 | What: /sys/class/ptp/ptpN/pps_enable | ||
| 92 | Date: September 2010 | ||
| 93 | Contact: Richard Cochran <richardcochran@gmail.com> | ||
| 94 | Description: | ||
| 95 | This write-only file enables or disables delivery of | ||
| 96 | PPS events to the Linux PPS subsystem. To enable PPS | ||
| 97 | events, write a "1" into the file. To disable events, | ||
| 98 | write a "0" into the file. | ||
diff --git a/Documentation/IRQ-affinity.txt b/Documentation/IRQ-affinity.txt index b4a615b78403..7890fae18529 100644 --- a/Documentation/IRQ-affinity.txt +++ b/Documentation/IRQ-affinity.txt | |||
| @@ -4,10 +4,11 @@ ChangeLog: | |||
| 4 | 4 | ||
| 5 | SMP IRQ affinity | 5 | SMP IRQ affinity |
| 6 | 6 | ||
| 7 | /proc/irq/IRQ#/smp_affinity specifies which target CPUs are permitted | 7 | /proc/irq/IRQ#/smp_affinity and /proc/irq/IRQ#/smp_affinity_list specify |
| 8 | for a given IRQ source. It's a bitmask of allowed CPUs. It's not allowed | 8 | which target CPUs are permitted for a given IRQ source. It's a bitmask |
| 9 | to turn off all CPUs, and if an IRQ controller does not support IRQ | 9 | (smp_affinity) or cpu list (smp_affinity_list) of allowed CPUs. It's not |
| 10 | affinity then the value will not change from the default 0xffffffff. | 10 | allowed to turn off all CPUs, and if an IRQ controller does not support |
| 11 | IRQ affinity then the value will not change from the default of all cpus. | ||
| 11 | 12 | ||
| 12 | /proc/irq/default_smp_affinity specifies default affinity mask that applies | 13 | /proc/irq/default_smp_affinity specifies default affinity mask that applies |
| 13 | to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask | 14 | to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask |
| @@ -54,3 +55,11 @@ round-trip min/avg/max = 0.1/0.5/585.4 ms | |||
| 54 | This time around IRQ44 was delivered only to the last four processors. | 55 | This time around IRQ44 was delivered only to the last four processors. |
| 55 | i.e counters for the CPU0-3 did not change. | 56 | i.e counters for the CPU0-3 did not change. |
| 56 | 57 | ||
| 58 | Here is an example of limiting that same irq (44) to cpus 1024 to 1031: | ||
| 59 | |||
| 60 | [root@moon 44]# echo 1024-1031 > smp_affinity | ||
| 61 | [root@moon 44]# cat smp_affinity | ||
| 62 | 1024-1031 | ||
| 63 | |||
| 64 | Note that to do this with a bitmask would require 32 bitmasks of zero | ||
| 65 | to follow the pertinent one. | ||
diff --git a/Documentation/blockdev/cciss.txt b/Documentation/blockdev/cciss.txt index 89698e8df7d4..c00c6a5ab21f 100644 --- a/Documentation/blockdev/cciss.txt +++ b/Documentation/blockdev/cciss.txt | |||
| @@ -169,3 +169,18 @@ is issued which positions the tape to a known position. Typically you | |||
| 169 | must rewind the tape (by issuing "mt -f /dev/st0 rewind" for example) | 169 | must rewind the tape (by issuing "mt -f /dev/st0 rewind" for example) |
| 170 | before i/o can proceed again to a tape drive which was reset. | 170 | before i/o can proceed again to a tape drive which was reset. |
| 171 | 171 | ||
| 172 | There is a cciss_tape_cmds module parameter which can be used to make cciss | ||
| 173 | allocate more commands for use by tape drives. Ordinarily only a few commands | ||
| 174 | (6) are allocated for tape drives because tape drives are slow and | ||
| 175 | infrequently used and the primary purpose of Smart Array controllers is to | ||
| 176 | act as a RAID controller for disk drives, so the vast majority of commands | ||
| 177 | are allocated for disk devices. However, if you have more than a few tape | ||
| 178 | drives attached to a smart array, the default number of commands may not be | ||
| 179 | enought (for example, if you have 8 tape drives, you could only rewind 6 | ||
| 180 | at one time with the default number of commands.) The cciss_tape_cmds module | ||
| 181 | parameter allows more commands (up to 16 more) to be allocated for use by | ||
| 182 | tape drives. For example: | ||
| 183 | |||
| 184 | insmod cciss.ko cciss_tape_cmds=16 | ||
| 185 | |||
| 186 | Or, as a kernel boot parameter passed in via grub: cciss.cciss_tape_cmds=8 | ||
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index 9164ae3b83bc..9b728dc17535 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt | |||
| @@ -16,7 +16,7 @@ on all processors in the system. Don't let this scare you into | |||
| 16 | thinking SMP cache/tlb flushing must be so inefficient, this is in | 16 | thinking SMP cache/tlb flushing must be so inefficient, this is in |
| 17 | fact an area where many optimizations are possible. For example, | 17 | fact an area where many optimizations are possible. For example, |
| 18 | if it can be proven that a user address space has never executed | 18 | if it can be proven that a user address space has never executed |
| 19 | on a cpu (see vma->cpu_vm_mask), one need not perform a flush | 19 | on a cpu (see mm_cpumask()), one need not perform a flush |
| 20 | for this address space on that cpu. | 20 | for this address space on that cpu. |
| 21 | 21 | ||
| 22 | First, the TLB flushing interfaces, since they are the simplest. The | 22 | First, the TLB flushing interfaces, since they are the simplest. The |
diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt index edb7ae19e868..2c6be0377f55 100644 --- a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt +++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt | |||
| @@ -74,3 +74,57 @@ Example: | |||
| 74 | interrupt-parent = <&mpic>; | 74 | interrupt-parent = <&mpic>; |
| 75 | phy-handle = <&phy0> | 75 | phy-handle = <&phy0> |
| 76 | }; | 76 | }; |
| 77 | |||
| 78 | * Gianfar PTP clock nodes | ||
| 79 | |||
| 80 | General Properties: | ||
| 81 | |||
| 82 | - compatible Should be "fsl,etsec-ptp" | ||
| 83 | - reg Offset and length of the register set for the device | ||
| 84 | - interrupts There should be at least two interrupts. Some devices | ||
| 85 | have as many as four PTP related interrupts. | ||
| 86 | |||
| 87 | Clock Properties: | ||
| 88 | |||
| 89 | - fsl,tclk-period Timer reference clock period in nanoseconds. | ||
| 90 | - fsl,tmr-prsc Prescaler, divides the output clock. | ||
| 91 | - fsl,tmr-add Frequency compensation value. | ||
| 92 | - fsl,tmr-fiper1 Fixed interval period pulse generator. | ||
| 93 | - fsl,tmr-fiper2 Fixed interval period pulse generator. | ||
| 94 | - fsl,max-adj Maximum frequency adjustment in parts per billion. | ||
| 95 | |||
| 96 | These properties set the operational parameters for the PTP | ||
| 97 | clock. You must choose these carefully for the clock to work right. | ||
| 98 | Here is how to figure good values: | ||
| 99 | |||
| 100 | TimerOsc = system clock MHz | ||
| 101 | tclk_period = desired clock period nanoseconds | ||
| 102 | NominalFreq = 1000 / tclk_period MHz | ||
| 103 | FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0) | ||
| 104 | tmr_add = ceil(2^32 / FreqDivRatio) | ||
| 105 | OutputClock = NominalFreq / tmr_prsc MHz | ||
| 106 | PulseWidth = 1 / OutputClock microseconds | ||
| 107 | FiperFreq1 = desired frequency in Hz | ||
| 108 | FiperDiv1 = 1000000 * OutputClock / FiperFreq1 | ||
| 109 | tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period | ||
| 110 | max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1 | ||
| 111 | |||
| 112 | The calculation for tmr_fiper2 is the same as for tmr_fiper1. The | ||
| 113 | driver expects that tmr_fiper1 will be correctly set to produce a 1 | ||
| 114 | Pulse Per Second (PPS) signal, since this will be offered to the PPS | ||
| 115 | subsystem to synchronize the Linux clock. | ||
| 116 | |||
| 117 | Example: | ||
| 118 | |||
| 119 | ptp_clock@24E00 { | ||
| 120 | compatible = "fsl,etsec-ptp"; | ||
| 121 | reg = <0x24E00 0xB0>; | ||
| 122 | interrupts = <12 0x8 13 0x8>; | ||
| 123 | interrupt-parent = < &ipic >; | ||
| 124 | fsl,tclk-period = <10>; | ||
| 125 | fsl,tmr-prsc = <100>; | ||
| 126 | fsl,tmr-add = <0x999999A4>; | ||
| 127 | fsl,tmr-fiper1 = <0x3B9AC9F6>; | ||
| 128 | fsl,tmr-fiper2 = <0x00018696>; | ||
| 129 | fsl,max-adj = <659999998>; | ||
| 130 | }; | ||
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt index b22abba78fed..13de64c7f0ab 100644 --- a/Documentation/filesystems/9p.txt +++ b/Documentation/filesystems/9p.txt | |||
| @@ -25,6 +25,8 @@ Other applications are described in the following papers: | |||
| 25 | http://xcpu.org/papers/cellfs-talk.pdf | 25 | http://xcpu.org/papers/cellfs-talk.pdf |
| 26 | * PROSE I/O: Using 9p to enable Application Partitions | 26 | * PROSE I/O: Using 9p to enable Application Partitions |
| 27 | http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf | 27 | http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf |
| 28 | * VirtFS: A Virtualization Aware File System pass-through | ||
| 29 | http://goo.gl/3WPDg | ||
| 28 | 30 | ||
| 29 | USAGE | 31 | USAGE |
| 30 | ===== | 32 | ===== |
| @@ -130,31 +132,20 @@ OPTIONS | |||
| 130 | RESOURCES | 132 | RESOURCES |
| 131 | ========= | 133 | ========= |
| 132 | 134 | ||
| 133 | Our current recommendation is to use Inferno (http://www.vitanuova.com/nferno/index.html) | 135 | Protocol specifications are maintained on github: |
| 134 | as the 9p server. You can start a 9p server under Inferno by issuing the | 136 | http://ericvh.github.com/9p-rfc/ |
| 135 | following command: | ||
| 136 | ; styxlisten -A tcp!*!564 export '#U*' | ||
| 137 | 137 | ||
| 138 | The -A specifies an unauthenticated export. The 564 is the port # (you may | 138 | 9p client and server implementations are listed on |
| 139 | have to choose a higher port number if running as a normal user). The '#U*' | 139 | http://9p.cat-v.org/implementations |
| 140 | specifies exporting the root of the Linux name space. You may specify a | ||
| 141 | subset of the namespace by extending the path: '#U*'/tmp would just export | ||
| 142 | /tmp. For more information, see the Inferno manual pages covering styxlisten | ||
| 143 | and export. | ||
| 144 | 140 | ||
| 145 | A Linux version of the 9p server is now maintained under the npfs project | 141 | A 9p2000.L server is being developed by LLNL and can be found |
| 146 | on sourceforge (http://sourceforge.net/projects/npfs). The currently | 142 | at http://code.google.com/p/diod/ |
| 147 | maintained version is the single-threaded version of the server (named spfs) | ||
| 148 | available from the same SVN repository. | ||
| 149 | 143 | ||
| 150 | There are user and developer mailing lists available through the v9fs project | 144 | There are user and developer mailing lists available through the v9fs project |
| 151 | on sourceforge (http://sourceforge.net/projects/v9fs). | 145 | on sourceforge (http://sourceforge.net/projects/v9fs). |
| 152 | 146 | ||
| 153 | A stand-alone version of the module (which should build for any 2.6 kernel) | 147 | News and other information is maintained on a Wiki. |
| 154 | is available via (http://github.com/ericvh/9p-sac/tree/master) | 148 | (http://sf.net/apps/mediawiki/v9fs/index.php). |
| 155 | |||
| 156 | News and other information is maintained on SWiK (http://swik.net/v9fs) | ||
| 157 | and the Wiki (http://sf.net/apps/mediawiki/v9fs/index.php). | ||
| 158 | 149 | ||
| 159 | Bug reports may be issued through the kernel.org bugzilla | 150 | Bug reports may be issued through the kernel.org bugzilla |
| 160 | (http://bugzilla.kernel.org) | 151 | (http://bugzilla.kernel.org) |
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index c79ec58fd7f6..3ae9bc94352a 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
| @@ -226,10 +226,6 @@ acl Enables POSIX Access Control Lists support. | |||
| 226 | noacl This option disables POSIX Access Control List | 226 | noacl This option disables POSIX Access Control List |
| 227 | support. | 227 | support. |
| 228 | 228 | ||
| 229 | reservation | ||
| 230 | |||
| 231 | noreservation | ||
| 232 | |||
| 233 | bsddf (*) Make 'df' act like BSD. | 229 | bsddf (*) Make 'df' act like BSD. |
| 234 | minixdf Make 'df' act like Minix. | 230 | minixdf Make 'df' act like Minix. |
| 235 | 231 | ||
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 60740e8ecb37..f48178024067 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt | |||
| @@ -574,6 +574,12 @@ The contents of each smp_affinity file is the same by default: | |||
| 574 | > cat /proc/irq/0/smp_affinity | 574 | > cat /proc/irq/0/smp_affinity |
| 575 | ffffffff | 575 | ffffffff |
| 576 | 576 | ||
| 577 | There is an alternate interface, smp_affinity_list which allows specifying | ||
| 578 | a cpu range instead of a bitmask: | ||
| 579 | |||
| 580 | > cat /proc/irq/0/smp_affinity_list | ||
| 581 | 1024-1031 | ||
| 582 | |||
| 577 | The default_smp_affinity mask applies to all non-active IRQs, which are the | 583 | The default_smp_affinity mask applies to all non-active IRQs, which are the |
| 578 | IRQs which have not yet been allocated/activated, and hence which lack a | 584 | IRQs which have not yet been allocated/activated, and hence which lack a |
| 579 | /proc/irq/[0-9]* directory. | 585 | /proc/irq/[0-9]* directory. |
| @@ -583,12 +589,13 @@ reports itself as being attached. This hardware locality information does not | |||
| 583 | include information about any possible driver locality preference. | 589 | include information about any possible driver locality preference. |
| 584 | 590 | ||
| 585 | prof_cpu_mask specifies which CPUs are to be profiled by the system wide | 591 | prof_cpu_mask specifies which CPUs are to be profiled by the system wide |
| 586 | profiler. Default value is ffffffff (all cpus). | 592 | profiler. Default value is ffffffff (all cpus if there are only 32 of them). |
| 587 | 593 | ||
| 588 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin | 594 | The way IRQs are routed is handled by the IO-APIC, and it's Round Robin |
| 589 | between all the CPUs which are allowed to handle it. As usual the kernel has | 595 | between all the CPUs which are allowed to handle it. As usual the kernel has |
| 590 | more info than you and does a better job than you, so the defaults are the | 596 | more info than you and does a better job than you, so the defaults are the |
| 591 | best choice for almost everyone. | 597 | best choice for almost everyone. [Note this applies only to those IO-APIC's |
| 598 | that support "Round Robin" interrupt distribution.] | ||
| 592 | 599 | ||
| 593 | There are three more important subdirectories in /proc: net, scsi, and sys. | 600 | There are three more important subdirectories in /proc: net, scsi, and sys. |
| 594 | The general rule is that the contents, or even the existence of these | 601 | The general rule is that the contents, or even the existence of these |
diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt index 7bff3e4f35df..3fc0c31a6f5d 100644 --- a/Documentation/filesystems/xfs.txt +++ b/Documentation/filesystems/xfs.txt | |||
| @@ -39,6 +39,12 @@ When mounting an XFS filesystem, the following options are accepted. | |||
| 39 | drive level write caching to be enabled, for devices that | 39 | drive level write caching to be enabled, for devices that |
| 40 | support write barriers. | 40 | support write barriers. |
| 41 | 41 | ||
| 42 | discard | ||
| 43 | Issue command to let the block device reclaim space freed by the | ||
| 44 | filesystem. This is useful for SSD devices, thinly provisioned | ||
| 45 | LUNs and virtual machine images, but may have a performance | ||
| 46 | impact. This option is incompatible with the nodelaylog option. | ||
| 47 | |||
| 42 | dmapi | 48 | dmapi |
| 43 | Enable the DMAPI (Data Management API) event callouts. | 49 | Enable the DMAPI (Data Management API) event callouts. |
| 44 | Use with the "mtpt" option. | 50 | Use with the "mtpt" option. |
diff --git a/Documentation/hwmon/emc6w201 b/Documentation/hwmon/emc6w201 new file mode 100644 index 000000000000..32f355aaf56b --- /dev/null +++ b/Documentation/hwmon/emc6w201 | |||
| @@ -0,0 +1,42 @@ | |||
| 1 | Kernel driver emc6w201 | ||
| 2 | ====================== | ||
| 3 | |||
| 4 | Supported chips: | ||
| 5 | * SMSC EMC6W201 | ||
| 6 | Prefix: 'emc6w201' | ||
| 7 | Addresses scanned: I2C 0x2c, 0x2d, 0x2e | ||
| 8 | Datasheet: Not public | ||
| 9 | |||
| 10 | Author: Jean Delvare <khali@linux-fr.org> | ||
| 11 | |||
| 12 | |||
| 13 | Description | ||
| 14 | ----------- | ||
| 15 | |||
| 16 | From the datasheet: | ||
| 17 | |||
| 18 | "The EMC6W201 is an environmental monitoring device with automatic fan | ||
| 19 | control capability and enhanced system acoustics for noise suppression. | ||
| 20 | This ACPI compliant device provides hardware monitoring for up to six | ||
| 21 | voltages (including its own VCC) and five external thermal sensors, | ||
| 22 | measures the speed of up to five fans, and controls the speed of | ||
| 23 | multiple DC fans using three Pulse Width Modulator (PWM) outputs. Note | ||
| 24 | that it is possible to control more than three fans by connecting two | ||
| 25 | fans to one PWM output. The EMC6W201 will be available in a 36-pin | ||
| 26 | QFN package." | ||
| 27 | |||
| 28 | The device is functionally close to the EMC6D100 series, but is | ||
| 29 | register-incompatible. | ||
| 30 | |||
| 31 | The driver currently only supports the monitoring of the voltages, | ||
| 32 | temperatures and fan speeds. Limits can be changed. Alarms are not | ||
| 33 | supported, and neither is fan speed control. | ||
| 34 | |||
| 35 | |||
| 36 | Known Systems With EMC6W201 | ||
| 37 | --------------------------- | ||
| 38 | |||
| 39 | The EMC6W201 is a rare device, only found on a few systems, made in | ||
| 40 | 2005 and 2006. Known systems with this device: | ||
| 41 | * Dell Precision 670 workstation | ||
| 42 | * Gigabyte 2CEWH mainboard | ||
diff --git a/Documentation/hwmon/f71882fg b/Documentation/hwmon/f71882fg index df02245d1419..84d2623810f3 100644 --- a/Documentation/hwmon/f71882fg +++ b/Documentation/hwmon/f71882fg | |||
| @@ -6,6 +6,10 @@ Supported chips: | |||
| 6 | Prefix: 'f71808e' | 6 | Prefix: 'f71808e' |
| 7 | Addresses scanned: none, address read from Super I/O config space | 7 | Addresses scanned: none, address read from Super I/O config space |
| 8 | Datasheet: Not public | 8 | Datasheet: Not public |
| 9 | * Fintek F71808A | ||
| 10 | Prefix: 'f71808a' | ||
| 11 | Addresses scanned: none, address read from Super I/O config space | ||
| 12 | Datasheet: Not public | ||
| 9 | * Fintek F71858FG | 13 | * Fintek F71858FG |
| 10 | Prefix: 'f71858fg' | 14 | Prefix: 'f71858fg' |
| 11 | Addresses scanned: none, address read from Super I/O config space | 15 | Addresses scanned: none, address read from Super I/O config space |
diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power new file mode 100644 index 000000000000..a92918e0bd69 --- /dev/null +++ b/Documentation/hwmon/fam15h_power | |||
| @@ -0,0 +1,37 @@ | |||
| 1 | Kernel driver fam15h_power | ||
| 2 | ========================== | ||
| 3 | |||
| 4 | Supported chips: | ||
| 5 | * AMD Family 15h Processors | ||
| 6 | |||
| 7 | Prefix: 'fam15h_power' | ||
| 8 | Addresses scanned: PCI space | ||
| 9 | Datasheets: | ||
| 10 | BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors | ||
| 11 | (not yet published) | ||
| 12 | |||
| 13 | Author: Andreas Herrmann <andreas.herrmann3@amd.com> | ||
| 14 | |||
| 15 | Description | ||
| 16 | ----------- | ||
| 17 | |||
| 18 | This driver permits reading of registers providing power information | ||
| 19 | of AMD Family 15h processors. | ||
| 20 | |||
| 21 | For AMD Family 15h processors the following power values can be | ||
| 22 | calculated using different processor northbridge function registers: | ||
| 23 | |||
| 24 | * BasePwrWatts: Specifies in watts the maximum amount of power | ||
| 25 | consumed by the processor for NB and logic external to the core. | ||
| 26 | * ProcessorPwrWatts: Specifies in watts the maximum amount of power | ||
| 27 | the processor can support. | ||
| 28 | * CurrPwrWatts: Specifies in watts the current amount of power being | ||
| 29 | consumed by the processor. | ||
| 30 | |||
| 31 | This driver provides ProcessorPwrWatts and CurrPwrWatts: | ||
| 32 | * power1_crit (ProcessorPwrWatts) | ||
| 33 | * power1_input (CurrPwrWatts) | ||
| 34 | |||
| 35 | On multi-node processors the calculated value is for the entire | ||
| 36 | package and not for a single node. Thus the driver creates sysfs | ||
| 37 | attributes only for internal node0 of a multi-node processor. | ||
diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp index d2b56a4fd1f5..0393c89277c0 100644 --- a/Documentation/hwmon/k10temp +++ b/Documentation/hwmon/k10temp | |||
| @@ -11,6 +11,7 @@ Supported chips: | |||
| 11 | Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) | 11 | Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) |
| 12 | * AMD Family 12h processors: "Llano" | 12 | * AMD Family 12h processors: "Llano" |
| 13 | * AMD Family 14h processors: "Brazos" (C/E/G-Series) | 13 | * AMD Family 14h processors: "Brazos" (C/E/G-Series) |
| 14 | * AMD Family 15h processors: "Bulldozer" | ||
| 14 | 15 | ||
| 15 | Prefix: 'k10temp' | 16 | Prefix: 'k10temp' |
| 16 | Addresses scanned: PCI space | 17 | Addresses scanned: PCI space |
| @@ -40,7 +41,7 @@ Description | |||
| 40 | ----------- | 41 | ----------- |
| 41 | 42 | ||
| 42 | This driver permits reading of the internal temperature sensor of AMD | 43 | This driver permits reading of the internal temperature sensor of AMD |
| 43 | Family 10h/11h/12h/14h processors. | 44 | Family 10h/11h/12h/14h/15h processors. |
| 44 | 45 | ||
| 45 | All these processors have a sensor, but on those for Socket F or AM2+, | 46 | All these processors have a sensor, but on those for Socket F or AM2+, |
| 46 | the sensor may return inconsistent values (erratum 319). The driver | 47 | the sensor may return inconsistent values (erratum 319). The driver |
diff --git a/Documentation/hwmon/max6650 b/Documentation/hwmon/max6650 index c565650fcfc6..58d9644a2bde 100644 --- a/Documentation/hwmon/max6650 +++ b/Documentation/hwmon/max6650 | |||
| @@ -2,9 +2,13 @@ Kernel driver max6650 | |||
| 2 | ===================== | 2 | ===================== |
| 3 | 3 | ||
| 4 | Supported chips: | 4 | Supported chips: |
| 5 | * Maxim 6650 / 6651 | 5 | * Maxim MAX6650 |
| 6 | Prefix: 'max6650' | 6 | Prefix: 'max6650' |
| 7 | Addresses scanned: I2C 0x1b, 0x1f, 0x48, 0x4b | 7 | Addresses scanned: none |
| 8 | Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf | ||
| 9 | * Maxim MAX6651 | ||
| 10 | Prefix: 'max6651' | ||
| 11 | Addresses scanned: none | ||
| 8 | Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf | 12 | Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf |
| 9 | 13 | ||
| 10 | Authors: | 14 | Authors: |
| @@ -15,10 +19,10 @@ Authors: | |||
| 15 | Description | 19 | Description |
| 16 | ----------- | 20 | ----------- |
| 17 | 21 | ||
| 18 | This driver implements support for the Maxim 6650/6651 | 22 | This driver implements support for the Maxim MAX6650 and MAX6651. |
| 19 | 23 | ||
| 20 | The 2 devices are very similar, but the Maxim 6550 has a reduced feature | 24 | The 2 devices are very similar, but the MAX6550 has a reduced feature |
| 21 | set, e.g. only one fan-input, instead of 4 for the 6651. | 25 | set, e.g. only one fan-input, instead of 4 for the MAX6651. |
| 22 | 26 | ||
| 23 | The driver is not able to distinguish between the 2 devices. | 27 | The driver is not able to distinguish between the 2 devices. |
| 24 | 28 | ||
| @@ -36,6 +40,13 @@ fan1_div rw sets the speed range the inputs can handle. Legal | |||
| 36 | values are 1, 2, 4, and 8. Use lower values for | 40 | values are 1, 2, 4, and 8. Use lower values for |
| 37 | faster fans. | 41 | faster fans. |
| 38 | 42 | ||
| 43 | Usage notes | ||
| 44 | ----------- | ||
| 45 | |||
| 46 | This driver does not auto-detect devices. You will have to instantiate the | ||
| 47 | devices explicitly. Please see Documentation/i2c/instantiating-devices for | ||
| 48 | details. | ||
| 49 | |||
| 39 | Module parameters | 50 | Module parameters |
| 40 | ----------------- | 51 | ----------------- |
| 41 | 52 | ||
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 2d1ad12e2b3e..3a46e360496d 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt | |||
| @@ -304,6 +304,7 @@ Code Seq#(hex) Include File Comments | |||
| 304 | 0xB0 all RATIO devices in development: | 304 | 0xB0 all RATIO devices in development: |
| 305 | <mailto:vgo@ratio.de> | 305 | <mailto:vgo@ratio.de> |
| 306 | 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> | 306 | 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> |
| 307 | 0xB3 00 linux/mmc/ioctl.h | ||
| 307 | 0xC0 00-0F linux/usb/iowarrior.h | 308 | 0xC0 00-0F linux/usb/iowarrior.h |
| 308 | 0xCB 00-1F CBM serial IEC bus in development: | 309 | 0xCB 00-1F CBM serial IEC bus in development: |
| 309 | <mailto:michael.klein@puffin.lb.shuttle.de> | 310 | <mailto:michael.klein@puffin.lb.shuttle.de> |
diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt index b507d61fd41c..44e2649fbb29 100644 --- a/Documentation/kbuild/kconfig-language.txt +++ b/Documentation/kbuild/kconfig-language.txt | |||
| @@ -113,6 +113,13 @@ applicable everywhere (see syntax). | |||
| 113 | That will limit the usefulness but on the other hand avoid | 113 | That will limit the usefulness but on the other hand avoid |
| 114 | the illegal configurations all over. | 114 | the illegal configurations all over. |
| 115 | 115 | ||
| 116 | - limiting menu display: "visible if" <expr> | ||
| 117 | This attribute is only applicable to menu blocks, if the condition is | ||
| 118 | false, the menu block is not displayed to the user (the symbols | ||
| 119 | contained there can still be selected by other symbols, though). It is | ||
| 120 | similar to a conditional "prompt" attribude for individual menu | ||
| 121 | entries. Default value of "visible" is true. | ||
| 122 | |||
| 116 | - numerical ranges: "range" <symbol> <symbol> ["if" <expr>] | 123 | - numerical ranges: "range" <symbol> <symbol> ["if" <expr>] |
| 117 | This allows to limit the range of possible input values for int | 124 | This allows to limit the range of possible input values for int |
| 118 | and hex symbols. The user can only input a value which is larger than | 125 | and hex symbols. The user can only input a value which is larger than |
| @@ -303,7 +310,8 @@ menu: | |||
| 303 | "endmenu" | 310 | "endmenu" |
| 304 | 311 | ||
| 305 | This defines a menu block, see "Menu structure" above for more | 312 | This defines a menu block, see "Menu structure" above for more |
| 306 | information. The only possible options are dependencies. | 313 | information. The only possible options are dependencies and "visible" |
| 314 | attributes. | ||
| 307 | 315 | ||
| 308 | if: | 316 | if: |
| 309 | 317 | ||
| @@ -381,3 +389,25 @@ config FOO | |||
| 381 | 389 | ||
| 382 | limits FOO to module (=m) or disabled (=n). | 390 | limits FOO to module (=m) or disabled (=n). |
| 383 | 391 | ||
| 392 | Kconfig symbol existence | ||
| 393 | ~~~~~~~~~~~~~~~~~~~~~~~~ | ||
| 394 | The following two methods produce the same kconfig symbol dependencies | ||
| 395 | but differ greatly in kconfig symbol existence (production) in the | ||
| 396 | generated config file. | ||
| 397 | |||
| 398 | case 1: | ||
| 399 | |||
| 400 | config FOO | ||
| 401 | tristate "about foo" | ||
| 402 | depends on BAR | ||
| 403 | |||
| 404 | vs. case 2: | ||
| 405 | |||
| 406 | if BAR | ||
| 407 | config FOO | ||
| 408 | tristate "about foo" | ||
| 409 | endif | ||
| 410 | |||
| 411 | In case 1, the symbol FOO will always exist in the config file (given | ||
| 412 | no other dependencies). In case 2, the symbol FOO will only exist in | ||
| 413 | the config file if BAR is enabled. | ||
diff --git a/Documentation/kbuild/kconfig.txt b/Documentation/kbuild/kconfig.txt index cca46b1a0f6c..c313d71324b4 100644 --- a/Documentation/kbuild/kconfig.txt +++ b/Documentation/kbuild/kconfig.txt | |||
| @@ -48,11 +48,6 @@ KCONFIG_OVERWRITECONFIG | |||
| 48 | If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not | 48 | If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not |
| 49 | break symlinks when .config is a symlink to somewhere else. | 49 | break symlinks when .config is a symlink to somewhere else. |
| 50 | 50 | ||
| 51 | KCONFIG_NOTIMESTAMP | ||
| 52 | -------------------------------------------------- | ||
| 53 | If this environment variable exists and is non-null, the timestamp line | ||
| 54 | in generated .config files is omitted. | ||
| 55 | |||
| 56 | ______________________________________________________________________ | 51 | ______________________________________________________________________ |
| 57 | Environment variables for '{allyes/allmod/allno/rand}config' | 52 | Environment variables for '{allyes/allmod/allno/rand}config' |
| 58 | 53 | ||
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7c6624e7a5cb..5438a2d7907f 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
| @@ -1777,9 +1777,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted. | |||
| 1777 | 1777 | ||
| 1778 | nosoftlockup [KNL] Disable the soft-lockup detector. | 1778 | nosoftlockup [KNL] Disable the soft-lockup detector. |
| 1779 | 1779 | ||
| 1780 | noswapaccount [KNL] Disable accounting of swap in memory resource | ||
| 1781 | controller. (See Documentation/cgroups/memory.txt) | ||
| 1782 | |||
| 1783 | nosync [HW,M68K] Disables sync negotiation for all devices. | 1780 | nosync [HW,M68K] Disables sync negotiation for all devices. |
| 1784 | 1781 | ||
| 1785 | notsc [BUGS=X86-32] Disable Time Stamp Counter | 1782 | notsc [BUGS=X86-32] Disable Time Stamp Counter |
diff --git a/Documentation/lockstat.txt b/Documentation/lockstat.txt index 65f4c795015d..9c0a80d17a23 100644 --- a/Documentation/lockstat.txt +++ b/Documentation/lockstat.txt | |||
| @@ -136,7 +136,7 @@ View the top contending locks: | |||
| 136 | dcache_lock: 1037 1161 0.38 45.32 774.51 6611 243371 0.15 306.48 77387.24 | 136 | dcache_lock: 1037 1161 0.38 45.32 774.51 6611 243371 0.15 306.48 77387.24 |
| 137 | &inode->i_mutex: 161 286 18446744073709 62882.54 1244614.55 3653 20598 18446744073709 62318.60 1693822.74 | 137 | &inode->i_mutex: 161 286 18446744073709 62882.54 1244614.55 3653 20598 18446744073709 62318.60 1693822.74 |
| 138 | &zone->lru_lock: 94 94 0.53 7.33 92.10 4366 32690 0.29 59.81 16350.06 | 138 | &zone->lru_lock: 94 94 0.53 7.33 92.10 4366 32690 0.29 59.81 16350.06 |
| 139 | &inode->i_data.i_mmap_lock: 79 79 0.40 3.77 53.03 11779 87755 0.28 116.93 29898.44 | 139 | &inode->i_data.i_mmap_mutex: 79 79 0.40 3.77 53.03 11779 87755 0.28 116.93 29898.44 |
| 140 | &q->__queue_lock: 48 50 0.52 31.62 86.31 774 13131 0.17 113.08 12277.52 | 140 | &q->__queue_lock: 48 50 0.52 31.62 86.31 774 13131 0.17 113.08 12277.52 |
| 141 | &rq->rq_lock_key: 43 47 0.74 68.50 170.63 3706 33929 0.22 107.99 17460.62 | 141 | &rq->rq_lock_key: 43 47 0.74 68.50 170.63 3706 33929 0.22 107.99 17460.62 |
| 142 | &rq->rq_lock_key#2: 39 46 0.75 6.68 49.03 2979 32292 0.17 125.17 17137.63 | 142 | &rq->rq_lock_key#2: 39 46 0.75 6.68 49.03 2979 32292 0.17 125.17 17137.63 |
diff --git a/Documentation/mmc/00-INDEX b/Documentation/mmc/00-INDEX index fca586f5b853..93dd7a714075 100644 --- a/Documentation/mmc/00-INDEX +++ b/Documentation/mmc/00-INDEX | |||
| @@ -2,3 +2,5 @@ | |||
| 2 | - this file | 2 | - this file |
| 3 | mmc-dev-attrs.txt | 3 | mmc-dev-attrs.txt |
| 4 | - info on SD and MMC device attributes | 4 | - info on SD and MMC device attributes |
| 5 | mmc-dev-parts.txt | ||
| 6 | - info on SD and MMC device partitions | ||
diff --git a/Documentation/mmc/mmc-dev-attrs.txt b/Documentation/mmc/mmc-dev-attrs.txt index ff2bd685bced..8898a95b41e5 100644 --- a/Documentation/mmc/mmc-dev-attrs.txt +++ b/Documentation/mmc/mmc-dev-attrs.txt | |||
| @@ -1,3 +1,13 @@ | |||
| 1 | SD and MMC Block Device Attributes | ||
| 2 | ================================== | ||
| 3 | |||
| 4 | These attributes are defined for the block devices associated with the | ||
| 5 | SD or MMC device. | ||
| 6 | |||
| 7 | The following attributes are read/write. | ||
| 8 | |||
| 9 | force_ro Enforce read-only access even if write protect switch is off. | ||
| 10 | |||
| 1 | SD and MMC Device Attributes | 11 | SD and MMC Device Attributes |
| 2 | ============================ | 12 | ============================ |
| 3 | 13 | ||
diff --git a/Documentation/mmc/mmc-dev-parts.txt b/Documentation/mmc/mmc-dev-parts.txt new file mode 100644 index 000000000000..2db28b8e662f --- /dev/null +++ b/Documentation/mmc/mmc-dev-parts.txt | |||
| @@ -0,0 +1,27 @@ | |||
| 1 | SD and MMC Device Partitions | ||
| 2 | ============================ | ||
| 3 | |||
| 4 | Device partitions are additional logical block devices present on the | ||
| 5 | SD/MMC device. | ||
| 6 | |||
| 7 | As of this writing, MMC boot partitions as supported and exposed as | ||
| 8 | /dev/mmcblkXboot0 and /dev/mmcblkXboot1, where X is the index of the | ||
| 9 | parent /dev/mmcblkX. | ||
| 10 | |||
| 11 | MMC Boot Partitions | ||
| 12 | =================== | ||
| 13 | |||
| 14 | Read and write access is provided to the two MMC boot partitions. Due to | ||
| 15 | the sensitive nature of the boot partition contents, which often store | ||
| 16 | a bootloader or bootloader configuration tables crucial to booting the | ||
| 17 | platform, write access is disabled by default to reduce the chance of | ||
| 18 | accidental bricking. | ||
| 19 | |||
| 20 | To enable write access to /dev/mmcblkXbootY, disable the forced read-only | ||
| 21 | access with: | ||
| 22 | |||
| 23 | echo 0 > /sys/block/mmcblkXbootY/force_ro | ||
| 24 | |||
| 25 | To re-enable read-only access: | ||
| 26 | |||
| 27 | echo 1 > /sys/block/mmcblkXbootY/force_ro | ||
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 1f45bd887d65..675612ff41ae 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt | |||
| @@ -770,8 +770,17 @@ resend_igmp | |||
| 770 | a failover event. One membership report is issued immediately after | 770 | a failover event. One membership report is issued immediately after |
| 771 | the failover, subsequent packets are sent in each 200ms interval. | 771 | the failover, subsequent packets are sent in each 200ms interval. |
| 772 | 772 | ||
| 773 | The valid range is 0 - 255; the default value is 1. This option | 773 | The valid range is 0 - 255; the default value is 1. A value of 0 |
| 774 | was added for bonding version 3.7.0. | 774 | prevents the IGMP membership report from being issued in response |
| 775 | to the failover event. | ||
| 776 | |||
| 777 | This option is useful for bonding modes balance-rr (0), active-backup | ||
| 778 | (1), balance-tlb (5) and balance-alb (6), in which a failover can | ||
| 779 | switch the IGMP traffic from one slave to another. Therefore a fresh | ||
| 780 | IGMP report must be issued to cause the switch to forward the incoming | ||
| 781 | IGMP traffic over the newly selected slave. | ||
| 782 | |||
| 783 | This option was added for bonding version 3.7.0. | ||
| 775 | 784 | ||
| 776 | 3. Configuring Bonding Devices | 785 | 3. Configuring Bonding Devices |
| 777 | ============================== | 786 | ============================== |
diff --git a/Documentation/ptp/ptp.txt b/Documentation/ptp/ptp.txt new file mode 100644 index 000000000000..ae8fef86b832 --- /dev/null +++ b/Documentation/ptp/ptp.txt | |||
| @@ -0,0 +1,89 @@ | |||
| 1 | |||
| 2 | * PTP hardware clock infrastructure for Linux | ||
| 3 | |||
| 4 | This patch set introduces support for IEEE 1588 PTP clocks in | ||
| 5 | Linux. Together with the SO_TIMESTAMPING socket options, this | ||
| 6 | presents a standardized method for developing PTP user space | ||
| 7 | programs, synchronizing Linux with external clocks, and using the | ||
| 8 | ancillary features of PTP hardware clocks. | ||
| 9 | |||
| 10 | A new class driver exports a kernel interface for specific clock | ||
| 11 | drivers and a user space interface. The infrastructure supports a | ||
| 12 | complete set of PTP hardware clock functionality. | ||
| 13 | |||
| 14 | + Basic clock operations | ||
| 15 | - Set time | ||
| 16 | - Get time | ||
| 17 | - Shift the clock by a given offset atomically | ||
| 18 | - Adjust clock frequency | ||
| 19 | |||
| 20 | + Ancillary clock features | ||
| 21 | - One short or periodic alarms, with signal delivery to user program | ||
| 22 | - Time stamp external events | ||
| 23 | - Period output signals configurable from user space | ||
| 24 | - Synchronization of the Linux system time via the PPS subsystem | ||
| 25 | |||
| 26 | ** PTP hardware clock kernel API | ||
| 27 | |||
| 28 | A PTP clock driver registers itself with the class driver. The | ||
| 29 | class driver handles all of the dealings with user space. The | ||
| 30 | author of a clock driver need only implement the details of | ||
| 31 | programming the clock hardware. The clock driver notifies the class | ||
| 32 | driver of asynchronous events (alarms and external time stamps) via | ||
| 33 | a simple message passing interface. | ||
| 34 | |||
| 35 | The class driver supports multiple PTP clock drivers. In normal use | ||
| 36 | cases, only one PTP clock is needed. However, for testing and | ||
| 37 | development, it can be useful to have more than one clock in a | ||
| 38 | single system, in order to allow performance comparisons. | ||
| 39 | |||
| 40 | ** PTP hardware clock user space API | ||
| 41 | |||
| 42 | The class driver also creates a character device for each | ||
| 43 | registered clock. User space can use an open file descriptor from | ||
| 44 | the character device as a POSIX clock id and may call | ||
| 45 | clock_gettime, clock_settime, and clock_adjtime. These calls | ||
| 46 | implement the basic clock operations. | ||
| 47 | |||
| 48 | User space programs may control the clock using standardized | ||
| 49 | ioctls. A program may query, enable, configure, and disable the | ||
| 50 | ancillary clock features. User space can receive time stamped | ||
| 51 | events via blocking read() and poll(). One shot and periodic | ||
| 52 | signals may be configured via the POSIX timer_settime() system | ||
| 53 | call. | ||
| 54 | |||
| 55 | ** Writing clock drivers | ||
| 56 | |||
| 57 | Clock drivers include include/linux/ptp_clock_kernel.h and register | ||
| 58 | themselves by presenting a 'struct ptp_clock_info' to the | ||
| 59 | registration method. Clock drivers must implement all of the | ||
| 60 | functions in the interface. If a clock does not offer a particular | ||
| 61 | ancillary feature, then the driver should just return -EOPNOTSUPP | ||
| 62 | from those functions. | ||
| 63 | |||
| 64 | Drivers must ensure that all of the methods in interface are | ||
| 65 | reentrant. Since most hardware implementations treat the time value | ||
| 66 | as a 64 bit integer accessed as two 32 bit registers, drivers | ||
| 67 | should use spin_lock_irqsave/spin_unlock_irqrestore to protect | ||
| 68 | against concurrent access. This locking cannot be accomplished in | ||
| 69 | class driver, since the lock may also be needed by the clock | ||
| 70 | driver's interrupt service routine. | ||
| 71 | |||
| 72 | ** Supported hardware | ||
| 73 | |||
| 74 | + Freescale eTSEC gianfar | ||
| 75 | - 2 Time stamp external triggers, programmable polarity (opt. interrupt) | ||
| 76 | - 2 Alarm registers (optional interrupt) | ||
| 77 | - 3 Periodic signals (optional interrupt) | ||
| 78 | |||
| 79 | + National DP83640 | ||
| 80 | - 6 GPIOs programmable as inputs or outputs | ||
| 81 | - 6 GPIOs with dedicated functions (LED/JTAG/clock) can also be | ||
| 82 | used as general inputs or outputs | ||
| 83 | - GPIO inputs can time stamp external triggers | ||
| 84 | - GPIO outputs can produce periodic signals | ||
| 85 | - 1 interrupt pin | ||
| 86 | |||
| 87 | + Intel IXP465 | ||
| 88 | - Auxiliary Slave/Master Mode Snapshot (optional interrupt) | ||
| 89 | - Target Time (optional interrupt) | ||
diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c new file mode 100644 index 000000000000..f59ded066108 --- /dev/null +++ b/Documentation/ptp/testptp.c | |||
| @@ -0,0 +1,381 @@ | |||
| 1 | /* | ||
| 2 | * PTP 1588 clock support - User space test program | ||
| 3 | * | ||
| 4 | * Copyright (C) 2010 OMICRON electronics GmbH | ||
| 5 | * | ||
| 6 | * This program is free software; you can redistribute it and/or modify | ||
| 7 | * it under the terms of the GNU General Public License as published by | ||
| 8 | * the Free Software Foundation; either version 2 of the License, or | ||
| 9 | * (at your option) any later version. | ||
| 10 | * | ||
| 11 | * This program is distributed in the hope that it will be useful, | ||
| 12 | * but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
| 13 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
| 14 | * GNU General Public License for more details. | ||
| 15 | * | ||
| 16 | * You should have received a copy of the GNU General Public License | ||
| 17 | * along with this program; if not, write to the Free Software | ||
| 18 | * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | ||
| 19 | */ | ||
| 20 | #include <errno.h> | ||
| 21 | #include <fcntl.h> | ||
| 22 | #include <math.h> | ||
| 23 | #include <signal.h> | ||
| 24 | #include <stdio.h> | ||
| 25 | #include <stdlib.h> | ||
| 26 | #include <string.h> | ||
| 27 | #include <sys/ioctl.h> | ||
| 28 | #include <sys/mman.h> | ||
| 29 | #include <sys/stat.h> | ||
| 30 | #include <sys/time.h> | ||
| 31 | #include <sys/timex.h> | ||
| 32 | #include <sys/types.h> | ||
| 33 | #include <time.h> | ||
| 34 | #include <unistd.h> | ||
| 35 | |||
| 36 | #include <linux/ptp_clock.h> | ||
| 37 | |||
| 38 | #define DEVICE "/dev/ptp0" | ||
| 39 | |||
| 40 | #ifndef ADJ_SETOFFSET | ||
| 41 | #define ADJ_SETOFFSET 0x0100 | ||
| 42 | #endif | ||
| 43 | |||
| 44 | #ifndef CLOCK_INVALID | ||
| 45 | #define CLOCK_INVALID -1 | ||
| 46 | #endif | ||
| 47 | |||
| 48 | /* When glibc offers the syscall, this will go away. */ | ||
| 49 | #include <sys/syscall.h> | ||
| 50 | static int clock_adjtime(clockid_t id, struct timex *tx) | ||
| 51 | { | ||
| 52 | return syscall(__NR_clock_adjtime, id, tx); | ||
| 53 | } | ||
| 54 | |||
| 55 | static clockid_t get_clockid(int fd) | ||
| 56 | { | ||
| 57 | #define CLOCKFD 3 | ||
| 58 | #define FD_TO_CLOCKID(fd) ((~(clockid_t) (fd) << 3) | CLOCKFD) | ||
| 59 | |||
| 60 | return FD_TO_CLOCKID(fd); | ||
| 61 | } | ||
| 62 | |||
| 63 | static void handle_alarm(int s) | ||
| 64 | { | ||
| 65 | printf("received signal %d\n", s); | ||
| 66 | } | ||
| 67 | |||
| 68 | static int install_handler(int signum, void (*handler)(int)) | ||
| 69 | { | ||
| 70 | struct sigaction action; | ||
| 71 | sigset_t mask; | ||
| 72 | |||
| 73 | /* Unblock the signal. */ | ||
| 74 | sigemptyset(&mask); | ||
| 75 | sigaddset(&mask, signum); | ||
| 76 | sigprocmask(SIG_UNBLOCK, &mask, NULL); | ||
| 77 | |||
| 78 | /* Install the signal handler. */ | ||
| 79 | action.sa_handler = handler; | ||
| 80 | action.sa_flags = 0; | ||
| 81 | sigemptyset(&action.sa_mask); | ||
| 82 | sigaction(signum, &action, NULL); | ||
| 83 | |||
| 84 | return 0; | ||
| 85 | } | ||
| 86 | |||
| 87 | static long ppb_to_scaled_ppm(int ppb) | ||
| 88 | { | ||
| 89 | /* | ||
| 90 | * The 'freq' field in the 'struct timex' is in parts per | ||
| 91 | * million, but with a 16 bit binary fractional field. | ||
| 92 | * Instead of calculating either one of | ||
| 93 | * | ||
| 94 | * scaled_ppm = (ppb / 1000) << 16 [1] | ||
| 95 | * scaled_ppm = (ppb << 16) / 1000 [2] | ||
| 96 | * | ||
| 97 | * we simply use double precision math, in order to avoid the | ||
| 98 | * truncation in [1] and the possible overflow in [2]. | ||
| 99 | */ | ||
| 100 | return (long) (ppb * 65.536); | ||
| 101 | } | ||
| 102 | |||
| 103 | static void usage(char *progname) | ||
| 104 | { | ||
| 105 | fprintf(stderr, | ||
| 106 | "usage: %s [options]\n" | ||
| 107 | " -a val request a one-shot alarm after 'val' seconds\n" | ||
| 108 | " -A val request a periodic alarm every 'val' seconds\n" | ||
| 109 | " -c query the ptp clock's capabilities\n" | ||
| 110 | " -d name device to open\n" | ||
| 111 | " -e val read 'val' external time stamp events\n" | ||
| 112 | " -f val adjust the ptp clock frequency by 'val' ppb\n" | ||
| 113 | " -g get the ptp clock time\n" | ||
| 114 | " -h prints this message\n" | ||
| 115 | " -p val enable output with a period of 'val' nanoseconds\n" | ||
| 116 | " -P val enable or disable (val=1|0) the system clock PPS\n" | ||
| 117 | " -s set the ptp clock time from the system time\n" | ||
| 118 | " -S set the system time from the ptp clock time\n" | ||
| 119 | " -t val shift the ptp clock time by 'val' seconds\n", | ||
| 120 | progname); | ||
| 121 | } | ||
| 122 | |||
| 123 | int main(int argc, char *argv[]) | ||
| 124 | { | ||
| 125 | struct ptp_clock_caps caps; | ||
| 126 | struct ptp_extts_event event; | ||
| 127 | struct ptp_extts_request extts_request; | ||
| 128 | struct ptp_perout_request perout_request; | ||
| 129 | struct timespec ts; | ||
| 130 | struct timex tx; | ||
| 131 | |||
| 132 | static timer_t timerid; | ||
| 133 | struct itimerspec timeout; | ||
| 134 | struct sigevent sigevent; | ||
| 135 | |||
| 136 | char *progname; | ||
| 137 | int c, cnt, fd; | ||
| 138 | |||
| 139 | char *device = DEVICE; | ||
| 140 | clockid_t clkid; | ||
| 141 | int adjfreq = 0x7fffffff; | ||
| 142 | int adjtime = 0; | ||
| 143 | int capabilities = 0; | ||
| 144 | int extts = 0; | ||
| 145 | int gettime = 0; | ||
| 146 | int oneshot = 0; | ||
| 147 | int periodic = 0; | ||
| 148 | int perout = -1; | ||
| 149 | int pps = -1; | ||
| 150 | int settime = 0; | ||
| 151 | |||
| 152 | progname = strrchr(argv[0], '/'); | ||
| 153 | progname = progname ? 1+progname : argv[0]; | ||
| 154 | while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghp:P:sSt:v"))) { | ||
| 155 | switch (c) { | ||
| 156 | case 'a': | ||
| 157 | oneshot = atoi(optarg); | ||
| 158 | break; | ||
| 159 | case 'A': | ||
| 160 | periodic = atoi(optarg); | ||
| 161 | break; | ||
| 162 | case 'c': | ||
| 163 | capabilities = 1; | ||
| 164 | break; | ||
| 165 | case 'd': | ||
| 166 | device = optarg; | ||
| 167 | break; | ||
| 168 | case 'e': | ||
| 169 | extts = atoi(optarg); | ||
| 170 | break; | ||
| 171 | case 'f': | ||
| 172 | adjfreq = atoi(optarg); | ||
| 173 | break; | ||
| 174 | case 'g': | ||
| 175 | gettime = 1; | ||
| 176 | break; | ||
| 177 | case 'p': | ||
| 178 | perout = atoi(optarg); | ||
| 179 | break; | ||
| 180 | case 'P': | ||
| 181 | pps = atoi(optarg); | ||
| 182 | break; | ||
| 183 | case 's': | ||
| 184 | settime = 1; | ||
| 185 | break; | ||
| 186 | case 'S': | ||
| 187 | settime = 2; | ||
| 188 | break; | ||
| 189 | case 't': | ||
| 190 | adjtime = atoi(optarg); | ||
| 191 | break; | ||
| 192 | case 'h': | ||
| 193 | usage(progname); | ||
| 194 | return 0; | ||
| 195 | case '?': | ||
| 196 | default: | ||
| 197 | usage(progname); | ||
| 198 | return -1; | ||
| 199 | } | ||
| 200 | } | ||
| 201 | |||
| 202 | fd = open(device, O_RDWR); | ||
| 203 | if (fd < 0) { | ||
| 204 | fprintf(stderr, "opening %s: %s\n", device, strerror(errno)); | ||
| 205 | return -1; | ||
| 206 | } | ||
| 207 | |||
| 208 | clkid = get_clockid(fd); | ||
| 209 | if (CLOCK_INVALID == clkid) { | ||
| 210 | fprintf(stderr, "failed to read clock id\n"); | ||
| 211 | return -1; | ||
| 212 | } | ||
| 213 | |||
| 214 | if (capabilities) { | ||
| 215 | if (ioctl(fd, PTP_CLOCK_GETCAPS, &caps)) { | ||
| 216 | perror("PTP_CLOCK_GETCAPS"); | ||
| 217 | } else { | ||
| 218 | printf("capabilities:\n" | ||
| 219 | " %d maximum frequency adjustment (ppb)\n" | ||
| 220 | " %d programmable alarms\n" | ||
| 221 | " %d external time stamp channels\n" | ||
| 222 | " %d programmable periodic signals\n" | ||
| 223 | " %d pulse per second\n", | ||
| 224 | caps.max_adj, | ||
| 225 | caps.n_alarm, | ||
| 226 | caps.n_ext_ts, | ||
| 227 | caps.n_per_out, | ||
| 228 | caps.pps); | ||
| 229 | } | ||
| 230 | } | ||
| 231 | |||
| 232 | if (0x7fffffff != adjfreq) { | ||
| 233 | memset(&tx, 0, sizeof(tx)); | ||
| 234 | tx.modes = ADJ_FREQUENCY; | ||
| 235 | tx.freq = ppb_to_scaled_ppm(adjfreq); | ||
| 236 | if (clock_adjtime(clkid, &tx)) { | ||
| 237 | perror("clock_adjtime"); | ||
| 238 | } else { | ||
| 239 | puts("frequency adjustment okay"); | ||
| 240 | } | ||
| 241 | } | ||
| 242 | |||
| 243 | if (adjtime) { | ||
| 244 | memset(&tx, 0, sizeof(tx)); | ||
| 245 | tx.modes = ADJ_SETOFFSET; | ||
| 246 | tx.time.tv_sec = adjtime; | ||
| 247 | tx.time.tv_usec = 0; | ||
| 248 | if (clock_adjtime(clkid, &tx) < 0) { | ||
| 249 | perror("clock_adjtime"); | ||
| 250 | } else { | ||
| 251 | puts("time shift okay"); | ||
| 252 | } | ||
| 253 | } | ||
| 254 | |||
| 255 | if (gettime) { | ||
| 256 | if (clock_gettime(clkid, &ts)) { | ||
| 257 | perror("clock_gettime"); | ||
| 258 | } else { | ||
| 259 | printf("clock time: %ld.%09ld or %s", | ||
| 260 | ts.tv_sec, ts.tv_nsec, ctime(&ts.tv_sec)); | ||
| 261 | } | ||
| 262 | } | ||
| 263 | |||
| 264 | if (settime == 1) { | ||
| 265 | clock_gettime(CLOCK_REALTIME, &ts); | ||
| 266 | if (clock_settime(clkid, &ts)) { | ||
| 267 | perror("clock_settime"); | ||
| 268 | } else { | ||
| 269 | puts("set time okay"); | ||
| 270 | } | ||
| 271 | } | ||
| 272 | |||
| 273 | if (settime == 2) { | ||
| 274 | clock_gettime(clkid, &ts); | ||
| 275 | if (clock_settime(CLOCK_REALTIME, &ts)) { | ||
| 276 | perror("clock_settime"); | ||
| 277 | } else { | ||
| 278 | puts("set time okay"); | ||
| 279 | } | ||
| 280 | } | ||
| 281 | |||
| 282 | if (extts) { | ||
| 283 | memset(&extts_request, 0, sizeof(extts_request)); | ||
| 284 | extts_request.index = 0; | ||
| 285 | extts_request.flags = PTP_ENABLE_FEATURE; | ||
| 286 | if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) { | ||
| 287 | perror("PTP_EXTTS_REQUEST"); | ||
| 288 | extts = 0; | ||
| 289 | } else { | ||
| 290 | puts("external time stamp request okay"); | ||
| 291 | } | ||
| 292 | for (; extts; extts--) { | ||
| 293 | cnt = read(fd, &event, sizeof(event)); | ||
| 294 | if (cnt != sizeof(event)) { | ||
| 295 | perror("read"); | ||
| 296 | break; | ||
| 297 | } | ||
| 298 | printf("event index %u at %lld.%09u\n", event.index, | ||
| 299 | event.t.sec, event.t.nsec); | ||
| 300 | fflush(stdout); | ||
| 301 | } | ||
| 302 | /* Disable the feature again. */ | ||
| 303 | extts_request.flags = 0; | ||
| 304 | if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) { | ||
| 305 | perror("PTP_EXTTS_REQUEST"); | ||
| 306 | } | ||
| 307 | } | ||
| 308 | |||
| 309 | if (oneshot) { | ||
| 310 | install_handler(SIGALRM, handle_alarm); | ||
| 311 | /* Create a timer. */ | ||
| 312 | sigevent.sigev_notify = SIGEV_SIGNAL; | ||
| 313 | sigevent.sigev_signo = SIGALRM; | ||
| 314 | if (timer_create(clkid, &sigevent, &timerid)) { | ||
| 315 | perror("timer_create"); | ||
| 316 | return -1; | ||
| 317 | } | ||
| 318 | /* Start the timer. */ | ||
| 319 | memset(&timeout, 0, sizeof(timeout)); | ||
| 320 | timeout.it_value.tv_sec = oneshot; | ||
| 321 | if (timer_settime(timerid, 0, &timeout, NULL)) { | ||
| 322 | perror("timer_settime"); | ||
| 323 | return -1; | ||
| 324 | } | ||
| 325 | pause(); | ||
| 326 | timer_delete(timerid); | ||
| 327 | } | ||
| 328 | |||
| 329 | if (periodic) { | ||
| 330 | install_handler(SIGALRM, handle_alarm); | ||
| 331 | /* Create a timer. */ | ||
| 332 | sigevent.sigev_notify = SIGEV_SIGNAL; | ||
| 333 | sigevent.sigev_signo = SIGALRM; | ||
| 334 | if (timer_create(clkid, &sigevent, &timerid)) { | ||
| 335 | perror("timer_create"); | ||
| 336 | return -1; | ||
| 337 | } | ||
| 338 | /* Start the timer. */ | ||
| 339 | memset(&timeout, 0, sizeof(timeout)); | ||
| 340 | timeout.it_interval.tv_sec = periodic; | ||
| 341 | timeout.it_value.tv_sec = periodic; | ||
| 342 | if (timer_settime(timerid, 0, &timeout, NULL)) { | ||
| 343 | perror("timer_settime"); | ||
| 344 | return -1; | ||
| 345 | } | ||
| 346 | while (1) { | ||
| 347 | pause(); | ||
| 348 | } | ||
| 349 | timer_delete(timerid); | ||
| 350 | } | ||
| 351 | |||
| 352 | if (perout >= 0) { | ||
| 353 | if (clock_gettime(clkid, &ts)) { | ||
| 354 | perror("clock_gettime"); | ||
| 355 | return -1; | ||
| 356 | } | ||
| 357 | memset(&perout_request, 0, sizeof(perout_request)); | ||
| 358 | perout_request.index = 0; | ||
| 359 | perout_request.start.sec = ts.tv_sec + 2; | ||
| 360 | perout_request.start.nsec = 0; | ||
| 361 | perout_request.period.sec = 0; | ||
| 362 | perout_request.period.nsec = perout; | ||
| 363 | if (ioctl(fd, PTP_PEROUT_REQUEST, &perout_request)) { | ||
| 364 | perror("PTP_PEROUT_REQUEST"); | ||
| 365 | } else { | ||
| 366 | puts("periodic output request okay"); | ||
| 367 | } | ||
| 368 | } | ||
| 369 | |||
| 370 | if (pps != -1) { | ||
| 371 | int enable = pps ? 1 : 0; | ||
| 372 | if (ioctl(fd, PTP_ENABLE_PPS, enable)) { | ||
| 373 | perror("PTP_ENABLE_PPS"); | ||
| 374 | } else { | ||
| 375 | puts("pps for system time request okay"); | ||
| 376 | } | ||
| 377 | } | ||
| 378 | |||
| 379 | close(fd); | ||
| 380 | return 0; | ||
| 381 | } | ||
diff --git a/Documentation/ptp/testptp.mk b/Documentation/ptp/testptp.mk new file mode 100644 index 000000000000..4ef2d9755421 --- /dev/null +++ b/Documentation/ptp/testptp.mk | |||
| @@ -0,0 +1,33 @@ | |||
| 1 | # PTP 1588 clock support - User space test program | ||
| 2 | # | ||
| 3 | # Copyright (C) 2010 OMICRON electronics GmbH | ||
| 4 | # | ||
| 5 | # This program is free software; you can redistribute it and/or modify | ||
| 6 | # it under the terms of the GNU General Public License as published by | ||
| 7 | # the Free Software Foundation; either version 2 of the License, or | ||
| 8 | # (at your option) any later version. | ||
| 9 | # | ||
| 10 | # This program is distributed in the hope that it will be useful, | ||
| 11 | # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
| 12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
| 13 | # GNU General Public License for more details. | ||
| 14 | # | ||
| 15 | # You should have received a copy of the GNU General Public License | ||
| 16 | # along with this program; if not, write to the Free Software | ||
| 17 | # Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | ||
| 18 | |||
| 19 | CC = $(CROSS_COMPILE)gcc | ||
| 20 | INC = -I$(KBUILD_OUTPUT)/usr/include | ||
| 21 | CFLAGS = -Wall $(INC) | ||
| 22 | LDLIBS = -lrt | ||
| 23 | PROGS = testptp | ||
| 24 | |||
| 25 | all: $(PROGS) | ||
| 26 | |||
| 27 | testptp: testptp.o | ||
| 28 | |||
| 29 | clean: | ||
| 30 | rm -f testptp.o | ||
| 31 | |||
| 32 | distclean: clean | ||
| 33 | rm -f $(PROGS) | ||
diff --git a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt index 9b7e1904db1c..5d0fc8bfcdb9 100644 --- a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt +++ b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt | |||
| @@ -1182,6 +1182,16 @@ | |||
| 1182 | forge.net/> and explains these in detail, as well as | 1182 | forge.net/> and explains these in detail, as well as |
| 1183 | some other issues. | 1183 | some other issues. |
| 1184 | 1184 | ||
| 1185 | There is also a related point-to-point only "ucast" transport. | ||
| 1186 | This is useful when your network does not support multicast, and | ||
| 1187 | all network connections are simple point to point links. | ||
| 1188 | |||
| 1189 | The full set of command line options for this transport are | ||
| 1190 | |||
| 1191 | |||
| 1192 | ethn=ucast,ethernet address,remote address,listen port,remote port | ||
| 1193 | |||
| 1194 | |||
| 1185 | 1195 | ||
| 1186 | 1196 | ||
| 1187 | 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr | 1197 | 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr |
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt new file mode 100644 index 000000000000..36c367c73084 --- /dev/null +++ b/Documentation/vm/cleancache.txt | |||
| @@ -0,0 +1,278 @@ | |||
| 1 | MOTIVATION | ||
| 2 | |||
| 3 | Cleancache is a new optional feature provided by the VFS layer that | ||
| 4 | potentially dramatically increases page cache effectiveness for | ||
| 5 | many workloads in many environments at a negligible cost. | ||
| 6 | |||
| 7 | Cleancache can be thought of as a page-granularity victim cache for clean | ||
| 8 | pages that the kernel's pageframe replacement algorithm (PFRA) would like | ||
| 9 | to keep around, but can't since there isn't enough memory. So when the | ||
| 10 | PFRA "evicts" a page, it first attempts to use cleancache code to | ||
| 11 | put the data contained in that page into "transcendent memory", memory | ||
| 12 | that is not directly accessible or addressable by the kernel and is | ||
| 13 | of unknown and possibly time-varying size. | ||
| 14 | |||
| 15 | Later, when a cleancache-enabled filesystem wishes to access a page | ||
| 16 | in a file on disk, it first checks cleancache to see if it already | ||
| 17 | contains it; if it does, the page of data is copied into the kernel | ||
| 18 | and a disk access is avoided. | ||
| 19 | |||
| 20 | Transcendent memory "drivers" for cleancache are currently implemented | ||
| 21 | in Xen (using hypervisor memory) and zcache (using in-kernel compressed | ||
| 22 | memory) and other implementations are in development. | ||
| 23 | |||
| 24 | FAQs are included below. | ||
| 25 | |||
| 26 | IMPLEMENTATION OVERVIEW | ||
| 27 | |||
| 28 | A cleancache "backend" that provides transcendent memory registers itself | ||
| 29 | to the kernel's cleancache "frontend" by calling cleancache_register_ops, | ||
| 30 | passing a pointer to a cleancache_ops structure with funcs set appropriately. | ||
| 31 | Note that cleancache_register_ops returns the previous settings so that | ||
| 32 | chaining can be performed if desired. The functions provided must conform to | ||
| 33 | certain semantics as follows: | ||
| 34 | |||
| 35 | Most important, cleancache is "ephemeral". Pages which are copied into | ||
| 36 | cleancache have an indefinite lifetime which is completely unknowable | ||
| 37 | by the kernel and so may or may not still be in cleancache at any later time. | ||
| 38 | Thus, as its name implies, cleancache is not suitable for dirty pages. | ||
| 39 | Cleancache has complete discretion over what pages to preserve and what | ||
| 40 | pages to discard and when. | ||
| 41 | |||
| 42 | Mounting a cleancache-enabled filesystem should call "init_fs" to obtain a | ||
| 43 | pool id which, if positive, must be saved in the filesystem's superblock; | ||
| 44 | a negative return value indicates failure. A "put_page" will copy a | ||
| 45 | (presumably about-to-be-evicted) page into cleancache and associate it with | ||
| 46 | the pool id, a file key, and a page index into the file. (The combination | ||
| 47 | of a pool id, a file key, and an index is sometimes called a "handle".) | ||
| 48 | A "get_page" will copy the page, if found, from cleancache into kernel memory. | ||
| 49 | A "flush_page" will ensure the page no longer is present in cleancache; | ||
| 50 | a "flush_inode" will flush all pages associated with the specified file; | ||
| 51 | and, when a filesystem is unmounted, a "flush_fs" will flush all pages in | ||
| 52 | all files specified by the given pool id and also surrender the pool id. | ||
| 53 | |||
| 54 | An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache | ||
| 55 | to treat the pool as shared using a 128-bit UUID as a key. On systems | ||
| 56 | that may run multiple kernels (such as hard partitioned or virtualized | ||
| 57 | systems) that may share a clustered filesystem, and where cleancache | ||
| 58 | may be shared among those kernels, calls to init_shared_fs that specify the | ||
| 59 | same UUID will receive the same pool id, thus allowing the pages to | ||
| 60 | be shared. Note that any security requirements must be imposed outside | ||
| 61 | of the kernel (e.g. by "tools" that control cleancache). Or a | ||
| 62 | cleancache implementation can simply disable shared_init by always | ||
| 63 | returning a negative value. | ||
| 64 | |||
| 65 | If a get_page is successful on a non-shared pool, the page is flushed (thus | ||
| 66 | making cleancache an "exclusive" cache). On a shared pool, the page | ||
| 67 | is NOT flushed on a successful get_page so that it remains accessible to | ||
| 68 | other sharers. The kernel is responsible for ensuring coherency between | ||
| 69 | cleancache (shared or not), the page cache, and the filesystem, using | ||
| 70 | cleancache flush operations as required. | ||
| 71 | |||
| 72 | Note that cleancache must enforce put-put-get coherency and get-get | ||
| 73 | coherency. For the former, if two puts are made to the same handle but | ||
| 74 | with different data, say AAA by the first put and BBB by the second, a | ||
| 75 | subsequent get can never return the stale data (AAA). For get-get coherency, | ||
| 76 | if a get for a given handle fails, subsequent gets for that handle will | ||
| 77 | never succeed unless preceded by a successful put with that handle. | ||
| 78 | |||
| 79 | Last, cleancache provides no SMP serialization guarantees; if two | ||
| 80 | different Linux threads are simultaneously putting and flushing a page | ||
| 81 | with the same handle, the results are indeterminate. Callers must | ||
| 82 | lock the page to ensure serial behavior. | ||
| 83 | |||
| 84 | CLEANCACHE PERFORMANCE METRICS | ||
| 85 | |||
| 86 | Cleancache monitoring is done by sysfs files in the | ||
| 87 | /sys/kernel/mm/cleancache directory. The effectiveness of cleancache | ||
| 88 | can be measured (across all filesystems) with: | ||
| 89 | |||
| 90 | succ_gets - number of gets that were successful | ||
| 91 | failed_gets - number of gets that failed | ||
| 92 | puts - number of puts attempted (all "succeed") | ||
| 93 | flushes - number of flushes attempted | ||
| 94 | |||
| 95 | A backend implementatation may provide additional metrics. | ||
| 96 | |||
| 97 | FAQ | ||
| 98 | |||
| 99 | 1) Where's the value? (Andrew Morton) | ||
| 100 | |||
| 101 | Cleancache provides a significant performance benefit to many workloads | ||
| 102 | in many environments with negligible overhead by improving the | ||
| 103 | effectiveness of the pagecache. Clean pagecache pages are | ||
| 104 | saved in transcendent memory (RAM that is otherwise not directly | ||
| 105 | addressable to the kernel); fetching those pages later avoids "refaults" | ||
| 106 | and thus disk reads. | ||
| 107 | |||
| 108 | Cleancache (and its sister code "frontswap") provide interfaces for | ||
| 109 | this transcendent memory (aka "tmem"), which conceptually lies between | ||
| 110 | fast kernel-directly-addressable RAM and slower DMA/asynchronous devices. | ||
| 111 | Disallowing direct kernel or userland reads/writes to tmem | ||
| 112 | is ideal when data is transformed to a different form and size (such | ||
| 113 | as with compression) or secretly moved (as might be useful for write- | ||
| 114 | balancing for some RAM-like devices). Evicted page-cache pages (and | ||
| 115 | swap pages) are a great use for this kind of slower-than-RAM-but-much- | ||
| 116 | faster-than-disk transcendent memory, and the cleancache (and frontswap) | ||
| 117 | "page-object-oriented" specification provides a nice way to read and | ||
| 118 | write -- and indirectly "name" -- the pages. | ||
| 119 | |||
| 120 | In the virtual case, the whole point of virtualization is to statistically | ||
| 121 | multiplex physical resources across the varying demands of multiple | ||
| 122 | virtual machines. This is really hard to do with RAM and efforts to | ||
| 123 | do it well with no kernel change have essentially failed (except in some | ||
| 124 | well-publicized special-case workloads). Cleancache -- and frontswap -- | ||
| 125 | with a fairly small impact on the kernel, provide a huge amount | ||
| 126 | of flexibility for more dynamic, flexible RAM multiplexing. | ||
| 127 | Specifically, the Xen Transcendent Memory backend allows otherwise | ||
| 128 | "fallow" hypervisor-owned RAM to not only be "time-shared" between multiple | ||
| 129 | virtual machines, but the pages can be compressed and deduplicated to | ||
| 130 | optimize RAM utilization. And when guest OS's are induced to surrender | ||
| 131 | underutilized RAM (e.g. with "self-ballooning"), page cache pages | ||
| 132 | are the first to go, and cleancache allows those pages to be | ||
| 133 | saved and reclaimed if overall host system memory conditions allow. | ||
| 134 | |||
| 135 | And the identical interface used for cleancache can be used in | ||
| 136 | physical systems as well. The zcache driver acts as a memory-hungry | ||
| 137 | device that stores pages of data in a compressed state. And | ||
| 138 | the proposed "RAMster" driver shares RAM across multiple physical | ||
| 139 | systems. | ||
| 140 | |||
| 141 | 2) Why does cleancache have its sticky fingers so deep inside the | ||
| 142 | filesystems and VFS? (Andrew Morton and Christoph Hellwig) | ||
| 143 | |||
| 144 | The core hooks for cleancache in VFS are in most cases a single line | ||
| 145 | and the minimum set are placed precisely where needed to maintain | ||
| 146 | coherency (via cleancache_flush operations) between cleancache, | ||
| 147 | the page cache, and disk. All hooks compile into nothingness if | ||
| 148 | cleancache is config'ed off and turn into a function-pointer- | ||
| 149 | compare-to-NULL if config'ed on but no backend claims the ops | ||
| 150 | functions, or to a compare-struct-element-to-negative if a | ||
| 151 | backend claims the ops functions but a filesystem doesn't enable | ||
| 152 | cleancache. | ||
| 153 | |||
| 154 | Some filesystems are built entirely on top of VFS and the hooks | ||
| 155 | in VFS are sufficient, so don't require an "init_fs" hook; the | ||
| 156 | initial implementation of cleancache didn't provide this hook. | ||
| 157 | But for some filesystems (such as btrfs), the VFS hooks are | ||
| 158 | incomplete and one or more hooks in fs-specific code are required. | ||
| 159 | And for some other filesystems, such as tmpfs, cleancache may | ||
| 160 | be counterproductive. So it seemed prudent to require a filesystem | ||
| 161 | to "opt in" to use cleancache, which requires adding a hook in | ||
| 162 | each filesystem. Not all filesystems are supported by cleancache | ||
| 163 | only because they haven't been tested. The existing set should | ||
| 164 | be sufficient to validate the concept, the opt-in approach means | ||
| 165 | that untested filesystems are not affected, and the hooks in the | ||
| 166 | existing filesystems should make it very easy to add more | ||
| 167 | filesystems in the future. | ||
| 168 | |||
| 169 | The total impact of the hooks to existing fs and mm files is only | ||
| 170 | about 40 lines added (not counting comments and blank lines). | ||
| 171 | |||
| 172 | 3) Why not make cleancache asynchronous and batched so it can | ||
| 173 | more easily interface with real devices with DMA instead | ||
| 174 | of copying each individual page? (Minchan Kim) | ||
| 175 | |||
| 176 | The one-page-at-a-time copy semantics simplifies the implementation | ||
| 177 | on both the frontend and backend and also allows the backend to | ||
| 178 | do fancy things on-the-fly like page compression and | ||
| 179 | page deduplication. And since the data is "gone" (copied into/out | ||
| 180 | of the pageframe) before the cleancache get/put call returns, | ||
| 181 | a great deal of race conditions and potential coherency issues | ||
| 182 | are avoided. While the interface seems odd for a "real device" | ||
| 183 | or for real kernel-addressable RAM, it makes perfect sense for | ||
| 184 | transcendent memory. | ||
| 185 | |||
| 186 | 4) Why is non-shared cleancache "exclusive"? And where is the | ||
| 187 | page "flushed" after a "get"? (Minchan Kim) | ||
| 188 | |||
| 189 | The main reason is to free up space in transcendent memory and | ||
| 190 | to avoid unnecessary cleancache_flush calls. If you want inclusive, | ||
| 191 | the page can be "put" immediately following the "get". If | ||
| 192 | put-after-get for inclusive becomes common, the interface could | ||
| 193 | be easily extended to add a "get_no_flush" call. | ||
| 194 | |||
| 195 | The flush is done by the cleancache backend implementation. | ||
| 196 | |||
| 197 | 5) What's the performance impact? | ||
| 198 | |||
| 199 | Performance analysis has been presented at OLS'09 and LCA'10. | ||
| 200 | Briefly, performance gains can be significant on most workloads, | ||
| 201 | especially when memory pressure is high (e.g. when RAM is | ||
| 202 | overcommitted in a virtual workload); and because the hooks are | ||
| 203 | invoked primarily in place of or in addition to a disk read/write, | ||
| 204 | overhead is negligible even in worst case workloads. Basically | ||
| 205 | cleancache replaces I/O with memory-copy-CPU-overhead; on older | ||
| 206 | single-core systems with slow memory-copy speeds, cleancache | ||
| 207 | has little value, but in newer multicore machines, especially | ||
| 208 | consolidated/virtualized machines, it has great value. | ||
| 209 | |||
| 210 | 6) How do I add cleancache support for filesystem X? (Boaz Harrash) | ||
| 211 | |||
| 212 | Filesystems that are well-behaved and conform to certain | ||
| 213 | restrictions can utilize cleancache simply by making a call to | ||
| 214 | cleancache_init_fs at mount time. Unusual, misbehaving, or | ||
| 215 | poorly layered filesystems must either add additional hooks | ||
| 216 | and/or undergo extensive additional testing... or should just | ||
| 217 | not enable the optional cleancache. | ||
| 218 | |||
| 219 | Some points for a filesystem to consider: | ||
| 220 | |||
| 221 | - The FS should be block-device-based (e.g. a ram-based FS such | ||
| 222 | as tmpfs should not enable cleancache) | ||
| 223 | - To ensure coherency/correctness, the FS must ensure that all | ||
| 224 | file removal or truncation operations either go through VFS or | ||
| 225 | add hooks to do the equivalent cleancache "flush" operations | ||
| 226 | - To ensure coherency/correctness, either inode numbers must | ||
| 227 | be unique across the lifetime of the on-disk file OR the | ||
| 228 | FS must provide an "encode_fh" function. | ||
| 229 | - The FS must call the VFS superblock alloc and deactivate routines | ||
| 230 | or add hooks to do the equivalent cleancache calls done there. | ||
| 231 | - To maximize performance, all pages fetched from the FS should | ||
| 232 | go through the do_mpag_readpage routine or the FS should add | ||
| 233 | hooks to do the equivalent (cf. btrfs) | ||
| 234 | - Currently, the FS blocksize must be the same as PAGESIZE. This | ||
| 235 | is not an architectural restriction, but no backends currently | ||
| 236 | support anything different. | ||
| 237 | - A clustered FS should invoke the "shared_init_fs" cleancache | ||
| 238 | hook to get best performance for some backends. | ||
| 239 | |||
| 240 | 7) Why not use the KVA of the inode as the key? (Christoph Hellwig) | ||
| 241 | |||
| 242 | If cleancache would use the inode virtual address instead of | ||
| 243 | inode/filehandle, the pool id could be eliminated. But, this | ||
| 244 | won't work because cleancache retains pagecache data pages | ||
| 245 | persistently even when the inode has been pruned from the | ||
| 246 | inode unused list, and only flushes the data page if the file | ||
| 247 | gets removed/truncated. So if cleancache used the inode kva, | ||
| 248 | there would be potential coherency issues if/when the inode | ||
| 249 | kva is reused for a different file. Alternately, if cleancache | ||
| 250 | flushed the pages when the inode kva was freed, much of the value | ||
| 251 | of cleancache would be lost because the cache of pages in cleanache | ||
| 252 | is potentially much larger than the kernel pagecache and is most | ||
| 253 | useful if the pages survive inode cache removal. | ||
| 254 | |||
| 255 | 8) Why is a global variable required? | ||
| 256 | |||
| 257 | The cleancache_enabled flag is checked in all of the frequently-used | ||
| 258 | cleancache hooks. The alternative is a function call to check a static | ||
| 259 | variable. Since cleancache is enabled dynamically at runtime, systems | ||
| 260 | that don't enable cleancache would suffer thousands (possibly | ||
| 261 | tens-of-thousands) of unnecessary function calls per second. So the | ||
| 262 | global variable allows cleancache to be enabled by default at compile | ||
| 263 | time, but have insignificant performance impact when cleancache remains | ||
| 264 | disabled at runtime. | ||
| 265 | |||
| 266 | 9) Does cleanache work with KVM? | ||
| 267 | |||
| 268 | The memory model of KVM is sufficiently different that a cleancache | ||
| 269 | backend may have less value for KVM. This remains to be tested, | ||
| 270 | especially in an overcommitted system. | ||
| 271 | |||
| 272 | 10) Does cleancache work in userspace? It sounds useful for | ||
| 273 | memory hungry caches like web browsers. (Jamie Lokier) | ||
| 274 | |||
| 275 | No plans yet, though we agree it sounds useful, at least for | ||
| 276 | apps that bypass the page cache (e.g. O_DIRECT). | ||
| 277 | |||
| 278 | Last updated: Dan Magenheimer, April 13 2011 | ||
diff --git a/Documentation/vm/locking b/Documentation/vm/locking index 25fadb448760..f61228bd6395 100644 --- a/Documentation/vm/locking +++ b/Documentation/vm/locking | |||
| @@ -66,7 +66,7 @@ in some cases it is not really needed. Eg, vm_start is modified by | |||
| 66 | expand_stack(), it is hard to come up with a destructive scenario without | 66 | expand_stack(), it is hard to come up with a destructive scenario without |
| 67 | having the vmlist protection in this case. | 67 | having the vmlist protection in this case. |
| 68 | 68 | ||
| 69 | The page_table_lock nests with the inode i_mmap_lock and the kmem cache | 69 | The page_table_lock nests with the inode i_mmap_mutex and the kmem cache |
| 70 | c_spinlock spinlocks. This is okay, since the kmem code asks for pages after | 70 | c_spinlock spinlocks. This is okay, since the kmem code asks for pages after |
| 71 | dropping c_spinlock. The page_table_lock also nests with pagecache_lock and | 71 | dropping c_spinlock. The page_table_lock also nests with pagecache_lock and |
| 72 | pagemap_lru_lock spinlocks, and no code asks for memory with these locks | 72 | pagemap_lru_lock spinlocks, and no code asks for memory with these locks |
