aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block64
-rw-r--r--Documentation/ABI/testing/sysfs-kernel-mm-cleancache11
-rw-r--r--Documentation/ABI/testing/sysfs-ptp98
-rw-r--r--Documentation/IRQ-affinity.txt17
-rw-r--r--Documentation/blockdev/cciss.txt15
-rw-r--r--Documentation/cachetlb.txt2
-rw-r--r--Documentation/devicetree/bindings/net/fsl-tsec-phy.txt54
-rw-r--r--Documentation/filesystems/9p.txt29
-rw-r--r--Documentation/filesystems/ext4.txt4
-rw-r--r--Documentation/filesystems/proc.txt11
-rw-r--r--Documentation/filesystems/xfs.txt6
-rw-r--r--Documentation/hwmon/emc6w20142
-rw-r--r--Documentation/hwmon/f71882fg4
-rw-r--r--Documentation/hwmon/fam15h_power37
-rw-r--r--Documentation/hwmon/k10temp3
-rw-r--r--Documentation/hwmon/max665021
-rw-r--r--Documentation/ioctl/ioctl-number.txt1
-rw-r--r--Documentation/kbuild/kconfig-language.txt32
-rw-r--r--Documentation/kbuild/kconfig.txt5
-rw-r--r--Documentation/kernel-parameters.txt3
-rw-r--r--Documentation/lockstat.txt2
-rw-r--r--Documentation/mmc/00-INDEX2
-rw-r--r--Documentation/mmc/mmc-dev-attrs.txt10
-rw-r--r--Documentation/mmc/mmc-dev-parts.txt27
-rw-r--r--Documentation/networking/bonding.txt13
-rw-r--r--Documentation/ptp/ptp.txt89
-rw-r--r--Documentation/ptp/testptp.c381
-rw-r--r--Documentation/ptp/testptp.mk33
-rw-r--r--Documentation/virtual/uml/UserModeLinux-HOWTO.txt10
-rw-r--r--Documentation/vm/cleancache.txt278
-rw-r--r--Documentation/vm/locking2
31 files changed, 1257 insertions, 49 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 4873c759d535..c1eb41cb9876 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -142,3 +142,67 @@ Description:
142 with the previous I/O request are enabled. When set to 2, 142 with the previous I/O request are enabled. When set to 2,
143 all merge tries are disabled. The default value is 0 - 143 all merge tries are disabled. The default value is 0 -
144 which enables all types of merge tries. 144 which enables all types of merge tries.
145
146What: /sys/block/<disk>/discard_alignment
147Date: May 2011
148Contact: Martin K. Petersen <martin.petersen@oracle.com>
149Description:
150 Devices that support discard functionality may
151 internally allocate space in units that are bigger than
152 the exported logical block size. The discard_alignment
153 parameter indicates how many bytes the beginning of the
154 device is offset from the internal allocation unit's
155 natural alignment.
156
157What: /sys/block/<disk>/<partition>/discard_alignment
158Date: May 2011
159Contact: Martin K. Petersen <martin.petersen@oracle.com>
160Description:
161 Devices that support discard functionality may
162 internally allocate space in units that are bigger than
163 the exported logical block size. The discard_alignment
164 parameter indicates how many bytes the beginning of the
165 partition is offset from the internal allocation unit's
166 natural alignment.
167
168What: /sys/block/<disk>/queue/discard_granularity
169Date: May 2011
170Contact: Martin K. Petersen <martin.petersen@oracle.com>
171Description:
172 Devices that support discard functionality may
173 internally allocate space using units that are bigger
174 than the logical block size. The discard_granularity
175 parameter indicates the size of the internal allocation
176 unit in bytes if reported by the device. Otherwise the
177 discard_granularity will be set to match the device's
178 physical block size. A discard_granularity of 0 means
179 that the device does not support discard functionality.
180
181What: /sys/block/<disk>/queue/discard_max_bytes
182Date: May 2011
183Contact: Martin K. Petersen <martin.petersen@oracle.com>
184Description:
185 Devices that support discard functionality may have
186 internal limits on the number of bytes that can be
187 trimmed or unmapped in a single operation. Some storage
188 protocols also have inherent limits on the number of
189 blocks that can be described in a single command. The
190 discard_max_bytes parameter is set by the device driver
191 to the maximum number of bytes that can be discarded in
192 a single operation. Discard requests issued to the
193 device must not exceed this limit. A discard_max_bytes
194 value of 0 means that the device does not support
195 discard functionality.
196
197What: /sys/block/<disk>/queue/discard_zeroes_data
198Date: May 2011
199Contact: Martin K. Petersen <martin.petersen@oracle.com>
200Description:
201 Devices that support discard functionality may return
202 stale or random data when a previously discarded block
203 is read back. This can cause problems if the filesystem
204 expects discarded blocks to be explicitly cleared. If a
205 device reports that it deterministically returns zeroes
206 when a discarded area is read the discard_zeroes_data
207 parameter will be set to one. Otherwise it will be 0 and
208 the result of reading a discarded area is undefined.
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cleancache b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
new file mode 100644
index 000000000000..662ae646ea12
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-cleancache
@@ -0,0 +1,11 @@
1What: /sys/kernel/mm/cleancache/
2Date: April 2011
3Contact: Dan Magenheimer <dan.magenheimer@oracle.com>
4Description:
5 /sys/kernel/mm/cleancache/ contains a number of files which
6 record a count of various cleancache operations
7 (sum across all filesystems):
8 succ_gets
9 failed_gets
10 puts
11 flushes
diff --git a/Documentation/ABI/testing/sysfs-ptp b/Documentation/ABI/testing/sysfs-ptp
new file mode 100644
index 000000000000..d40d2b550502
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-ptp
@@ -0,0 +1,98 @@
1What: /sys/class/ptp/
2Date: September 2010
3Contact: Richard Cochran <richardcochran@gmail.com>
4Description:
5 This directory contains files and directories
6 providing a standardized interface to the ancillary
7 features of PTP hardware clocks.
8
9What: /sys/class/ptp/ptpN/
10Date: September 2010
11Contact: Richard Cochran <richardcochran@gmail.com>
12Description:
13 This directory contains the attributes of the Nth PTP
14 hardware clock registered into the PTP class driver
15 subsystem.
16
17What: /sys/class/ptp/ptpN/clock_name
18Date: September 2010
19Contact: Richard Cochran <richardcochran@gmail.com>
20Description:
21 This file contains the name of the PTP hardware clock
22 as a human readable string.
23
24What: /sys/class/ptp/ptpN/max_adjustment
25Date: September 2010
26Contact: Richard Cochran <richardcochran@gmail.com>
27Description:
28 This file contains the PTP hardware clock's maximum
29 frequency adjustment value (a positive integer) in
30 parts per billion.
31
32What: /sys/class/ptp/ptpN/n_alarms
33Date: September 2010
34Contact: Richard Cochran <richardcochran@gmail.com>
35Description:
36 This file contains the number of periodic or one shot
37 alarms offer by the PTP hardware clock.
38
39What: /sys/class/ptp/ptpN/n_external_timestamps
40Date: September 2010
41Contact: Richard Cochran <richardcochran@gmail.com>
42Description:
43 This file contains the number of external timestamp
44 channels offered by the PTP hardware clock.
45
46What: /sys/class/ptp/ptpN/n_periodic_outputs
47Date: September 2010
48Contact: Richard Cochran <richardcochran@gmail.com>
49Description:
50 This file contains the number of programmable periodic
51 output channels offered by the PTP hardware clock.
52
53What: /sys/class/ptp/ptpN/pps_avaiable
54Date: September 2010
55Contact: Richard Cochran <richardcochran@gmail.com>
56Description:
57 This file indicates whether the PTP hardware clock
58 supports a Pulse Per Second to the host CPU. Reading
59 "1" means that the PPS is supported, while "0" means
60 not supported.
61
62What: /sys/class/ptp/ptpN/extts_enable
63Date: September 2010
64Contact: Richard Cochran <richardcochran@gmail.com>
65Description:
66 This write-only file enables or disables external
67 timestamps. To enable external timestamps, write the
68 channel index followed by a "1" into the file.
69 To disable external timestamps, write the channel
70 index followed by a "0" into the file.
71
72What: /sys/class/ptp/ptpN/fifo
73Date: September 2010
74Contact: Richard Cochran <richardcochran@gmail.com>
75Description:
76 This file provides timestamps on external events, in
77 the form of three integers: channel index, seconds,
78 and nanoseconds.
79
80What: /sys/class/ptp/ptpN/period
81Date: September 2010
82Contact: Richard Cochran <richardcochran@gmail.com>
83Description:
84 This write-only file enables or disables periodic
85 outputs. To enable a periodic output, write five
86 integers into the file: channel index, start time
87 seconds, start time nanoseconds, period seconds, and
88 period nanoseconds. To disable a periodic output, set
89 all the seconds and nanoseconds values to zero.
90
91What: /sys/class/ptp/ptpN/pps_enable
92Date: September 2010
93Contact: Richard Cochran <richardcochran@gmail.com>
94Description:
95 This write-only file enables or disables delivery of
96 PPS events to the Linux PPS subsystem. To enable PPS
97 events, write a "1" into the file. To disable events,
98 write a "0" into the file.
diff --git a/Documentation/IRQ-affinity.txt b/Documentation/IRQ-affinity.txt
index b4a615b78403..7890fae18529 100644
--- a/Documentation/IRQ-affinity.txt
+++ b/Documentation/IRQ-affinity.txt
@@ -4,10 +4,11 @@ ChangeLog:
4 4
5SMP IRQ affinity 5SMP IRQ affinity
6 6
7/proc/irq/IRQ#/smp_affinity specifies which target CPUs are permitted 7/proc/irq/IRQ#/smp_affinity and /proc/irq/IRQ#/smp_affinity_list specify
8for a given IRQ source. It's a bitmask of allowed CPUs. It's not allowed 8which target CPUs are permitted for a given IRQ source. It's a bitmask
9to turn off all CPUs, and if an IRQ controller does not support IRQ 9(smp_affinity) or cpu list (smp_affinity_list) of allowed CPUs. It's not
10affinity then the value will not change from the default 0xffffffff. 10allowed to turn off all CPUs, and if an IRQ controller does not support
11IRQ affinity then the value will not change from the default of all cpus.
11 12
12/proc/irq/default_smp_affinity specifies default affinity mask that applies 13/proc/irq/default_smp_affinity specifies default affinity mask that applies
13to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask 14to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask
@@ -54,3 +55,11 @@ round-trip min/avg/max = 0.1/0.5/585.4 ms
54This time around IRQ44 was delivered only to the last four processors. 55This time around IRQ44 was delivered only to the last four processors.
55i.e counters for the CPU0-3 did not change. 56i.e counters for the CPU0-3 did not change.
56 57
58Here is an example of limiting that same irq (44) to cpus 1024 to 1031:
59
60[root@moon 44]# echo 1024-1031 > smp_affinity
61[root@moon 44]# cat smp_affinity
621024-1031
63
64Note that to do this with a bitmask would require 32 bitmasks of zero
65to follow the pertinent one.
diff --git a/Documentation/blockdev/cciss.txt b/Documentation/blockdev/cciss.txt
index 89698e8df7d4..c00c6a5ab21f 100644
--- a/Documentation/blockdev/cciss.txt
+++ b/Documentation/blockdev/cciss.txt
@@ -169,3 +169,18 @@ is issued which positions the tape to a known position. Typically you
169must rewind the tape (by issuing "mt -f /dev/st0 rewind" for example) 169must rewind the tape (by issuing "mt -f /dev/st0 rewind" for example)
170before i/o can proceed again to a tape drive which was reset. 170before i/o can proceed again to a tape drive which was reset.
171 171
172There is a cciss_tape_cmds module parameter which can be used to make cciss
173allocate more commands for use by tape drives. Ordinarily only a few commands
174(6) are allocated for tape drives because tape drives are slow and
175infrequently used and the primary purpose of Smart Array controllers is to
176act as a RAID controller for disk drives, so the vast majority of commands
177are allocated for disk devices. However, if you have more than a few tape
178drives attached to a smart array, the default number of commands may not be
179enought (for example, if you have 8 tape drives, you could only rewind 6
180at one time with the default number of commands.) The cciss_tape_cmds module
181parameter allows more commands (up to 16 more) to be allocated for use by
182tape drives. For example:
183
184 insmod cciss.ko cciss_tape_cmds=16
185
186Or, as a kernel boot parameter passed in via grub: cciss.cciss_tape_cmds=8
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt
index 9164ae3b83bc..9b728dc17535 100644
--- a/Documentation/cachetlb.txt
+++ b/Documentation/cachetlb.txt
@@ -16,7 +16,7 @@ on all processors in the system. Don't let this scare you into
16thinking SMP cache/tlb flushing must be so inefficient, this is in 16thinking SMP cache/tlb flushing must be so inefficient, this is in
17fact an area where many optimizations are possible. For example, 17fact an area where many optimizations are possible. For example,
18if it can be proven that a user address space has never executed 18if it can be proven that a user address space has never executed
19on a cpu (see vma->cpu_vm_mask), one need not perform a flush 19on a cpu (see mm_cpumask()), one need not perform a flush
20for this address space on that cpu. 20for this address space on that cpu.
21 21
22First, the TLB flushing interfaces, since they are the simplest. The 22First, the TLB flushing interfaces, since they are the simplest. The
diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
index edb7ae19e868..2c6be0377f55 100644
--- a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
+++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
@@ -74,3 +74,57 @@ Example:
74 interrupt-parent = <&mpic>; 74 interrupt-parent = <&mpic>;
75 phy-handle = <&phy0> 75 phy-handle = <&phy0>
76 }; 76 };
77
78* Gianfar PTP clock nodes
79
80General Properties:
81
82 - compatible Should be "fsl,etsec-ptp"
83 - reg Offset and length of the register set for the device
84 - interrupts There should be at least two interrupts. Some devices
85 have as many as four PTP related interrupts.
86
87Clock Properties:
88
89 - fsl,tclk-period Timer reference clock period in nanoseconds.
90 - fsl,tmr-prsc Prescaler, divides the output clock.
91 - fsl,tmr-add Frequency compensation value.
92 - fsl,tmr-fiper1 Fixed interval period pulse generator.
93 - fsl,tmr-fiper2 Fixed interval period pulse generator.
94 - fsl,max-adj Maximum frequency adjustment in parts per billion.
95
96 These properties set the operational parameters for the PTP
97 clock. You must choose these carefully for the clock to work right.
98 Here is how to figure good values:
99
100 TimerOsc = system clock MHz
101 tclk_period = desired clock period nanoseconds
102 NominalFreq = 1000 / tclk_period MHz
103 FreqDivRatio = TimerOsc / NominalFreq (must be greater that 1.0)
104 tmr_add = ceil(2^32 / FreqDivRatio)
105 OutputClock = NominalFreq / tmr_prsc MHz
106 PulseWidth = 1 / OutputClock microseconds
107 FiperFreq1 = desired frequency in Hz
108 FiperDiv1 = 1000000 * OutputClock / FiperFreq1
109 tmr_fiper1 = tmr_prsc * tclk_period * FiperDiv1 - tclk_period
110 max_adj = 1000000000 * (FreqDivRatio - 1.0) - 1
111
112 The calculation for tmr_fiper2 is the same as for tmr_fiper1. The
113 driver expects that tmr_fiper1 will be correctly set to produce a 1
114 Pulse Per Second (PPS) signal, since this will be offered to the PPS
115 subsystem to synchronize the Linux clock.
116
117Example:
118
119 ptp_clock@24E00 {
120 compatible = "fsl,etsec-ptp";
121 reg = <0x24E00 0xB0>;
122 interrupts = <12 0x8 13 0x8>;
123 interrupt-parent = < &ipic >;
124 fsl,tclk-period = <10>;
125 fsl,tmr-prsc = <100>;
126 fsl,tmr-add = <0x999999A4>;
127 fsl,tmr-fiper1 = <0x3B9AC9F6>;
128 fsl,tmr-fiper2 = <0x00018696>;
129 fsl,max-adj = <659999998>;
130 };
diff --git a/Documentation/filesystems/9p.txt b/Documentation/filesystems/9p.txt
index b22abba78fed..13de64c7f0ab 100644
--- a/Documentation/filesystems/9p.txt
+++ b/Documentation/filesystems/9p.txt
@@ -25,6 +25,8 @@ Other applications are described in the following papers:
25 http://xcpu.org/papers/cellfs-talk.pdf 25 http://xcpu.org/papers/cellfs-talk.pdf
26 * PROSE I/O: Using 9p to enable Application Partitions 26 * PROSE I/O: Using 9p to enable Application Partitions
27 http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf 27 http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
28 * VirtFS: A Virtualization Aware File System pass-through
29 http://goo.gl/3WPDg
28 30
29USAGE 31USAGE
30===== 32=====
@@ -130,31 +132,20 @@ OPTIONS
130RESOURCES 132RESOURCES
131========= 133=========
132 134
133Our current recommendation is to use Inferno (http://www.vitanuova.com/nferno/index.html) 135Protocol specifications are maintained on github:
134as the 9p server. You can start a 9p server under Inferno by issuing the 136http://ericvh.github.com/9p-rfc/
135following command:
136 ; styxlisten -A tcp!*!564 export '#U*'
137 137
138The -A specifies an unauthenticated export. The 564 is the port # (you may 1389p client and server implementations are listed on
139have to choose a higher port number if running as a normal user). The '#U*' 139http://9p.cat-v.org/implementations
140specifies exporting the root of the Linux name space. You may specify a
141subset of the namespace by extending the path: '#U*'/tmp would just export
142/tmp. For more information, see the Inferno manual pages covering styxlisten
143and export.
144 140
145A Linux version of the 9p server is now maintained under the npfs project 141A 9p2000.L server is being developed by LLNL and can be found
146on sourceforge (http://sourceforge.net/projects/npfs). The currently 142at http://code.google.com/p/diod/
147maintained version is the single-threaded version of the server (named spfs)
148available from the same SVN repository.
149 143
150There are user and developer mailing lists available through the v9fs project 144There are user and developer mailing lists available through the v9fs project
151on sourceforge (http://sourceforge.net/projects/v9fs). 145on sourceforge (http://sourceforge.net/projects/v9fs).
152 146
153A stand-alone version of the module (which should build for any 2.6 kernel) 147News and other information is maintained on a Wiki.
154is available via (http://github.com/ericvh/9p-sac/tree/master) 148(http://sf.net/apps/mediawiki/v9fs/index.php).
155
156News and other information is maintained on SWiK (http://swik.net/v9fs)
157and the Wiki (http://sf.net/apps/mediawiki/v9fs/index.php).
158 149
159Bug reports may be issued through the kernel.org bugzilla 150Bug reports may be issued through the kernel.org bugzilla
160(http://bugzilla.kernel.org) 151(http://bugzilla.kernel.org)
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index c79ec58fd7f6..3ae9bc94352a 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -226,10 +226,6 @@ acl Enables POSIX Access Control Lists support.
226noacl This option disables POSIX Access Control List 226noacl This option disables POSIX Access Control List
227 support. 227 support.
228 228
229reservation
230
231noreservation
232
233bsddf (*) Make 'df' act like BSD. 229bsddf (*) Make 'df' act like BSD.
234minixdf Make 'df' act like Minix. 230minixdf Make 'df' act like Minix.
235 231
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 60740e8ecb37..f48178024067 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -574,6 +574,12 @@ The contents of each smp_affinity file is the same by default:
574 > cat /proc/irq/0/smp_affinity 574 > cat /proc/irq/0/smp_affinity
575 ffffffff 575 ffffffff
576 576
577There is an alternate interface, smp_affinity_list which allows specifying
578a cpu range instead of a bitmask:
579
580 > cat /proc/irq/0/smp_affinity_list
581 1024-1031
582
577The default_smp_affinity mask applies to all non-active IRQs, which are the 583The default_smp_affinity mask applies to all non-active IRQs, which are the
578IRQs which have not yet been allocated/activated, and hence which lack a 584IRQs which have not yet been allocated/activated, and hence which lack a
579/proc/irq/[0-9]* directory. 585/proc/irq/[0-9]* directory.
@@ -583,12 +589,13 @@ reports itself as being attached. This hardware locality information does not
583include information about any possible driver locality preference. 589include information about any possible driver locality preference.
584 590
585prof_cpu_mask specifies which CPUs are to be profiled by the system wide 591prof_cpu_mask specifies which CPUs are to be profiled by the system wide
586profiler. Default value is ffffffff (all cpus). 592profiler. Default value is ffffffff (all cpus if there are only 32 of them).
587 593
588The way IRQs are routed is handled by the IO-APIC, and it's Round Robin 594The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
589between all the CPUs which are allowed to handle it. As usual the kernel has 595between all the CPUs which are allowed to handle it. As usual the kernel has
590more info than you and does a better job than you, so the defaults are the 596more info than you and does a better job than you, so the defaults are the
591best choice for almost everyone. 597best choice for almost everyone. [Note this applies only to those IO-APIC's
598that support "Round Robin" interrupt distribution.]
592 599
593There are three more important subdirectories in /proc: net, scsi, and sys. 600There are three more important subdirectories in /proc: net, scsi, and sys.
594The general rule is that the contents, or even the existence of these 601The general rule is that the contents, or even the existence of these
diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt
index 7bff3e4f35df..3fc0c31a6f5d 100644
--- a/Documentation/filesystems/xfs.txt
+++ b/Documentation/filesystems/xfs.txt
@@ -39,6 +39,12 @@ When mounting an XFS filesystem, the following options are accepted.
39 drive level write caching to be enabled, for devices that 39 drive level write caching to be enabled, for devices that
40 support write barriers. 40 support write barriers.
41 41
42 discard
43 Issue command to let the block device reclaim space freed by the
44 filesystem. This is useful for SSD devices, thinly provisioned
45 LUNs and virtual machine images, but may have a performance
46 impact. This option is incompatible with the nodelaylog option.
47
42 dmapi 48 dmapi
43 Enable the DMAPI (Data Management API) event callouts. 49 Enable the DMAPI (Data Management API) event callouts.
44 Use with the "mtpt" option. 50 Use with the "mtpt" option.
diff --git a/Documentation/hwmon/emc6w201 b/Documentation/hwmon/emc6w201
new file mode 100644
index 000000000000..32f355aaf56b
--- /dev/null
+++ b/Documentation/hwmon/emc6w201
@@ -0,0 +1,42 @@
1Kernel driver emc6w201
2======================
3
4Supported chips:
5 * SMSC EMC6W201
6 Prefix: 'emc6w201'
7 Addresses scanned: I2C 0x2c, 0x2d, 0x2e
8 Datasheet: Not public
9
10Author: Jean Delvare <khali@linux-fr.org>
11
12
13Description
14-----------
15
16From the datasheet:
17
18"The EMC6W201 is an environmental monitoring device with automatic fan
19control capability and enhanced system acoustics for noise suppression.
20This ACPI compliant device provides hardware monitoring for up to six
21voltages (including its own VCC) and five external thermal sensors,
22measures the speed of up to five fans, and controls the speed of
23multiple DC fans using three Pulse Width Modulator (PWM) outputs. Note
24that it is possible to control more than three fans by connecting two
25fans to one PWM output. The EMC6W201 will be available in a 36-pin
26QFN package."
27
28The device is functionally close to the EMC6D100 series, but is
29register-incompatible.
30
31The driver currently only supports the monitoring of the voltages,
32temperatures and fan speeds. Limits can be changed. Alarms are not
33supported, and neither is fan speed control.
34
35
36Known Systems With EMC6W201
37---------------------------
38
39The EMC6W201 is a rare device, only found on a few systems, made in
402005 and 2006. Known systems with this device:
41* Dell Precision 670 workstation
42* Gigabyte 2CEWH mainboard
diff --git a/Documentation/hwmon/f71882fg b/Documentation/hwmon/f71882fg
index df02245d1419..84d2623810f3 100644
--- a/Documentation/hwmon/f71882fg
+++ b/Documentation/hwmon/f71882fg
@@ -6,6 +6,10 @@ Supported chips:
6 Prefix: 'f71808e' 6 Prefix: 'f71808e'
7 Addresses scanned: none, address read from Super I/O config space 7 Addresses scanned: none, address read from Super I/O config space
8 Datasheet: Not public 8 Datasheet: Not public
9 * Fintek F71808A
10 Prefix: 'f71808a'
11 Addresses scanned: none, address read from Super I/O config space
12 Datasheet: Not public
9 * Fintek F71858FG 13 * Fintek F71858FG
10 Prefix: 'f71858fg' 14 Prefix: 'f71858fg'
11 Addresses scanned: none, address read from Super I/O config space 15 Addresses scanned: none, address read from Super I/O config space
diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
new file mode 100644
index 000000000000..a92918e0bd69
--- /dev/null
+++ b/Documentation/hwmon/fam15h_power
@@ -0,0 +1,37 @@
1Kernel driver fam15h_power
2==========================
3
4Supported chips:
5* AMD Family 15h Processors
6
7 Prefix: 'fam15h_power'
8 Addresses scanned: PCI space
9 Datasheets:
10 BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
11 (not yet published)
12
13Author: Andreas Herrmann <andreas.herrmann3@amd.com>
14
15Description
16-----------
17
18This driver permits reading of registers providing power information
19of AMD Family 15h processors.
20
21For AMD Family 15h processors the following power values can be
22calculated using different processor northbridge function registers:
23
24* BasePwrWatts: Specifies in watts the maximum amount of power
25 consumed by the processor for NB and logic external to the core.
26* ProcessorPwrWatts: Specifies in watts the maximum amount of power
27 the processor can support.
28* CurrPwrWatts: Specifies in watts the current amount of power being
29 consumed by the processor.
30
31This driver provides ProcessorPwrWatts and CurrPwrWatts:
32* power1_crit (ProcessorPwrWatts)
33* power1_input (CurrPwrWatts)
34
35On multi-node processors the calculated value is for the entire
36package and not for a single node. Thus the driver creates sysfs
37attributes only for internal node0 of a multi-node processor.
diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp
index d2b56a4fd1f5..0393c89277c0 100644
--- a/Documentation/hwmon/k10temp
+++ b/Documentation/hwmon/k10temp
@@ -11,6 +11,7 @@ Supported chips:
11 Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) 11 Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra)
12* AMD Family 12h processors: "Llano" 12* AMD Family 12h processors: "Llano"
13* AMD Family 14h processors: "Brazos" (C/E/G-Series) 13* AMD Family 14h processors: "Brazos" (C/E/G-Series)
14* AMD Family 15h processors: "Bulldozer"
14 15
15 Prefix: 'k10temp' 16 Prefix: 'k10temp'
16 Addresses scanned: PCI space 17 Addresses scanned: PCI space
@@ -40,7 +41,7 @@ Description
40----------- 41-----------
41 42
42This driver permits reading of the internal temperature sensor of AMD 43This driver permits reading of the internal temperature sensor of AMD
43Family 10h/11h/12h/14h processors. 44Family 10h/11h/12h/14h/15h processors.
44 45
45All these processors have a sensor, but on those for Socket F or AM2+, 46All these processors have a sensor, but on those for Socket F or AM2+,
46the sensor may return inconsistent values (erratum 319). The driver 47the sensor may return inconsistent values (erratum 319). The driver
diff --git a/Documentation/hwmon/max6650 b/Documentation/hwmon/max6650
index c565650fcfc6..58d9644a2bde 100644
--- a/Documentation/hwmon/max6650
+++ b/Documentation/hwmon/max6650
@@ -2,9 +2,13 @@ Kernel driver max6650
2===================== 2=====================
3 3
4Supported chips: 4Supported chips:
5 * Maxim 6650 / 6651 5 * Maxim MAX6650
6 Prefix: 'max6650' 6 Prefix: 'max6650'
7 Addresses scanned: I2C 0x1b, 0x1f, 0x48, 0x4b 7 Addresses scanned: none
8 Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf
9 * Maxim MAX6651
10 Prefix: 'max6651'
11 Addresses scanned: none
8 Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf 12 Datasheet: http://pdfserv.maxim-ic.com/en/ds/MAX6650-MAX6651.pdf
9 13
10Authors: 14Authors:
@@ -15,10 +19,10 @@ Authors:
15Description 19Description
16----------- 20-----------
17 21
18This driver implements support for the Maxim 6650/6651 22This driver implements support for the Maxim MAX6650 and MAX6651.
19 23
20The 2 devices are very similar, but the Maxim 6550 has a reduced feature 24The 2 devices are very similar, but the MAX6550 has a reduced feature
21set, e.g. only one fan-input, instead of 4 for the 6651. 25set, e.g. only one fan-input, instead of 4 for the MAX6651.
22 26
23The driver is not able to distinguish between the 2 devices. 27The driver is not able to distinguish between the 2 devices.
24 28
@@ -36,6 +40,13 @@ fan1_div rw sets the speed range the inputs can handle. Legal
36 values are 1, 2, 4, and 8. Use lower values for 40 values are 1, 2, 4, and 8. Use lower values for
37 faster fans. 41 faster fans.
38 42
43Usage notes
44-----------
45
46This driver does not auto-detect devices. You will have to instantiate the
47devices explicitly. Please see Documentation/i2c/instantiating-devices for
48details.
49
39Module parameters 50Module parameters
40----------------- 51-----------------
41 52
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt
index 2d1ad12e2b3e..3a46e360496d 100644
--- a/Documentation/ioctl/ioctl-number.txt
+++ b/Documentation/ioctl/ioctl-number.txt
@@ -304,6 +304,7 @@ Code Seq#(hex) Include File Comments
3040xB0 all RATIO devices in development: 3040xB0 all RATIO devices in development:
305 <mailto:vgo@ratio.de> 305 <mailto:vgo@ratio.de>
3060xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> 3060xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca>
3070xB3 00 linux/mmc/ioctl.h
3070xC0 00-0F linux/usb/iowarrior.h 3080xC0 00-0F linux/usb/iowarrior.h
3080xCB 00-1F CBM serial IEC bus in development: 3090xCB 00-1F CBM serial IEC bus in development:
309 <mailto:michael.klein@puffin.lb.shuttle.de> 310 <mailto:michael.klein@puffin.lb.shuttle.de>
diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt
index b507d61fd41c..44e2649fbb29 100644
--- a/Documentation/kbuild/kconfig-language.txt
+++ b/Documentation/kbuild/kconfig-language.txt
@@ -113,6 +113,13 @@ applicable everywhere (see syntax).
113 That will limit the usefulness but on the other hand avoid 113 That will limit the usefulness but on the other hand avoid
114 the illegal configurations all over. 114 the illegal configurations all over.
115 115
116- limiting menu display: "visible if" <expr>
117 This attribute is only applicable to menu blocks, if the condition is
118 false, the menu block is not displayed to the user (the symbols
119 contained there can still be selected by other symbols, though). It is
120 similar to a conditional "prompt" attribude for individual menu
121 entries. Default value of "visible" is true.
122
116- numerical ranges: "range" <symbol> <symbol> ["if" <expr>] 123- numerical ranges: "range" <symbol> <symbol> ["if" <expr>]
117 This allows to limit the range of possible input values for int 124 This allows to limit the range of possible input values for int
118 and hex symbols. The user can only input a value which is larger than 125 and hex symbols. The user can only input a value which is larger than
@@ -303,7 +310,8 @@ menu:
303 "endmenu" 310 "endmenu"
304 311
305This defines a menu block, see "Menu structure" above for more 312This defines a menu block, see "Menu structure" above for more
306information. The only possible options are dependencies. 313information. The only possible options are dependencies and "visible"
314attributes.
307 315
308if: 316if:
309 317
@@ -381,3 +389,25 @@ config FOO
381 389
382limits FOO to module (=m) or disabled (=n). 390limits FOO to module (=m) or disabled (=n).
383 391
392Kconfig symbol existence
393~~~~~~~~~~~~~~~~~~~~~~~~
394The following two methods produce the same kconfig symbol dependencies
395but differ greatly in kconfig symbol existence (production) in the
396generated config file.
397
398case 1:
399
400config FOO
401 tristate "about foo"
402 depends on BAR
403
404vs. case 2:
405
406if BAR
407config FOO
408 tristate "about foo"
409endif
410
411In case 1, the symbol FOO will always exist in the config file (given
412no other dependencies). In case 2, the symbol FOO will only exist in
413the config file if BAR is enabled.
diff --git a/Documentation/kbuild/kconfig.txt b/Documentation/kbuild/kconfig.txt
index cca46b1a0f6c..c313d71324b4 100644
--- a/Documentation/kbuild/kconfig.txt
+++ b/Documentation/kbuild/kconfig.txt
@@ -48,11 +48,6 @@ KCONFIG_OVERWRITECONFIG
48If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not 48If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not
49break symlinks when .config is a symlink to somewhere else. 49break symlinks when .config is a symlink to somewhere else.
50 50
51KCONFIG_NOTIMESTAMP
52--------------------------------------------------
53If this environment variable exists and is non-null, the timestamp line
54in generated .config files is omitted.
55
56______________________________________________________________________ 51______________________________________________________________________
57Environment variables for '{allyes/allmod/allno/rand}config' 52Environment variables for '{allyes/allmod/allno/rand}config'
58 53
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 7c6624e7a5cb..5438a2d7907f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1777,9 +1777,6 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1777 1777
1778 nosoftlockup [KNL] Disable the soft-lockup detector. 1778 nosoftlockup [KNL] Disable the soft-lockup detector.
1779 1779
1780 noswapaccount [KNL] Disable accounting of swap in memory resource
1781 controller. (See Documentation/cgroups/memory.txt)
1782
1783 nosync [HW,M68K] Disables sync negotiation for all devices. 1780 nosync [HW,M68K] Disables sync negotiation for all devices.
1784 1781
1785 notsc [BUGS=X86-32] Disable Time Stamp Counter 1782 notsc [BUGS=X86-32] Disable Time Stamp Counter
diff --git a/Documentation/lockstat.txt b/Documentation/lockstat.txt
index 65f4c795015d..9c0a80d17a23 100644
--- a/Documentation/lockstat.txt
+++ b/Documentation/lockstat.txt
@@ -136,7 +136,7 @@ View the top contending locks:
136 dcache_lock: 1037 1161 0.38 45.32 774.51 6611 243371 0.15 306.48 77387.24 136 dcache_lock: 1037 1161 0.38 45.32 774.51 6611 243371 0.15 306.48 77387.24
137 &inode->i_mutex: 161 286 18446744073709 62882.54 1244614.55 3653 20598 18446744073709 62318.60 1693822.74 137 &inode->i_mutex: 161 286 18446744073709 62882.54 1244614.55 3653 20598 18446744073709 62318.60 1693822.74
138 &zone->lru_lock: 94 94 0.53 7.33 92.10 4366 32690 0.29 59.81 16350.06 138 &zone->lru_lock: 94 94 0.53 7.33 92.10 4366 32690 0.29 59.81 16350.06
139 &inode->i_data.i_mmap_lock: 79 79 0.40 3.77 53.03 11779 87755 0.28 116.93 29898.44 139 &inode->i_data.i_mmap_mutex: 79 79 0.40 3.77 53.03 11779 87755 0.28 116.93 29898.44
140 &q->__queue_lock: 48 50 0.52 31.62 86.31 774 13131 0.17 113.08 12277.52 140 &q->__queue_lock: 48 50 0.52 31.62 86.31 774 13131 0.17 113.08 12277.52
141 &rq->rq_lock_key: 43 47 0.74 68.50 170.63 3706 33929 0.22 107.99 17460.62 141 &rq->rq_lock_key: 43 47 0.74 68.50 170.63 3706 33929 0.22 107.99 17460.62
142 &rq->rq_lock_key#2: 39 46 0.75 6.68 49.03 2979 32292 0.17 125.17 17137.63 142 &rq->rq_lock_key#2: 39 46 0.75 6.68 49.03 2979 32292 0.17 125.17 17137.63
diff --git a/Documentation/mmc/00-INDEX b/Documentation/mmc/00-INDEX
index fca586f5b853..93dd7a714075 100644
--- a/Documentation/mmc/00-INDEX
+++ b/Documentation/mmc/00-INDEX
@@ -2,3 +2,5 @@
2 - this file 2 - this file
3mmc-dev-attrs.txt 3mmc-dev-attrs.txt
4 - info on SD and MMC device attributes 4 - info on SD and MMC device attributes
5mmc-dev-parts.txt
6 - info on SD and MMC device partitions
diff --git a/Documentation/mmc/mmc-dev-attrs.txt b/Documentation/mmc/mmc-dev-attrs.txt
index ff2bd685bced..8898a95b41e5 100644
--- a/Documentation/mmc/mmc-dev-attrs.txt
+++ b/Documentation/mmc/mmc-dev-attrs.txt
@@ -1,3 +1,13 @@
1SD and MMC Block Device Attributes
2==================================
3
4These attributes are defined for the block devices associated with the
5SD or MMC device.
6
7The following attributes are read/write.
8
9 force_ro Enforce read-only access even if write protect switch is off.
10
1SD and MMC Device Attributes 11SD and MMC Device Attributes
2============================ 12============================
3 13
diff --git a/Documentation/mmc/mmc-dev-parts.txt b/Documentation/mmc/mmc-dev-parts.txt
new file mode 100644
index 000000000000..2db28b8e662f
--- /dev/null
+++ b/Documentation/mmc/mmc-dev-parts.txt
@@ -0,0 +1,27 @@
1SD and MMC Device Partitions
2============================
3
4Device partitions are additional logical block devices present on the
5SD/MMC device.
6
7As of this writing, MMC boot partitions as supported and exposed as
8/dev/mmcblkXboot0 and /dev/mmcblkXboot1, where X is the index of the
9parent /dev/mmcblkX.
10
11MMC Boot Partitions
12===================
13
14Read and write access is provided to the two MMC boot partitions. Due to
15the sensitive nature of the boot partition contents, which often store
16a bootloader or bootloader configuration tables crucial to booting the
17platform, write access is disabled by default to reduce the chance of
18accidental bricking.
19
20To enable write access to /dev/mmcblkXbootY, disable the forced read-only
21access with:
22
23echo 0 > /sys/block/mmcblkXbootY/force_ro
24
25To re-enable read-only access:
26
27echo 1 > /sys/block/mmcblkXbootY/force_ro
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 1f45bd887d65..675612ff41ae 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -770,8 +770,17 @@ resend_igmp
770 a failover event. One membership report is issued immediately after 770 a failover event. One membership report is issued immediately after
771 the failover, subsequent packets are sent in each 200ms interval. 771 the failover, subsequent packets are sent in each 200ms interval.
772 772
773 The valid range is 0 - 255; the default value is 1. This option 773 The valid range is 0 - 255; the default value is 1. A value of 0
774 was added for bonding version 3.7.0. 774 prevents the IGMP membership report from being issued in response
775 to the failover event.
776
777 This option is useful for bonding modes balance-rr (0), active-backup
778 (1), balance-tlb (5) and balance-alb (6), in which a failover can
779 switch the IGMP traffic from one slave to another. Therefore a fresh
780 IGMP report must be issued to cause the switch to forward the incoming
781 IGMP traffic over the newly selected slave.
782
783 This option was added for bonding version 3.7.0.
775 784
7763. Configuring Bonding Devices 7853. Configuring Bonding Devices
777============================== 786==============================
diff --git a/Documentation/ptp/ptp.txt b/Documentation/ptp/ptp.txt
new file mode 100644
index 000000000000..ae8fef86b832
--- /dev/null
+++ b/Documentation/ptp/ptp.txt
@@ -0,0 +1,89 @@
1
2* PTP hardware clock infrastructure for Linux
3
4 This patch set introduces support for IEEE 1588 PTP clocks in
5 Linux. Together with the SO_TIMESTAMPING socket options, this
6 presents a standardized method for developing PTP user space
7 programs, synchronizing Linux with external clocks, and using the
8 ancillary features of PTP hardware clocks.
9
10 A new class driver exports a kernel interface for specific clock
11 drivers and a user space interface. The infrastructure supports a
12 complete set of PTP hardware clock functionality.
13
14 + Basic clock operations
15 - Set time
16 - Get time
17 - Shift the clock by a given offset atomically
18 - Adjust clock frequency
19
20 + Ancillary clock features
21 - One short or periodic alarms, with signal delivery to user program
22 - Time stamp external events
23 - Period output signals configurable from user space
24 - Synchronization of the Linux system time via the PPS subsystem
25
26** PTP hardware clock kernel API
27
28 A PTP clock driver registers itself with the class driver. The
29 class driver handles all of the dealings with user space. The
30 author of a clock driver need only implement the details of
31 programming the clock hardware. The clock driver notifies the class
32 driver of asynchronous events (alarms and external time stamps) via
33 a simple message passing interface.
34
35 The class driver supports multiple PTP clock drivers. In normal use
36 cases, only one PTP clock is needed. However, for testing and
37 development, it can be useful to have more than one clock in a
38 single system, in order to allow performance comparisons.
39
40** PTP hardware clock user space API
41
42 The class driver also creates a character device for each
43 registered clock. User space can use an open file descriptor from
44 the character device as a POSIX clock id and may call
45 clock_gettime, clock_settime, and clock_adjtime. These calls
46 implement the basic clock operations.
47
48 User space programs may control the clock using standardized
49 ioctls. A program may query, enable, configure, and disable the
50 ancillary clock features. User space can receive time stamped
51 events via blocking read() and poll(). One shot and periodic
52 signals may be configured via the POSIX timer_settime() system
53 call.
54
55** Writing clock drivers
56
57 Clock drivers include include/linux/ptp_clock_kernel.h and register
58 themselves by presenting a 'struct ptp_clock_info' to the
59 registration method. Clock drivers must implement all of the
60 functions in the interface. If a clock does not offer a particular
61 ancillary feature, then the driver should just return -EOPNOTSUPP
62 from those functions.
63
64 Drivers must ensure that all of the methods in interface are
65 reentrant. Since most hardware implementations treat the time value
66 as a 64 bit integer accessed as two 32 bit registers, drivers
67 should use spin_lock_irqsave/spin_unlock_irqrestore to protect
68 against concurrent access. This locking cannot be accomplished in
69 class driver, since the lock may also be needed by the clock
70 driver's interrupt service routine.
71
72** Supported hardware
73
74 + Freescale eTSEC gianfar
75 - 2 Time stamp external triggers, programmable polarity (opt. interrupt)
76 - 2 Alarm registers (optional interrupt)
77 - 3 Periodic signals (optional interrupt)
78
79 + National DP83640
80 - 6 GPIOs programmable as inputs or outputs
81 - 6 GPIOs with dedicated functions (LED/JTAG/clock) can also be
82 used as general inputs or outputs
83 - GPIO inputs can time stamp external triggers
84 - GPIO outputs can produce periodic signals
85 - 1 interrupt pin
86
87 + Intel IXP465
88 - Auxiliary Slave/Master Mode Snapshot (optional interrupt)
89 - Target Time (optional interrupt)
diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
new file mode 100644
index 000000000000..f59ded066108
--- /dev/null
+++ b/Documentation/ptp/testptp.c
@@ -0,0 +1,381 @@
1/*
2 * PTP 1588 clock support - User space test program
3 *
4 * Copyright (C) 2010 OMICRON electronics GmbH
5 *
6 * This program is free software; you can redistribute it and/or modify
7 * it under the terms of the GNU General Public License as published by
8 * the Free Software Foundation; either version 2 of the License, or
9 * (at your option) any later version.
10 *
11 * This program is distributed in the hope that it will be useful,
12 * but WITHOUT ANY WARRANTY; without even the implied warranty of
13 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 * GNU General Public License for more details.
15 *
16 * You should have received a copy of the GNU General Public License
17 * along with this program; if not, write to the Free Software
18 * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
19 */
20#include <errno.h>
21#include <fcntl.h>
22#include <math.h>
23#include <signal.h>
24#include <stdio.h>
25#include <stdlib.h>
26#include <string.h>
27#include <sys/ioctl.h>
28#include <sys/mman.h>
29#include <sys/stat.h>
30#include <sys/time.h>
31#include <sys/timex.h>
32#include <sys/types.h>
33#include <time.h>
34#include <unistd.h>
35
36#include <linux/ptp_clock.h>
37
38#define DEVICE "/dev/ptp0"
39
40#ifndef ADJ_SETOFFSET
41#define ADJ_SETOFFSET 0x0100
42#endif
43
44#ifndef CLOCK_INVALID
45#define CLOCK_INVALID -1
46#endif
47
48/* When glibc offers the syscall, this will go away. */
49#include <sys/syscall.h>
50static int clock_adjtime(clockid_t id, struct timex *tx)
51{
52 return syscall(__NR_clock_adjtime, id, tx);
53}
54
55static clockid_t get_clockid(int fd)
56{
57#define CLOCKFD 3
58#define FD_TO_CLOCKID(fd) ((~(clockid_t) (fd) << 3) | CLOCKFD)
59
60 return FD_TO_CLOCKID(fd);
61}
62
63static void handle_alarm(int s)
64{
65 printf("received signal %d\n", s);
66}
67
68static int install_handler(int signum, void (*handler)(int))
69{
70 struct sigaction action;
71 sigset_t mask;
72
73 /* Unblock the signal. */
74 sigemptyset(&mask);
75 sigaddset(&mask, signum);
76 sigprocmask(SIG_UNBLOCK, &mask, NULL);
77
78 /* Install the signal handler. */
79 action.sa_handler = handler;
80 action.sa_flags = 0;
81 sigemptyset(&action.sa_mask);
82 sigaction(signum, &action, NULL);
83
84 return 0;
85}
86
87static long ppb_to_scaled_ppm(int ppb)
88{
89 /*
90 * The 'freq' field in the 'struct timex' is in parts per
91 * million, but with a 16 bit binary fractional field.
92 * Instead of calculating either one of
93 *
94 * scaled_ppm = (ppb / 1000) << 16 [1]
95 * scaled_ppm = (ppb << 16) / 1000 [2]
96 *
97 * we simply use double precision math, in order to avoid the
98 * truncation in [1] and the possible overflow in [2].
99 */
100 return (long) (ppb * 65.536);
101}
102
103static void usage(char *progname)
104{
105 fprintf(stderr,
106 "usage: %s [options]\n"
107 " -a val request a one-shot alarm after 'val' seconds\n"
108 " -A val request a periodic alarm every 'val' seconds\n"
109 " -c query the ptp clock's capabilities\n"
110 " -d name device to open\n"
111 " -e val read 'val' external time stamp events\n"
112 " -f val adjust the ptp clock frequency by 'val' ppb\n"
113 " -g get the ptp clock time\n"
114 " -h prints this message\n"
115 " -p val enable output with a period of 'val' nanoseconds\n"
116 " -P val enable or disable (val=1|0) the system clock PPS\n"
117 " -s set the ptp clock time from the system time\n"
118 " -S set the system time from the ptp clock time\n"
119 " -t val shift the ptp clock time by 'val' seconds\n",
120 progname);
121}
122
123int main(int argc, char *argv[])
124{
125 struct ptp_clock_caps caps;
126 struct ptp_extts_event event;
127 struct ptp_extts_request extts_request;
128 struct ptp_perout_request perout_request;
129 struct timespec ts;
130 struct timex tx;
131
132 static timer_t timerid;
133 struct itimerspec timeout;
134 struct sigevent sigevent;
135
136 char *progname;
137 int c, cnt, fd;
138
139 char *device = DEVICE;
140 clockid_t clkid;
141 int adjfreq = 0x7fffffff;
142 int adjtime = 0;
143 int capabilities = 0;
144 int extts = 0;
145 int gettime = 0;
146 int oneshot = 0;
147 int periodic = 0;
148 int perout = -1;
149 int pps = -1;
150 int settime = 0;
151
152 progname = strrchr(argv[0], '/');
153 progname = progname ? 1+progname : argv[0];
154 while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghp:P:sSt:v"))) {
155 switch (c) {
156 case 'a':
157 oneshot = atoi(optarg);
158 break;
159 case 'A':
160 periodic = atoi(optarg);
161 break;
162 case 'c':
163 capabilities = 1;
164 break;
165 case 'd':
166 device = optarg;
167 break;
168 case 'e':
169 extts = atoi(optarg);
170 break;
171 case 'f':
172 adjfreq = atoi(optarg);
173 break;
174 case 'g':
175 gettime = 1;
176 break;
177 case 'p':
178 perout = atoi(optarg);
179 break;
180 case 'P':
181 pps = atoi(optarg);
182 break;
183 case 's':
184 settime = 1;
185 break;
186 case 'S':
187 settime = 2;
188 break;
189 case 't':
190 adjtime = atoi(optarg);
191 break;
192 case 'h':
193 usage(progname);
194 return 0;
195 case '?':
196 default:
197 usage(progname);
198 return -1;
199 }
200 }
201
202 fd = open(device, O_RDWR);
203 if (fd < 0) {
204 fprintf(stderr, "opening %s: %s\n", device, strerror(errno));
205 return -1;
206 }
207
208 clkid = get_clockid(fd);
209 if (CLOCK_INVALID == clkid) {
210 fprintf(stderr, "failed to read clock id\n");
211 return -1;
212 }
213
214 if (capabilities) {
215 if (ioctl(fd, PTP_CLOCK_GETCAPS, &caps)) {
216 perror("PTP_CLOCK_GETCAPS");
217 } else {
218 printf("capabilities:\n"
219 " %d maximum frequency adjustment (ppb)\n"
220 " %d programmable alarms\n"
221 " %d external time stamp channels\n"
222 " %d programmable periodic signals\n"
223 " %d pulse per second\n",
224 caps.max_adj,
225 caps.n_alarm,
226 caps.n_ext_ts,
227 caps.n_per_out,
228 caps.pps);
229 }
230 }
231
232 if (0x7fffffff != adjfreq) {
233 memset(&tx, 0, sizeof(tx));
234 tx.modes = ADJ_FREQUENCY;
235 tx.freq = ppb_to_scaled_ppm(adjfreq);
236 if (clock_adjtime(clkid, &tx)) {
237 perror("clock_adjtime");
238 } else {
239 puts("frequency adjustment okay");
240 }
241 }
242
243 if (adjtime) {
244 memset(&tx, 0, sizeof(tx));
245 tx.modes = ADJ_SETOFFSET;
246 tx.time.tv_sec = adjtime;
247 tx.time.tv_usec = 0;
248 if (clock_adjtime(clkid, &tx) < 0) {
249 perror("clock_adjtime");
250 } else {
251 puts("time shift okay");
252 }
253 }
254
255 if (gettime) {
256 if (clock_gettime(clkid, &ts)) {
257 perror("clock_gettime");
258 } else {
259 printf("clock time: %ld.%09ld or %s",
260 ts.tv_sec, ts.tv_nsec, ctime(&ts.tv_sec));
261 }
262 }
263
264 if (settime == 1) {
265 clock_gettime(CLOCK_REALTIME, &ts);
266 if (clock_settime(clkid, &ts)) {
267 perror("clock_settime");
268 } else {
269 puts("set time okay");
270 }
271 }
272
273 if (settime == 2) {
274 clock_gettime(clkid, &ts);
275 if (clock_settime(CLOCK_REALTIME, &ts)) {
276 perror("clock_settime");
277 } else {
278 puts("set time okay");
279 }
280 }
281
282 if (extts) {
283 memset(&extts_request, 0, sizeof(extts_request));
284 extts_request.index = 0;
285 extts_request.flags = PTP_ENABLE_FEATURE;
286 if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) {
287 perror("PTP_EXTTS_REQUEST");
288 extts = 0;
289 } else {
290 puts("external time stamp request okay");
291 }
292 for (; extts; extts--) {
293 cnt = read(fd, &event, sizeof(event));
294 if (cnt != sizeof(event)) {
295 perror("read");
296 break;
297 }
298 printf("event index %u at %lld.%09u\n", event.index,
299 event.t.sec, event.t.nsec);
300 fflush(stdout);
301 }
302 /* Disable the feature again. */
303 extts_request.flags = 0;
304 if (ioctl(fd, PTP_EXTTS_REQUEST, &extts_request)) {
305 perror("PTP_EXTTS_REQUEST");
306 }
307 }
308
309 if (oneshot) {
310 install_handler(SIGALRM, handle_alarm);
311 /* Create a timer. */
312 sigevent.sigev_notify = SIGEV_SIGNAL;
313 sigevent.sigev_signo = SIGALRM;
314 if (timer_create(clkid, &sigevent, &timerid)) {
315 perror("timer_create");
316 return -1;
317 }
318 /* Start the timer. */
319 memset(&timeout, 0, sizeof(timeout));
320 timeout.it_value.tv_sec = oneshot;
321 if (timer_settime(timerid, 0, &timeout, NULL)) {
322 perror("timer_settime");
323 return -1;
324 }
325 pause();
326 timer_delete(timerid);
327 }
328
329 if (periodic) {
330 install_handler(SIGALRM, handle_alarm);
331 /* Create a timer. */
332 sigevent.sigev_notify = SIGEV_SIGNAL;
333 sigevent.sigev_signo = SIGALRM;
334 if (timer_create(clkid, &sigevent, &timerid)) {
335 perror("timer_create");
336 return -1;
337 }
338 /* Start the timer. */
339 memset(&timeout, 0, sizeof(timeout));
340 timeout.it_interval.tv_sec = periodic;
341 timeout.it_value.tv_sec = periodic;
342 if (timer_settime(timerid, 0, &timeout, NULL)) {
343 perror("timer_settime");
344 return -1;
345 }
346 while (1) {
347 pause();
348 }
349 timer_delete(timerid);
350 }
351
352 if (perout >= 0) {
353 if (clock_gettime(clkid, &ts)) {
354 perror("clock_gettime");
355 return -1;
356 }
357 memset(&perout_request, 0, sizeof(perout_request));
358 perout_request.index = 0;
359 perout_request.start.sec = ts.tv_sec + 2;
360 perout_request.start.nsec = 0;
361 perout_request.period.sec = 0;
362 perout_request.period.nsec = perout;
363 if (ioctl(fd, PTP_PEROUT_REQUEST, &perout_request)) {
364 perror("PTP_PEROUT_REQUEST");
365 } else {
366 puts("periodic output request okay");
367 }
368 }
369
370 if (pps != -1) {
371 int enable = pps ? 1 : 0;
372 if (ioctl(fd, PTP_ENABLE_PPS, enable)) {
373 perror("PTP_ENABLE_PPS");
374 } else {
375 puts("pps for system time request okay");
376 }
377 }
378
379 close(fd);
380 return 0;
381}
diff --git a/Documentation/ptp/testptp.mk b/Documentation/ptp/testptp.mk
new file mode 100644
index 000000000000..4ef2d9755421
--- /dev/null
+++ b/Documentation/ptp/testptp.mk
@@ -0,0 +1,33 @@
1# PTP 1588 clock support - User space test program
2#
3# Copyright (C) 2010 OMICRON electronics GmbH
4#
5# This program is free software; you can redistribute it and/or modify
6# it under the terms of the GNU General Public License as published by
7# the Free Software Foundation; either version 2 of the License, or
8# (at your option) any later version.
9#
10# This program is distributed in the hope that it will be useful,
11# but WITHOUT ANY WARRANTY; without even the implied warranty of
12# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13# GNU General Public License for more details.
14#
15# You should have received a copy of the GNU General Public License
16# along with this program; if not, write to the Free Software
17# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
18
19CC = $(CROSS_COMPILE)gcc
20INC = -I$(KBUILD_OUTPUT)/usr/include
21CFLAGS = -Wall $(INC)
22LDLIBS = -lrt
23PROGS = testptp
24
25all: $(PROGS)
26
27testptp: testptp.o
28
29clean:
30 rm -f testptp.o
31
32distclean: clean
33 rm -f $(PROGS)
diff --git a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
index 9b7e1904db1c..5d0fc8bfcdb9 100644
--- a/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
+++ b/Documentation/virtual/uml/UserModeLinux-HOWTO.txt
@@ -1182,6 +1182,16 @@
1182 forge.net/> and explains these in detail, as well as 1182 forge.net/> and explains these in detail, as well as
1183 some other issues. 1183 some other issues.
1184 1184
1185 There is also a related point-to-point only "ucast" transport.
1186 This is useful when your network does not support multicast, and
1187 all network connections are simple point to point links.
1188
1189 The full set of command line options for this transport are
1190
1191
1192 ethn=ucast,ethernet address,remote address,listen port,remote port
1193
1194
1185 1195
1186 1196
1187 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr 1197 66..66.. TTUUNN//TTAAPP wwiitthh tthhee uummll__nneett hheellppeerr
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt
new file mode 100644
index 000000000000..36c367c73084
--- /dev/null
+++ b/Documentation/vm/cleancache.txt
@@ -0,0 +1,278 @@
1MOTIVATION
2
3Cleancache is a new optional feature provided by the VFS layer that
4potentially dramatically increases page cache effectiveness for
5many workloads in many environments at a negligible cost.
6
7Cleancache can be thought of as a page-granularity victim cache for clean
8pages that the kernel's pageframe replacement algorithm (PFRA) would like
9to keep around, but can't since there isn't enough memory. So when the
10PFRA "evicts" a page, it first attempts to use cleancache code to
11put the data contained in that page into "transcendent memory", memory
12that is not directly accessible or addressable by the kernel and is
13of unknown and possibly time-varying size.
14
15Later, when a cleancache-enabled filesystem wishes to access a page
16in a file on disk, it first checks cleancache to see if it already
17contains it; if it does, the page of data is copied into the kernel
18and a disk access is avoided.
19
20Transcendent memory "drivers" for cleancache are currently implemented
21in Xen (using hypervisor memory) and zcache (using in-kernel compressed
22memory) and other implementations are in development.
23
24FAQs are included below.
25
26IMPLEMENTATION OVERVIEW
27
28A cleancache "backend" that provides transcendent memory registers itself
29to the kernel's cleancache "frontend" by calling cleancache_register_ops,
30passing a pointer to a cleancache_ops structure with funcs set appropriately.
31Note that cleancache_register_ops returns the previous settings so that
32chaining can be performed if desired. The functions provided must conform to
33certain semantics as follows:
34
35Most important, cleancache is "ephemeral". Pages which are copied into
36cleancache have an indefinite lifetime which is completely unknowable
37by the kernel and so may or may not still be in cleancache at any later time.
38Thus, as its name implies, cleancache is not suitable for dirty pages.
39Cleancache has complete discretion over what pages to preserve and what
40pages to discard and when.
41
42Mounting a cleancache-enabled filesystem should call "init_fs" to obtain a
43pool id which, if positive, must be saved in the filesystem's superblock;
44a negative return value indicates failure. A "put_page" will copy a
45(presumably about-to-be-evicted) page into cleancache and associate it with
46the pool id, a file key, and a page index into the file. (The combination
47of a pool id, a file key, and an index is sometimes called a "handle".)
48A "get_page" will copy the page, if found, from cleancache into kernel memory.
49A "flush_page" will ensure the page no longer is present in cleancache;
50a "flush_inode" will flush all pages associated with the specified file;
51and, when a filesystem is unmounted, a "flush_fs" will flush all pages in
52all files specified by the given pool id and also surrender the pool id.
53
54An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache
55to treat the pool as shared using a 128-bit UUID as a key. On systems
56that may run multiple kernels (such as hard partitioned or virtualized
57systems) that may share a clustered filesystem, and where cleancache
58may be shared among those kernels, calls to init_shared_fs that specify the
59same UUID will receive the same pool id, thus allowing the pages to
60be shared. Note that any security requirements must be imposed outside
61of the kernel (e.g. by "tools" that control cleancache). Or a
62cleancache implementation can simply disable shared_init by always
63returning a negative value.
64
65If a get_page is successful on a non-shared pool, the page is flushed (thus
66making cleancache an "exclusive" cache). On a shared pool, the page
67is NOT flushed on a successful get_page so that it remains accessible to
68other sharers. The kernel is responsible for ensuring coherency between
69cleancache (shared or not), the page cache, and the filesystem, using
70cleancache flush operations as required.
71
72Note that cleancache must enforce put-put-get coherency and get-get
73coherency. For the former, if two puts are made to the same handle but
74with different data, say AAA by the first put and BBB by the second, a
75subsequent get can never return the stale data (AAA). For get-get coherency,
76if a get for a given handle fails, subsequent gets for that handle will
77never succeed unless preceded by a successful put with that handle.
78
79Last, cleancache provides no SMP serialization guarantees; if two
80different Linux threads are simultaneously putting and flushing a page
81with the same handle, the results are indeterminate. Callers must
82lock the page to ensure serial behavior.
83
84CLEANCACHE PERFORMANCE METRICS
85
86Cleancache monitoring is done by sysfs files in the
87/sys/kernel/mm/cleancache directory. The effectiveness of cleancache
88can be measured (across all filesystems) with:
89
90succ_gets - number of gets that were successful
91failed_gets - number of gets that failed
92puts - number of puts attempted (all "succeed")
93flushes - number of flushes attempted
94
95A backend implementatation may provide additional metrics.
96
97FAQ
98
991) Where's the value? (Andrew Morton)
100
101Cleancache provides a significant performance benefit to many workloads
102in many environments with negligible overhead by improving the
103effectiveness of the pagecache. Clean pagecache pages are
104saved in transcendent memory (RAM that is otherwise not directly
105addressable to the kernel); fetching those pages later avoids "refaults"
106and thus disk reads.
107
108Cleancache (and its sister code "frontswap") provide interfaces for
109this transcendent memory (aka "tmem"), which conceptually lies between
110fast kernel-directly-addressable RAM and slower DMA/asynchronous devices.
111Disallowing direct kernel or userland reads/writes to tmem
112is ideal when data is transformed to a different form and size (such
113as with compression) or secretly moved (as might be useful for write-
114balancing for some RAM-like devices). Evicted page-cache pages (and
115swap pages) are a great use for this kind of slower-than-RAM-but-much-
116faster-than-disk transcendent memory, and the cleancache (and frontswap)
117"page-object-oriented" specification provides a nice way to read and
118write -- and indirectly "name" -- the pages.
119
120In the virtual case, the whole point of virtualization is to statistically
121multiplex physical resources across the varying demands of multiple
122virtual machines. This is really hard to do with RAM and efforts to
123do it well with no kernel change have essentially failed (except in some
124well-publicized special-case workloads). Cleancache -- and frontswap --
125with a fairly small impact on the kernel, provide a huge amount
126of flexibility for more dynamic, flexible RAM multiplexing.
127Specifically, the Xen Transcendent Memory backend allows otherwise
128"fallow" hypervisor-owned RAM to not only be "time-shared" between multiple
129virtual machines, but the pages can be compressed and deduplicated to
130optimize RAM utilization. And when guest OS's are induced to surrender
131underutilized RAM (e.g. with "self-ballooning"), page cache pages
132are the first to go, and cleancache allows those pages to be
133saved and reclaimed if overall host system memory conditions allow.
134
135And the identical interface used for cleancache can be used in
136physical systems as well. The zcache driver acts as a memory-hungry
137device that stores pages of data in a compressed state. And
138the proposed "RAMster" driver shares RAM across multiple physical
139systems.
140
1412) Why does cleancache have its sticky fingers so deep inside the
142 filesystems and VFS? (Andrew Morton and Christoph Hellwig)
143
144The core hooks for cleancache in VFS are in most cases a single line
145and the minimum set are placed precisely where needed to maintain
146coherency (via cleancache_flush operations) between cleancache,
147the page cache, and disk. All hooks compile into nothingness if
148cleancache is config'ed off and turn into a function-pointer-
149compare-to-NULL if config'ed on but no backend claims the ops
150functions, or to a compare-struct-element-to-negative if a
151backend claims the ops functions but a filesystem doesn't enable
152cleancache.
153
154Some filesystems are built entirely on top of VFS and the hooks
155in VFS are sufficient, so don't require an "init_fs" hook; the
156initial implementation of cleancache didn't provide this hook.
157But for some filesystems (such as btrfs), the VFS hooks are
158incomplete and one or more hooks in fs-specific code are required.
159And for some other filesystems, such as tmpfs, cleancache may
160be counterproductive. So it seemed prudent to require a filesystem
161to "opt in" to use cleancache, which requires adding a hook in
162each filesystem. Not all filesystems are supported by cleancache
163only because they haven't been tested. The existing set should
164be sufficient to validate the concept, the opt-in approach means
165that untested filesystems are not affected, and the hooks in the
166existing filesystems should make it very easy to add more
167filesystems in the future.
168
169The total impact of the hooks to existing fs and mm files is only
170about 40 lines added (not counting comments and blank lines).
171
1723) Why not make cleancache asynchronous and batched so it can
173 more easily interface with real devices with DMA instead
174 of copying each individual page? (Minchan Kim)
175
176The one-page-at-a-time copy semantics simplifies the implementation
177on both the frontend and backend and also allows the backend to
178do fancy things on-the-fly like page compression and
179page deduplication. And since the data is "gone" (copied into/out
180of the pageframe) before the cleancache get/put call returns,
181a great deal of race conditions and potential coherency issues
182are avoided. While the interface seems odd for a "real device"
183or for real kernel-addressable RAM, it makes perfect sense for
184transcendent memory.
185
1864) Why is non-shared cleancache "exclusive"? And where is the
187 page "flushed" after a "get"? (Minchan Kim)
188
189The main reason is to free up space in transcendent memory and
190to avoid unnecessary cleancache_flush calls. If you want inclusive,
191the page can be "put" immediately following the "get". If
192put-after-get for inclusive becomes common, the interface could
193be easily extended to add a "get_no_flush" call.
194
195The flush is done by the cleancache backend implementation.
196
1975) What's the performance impact?
198
199Performance analysis has been presented at OLS'09 and LCA'10.
200Briefly, performance gains can be significant on most workloads,
201especially when memory pressure is high (e.g. when RAM is
202overcommitted in a virtual workload); and because the hooks are
203invoked primarily in place of or in addition to a disk read/write,
204overhead is negligible even in worst case workloads. Basically
205cleancache replaces I/O with memory-copy-CPU-overhead; on older
206single-core systems with slow memory-copy speeds, cleancache
207has little value, but in newer multicore machines, especially
208consolidated/virtualized machines, it has great value.
209
2106) How do I add cleancache support for filesystem X? (Boaz Harrash)
211
212Filesystems that are well-behaved and conform to certain
213restrictions can utilize cleancache simply by making a call to
214cleancache_init_fs at mount time. Unusual, misbehaving, or
215poorly layered filesystems must either add additional hooks
216and/or undergo extensive additional testing... or should just
217not enable the optional cleancache.
218
219Some points for a filesystem to consider:
220
221- The FS should be block-device-based (e.g. a ram-based FS such
222 as tmpfs should not enable cleancache)
223- To ensure coherency/correctness, the FS must ensure that all
224 file removal or truncation operations either go through VFS or
225 add hooks to do the equivalent cleancache "flush" operations
226- To ensure coherency/correctness, either inode numbers must
227 be unique across the lifetime of the on-disk file OR the
228 FS must provide an "encode_fh" function.
229- The FS must call the VFS superblock alloc and deactivate routines
230 or add hooks to do the equivalent cleancache calls done there.
231- To maximize performance, all pages fetched from the FS should
232 go through the do_mpag_readpage routine or the FS should add
233 hooks to do the equivalent (cf. btrfs)
234- Currently, the FS blocksize must be the same as PAGESIZE. This
235 is not an architectural restriction, but no backends currently
236 support anything different.
237- A clustered FS should invoke the "shared_init_fs" cleancache
238 hook to get best performance for some backends.
239
2407) Why not use the KVA of the inode as the key? (Christoph Hellwig)
241
242If cleancache would use the inode virtual address instead of
243inode/filehandle, the pool id could be eliminated. But, this
244won't work because cleancache retains pagecache data pages
245persistently even when the inode has been pruned from the
246inode unused list, and only flushes the data page if the file
247gets removed/truncated. So if cleancache used the inode kva,
248there would be potential coherency issues if/when the inode
249kva is reused for a different file. Alternately, if cleancache
250flushed the pages when the inode kva was freed, much of the value
251of cleancache would be lost because the cache of pages in cleanache
252is potentially much larger than the kernel pagecache and is most
253useful if the pages survive inode cache removal.
254
2558) Why is a global variable required?
256
257The cleancache_enabled flag is checked in all of the frequently-used
258cleancache hooks. The alternative is a function call to check a static
259variable. Since cleancache is enabled dynamically at runtime, systems
260that don't enable cleancache would suffer thousands (possibly
261tens-of-thousands) of unnecessary function calls per second. So the
262global variable allows cleancache to be enabled by default at compile
263time, but have insignificant performance impact when cleancache remains
264disabled at runtime.
265
2669) Does cleanache work with KVM?
267
268The memory model of KVM is sufficiently different that a cleancache
269backend may have less value for KVM. This remains to be tested,
270especially in an overcommitted system.
271
27210) Does cleancache work in userspace? It sounds useful for
273 memory hungry caches like web browsers. (Jamie Lokier)
274
275No plans yet, though we agree it sounds useful, at least for
276apps that bypass the page cache (e.g. O_DIRECT).
277
278Last updated: Dan Magenheimer, April 13 2011
diff --git a/Documentation/vm/locking b/Documentation/vm/locking
index 25fadb448760..f61228bd6395 100644
--- a/Documentation/vm/locking
+++ b/Documentation/vm/locking
@@ -66,7 +66,7 @@ in some cases it is not really needed. Eg, vm_start is modified by
66expand_stack(), it is hard to come up with a destructive scenario without 66expand_stack(), it is hard to come up with a destructive scenario without
67having the vmlist protection in this case. 67having the vmlist protection in this case.
68 68
69The page_table_lock nests with the inode i_mmap_lock and the kmem cache 69The page_table_lock nests with the inode i_mmap_mutex and the kmem cache
70c_spinlock spinlocks. This is okay, since the kmem code asks for pages after 70c_spinlock spinlocks. This is okay, since the kmem code asks for pages after
71dropping c_spinlock. The page_table_lock also nests with pagecache_lock and 71dropping c_spinlock. The page_table_lock also nests with pagecache_lock and
72pagemap_lru_lock spinlocks, and no code asks for memory with these locks 72pagemap_lru_lock spinlocks, and no code asks for memory with these locks