aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block13
-rw-r--r--Documentation/ABI/testing/sysfs-bus-rbd7
-rw-r--r--Documentation/DocBook/debugobjects.tmpl50
-rw-r--r--Documentation/DocBook/uio-howto.tmpl7
-rw-r--r--Documentation/HOWTO4
-rw-r--r--Documentation/RCU/checklist.txt6
-rw-r--r--Documentation/RCU/rcu.txt10
-rw-r--r--Documentation/RCU/stallwarn.txt16
-rw-r--r--Documentation/RCU/torture.txt13
-rw-r--r--Documentation/RCU/trace.txt4
-rw-r--r--Documentation/RCU/whatisRCU.txt19
-rw-r--r--Documentation/arm/memory.txt11
-rw-r--r--Documentation/atomic_ops.txt87
-rw-r--r--Documentation/blockdev/cciss.txt14
-rw-r--r--Documentation/cgroups/memory.txt28
-rw-r--r--Documentation/cgroups/net_prio.txt53
-rw-r--r--Documentation/development-process/5.Posting8
-rw-r--r--Documentation/devicetree/bindings/arm/gic.txt4
-rw-r--r--Documentation/devicetree/bindings/arm/vic.txt29
-rw-r--r--Documentation/devicetree/bindings/i2c/i2c-designware.txt22
-rw-r--r--Documentation/devicetree/bindings/i2c/trivial-devices.txt58
-rw-r--r--Documentation/devicetree/bindings/net/calxeda-xgmac.txt15
-rw-r--r--Documentation/devicetree/bindings/net/can/cc770.txt53
-rw-r--r--Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt163
-rw-r--r--Documentation/devicetree/bindings/powerpc/fsl/srio.txt103
-rw-r--r--Documentation/devicetree/bindings/vendor-prefixes.txt4
-rw-r--r--Documentation/driver-model/devres.txt1
-rw-r--r--Documentation/feature-removal-schedule.txt14
-rw-r--r--Documentation/filesystems/Locking8
-rw-r--r--Documentation/filesystems/btrfs.txt4
-rw-r--r--Documentation/filesystems/configfs/configfs.txt2
-rw-r--r--Documentation/filesystems/debugfs.txt56
-rw-r--r--Documentation/filesystems/sysfs.txt2
-rw-r--r--Documentation/filesystems/vfs.txt8
-rw-r--r--Documentation/i2c/ten-bit-addresses36
-rw-r--r--Documentation/kernel-parameters.txt18
-rw-r--r--Documentation/lockdep-design.txt63
-rw-r--r--Documentation/networking/00-INDEX2
-rw-r--r--Documentation/networking/batman-adv.txt7
-rw-r--r--Documentation/networking/bonding.txt17
-rw-r--r--Documentation/networking/ieee802154.txt27
-rw-r--r--Documentation/networking/ifenslave.c2
-rw-r--r--Documentation/networking/ip-sysctl.txt25
-rw-r--r--Documentation/networking/openvswitch.txt195
-rw-r--r--Documentation/networking/packet_mmap.txt2
-rw-r--r--Documentation/networking/scaling.txt8
-rw-r--r--Documentation/networking/stmmac.txt16
-rw-r--r--Documentation/networking/team.txt2
-rw-r--r--Documentation/power/devices.txt118
-rw-r--r--Documentation/power/freezing-of-tasks.txt39
-rw-r--r--Documentation/power/runtime_pm.txt158
-rw-r--r--Documentation/serial/serial-rs485.txt14
-rw-r--r--Documentation/sound/alsa/HD-Audio.txt8
-rw-r--r--Documentation/sound/alsa/soc/machine.txt6
-rw-r--r--Documentation/trace/events.txt2
-rw-r--r--Documentation/usb/linux-cdc-acm.inf4
-rw-r--r--Documentation/virtual/kvm/api.txt16
57 files changed, 1398 insertions, 283 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index 2b5d56127fce..c1eb41cb9876 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -206,16 +206,3 @@ Description:
206 when a discarded area is read the discard_zeroes_data 206 when a discarded area is read the discard_zeroes_data
207 parameter will be set to one. Otherwise it will be 0 and 207 parameter will be set to one. Otherwise it will be 0 and
208 the result of reading a discarded area is undefined. 208 the result of reading a discarded area is undefined.
209What: /sys/block/<disk>/alias
210Date: Aug 2011
211Contact: Nao Nishijima <nao.nishijima.xt@hitachi.com>
212Description:
213 A raw device name of a disk does not always point a same disk
214 each boot-up time. Therefore, users have to use persistent
215 device names, which udev creates when the kernel finds a disk,
216 instead of raw device name. However, kernel doesn't show those
217 persistent names on its messages (e.g. dmesg).
218 This file can store an alias of the disk and it would be
219 appeared in kernel messages if it is set. A disk can have an
220 alias which length is up to 255bytes. Users can use alphabets,
221 numbers, "-" and "_" in alias name. This file is writeonce.
diff --git a/Documentation/ABI/testing/sysfs-bus-rbd b/Documentation/ABI/testing/sysfs-bus-rbd
index fa72ccb2282e..dbedafb095e2 100644
--- a/Documentation/ABI/testing/sysfs-bus-rbd
+++ b/Documentation/ABI/testing/sysfs-bus-rbd
@@ -57,13 +57,6 @@ create_snap
57 57
58 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create 58 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create
59 59
60rollback_snap
61
62 Rolls back data to the specified snapshot. This goes over the entire
63 list of rados blocks and sends a rollback command to each.
64
65 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback
66
67snap_* 60snap_*
68 61
69 A directory per each snapshot 62 A directory per each snapshot
diff --git a/Documentation/DocBook/debugobjects.tmpl b/Documentation/DocBook/debugobjects.tmpl
index 08ff908aa7a2..24979f691e3e 100644
--- a/Documentation/DocBook/debugobjects.tmpl
+++ b/Documentation/DocBook/debugobjects.tmpl
@@ -96,6 +96,7 @@
96 <listitem><para>debug_object_deactivate</para></listitem> 96 <listitem><para>debug_object_deactivate</para></listitem>
97 <listitem><para>debug_object_destroy</para></listitem> 97 <listitem><para>debug_object_destroy</para></listitem>
98 <listitem><para>debug_object_free</para></listitem> 98 <listitem><para>debug_object_free</para></listitem>
99 <listitem><para>debug_object_assert_init</para></listitem>
99 </itemizedlist> 100 </itemizedlist>
100 Each of these functions takes the address of the real object and 101 Each of these functions takes the address of the real object and
101 a pointer to the object type specific debug description 102 a pointer to the object type specific debug description
@@ -273,6 +274,26 @@
273 debug checks. 274 debug checks.
274 </para> 275 </para>
275 </sect1> 276 </sect1>
277
278 <sect1 id="debug_object_assert_init">
279 <title>debug_object_assert_init</title>
280 <para>
281 This function is called to assert that an object has been
282 initialized.
283 </para>
284 <para>
285 When the real object is not tracked by debugobjects, it calls
286 fixup_assert_init of the object type description structure
287 provided by the caller, with the hardcoded object state
288 ODEBUG_NOT_AVAILABLE. The fixup function can correct the problem
289 by calling debug_object_init and other specific initializing
290 functions.
291 </para>
292 <para>
293 When the real object is already tracked by debugobjects it is
294 ignored.
295 </para>
296 </sect1>
276 </chapter> 297 </chapter>
277 <chapter id="fixupfunctions"> 298 <chapter id="fixupfunctions">
278 <title>Fixup functions</title> 299 <title>Fixup functions</title>
@@ -381,6 +402,35 @@
381 statistics. 402 statistics.
382 </para> 403 </para>
383 </sect1> 404 </sect1>
405 <sect1 id="fixup_assert_init">
406 <title>fixup_assert_init</title>
407 <para>
408 This function is called from the debug code whenever a problem
409 in debug_object_assert_init is detected.
410 </para>
411 <para>
412 Called from debug_object_assert_init() with a hardcoded state
413 ODEBUG_STATE_NOTAVAILABLE when the object is not found in the
414 debug bucket.
415 </para>
416 <para>
417 The function returns 1 when the fixup was successful,
418 otherwise 0. The return value is used to update the
419 statistics.
420 </para>
421 <para>
422 Note, this function should make sure debug_object_init() is
423 called before returning.
424 </para>
425 <para>
426 The handling of statically initialized objects is a special
427 case. The fixup function should check if this is a legitimate
428 case of a statically initialized object or not. In this case only
429 debug_object_init() should be called to make the object known to
430 the tracker. Then the function should return 0 because this is not
431 a real fixup.
432 </para>
433 </sect1>
384 </chapter> 434 </chapter>
385 <chapter id="bugs"> 435 <chapter id="bugs">
386 <title>Known Bugs And Assumptions</title> 436 <title>Known Bugs And Assumptions</title>
diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl
index 54883de5d5f9..ac3d0018140c 100644
--- a/Documentation/DocBook/uio-howto.tmpl
+++ b/Documentation/DocBook/uio-howto.tmpl
@@ -521,6 +521,11 @@ Here's a description of the fields of <varname>struct uio_mem</varname>:
521 521
522<itemizedlist> 522<itemizedlist>
523<listitem><para> 523<listitem><para>
524<varname>const char *name</varname>: Optional. Set this to help identify
525the memory region, it will show up in the corresponding sysfs node.
526</para></listitem>
527
528<listitem><para>
524<varname>int memtype</varname>: Required if the mapping is used. Set this to 529<varname>int memtype</varname>: Required if the mapping is used. Set this to
525<varname>UIO_MEM_PHYS</varname> if you you have physical memory on your 530<varname>UIO_MEM_PHYS</varname> if you you have physical memory on your
526card to be mapped. Use <varname>UIO_MEM_LOGICAL</varname> for logical 531card to be mapped. Use <varname>UIO_MEM_LOGICAL</varname> for logical
@@ -553,7 +558,7 @@ instead to remember such an address.
553</itemizedlist> 558</itemizedlist>
554 559
555<para> 560<para>
556Please do not touch the <varname>kobj</varname> element of 561Please do not touch the <varname>map</varname> element of
557<varname>struct uio_mem</varname>! It is used by the UIO framework 562<varname>struct uio_mem</varname>! It is used by the UIO framework
558to set up sysfs files for this mapping. Simply leave it alone. 563to set up sysfs files for this mapping. Simply leave it alone.
559</para> 564</para>
diff --git a/Documentation/HOWTO b/Documentation/HOWTO
index 81bc1a9ab9d8..f7ade3b3b40d 100644
--- a/Documentation/HOWTO
+++ b/Documentation/HOWTO
@@ -275,8 +275,8 @@ versions.
275If no 2.6.x.y kernel is available, then the highest numbered 2.6.x 275If no 2.6.x.y kernel is available, then the highest numbered 2.6.x
276kernel is the current stable kernel. 276kernel is the current stable kernel.
277 277
2782.6.x.y are maintained by the "stable" team <stable@kernel.org>, and are 2782.6.x.y are maintained by the "stable" team <stable@vger.kernel.org>, and
279released as needs dictate. The normal release period is approximately 279are released as needs dictate. The normal release period is approximately
280two weeks, but it can be longer if there are no pressing problems. A 280two weeks, but it can be longer if there are no pressing problems. A
281security-related problem, instead, can cause a release to happen almost 281security-related problem, instead, can cause a release to happen almost
282instantly. 282instantly.
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 0c134f8afc6f..bff2d8be1e18 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -328,6 +328,12 @@ over a rather long period of time, but improvements are always welcome!
328 RCU rather than SRCU, because RCU is almost always faster and 328 RCU rather than SRCU, because RCU is almost always faster and
329 easier to use than is SRCU. 329 easier to use than is SRCU.
330 330
331 If you need to enter your read-side critical section in a
332 hardirq or exception handler, and then exit that same read-side
333 critical section in the task that was interrupted, then you need
334 to srcu_read_lock_raw() and srcu_read_unlock_raw(), which avoid
335 the lockdep checking that would otherwise this practice illegal.
336
331 Also unlike other forms of RCU, explicit initialization 337 Also unlike other forms of RCU, explicit initialization
332 and cleanup is required via init_srcu_struct() and 338 and cleanup is required via init_srcu_struct() and
333 cleanup_srcu_struct(). These are passed a "struct srcu_struct" 339 cleanup_srcu_struct(). These are passed a "struct srcu_struct"
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt
index 31852705b586..bf778332a28f 100644
--- a/Documentation/RCU/rcu.txt
+++ b/Documentation/RCU/rcu.txt
@@ -38,11 +38,11 @@ o How can the updater tell when a grace period has completed
38 38
39 Preemptible variants of RCU (CONFIG_TREE_PREEMPT_RCU) get the 39 Preemptible variants of RCU (CONFIG_TREE_PREEMPT_RCU) get the
40 same effect, but require that the readers manipulate CPU-local 40 same effect, but require that the readers manipulate CPU-local
41 counters. These counters allow limited types of blocking 41 counters. These counters allow limited types of blocking within
42 within RCU read-side critical sections. SRCU also uses 42 RCU read-side critical sections. SRCU also uses CPU-local
43 CPU-local counters, and permits general blocking within 43 counters, and permits general blocking within RCU read-side
44 RCU read-side critical sections. These two variants of 44 critical sections. These variants of RCU detect grace periods
45 RCU detect grace periods by sampling these counters. 45 by sampling these counters.
46 46
47o If I am running on a uniprocessor kernel, which can only do one 47o If I am running on a uniprocessor kernel, which can only do one
48 thing at a time, why should I wait for a grace period? 48 thing at a time, why should I wait for a grace period?
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 4e959208f736..083d88cbc089 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -101,6 +101,11 @@ o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
101 CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning 101 CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
102 messages. 102 messages.
103 103
104o A hardware or software issue shuts off the scheduler-clock
105 interrupt on a CPU that is not in dyntick-idle mode. This
106 problem really has happened, and seems to be most likely to
107 result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.
108
104o A bug in the RCU implementation. 109o A bug in the RCU implementation.
105 110
106o A hardware failure. This is quite unlikely, but has occurred 111o A hardware failure. This is quite unlikely, but has occurred
@@ -109,12 +114,11 @@ o A hardware failure. This is quite unlikely, but has occurred
109 This resulted in a series of RCU CPU stall warnings, eventually 114 This resulted in a series of RCU CPU stall warnings, eventually
110 leading the realization that the CPU had failed. 115 leading the realization that the CPU had failed.
111 116
112The RCU, RCU-sched, and RCU-bh implementations have CPU stall 117The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
113warning. SRCU does not have its own CPU stall warnings, but its 118SRCU does not have its own CPU stall warnings, but its calls to
114calls to synchronize_sched() will result in RCU-sched detecting 119synchronize_sched() will result in RCU-sched detecting RCU-sched-related
115RCU-sched-related CPU stalls. Please note that RCU only detects 120CPU stalls. Please note that RCU only detects CPU stalls when there is
116CPU stalls when there is a grace period in progress. No grace period, 121a grace period in progress. No grace period, no CPU stall warnings.
117no CPU stall warnings.
118 122
119To diagnose the cause of the stall, inspect the stack traces. 123To diagnose the cause of the stall, inspect the stack traces.
120The offending function will usually be near the top of the stack. 124The offending function will usually be near the top of the stack.
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 783d6c134d3f..d67068d0d2b9 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -61,11 +61,24 @@ nreaders This is the number of RCU reading threads supported.
61 To properly exercise RCU implementations with preemptible 61 To properly exercise RCU implementations with preemptible
62 read-side critical sections. 62 read-side critical sections.
63 63
64onoff_interval
65 The number of seconds between each attempt to execute a
66 randomly selected CPU-hotplug operation. Defaults to
67 zero, which disables CPU hotplugging. In HOTPLUG_CPU=n
68 kernels, rcutorture will silently refuse to do any
69 CPU-hotplug operations regardless of what value is
70 specified for onoff_interval.
71
64shuffle_interval 72shuffle_interval
65 The number of seconds to keep the test threads affinitied 73 The number of seconds to keep the test threads affinitied
66 to a particular subset of the CPUs, defaults to 3 seconds. 74 to a particular subset of the CPUs, defaults to 3 seconds.
67 Used in conjunction with test_no_idle_hz. 75 Used in conjunction with test_no_idle_hz.
68 76
77shutdown_secs The number of seconds to run the test before terminating
78 the test and powering off the system. The default is
79 zero, which disables test termination and system shutdown.
80 This capability is useful for automated testing.
81
69stat_interval The number of seconds between output of torture 82stat_interval The number of seconds between output of torture
70 statistics (via printk()). Regardless of the interval, 83 statistics (via printk()). Regardless of the interval,
71 statistics are printed when the module is unloaded. 84 statistics are printed when the module is unloaded.
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index aaf65f6c6cd7..49587abfc2f7 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -105,14 +105,10 @@ o "dt" is the current value of the dyntick counter that is incremented
105 or one greater than the interrupt-nesting depth otherwise. 105 or one greater than the interrupt-nesting depth otherwise.
106 The number after the second "/" is the NMI nesting depth. 106 The number after the second "/" is the NMI nesting depth.
107 107
108 This field is displayed only for CONFIG_NO_HZ kernels.
109
110o "df" is the number of times that some other CPU has forced a 108o "df" is the number of times that some other CPU has forced a
111 quiescent state on behalf of this CPU due to this CPU being in 109 quiescent state on behalf of this CPU due to this CPU being in
112 dynticks-idle state. 110 dynticks-idle state.
113 111
114 This field is displayed only for CONFIG_NO_HZ kernels.
115
116o "of" is the number of times that some other CPU has forced a 112o "of" is the number of times that some other CPU has forced a
117 quiescent state on behalf of this CPU due to this CPU being 113 quiescent state on behalf of this CPU due to this CPU being
118 offline. In a perfect world, this might never happen, but it 114 offline. In a perfect world, this might never happen, but it
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 6ef692667e2f..6bbe8dcdc3da 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -4,6 +4,7 @@ to start learning about RCU:
41. What is RCU, Fundamentally? http://lwn.net/Articles/262464/ 41. What is RCU, Fundamentally? http://lwn.net/Articles/262464/
52. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/ 52. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/
63. RCU part 3: the RCU API http://lwn.net/Articles/264090/ 63. RCU part 3: the RCU API http://lwn.net/Articles/264090/
74. The RCU API, 2010 Edition http://lwn.net/Articles/418853/
7 8
8 9
9What is RCU? 10What is RCU?
@@ -834,6 +835,8 @@ SRCU: Critical sections Grace period Barrier
834 835
835 srcu_read_lock synchronize_srcu N/A 836 srcu_read_lock synchronize_srcu N/A
836 srcu_read_unlock synchronize_srcu_expedited 837 srcu_read_unlock synchronize_srcu_expedited
838 srcu_read_lock_raw
839 srcu_read_unlock_raw
837 srcu_dereference 840 srcu_dereference
838 841
839SRCU: Initialization/cleanup 842SRCU: Initialization/cleanup
@@ -855,27 +858,33 @@ list can be helpful:
855 858
856a. Will readers need to block? If so, you need SRCU. 859a. Will readers need to block? If so, you need SRCU.
857 860
858b. What about the -rt patchset? If readers would need to block 861b. Is it necessary to start a read-side critical section in a
862 hardirq handler or exception handler, and then to complete
863 this read-side critical section in the task that was
864 interrupted? If so, you need SRCU's srcu_read_lock_raw() and
865 srcu_read_unlock_raw() primitives.
866
867c. What about the -rt patchset? If readers would need to block
859 in an non-rt kernel, you need SRCU. If readers would block 868 in an non-rt kernel, you need SRCU. If readers would block
860 in a -rt kernel, but not in a non-rt kernel, SRCU is not 869 in a -rt kernel, but not in a non-rt kernel, SRCU is not
861 necessary. 870 necessary.
862 871
863c. Do you need to treat NMI handlers, hardirq handlers, 872d. Do you need to treat NMI handlers, hardirq handlers,
864 and code segments with preemption disabled (whether 873 and code segments with preemption disabled (whether
865 via preempt_disable(), local_irq_save(), local_bh_disable(), 874 via preempt_disable(), local_irq_save(), local_bh_disable(),
866 or some other mechanism) as if they were explicit RCU readers? 875 or some other mechanism) as if they were explicit RCU readers?
867 If so, you need RCU-sched. 876 If so, you need RCU-sched.
868 877
869d. Do you need RCU grace periods to complete even in the face 878e. Do you need RCU grace periods to complete even in the face
870 of softirq monopolization of one or more of the CPUs? For 879 of softirq monopolization of one or more of the CPUs? For
871 example, is your code subject to network-based denial-of-service 880 example, is your code subject to network-based denial-of-service
872 attacks? If so, you need RCU-bh. 881 attacks? If so, you need RCU-bh.
873 882
874e. Is your workload too update-intensive for normal use of 883f. Is your workload too update-intensive for normal use of
875 RCU, but inappropriate for other synchronization mechanisms? 884 RCU, but inappropriate for other synchronization mechanisms?
876 If so, consider SLAB_DESTROY_BY_RCU. But please be careful! 885 If so, consider SLAB_DESTROY_BY_RCU. But please be careful!
877 886
878f. Otherwise, use RCU. 887g. Otherwise, use RCU.
879 888
880Of course, this all assumes that you have determined that RCU is in fact 889Of course, this all assumes that you have determined that RCU is in fact
881the right tool for your job. 890the right tool for your job.
diff --git a/Documentation/arm/memory.txt b/Documentation/arm/memory.txt
index 771d48d3b335..208a2d465b92 100644
--- a/Documentation/arm/memory.txt
+++ b/Documentation/arm/memory.txt
@@ -51,15 +51,14 @@ ffc00000 ffefffff DMA memory mapping region. Memory returned
51ff000000 ffbfffff Reserved for future expansion of DMA 51ff000000 ffbfffff Reserved for future expansion of DMA
52 mapping region. 52 mapping region.
53 53
54VMALLOC_END feffffff Free for platform use, recommended.
55 VMALLOC_END must be aligned to a 2MB
56 boundary.
57
58VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space. 54VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space.
59 Memory returned by vmalloc/ioremap will 55 Memory returned by vmalloc/ioremap will
60 be dynamically placed in this region. 56 be dynamically placed in this region.
61 VMALLOC_START may be based upon the value 57 Machine specific static mappings are also
62 of the high_memory variable. 58 located here through iotable_init().
59 VMALLOC_START is based upon the value
60 of the high_memory variable, and VMALLOC_END
61 is equal to 0xff000000.
63 62
64PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region. 63PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region.
65 This maps the platforms RAM, and typically 64 This maps the platforms RAM, and typically
diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 3bd585b44927..27f2b21a9d5c 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -84,6 +84,93 @@ compiler optimizes the section accessing atomic_t variables.
84 84
85*** YOU HAVE BEEN WARNED! *** 85*** YOU HAVE BEEN WARNED! ***
86 86
87Properly aligned pointers, longs, ints, and chars (and unsigned
88equivalents) may be atomically loaded from and stored to in the same
89sense as described for atomic_read() and atomic_set(). The ACCESS_ONCE()
90macro should be used to prevent the compiler from using optimizations
91that might otherwise optimize accesses out of existence on the one hand,
92or that might create unsolicited accesses on the other.
93
94For example consider the following code:
95
96 while (a > 0)
97 do_something();
98
99If the compiler can prove that do_something() does not store to the
100variable a, then the compiler is within its rights transforming this to
101the following:
102
103 tmp = a;
104 if (a > 0)
105 for (;;)
106 do_something();
107
108If you don't want the compiler to do this (and you probably don't), then
109you should use something like the following:
110
111 while (ACCESS_ONCE(a) < 0)
112 do_something();
113
114Alternatively, you could place a barrier() call in the loop.
115
116For another example, consider the following code:
117
118 tmp_a = a;
119 do_something_with(tmp_a);
120 do_something_else_with(tmp_a);
121
122If the compiler can prove that do_something_with() does not store to the
123variable a, then the compiler is within its rights to manufacture an
124additional load as follows:
125
126 tmp_a = a;
127 do_something_with(tmp_a);
128 tmp_a = a;
129 do_something_else_with(tmp_a);
130
131This could fatally confuse your code if it expected the same value
132to be passed to do_something_with() and do_something_else_with().
133
134The compiler would be likely to manufacture this additional load if
135do_something_with() was an inline function that made very heavy use
136of registers: reloading from variable a could save a flush to the
137stack and later reload. To prevent the compiler from attacking your
138code in this manner, write the following:
139
140 tmp_a = ACCESS_ONCE(a);
141 do_something_with(tmp_a);
142 do_something_else_with(tmp_a);
143
144For a final example, consider the following code, assuming that the
145variable a is set at boot time before the second CPU is brought online
146and never changed later, so that memory barriers are not needed:
147
148 if (a)
149 b = 9;
150 else
151 b = 42;
152
153The compiler is within its rights to manufacture an additional store
154by transforming the above code into the following:
155
156 b = 42;
157 if (a)
158 b = 9;
159
160This could come as a fatal surprise to other code running concurrently
161that expected b to never have the value 42 if a was zero. To prevent
162the compiler from doing this, write something like:
163
164 if (a)
165 ACCESS_ONCE(b) = 9;
166 else
167 ACCESS_ONCE(b) = 42;
168
169Don't even -think- about doing this without proper use of memory barriers,
170locks, or atomic operations if variable a can change at runtime!
171
172*** WARNING: ACCESS_ONCE() DOES NOT IMPLY A BARRIER! ***
173
87Now, we move onto the atomic operation interfaces typically implemented with 174Now, we move onto the atomic operation interfaces typically implemented with
88the help of assembly code. 175the help of assembly code.
89 176
diff --git a/Documentation/blockdev/cciss.txt b/Documentation/blockdev/cciss.txt
index 71464e09ec18..b79d0a13e7cd 100644
--- a/Documentation/blockdev/cciss.txt
+++ b/Documentation/blockdev/cciss.txt
@@ -98,14 +98,12 @@ You must enable "SCSI tape drive support for Smart Array 5xxx" and
98"SCSI support" in your kernel configuration to be able to use SCSI 98"SCSI support" in your kernel configuration to be able to use SCSI
99tape drives with your Smart Array 5xxx controller. 99tape drives with your Smart Array 5xxx controller.
100 100
101Additionally, note that the driver will not engage the SCSI core at init 101Additionally, note that the driver will engage the SCSI core at init
102time. The driver must be directed to dynamically engage the SCSI core via 102time if any tape drives or medium changers are detected. The driver may
103the /proc filesystem entry which the "block" side of the driver creates as 103also be directed to dynamically engage the SCSI core via the /proc filesystem
104/proc/driver/cciss/cciss* at runtime. This is because at driver init time, 104entry which the "block" side of the driver creates as
105the SCSI core may not yet be initialized (because the driver is a block 105/proc/driver/cciss/cciss* at runtime. This is best done via a script.
106driver) and attempting to register it with the SCSI core in such a case 106
107would cause a hang. This is best done via an initialization script
108(typically in /etc/init.d, but could vary depending on distribution).
109For example: 107For example:
110 108
111 for x in /proc/driver/cciss/cciss[0-9]* 109 for x in /proc/driver/cciss/cciss[0-9]*
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index cc0ebc5241b3..4d8774f6f48a 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -44,8 +44,8 @@ Features:
44 - oom-killer disable knob and oom-notifier 44 - oom-killer disable knob and oom-notifier
45 - Root cgroup has no limit controls. 45 - Root cgroup has no limit controls.
46 46
47 Kernel memory and Hugepages are not under control yet. We just manage 47 Kernel memory support is work in progress, and the current version provides
48 pages on LRU. To add more controls, we have to take care of performance. 48 basically functionality. (See Section 2.7)
49 49
50Brief summary of control files. 50Brief summary of control files.
51 51
@@ -72,6 +72,9 @@ Brief summary of control files.
72 memory.oom_control # set/show oom controls. 72 memory.oom_control # set/show oom controls.
73 memory.numa_stat # show the number of memory usage per numa node 73 memory.numa_stat # show the number of memory usage per numa node
74 74
75 memory.kmem.tcp.limit_in_bytes # set/show hard limit for tcp buf memory
76 memory.kmem.tcp.usage_in_bytes # show current tcp buf memory allocation
77
751. History 781. History
76 79
77The memory controller has a long history. A request for comments for the memory 80The memory controller has a long history. A request for comments for the memory
@@ -255,6 +258,27 @@ When oom event notifier is registered, event will be delivered.
255 per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by 258 per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
256 zone->lru_lock, it has no lock of its own. 259 zone->lru_lock, it has no lock of its own.
257 260
2612.7 Kernel Memory Extension (CONFIG_CGROUP_MEM_RES_CTLR_KMEM)
262
263With the Kernel memory extension, the Memory Controller is able to limit
264the amount of kernel memory used by the system. Kernel memory is fundamentally
265different than user memory, since it can't be swapped out, which makes it
266possible to DoS the system by consuming too much of this precious resource.
267
268Kernel memory limits are not imposed for the root cgroup. Usage for the root
269cgroup may or may not be accounted.
270
271Currently no soft limit is implemented for kernel memory. It is future work
272to trigger slab reclaim when those limits are reached.
273
2742.7.1 Current Kernel Memory resources accounted
275
276* sockets memory pressure: some sockets protocols have memory pressure
277thresholds. The Memory Controller allows them to be controlled individually
278per cgroup, instead of globally.
279
280* tcp memory pressure: sockets memory pressure for the tcp protocol.
281
2583. User Interface 2823. User Interface
259 283
2600. Configuration 2840. Configuration
diff --git a/Documentation/cgroups/net_prio.txt b/Documentation/cgroups/net_prio.txt
new file mode 100644
index 000000000000..01b322635591
--- /dev/null
+++ b/Documentation/cgroups/net_prio.txt
@@ -0,0 +1,53 @@
1Network priority cgroup
2-------------------------
3
4The Network priority cgroup provides an interface to allow an administrator to
5dynamically set the priority of network traffic generated by various
6applications
7
8Nominally, an application would set the priority of its traffic via the
9SO_PRIORITY socket option. This however, is not always possible because:
10
111) The application may not have been coded to set this value
122) The priority of application traffic is often a site-specific administrative
13 decision rather than an application defined one.
14
15This cgroup allows an administrator to assign a process to a group which defines
16the priority of egress traffic on a given interface. Network priority groups can
17be created by first mounting the cgroup filesystem.
18
19# mount -t cgroup -onet_prio none /sys/fs/cgroup/net_prio
20
21With the above step, the initial group acting as the parent accounting group
22becomes visible at '/sys/fs/cgroup/net_prio'. This group includes all tasks in
23the system. '/sys/fs/cgroup/net_prio/tasks' lists the tasks in this cgroup.
24
25Each net_prio cgroup contains two files that are subsystem specific
26
27net_prio.prioidx
28This file is read-only, and is simply informative. It contains a unique integer
29value that the kernel uses as an internal representation of this cgroup.
30
31net_prio.ifpriomap
32This file contains a map of the priorities assigned to traffic originating from
33processes in this group and egressing the system on various interfaces. It
34contains a list of tuples in the form <ifname priority>. Contents of this file
35can be modified by echoing a string into the file using the same tuple format.
36for example:
37
38echo "eth0 5" > /sys/fs/cgroups/net_prio/iscsi/net_prio.ifpriomap
39
40This command would force any traffic originating from processes belonging to the
41iscsi net_prio cgroup and egressing on interface eth0 to have the priority of
42said traffic set to the value 5. The parent accounting group also has a
43writeable 'net_prio.ifpriomap' file that can be used to set a system default
44priority.
45
46Priorities are set immediately prior to queueing a frame to the device
47queueing discipline (qdisc) so priorities will be assigned prior to the hardware
48queue selection being made.
49
50One usage for the net_prio cgroup is with mqprio qdisc allowing application
51traffic to be steered to hardware/driver based traffic classes. These mappings
52can then be managed by administrators or other networking protocols such as
53DCBX.
diff --git a/Documentation/development-process/5.Posting b/Documentation/development-process/5.Posting
index 903a2546f138..8a48c9b62864 100644
--- a/Documentation/development-process/5.Posting
+++ b/Documentation/development-process/5.Posting
@@ -271,10 +271,10 @@ copies should go to:
271 the linux-kernel list. 271 the linux-kernel list.
272 272
273 - If you are fixing a bug, think about whether the fix should go into the 273 - If you are fixing a bug, think about whether the fix should go into the
274 next stable update. If so, stable@kernel.org should get a copy of the 274 next stable update. If so, stable@vger.kernel.org should get a copy of
275 patch. Also add a "Cc: stable@kernel.org" to the tags within the patch 275 the patch. Also add a "Cc: stable@vger.kernel.org" to the tags within
276 itself; that will cause the stable team to get a notification when your 276 the patch itself; that will cause the stable team to get a notification
277 fix goes into the mainline. 277 when your fix goes into the mainline.
278 278
279When selecting recipients for a patch, it is good to have an idea of who 279When selecting recipients for a patch, it is good to have an idea of who
280you think will eventually accept the patch and get it merged. While it 280you think will eventually accept the patch and get it merged. While it
diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
index 52916b4aa1fe..9b4b82a721b6 100644
--- a/Documentation/devicetree/bindings/arm/gic.txt
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -42,6 +42,10 @@ Optional
42- interrupts : Interrupt source of the parent interrupt controller. Only 42- interrupts : Interrupt source of the parent interrupt controller. Only
43 present on secondary GICs. 43 present on secondary GICs.
44 44
45- cpu-offset : per-cpu offset within the distributor and cpu interface
46 regions, used when the GIC doesn't have banked registers. The offset is
47 cpu-offset * cpu-nr.
48
45Example: 49Example:
46 50
47 intc: interrupt-controller@fff11000 { 51 intc: interrupt-controller@fff11000 {
diff --git a/Documentation/devicetree/bindings/arm/vic.txt b/Documentation/devicetree/bindings/arm/vic.txt
new file mode 100644
index 000000000000..266716b23437
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/vic.txt
@@ -0,0 +1,29 @@
1* ARM Vectored Interrupt Controller
2
3One or more Vectored Interrupt Controllers (VIC's) can be connected in an ARM
4system for interrupt routing. For multiple controllers they can either be
5nested or have the outputs wire-OR'd together.
6
7Required properties:
8
9- compatible : should be one of
10 "arm,pl190-vic"
11 "arm,pl192-vic"
12- interrupt-controller : Identifies the node as an interrupt controller
13- #interrupt-cells : The number of cells to define the interrupts. Must be 1 as
14 the VIC has no configuration options for interrupt sources. The cell is a u32
15 and defines the interrupt number.
16- reg : The register bank for the VIC.
17
18Optional properties:
19
20- interrupts : Interrupt source for parent controllers if the VIC is nested.
21
22Example:
23
24 vic0: interrupt-controller@60000 {
25 compatible = "arm,pl192-vic";
26 interrupt-controller;
27 #interrupt-cells = <1>;
28 reg = <0x60000 0x1000>;
29 };
diff --git a/Documentation/devicetree/bindings/i2c/i2c-designware.txt b/Documentation/devicetree/bindings/i2c/i2c-designware.txt
new file mode 100644
index 000000000000..e42a2ee233e6
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/i2c-designware.txt
@@ -0,0 +1,22 @@
1* Synopsys DesignWare I2C
2
3Required properties :
4
5 - compatible : should be "snps,designware-i2c"
6 - reg : Offset and length of the register set for the device
7 - interrupts : <IRQ> where IRQ is the interrupt number.
8
9Recommended properties :
10
11 - clock-frequency : desired I2C bus clock frequency in Hz.
12
13Example :
14
15 i2c@f0000 {
16 #address-cells = <1>;
17 #size-cells = <0>;
18 compatible = "snps,designware-i2c";
19 reg = <0xf0000 0x1000>;
20 interrupts = <11>;
21 clock-frequency = <400000>;
22 };
diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
new file mode 100644
index 000000000000..1a85f986961b
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
@@ -0,0 +1,58 @@
1This is a list of trivial i2c devices that have simple device tree
2bindings, consisting only of a compatible field, an address and
3possibly an interrupt line.
4
5If a device needs more specific bindings, such as properties to
6describe some aspect of it, there needs to be a specific binding
7document for it just like any other devices.
8
9
10Compatible Vendor / Chip
11========== =============
12ad,ad7414 SMBus/I2C Digital Temperature Sensor in 6-Pin SOT with SMBus Alert and Over Temperature Pin
13ad,adm9240 ADM9240: Complete System Hardware Monitor for uProcessor-Based Systems
14adi,adt7461 +/-1C TDM Extended Temp Range I.C
15adt7461 +/-1C TDM Extended Temp Range I.C
16at,24c08 i2c serial eeprom (24cxx)
17atmel,24c02 i2c serial eeprom (24cxx)
18catalyst,24c32 i2c serial eeprom
19dallas,ds1307 64 x 8, Serial, I2C Real-Time Clock
20dallas,ds1338 I2C RTC with 56-Byte NV RAM
21dallas,ds1339 I2C Serial Real-Time Clock
22dallas,ds1340 I2C RTC with Trickle Charger
23dallas,ds1374 I2C, 32-Bit Binary Counter Watchdog RTC with Trickle Charger and Reset Input/Output
24dallas,ds1631 High-Precision Digital Thermometer
25dallas,ds1682 Total-Elapsed-Time Recorder with Alarm
26dallas,ds1775 Tiny Digital Thermometer and Thermostat
27dallas,ds3232 Extremely Accurate I²C RTC with Integrated Crystal and SRAM
28dallas,ds4510 CPU Supervisor with Nonvolatile Memory and Programmable I/O
29dallas,ds75 Digital Thermometer and Thermostat
30dialog,da9053 DA9053: flexible system level PMIC with multicore support
31epson,rx8025 High-Stability. I2C-Bus INTERFACE REAL TIME CLOCK MODULE
32epson,rx8581 I2C-BUS INTERFACE REAL TIME CLOCK MODULE
33fsl,mag3110 MAG3110: Xtrinsic High Accuracy, 3D Magnetometer
34fsl,mc13892 MC13892: Power Management Integrated Circuit (PMIC) for i.MX35/51
35fsl,mma8450 MMA8450Q: Xtrinsic Low-power, 3-axis Xtrinsic Accelerometer
36fsl,mpr121 MPR121: Proximity Capacitive Touch Sensor Controller
37fsl,sgtl5000 SGTL5000: Ultra Low-Power Audio Codec
38maxim,ds1050 5 Bit Programmable, Pulse-Width Modulator
39maxim,max1237 Low-Power, 4-/12-Channel, 2-Wire Serial, 12-Bit ADCs
40maxim,max6625 9-Bit/12-Bit Temperature Sensors with I²C-Compatible Serial Interface
41mc,rv3029c2 Real Time Clock Module with I2C-Bus
42national,lm75 I2C TEMP SENSOR
43national,lm80 Serial Interface ACPI-Compatible Microprocessor System Hardware Monitor
44national,lm92 ±0.33°C Accurate, 12-Bit + Sign Temperature Sensor and Thermal Window Comparator with Two-Wire Interface
45nxp,pca9556 Octal SMBus and I2C registered interface
46nxp,pca9557 8-bit I2C-bus and SMBus I/O port with reset
47nxp,pcf8563 Real-time clock/calendar
48ovti,ov5642 OV5642: Color CMOS QSXGA (5-megapixel) Image Sensor with OmniBSI and Embedded TrueFocus
49pericom,pt7c4338 Real-time Clock Module
50plx,pex8648 48-Lane, 12-Port PCI Express Gen 2 (5.0 GT/s) Switch
51ramtron,24c64 i2c serial eeprom (24cxx)
52ricoh,rs5c372a I2C bus SERIAL INTERFACE REAL-TIME CLOCK IC
53samsung,24ad0xd1 S524AD0XF1 (128K/256K-bit Serial EEPROM for Low Power)
54st-micro,24c256 i2c serial eeprom (24cxx)
55stm,m41t00 Serial Access TIMEKEEPER
56stm,m41t62 Serial real-time clock (RTC) with alarm
57stm,m41t80 M41T80 - SERIAL ACCESS RTC WITH ALARMS
58ti,tsc2003 I2C Touch-Screen Controller
diff --git a/Documentation/devicetree/bindings/net/calxeda-xgmac.txt b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
new file mode 100644
index 000000000000..411727a3f82d
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/calxeda-xgmac.txt
@@ -0,0 +1,15 @@
1* Calxeda Highbank 10Gb XGMAC Ethernet
2
3Required properties:
4- compatible : Should be "calxeda,hb-xgmac"
5- reg : Address and length of the register set for the device
6- interrupts : Should contain 3 xgmac interrupts. The 1st is main interrupt.
7 The 2nd is pwr mgt interrupt. The 3rd is low power state interrupt.
8
9Example:
10
11ethernet@fff50000 {
12 compatible = "calxeda,hb-xgmac";
13 reg = <0xfff50000 0x1000>;
14 interrupts = <0 77 4 0 78 4 0 79 4>;
15};
diff --git a/Documentation/devicetree/bindings/net/can/cc770.txt b/Documentation/devicetree/bindings/net/can/cc770.txt
new file mode 100644
index 000000000000..77027bf6460a
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/cc770.txt
@@ -0,0 +1,53 @@
1Memory mapped Bosch CC770 and Intel AN82527 CAN controller
2
3Note: The CC770 is a CAN controller from Bosch, which is 100%
4compatible with the old AN82527 from Intel, but with "bugs" being fixed.
5
6Required properties:
7
8- compatible : should be "bosch,cc770" for the CC770 and "intc,82527"
9 for the AN82527.
10
11- reg : should specify the chip select, address offset and size required
12 to map the registers of the controller. The size is usually 0x80.
13
14- interrupts : property with a value describing the interrupt source
15 (number and sensitivity) required for the controller.
16
17Optional properties:
18
19- bosch,external-clock-frequency : frequency of the external oscillator
20 clock in Hz. Note that the internal clock frequency used by the
21 controller is half of that value. If not specified, a default
22 value of 16000000 (16 MHz) is used.
23
24- bosch,clock-out-frequency : slock frequency in Hz on the CLKOUT pin.
25 If not specified or if the specified value is 0, the CLKOUT pin
26 will be disabled.
27
28- bosch,slew-rate : slew rate of the CLKOUT signal. If not specified,
29 a resonable value will be calculated.
30
31- bosch,disconnect-rx0-input : see data sheet.
32
33- bosch,disconnect-rx1-input : see data sheet.
34
35- bosch,disconnect-tx1-output : see data sheet.
36
37- bosch,polarity-dominant : see data sheet.
38
39- bosch,divide-memory-clock : see data sheet.
40
41- bosch,iso-low-speed-mux : see data sheet.
42
43For further information, please have a look to the CC770 or AN82527.
44
45Examples:
46
47can@3,100 {
48 compatible = "bosch,cc770";
49 reg = <3 0x100 0x80>;
50 interrupts = <2 0>;
51 interrupt-parent = <&mpic>;
52 bosch,external-clock-frequency = <16000000>;
53};
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt
new file mode 100644
index 000000000000..b9a8a2bcfae7
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/srio-rmu.txt
@@ -0,0 +1,163 @@
1Message unit node:
2
3For SRIO controllers that implement the message unit as part of the controller
4this node is required. For devices with RMAN this node should NOT exist. The
5node is composed of three types of sub-nodes ("fsl-srio-msg-unit",
6"fsl-srio-dbell-unit" and "fsl-srio-port-write-unit").
7
8See srio.txt for more details about generic SRIO controller details.
9
10 - compatible
11 Usage: required
12 Value type: <string>
13 Definition: Must include "fsl,srio-rmu-vX.Y", "fsl,srio-rmu".
14
15 The version X.Y should match the general SRIO controller's IP Block
16 revision register's Major(X) and Minor (Y) value.
17
18 - reg
19 Usage: required
20 Value type: <prop-encoded-array>
21 Definition: A standard property. Specifies the physical address and
22 length of the SRIO configuration registers for message units
23 and doorbell units.
24
25 - fsl,liodn
26 Usage: optional-but-recommended (for devices with PAMU)
27 Value type: <prop-encoded-array>
28 Definition: The logical I/O device number for the PAMU (IOMMU) to be
29 correctly configured for SRIO accesses. The property should
30 not exist on devices that do not support PAMU.
31
32 The LIODN value is associated with all RMU transactions
33 (msg-unit, doorbell, port-write).
34
35Sub-Nodes for RMU: The RMU node is composed of multiple sub-nodes that
36correspond to the actual sub-controllers in the RMU. The manual for a given
37SoC will detail which and how many of these sub-controllers are implemented.
38
39Message Unit:
40
41 - compatible
42 Usage: required
43 Value type: <string>
44 Definition: Must include "fsl,srio-msg-unit-vX.Y", "fsl,srio-msg-unit".
45
46 The version X.Y should match the general SRIO controller's IP Block
47 revision register's Major(X) and Minor (Y) value.
48
49 - reg
50 Usage: required
51 Value type: <prop-encoded-array>
52 Definition: A standard property. Specifies the physical address and
53 length of the SRIO configuration registers for message units
54 and doorbell units.
55
56 - interrupts
57 Usage: required
58 Value type: <prop_encoded-array>
59 Definition: Specifies the interrupts generated by this device. The
60 value of the interrupts property consists of one interrupt
61 specifier. The format of the specifier is defined by the
62 binding document describing the node's interrupt parent.
63
64 A pair of IRQs are specified in this property. The first
65 element is associated with the transmit (TX) interrupt and the
66 second element is associated with the receive (RX) interrupt.
67
68Doorbell Unit:
69
70 - compatible
71 Usage: required
72 Value type: <string>
73 Definition: Must include:
74 "fsl,srio-dbell-unit-vX.Y", "fsl,srio-dbell-unit"
75
76 The version X.Y should match the general SRIO controller's IP Block
77 revision register's Major(X) and Minor (Y) value.
78
79 - reg
80 Usage: required
81 Value type: <prop-encoded-array>
82 Definition: A standard property. Specifies the physical address and
83 length of the SRIO configuration registers for message units
84 and doorbell units.
85
86 - interrupts
87 Usage: required
88 Value type: <prop_encoded-array>
89 Definition: Specifies the interrupts generated by this device. The
90 value of the interrupts property consists of one interrupt
91 specifier. The format of the specifier is defined by the
92 binding document describing the node's interrupt parent.
93
94 A pair of IRQs are specified in this property. The first
95 element is associated with the transmit (TX) interrupt and the
96 second element is associated with the receive (RX) interrupt.
97
98Port-Write Unit:
99
100 - compatible
101 Usage: required
102 Value type: <string>
103 Definition: Must include:
104 "fsl,srio-port-write-unit-vX.Y", "fsl,srio-port-write-unit"
105
106 The version X.Y should match the general SRIO controller's IP Block
107 revision register's Major(X) and Minor (Y) value.
108
109 - reg
110 Usage: required
111 Value type: <prop-encoded-array>
112 Definition: A standard property. Specifies the physical address and
113 length of the SRIO configuration registers for message units
114 and doorbell units.
115
116 - interrupts
117 Usage: required
118 Value type: <prop_encoded-array>
119 Definition: Specifies the interrupts generated by this device. The
120 value of the interrupts property consists of one interrupt
121 specifier. The format of the specifier is defined by the
122 binding document describing the node's interrupt parent.
123
124 A single IRQ that handles port-write conditions is
125 specified by this property. (Typically shared with error).
126
127 Note: All other standard properties (see the ePAPR) are allowed
128 but are optional.
129
130Example:
131 rmu: rmu@d3000 {
132 compatible = "fsl,srio-rmu";
133 reg = <0xd3000 0x400>;
134 ranges = <0x0 0xd3000 0x400>;
135 fsl,liodn = <0xc8>;
136
137 message-unit@0 {
138 compatible = "fsl,srio-msg-unit";
139 reg = <0x0 0x100>;
140 interrupts = <
141 60 2 0 0 /* msg1_tx_irq */
142 61 2 0 0>;/* msg1_rx_irq */
143 };
144 message-unit@100 {
145 compatible = "fsl,srio-msg-unit";
146 reg = <0x100 0x100>;
147 interrupts = <
148 62 2 0 0 /* msg2_tx_irq */
149 63 2 0 0>;/* msg2_rx_irq */
150 };
151 doorbell-unit@400 {
152 compatible = "fsl,srio-dbell-unit";
153 reg = <0x400 0x80>;
154 interrupts = <
155 56 2 0 0 /* bell_outb_irq */
156 57 2 0 0>;/* bell_inb_irq */
157 };
158 port-write-unit@4e0 {
159 compatible = "fsl,srio-port-write-unit";
160 reg = <0x4e0 0x20>;
161 interrupts = <16 2 1 11>;
162 };
163 };
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/srio.txt b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt
new file mode 100644
index 000000000000..b039bcbee134
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/fsl/srio.txt
@@ -0,0 +1,103 @@
1* Freescale Serial RapidIO (SRIO) Controller
2
3RapidIO port node:
4Properties:
5 - compatible
6 Usage: required
7 Value type: <string>
8 Definition: Must include "fsl,srio" for IP blocks with IP Block
9 Revision Register (SRIO IPBRR1) Major ID equal to 0x01c0.
10
11 Optionally, a compatiable string of "fsl,srio-vX.Y" where X is Major
12 version in IP Block Revision Register and Y is Minor version. If this
13 compatiable is provided it should be ordered before "fsl,srio".
14
15 - reg
16 Usage: required
17 Value type: <prop-encoded-array>
18 Definition: A standard property. Specifies the physical address and
19 length of the SRIO configuration registers. The size should
20 be set to 0x11000.
21
22 - interrupts
23 Usage: required
24 Value type: <prop_encoded-array>
25 Definition: Specifies the interrupts generated by this device. The
26 value of the interrupts property consists of one interrupt
27 specifier. The format of the specifier is defined by the
28 binding document describing the node's interrupt parent.
29
30 A single IRQ that handles error conditions is specified by this
31 property. (Typically shared with port-write).
32
33 - fsl,srio-rmu-handle:
34 Usage: required if rmu node is defined
35 Value type: <phandle>
36 Definition: A single <phandle> value that points to the RMU.
37 (See srio-rmu.txt for more details on RMU node binding)
38
39Port Child Nodes: There should a port child node for each port that exists in
40the controller. The ports are numbered starting at one (1) and should have
41the following properties:
42
43 - cell-index
44 Usage: required
45 Value type: <u32>
46 Definition: A standard property. Matches the port id.
47
48 - ranges
49 Usage: required if local access windows preset
50 Value type: <prop-encoded-array>
51 Definition: A standard property. Utilized to describe the memory mapped
52 IO space utilized by the controller. This corresponds to the
53 setting of the local access windows that are targeted to this
54 SRIO port.
55
56 - fsl,liodn
57 Usage: optional-but-recommended (for devices with PAMU)
58 Value type: <prop-encoded-array>
59 Definition: The logical I/O device number for the PAMU (IOMMU) to be
60 correctly configured for SRIO accesses. The property should
61 not exist on devices that do not support PAMU.
62
63 For HW (ie, the P4080) that only supports a LIODN for both
64 memory and maintenance transactions then a single LIODN is
65 represented in the property for both transactions.
66
67 For HW (ie, the P304x/P5020, etc) that supports an LIODN for
68 memory transactions and a unique LIODN for maintenance
69 transactions then a pair of LIODNs are represented in the
70 property. Within the pair, the first element represents the
71 LIODN associated with memory transactions and the second element
72 represents the LIODN associated with maintenance transactions
73 for the port.
74
75Note: All other standard properties (see ePAPR) are allowed but are optional.
76
77Example:
78
79 rapidio: rapidio@ffe0c0000 {
80 #address-cells = <2>;
81 #size-cells = <2>;
82 reg = <0xf 0xfe0c0000 0 0x11000>;
83 compatible = "fsl,srio";
84 interrupts = <16 2 1 11>; /* err_irq */
85 fsl,srio-rmu-handle = <&rmu>;
86 ranges;
87
88 port1 {
89 cell-index = <1>;
90 #address-cells = <2>;
91 #size-cells = <2>;
92 fsl,liodn = <34>;
93 ranges = <0 0 0xc 0x20000000 0 0x10000000>;
94 };
95
96 port2 {
97 cell-index = <2>;
98 #address-cells = <2>;
99 #size-cells = <2>;
100 fsl,liodn = <48>;
101 ranges = <0 0 0xc 0x30000000 0 0x10000000>;
102 };
103 };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index e8552782b440..18626965159e 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -8,7 +8,9 @@ amcc Applied Micro Circuits Corporation (APM, formally AMCC)
8apm Applied Micro Circuits Corporation (APM) 8apm Applied Micro Circuits Corporation (APM)
9arm ARM Ltd. 9arm ARM Ltd.
10atmel Atmel Corporation 10atmel Atmel Corporation
11cavium Cavium, Inc.
11chrp Common Hardware Reference Platform 12chrp Common Hardware Reference Platform
13cortina Cortina Systems, Inc.
12dallas Maxim Integrated Products (formerly Dallas Semiconductor) 14dallas Maxim Integrated Products (formerly Dallas Semiconductor)
13denx Denx Software Engineering 15denx Denx Software Engineering
14epson Seiko Epson Corp. 16epson Seiko Epson Corp.
@@ -33,8 +35,10 @@ qcom Qualcomm, Inc.
33ramtron Ramtron International 35ramtron Ramtron International
34samsung Samsung Semiconductor 36samsung Samsung Semiconductor
35schindler Schindler 37schindler Schindler
38sil Silicon Image
36simtek 39simtek
37sirf SiRF Technology, Inc. 40sirf SiRF Technology, Inc.
41st STMicroelectronics
38stericsson ST-Ericsson 42stericsson ST-Ericsson
39ti Texas Instruments 43ti Texas Instruments
40xlnx Xilinx 44xlnx Xilinx
diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt
index d79aead9418b..10c64c8a13d4 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -262,6 +262,7 @@ IOMAP
262 devm_ioremap() 262 devm_ioremap()
263 devm_ioremap_nocache() 263 devm_ioremap_nocache()
264 devm_iounmap() 264 devm_iounmap()
265 devm_request_and_ioremap() : checks resource, requests region, ioremaps
265 pcim_iomap() 266 pcim_iomap()
266 pcim_iounmap() 267 pcim_iounmap()
267 pcim_iomap_table() : array of mapped addresses indexed by BAR 268 pcim_iomap_table() : array of mapped addresses indexed by BAR
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 3d849122b5b1..a1e7f3eec98f 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -85,17 +85,6 @@ Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
85 85
86--------------------------- 86---------------------------
87 87
88What: Deprecated snapshot ioctls
89When: 2.6.36
90
91Why: The ioctls in kernel/power/user.c were marked as deprecated long time
92 ago. Now they notify users about that so that they need to replace
93 their userspace. After some more time, remove them completely.
94
95Who: Jiri Slaby <jirislaby@gmail.com>
96
97---------------------------
98
99What: The ieee80211_regdom module parameter 88What: The ieee80211_regdom module parameter
100When: March 2010 / desktop catchup 89When: March 2010 / desktop catchup
101 90
@@ -263,8 +252,7 @@ Who: Ravikiran Thirumalai <kiran@scalex86.org>
263 252
264What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS 253What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS
265 (in net/core/net-sysfs.c) 254 (in net/core/net-sysfs.c)
266When: After the only user (hal) has seen a release with the patches 255When: 3.5
267 for enough time, probably some time in 2010.
268Why: Over 1K .text/.data size reduction, data is available in other 256Why: Over 1K .text/.data size reduction, data is available in other
269 ways (ioctls) 257 ways (ioctls)
270Who: Johannes Berg <johannes@sipsolutions.net> 258Who: Johannes Berg <johannes@sipsolutions.net>
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index d819ba16a0c7..4fca82e5276e 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -37,15 +37,15 @@ d_manage: no no yes (ref-walk) maybe
37 37
38--------------------------- inode_operations --------------------------- 38--------------------------- inode_operations ---------------------------
39prototypes: 39prototypes:
40 int (*create) (struct inode *,struct dentry *,int, struct nameidata *); 40 int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *);
41 struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid 41 struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid
42ata *); 42ata *);
43 int (*link) (struct dentry *,struct inode *,struct dentry *); 43 int (*link) (struct dentry *,struct inode *,struct dentry *);
44 int (*unlink) (struct inode *,struct dentry *); 44 int (*unlink) (struct inode *,struct dentry *);
45 int (*symlink) (struct inode *,struct dentry *,const char *); 45 int (*symlink) (struct inode *,struct dentry *,const char *);
46 int (*mkdir) (struct inode *,struct dentry *,int); 46 int (*mkdir) (struct inode *,struct dentry *,umode_t);
47 int (*rmdir) (struct inode *,struct dentry *); 47 int (*rmdir) (struct inode *,struct dentry *);
48 int (*mknod) (struct inode *,struct dentry *,int,dev_t); 48 int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
49 int (*rename) (struct inode *, struct dentry *, 49 int (*rename) (struct inode *, struct dentry *,
50 struct inode *, struct dentry *); 50 struct inode *, struct dentry *);
51 int (*readlink) (struct dentry *, char __user *,int); 51 int (*readlink) (struct dentry *, char __user *,int);
@@ -117,7 +117,7 @@ prototypes:
117 int (*statfs) (struct dentry *, struct kstatfs *); 117 int (*statfs) (struct dentry *, struct kstatfs *);
118 int (*remount_fs) (struct super_block *, int *, char *); 118 int (*remount_fs) (struct super_block *, int *, char *);
119 void (*umount_begin) (struct super_block *); 119 void (*umount_begin) (struct super_block *);
120 int (*show_options)(struct seq_file *, struct vfsmount *); 120 int (*show_options)(struct seq_file *, struct dentry *);
121 ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); 121 ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
122 ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); 122 ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
123 int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t); 123 int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
diff --git a/Documentation/filesystems/btrfs.txt b/Documentation/filesystems/btrfs.txt
index 64087c34327f..7671352216f1 100644
--- a/Documentation/filesystems/btrfs.txt
+++ b/Documentation/filesystems/btrfs.txt
@@ -63,8 +63,8 @@ IRC network.
63Userspace tools for creating and manipulating Btrfs file systems are 63Userspace tools for creating and manipulating Btrfs file systems are
64available from the git repository at the following location: 64available from the git repository at the following location:
65 65
66 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git 66 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git
67 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git 67 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
68 68
69These include the following tools: 69These include the following tools:
70 70
diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt
index dd57bb6bb390..b40fec9d3f53 100644
--- a/Documentation/filesystems/configfs/configfs.txt
+++ b/Documentation/filesystems/configfs/configfs.txt
@@ -192,7 +192,7 @@ attribute value uses the store_attribute() method.
192 struct configfs_attribute { 192 struct configfs_attribute {
193 char *ca_name; 193 char *ca_name;
194 struct module *ca_owner; 194 struct module *ca_owner;
195 mode_t ca_mode; 195 umode_t ca_mode;
196 }; 196 };
197 197
198When a config_item wants an attribute to appear as a file in the item's 198When a config_item wants an attribute to appear as a file in the item's
diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt
index 742cc06e138f..6872c91bce35 100644
--- a/Documentation/filesystems/debugfs.txt
+++ b/Documentation/filesystems/debugfs.txt
@@ -35,7 +35,7 @@ described below will work.
35 35
36The most general way to create a file within a debugfs directory is with: 36The most general way to create a file within a debugfs directory is with:
37 37
38 struct dentry *debugfs_create_file(const char *name, mode_t mode, 38 struct dentry *debugfs_create_file(const char *name, umode_t mode,
39 struct dentry *parent, void *data, 39 struct dentry *parent, void *data,
40 const struct file_operations *fops); 40 const struct file_operations *fops);
41 41
@@ -53,13 +53,13 @@ actually necessary; the debugfs code provides a number of helper functions
53for simple situations. Files containing a single integer value can be 53for simple situations. Files containing a single integer value can be
54created with any of: 54created with any of:
55 55
56 struct dentry *debugfs_create_u8(const char *name, mode_t mode, 56 struct dentry *debugfs_create_u8(const char *name, umode_t mode,
57 struct dentry *parent, u8 *value); 57 struct dentry *parent, u8 *value);
58 struct dentry *debugfs_create_u16(const char *name, mode_t mode, 58 struct dentry *debugfs_create_u16(const char *name, umode_t mode,
59 struct dentry *parent, u16 *value); 59 struct dentry *parent, u16 *value);
60 struct dentry *debugfs_create_u32(const char *name, mode_t mode, 60 struct dentry *debugfs_create_u32(const char *name, umode_t mode,
61 struct dentry *parent, u32 *value); 61 struct dentry *parent, u32 *value);
62 struct dentry *debugfs_create_u64(const char *name, mode_t mode, 62 struct dentry *debugfs_create_u64(const char *name, umode_t mode,
63 struct dentry *parent, u64 *value); 63 struct dentry *parent, u64 *value);
64 64
65These files support both reading and writing the given value; if a specific 65These files support both reading and writing the given value; if a specific
@@ -67,13 +67,13 @@ file should not be written to, simply set the mode bits accordingly. The
67values in these files are in decimal; if hexadecimal is more appropriate, 67values in these files are in decimal; if hexadecimal is more appropriate,
68the following functions can be used instead: 68the following functions can be used instead:
69 69
70 struct dentry *debugfs_create_x8(const char *name, mode_t mode, 70 struct dentry *debugfs_create_x8(const char *name, umode_t mode,
71 struct dentry *parent, u8 *value); 71 struct dentry *parent, u8 *value);
72 struct dentry *debugfs_create_x16(const char *name, mode_t mode, 72 struct dentry *debugfs_create_x16(const char *name, umode_t mode,
73 struct dentry *parent, u16 *value); 73 struct dentry *parent, u16 *value);
74 struct dentry *debugfs_create_x32(const char *name, mode_t mode, 74 struct dentry *debugfs_create_x32(const char *name, umode_t mode,
75 struct dentry *parent, u32 *value); 75 struct dentry *parent, u32 *value);
76 struct dentry *debugfs_create_x64(const char *name, mode_t mode, 76 struct dentry *debugfs_create_x64(const char *name, umode_t mode,
77 struct dentry *parent, u64 *value); 77 struct dentry *parent, u64 *value);
78 78
79These functions are useful as long as the developer knows the size of the 79These functions are useful as long as the developer knows the size of the
@@ -81,7 +81,7 @@ value to be exported. Some types can have different widths on different
81architectures, though, complicating the situation somewhat. There is a 81architectures, though, complicating the situation somewhat. There is a
82function meant to help out in one special case: 82function meant to help out in one special case:
83 83
84 struct dentry *debugfs_create_size_t(const char *name, mode_t mode, 84 struct dentry *debugfs_create_size_t(const char *name, umode_t mode,
85 struct dentry *parent, 85 struct dentry *parent,
86 size_t *value); 86 size_t *value);
87 87
@@ -90,21 +90,22 @@ a variable of type size_t.
90 90
91Boolean values can be placed in debugfs with: 91Boolean values can be placed in debugfs with:
92 92
93 struct dentry *debugfs_create_bool(const char *name, mode_t mode, 93 struct dentry *debugfs_create_bool(const char *name, umode_t mode,
94 struct dentry *parent, u32 *value); 94 struct dentry *parent, u32 *value);
95 95
96A read on the resulting file will yield either Y (for non-zero values) or 96A read on the resulting file will yield either Y (for non-zero values) or
97N, followed by a newline. If written to, it will accept either upper- or 97N, followed by a newline. If written to, it will accept either upper- or
98lower-case values, or 1 or 0. Any other input will be silently ignored. 98lower-case values, or 1 or 0. Any other input will be silently ignored.
99 99
100Finally, a block of arbitrary binary data can be exported with: 100Another option is exporting a block of arbitrary binary data, with
101this structure and function:
101 102
102 struct debugfs_blob_wrapper { 103 struct debugfs_blob_wrapper {
103 void *data; 104 void *data;
104 unsigned long size; 105 unsigned long size;
105 }; 106 };
106 107
107 struct dentry *debugfs_create_blob(const char *name, mode_t mode, 108 struct dentry *debugfs_create_blob(const char *name, umode_t mode,
108 struct dentry *parent, 109 struct dentry *parent,
109 struct debugfs_blob_wrapper *blob); 110 struct debugfs_blob_wrapper *blob);
110 111
@@ -115,6 +116,35 @@ can be used to export binary information, but there does not appear to be
115any code which does so in the mainline. Note that all files created with 116any code which does so in the mainline. Note that all files created with
116debugfs_create_blob() are read-only. 117debugfs_create_blob() are read-only.
117 118
119If you want to dump a block of registers (something that happens quite
120often during development, even if little such code reaches mainline.
121Debugfs offers two functions: one to make a registers-only file, and
122another to insert a register block in the middle of another sequential
123file.
124
125 struct debugfs_reg32 {
126 char *name;
127 unsigned long offset;
128 };
129
130 struct debugfs_regset32 {
131 struct debugfs_reg32 *regs;
132 int nregs;
133 void __iomem *base;
134 };
135
136 struct dentry *debugfs_create_regset32(const char *name, mode_t mode,
137 struct dentry *parent,
138 struct debugfs_regset32 *regset);
139
140 int debugfs_print_regs32(struct seq_file *s, struct debugfs_reg32 *regs,
141 int nregs, void __iomem *base, char *prefix);
142
143The "base" argument may be 0, but you may want to build the reg32 array
144using __stringify, and a number of register names (macros) are actually
145byte offsets over a base for the register block.
146
147
118There are a couple of other directory-oriented helper functions: 148There are a couple of other directory-oriented helper functions:
119 149
120 struct dentry *debugfs_rename(struct dentry *old_dir, 150 struct dentry *debugfs_rename(struct dentry *old_dir,
diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt
index 07235caec22c..a6619b7064b9 100644
--- a/Documentation/filesystems/sysfs.txt
+++ b/Documentation/filesystems/sysfs.txt
@@ -70,7 +70,7 @@ An attribute definition is simply:
70struct attribute { 70struct attribute {
71 char * name; 71 char * name;
72 struct module *owner; 72 struct module *owner;
73 mode_t mode; 73 umode_t mode;
74}; 74};
75 75
76 76
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 43cbd0821721..3d9393b845b8 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -225,7 +225,7 @@ struct super_operations {
225 void (*clear_inode) (struct inode *); 225 void (*clear_inode) (struct inode *);
226 void (*umount_begin) (struct super_block *); 226 void (*umount_begin) (struct super_block *);
227 227
228 int (*show_options)(struct seq_file *, struct vfsmount *); 228 int (*show_options)(struct seq_file *, struct dentry *);
229 229
230 ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); 230 ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
231 ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); 231 ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
@@ -341,14 +341,14 @@ This describes how the VFS can manipulate an inode in your
341filesystem. As of kernel 2.6.22, the following members are defined: 341filesystem. As of kernel 2.6.22, the following members are defined:
342 342
343struct inode_operations { 343struct inode_operations {
344 int (*create) (struct inode *,struct dentry *,int, struct nameidata *); 344 int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *);
345 struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *); 345 struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
346 int (*link) (struct dentry *,struct inode *,struct dentry *); 346 int (*link) (struct dentry *,struct inode *,struct dentry *);
347 int (*unlink) (struct inode *,struct dentry *); 347 int (*unlink) (struct inode *,struct dentry *);
348 int (*symlink) (struct inode *,struct dentry *,const char *); 348 int (*symlink) (struct inode *,struct dentry *,const char *);
349 int (*mkdir) (struct inode *,struct dentry *,int); 349 int (*mkdir) (struct inode *,struct dentry *,umode_t);
350 int (*rmdir) (struct inode *,struct dentry *); 350 int (*rmdir) (struct inode *,struct dentry *);
351 int (*mknod) (struct inode *,struct dentry *,int,dev_t); 351 int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
352 int (*rename) (struct inode *, struct dentry *, 352 int (*rename) (struct inode *, struct dentry *,
353 struct inode *, struct dentry *); 353 struct inode *, struct dentry *);
354 int (*readlink) (struct dentry *, char __user *,int); 354 int (*readlink) (struct dentry *, char __user *,int);
diff --git a/Documentation/i2c/ten-bit-addresses b/Documentation/i2c/ten-bit-addresses
index e9890709c508..cdfe13901b99 100644
--- a/Documentation/i2c/ten-bit-addresses
+++ b/Documentation/i2c/ten-bit-addresses
@@ -1,22 +1,24 @@
1The I2C protocol knows about two kinds of device addresses: normal 7 bit 1The I2C protocol knows about two kinds of device addresses: normal 7 bit
2addresses, and an extended set of 10 bit addresses. The sets of addresses 2addresses, and an extended set of 10 bit addresses. The sets of addresses
3do not intersect: the 7 bit address 0x10 is not the same as the 10 bit 3do not intersect: the 7 bit address 0x10 is not the same as the 10 bit
4address 0x10 (though a single device could respond to both of them). You 4address 0x10 (though a single device could respond to both of them).
5select a 10 bit address by adding an extra byte after the address
6byte:
7 S Addr7 Rd/Wr ....
8becomes
9 S 11110 Addr10 Rd/Wr
10S is the start bit, Rd/Wr the read/write bit, and if you count the number
11of bits, you will see the there are 8 after the S bit for 7 bit addresses,
12and 16 after the S bit for 10 bit addresses.
13 5
14WARNING! The current 10 bit address support is EXPERIMENTAL. There are 6I2C messages to and from 10-bit address devices have a different format.
15several places in the code that will cause SEVERE PROBLEMS with 10 bit 7See the I2C specification for the details.
16addresses, even though there is some basic handling and hooks. Also,
17almost no supported adapter handles the 10 bit addresses correctly.
18 8
19As soon as a real 10 bit address device is spotted 'in the wild', we 9The current 10 bit address support is minimal. It should work, however
20can and will add proper support. Right now, 10 bit address devices 10you can expect some problems along the way:
21are defined by the I2C protocol, but we have never seen a single device 11* Not all bus drivers support 10-bit addresses. Some don't because the
22which supports them. 12 hardware doesn't support them (SMBus doesn't require 10-bit address
13 support for example), some don't because nobody bothered adding the
14 code (or it's there but not working properly.) Software implementation
15 (i2c-algo-bit) is known to work.
16* Some optional features do not support 10-bit addresses. This is the
17 case of automatic detection and instantiation of devices by their,
18 drivers, for example.
19* Many user-space packages (for example i2c-tools) lack support for
20 10-bit addresses.
21
22Note that 10-bit address devices are still pretty rare, so the limitations
23listed above could stay for a long time, maybe even forever if nobody
24needs them to be fixed.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a0c5c5f4fce6..e229769606f2 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -315,12 +315,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
315 CPU-intensive style benchmark, and it can vary highly in 315 CPU-intensive style benchmark, and it can vary highly in
316 a microbenchmark depending on workload and compiler. 316 a microbenchmark depending on workload and compiler.
317 317
318 1: only for 32-bit processes 318 32: only for 32-bit processes
319 2: only for 64-bit processes 319 64: only for 64-bit processes
320 on: enable for both 32- and 64-bit processes 320 on: enable for both 32- and 64-bit processes
321 off: disable for both 32- and 64-bit processes 321 off: disable for both 32- and 64-bit processes
322 322
323 amd_iommu= [HW,X86-84] 323 amd_iommu= [HW,X86-64]
324 Pass parameters to the AMD IOMMU driver in the system. 324 Pass parameters to the AMD IOMMU driver in the system.
325 Possible values are: 325 Possible values are:
326 fullflush - enable flushing of IO/TLB entries when 326 fullflush - enable flushing of IO/TLB entries when
@@ -1885,6 +1885,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
1885 arch_perfmon: [X86] Force use of architectural 1885 arch_perfmon: [X86] Force use of architectural
1886 perfmon on Intel CPUs instead of the 1886 perfmon on Intel CPUs instead of the
1887 CPU specific event set. 1887 CPU specific event set.
1888 timer: [X86] Force use of architectural NMI
1889 timer mode (see also oprofile.timer
1890 for generic hr timer mode)
1891 [s390] Force legacy basic mode sampling
1892 (report cpu_type "timer")
1888 1893
1889 oops=panic Always panic on oopses. Default is to just kill the 1894 oops=panic Always panic on oopses. Default is to just kill the
1890 process, but there is a small probability of 1895 process, but there is a small probability of
@@ -2750,11 +2755,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
2750 functions are at fixed addresses, they make nice 2755 functions are at fixed addresses, they make nice
2751 targets for exploits that can control RIP. 2756 targets for exploits that can control RIP.
2752 2757
2753 emulate Vsyscalls turn into traps and are emulated 2758 emulate [default] Vsyscalls turn into traps and are
2754 reasonably safely. 2759 emulated reasonably safely.
2755 2760
2756 native [default] Vsyscalls are native syscall 2761 native Vsyscalls are native syscall instructions.
2757 instructions.
2758 This is a little bit faster than trapping 2762 This is a little bit faster than trapping
2759 and makes a few dynamic recompilers work 2763 and makes a few dynamic recompilers work
2760 better than they would in emulation mode. 2764 better than they would in emulation mode.
diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt
index abf768c681e2..5dbc99c04f6e 100644
--- a/Documentation/lockdep-design.txt
+++ b/Documentation/lockdep-design.txt
@@ -221,3 +221,66 @@ when the chain is validated for the first time, is then put into a hash
221table, which hash-table can be checked in a lockfree manner. If the 221table, which hash-table can be checked in a lockfree manner. If the
222locking chain occurs again later on, the hash table tells us that we 222locking chain occurs again later on, the hash table tells us that we
223dont have to validate the chain again. 223dont have to validate the chain again.
224
225Troubleshooting:
226----------------
227
228The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes.
229Exceeding this number will trigger the following lockdep warning:
230
231 (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS))
232
233By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical
234desktop systems have less than 1,000 lock classes, so this warning
235normally results from lock-class leakage or failure to properly
236initialize locks. These two problems are illustrated below:
237
2381. Repeated module loading and unloading while running the validator
239 will result in lock-class leakage. The issue here is that each
240 load of the module will create a new set of lock classes for
241 that module's locks, but module unloading does not remove old
242 classes (see below discussion of reuse of lock classes for why).
243 Therefore, if that module is loaded and unloaded repeatedly,
244 the number of lock classes will eventually reach the maximum.
245
2462. Using structures such as arrays that have large numbers of
247 locks that are not explicitly initialized. For example,
248 a hash table with 8192 buckets where each bucket has its own
249 spinlock_t will consume 8192 lock classes -unless- each spinlock
250 is explicitly initialized at runtime, for example, using the
251 run-time spin_lock_init() as opposed to compile-time initializers
252 such as __SPIN_LOCK_UNLOCKED(). Failure to properly initialize
253 the per-bucket spinlocks would guarantee lock-class overflow.
254 In contrast, a loop that called spin_lock_init() on each lock
255 would place all 8192 locks into a single lock class.
256
257 The moral of this story is that you should always explicitly
258 initialize your locks.
259
260One might argue that the validator should be modified to allow
261lock classes to be reused. However, if you are tempted to make this
262argument, first review the code and think through the changes that would
263be required, keeping in mind that the lock classes to be removed are
264likely to be linked into the lock-dependency graph. This turns out to
265be harder to do than to say.
266
267Of course, if you do run out of lock classes, the next thing to do is
268to find the offending lock classes. First, the following command gives
269you the number of lock classes currently in use along with the maximum:
270
271 grep "lock-classes" /proc/lockdep_stats
272
273This command produces the following output on a modest system:
274
275 lock-classes: 748 [max: 8191]
276
277If the number allocated (748 above) increases continually over time,
278then there is likely a leak. The following command can be used to
279identify the leaking lock classes:
280
281 grep "BD" /proc/lockdep
282
283Run the command and save the output, then compare against the output from
284a later run of this command to identify the leakers. This same output
285can also help you find situations where runtime lock initialization has
286been omitted.
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index bbce1215434a..9ad9ddeb384c 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -144,6 +144,8 @@ nfc.txt
144 - The Linux Near Field Communication (NFS) subsystem. 144 - The Linux Near Field Communication (NFS) subsystem.
145olympic.txt 145olympic.txt
146 - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info. 146 - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info.
147openvswitch.txt
148 - Open vSwitch developer documentation.
147operstates.txt 149operstates.txt
148 - Overview of network interface operational states. 150 - Overview of network interface operational states.
149packet_mmap.txt 151packet_mmap.txt
diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt
index c86d03f18a5b..221ad0cdf11f 100644
--- a/Documentation/networking/batman-adv.txt
+++ b/Documentation/networking/batman-adv.txt
@@ -200,15 +200,16 @@ abled during run time. Following log_levels are defined:
200 200
2010 - All debug output disabled 2010 - All debug output disabled
2021 - Enable messages related to routing / flooding / broadcasting 2021 - Enable messages related to routing / flooding / broadcasting
2032 - Enable route or tt entry added / changed / deleted 2032 - Enable messages related to route added / changed / deleted
2043 - Enable all messages 2044 - Enable messages related to translation table operations
2057 - Enable all messages
205 206
206The debug output can be changed at runtime using the file 207The debug output can be changed at runtime using the file
207/sys/class/net/bat0/mesh/log_level. e.g. 208/sys/class/net/bat0/mesh/log_level. e.g.
208 209
209# echo 2 > /sys/class/net/bat0/mesh/log_level 210# echo 2 > /sys/class/net/bat0/mesh/log_level
210 211
211will enable debug messages for when routes or TTs change. 212will enable debug messages for when routes change.
212 213
213 214
214BATCTL 215BATCTL
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 91df678fb7f8..080ad26690ae 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -196,6 +196,23 @@ or, for backwards compatibility, the option value. E.g.,
196 196
197 The parameters are as follows: 197 The parameters are as follows:
198 198
199active_slave
200
201 Specifies the new active slave for modes that support it
202 (active-backup, balance-alb and balance-tlb). Possible values
203 are the name of any currently enslaved interface, or an empty
204 string. If a name is given, the slave and its link must be up in order
205 to be selected as the new active slave. If an empty string is
206 specified, the current active slave is cleared, and a new active
207 slave is selected automatically.
208
209 Note that this is only available through the sysfs interface. No module
210 parameter by this name exists.
211
212 The normal value of this option is the name of the currently
213 active slave, or the empty string if there is no active slave or
214 the current mode does not use an active slave.
215
199ad_select 216ad_select
200 217
201 Specifies the 802.3ad aggregation selection logic to use. The 218 Specifies the 802.3ad aggregation selection logic to use. The
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index f41ea2405220..1dc1c24a7547 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -78,3 +78,30 @@ in software. This is currently WIP.
78 78
79See header include/net/mac802154.h and several drivers in drivers/ieee802154/. 79See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
80 80
816LoWPAN Linux implementation
82============================
83
84The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
85octets of actual MAC payload once security is turned on, on a wireless link
86with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
87[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
88taking into account limited bandwidth, memory, or energy resources that are
89expected in applications such as wireless Sensor Networks. [RFC4944] defines
90a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
91to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
92compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
93relatively large IPv6 and UDP headers down to (in the best case) several bytes.
94
95In Semptember 2011 the standard update was published - [RFC6282].
96It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
97used in this Linux implementation.
98
99All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
100
101To setup 6lowpan interface you need (busybox release > 1.17.0):
1021. Add IEEE802.15.4 interface and initialize PANid;
1032. Add 6lowpan interface by command like:
104 # ip link add link wpan0 name lowpan0 type lowpan
1053. Set MAC (if needs):
106 # ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
1074. Bring up 'lowpan0' interface
diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c
index 65968fbf1e49..ac5debb2f16c 100644
--- a/Documentation/networking/ifenslave.c
+++ b/Documentation/networking/ifenslave.c
@@ -539,12 +539,14 @@ static int if_getconfig(char *ifname)
539 metric = 0; 539 metric = 0;
540 } else 540 } else
541 metric = ifr.ifr_metric; 541 metric = ifr.ifr_metric;
542 printf("The result of SIOCGIFMETRIC is %d\n", metric);
542 543
543 strcpy(ifr.ifr_name, ifname); 544 strcpy(ifr.ifr_name, ifname);
544 if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0) 545 if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0)
545 mtu = 0; 546 mtu = 0;
546 else 547 else
547 mtu = ifr.ifr_mtu; 548 mtu = ifr.ifr_mtu;
549 printf("The result of SIOCGIFMTU is %d\n", mtu);
548 550
549 strcpy(ifr.ifr_name, ifname); 551 strcpy(ifr.ifr_name, ifname);
550 if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) { 552 if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) {
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index cb7f3148035d..ad3e80e17b4f 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -20,7 +20,7 @@ ip_no_pmtu_disc - BOOLEAN
20 default FALSE 20 default FALSE
21 21
22min_pmtu - INTEGER 22min_pmtu - INTEGER
23 default 562 - minimum discovered Path MTU 23 default 552 - minimum discovered Path MTU
24 24
25route/max_size - INTEGER 25route/max_size - INTEGER
26 Maximum number of routes allowed in the kernel. Increase 26 Maximum number of routes allowed in the kernel. Increase
@@ -31,6 +31,16 @@ neigh/default/gc_thresh3 - INTEGER
31 when using large numbers of interfaces and when communicating 31 when using large numbers of interfaces and when communicating
32 with large numbers of directly-connected peers. 32 with large numbers of directly-connected peers.
33 33
34neigh/default/unres_qlen_bytes - INTEGER
35 The maximum number of bytes which may be used by packets
36 queued for each unresolved address by other network layers.
37 (added in linux 3.3)
38
39neigh/default/unres_qlen - INTEGER
40 The maximum number of packets which may be queued for each
41 unresolved address by other network layers.
42 (deprecated in linux 3.3) : use unres_qlen_bytes instead.
43
34mtu_expires - INTEGER 44mtu_expires - INTEGER
35 Time, in seconds, that cached PMTU information is kept. 45 Time, in seconds, that cached PMTU information is kept.
36 46
@@ -165,6 +175,9 @@ tcp_congestion_control - STRING
165 connections. The algorithm "reno" is always available, but 175 connections. The algorithm "reno" is always available, but
166 additional choices may be available based on kernel configuration. 176 additional choices may be available based on kernel configuration.
167 Default is set as part of kernel configuration. 177 Default is set as part of kernel configuration.
178 For passive connections, the listener congestion control choice
179 is inherited.
180 [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
168 181
169tcp_cookie_size - INTEGER 182tcp_cookie_size - INTEGER
170 Default size of TCP Cookie Transactions (TCPCT) option, that may be 183 Default size of TCP Cookie Transactions (TCPCT) option, that may be
@@ -282,11 +295,11 @@ tcp_max_ssthresh - INTEGER
282 Default: 0 (off) 295 Default: 0 (off)
283 296
284tcp_max_syn_backlog - INTEGER 297tcp_max_syn_backlog - INTEGER
285 Maximal number of remembered connection requests, which are 298 Maximal number of remembered connection requests, which have not
286 still did not receive an acknowledgment from connecting client. 299 received an acknowledgment from connecting client.
287 Default value is 1024 for systems with more than 128Mb of memory, 300 The minimal value is 128 for low memory machines, and it will
288 and 128 for low memory machines. If server suffers of overload, 301 increase in proportion to the memory of machine.
289 try to increase this number. 302 If server suffers from overload, try increasing this number.
290 303
291tcp_max_tw_buckets - INTEGER 304tcp_max_tw_buckets - INTEGER
292 Maximal number of timewait sockets held by system simultaneously. 305 Maximal number of timewait sockets held by system simultaneously.
diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
new file mode 100644
index 000000000000..b8a048b8df3a
--- /dev/null
+++ b/Documentation/networking/openvswitch.txt
@@ -0,0 +1,195 @@
1Open vSwitch datapath developer documentation
2=============================================
3
4The Open vSwitch kernel module allows flexible userspace control over
5flow-level packet processing on selected network devices. It can be
6used to implement a plain Ethernet switch, network device bonding,
7VLAN processing, network access control, flow-based network control,
8and so on.
9
10The kernel module implements multiple "datapaths" (analogous to
11bridges), each of which can have multiple "vports" (analogous to ports
12within a bridge). Each datapath also has associated with it a "flow
13table" that userspace populates with "flows" that map from keys based
14on packet headers and metadata to sets of actions. The most common
15action forwards the packet to another vport; other actions are also
16implemented.
17
18When a packet arrives on a vport, the kernel module processes it by
19extracting its flow key and looking it up in the flow table. If there
20is a matching flow, it executes the associated actions. If there is
21no match, it queues the packet to userspace for processing (as part of
22its processing, userspace will likely set up a flow to handle further
23packets of the same type entirely in-kernel).
24
25
26Flow key compatibility
27----------------------
28
29Network protocols evolve over time. New protocols become important
30and existing protocols lose their prominence. For the Open vSwitch
31kernel module to remain relevant, it must be possible for newer
32versions to parse additional protocols as part of the flow key. It
33might even be desirable, someday, to drop support for parsing
34protocols that have become obsolete. Therefore, the Netlink interface
35to Open vSwitch is designed to allow carefully written userspace
36applications to work with any version of the flow key, past or future.
37
38To support this forward and backward compatibility, whenever the
39kernel module passes a packet to userspace, it also passes along the
40flow key that it parsed from the packet. Userspace then extracts its
41own notion of a flow key from the packet and compares it against the
42kernel-provided version:
43
44 - If userspace's notion of the flow key for the packet matches the
45 kernel's, then nothing special is necessary.
46
47 - If the kernel's flow key includes more fields than the userspace
48 version of the flow key, for example if the kernel decoded IPv6
49 headers but userspace stopped at the Ethernet type (because it
50 does not understand IPv6), then again nothing special is
51 necessary. Userspace can still set up a flow in the usual way,
52 as long as it uses the kernel-provided flow key to do it.
53
54 - If the userspace flow key includes more fields than the
55 kernel's, for example if userspace decoded an IPv6 header but
56 the kernel stopped at the Ethernet type, then userspace can
57 forward the packet manually, without setting up a flow in the
58 kernel. This case is bad for performance because every packet
59 that the kernel considers part of the flow must go to userspace,
60 but the forwarding behavior is correct. (If userspace can
61 determine that the values of the extra fields would not affect
62 forwarding behavior, then it could set up a flow anyway.)
63
64How flow keys evolve over time is important to making this work, so
65the following sections go into detail.
66
67
68Flow key format
69---------------
70
71A flow key is passed over a Netlink socket as a sequence of Netlink
72attributes. Some attributes represent packet metadata, defined as any
73information about a packet that cannot be extracted from the packet
74itself, e.g. the vport on which the packet was received. Most
75attributes, however, are extracted from headers within the packet,
76e.g. source and destination addresses from Ethernet, IP, or TCP
77headers.
78
79The <linux/openvswitch.h> header file defines the exact format of the
80flow key attributes. For informal explanatory purposes here, we write
81them as comma-separated strings, with parentheses indicating arguments
82and nesting. For example, the following could represent a flow key
83corresponding to a TCP packet that arrived on vport 1:
84
85 in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
86 eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
87 frag=no), tcp(src=49163, dst=80)
88
89Often we ellipsize arguments not important to the discussion, e.g.:
90
91 in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
92
93
94Basic rule for evolving flow keys
95---------------------------------
96
97Some care is needed to really maintain forward and backward
98compatibility for applications that follow the rules listed under
99"Flow key compatibility" above.
100
101The basic rule is obvious:
102
103 ------------------------------------------------------------------
104 New network protocol support must only supplement existing flow
105 key attributes. It must not change the meaning of already defined
106 flow key attributes.
107 ------------------------------------------------------------------
108
109This rule does have less-obvious consequences so it is worth working
110through a few examples. Suppose, for example, that the kernel module
111did not already implement VLAN parsing. Instead, it just interpreted
112the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
113packet. The flow key for any packet with an 802.1Q header would look
114essentially like this, ignoring metadata:
115
116 eth(...), eth_type(0x8100)
117
118Naively, to add VLAN support, it makes sense to add a new "vlan" flow
119key attribute to contain the VLAN tag, then continue to decode the
120encapsulated headers beyond the VLAN tag using the existing field
121definitions. With this change, an TCP packet in VLAN 10 would have a
122flow key much like this:
123
124 eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
125
126But this change would negatively affect a userspace application that
127has not been updated to understand the new "vlan" flow key attribute.
128The application could, following the flow compatibility rules above,
129ignore the "vlan" attribute that it does not understand and therefore
130assume that the flow contained IP packets. This is a bad assumption
131(the flow only contains IP packets if one parses and skips over the
132802.1Q header) and it could cause the application's behavior to change
133across kernel versions even though it follows the compatibility rules.
134
135The solution is to use a set of nested attributes. This is, for
136example, why 802.1Q support uses nested attributes. A TCP packet in
137VLAN 10 is actually expressed as:
138
139 eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
140 ip(proto=6, ...), tcp(...)))
141
142Notice how the "eth_type", "ip", and "tcp" flow key attributes are
143nested inside the "encap" attribute. Thus, an application that does
144not understand the "vlan" key will not see either of those attributes
145and therefore will not misinterpret them. (Also, the outer eth_type
146is still 0x8100, not changed to 0x0800.)
147
148Handling malformed packets
149--------------------------
150
151Don't drop packets in the kernel for malformed protocol headers, bad
152checksums, etc. This would prevent userspace from implementing a
153simple Ethernet switch that forwards every packet.
154
155Instead, in such a case, include an attribute with "empty" content.
156It doesn't matter if the empty content could be valid protocol values,
157as long as those values are rarely seen in practice, because userspace
158can always forward all packets with those values to userspace and
159handle them individually.
160
161For example, consider a packet that contains an IP header that
162indicates protocol 6 for TCP, but which is truncated just after the IP
163header, so that the TCP header is missing. The flow key for this
164packet would include a tcp attribute with all-zero src and dst, like
165this:
166
167 eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
168
169As another example, consider a packet with an Ethernet type of 0x8100,
170indicating that a VLAN TCI should follow, but which is truncated just
171after the Ethernet type. The flow key for this packet would include
172an all-zero-bits vlan and an empty encap attribute, like this:
173
174 eth(...), eth_type(0x8100), vlan(0), encap()
175
176Unlike a TCP packet with source and destination ports 0, an
177all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
178VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
179attribute expressly to allow this situation to be distinguished.
180Thus, the flow key in this second example unambiguously indicates a
181missing or malformed VLAN TCI.
182
183Other rules
184-----------
185
186The other rules for flow keys are much less subtle:
187
188 - Duplicate attributes are not allowed at a given nesting level.
189
190 - Ordering of attributes is not significant.
191
192 - When the kernel sends a given flow key to userspace, it always
193 composes it the same way. This allows userspace to hash and
194 compare entire flow keys that it may not be able to fully
195 interpret.
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 4acea6603720..1c08a4b0981f 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -155,7 +155,7 @@ As capture, each frame contains two parts:
155 155
156 /* fill sockaddr_ll struct to prepare binding */ 156 /* fill sockaddr_ll struct to prepare binding */
157 my_addr.sll_family = AF_PACKET; 157 my_addr.sll_family = AF_PACKET;
158 my_addr.sll_protocol = ETH_P_ALL; 158 my_addr.sll_protocol = htons(ETH_P_ALL);
159 my_addr.sll_ifindex = s_ifr.ifr_ifindex; 159 my_addr.sll_ifindex = s_ifr.ifr_ifindex;
160 160
161 /* bind socket to eth0 */ 161 /* bind socket to eth0 */
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
index a177de21d28e..579994afbe06 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the length of the current
208CPU's backlog when a packet in this flow was last enqueued. Each backlog 208CPU's backlog when a packet in this flow was last enqueued. Each backlog
209queue has a head counter that is incremented on dequeue. A tail counter 209queue has a head counter that is incremented on dequeue. A tail counter
210is computed as head counter + queue length. In other words, the counter 210is computed as head counter + queue length. In other words, the counter
211in rps_dev_flow_table[i] records the last element in flow i that has 211in rps_dev_flow[i] records the last element in flow i that has
212been enqueued onto the currently designated CPU for flow i (of course, 212been enqueued onto the currently designated CPU for flow i (of course,
213entry i is actually selected by hash and multiple flows may hash to the 213entry i is actually selected by hash and multiple flows may hash to the
214same entry i). 214same entry i).
@@ -224,7 +224,7 @@ following is true:
224 224
225- The current CPU's queue head counter >= the recorded tail counter 225- The current CPU's queue head counter >= the recorded tail counter
226 value in rps_dev_flow[i] 226 value in rps_dev_flow[i]
227- The current CPU is unset (equal to NR_CPUS) 227- The current CPU is unset (equal to RPS_NO_CPU)
228- The current CPU is offline 228- The current CPU is offline
229 229
230After this check, the packet is sent to the (possibly updated) current 230After this check, the packet is sent to the (possibly updated) current
@@ -235,7 +235,7 @@ CPU.
235 235
236==== RFS Configuration 236==== RFS Configuration
237 237
238RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on 238RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
239by default for SMP). The functionality remains disabled until explicitly 239by default for SMP). The functionality remains disabled until explicitly
240configured. The number of entries in the global flow table is set through: 240configured. The number of entries in the global flow table is set through:
241 241
@@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value for the single queue
258would normally be configured to the same value as rps_sock_flow_entries. 258would normally be configured to the same value as rps_sock_flow_entries.
259For a multi-queue device, the rps_flow_cnt for each queue might be 259For a multi-queue device, the rps_flow_cnt for each queue might be
260configured as rps_sock_flow_entries / N, where N is the number of 260configured as rps_sock_flow_entries / N, where N is the number of
261queues. So for instance, if rps_flow_entries is set to 32768 and there 261queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
262are 16 configured receive queues, rps_flow_cnt for each queue might be 262are 16 configured receive queues, rps_flow_cnt for each queue might be
263configured as 2048. 263configured as 2048.
264 264
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 8d67980fabe8..d0aeeadd264b 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -4,14 +4,16 @@ Copyright (C) 2007-2010 STMicroelectronics Ltd
4Author: Giuseppe Cavallaro <peppe.cavallaro@st.com> 4Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
5 5
6This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers 6This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
7(Synopsys IP blocks); it has been fully tested on STLinux platforms. 7(Synopsys IP blocks).
8 8
9Currently this network device driver is for all STM embedded MAC/GMAC 9Currently this network device driver is for all STM embedded MAC/GMAC
10(i.e. 7xxx/5xxx SoCs) and it's known working on other platforms i.e. ARM SPEAr. 10(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
11FF1152AMT0221 D1215994A VIRTEX FPGA board.
11 12
12DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100 13DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether MAC 10/100
13Universal version 4.0 have been used for developing the first code 14Universal version 4.0 have been used for developing this driver.
14implementation. 15
16This driver supports both the platform bus and PCI.
15 17
16Please, for more information also visit: www.stlinux.com 18Please, for more information also visit: www.stlinux.com
17 19
@@ -277,5 +279,5 @@ In fact, these can generate an huge amount of debug messages.
277 279
2786) TODO: 2806) TODO:
279 o XGMAC is not supported. 281 o XGMAC is not supported.
280 o Review the timer optimisation code to use an embedded device that will be 282 o Add the EEE - Energy Efficient Ethernet
281 available in new chip generations. 283 o Add the PTP - precision time protocol
diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 000000000000..5a013686b9ea
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
1Team devices are driven from userspace via libteam library which is here:
2 https://github.com/jpirko/libteam
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt
index 646a89e0c07d..20af7def23c8 100644
--- a/Documentation/power/devices.txt
+++ b/Documentation/power/devices.txt
@@ -123,9 +123,12 @@ please refer directly to the source code for more information about it.
123Subsystem-Level Methods 123Subsystem-Level Methods
124----------------------- 124-----------------------
125The core methods to suspend and resume devices reside in struct dev_pm_ops 125The core methods to suspend and resume devices reside in struct dev_pm_ops
126pointed to by the pm member of struct bus_type, struct device_type and 126pointed to by the ops member of struct dev_pm_domain, or by the pm member of
127struct class. They are mostly of interest to the people writing infrastructure 127struct bus_type, struct device_type and struct class. They are mostly of
128for buses, like PCI or USB, or device type and device class drivers. 128interest to the people writing infrastructure for platforms and buses, like PCI
129or USB, or device type and device class drivers. They also are relevant to the
130writers of device drivers whose subsystems (PM domains, device types, device
131classes and bus types) don't provide all power management methods.
129 132
130Bus drivers implement these methods as appropriate for the hardware and the 133Bus drivers implement these methods as appropriate for the hardware and the
131drivers using it; PCI works differently from USB, and so on. Not many people 134drivers using it; PCI works differently from USB, and so on. Not many people
@@ -139,41 +142,57 @@ sequencing in the driver model tree.
139 142
140/sys/devices/.../power/wakeup files 143/sys/devices/.../power/wakeup files
141----------------------------------- 144-----------------------------------
142All devices in the driver model have two flags to control handling of wakeup 145All device objects in the driver model contain fields that control the handling
143events (hardware signals that can force the device and/or system out of a low 146of system wakeup events (hardware signals that can force the system out of a
144power state). These flags are initialized by bus or device driver code using 147sleep state). These fields are initialized by bus or device driver code using
145device_set_wakeup_capable() and device_set_wakeup_enable(), defined in 148device_set_wakeup_capable() and device_set_wakeup_enable(), defined in
146include/linux/pm_wakeup.h. 149include/linux/pm_wakeup.h.
147 150
148The "can_wakeup" flag just records whether the device (and its driver) can 151The "power.can_wakeup" flag just records whether the device (and its driver) can
149physically support wakeup events. The device_set_wakeup_capable() routine 152physically support wakeup events. The device_set_wakeup_capable() routine
150affects this flag. The "should_wakeup" flag controls whether the device should 153affects this flag. The "power.wakeup" field is a pointer to an object of type
151try to use its wakeup mechanism. device_set_wakeup_enable() affects this flag; 154struct wakeup_source used for controlling whether or not the device should use
152for the most part drivers should not change its value. The initial value of 155its system wakeup mechanism and for notifying the PM core of system wakeup
153should_wakeup is supposed to be false for the majority of devices; the major 156events signaled by the device. This object is only present for wakeup-capable
154exceptions are power buttons, keyboards, and Ethernet adapters whose WoL 157devices (i.e. devices whose "can_wakeup" flags are set) and is created (or
155(wake-on-LAN) feature has been set up with ethtool. It should also default 158removed) by device_set_wakeup_capable().
156to true for devices that don't generate wakeup requests on their own but merely
157forward wakeup requests from one bus to another (like PCI bridges).
158 159
159Whether or not a device is capable of issuing wakeup events is a hardware 160Whether or not a device is capable of issuing wakeup events is a hardware
160matter, and the kernel is responsible for keeping track of it. By contrast, 161matter, and the kernel is responsible for keeping track of it. By contrast,
161whether or not a wakeup-capable device should issue wakeup events is a policy 162whether or not a wakeup-capable device should issue wakeup events is a policy
162decision, and it is managed by user space through a sysfs attribute: the 163decision, and it is managed by user space through a sysfs attribute: the
163power/wakeup file. User space can write the strings "enabled" or "disabled" to 164"power/wakeup" file. User space can write the strings "enabled" or "disabled"
164set or clear the "should_wakeup" flag, respectively. This file is only present 165to it to indicate whether or not, respectively, the device is supposed to signal
165for wakeup-capable devices (i.e. devices whose "can_wakeup" flags are set) 166system wakeup. This file is only present if the "power.wakeup" object exists
166and is created (or removed) by device_set_wakeup_capable(). Reads from the 167for the given device and is created (or removed) along with that object, by
167file will return the corresponding string. 168device_set_wakeup_capable(). Reads from the file will return the corresponding
168 169string.
169The device_may_wakeup() routine returns true only if both flags are set. 170
171The "power/wakeup" file is supposed to contain the "disabled" string initially
172for the majority of devices; the major exceptions are power buttons, keyboards,
173and Ethernet adapters whose WoL (wake-on-LAN) feature has been set up with
174ethtool. It should also default to "enabled" for devices that don't generate
175wakeup requests on their own but merely forward wakeup requests from one bus to
176another (like PCI Express ports).
177
178The device_may_wakeup() routine returns true only if the "power.wakeup" object
179exists and the corresponding "power/wakeup" file contains the string "enabled".
170This information is used by subsystems, like the PCI bus type code, to see 180This information is used by subsystems, like the PCI bus type code, to see
171whether or not to enable the devices' wakeup mechanisms. If device wakeup 181whether or not to enable the devices' wakeup mechanisms. If device wakeup
172mechanisms are enabled or disabled directly by drivers, they also should use 182mechanisms are enabled or disabled directly by drivers, they also should use
173device_may_wakeup() to decide what to do during a system sleep transition. 183device_may_wakeup() to decide what to do during a system sleep transition.
174However for runtime power management, wakeup events should be enabled whenever 184Device drivers, however, are not supposed to call device_set_wakeup_enable()
175the device and driver both support them, regardless of the should_wakeup flag. 185directly in any case.
176 186
187It ought to be noted that system wakeup is conceptually different from "remote
188wakeup" used by runtime power management, although it may be supported by the
189same physical mechanism. Remote wakeup is a feature allowing devices in
190low-power states to trigger specific interrupts to signal conditions in which
191they should be put into the full-power state. Those interrupts may or may not
192be used to signal system wakeup events, depending on the hardware design. On
193some systems it is impossible to trigger them from system sleep states. In any
194case, remote wakeup should always be enabled for runtime power management for
195all devices and drivers that support it.
177 196
178/sys/devices/.../power/control files 197/sys/devices/.../power/control files
179------------------------------------ 198------------------------------------
@@ -249,23 +268,37 @@ for every device before the next phase begins. Not all busses or classes
249support all these callbacks and not all drivers use all the callbacks. The 268support all these callbacks and not all drivers use all the callbacks. The
250various phases always run after tasks have been frozen and before they are 269various phases always run after tasks have been frozen and before they are
251unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have 270unfrozen. Furthermore, the *_noirq phases run at a time when IRQ handlers have
252been disabled (except for those marked with the IRQ_WAKEUP flag). 271been disabled (except for those marked with the IRQF_NO_SUSPEND flag).
272
273All phases use PM domain, bus, type, class or driver callbacks (that is, methods
274defined in dev->pm_domain->ops, dev->bus->pm, dev->type->pm, dev->class->pm or
275dev->driver->pm). These callbacks are regarded by the PM core as mutually
276exclusive. Moreover, PM domain callbacks always take precedence over all of the
277other callbacks and, for example, type callbacks take precedence over bus, class
278and driver callbacks. To be precise, the following rules are used to determine
279which callback to execute in the given phase:
280
281 1. If dev->pm_domain is present, the PM core will choose the callback
282 included in dev->pm_domain->ops for execution
283
284 2. Otherwise, if both dev->type and dev->type->pm are present, the callback
285 included in dev->type->pm will be chosen for execution.
286
287 3. Otherwise, if both dev->class and dev->class->pm are present, the
288 callback included in dev->class->pm will be chosen for execution.
289
290 4. Otherwise, if both dev->bus and dev->bus->pm are present, the callback
291 included in dev->bus->pm will be chosen for execution.
292
293This allows PM domains and device types to override callbacks provided by bus
294types or device classes if necessary.
253 295
254All phases use bus, type, or class callbacks (that is, methods defined in 296The PM domain, type, class and bus callbacks may in turn invoke device- or
255dev->bus->pm, dev->type->pm, or dev->class->pm). These callbacks are mutually 297driver-specific methods stored in dev->driver->pm, but they don't have to do
256exclusive, so if the device type provides a struct dev_pm_ops object pointed to 298that.
257by its pm field (i.e. both dev->type and dev->type->pm are defined), the
258callbacks included in that object (i.e. dev->type->pm) will be used. Otherwise,
259if the class provides a struct dev_pm_ops object pointed to by its pm field
260(i.e. both dev->class and dev->class->pm are defined), the PM core will use the
261callbacks from that object (i.e. dev->class->pm). Finally, if the pm fields of
262both the device type and class objects are NULL (or those objects do not exist),
263the callbacks provided by the bus (that is, the callbacks from dev->bus->pm)
264will be used (this allows device types to override callbacks provided by bus
265types or classes if necessary).
266 299
267These callbacks may in turn invoke device- or driver-specific methods stored in 300If the subsystem callback chosen for execution is not present, the PM core will
268dev->driver->pm, but they don't have to. 301execute the corresponding method from dev->driver->pm instead if there is one.
269 302
270 303
271Entering System Suspend 304Entering System Suspend
@@ -283,9 +316,8 @@ When the system goes into the standby or memory sleep state, the phases are:
283 316
284 After the prepare callback method returns, no new children may be 317 After the prepare callback method returns, no new children may be
285 registered below the device. The method may also prepare the device or 318 registered below the device. The method may also prepare the device or
286 driver in some way for the upcoming system power transition (for 319 driver in some way for the upcoming system power transition, but it
287 example, by allocating additional memory required for this purpose), but 320 should not put the device into a low-power state.
288 it should not put the device into a low-power state.
289 321
290 2. The suspend methods should quiesce the device to stop it from performing 322 2. The suspend methods should quiesce the device to stop it from performing
291 I/O. They also may save the device registers and put it into the 323 I/O. They also may save the device registers and put it into the
diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt
index 316c2ba187f4..6ccb68f68da6 100644
--- a/Documentation/power/freezing-of-tasks.txt
+++ b/Documentation/power/freezing-of-tasks.txt
@@ -21,7 +21,7 @@ freeze_processes() (defined in kernel/power/process.c) is called. It executes
21try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and 21try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and
22either wakes them up, if they are kernel threads, or sends fake signals to them, 22either wakes them up, if they are kernel threads, or sends fake signals to them,
23if they are user space processes. A task that has TIF_FREEZE set, should react 23if they are user space processes. A task that has TIF_FREEZE set, should react
24to it by calling the function called refrigerator() (defined in 24to it by calling the function called __refrigerator() (defined in
25kernel/freezer.c), which sets the task's PF_FROZEN flag, changes its state 25kernel/freezer.c), which sets the task's PF_FROZEN flag, changes its state
26to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it. 26to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it.
27Then, we say that the task is 'frozen' and therefore the set of functions 27Then, we say that the task is 'frozen' and therefore the set of functions
@@ -29,10 +29,10 @@ handling this mechanism is referred to as 'the freezer' (these functions are
29defined in kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h). 29defined in kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h).
30User space processes are generally frozen before kernel threads. 30User space processes are generally frozen before kernel threads.
31 31
32It is not recommended to call refrigerator() directly. Instead, it is 32__refrigerator() must not be called directly. Instead, use the
33recommended to use the try_to_freeze() function (defined in 33try_to_freeze() function (defined in include/linux/freezer.h), that checks
34include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the 34the task's TIF_FREEZE flag and makes the task enter __refrigerator() if the
35task enter refrigerator() if the flag is set. 35flag is set.
36 36
37For user space processes try_to_freeze() is called automatically from the 37For user space processes try_to_freeze() is called automatically from the
38signal-handling code, but the freezable kernel threads need to call it 38signal-handling code, but the freezable kernel threads need to call it
@@ -61,13 +61,13 @@ wait_event_freezable() and wait_event_freezable_timeout() macros.
61After the system memory state has been restored from a hibernation image and 61After the system memory state has been restored from a hibernation image and
62devices have been reinitialized, the function thaw_processes() is called in 62devices have been reinitialized, the function thaw_processes() is called in
63order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that 63order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that
64have been frozen leave refrigerator() and continue running. 64have been frozen leave __refrigerator() and continue running.
65 65
66III. Which kernel threads are freezable? 66III. Which kernel threads are freezable?
67 67
68Kernel threads are not freezable by default. However, a kernel thread may clear 68Kernel threads are not freezable by default. However, a kernel thread may clear
69PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE 69PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE
70directly is strongly discouraged). From this point it is regarded as freezable 70directly is not allowed). From this point it is regarded as freezable
71and must call try_to_freeze() in a suitable place. 71and must call try_to_freeze() in a suitable place.
72 72
73IV. Why do we do that? 73IV. Why do we do that?
@@ -176,3 +176,28 @@ tasks, since it generally exists anyway.
176A driver must have all firmwares it may need in RAM before suspend() is called. 176A driver must have all firmwares it may need in RAM before suspend() is called.
177If keeping them is not practical, for example due to their size, they must be 177If keeping them is not practical, for example due to their size, they must be
178requested early enough using the suspend notifier API described in notifiers.txt. 178requested early enough using the suspend notifier API described in notifiers.txt.
179
180VI. Are there any precautions to be taken to prevent freezing failures?
181
182Yes, there are.
183
184First of all, grabbing the 'pm_mutex' lock to mutually exclude a piece of code
185from system-wide sleep such as suspend/hibernation is not encouraged.
186If possible, that piece of code must instead hook onto the suspend/hibernation
187notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code
188(kernel/cpu.c) for an example.
189
190However, if that is not feasible, and grabbing 'pm_mutex' is deemed necessary,
191it is strongly discouraged to directly call mutex_[un]lock(&pm_mutex) since
192that could lead to freezing failures, because if the suspend/hibernate code
193successfully acquired the 'pm_mutex' lock, and hence that other entity failed
194to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE
195state. As a consequence, the freezer would not be able to freeze that task,
196leading to freezing failure.
197
198However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
199since they ask the freezer to skip freezing this task, since it is anyway
200"frozen enough" as it is blocked on 'pm_mutex', which will be released
201only after the entire suspend/hibernation sequence is complete.
202So, to summarize, use [un]lock_system_sleep() instead of directly using
203mutex_[un]lock(&pm_mutex). That would prevent freezing failures.
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
index 5336149f831b..4abe83e1045a 100644
--- a/Documentation/power/runtime_pm.txt
+++ b/Documentation/power/runtime_pm.txt
@@ -44,98 +44,112 @@ struct dev_pm_ops {
44}; 44};
45 45
46The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks 46The ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks
47are executed by the PM core for either the power domain, or the device type 47are executed by the PM core for the device's subsystem that may be either of
48(if the device power domain's struct dev_pm_ops does not exist), or the class 48the following:
49(if the device power domain's and type's struct dev_pm_ops object does not 49
50exist), or the bus type (if the device power domain's, type's and class' 50 1. PM domain of the device, if the device's PM domain object, dev->pm_domain,
51struct dev_pm_ops objects do not exist) of the given device, so the priority 51 is present.
52order of callbacks from high to low is that power domain callbacks, device 52
53type callbacks, class callbacks and bus type callbacks, and the high priority 53 2. Device type of the device, if both dev->type and dev->type->pm are present.
54one will take precedence over low priority one. The bus type, device type and 54
55class callbacks are referred to as subsystem-level callbacks in what follows, 55 3. Device class of the device, if both dev->class and dev->class->pm are
56and generally speaking, the power domain callbacks are used for representing 56 present.
57power domains within a SoC. 57
58 4. Bus type of the device, if both dev->bus and dev->bus->pm are present.
59
60If the subsystem chosen by applying the above rules doesn't provide the relevant
61callback, the PM core will invoke the corresponding driver callback stored in
62dev->driver->pm directly (if present).
63
64The PM core always checks which callback to use in the order given above, so the
65priority order of callbacks from high to low is: PM domain, device type, class
66and bus type. Moreover, the high-priority one will always take precedence over
67a low-priority one. The PM domain, bus type, device type and class callbacks
68are referred to as subsystem-level callbacks in what follows.
58 69
59By default, the callbacks are always invoked in process context with interrupts 70By default, the callbacks are always invoked in process context with interrupts
60enabled. However, subsystems can use the pm_runtime_irq_safe() helper function 71enabled. However, the pm_runtime_irq_safe() helper function can be used to tell
61to tell the PM core that a device's ->runtime_suspend() and ->runtime_resume() 72the PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume()
62callbacks should be invoked in atomic context with interrupts disabled. 73and ->runtime_idle() callbacks for the given device in atomic context with
63This implies that these callback routines must not block or sleep, but it also 74interrupts disabled. This implies that the callback routines in question must
64means that the synchronous helper functions listed at the end of Section 4 can 75not block or sleep, but it also means that the synchronous helper functions
65be used within an interrupt handler or in an atomic context. 76listed at the end of Section 4 may be used for that device within an interrupt
66 77handler or generally in an atomic context.
67The subsystem-level suspend callback is _entirely_ _responsible_ for handling 78
68the suspend of the device as appropriate, which may, but need not include 79The subsystem-level suspend callback, if present, is _entirely_ _responsible_
69executing the device driver's own ->runtime_suspend() callback (from the 80for handling the suspend of the device as appropriate, which may, but need not
81include executing the device driver's own ->runtime_suspend() callback (from the
70PM core's point of view it is not necessary to implement a ->runtime_suspend() 82PM core's point of view it is not necessary to implement a ->runtime_suspend()
71callback in a device driver as long as the subsystem-level suspend callback 83callback in a device driver as long as the subsystem-level suspend callback
72knows what to do to handle the device). 84knows what to do to handle the device).
73 85
74 * Once the subsystem-level suspend callback has completed successfully 86 * Once the subsystem-level suspend callback (or the driver suspend callback,
75 for given device, the PM core regards the device as suspended, which need 87 if invoked directly) has completed successfully for the given device, the PM
76 not mean that the device has been put into a low power state. It is 88 core regards the device as suspended, which need not mean that it has been
77 supposed to mean, however, that the device will not process data and will 89 put into a low power state. It is supposed to mean, however, that the
78 not communicate with the CPU(s) and RAM until the subsystem-level resume 90 device will not process data and will not communicate with the CPU(s) and
79 callback is executed for it. The runtime PM status of a device after 91 RAM until the appropriate resume callback is executed for it. The runtime
80 successful execution of the subsystem-level suspend callback is 'suspended'. 92 PM status of a device after successful execution of the suspend callback is
81 93 'suspended'.
82 * If the subsystem-level suspend callback returns -EBUSY or -EAGAIN, 94
83 the device's runtime PM status is 'active', which means that the device 95 * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM
84 _must_ be fully operational afterwards. 96 status remains 'active', which means that the device _must_ be fully
85 97 operational afterwards.
86 * If the subsystem-level suspend callback returns an error code different 98
87 from -EBUSY or -EAGAIN, the PM core regards this as a fatal error and will 99 * If the suspend callback returns an error code different from -EBUSY and
88 refuse to run the helper functions described in Section 4 for the device, 100 -EAGAIN, the PM core regards this as a fatal error and will refuse to run
89 until the status of it is directly set either to 'active', or to 'suspended' 101 the helper functions described in Section 4 for the device until its status
90 (the PM core provides special helper functions for this purpose). 102 is directly set to either'active', or 'suspended' (the PM core provides
91 103 special helper functions for this purpose).
92In particular, if the driver requires remote wake-up capability (i.e. hardware 104
105In particular, if the driver requires remote wakeup capability (i.e. hardware
93mechanism allowing the device to request a change of its power state, such as 106mechanism allowing the device to request a change of its power state, such as
94PCI PME) for proper functioning and device_run_wake() returns 'false' for the 107PCI PME) for proper functioning and device_run_wake() returns 'false' for the
95device, then ->runtime_suspend() should return -EBUSY. On the other hand, if 108device, then ->runtime_suspend() should return -EBUSY. On the other hand, if
96device_run_wake() returns 'true' for the device and the device is put into a low 109device_run_wake() returns 'true' for the device and the device is put into a
97power state during the execution of the subsystem-level suspend callback, it is 110low-power state during the execution of the suspend callback, it is expected
98expected that remote wake-up will be enabled for the device. Generally, remote 111that remote wakeup will be enabled for the device. Generally, remote wakeup
99wake-up should be enabled for all input devices put into a low power state at 112should be enabled for all input devices put into low-power states at run time.
100run time. 113
101 114The subsystem-level resume callback, if present, is _entirely_ _responsible_ for
102The subsystem-level resume callback is _entirely_ _responsible_ for handling the 115handling the resume of the device as appropriate, which may, but need not
103resume of the device as appropriate, which may, but need not include executing 116include executing the device driver's own ->runtime_resume() callback (from the
104the device driver's own ->runtime_resume() callback (from the PM core's point of 117PM core's point of view it is not necessary to implement a ->runtime_resume()
105view it is not necessary to implement a ->runtime_resume() callback in a device 118callback in a device driver as long as the subsystem-level resume callback knows
106driver as long as the subsystem-level resume callback knows what to do to handle 119what to do to handle the device).
107the device). 120
108 121 * Once the subsystem-level resume callback (or the driver resume callback, if
109 * Once the subsystem-level resume callback has completed successfully, the PM 122 invoked directly) has completed successfully, the PM core regards the device
110 core regards the device as fully operational, which means that the device 123 as fully operational, which means that the device _must_ be able to complete
111 _must_ be able to complete I/O operations as needed. The runtime PM status 124 I/O operations as needed. The runtime PM status of the device is then
112 of the device is then 'active'. 125 'active'.
113 126
114 * If the subsystem-level resume callback returns an error code, the PM core 127 * If the resume callback returns an error code, the PM core regards this as a
115 regards this as a fatal error and will refuse to run the helper functions 128 fatal error and will refuse to run the helper functions described in Section
116 described in Section 4 for the device, until its status is directly set 129 4 for the device, until its status is directly set to either 'active', or
117 either to 'active' or to 'suspended' (the PM core provides special helper 130 'suspended' (by means of special helper functions provided by the PM core
118 functions for this purpose). 131 for this purpose).
119 132
120The subsystem-level idle callback is executed by the PM core whenever the device 133The idle callback (a subsystem-level one, if present, or the driver one) is
121appears to be idle, which is indicated to the PM core by two counters, the 134executed by the PM core whenever the device appears to be idle, which is
122device's usage counter and the counter of 'active' children of the device. 135indicated to the PM core by two counters, the device's usage counter and the
136counter of 'active' children of the device.
123 137
124 * If any of these counters is decreased using a helper function provided by 138 * If any of these counters is decreased using a helper function provided by
125 the PM core and it turns out to be equal to zero, the other counter is 139 the PM core and it turns out to be equal to zero, the other counter is
126 checked. If that counter also is equal to zero, the PM core executes the 140 checked. If that counter also is equal to zero, the PM core executes the
127 subsystem-level idle callback with the device as an argument. 141 idle callback with the device as its argument.
128 142
129The action performed by a subsystem-level idle callback is totally dependent on 143The action performed by the idle callback is totally dependent on the subsystem
130the subsystem in question, but the expected and recommended action is to check 144(or driver) in question, but the expected and recommended action is to check
131if the device can be suspended (i.e. if all of the conditions necessary for 145if the device can be suspended (i.e. if all of the conditions necessary for
132suspending the device are satisfied) and to queue up a suspend request for the 146suspending the device are satisfied) and to queue up a suspend request for the
133device in that case. The value returned by this callback is ignored by the PM 147device in that case. The value returned by this callback is ignored by the PM
134core. 148core.
135 149
136The helper functions provided by the PM core, described in Section 4, guarantee 150The helper functions provided by the PM core, described in Section 4, guarantee
137that the following constraints are met with respect to the bus type's runtime 151that the following constraints are met with respect to runtime PM callbacks for
138PM callbacks: 152one device:
139 153
140(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute 154(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute
141 ->runtime_suspend() in parallel with ->runtime_resume() or with another 155 ->runtime_suspend() in parallel with ->runtime_resume() or with another
diff --git a/Documentation/serial/serial-rs485.txt b/Documentation/serial/serial-rs485.txt
index 079cb3df62cf..41c8378c0b2f 100644
--- a/Documentation/serial/serial-rs485.txt
+++ b/Documentation/serial/serial-rs485.txt
@@ -97,15 +97,23 @@
97 97
98 struct serial_rs485 rs485conf; 98 struct serial_rs485 rs485conf;
99 99
100 /* Set RS485 mode: */ 100 /* Enable RS485 mode: */
101 rs485conf.flags |= SER_RS485_ENABLED; 101 rs485conf.flags |= SER_RS485_ENABLED;
102 102
103 /* Set logical level for RTS pin equal to 1 when sending: */
104 rs485conf.flags |= SER_RS485_RTS_ON_SEND;
105 /* or, set logical level for RTS pin equal to 0 when sending: */
106 rs485conf.flags &= ~(SER_RS485_RTS_ON_SEND);
107
108 /* Set logical level for RTS pin equal to 1 after sending: */
109 rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
110 /* or, set logical level for RTS pin equal to 0 after sending: */
111 rs485conf.flags &= ~(SER_RS485_RTS_AFTER_SEND);
112
103 /* Set rts delay before send, if needed: */ 113 /* Set rts delay before send, if needed: */
104 rs485conf.flags |= SER_RS485_RTS_BEFORE_SEND;
105 rs485conf.delay_rts_before_send = ...; 114 rs485conf.delay_rts_before_send = ...;
106 115
107 /* Set rts delay after send, if needed: */ 116 /* Set rts delay after send, if needed: */
108 rs485conf.flags |= SER_RS485_RTS_AFTER_SEND;
109 rs485conf.delay_rts_after_send = ...; 117 rs485conf.delay_rts_after_send = ...;
110 118
111 /* Set this flag if you want to receive data even whilst sending data */ 119 /* Set this flag if you want to receive data even whilst sending data */
diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt
index 03e2771ddeef..91fee3b45fb8 100644
--- a/Documentation/sound/alsa/HD-Audio.txt
+++ b/Documentation/sound/alsa/HD-Audio.txt
@@ -579,7 +579,7 @@ Development Tree
579~~~~~~~~~~~~~~~~ 579~~~~~~~~~~~~~~~~
580The latest development codes for HD-audio are found on sound git tree: 580The latest development codes for HD-audio are found on sound git tree:
581 581
582- git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6.git 582- git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
583 583
584The master branch or for-next branches can be used as the main 584The master branch or for-next branches can be used as the main
585development branches in general while the HD-audio specific patches 585development branches in general while the HD-audio specific patches
@@ -594,7 +594,7 @@ is, installed via the usual spells: configure, make and make
594install(-modules). See INSTALL in the package. The snapshot tarballs 594install(-modules). See INSTALL in the package. The snapshot tarballs
595are found at: 595are found at:
596 596
597- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/snapshot/ 597- ftp://ftp.suse.com/pub/people/tiwai/snapshot/
598 598
599 599
600Sending a Bug Report 600Sending a Bug Report
@@ -696,7 +696,7 @@ via hda-verb won't change the mixer value.
696 696
697The hda-verb program is found in the ftp directory: 697The hda-verb program is found in the ftp directory:
698 698
699- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/misc/ 699- ftp://ftp.suse.com/pub/people/tiwai/misc/
700 700
701Also a git repository is available: 701Also a git repository is available:
702 702
@@ -764,7 +764,7 @@ operation, the jack plugging simulation, etc.
764 764
765The package is found in: 765The package is found in:
766 766
767- ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/misc/ 767- ftp://ftp.suse.com/pub/people/tiwai/misc/
768 768
769A git repository is available: 769A git repository is available:
770 770
diff --git a/Documentation/sound/alsa/soc/machine.txt b/Documentation/sound/alsa/soc/machine.txt
index 3e2ec9cbf397..d50c14df3411 100644
--- a/Documentation/sound/alsa/soc/machine.txt
+++ b/Documentation/sound/alsa/soc/machine.txt
@@ -50,8 +50,7 @@ Machine DAI Configuration
50The machine DAI configuration glues all the codec and CPU DAIs together. It can 50The machine DAI configuration glues all the codec and CPU DAIs together. It can
51also be used to set up the DAI system clock and for any machine related DAI 51also be used to set up the DAI system clock and for any machine related DAI
52initialisation e.g. the machine audio map can be connected to the codec audio 52initialisation e.g. the machine audio map can be connected to the codec audio
53map, unconnected codec pins can be set as such. Please see corgi.c, spitz.c 53map, unconnected codec pins can be set as such.
54for examples.
55 54
56struct snd_soc_dai_link is used to set up each DAI in your machine. e.g. 55struct snd_soc_dai_link is used to set up each DAI in your machine. e.g.
57 56
@@ -83,8 +82,7 @@ Machine Power Map
83The machine driver can optionally extend the codec power map and to become an 82The machine driver can optionally extend the codec power map and to become an
84audio power map of the audio subsystem. This allows for automatic power up/down 83audio power map of the audio subsystem. This allows for automatic power up/down
85of speaker/HP amplifiers, etc. Codec pins can be connected to the machines jack 84of speaker/HP amplifiers, etc. Codec pins can be connected to the machines jack
86sockets in the machine init function. See soc/pxa/spitz.c and dapm.txt for 85sockets in the machine init function.
87details.
88 86
89 87
90Machine Controls 88Machine Controls
diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt
index b510564aac7e..bb24c2a0e870 100644
--- a/Documentation/trace/events.txt
+++ b/Documentation/trace/events.txt
@@ -191,8 +191,6 @@ And for string fields they are:
191 191
192Currently, only exact string matches are supported. 192Currently, only exact string matches are supported.
193 193
194Currently, the maximum number of predicates in a filter is 16.
195
1965.2 Setting filters 1945.2 Setting filters
197------------------- 195-------------------
198 196
diff --git a/Documentation/usb/linux-cdc-acm.inf b/Documentation/usb/linux-cdc-acm.inf
index 37a02ce54841..f0ffc27d4c0a 100644
--- a/Documentation/usb/linux-cdc-acm.inf
+++ b/Documentation/usb/linux-cdc-acm.inf
@@ -90,10 +90,10 @@ ServiceBinary=%12%\USBSER.sys
90[SourceDisksFiles] 90[SourceDisksFiles]
91[SourceDisksNames] 91[SourceDisksNames]
92[DeviceList] 92[DeviceList]
93%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02 93%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00
94 94
95[DeviceList.NTamd64] 95[DeviceList.NTamd64]
96%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02 96%DESCRIPTION%=DriverInstall, USB\VID_0525&PID_A4A7, USB\VID_1D6B&PID_0104&MI_02, USB\VID_1D6B&PID_0106&MI_00
97 97
98 98
99;------------------------------------------------------------------------------ 99;------------------------------------------------------------------------------
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 7945b0bd35e2..e2a4b5287361 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1100,6 +1100,15 @@ emulate them efficiently. The fields in each entry are defined as follows:
1100 eax, ebx, ecx, edx: the values returned by the cpuid instruction for 1100 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1101 this function/index combination 1101 this function/index combination
1102 1102
1103The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
1104as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
1105support. Instead it is reported via
1106
1107 ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
1108
1109if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
1110feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
1111
11034.47 KVM_PPC_GET_PVINFO 11124.47 KVM_PPC_GET_PVINFO
1104 1113
1105Capability: KVM_CAP_PPC_GET_PVINFO 1114Capability: KVM_CAP_PPC_GET_PVINFO
@@ -1151,6 +1160,13 @@ following flags are specified:
1151/* Depends on KVM_CAP_IOMMU */ 1160/* Depends on KVM_CAP_IOMMU */
1152#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) 1161#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
1153 1162
1163The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure
1164isolation of the device. Usages not specifying this flag are deprecated.
1165
1166Only PCI header type 0 devices with PCI BAR resources are supported by
1167device assignment. The user requesting this ioctl must have read/write
1168access to the PCI sysfs resource files associated with the device.
1169
11544.49 KVM_DEASSIGN_PCI_DEVICE 11704.49 KVM_DEASSIGN_PCI_DEVICE
1155 1171
1156Capability: KVM_CAP_DEVICE_DEASSIGNMENT 1172Capability: KVM_CAP_DEVICE_DEASSIGNMENT