diff options
Diffstat (limited to 'Documentation')
56 files changed, 4276 insertions, 246 deletions
diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index d05737aaa84b..06b982affe76 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX | |||
@@ -82,6 +82,8 @@ block/ | |||
82 | - info on the Block I/O (BIO) layer. | 82 | - info on the Block I/O (BIO) layer. |
83 | blockdev/ | 83 | blockdev/ |
84 | - info on block devices & drivers | 84 | - info on block devices & drivers |
85 | btmrvl.txt | ||
86 | - info on Marvell Bluetooth driver usage. | ||
85 | cachetlb.txt | 87 | cachetlb.txt |
86 | - describes the cache/TLB flushing interfaces Linux uses. | 88 | - describes the cache/TLB flushing interfaces Linux uses. |
87 | cdrom/ | 89 | cdrom/ |
diff --git a/Documentation/RCU/RTFP.txt b/Documentation/RCU/RTFP.txt index 9f711d2df91b..d2b85237c76e 100644 --- a/Documentation/RCU/RTFP.txt +++ b/Documentation/RCU/RTFP.txt | |||
@@ -743,3 +743,80 @@ Revised: | |||
743 | RCU, realtime RCU, sleepable RCU, performance. | 743 | RCU, realtime RCU, sleepable RCU, performance. |
744 | " | 744 | " |
745 | } | 745 | } |
746 | |||
747 | @article{PaulEMcKenney2008RCUOSR | ||
748 | ,author="Paul E. McKenney and Jonathan Walpole" | ||
749 | ,title="Introducing technology into the {Linux} kernel: a case study" | ||
750 | ,Year="2008" | ||
751 | ,journal="SIGOPS Oper. Syst. Rev." | ||
752 | ,volume="42" | ||
753 | ,number="5" | ||
754 | ,pages="4--17" | ||
755 | ,issn="0163-5980" | ||
756 | ,doi={http://doi.acm.org/10.1145/1400097.1400099} | ||
757 | ,publisher="ACM" | ||
758 | ,address="New York, NY, USA" | ||
759 | ,annotation={ | ||
760 | Linux changed RCU to a far greater degree than RCU has changed Linux. | ||
761 | } | ||
762 | } | ||
763 | |||
764 | @unpublished{PaulEMcKenney2008HierarchicalRCU | ||
765 | ,Author="Paul E. McKenney" | ||
766 | ,Title="Hierarchical {RCU}" | ||
767 | ,month="November" | ||
768 | ,day="3" | ||
769 | ,year="2008" | ||
770 | ,note="Available: | ||
771 | \url{http://lwn.net/Articles/305782/} | ||
772 | [Viewed November 6, 2008]" | ||
773 | ,annotation=" | ||
774 | RCU with combining-tree-based grace-period detection, | ||
775 | permitting it to handle thousands of CPUs. | ||
776 | " | ||
777 | } | ||
778 | |||
779 | @conference{PaulEMcKenney2009MaliciousURCU | ||
780 | ,Author="Paul E. McKenney" | ||
781 | ,Title="Using a Malicious User-Level {RCU} to Torture {RCU}-Based Algorithms" | ||
782 | ,Booktitle="linux.conf.au 2009" | ||
783 | ,month="January" | ||
784 | ,year="2009" | ||
785 | ,address="Hobart, Australia" | ||
786 | ,note="Available: | ||
787 | \url{http://www.rdrop.com/users/paulmck/RCU/urcutorture.2009.01.22a.pdf} | ||
788 | [Viewed February 2, 2009]" | ||
789 | ,annotation=" | ||
790 | Realtime RCU and torture-testing RCU uses. | ||
791 | " | ||
792 | } | ||
793 | |||
794 | @unpublished{MathieuDesnoyers2009URCU | ||
795 | ,Author="Mathieu Desnoyers" | ||
796 | ,Title="[{RFC} git tree] Userspace {RCU} (urcu) for {Linux}" | ||
797 | ,month="February" | ||
798 | ,day="5" | ||
799 | ,year="2009" | ||
800 | ,note="Available: | ||
801 | \url{http://lkml.org/lkml/2009/2/5/572} | ||
802 | \url{git://lttng.org/userspace-rcu.git} | ||
803 | [Viewed February 20, 2009]" | ||
804 | ,annotation=" | ||
805 | Mathieu Desnoyers's user-space RCU implementation. | ||
806 | git://lttng.org/userspace-rcu.git | ||
807 | " | ||
808 | } | ||
809 | |||
810 | @unpublished{PaulEMcKenney2009BloatWatchRCU | ||
811 | ,Author="Paul E. McKenney" | ||
812 | ,Title="{RCU}: The {Bloatwatch} Edition" | ||
813 | ,month="March" | ||
814 | ,day="17" | ||
815 | ,year="2009" | ||
816 | ,note="Available: | ||
817 | \url{http://lwn.net/Articles/323929/} | ||
818 | [Viewed March 20, 2009]" | ||
819 | ,annotation=" | ||
820 | Uniprocessor assumptions allow simplified RCU implementation. | ||
821 | " | ||
822 | } | ||
diff --git a/Documentation/RCU/UP.txt b/Documentation/RCU/UP.txt index aab4a9ec3931..90ec5341ee98 100644 --- a/Documentation/RCU/UP.txt +++ b/Documentation/RCU/UP.txt | |||
@@ -2,14 +2,13 @@ RCU on Uniprocessor Systems | |||
2 | 2 | ||
3 | 3 | ||
4 | A common misconception is that, on UP systems, the call_rcu() primitive | 4 | A common misconception is that, on UP systems, the call_rcu() primitive |
5 | may immediately invoke its function, and that the synchronize_rcu() | 5 | may immediately invoke its function. The basis of this misconception |
6 | primitive may return immediately. The basis of this misconception | ||
7 | is that since there is only one CPU, it should not be necessary to | 6 | is that since there is only one CPU, it should not be necessary to |
8 | wait for anything else to get done, since there are no other CPUs for | 7 | wait for anything else to get done, since there are no other CPUs for |
9 | anything else to be happening on. Although this approach will -sort- -of- | 8 | anything else to be happening on. Although this approach will -sort- -of- |
10 | work a surprising amount of the time, it is a very bad idea in general. | 9 | work a surprising amount of the time, it is a very bad idea in general. |
11 | This document presents three examples that demonstrate exactly how bad an | 10 | This document presents three examples that demonstrate exactly how bad |
12 | idea this is. | 11 | an idea this is. |
13 | 12 | ||
14 | 13 | ||
15 | Example 1: softirq Suicide | 14 | Example 1: softirq Suicide |
@@ -82,11 +81,18 @@ Quick Quiz #2: What locking restriction must RCU callbacks respect? | |||
82 | 81 | ||
83 | Summary | 82 | Summary |
84 | 83 | ||
85 | Permitting call_rcu() to immediately invoke its arguments or permitting | 84 | Permitting call_rcu() to immediately invoke its arguments breaks RCU, |
86 | synchronize_rcu() to immediately return breaks RCU, even on a UP system. | 85 | even on a UP system. So do not do it! Even on a UP system, the RCU |
87 | So do not do it! Even on a UP system, the RCU infrastructure -must- | 86 | infrastructure -must- respect grace periods, and -must- invoke callbacks |
88 | respect grace periods, and -must- invoke callbacks from a known environment | 87 | from a known environment in which no locks are held. |
89 | in which no locks are held. | 88 | |
89 | It -is- safe for synchronize_sched() and synchronize_rcu_bh() to return | ||
90 | immediately on an UP system. It is also safe for synchronize_rcu() | ||
91 | to return immediately on UP systems, except when running preemptable | ||
92 | RCU. | ||
93 | |||
94 | Quick Quiz #3: Why can't synchronize_rcu() return immediately on | ||
95 | UP systems running preemptable RCU? | ||
90 | 96 | ||
91 | 97 | ||
92 | Answer to Quick Quiz #1: | 98 | Answer to Quick Quiz #1: |
@@ -117,3 +123,13 @@ Answer to Quick Quiz #2: | |||
117 | callbacks acquire locks directly. However, a great many RCU | 123 | callbacks acquire locks directly. However, a great many RCU |
118 | callbacks do acquire locks -indirectly-, for example, via | 124 | callbacks do acquire locks -indirectly-, for example, via |
119 | the kfree() primitive. | 125 | the kfree() primitive. |
126 | |||
127 | Answer to Quick Quiz #3: | ||
128 | Why can't synchronize_rcu() return immediately on UP systems | ||
129 | running preemptable RCU? | ||
130 | |||
131 | Because some other task might have been preempted in the middle | ||
132 | of an RCU read-side critical section. If synchronize_rcu() | ||
133 | simply immediately returned, it would prematurely signal the | ||
134 | end of the grace period, which would come as a nasty shock to | ||
135 | that other thread when it started running again. | ||
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index accfe2f5247d..51525a30e8b4 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt | |||
@@ -11,7 +11,10 @@ over a rather long period of time, but improvements are always welcome! | |||
11 | structure is updated more than about 10% of the time, then | 11 | structure is updated more than about 10% of the time, then |
12 | you should strongly consider some other approach, unless | 12 | you should strongly consider some other approach, unless |
13 | detailed performance measurements show that RCU is nonetheless | 13 | detailed performance measurements show that RCU is nonetheless |
14 | the right tool for the job. | 14 | the right tool for the job. Yes, you might think of RCU |
15 | as simply cutting overhead off of the readers and imposing it | ||
16 | on the writers. That is exactly why normal uses of RCU will | ||
17 | do much more reading than updating. | ||
15 | 18 | ||
16 | Another exception is where performance is not an issue, and RCU | 19 | Another exception is where performance is not an issue, and RCU |
17 | provides a simpler implementation. An example of this situation | 20 | provides a simpler implementation. An example of this situation |
@@ -240,10 +243,11 @@ over a rather long period of time, but improvements are always welcome! | |||
240 | instead need to use synchronize_irq() or synchronize_sched(). | 243 | instead need to use synchronize_irq() or synchronize_sched(). |
241 | 244 | ||
242 | 12. Any lock acquired by an RCU callback must be acquired elsewhere | 245 | 12. Any lock acquired by an RCU callback must be acquired elsewhere |
243 | with irq disabled, e.g., via spin_lock_irqsave(). Failing to | 246 | with softirq disabled, e.g., via spin_lock_irqsave(), |
244 | disable irq on a given acquisition of that lock will result in | 247 | spin_lock_bh(), etc. Failing to disable irq on a given |
245 | deadlock as soon as the RCU callback happens to interrupt that | 248 | acquisition of that lock will result in deadlock as soon as the |
246 | acquisition's critical section. | 249 | RCU callback happens to interrupt that acquisition's critical |
250 | section. | ||
247 | 251 | ||
248 | 13. RCU callbacks can be and are executed in parallel. In many cases, | 252 | 13. RCU callbacks can be and are executed in parallel. In many cases, |
249 | the callback code simply wrappers around kfree(), so that this | 253 | the callback code simply wrappers around kfree(), so that this |
@@ -310,3 +314,9 @@ over a rather long period of time, but improvements are always welcome! | |||
310 | Because these primitives only wait for pre-existing readers, | 314 | Because these primitives only wait for pre-existing readers, |
311 | it is the caller's responsibility to guarantee safety to | 315 | it is the caller's responsibility to guarantee safety to |
312 | any subsequent readers. | 316 | any subsequent readers. |
317 | |||
318 | 16. The various RCU read-side primitives do -not- contain memory | ||
319 | barriers. The CPU (and in some cases, the compiler) is free | ||
320 | to reorder code into and out of RCU read-side critical sections. | ||
321 | It is the responsibility of the RCU update-side primitives to | ||
322 | deal with this. | ||
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index 7aa2002ade77..2a23523ce471 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt | |||
@@ -36,7 +36,7 @@ o How can the updater tell when a grace period has completed | |||
36 | executed in user mode, or executed in the idle loop, we can | 36 | executed in user mode, or executed in the idle loop, we can |
37 | safely free up that item. | 37 | safely free up that item. |
38 | 38 | ||
39 | Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the | 39 | Preemptible variants of RCU (CONFIG_TREE_PREEMPT_RCU) get the |
40 | same effect, but require that the readers manipulate CPU-local | 40 | same effect, but require that the readers manipulate CPU-local |
41 | counters. These counters allow limited types of blocking | 41 | counters. These counters allow limited types of blocking |
42 | within RCU read-side critical sections. SRCU also uses | 42 | within RCU read-side critical sections. SRCU also uses |
@@ -79,10 +79,10 @@ o I hear that RCU is patented? What is with that? | |||
79 | o I hear that RCU needs work in order to support realtime kernels? | 79 | o I hear that RCU needs work in order to support realtime kernels? |
80 | 80 | ||
81 | This work is largely completed. Realtime-friendly RCU can be | 81 | This work is largely completed. Realtime-friendly RCU can be |
82 | enabled via the CONFIG_PREEMPT_RCU kernel configuration parameter. | 82 | enabled via the CONFIG_TREE_PREEMPT_RCU kernel configuration |
83 | However, work is in progress for enabling priority boosting of | 83 | parameter. However, work is in progress for enabling priority |
84 | preempted RCU read-side critical sections. This is needed if you | 84 | boosting of preempted RCU read-side critical sections. This is |
85 | have CPU-bound realtime threads. | 85 | needed if you have CPU-bound realtime threads. |
86 | 86 | ||
87 | o Where can I find more information on RCU? | 87 | o Where can I find more information on RCU? |
88 | 88 | ||
diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt index 909602d409bb..e439a0edee22 100644 --- a/Documentation/RCU/rcubarrier.txt +++ b/Documentation/RCU/rcubarrier.txt | |||
@@ -170,6 +170,13 @@ module invokes call_rcu() from timers, you will need to first cancel all | |||
170 | the timers, and only then invoke rcu_barrier() to wait for any remaining | 170 | the timers, and only then invoke rcu_barrier() to wait for any remaining |
171 | RCU callbacks to complete. | 171 | RCU callbacks to complete. |
172 | 172 | ||
173 | Of course, if you module uses call_rcu_bh(), you will need to invoke | ||
174 | rcu_barrier_bh() before unloading. Similarly, if your module uses | ||
175 | call_rcu_sched(), you will need to invoke rcu_barrier_sched() before | ||
176 | unloading. If your module uses call_rcu(), call_rcu_bh(), -and- | ||
177 | call_rcu_sched(), then you will need to invoke each of rcu_barrier(), | ||
178 | rcu_barrier_bh(), and rcu_barrier_sched(). | ||
179 | |||
173 | 180 | ||
174 | Implementing rcu_barrier() | 181 | Implementing rcu_barrier() |
175 | 182 | ||
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index a342b6e1cc10..9dba3bb90e60 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt | |||
@@ -76,8 +76,10 @@ torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, | |||
76 | "rcu_sync" for rcu_read_lock() with synchronous reclamation, | 76 | "rcu_sync" for rcu_read_lock() with synchronous reclamation, |
77 | "rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for | 77 | "rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for |
78 | rcu_read_lock_bh() with synchronous reclamation, "srcu" for | 78 | rcu_read_lock_bh() with synchronous reclamation, "srcu" for |
79 | the "srcu_read_lock()" API, and "sched" for the use of | 79 | the "srcu_read_lock()" API, "sched" for the use of |
80 | preempt_disable() together with synchronize_sched(). | 80 | preempt_disable() together with synchronize_sched(), |
81 | and "sched_expedited" for the use of preempt_disable() | ||
82 | with synchronize_sched_expedited(). | ||
81 | 83 | ||
82 | verbose Enable debug printk()s. Default is disabled. | 84 | verbose Enable debug printk()s. Default is disabled. |
83 | 85 | ||
@@ -162,6 +164,23 @@ of the "old" and "current" counters for the corresponding CPU. The | |||
162 | "idx" value maps the "old" and "current" values to the underlying array, | 164 | "idx" value maps the "old" and "current" values to the underlying array, |
163 | and is useful for debugging. | 165 | and is useful for debugging. |
164 | 166 | ||
167 | Similarly, sched_expedited RCU provides the following: | ||
168 | |||
169 | sched_expedited-torture: rtc: d0000000016c1880 ver: 1090796 tfle: 0 rta: 1090796 rtaf: 0 rtf: 1090787 rtmbe: 0 nt: 27713319 | ||
170 | sched_expedited-torture: Reader Pipe: 12660320201 95875 0 0 0 0 0 0 0 0 0 | ||
171 | sched_expedited-torture: Reader Batch: 12660424885 0 0 0 0 0 0 0 0 0 0 | ||
172 | sched_expedited-torture: Free-Block Circulation: 1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0 | ||
173 | state: -1 / 0:0 3:0 4:0 | ||
174 | |||
175 | As before, the first four lines are similar to those for RCU. | ||
176 | The last line shows the task-migration state. The first number is | ||
177 | -1 if synchronize_sched_expedited() is idle, -2 if in the process of | ||
178 | posting wakeups to the migration kthreads, and N when waiting on CPU N. | ||
179 | Each of the colon-separated fields following the "/" is a CPU:state pair. | ||
180 | Valid states are "0" for idle, "1" for waiting for quiescent state, | ||
181 | "2" for passed through quiescent state, and "3" when a race with a | ||
182 | CPU-hotplug event forces use of the synchronize_sched() primitive. | ||
183 | |||
165 | 184 | ||
166 | USAGE | 185 | USAGE |
167 | 186 | ||
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index 02cced183b2d..187bbf10c923 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt | |||
@@ -191,8 +191,7 @@ rcu/rcuhier (which displays the struct rcu_node hierarchy). | |||
191 | 191 | ||
192 | The output of "cat rcu/rcudata" looks as follows: | 192 | The output of "cat rcu/rcudata" looks as follows: |
193 | 193 | ||
194 | rcu: | 194 | rcu_sched: |
195 | rcu: | ||
196 | 0 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=10951/1 dn=0 df=1101 of=0 ri=36 ql=0 b=10 | 195 | 0 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=10951/1 dn=0 df=1101 of=0 ri=36 ql=0 b=10 |
197 | 1 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=16117/1 dn=0 df=1015 of=0 ri=0 ql=0 b=10 | 196 | 1 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=16117/1 dn=0 df=1015 of=0 ri=0 ql=0 b=10 |
198 | 2 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=1445/1 dn=0 df=1839 of=0 ri=0 ql=0 b=10 | 197 | 2 c=17829 g=17829 pq=1 pqc=17829 qp=0 dt=1445/1 dn=0 df=1839 of=0 ri=0 ql=0 b=10 |
@@ -306,7 +305,7 @@ comma-separated-variable spreadsheet format. | |||
306 | 305 | ||
307 | The output of "cat rcu/rcugp" looks as follows: | 306 | The output of "cat rcu/rcugp" looks as follows: |
308 | 307 | ||
309 | rcu: completed=33062 gpnum=33063 | 308 | rcu_sched: completed=33062 gpnum=33063 |
310 | rcu_bh: completed=464 gpnum=464 | 309 | rcu_bh: completed=464 gpnum=464 |
311 | 310 | ||
312 | Again, this output is for both "rcu" and "rcu_bh". The fields are | 311 | Again, this output is for both "rcu" and "rcu_bh". The fields are |
@@ -413,7 +412,7 @@ o Each element of the form "1/1 0:127 ^0" represents one struct | |||
413 | 412 | ||
414 | The output of "cat rcu/rcu_pending" looks as follows: | 413 | The output of "cat rcu/rcu_pending" looks as follows: |
415 | 414 | ||
416 | rcu: | 415 | rcu_sched: |
417 | 0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 | 416 | 0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 |
418 | 1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 | 417 | 1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 |
419 | 2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 | 418 | 2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 |
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 96170824a717..e41a7fecf0d3 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt | |||
@@ -136,10 +136,10 @@ rcu_read_lock() | |||
136 | Used by a reader to inform the reclaimer that the reader is | 136 | Used by a reader to inform the reclaimer that the reader is |
137 | entering an RCU read-side critical section. It is illegal | 137 | entering an RCU read-side critical section. It is illegal |
138 | to block while in an RCU read-side critical section, though | 138 | to block while in an RCU read-side critical section, though |
139 | kernels built with CONFIG_PREEMPT_RCU can preempt RCU read-side | 139 | kernels built with CONFIG_TREE_PREEMPT_RCU can preempt RCU |
140 | critical sections. Any RCU-protected data structure accessed | 140 | read-side critical sections. Any RCU-protected data structure |
141 | during an RCU read-side critical section is guaranteed to remain | 141 | accessed during an RCU read-side critical section is guaranteed to |
142 | unreclaimed for the full duration of that critical section. | 142 | remain unreclaimed for the full duration of that critical section. |
143 | Reference counts may be used in conjunction with RCU to maintain | 143 | Reference counts may be used in conjunction with RCU to maintain |
144 | longer-term references to data structures. | 144 | longer-term references to data structures. |
145 | 145 | ||
@@ -785,6 +785,7 @@ RCU pointer/list traversal: | |||
785 | rcu_dereference | 785 | rcu_dereference |
786 | list_for_each_entry_rcu | 786 | list_for_each_entry_rcu |
787 | hlist_for_each_entry_rcu | 787 | hlist_for_each_entry_rcu |
788 | hlist_nulls_for_each_entry_rcu | ||
788 | 789 | ||
789 | list_for_each_continue_rcu (to be deprecated in favor of new | 790 | list_for_each_continue_rcu (to be deprecated in favor of new |
790 | list_for_each_entry_continue_rcu) | 791 | list_for_each_entry_continue_rcu) |
@@ -807,19 +808,23 @@ RCU: Critical sections Grace period Barrier | |||
807 | 808 | ||
808 | rcu_read_lock synchronize_net rcu_barrier | 809 | rcu_read_lock synchronize_net rcu_barrier |
809 | rcu_read_unlock synchronize_rcu | 810 | rcu_read_unlock synchronize_rcu |
811 | synchronize_rcu_expedited | ||
810 | call_rcu | 812 | call_rcu |
811 | 813 | ||
812 | 814 | ||
813 | bh: Critical sections Grace period Barrier | 815 | bh: Critical sections Grace period Barrier |
814 | 816 | ||
815 | rcu_read_lock_bh call_rcu_bh rcu_barrier_bh | 817 | rcu_read_lock_bh call_rcu_bh rcu_barrier_bh |
816 | rcu_read_unlock_bh | 818 | rcu_read_unlock_bh synchronize_rcu_bh |
819 | synchronize_rcu_bh_expedited | ||
817 | 820 | ||
818 | 821 | ||
819 | sched: Critical sections Grace period Barrier | 822 | sched: Critical sections Grace period Barrier |
820 | 823 | ||
821 | [preempt_disable] synchronize_sched rcu_barrier_sched | 824 | rcu_read_lock_sched synchronize_sched rcu_barrier_sched |
822 | [and friends] call_rcu_sched | 825 | rcu_read_unlock_sched call_rcu_sched |
826 | [preempt_disable] synchronize_sched_expedited | ||
827 | [and friends] | ||
823 | 828 | ||
824 | 829 | ||
825 | SRCU: Critical sections Grace period Barrier | 830 | SRCU: Critical sections Grace period Barrier |
@@ -827,6 +832,9 @@ SRCU: Critical sections Grace period Barrier | |||
827 | srcu_read_lock synchronize_srcu N/A | 832 | srcu_read_lock synchronize_srcu N/A |
828 | srcu_read_unlock | 833 | srcu_read_unlock |
829 | 834 | ||
835 | SRCU: Initialization/cleanup | ||
836 | init_srcu_struct | ||
837 | cleanup_srcu_struct | ||
830 | 838 | ||
831 | See the comment headers in the source code (or the docbook generated | 839 | See the comment headers in the source code (or the docbook generated |
832 | from them) for more information. | 840 | from them) for more information. |
diff --git a/Documentation/arm/SA1100/ADSBitsy b/Documentation/arm/SA1100/ADSBitsy index ab47c3833908..7197a9e958ee 100644 --- a/Documentation/arm/SA1100/ADSBitsy +++ b/Documentation/arm/SA1100/ADSBitsy | |||
@@ -40,4 +40,4 @@ Notes: | |||
40 | mode, the timing is off so the image is corrupted. This will be | 40 | mode, the timing is off so the image is corrupted. This will be |
41 | fixed soon. | 41 | fixed soon. |
42 | 42 | ||
43 | Any contribution can be sent to nico@cam.org and will be greatly welcome! | 43 | Any contribution can be sent to nico@fluxnic.net and will be greatly welcome! |
diff --git a/Documentation/arm/SA1100/Assabet b/Documentation/arm/SA1100/Assabet index 78bc1c1b04e5..91f7ce7ba426 100644 --- a/Documentation/arm/SA1100/Assabet +++ b/Documentation/arm/SA1100/Assabet | |||
@@ -240,7 +240,7 @@ Then, rebooting the Assabet is just a matter of waiting for the login prompt. | |||
240 | 240 | ||
241 | 241 | ||
242 | Nicolas Pitre | 242 | Nicolas Pitre |
243 | nico@cam.org | 243 | nico@fluxnic.net |
244 | June 12, 2001 | 244 | June 12, 2001 |
245 | 245 | ||
246 | 246 | ||
diff --git a/Documentation/arm/SA1100/Brutus b/Documentation/arm/SA1100/Brutus index 2254c8f0b326..b1cfd405dccc 100644 --- a/Documentation/arm/SA1100/Brutus +++ b/Documentation/arm/SA1100/Brutus | |||
@@ -60,7 +60,7 @@ little modifications. | |||
60 | 60 | ||
61 | Any contribution is welcome. | 61 | Any contribution is welcome. |
62 | 62 | ||
63 | Please send patches to nico@cam.org | 63 | Please send patches to nico@fluxnic.net |
64 | 64 | ||
65 | Have Fun ! | 65 | Have Fun ! |
66 | 66 | ||
diff --git a/Documentation/arm/SA1100/GraphicsClient b/Documentation/arm/SA1100/GraphicsClient index 8fa7e8027ff1..6c9c4f5a36e1 100644 --- a/Documentation/arm/SA1100/GraphicsClient +++ b/Documentation/arm/SA1100/GraphicsClient | |||
@@ -4,7 +4,7 @@ For more details, contact Applied Data Systems or see | |||
4 | http://www.applieddata.net/products.html | 4 | http://www.applieddata.net/products.html |
5 | 5 | ||
6 | The original Linux support for this product has been provided by | 6 | The original Linux support for this product has been provided by |
7 | Nicolas Pitre <nico@cam.org>. Continued development work by | 7 | Nicolas Pitre <nico@fluxnic.net>. Continued development work by |
8 | Woojung Huh <whuh@applieddata.net> | 8 | Woojung Huh <whuh@applieddata.net> |
9 | 9 | ||
10 | It's currently possible to mount a root filesystem via NFS providing a | 10 | It's currently possible to mount a root filesystem via NFS providing a |
@@ -94,5 +94,5 @@ Notes: | |||
94 | mode, the timing is off so the image is corrupted. This will be | 94 | mode, the timing is off so the image is corrupted. This will be |
95 | fixed soon. | 95 | fixed soon. |
96 | 96 | ||
97 | Any contribution can be sent to nico@cam.org and will be greatly welcome! | 97 | Any contribution can be sent to nico@fluxnic.net and will be greatly welcome! |
98 | 98 | ||
diff --git a/Documentation/arm/SA1100/GraphicsMaster b/Documentation/arm/SA1100/GraphicsMaster index dd28745ac521..ee7c6595f23f 100644 --- a/Documentation/arm/SA1100/GraphicsMaster +++ b/Documentation/arm/SA1100/GraphicsMaster | |||
@@ -4,7 +4,7 @@ For more details, contact Applied Data Systems or see | |||
4 | http://www.applieddata.net/products.html | 4 | http://www.applieddata.net/products.html |
5 | 5 | ||
6 | The original Linux support for this product has been provided by | 6 | The original Linux support for this product has been provided by |
7 | Nicolas Pitre <nico@cam.org>. Continued development work by | 7 | Nicolas Pitre <nico@fluxnic.net>. Continued development work by |
8 | Woojung Huh <whuh@applieddata.net> | 8 | Woojung Huh <whuh@applieddata.net> |
9 | 9 | ||
10 | Use 'make graphicsmaster_config' before any 'make config'. | 10 | Use 'make graphicsmaster_config' before any 'make config'. |
@@ -50,4 +50,4 @@ Notes: | |||
50 | mode, the timing is off so the image is corrupted. This will be | 50 | mode, the timing is off so the image is corrupted. This will be |
51 | fixed soon. | 51 | fixed soon. |
52 | 52 | ||
53 | Any contribution can be sent to nico@cam.org and will be greatly welcome! | 53 | Any contribution can be sent to nico@fluxnic.net and will be greatly welcome! |
diff --git a/Documentation/arm/SA1100/Victor b/Documentation/arm/SA1100/Victor index 01e81fc49461..f938a29fdc20 100644 --- a/Documentation/arm/SA1100/Victor +++ b/Documentation/arm/SA1100/Victor | |||
@@ -9,7 +9,7 @@ Of course Victor is using Linux as its main operating system. | |||
9 | The Victor implementation for Linux is maintained by Nicolas Pitre: | 9 | The Victor implementation for Linux is maintained by Nicolas Pitre: |
10 | 10 | ||
11 | nico@visuaide.com | 11 | nico@visuaide.com |
12 | nico@cam.org | 12 | nico@fluxnic.net |
13 | 13 | ||
14 | For any comments, please feel free to contact me through the above | 14 | For any comments, please feel free to contact me through the above |
15 | addresses. | 15 | addresses. |
diff --git a/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt b/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt new file mode 100644 index 000000000000..76b3a11e90be --- /dev/null +++ b/Documentation/arm/Samsung-S3C24XX/CPUfreq.txt | |||
@@ -0,0 +1,75 @@ | |||
1 | S3C24XX CPUfreq support | ||
2 | ======================= | ||
3 | |||
4 | Introduction | ||
5 | ------------ | ||
6 | |||
7 | The S3C24XX series support a number of power saving systems, such as | ||
8 | the ability to change the core, memory and peripheral operating | ||
9 | frequencies. The core control is exported via the CPUFreq driver | ||
10 | which has a number of different manual or automatic controls over the | ||
11 | rate the core is running at. | ||
12 | |||
13 | There are two forms of the driver depending on the specific CPU and | ||
14 | how the clocks are arranged. The first implementation used as single | ||
15 | PLL to feed the ARM, memory and peripherals via a series of dividers | ||
16 | and muxes and this is the implementation that is documented here. A | ||
17 | newer version where there is a seperate PLL and clock divider for the | ||
18 | ARM core is available as a seperate driver. | ||
19 | |||
20 | |||
21 | Layout | ||
22 | ------ | ||
23 | |||
24 | The code core manages the CPU specific drivers, any data that they | ||
25 | need to register and the interface to the generic drivers/cpufreq | ||
26 | system. Each CPU registers a driver to control the PLL, clock dividers | ||
27 | and anything else associated with it. Any board that wants to use this | ||
28 | framework needs to supply at least basic details of what is required. | ||
29 | |||
30 | The core registers with drivers/cpufreq at init time if all the data | ||
31 | necessary has been supplied. | ||
32 | |||
33 | |||
34 | CPU support | ||
35 | ----------- | ||
36 | |||
37 | The support for each CPU depends on the facilities provided by the | ||
38 | SoC and the driver as each device has different PLL and clock chains | ||
39 | associated with it. | ||
40 | |||
41 | |||
42 | Slow Mode | ||
43 | --------- | ||
44 | |||
45 | The SLOW mode where the PLL is turned off altogether and the | ||
46 | system is fed by the external crystal input is currently not | ||
47 | supported. | ||
48 | |||
49 | |||
50 | sysfs | ||
51 | ----- | ||
52 | |||
53 | The core code exports extra information via sysfs in the directory | ||
54 | devices/system/cpu/cpu0/arch-freq. | ||
55 | |||
56 | |||
57 | Board Support | ||
58 | ------------- | ||
59 | |||
60 | Each board that wants to use the cpufreq code must register some basic | ||
61 | information with the core driver to provide information about what the | ||
62 | board requires and any restrictions being placed on it. | ||
63 | |||
64 | The board needs to supply information about whether it needs the IO bank | ||
65 | timings changing, any maximum frequency limits and information about the | ||
66 | SDRAM refresh rate. | ||
67 | |||
68 | |||
69 | |||
70 | |||
71 | Document Author | ||
72 | --------------- | ||
73 | |||
74 | Ben Dooks, Copyright 2009 Simtec Electronics | ||
75 | Licensed under GPLv2 | ||
diff --git a/Documentation/btmrvl.txt b/Documentation/btmrvl.txt new file mode 100644 index 000000000000..34916a46c099 --- /dev/null +++ b/Documentation/btmrvl.txt | |||
@@ -0,0 +1,119 @@ | |||
1 | ======================================================================= | ||
2 | README for btmrvl driver | ||
3 | ======================================================================= | ||
4 | |||
5 | |||
6 | All commands are used via debugfs interface. | ||
7 | |||
8 | ===================== | ||
9 | Set/get driver configurations: | ||
10 | |||
11 | Path: /debug/btmrvl/config/ | ||
12 | |||
13 | gpiogap=[n] | ||
14 | hscfgcmd | ||
15 | These commands are used to configure the host sleep parameters. | ||
16 | bit 8:0 -- Gap | ||
17 | bit 16:8 -- GPIO | ||
18 | |||
19 | where GPIO is the pin number of GPIO used to wake up the host. | ||
20 | It could be any valid GPIO pin# (e.g. 0-7) or 0xff (SDIO interface | ||
21 | wakeup will be used instead). | ||
22 | |||
23 | where Gap is the gap in milli seconds between wakeup signal and | ||
24 | wakeup event, or 0xff for special host sleep setting. | ||
25 | |||
26 | Usage: | ||
27 | # Use SDIO interface to wake up the host and set GAP to 0x80: | ||
28 | echo 0xff80 > /debug/btmrvl/config/gpiogap | ||
29 | echo 1 > /debug/btmrvl/config/hscfgcmd | ||
30 | |||
31 | # Use GPIO pin #3 to wake up the host and set GAP to 0xff: | ||
32 | echo 0x03ff > /debug/btmrvl/config/gpiogap | ||
33 | echo 1 > /debug/btmrvl/config/hscfgcmd | ||
34 | |||
35 | psmode=[n] | ||
36 | pscmd | ||
37 | These commands are used to enable/disable auto sleep mode | ||
38 | |||
39 | where the option is: | ||
40 | 1 -- Enable auto sleep mode | ||
41 | 0 -- Disable auto sleep mode | ||
42 | |||
43 | Usage: | ||
44 | # Enable auto sleep mode | ||
45 | echo 1 > /debug/btmrvl/config/psmode | ||
46 | echo 1 > /debug/btmrvl/config/pscmd | ||
47 | |||
48 | # Disable auto sleep mode | ||
49 | echo 0 > /debug/btmrvl/config/psmode | ||
50 | echo 1 > /debug/btmrvl/config/pscmd | ||
51 | |||
52 | |||
53 | hsmode=[n] | ||
54 | hscmd | ||
55 | These commands are used to enable host sleep or wake up firmware | ||
56 | |||
57 | where the option is: | ||
58 | 1 -- Enable host sleep | ||
59 | 0 -- Wake up firmware | ||
60 | |||
61 | Usage: | ||
62 | # Enable host sleep | ||
63 | echo 1 > /debug/btmrvl/config/hsmode | ||
64 | echo 1 > /debug/btmrvl/config/hscmd | ||
65 | |||
66 | # Wake up firmware | ||
67 | echo 0 > /debug/btmrvl/config/hsmode | ||
68 | echo 1 > /debug/btmrvl/config/hscmd | ||
69 | |||
70 | |||
71 | ====================== | ||
72 | Get driver status: | ||
73 | |||
74 | Path: /debug/btmrvl/status/ | ||
75 | |||
76 | Usage: | ||
77 | cat /debug/btmrvl/status/<args> | ||
78 | |||
79 | where the args are: | ||
80 | |||
81 | curpsmode | ||
82 | This command displays current auto sleep status. | ||
83 | |||
84 | psstate | ||
85 | This command display the power save state. | ||
86 | |||
87 | hsstate | ||
88 | This command display the host sleep state. | ||
89 | |||
90 | txdnldrdy | ||
91 | This command displays the value of Tx download ready flag. | ||
92 | |||
93 | |||
94 | ===================== | ||
95 | |||
96 | Use hcitool to issue raw hci command, refer to hcitool manual | ||
97 | |||
98 | Usage: Hcitool cmd <ogf> <ocf> [Parameters] | ||
99 | |||
100 | Interface Control Command | ||
101 | hcitool cmd 0x3f 0x5b 0xf5 0x01 0x00 --Enable All interface | ||
102 | hcitool cmd 0x3f 0x5b 0xf5 0x01 0x01 --Enable Wlan interface | ||
103 | hcitool cmd 0x3f 0x5b 0xf5 0x01 0x02 --Enable BT interface | ||
104 | hcitool cmd 0x3f 0x5b 0xf5 0x00 0x00 --Disable All interface | ||
105 | hcitool cmd 0x3f 0x5b 0xf5 0x00 0x01 --Disable Wlan interface | ||
106 | hcitool cmd 0x3f 0x5b 0xf5 0x00 0x02 --Disable BT interface | ||
107 | |||
108 | ======================================================================= | ||
109 | |||
110 | |||
111 | SD8688 firmware: | ||
112 | |||
113 | /lib/firmware/sd8688_helper.bin | ||
114 | /lib/firmware/sd8688.bin | ||
115 | |||
116 | |||
117 | The images can be downloaded from: | ||
118 | |||
119 | git.infradead.org/users/dwmw2/linux-firmware.git/libertas/ | ||
diff --git a/Documentation/connector/Makefile b/Documentation/connector/Makefile index 8df1a7285a06..d98e4df98e24 100644 --- a/Documentation/connector/Makefile +++ b/Documentation/connector/Makefile | |||
@@ -9,3 +9,8 @@ hostprogs-y := ucon | |||
9 | always := $(hostprogs-y) | 9 | always := $(hostprogs-y) |
10 | 10 | ||
11 | HOSTCFLAGS_ucon.o += -I$(objtree)/usr/include | 11 | HOSTCFLAGS_ucon.o += -I$(objtree)/usr/include |
12 | |||
13 | all: modules | ||
14 | |||
15 | modules clean: | ||
16 | $(MAKE) -C ../.. SUBDIRS=$(PWD) $@ | ||
diff --git a/Documentation/connector/cn_test.c b/Documentation/connector/cn_test.c index 6a5be5d5c8e4..1711adc33373 100644 --- a/Documentation/connector/cn_test.c +++ b/Documentation/connector/cn_test.c | |||
@@ -19,6 +19,8 @@ | |||
19 | * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA | 19 | * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
20 | */ | 20 | */ |
21 | 21 | ||
22 | #define pr_fmt(fmt) "cn_test: " fmt | ||
23 | |||
22 | #include <linux/kernel.h> | 24 | #include <linux/kernel.h> |
23 | #include <linux/module.h> | 25 | #include <linux/module.h> |
24 | #include <linux/moduleparam.h> | 26 | #include <linux/moduleparam.h> |
@@ -27,18 +29,17 @@ | |||
27 | 29 | ||
28 | #include <linux/connector.h> | 30 | #include <linux/connector.h> |
29 | 31 | ||
30 | static struct cb_id cn_test_id = { 0x123, 0x456 }; | 32 | static struct cb_id cn_test_id = { CN_NETLINK_USERS + 3, 0x456 }; |
31 | static char cn_test_name[] = "cn_test"; | 33 | static char cn_test_name[] = "cn_test"; |
32 | static struct sock *nls; | 34 | static struct sock *nls; |
33 | static struct timer_list cn_test_timer; | 35 | static struct timer_list cn_test_timer; |
34 | 36 | ||
35 | void cn_test_callback(void *data) | 37 | static void cn_test_callback(struct cn_msg *msg) |
36 | { | 38 | { |
37 | struct cn_msg *msg = (struct cn_msg *)data; | 39 | pr_info("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n", |
38 | 40 | __func__, jiffies, msg->id.idx, msg->id.val, | |
39 | printk("%s: %lu: idx=%x, val=%x, seq=%u, ack=%u, len=%d: %s.\n", | 41 | msg->seq, msg->ack, msg->len, |
40 | __func__, jiffies, msg->id.idx, msg->id.val, | 42 | msg->len ? (char *)msg->data : ""); |
41 | msg->seq, msg->ack, msg->len, (char *)msg->data); | ||
42 | } | 43 | } |
43 | 44 | ||
44 | /* | 45 | /* |
@@ -63,9 +64,7 @@ static int cn_test_want_notify(void) | |||
63 | 64 | ||
64 | skb = alloc_skb(size, GFP_ATOMIC); | 65 | skb = alloc_skb(size, GFP_ATOMIC); |
65 | if (!skb) { | 66 | if (!skb) { |
66 | printk(KERN_ERR "Failed to allocate new skb with size=%u.\n", | 67 | pr_err("failed to allocate new skb with size=%u\n", size); |
67 | size); | ||
68 | |||
69 | return -ENOMEM; | 68 | return -ENOMEM; |
70 | } | 69 | } |
71 | 70 | ||
@@ -114,12 +113,12 @@ static int cn_test_want_notify(void) | |||
114 | //netlink_broadcast(nls, skb, 0, ctl->group, GFP_ATOMIC); | 113 | //netlink_broadcast(nls, skb, 0, ctl->group, GFP_ATOMIC); |
115 | netlink_unicast(nls, skb, 0, 0); | 114 | netlink_unicast(nls, skb, 0, 0); |
116 | 115 | ||
117 | printk(KERN_INFO "Request was sent. Group=0x%x.\n", ctl->group); | 116 | pr_info("request was sent: group=0x%x\n", ctl->group); |
118 | 117 | ||
119 | return 0; | 118 | return 0; |
120 | 119 | ||
121 | nlmsg_failure: | 120 | nlmsg_failure: |
122 | printk(KERN_ERR "Failed to send %u.%u\n", msg->seq, msg->ack); | 121 | pr_err("failed to send %u.%u\n", msg->seq, msg->ack); |
123 | kfree_skb(skb); | 122 | kfree_skb(skb); |
124 | return -EINVAL; | 123 | return -EINVAL; |
125 | } | 124 | } |
@@ -131,6 +130,8 @@ static void cn_test_timer_func(unsigned long __data) | |||
131 | struct cn_msg *m; | 130 | struct cn_msg *m; |
132 | char data[32]; | 131 | char data[32]; |
133 | 132 | ||
133 | pr_debug("%s: timer fired with data %lu\n", __func__, __data); | ||
134 | |||
134 | m = kzalloc(sizeof(*m) + sizeof(data), GFP_ATOMIC); | 135 | m = kzalloc(sizeof(*m) + sizeof(data), GFP_ATOMIC); |
135 | if (m) { | 136 | if (m) { |
136 | 137 | ||
@@ -150,7 +151,7 @@ static void cn_test_timer_func(unsigned long __data) | |||
150 | 151 | ||
151 | cn_test_timer_counter++; | 152 | cn_test_timer_counter++; |
152 | 153 | ||
153 | mod_timer(&cn_test_timer, jiffies + HZ); | 154 | mod_timer(&cn_test_timer, jiffies + msecs_to_jiffies(1000)); |
154 | } | 155 | } |
155 | 156 | ||
156 | static int cn_test_init(void) | 157 | static int cn_test_init(void) |
@@ -168,8 +169,10 @@ static int cn_test_init(void) | |||
168 | } | 169 | } |
169 | 170 | ||
170 | setup_timer(&cn_test_timer, cn_test_timer_func, 0); | 171 | setup_timer(&cn_test_timer, cn_test_timer_func, 0); |
171 | cn_test_timer.expires = jiffies + HZ; | 172 | mod_timer(&cn_test_timer, jiffies + msecs_to_jiffies(1000)); |
172 | add_timer(&cn_test_timer); | 173 | |
174 | pr_info("initialized with id={%u.%u}\n", | ||
175 | cn_test_id.idx, cn_test_id.val); | ||
173 | 176 | ||
174 | return 0; | 177 | return 0; |
175 | 178 | ||
diff --git a/Documentation/connector/connector.txt b/Documentation/connector/connector.txt index ad6e0ba7b38c..81e6bf6ead57 100644 --- a/Documentation/connector/connector.txt +++ b/Documentation/connector/connector.txt | |||
@@ -5,10 +5,10 @@ Kernel Connector. | |||
5 | Kernel connector - new netlink based userspace <-> kernel space easy | 5 | Kernel connector - new netlink based userspace <-> kernel space easy |
6 | to use communication module. | 6 | to use communication module. |
7 | 7 | ||
8 | Connector driver adds possibility to connect various agents using | 8 | The Connector driver makes it easy to connect various agents using a |
9 | netlink based network. One must register callback and | 9 | netlink based network. One must register a callback and an identifier. |
10 | identifier. When driver receives special netlink message with | 10 | When the driver receives a special netlink message with the appropriate |
11 | appropriate identifier, appropriate callback will be called. | 11 | identifier, the appropriate callback will be called. |
12 | 12 | ||
13 | From the userspace point of view it's quite straightforward: | 13 | From the userspace point of view it's quite straightforward: |
14 | 14 | ||
@@ -17,10 +17,10 @@ From the userspace point of view it's quite straightforward: | |||
17 | send(); | 17 | send(); |
18 | recv(); | 18 | recv(); |
19 | 19 | ||
20 | But if kernelspace want to use full power of such connections, driver | 20 | But if kernelspace wants to use the full power of such connections, the |
21 | writer must create special sockets, must know about struct sk_buff | 21 | driver writer must create special sockets, must know about struct sk_buff |
22 | handling... Connector allows any kernelspace agents to use netlink | 22 | handling, etc... The Connector driver allows any kernelspace agents to use |
23 | based networking for inter-process communication in a significantly | 23 | netlink based networking for inter-process communication in a significantly |
24 | easier way: | 24 | easier way: |
25 | 25 | ||
26 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); | 26 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); |
@@ -32,15 +32,15 @@ struct cb_id | |||
32 | __u32 val; | 32 | __u32 val; |
33 | }; | 33 | }; |
34 | 34 | ||
35 | idx and val are unique identifiers which must be registered in | 35 | idx and val are unique identifiers which must be registered in the |
36 | connector.h for in-kernel usage. void (*callback) (void *) - is a | 36 | connector.h header for in-kernel usage. void (*callback) (void *) is a |
37 | callback function which will be called when message with above idx.val | 37 | callback function which will be called when a message with above idx.val |
38 | will be received by connector core. Argument for that function must | 38 | is received by the connector core. The argument for that function must |
39 | be dereferenced to struct cn_msg *. | 39 | be dereferenced to struct cn_msg *. |
40 | 40 | ||
41 | struct cn_msg | 41 | struct cn_msg |
42 | { | 42 | { |
43 | struct cb_id id; | 43 | struct cb_id id; |
44 | 44 | ||
45 | __u32 seq; | 45 | __u32 seq; |
46 | __u32 ack; | 46 | __u32 ack; |
@@ -55,92 +55,95 @@ Connector interfaces. | |||
55 | 55 | ||
56 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); | 56 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (void *)); |
57 | 57 | ||
58 | Registers new callback with connector core. | 58 | Registers new callback with connector core. |
59 | 59 | ||
60 | struct cb_id *id - unique connector's user identifier. | 60 | struct cb_id *id - unique connector's user identifier. |
61 | It must be registered in connector.h for legal in-kernel users. | 61 | It must be registered in connector.h for legal in-kernel users. |
62 | char *name - connector's callback symbolic name. | 62 | char *name - connector's callback symbolic name. |
63 | void (*callback) (void *) - connector's callback. | 63 | void (*callback) (void *) - connector's callback. |
64 | Argument must be dereferenced to struct cn_msg *. | 64 | Argument must be dereferenced to struct cn_msg *. |
65 | 65 | ||
66 | |||
66 | void cn_del_callback(struct cb_id *id); | 67 | void cn_del_callback(struct cb_id *id); |
67 | 68 | ||
68 | Unregisters new callback with connector core. | 69 | Unregisters new callback with connector core. |
70 | |||
71 | struct cb_id *id - unique connector's user identifier. | ||
69 | 72 | ||
70 | struct cb_id *id - unique connector's user identifier. | ||
71 | 73 | ||
72 | int cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); | 74 | int cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); |
73 | 75 | ||
74 | Sends message to the specified groups. It can be safely called from | 76 | Sends message to the specified groups. It can be safely called from |
75 | softirq context, but may silently fail under strong memory pressure. | 77 | softirq context, but may silently fail under strong memory pressure. |
76 | If there are no listeners for given group -ESRCH can be returned. | 78 | If there are no listeners for given group -ESRCH can be returned. |
77 | 79 | ||
78 | struct cn_msg * - message header(with attached data). | 80 | struct cn_msg * - message header(with attached data). |
79 | u32 __group - destination group. | 81 | u32 __group - destination group. |
80 | If __group is zero, then appropriate group will | 82 | If __group is zero, then appropriate group will |
81 | be searched through all registered connector users, | 83 | be searched through all registered connector users, |
82 | and message will be delivered to the group which was | 84 | and message will be delivered to the group which was |
83 | created for user with the same ID as in msg. | 85 | created for user with the same ID as in msg. |
84 | If __group is not zero, then message will be delivered | 86 | If __group is not zero, then message will be delivered |
85 | to the specified group. | 87 | to the specified group. |
86 | int gfp_mask - GFP mask. | 88 | int gfp_mask - GFP mask. |
87 | 89 | ||
88 | Note: When registering new callback user, connector core assigns | 90 | Note: When registering new callback user, connector core assigns |
89 | netlink group to the user which is equal to it's id.idx. | 91 | netlink group to the user which is equal to it's id.idx. |
90 | 92 | ||
91 | /*****************************************/ | 93 | /*****************************************/ |
92 | Protocol description. | 94 | Protocol description. |
93 | /*****************************************/ | 95 | /*****************************************/ |
94 | 96 | ||
95 | Current offers transport layer with fixed header. Recommended | 97 | The current framework offers a transport layer with fixed headers. The |
96 | protocol which uses such header is following: | 98 | recommended protocol which uses such a header is as following: |
97 | 99 | ||
98 | msg->seq and msg->ack are used to determine message genealogy. When | 100 | msg->seq and msg->ack are used to determine message genealogy. When |
99 | someone sends message it puts there locally unique sequence and random | 101 | someone sends a message, they use a locally unique sequence and random |
100 | acknowledge numbers. Sequence number may be copied into | 102 | acknowledge number. The sequence number may be copied into |
101 | nlmsghdr->nlmsg_seq too. | 103 | nlmsghdr->nlmsg_seq too. |
102 | 104 | ||
103 | Sequence number is incremented with each message to be sent. | 105 | The sequence number is incremented with each message sent. |
104 | 106 | ||
105 | If we expect reply to our message, then sequence number in received | 107 | If you expect a reply to the message, then the sequence number in the |
106 | message MUST be the same as in original message, and acknowledge | 108 | received message MUST be the same as in the original message, and the |
107 | number MUST be the same + 1. | 109 | acknowledge number MUST be the same + 1. |
108 | 110 | ||
109 | If we receive message and it's sequence number is not equal to one we | 111 | If we receive a message and its sequence number is not equal to one we |
110 | are expecting, then it is new message. If we receive message and it's | 112 | are expecting, then it is a new message. If we receive a message and |
111 | sequence number is the same as one we are expecting, but it's | 113 | its sequence number is the same as one we are expecting, but its |
112 | acknowledge is not equal acknowledge number in original message + 1, | 114 | acknowledge is not equal to the acknowledge number in the original |
113 | then it is new message. | 115 | message + 1, then it is a new message. |
114 | 116 | ||
115 | Obviously, protocol header contains above id. | 117 | Obviously, the protocol header contains the above id. |
116 | 118 | ||
117 | connector allows event notification in the following form: kernel | 119 | The connector allows event notification in the following form: kernel |
118 | driver or userspace process can ask connector to notify it when | 120 | driver or userspace process can ask connector to notify it when |
119 | selected id's will be turned on or off(registered or unregistered it's | 121 | selected ids will be turned on or off (registered or unregistered its |
120 | callback). It is done by sending special command to connector | 122 | callback). It is done by sending a special command to the connector |
121 | driver(it also registers itself with id={-1, -1}). | 123 | driver (it also registers itself with id={-1, -1}). |
122 | 124 | ||
123 | As example of usage Documentation/connector now contains cn_test.c - | 125 | As example of this usage can be found in the cn_test.c module which |
124 | testing module which uses connector to request notification and to | 126 | uses the connector to request notification and to send messages. |
125 | send messages. | ||
126 | 127 | ||
127 | /*****************************************/ | 128 | /*****************************************/ |
128 | Reliability. | 129 | Reliability. |
129 | /*****************************************/ | 130 | /*****************************************/ |
130 | 131 | ||
131 | Netlink itself is not reliable protocol, that means that messages can | 132 | Netlink itself is not a reliable protocol. That means that messages can |
132 | be lost due to memory pressure or process' receiving queue overflowed, | 133 | be lost due to memory pressure or process' receiving queue overflowed, |
133 | so caller is warned must be prepared. That is why struct cn_msg [main | 134 | so caller is warned that it must be prepared. That is why the struct |
134 | connector's message header] contains u32 seq and u32 ack fields. | 135 | cn_msg [main connector's message header] contains u32 seq and u32 ack |
136 | fields. | ||
135 | 137 | ||
136 | /*****************************************/ | 138 | /*****************************************/ |
137 | Userspace usage. | 139 | Userspace usage. |
138 | /*****************************************/ | 140 | /*****************************************/ |
141 | |||
139 | 2.6.14 has a new netlink socket implementation, which by default does not | 142 | 2.6.14 has a new netlink socket implementation, which by default does not |
140 | allow to send data to netlink groups other than 1. | 143 | allow people to send data to netlink groups other than 1. |
141 | So, if to use netlink socket (for example using connector) | 144 | So, if you wish to use a netlink socket (for example using connector) |
142 | with different group number userspace application must subscribe to | 145 | with a different group number, the userspace application must subscribe to |
143 | that group. It can be achieved by following pseudocode: | 146 | that group first. It can be achieved by the following pseudocode: |
144 | 147 | ||
145 | s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); | 148 | s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); |
146 | 149 | ||
@@ -160,8 +163,8 @@ if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { | |||
160 | } | 163 | } |
161 | 164 | ||
162 | Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket | 165 | Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket |
163 | option. To drop multicast subscription one should call above socket option | 166 | option. To drop a multicast subscription, one should call the above socket |
164 | with NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. | 167 | option with the NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. |
165 | 168 | ||
166 | 2.6.14 netlink code only allows to select a group which is less or equal to | 169 | 2.6.14 netlink code only allows to select a group which is less or equal to |
167 | the maximum group number, which is used at netlink_kernel_create() time. | 170 | the maximum group number, which is used at netlink_kernel_create() time. |
diff --git a/Documentation/connector/ucon.c b/Documentation/connector/ucon.c index c5092ad0ce4b..4848db8c71ff 100644 --- a/Documentation/connector/ucon.c +++ b/Documentation/connector/ucon.c | |||
@@ -30,18 +30,24 @@ | |||
30 | 30 | ||
31 | #include <arpa/inet.h> | 31 | #include <arpa/inet.h> |
32 | 32 | ||
33 | #include <stdbool.h> | ||
33 | #include <stdio.h> | 34 | #include <stdio.h> |
34 | #include <stdlib.h> | 35 | #include <stdlib.h> |
35 | #include <unistd.h> | 36 | #include <unistd.h> |
36 | #include <string.h> | 37 | #include <string.h> |
37 | #include <errno.h> | 38 | #include <errno.h> |
38 | #include <time.h> | 39 | #include <time.h> |
40 | #include <getopt.h> | ||
39 | 41 | ||
40 | #include <linux/connector.h> | 42 | #include <linux/connector.h> |
41 | 43 | ||
42 | #define DEBUG | 44 | #define DEBUG |
43 | #define NETLINK_CONNECTOR 11 | 45 | #define NETLINK_CONNECTOR 11 |
44 | 46 | ||
47 | /* Hopefully your userspace connector.h matches this kernel */ | ||
48 | #define CN_TEST_IDX CN_NETLINK_USERS + 3 | ||
49 | #define CN_TEST_VAL 0x456 | ||
50 | |||
45 | #ifdef DEBUG | 51 | #ifdef DEBUG |
46 | #define ulog(f, a...) fprintf(stdout, f, ##a) | 52 | #define ulog(f, a...) fprintf(stdout, f, ##a) |
47 | #else | 53 | #else |
@@ -83,6 +89,25 @@ static int netlink_send(int s, struct cn_msg *msg) | |||
83 | return err; | 89 | return err; |
84 | } | 90 | } |
85 | 91 | ||
92 | static void usage(void) | ||
93 | { | ||
94 | printf( | ||
95 | "Usage: ucon [options] [output file]\n" | ||
96 | "\n" | ||
97 | "\t-h\tthis help screen\n" | ||
98 | "\t-s\tsend buffers to the test module\n" | ||
99 | "\n" | ||
100 | "The default behavior of ucon is to subscribe to the test module\n" | ||
101 | "and wait for state messages. Any ones received are dumped to the\n" | ||
102 | "specified output file (or stdout). The test module is assumed to\n" | ||
103 | "have an id of {%u.%u}\n" | ||
104 | "\n" | ||
105 | "If you get no output, then verify the cn_test module id matches\n" | ||
106 | "the expected id above.\n" | ||
107 | , CN_TEST_IDX, CN_TEST_VAL | ||
108 | ); | ||
109 | } | ||
110 | |||
86 | int main(int argc, char *argv[]) | 111 | int main(int argc, char *argv[]) |
87 | { | 112 | { |
88 | int s; | 113 | int s; |
@@ -94,17 +119,34 @@ int main(int argc, char *argv[]) | |||
94 | FILE *out; | 119 | FILE *out; |
95 | time_t tm; | 120 | time_t tm; |
96 | struct pollfd pfd; | 121 | struct pollfd pfd; |
122 | bool send_msgs = false; | ||
97 | 123 | ||
98 | if (argc < 2) | 124 | while ((s = getopt(argc, argv, "hs")) != -1) { |
99 | out = stdout; | 125 | switch (s) { |
100 | else { | 126 | case 's': |
101 | out = fopen(argv[1], "a+"); | 127 | send_msgs = true; |
128 | break; | ||
129 | |||
130 | case 'h': | ||
131 | usage(); | ||
132 | return 0; | ||
133 | |||
134 | default: | ||
135 | /* getopt() outputs an error for us */ | ||
136 | usage(); | ||
137 | return 1; | ||
138 | } | ||
139 | } | ||
140 | |||
141 | if (argc != optind) { | ||
142 | out = fopen(argv[optind], "a+"); | ||
102 | if (!out) { | 143 | if (!out) { |
103 | ulog("Unable to open %s for writing: %s\n", | 144 | ulog("Unable to open %s for writing: %s\n", |
104 | argv[1], strerror(errno)); | 145 | argv[1], strerror(errno)); |
105 | out = stdout; | 146 | out = stdout; |
106 | } | 147 | } |
107 | } | 148 | } else |
149 | out = stdout; | ||
108 | 150 | ||
109 | memset(buf, 0, sizeof(buf)); | 151 | memset(buf, 0, sizeof(buf)); |
110 | 152 | ||
@@ -115,9 +157,11 @@ int main(int argc, char *argv[]) | |||
115 | } | 157 | } |
116 | 158 | ||
117 | l_local.nl_family = AF_NETLINK; | 159 | l_local.nl_family = AF_NETLINK; |
118 | l_local.nl_groups = 0x123; /* bitmask of requested groups */ | 160 | l_local.nl_groups = -1; /* bitmask of requested groups */ |
119 | l_local.nl_pid = 0; | 161 | l_local.nl_pid = 0; |
120 | 162 | ||
163 | ulog("subscribing to %u.%u\n", CN_TEST_IDX, CN_TEST_VAL); | ||
164 | |||
121 | if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { | 165 | if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { |
122 | perror("bind"); | 166 | perror("bind"); |
123 | close(s); | 167 | close(s); |
@@ -130,15 +174,15 @@ int main(int argc, char *argv[]) | |||
130 | setsockopt(s, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &on, sizeof(on)); | 174 | setsockopt(s, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP, &on, sizeof(on)); |
131 | } | 175 | } |
132 | #endif | 176 | #endif |
133 | if (0) { | 177 | if (send_msgs) { |
134 | int i, j; | 178 | int i, j; |
135 | 179 | ||
136 | memset(buf, 0, sizeof(buf)); | 180 | memset(buf, 0, sizeof(buf)); |
137 | 181 | ||
138 | data = (struct cn_msg *)buf; | 182 | data = (struct cn_msg *)buf; |
139 | 183 | ||
140 | data->id.idx = 0x123; | 184 | data->id.idx = CN_TEST_IDX; |
141 | data->id.val = 0x456; | 185 | data->id.val = CN_TEST_VAL; |
142 | data->seq = seq++; | 186 | data->seq = seq++; |
143 | data->ack = 0; | 187 | data->ack = 0; |
144 | data->len = 0; | 188 | data->len = 0; |
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 09e031c55887..503d21216d58 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt | |||
@@ -6,6 +6,35 @@ be removed from this file. | |||
6 | 6 | ||
7 | --------------------------- | 7 | --------------------------- |
8 | 8 | ||
9 | What: PRISM54 | ||
10 | When: 2.6.34 | ||
11 | |||
12 | Why: prism54 FullMAC PCI / Cardbus devices used to be supported only by the | ||
13 | prism54 wireless driver. After Intersil stopped selling these | ||
14 | devices in preference for the newer more flexible SoftMAC devices | ||
15 | a SoftMAC device driver was required and prism54 did not support | ||
16 | them. The p54pci driver now exists and has been present in the kernel for | ||
17 | a while. This driver supports both SoftMAC devices and FullMAC devices. | ||
18 | The main difference between these devices was the amount of memory which | ||
19 | could be used for the firmware. The SoftMAC devices support a smaller | ||
20 | amount of memory. Because of this the SoftMAC firmware fits into FullMAC | ||
21 | devices's memory. p54pci supports not only PCI / Cardbus but also USB | ||
22 | and SPI. Since p54pci supports all devices prism54 supports | ||
23 | you will have a conflict. I'm not quite sure how distributions are | ||
24 | handling this conflict right now. prism54 was kept around due to | ||
25 | claims users may experience issues when using the SoftMAC driver. | ||
26 | Time has passed users have not reported issues. If you use prism54 | ||
27 | and for whatever reason you cannot use p54pci please let us know! | ||
28 | E-mail us at: linux-wireless@vger.kernel.org | ||
29 | |||
30 | For more information see the p54 wiki page: | ||
31 | |||
32 | http://wireless.kernel.org/en/users/Drivers/p54 | ||
33 | |||
34 | Who: Luis R. Rodriguez <lrodriguez@atheros.com> | ||
35 | |||
36 | --------------------------- | ||
37 | |||
9 | What: IRQF_SAMPLE_RANDOM | 38 | What: IRQF_SAMPLE_RANDOM |
10 | Check: IRQF_SAMPLE_RANDOM | 39 | Check: IRQF_SAMPLE_RANDOM |
11 | When: July 2009 | 40 | When: July 2009 |
@@ -206,24 +235,6 @@ Who: Len Brown <len.brown@intel.com> | |||
206 | 235 | ||
207 | --------------------------- | 236 | --------------------------- |
208 | 237 | ||
209 | What: libata spindown skipping and warning | ||
210 | When: Dec 2008 | ||
211 | Why: Some halt(8) implementations synchronize caches for and spin | ||
212 | down libata disks because libata didn't use to spin down disk on | ||
213 | system halt (only synchronized caches). | ||
214 | Spin down on system halt is now implemented. sysfs node | ||
215 | /sys/class/scsi_disk/h:c:i:l/manage_start_stop is present if | ||
216 | spin down support is available. | ||
217 | Because issuing spin down command to an already spun down disk | ||
218 | makes some disks spin up just to spin down again, libata tracks | ||
219 | device spindown status to skip the extra spindown command and | ||
220 | warn about it. | ||
221 | This is to give userspace tools the time to get updated and will | ||
222 | be removed after userspace is reasonably updated. | ||
223 | Who: Tejun Heo <htejun@gmail.com> | ||
224 | |||
225 | --------------------------- | ||
226 | |||
227 | What: i386/x86_64 bzImage symlinks | 238 | What: i386/x86_64 bzImage symlinks |
228 | When: April 2010 | 239 | When: April 2010 |
229 | 240 | ||
@@ -235,31 +246,6 @@ Who: Thomas Gleixner <tglx@linutronix.de> | |||
235 | --------------------------- | 246 | --------------------------- |
236 | 247 | ||
237 | What (Why): | 248 | What (Why): |
238 | - include/linux/netfilter_ipv4/ipt_TOS.h ipt_tos.h header files | ||
239 | (superseded by xt_TOS/xt_tos target & match) | ||
240 | |||
241 | - "forwarding" header files like ipt_mac.h in | ||
242 | include/linux/netfilter_ipv4/ and include/linux/netfilter_ipv6/ | ||
243 | |||
244 | - xt_CONNMARK match revision 0 | ||
245 | (superseded by xt_CONNMARK match revision 1) | ||
246 | |||
247 | - xt_MARK target revisions 0 and 1 | ||
248 | (superseded by xt_MARK match revision 2) | ||
249 | |||
250 | - xt_connmark match revision 0 | ||
251 | (superseded by xt_connmark match revision 1) | ||
252 | |||
253 | - xt_conntrack match revision 0 | ||
254 | (superseded by xt_conntrack match revision 1) | ||
255 | |||
256 | - xt_iprange match revision 0, | ||
257 | include/linux/netfilter_ipv4/ipt_iprange.h | ||
258 | (superseded by xt_iprange match revision 1) | ||
259 | |||
260 | - xt_mark match revision 0 | ||
261 | (superseded by xt_mark match revision 1) | ||
262 | |||
263 | - xt_recent: the old ipt_recent proc dir | 249 | - xt_recent: the old ipt_recent proc dir |
264 | (superseded by /proc/net/xt_recent) | 250 | (superseded by /proc/net/xt_recent) |
265 | 251 | ||
@@ -394,15 +380,6 @@ Who: Thomas Gleixner <tglx@linutronix.de> | |||
394 | 380 | ||
395 | ----------------------------- | 381 | ----------------------------- |
396 | 382 | ||
397 | What: obsolete generic irq defines and typedefs | ||
398 | When: 2.6.30 | ||
399 | Why: The defines and typedefs (hw_interrupt_type, no_irq_type, irq_desc_t) | ||
400 | have been kept around for migration reasons. After more than two years | ||
401 | it's time to remove them finally | ||
402 | Who: Thomas Gleixner <tglx@linutronix.de> | ||
403 | |||
404 | --------------------------- | ||
405 | |||
406 | What: fakephp and associated sysfs files in /sys/bus/pci/slots/ | 383 | What: fakephp and associated sysfs files in /sys/bus/pci/slots/ |
407 | When: 2011 | 384 | When: 2011 |
408 | Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to | 385 | Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to |
@@ -468,3 +445,27 @@ Why: cpu_policy_rwsem has a new cleaner definition making it local to | |||
468 | cpufreq core and contained inside cpufreq.c. Other dependent | 445 | cpufreq core and contained inside cpufreq.c. Other dependent |
469 | drivers should not use it in order to safely avoid lockdep issues. | 446 | drivers should not use it in order to safely avoid lockdep issues. |
470 | Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> | 447 | Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> |
448 | |||
449 | ---------------------------- | ||
450 | |||
451 | What: sound-slot/service-* module aliases and related clutters in | ||
452 | sound/sound_core.c | ||
453 | When: August 2010 | ||
454 | Why: OSS sound_core grabs all legacy minors (0-255) of SOUND_MAJOR | ||
455 | (14) and requests modules using custom sound-slot/service-* | ||
456 | module aliases. The only benefit of doing this is allowing | ||
457 | use of custom module aliases which might as well be considered | ||
458 | a bug at this point. This preemptive claiming prevents | ||
459 | alternative OSS implementations. | ||
460 | |||
461 | Till the feature is removed, the kernel will be requesting | ||
462 | both sound-slot/service-* and the standard char-major-* module | ||
463 | aliases and allow turning off the pre-claiming selectively via | ||
464 | CONFIG_SOUND_OSS_CORE_PRECLAIM and soundcore.preclaim_oss | ||
465 | kernel parameter. | ||
466 | |||
467 | After the transition phase is complete, both the custom module | ||
468 | aliases and switches to disable it will go away. This removal | ||
469 | will also allow making ALSA OSS emulation independent of | ||
470 | sound_core. The dependency will be broken then too. | ||
471 | Who: Tejun Heo <tj@kernel.org> | ||
diff --git a/Documentation/filesystems/gfs2-uevents.txt b/Documentation/filesystems/gfs2-uevents.txt new file mode 100644 index 000000000000..fd966dc9979a --- /dev/null +++ b/Documentation/filesystems/gfs2-uevents.txt | |||
@@ -0,0 +1,100 @@ | |||
1 | uevents and GFS2 | ||
2 | ================== | ||
3 | |||
4 | During the lifetime of a GFS2 mount, a number of uevents are generated. | ||
5 | This document explains what the events are and what they are used | ||
6 | for (by gfs_controld in gfs2-utils). | ||
7 | |||
8 | A list of GFS2 uevents | ||
9 | ----------------------- | ||
10 | |||
11 | 1. ADD | ||
12 | |||
13 | The ADD event occurs at mount time. It will always be the first | ||
14 | uevent generated by the newly created filesystem. If the mount | ||
15 | is successful, an ONLINE uevent will follow. If it is not successful | ||
16 | then a REMOVE uevent will follow. | ||
17 | |||
18 | The ADD uevent has two environment variables: SPECTATOR=[0|1] | ||
19 | and RDONLY=[0|1] that specify the spectator status (a read-only mount | ||
20 | with no journal assigned), and read-only (with journal assigned) status | ||
21 | of the filesystem respectively. | ||
22 | |||
23 | 2. ONLINE | ||
24 | |||
25 | The ONLINE uevent is generated after a successful mount or remount. It | ||
26 | has the same environment variables as the ADD uevent. The ONLINE | ||
27 | uevent, along with the two environment variables for spectator and | ||
28 | RDONLY are a relatively recent addition (2.6.32-rc+) and will not | ||
29 | be generated by older kernels. | ||
30 | |||
31 | 3. CHANGE | ||
32 | |||
33 | The CHANGE uevent is used in two places. One is when reporting the | ||
34 | successful mount of the filesystem by the first node (FIRSTMOUNT=Done). | ||
35 | This is used as a signal by gfs_controld that it is then ok for other | ||
36 | nodes in the cluster to mount the filesystem. | ||
37 | |||
38 | The other CHANGE uevent is used to inform of the completion | ||
39 | of journal recovery for one of the filesystems journals. It has | ||
40 | two environment variables, JID= which specifies the journal id which | ||
41 | has just been recovered, and RECOVERY=[Done|Failed] to indicate the | ||
42 | success (or otherwise) of the operation. These uevents are generated | ||
43 | for every journal recovered, whether it is during the initial mount | ||
44 | process or as the result of gfs_controld requesting a specific journal | ||
45 | recovery via the /sys/fs/gfs2/<fsname>/lock_module/recovery file. | ||
46 | |||
47 | Because the CHANGE uevent was used (in early versions of gfs_controld) | ||
48 | without checking the environment variables to discover the state, we | ||
49 | cannot add any more functions to it without running the risk of | ||
50 | someone using an older version of the user tools and breaking their | ||
51 | cluster. For this reason the ONLINE uevent was used when adding a new | ||
52 | uevent for a successful mount or remount. | ||
53 | |||
54 | 4. OFFLINE | ||
55 | |||
56 | The OFFLINE uevent is only generated due to filesystem errors and is used | ||
57 | as part of the "withdraw" mechanism. Currently this doesn't give any | ||
58 | information about what the error is, which is something that needs to | ||
59 | be fixed. | ||
60 | |||
61 | 5. REMOVE | ||
62 | |||
63 | The REMOVE uevent is generated at the end of an unsuccessful mount | ||
64 | or at the end of a umount of the filesystem. All REMOVE uevents will | ||
65 | have been preceeded by at least an ADD uevent for the same fileystem, | ||
66 | and unlike the other uevents is generated automatically by the kernel's | ||
67 | kobject subsystem. | ||
68 | |||
69 | |||
70 | Information common to all GFS2 uevents (uevent environment variables) | ||
71 | ---------------------------------------------------------------------- | ||
72 | |||
73 | 1. LOCKTABLE= | ||
74 | |||
75 | The LOCKTABLE is a string, as supplied on the mount command | ||
76 | line (locktable=) or via fstab. It is used as a filesystem label | ||
77 | as well as providing the information for a lock_dlm mount to be | ||
78 | able to join the cluster. | ||
79 | |||
80 | 2. LOCKPROTO= | ||
81 | |||
82 | The LOCKPROTO is a string, and its value depends on what is set | ||
83 | on the mount command line, or via fstab. It will be either | ||
84 | lock_nolock or lock_dlm. In the future other lock managers | ||
85 | may be supported. | ||
86 | |||
87 | 3. JOURNALID= | ||
88 | |||
89 | If a journal is in use by the filesystem (journals are not | ||
90 | assigned for spectator mounts) then this will give the | ||
91 | numeric journal id in all GFS2 uevents. | ||
92 | |||
93 | 4. UUID= | ||
94 | |||
95 | With recent versions of gfs2-utils, mkfs.gfs2 writes a UUID | ||
96 | into the filesystem superblock. If it exists, this will | ||
97 | be included in every uevent relating to the filesystem. | ||
98 | |||
99 | |||
100 | |||
diff --git a/Documentation/filesystems/nfs.txt b/Documentation/filesystems/nfs.txt new file mode 100644 index 000000000000..f50f26ce6cd0 --- /dev/null +++ b/Documentation/filesystems/nfs.txt | |||
@@ -0,0 +1,98 @@ | |||
1 | |||
2 | The NFS client | ||
3 | ============== | ||
4 | |||
5 | The NFS version 2 protocol was first documented in RFC1094 (March 1989). | ||
6 | Since then two more major releases of NFS have been published, with NFSv3 | ||
7 | being documented in RFC1813 (June 1995), and NFSv4 in RFC3530 (April | ||
8 | 2003). | ||
9 | |||
10 | The Linux NFS client currently supports all the above published versions, | ||
11 | and work is in progress on adding support for minor version 1 of the NFSv4 | ||
12 | protocol. | ||
13 | |||
14 | The purpose of this document is to provide information on some of the | ||
15 | upcall interfaces that are used in order to provide the NFS client with | ||
16 | some of the information that it requires in order to fully comply with | ||
17 | the NFS spec. | ||
18 | |||
19 | The DNS resolver | ||
20 | ================ | ||
21 | |||
22 | NFSv4 allows for one server to refer the NFS client to data that has been | ||
23 | migrated onto another server by means of the special "fs_locations" | ||
24 | attribute. See | ||
25 | http://tools.ietf.org/html/rfc3530#section-6 | ||
26 | and | ||
27 | http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00 | ||
28 | |||
29 | The fs_locations information can take the form of either an ip address and | ||
30 | a path, or a DNS hostname and a path. The latter requires the NFS client to | ||
31 | do a DNS lookup in order to mount the new volume, and hence the need for an | ||
32 | upcall to allow userland to provide this service. | ||
33 | |||
34 | Assuming that the user has the 'rpc_pipefs' filesystem mounted in the usual | ||
35 | /var/lib/nfs/rpc_pipefs, the upcall consists of the following steps: | ||
36 | |||
37 | (1) The process checks the dns_resolve cache to see if it contains a | ||
38 | valid entry. If so, it returns that entry and exits. | ||
39 | |||
40 | (2) If no valid entry exists, the helper script '/sbin/nfs_cache_getent' | ||
41 | (may be changed using the 'nfs.cache_getent' kernel boot parameter) | ||
42 | is run, with two arguments: | ||
43 | - the cache name, "dns_resolve" | ||
44 | - the hostname to resolve | ||
45 | |||
46 | (3) After looking up the corresponding ip address, the helper script | ||
47 | writes the result into the rpc_pipefs pseudo-file | ||
48 | '/var/lib/nfs/rpc_pipefs/cache/dns_resolve/channel' | ||
49 | in the following (text) format: | ||
50 | |||
51 | "<ip address> <hostname> <ttl>\n" | ||
52 | |||
53 | Where <ip address> is in the usual IPv4 (123.456.78.90) or IPv6 | ||
54 | (ffee:ddcc:bbaa:9988:7766:5544:3322:1100, ffee::1100, ...) format. | ||
55 | <hostname> is identical to the second argument of the helper | ||
56 | script, and <ttl> is the 'time to live' of this cache entry (in | ||
57 | units of seconds). | ||
58 | |||
59 | Note: If <ip address> is invalid, say the string "0", then a negative | ||
60 | entry is created, which will cause the kernel to treat the hostname | ||
61 | as having no valid DNS translation. | ||
62 | |||
63 | |||
64 | |||
65 | |||
66 | A basic sample /sbin/nfs_cache_getent | ||
67 | ===================================== | ||
68 | |||
69 | #!/bin/bash | ||
70 | # | ||
71 | ttl=600 | ||
72 | # | ||
73 | cut=/usr/bin/cut | ||
74 | getent=/usr/bin/getent | ||
75 | rpc_pipefs=/var/lib/nfs/rpc_pipefs | ||
76 | # | ||
77 | die() | ||
78 | { | ||
79 | echo "Usage: $0 cache_name entry_name" | ||
80 | exit 1 | ||
81 | } | ||
82 | |||
83 | [ $# -lt 2 ] && die | ||
84 | cachename="$1" | ||
85 | cache_path=${rpc_pipefs}/cache/${cachename}/channel | ||
86 | |||
87 | case "${cachename}" in | ||
88 | dns_resolve) | ||
89 | name="$2" | ||
90 | result="$(${getent} hosts ${name} | ${cut} -f1 -d\ )" | ||
91 | [ -z "${result}" ] && result="0" | ||
92 | ;; | ||
93 | *) | ||
94 | die | ||
95 | ;; | ||
96 | esac | ||
97 | echo "${result} ${name} ${ttl}" >${cache_path} | ||
98 | |||
diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt index b843743aa0b5..0d15ebccf5b0 100644 --- a/Documentation/filesystems/seq_file.txt +++ b/Documentation/filesystems/seq_file.txt | |||
@@ -46,7 +46,7 @@ better to do. The file is seekable, in that one can do something like the | |||
46 | following: | 46 | following: |
47 | 47 | ||
48 | dd if=/proc/sequence of=out1 count=1 | 48 | dd if=/proc/sequence of=out1 count=1 |
49 | dd if=/proc/sequence skip=1 out=out2 count=1 | 49 | dd if=/proc/sequence skip=1 of=out2 count=1 |
50 | 50 | ||
51 | Then concatenate the output files out1 and out2 and get the right | 51 | Then concatenate the output files out1 and out2 and get the right |
52 | result. Yes, it is a thoroughly useless module, but the point is to show | 52 | result. Yes, it is a thoroughly useless module, but the point is to show |
diff --git a/Documentation/flexible-arrays.txt b/Documentation/flexible-arrays.txt new file mode 100644 index 000000000000..84eb26808dee --- /dev/null +++ b/Documentation/flexible-arrays.txt | |||
@@ -0,0 +1,99 @@ | |||
1 | Using flexible arrays in the kernel | ||
2 | Last updated for 2.6.31 | ||
3 | Jonathan Corbet <corbet@lwn.net> | ||
4 | |||
5 | Large contiguous memory allocations can be unreliable in the Linux kernel. | ||
6 | Kernel programmers will sometimes respond to this problem by allocating | ||
7 | pages with vmalloc(). This solution not ideal, though. On 32-bit systems, | ||
8 | memory from vmalloc() must be mapped into a relatively small address space; | ||
9 | it's easy to run out. On SMP systems, the page table changes required by | ||
10 | vmalloc() allocations can require expensive cross-processor interrupts on | ||
11 | all CPUs. And, on all systems, use of space in the vmalloc() range | ||
12 | increases pressure on the translation lookaside buffer (TLB), reducing the | ||
13 | performance of the system. | ||
14 | |||
15 | In many cases, the need for memory from vmalloc() can be eliminated by | ||
16 | piecing together an array from smaller parts; the flexible array library | ||
17 | exists to make this task easier. | ||
18 | |||
19 | A flexible array holds an arbitrary (within limits) number of fixed-sized | ||
20 | objects, accessed via an integer index. Sparse arrays are handled | ||
21 | reasonably well. Only single-page allocations are made, so memory | ||
22 | allocation failures should be relatively rare. The down sides are that the | ||
23 | arrays cannot be indexed directly, individual object size cannot exceed the | ||
24 | system page size, and putting data into a flexible array requires a copy | ||
25 | operation. It's also worth noting that flexible arrays do no internal | ||
26 | locking at all; if concurrent access to an array is possible, then the | ||
27 | caller must arrange for appropriate mutual exclusion. | ||
28 | |||
29 | The creation of a flexible array is done with: | ||
30 | |||
31 | #include <linux/flex_array.h> | ||
32 | |||
33 | struct flex_array *flex_array_alloc(int element_size, | ||
34 | unsigned int total, | ||
35 | gfp_t flags); | ||
36 | |||
37 | The individual object size is provided by element_size, while total is the | ||
38 | maximum number of objects which can be stored in the array. The flags | ||
39 | argument is passed directly to the internal memory allocation calls. With | ||
40 | the current code, using flags to ask for high memory is likely to lead to | ||
41 | notably unpleasant side effects. | ||
42 | |||
43 | Storing data into a flexible array is accomplished with a call to: | ||
44 | |||
45 | int flex_array_put(struct flex_array *array, unsigned int element_nr, | ||
46 | void *src, gfp_t flags); | ||
47 | |||
48 | This call will copy the data from src into the array, in the position | ||
49 | indicated by element_nr (which must be less than the maximum specified when | ||
50 | the array was created). If any memory allocations must be performed, flags | ||
51 | will be used. The return value is zero on success, a negative error code | ||
52 | otherwise. | ||
53 | |||
54 | There might possibly be a need to store data into a flexible array while | ||
55 | running in some sort of atomic context; in this situation, sleeping in the | ||
56 | memory allocator would be a bad thing. That can be avoided by using | ||
57 | GFP_ATOMIC for the flags value, but, often, there is a better way. The | ||
58 | trick is to ensure that any needed memory allocations are done before | ||
59 | entering atomic context, using: | ||
60 | |||
61 | int flex_array_prealloc(struct flex_array *array, unsigned int start, | ||
62 | unsigned int end, gfp_t flags); | ||
63 | |||
64 | This function will ensure that memory for the elements indexed in the range | ||
65 | defined by start and end has been allocated. Thereafter, a | ||
66 | flex_array_put() call on an element in that range is guaranteed not to | ||
67 | block. | ||
68 | |||
69 | Getting data back out of the array is done with: | ||
70 | |||
71 | void *flex_array_get(struct flex_array *fa, unsigned int element_nr); | ||
72 | |||
73 | The return value is a pointer to the data element, or NULL if that | ||
74 | particular element has never been allocated. | ||
75 | |||
76 | Note that it is possible to get back a valid pointer for an element which | ||
77 | has never been stored in the array. Memory for array elements is allocated | ||
78 | one page at a time; a single allocation could provide memory for several | ||
79 | adjacent elements. The flexible array code does not know if a specific | ||
80 | element has been written; it only knows if the associated memory is | ||
81 | present. So a flex_array_get() call on an element which was never stored | ||
82 | in the array has the potential to return a pointer to random data. If the | ||
83 | caller does not have a separate way to know which elements were actually | ||
84 | stored, it might be wise, at least, to add GFP_ZERO to the flags argument | ||
85 | to ensure that all elements are zeroed. | ||
86 | |||
87 | There is no way to remove a single element from the array. It is possible, | ||
88 | though, to remove all elements with a call to: | ||
89 | |||
90 | void flex_array_free_parts(struct flex_array *array); | ||
91 | |||
92 | This call frees all elements, but leaves the array itself in place. | ||
93 | Freeing the entire array is done with: | ||
94 | |||
95 | void flex_array_free(struct flex_array *array); | ||
96 | |||
97 | As of this writing, there are no users of flexible arrays in the mainline | ||
98 | kernel. The functions described here are also not exported to modules; | ||
99 | that will probably be fixed when somebody comes up with a need for it. | ||
diff --git a/Documentation/input/sentelic.txt b/Documentation/input/sentelic.txt new file mode 100644 index 000000000000..f7160a2fb6a2 --- /dev/null +++ b/Documentation/input/sentelic.txt | |||
@@ -0,0 +1,475 @@ | |||
1 | Copyright (C) 2002-2008 Sentelic Corporation. | ||
2 | Last update: Oct-31-2008 | ||
3 | |||
4 | ============================================================================== | ||
5 | * Finger Sensing Pad Intellimouse Mode(scrolling wheel, 4th and 5th buttons) | ||
6 | ============================================================================== | ||
7 | A) MSID 4: Scrolling wheel mode plus Forward page(4th button) and Backward | ||
8 | page (5th button) | ||
9 | @1. Set sample rate to 200; | ||
10 | @2. Set sample rate to 200; | ||
11 | @3. Set sample rate to 80; | ||
12 | @4. Issuing the "Get device ID" command (0xF2) and waits for the response; | ||
13 | @5. FSP will respond 0x04. | ||
14 | |||
15 | Packet 1 | ||
16 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
17 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
18 | 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|W|W|W|W| | ||
19 | |---------------| |---------------| |---------------| |---------------| | ||
20 | |||
21 | Byte 1: Bit7 => Y overflow | ||
22 | Bit6 => X overflow | ||
23 | Bit5 => Y sign bit | ||
24 | Bit4 => X sign bit | ||
25 | Bit3 => 1 | ||
26 | Bit2 => Middle Button, 1 is pressed, 0 is not pressed. | ||
27 | Bit1 => Right Button, 1 is pressed, 0 is not pressed. | ||
28 | Bit0 => Left Button, 1 is pressed, 0 is not pressed. | ||
29 | Byte 2: X Movement(9-bit 2's complement integers) | ||
30 | Byte 3: Y Movement(9-bit 2's complement integers) | ||
31 | Byte 4: Bit3~Bit0 => the scrolling wheel's movement since the last data report. | ||
32 | valid values, -8 ~ +7 | ||
33 | Bit4 => 1 = 4th mouse button is pressed, Forward one page. | ||
34 | 0 = 4th mouse button is not pressed. | ||
35 | Bit5 => 1 = 5th mouse button is pressed, Backward one page. | ||
36 | 0 = 5th mouse button is not pressed. | ||
37 | |||
38 | B) MSID 6: Horizontal and Vertical scrolling. | ||
39 | @ Set bit 1 in register 0x40 to 1 | ||
40 | |||
41 | # FSP replaces scrolling wheel's movement as 4 bits to show horizontal and | ||
42 | vertical scrolling. | ||
43 | |||
44 | Packet 1 | ||
45 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
46 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
47 | 1 |Y|X|y|x|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 | | |B|F|l|r|u|d| | ||
48 | |---------------| |---------------| |---------------| |---------------| | ||
49 | |||
50 | Byte 1: Bit7 => Y overflow | ||
51 | Bit6 => X overflow | ||
52 | Bit5 => Y sign bit | ||
53 | Bit4 => X sign bit | ||
54 | Bit3 => 1 | ||
55 | Bit2 => Middle Button, 1 is pressed, 0 is not pressed. | ||
56 | Bit1 => Right Button, 1 is pressed, 0 is not pressed. | ||
57 | Bit0 => Left Button, 1 is pressed, 0 is not pressed. | ||
58 | Byte 2: X Movement(9-bit 2's complement integers) | ||
59 | Byte 3: Y Movement(9-bit 2's complement integers) | ||
60 | Byte 4: Bit0 => the Vertical scrolling movement downward. | ||
61 | Bit1 => the Vertical scrolling movement upward. | ||
62 | Bit2 => the Vertical scrolling movement rightward. | ||
63 | Bit3 => the Vertical scrolling movement leftward. | ||
64 | Bit4 => 1 = 4th mouse button is pressed, Forward one page. | ||
65 | 0 = 4th mouse button is not pressed. | ||
66 | Bit5 => 1 = 5th mouse button is pressed, Backward one page. | ||
67 | 0 = 5th mouse button is not pressed. | ||
68 | |||
69 | C) MSID 7: | ||
70 | # FSP uses 2 packets(8 Bytes) data to represent Absolute Position | ||
71 | so we have PACKET NUMBER to identify packets. | ||
72 | If PACKET NUMBER is 0, the packet is Packet 1. | ||
73 | If PACKET NUMBER is 1, the packet is Packet 2. | ||
74 | Please count this number in program. | ||
75 | |||
76 | # MSID6 special packet will be enable at the same time when enable MSID 7. | ||
77 | |||
78 | ============================================================================== | ||
79 | * Absolute position for STL3886-G0. | ||
80 | ============================================================================== | ||
81 | @ Set bit 2 or 3 in register 0x40 to 1 | ||
82 | @ Set bit 6 in register 0x40 to 1 | ||
83 | |||
84 | Packet 1 (ABSOLUTE POSITION) | ||
85 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
86 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
87 | 1 |0|1|V|1|1|M|R|L| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |r|l|d|u|X|X|Y|Y| | ||
88 | |---------------| |---------------| |---------------| |---------------| | ||
89 | |||
90 | Byte 1: Bit7~Bit6 => 00, Normal data packet | ||
91 | => 01, Absolute coordination packet | ||
92 | => 10, Notify packet | ||
93 | Bit5 => valid bit | ||
94 | Bit4 => 1 | ||
95 | Bit3 => 1 | ||
96 | Bit2 => Middle Button, 1 is pressed, 0 is not pressed. | ||
97 | Bit1 => Right Button, 1 is pressed, 0 is not pressed. | ||
98 | Bit0 => Left Button, 1 is pressed, 0 is not pressed. | ||
99 | Byte 2: X coordinate (xpos[9:2]) | ||
100 | Byte 3: Y coordinate (ypos[9:2]) | ||
101 | Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) | ||
102 | Bit3~Bit2 => X coordinate (ypos[1:0]) | ||
103 | Bit4 => scroll up | ||
104 | Bit5 => scroll down | ||
105 | Bit6 => scroll left | ||
106 | Bit7 => scroll right | ||
107 | |||
108 | Notify Packet for G0 | ||
109 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
110 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
111 | 1 |1|0|0|1|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |M|M|M|M|M|M|M|M| 4 |0|0|0|0|0|0|0|0| | ||
112 | |---------------| |---------------| |---------------| |---------------| | ||
113 | |||
114 | Byte 1: Bit7~Bit6 => 00, Normal data packet | ||
115 | => 01, Absolute coordination packet | ||
116 | => 10, Notify packet | ||
117 | Bit5 => 0 | ||
118 | Bit4 => 1 | ||
119 | Bit3 => 1 | ||
120 | Bit2 => Middle Button, 1 is pressed, 0 is not pressed. | ||
121 | Bit1 => Right Button, 1 is pressed, 0 is not pressed. | ||
122 | Bit0 => Left Button, 1 is pressed, 0 is not pressed. | ||
123 | Byte 2: Message Type => 0x5A (Enable/Disable status packet) | ||
124 | Mode Type => 0xA5 (Normal/Icon mode status) | ||
125 | Byte 3: Message Type => 0x00 (Disabled) | ||
126 | => 0x01 (Enabled) | ||
127 | Mode Type => 0x00 (Normal) | ||
128 | => 0x01 (Icon) | ||
129 | Byte 4: Bit7~Bit0 => Don't Care | ||
130 | |||
131 | ============================================================================== | ||
132 | * Absolute position for STL3888-A0. | ||
133 | ============================================================================== | ||
134 | Packet 1 (ABSOLUTE POSITION) | ||
135 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
136 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
137 | 1 |0|1|V|A|1|L|0|1| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |x|x|y|y|X|X|Y|Y| | ||
138 | |---------------| |---------------| |---------------| |---------------| | ||
139 | |||
140 | Byte 1: Bit7~Bit6 => 00, Normal data packet | ||
141 | => 01, Absolute coordination packet | ||
142 | => 10, Notify packet | ||
143 | Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up. | ||
144 | When both fingers are up, the last two reports have zero valid | ||
145 | bit. | ||
146 | Bit4 => arc | ||
147 | Bit3 => 1 | ||
148 | Bit2 => Left Button, 1 is pressed, 0 is released. | ||
149 | Bit1 => 0 | ||
150 | Bit0 => 1 | ||
151 | Byte 2: X coordinate (xpos[9:2]) | ||
152 | Byte 3: Y coordinate (ypos[9:2]) | ||
153 | Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) | ||
154 | Bit3~Bit2 => X coordinate (ypos[1:0]) | ||
155 | Bit5~Bit4 => y1_g | ||
156 | Bit7~Bit6 => x1_g | ||
157 | |||
158 | Packet 2 (ABSOLUTE POSITION) | ||
159 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
160 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
161 | 1 |0|1|V|A|1|R|1|0| 2 |X|X|X|X|X|X|X|X| 3 |Y|Y|Y|Y|Y|Y|Y|Y| 4 |x|x|y|y|X|X|Y|Y| | ||
162 | |---------------| |---------------| |---------------| |---------------| | ||
163 | |||
164 | Byte 1: Bit7~Bit6 => 00, Normal data packet | ||
165 | => 01, Absolute coordinates packet | ||
166 | => 10, Notify packet | ||
167 | Bit5 => Valid bit, 0 means that the coordinate is invalid or finger up. | ||
168 | When both fingers are up, the last two reports have zero valid | ||
169 | bit. | ||
170 | Bit4 => arc | ||
171 | Bit3 => 1 | ||
172 | Bit2 => Right Button, 1 is pressed, 0 is released. | ||
173 | Bit1 => 1 | ||
174 | Bit0 => 0 | ||
175 | Byte 2: X coordinate (xpos[9:2]) | ||
176 | Byte 3: Y coordinate (ypos[9:2]) | ||
177 | Byte 4: Bit1~Bit0 => Y coordinate (xpos[1:0]) | ||
178 | Bit3~Bit2 => X coordinate (ypos[1:0]) | ||
179 | Bit5~Bit4 => y2_g | ||
180 | Bit7~Bit6 => x2_g | ||
181 | |||
182 | Notify Packet for STL3888-A0 | ||
183 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
184 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
185 | 1 |1|0|1|P|1|M|R|L| 2 |C|C|C|C|C|C|C|C| 3 |0|0|F|F|0|0|0|i| 4 |r|l|d|u|0|0|0|0| | ||
186 | |---------------| |---------------| |---------------| |---------------| | ||
187 | |||
188 | Byte 1: Bit7~Bit6 => 00, Normal data packet | ||
189 | => 01, Absolute coordination packet | ||
190 | => 10, Notify packet | ||
191 | Bit5 => 1 | ||
192 | Bit4 => when in absolute coordinates mode (valid when EN_PKT_GO is 1): | ||
193 | 0: left button is generated by the on-pad command | ||
194 | 1: left button is generated by the external button | ||
195 | Bit3 => 1 | ||
196 | Bit2 => Middle Button, 1 is pressed, 0 is not pressed. | ||
197 | Bit1 => Right Button, 1 is pressed, 0 is not pressed. | ||
198 | Bit0 => Left Button, 1 is pressed, 0 is not pressed. | ||
199 | Byte 2: Message Type => 0xB7 (Multi Finger, Multi Coordinate mode) | ||
200 | Byte 3: Bit7~Bit6 => Don't care | ||
201 | Bit5~Bit4 => Number of fingers | ||
202 | Bit3~Bit1 => Reserved | ||
203 | Bit0 => 1: enter gesture mode; 0: leaving gesture mode | ||
204 | Byte 4: Bit7 => scroll right button | ||
205 | Bit6 => scroll left button | ||
206 | Bit5 => scroll down button | ||
207 | Bit4 => scroll up button | ||
208 | * Note that if gesture and additional button (Bit4~Bit7) | ||
209 | happen at the same time, the button information will not | ||
210 | be sent. | ||
211 | Bit3~Bit0 => Reserved | ||
212 | |||
213 | Sample sequence of Multi-finger, Multi-coordinate mode: | ||
214 | |||
215 | notify packet (valid bit == 1), abs pkt 1, abs pkt 2, abs pkt 1, | ||
216 | abs pkt 2, ..., notify packet(valid bit == 0) | ||
217 | |||
218 | ============================================================================== | ||
219 | * FSP Enable/Disable packet | ||
220 | ============================================================================== | ||
221 | Bit 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 | ||
222 | BYTE |---------------|BYTE |---------------|BYTE|---------------|BYTE|---------------| | ||
223 | 1 |Y|X|0|0|1|M|R|L| 2 |0|1|0|1|1|0|1|E| 3 | | | | | | | | | 4 | | | | | | | | | | ||
224 | |---------------| |---------------| |---------------| |---------------| | ||
225 | |||
226 | FSP will send out enable/disable packet when FSP receive PS/2 enable/disable | ||
227 | command. Host will receive the packet which Middle, Right, Left button will | ||
228 | be set. The packet only use byte 0 and byte 1 as a pattern of original packet. | ||
229 | Ignore the other bytes of the packet. | ||
230 | |||
231 | Byte 1: Bit7 => 0, Y overflow | ||
232 | Bit6 => 0, X overflow | ||
233 | Bit5 => 0, Y sign bit | ||
234 | Bit4 => 0, X sign bit | ||
235 | Bit3 => 1 | ||
236 | Bit2 => 1, Middle Button | ||
237 | Bit1 => 1, Right Button | ||
238 | Bit0 => 1, Left Button | ||
239 | Byte 2: Bit7~1 => (0101101b) | ||
240 | Bit0 => 1 = Enable | ||
241 | 0 = Disable | ||
242 | Byte 3: Don't care | ||
243 | Byte 4: Don't care (MOUSE ID 3, 4) | ||
244 | Byte 5~8: Don't care (Absolute packet) | ||
245 | |||
246 | ============================================================================== | ||
247 | * PS/2 Command Set | ||
248 | ============================================================================== | ||
249 | |||
250 | FSP supports basic PS/2 commanding set and modes, refer to following URL for | ||
251 | details about PS/2 commands: | ||
252 | |||
253 | http://www.computer-engineering.org/index.php?title=PS/2_Mouse_Interface | ||
254 | |||
255 | ============================================================================== | ||
256 | * Programming Sequence for Determining Packet Parsing Flow | ||
257 | ============================================================================== | ||
258 | 1. Identify FSP by reading device ID(0x00) and version(0x01) register | ||
259 | |||
260 | 2. Determine number of buttons by reading status2 (0x0b) register | ||
261 | |||
262 | buttons = reg[0x0b] & 0x30 | ||
263 | |||
264 | if buttons == 0x30 or buttons == 0x20: | ||
265 | # two/four buttons | ||
266 | Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse' | ||
267 | section A for packet parsing detail(ignore byte 4, bit ~ 7) | ||
268 | elif buttons == 0x10: | ||
269 | # 6 buttons | ||
270 | Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse' | ||
271 | section B for packet parsing detail | ||
272 | elif buttons == 0x00: | ||
273 | # 6 buttons | ||
274 | Refer to 'Finger Sensing Pad PS/2 Mouse Intellimouse' | ||
275 | section A for packet parsing detail | ||
276 | |||
277 | ============================================================================== | ||
278 | * Programming Sequence for Register Reading/Writing | ||
279 | ============================================================================== | ||
280 | |||
281 | Register inversion requirement: | ||
282 | |||
283 | Following values needed to be inverted(the '~' operator in C) before being | ||
284 | sent to FSP: | ||
285 | |||
286 | 0xe9, 0xee, 0xf2 and 0xff. | ||
287 | |||
288 | Register swapping requirement: | ||
289 | |||
290 | Following values needed to have their higher 4 bits and lower 4 bits being | ||
291 | swapped before being sent to FSP: | ||
292 | |||
293 | 10, 20, 40, 60, 80, 100 and 200. | ||
294 | |||
295 | Register reading sequence: | ||
296 | |||
297 | 1. send 0xf3 PS/2 command to FSP; | ||
298 | |||
299 | 2. send 0x66 PS/2 command to FSP; | ||
300 | |||
301 | 3. send 0x88 PS/2 command to FSP; | ||
302 | |||
303 | 4. send 0xf3 PS/2 command to FSP; | ||
304 | |||
305 | 5. if the register address being to read is not required to be | ||
306 | inverted(refer to the 'Register inversion requirement' section), | ||
307 | goto step 6 | ||
308 | |||
309 | 5a. send 0x68 PS/2 command to FSP; | ||
310 | |||
311 | 5b. send the inverted register address to FSP and goto step 8; | ||
312 | |||
313 | 6. if the register address being to read is not required to be | ||
314 | swapped(refer to the 'Register swapping requirement' section), | ||
315 | goto step 7 | ||
316 | |||
317 | 6a. send 0xcc PS/2 command to FSP; | ||
318 | |||
319 | 6b. send the swapped register address to FSP and goto step 8; | ||
320 | |||
321 | 7. send 0x66 PS/2 command to FSP; | ||
322 | |||
323 | 7a. send the original register address to FSP and goto step 8; | ||
324 | |||
325 | 8. send 0xe9(status request) PS/2 command to FSP; | ||
326 | |||
327 | 9. the response read from FSP should be the requested register value. | ||
328 | |||
329 | Register writing sequence: | ||
330 | |||
331 | 1. send 0xf3 PS/2 command to FSP; | ||
332 | |||
333 | 2. if the register address being to write is not required to be | ||
334 | inverted(refer to the 'Register inversion requirement' section), | ||
335 | goto step 3 | ||
336 | |||
337 | 2a. send 0x74 PS/2 command to FSP; | ||
338 | |||
339 | 2b. send the inverted register address to FSP and goto step 5; | ||
340 | |||
341 | 3. if the register address being to write is not required to be | ||
342 | swapped(refer to the 'Register swapping requirement' section), | ||
343 | goto step 4 | ||
344 | |||
345 | 3a. send 0x77 PS/2 command to FSP; | ||
346 | |||
347 | 3b. send the swapped register address to FSP and goto step 5; | ||
348 | |||
349 | 4. send 0x55 PS/2 command to FSP; | ||
350 | |||
351 | 4a. send the register address to FSP and goto step 5; | ||
352 | |||
353 | 5. send 0xf3 PS/2 command to FSP; | ||
354 | |||
355 | 6. if the register value being to write is not required to be | ||
356 | inverted(refer to the 'Register inversion requirement' section), | ||
357 | goto step 7 | ||
358 | |||
359 | 6a. send 0x47 PS/2 command to FSP; | ||
360 | |||
361 | 6b. send the inverted register value to FSP and goto step 9; | ||
362 | |||
363 | 7. if the register value being to write is not required to be | ||
364 | swapped(refer to the 'Register swapping requirement' section), | ||
365 | goto step 8 | ||
366 | |||
367 | 7a. send 0x44 PS/2 command to FSP; | ||
368 | |||
369 | 7b. send the swapped register value to FSP and goto step 9; | ||
370 | |||
371 | 8. send 0x33 PS/2 command to FSP; | ||
372 | |||
373 | 8a. send the register value to FSP; | ||
374 | |||
375 | 9. the register writing sequence is completed. | ||
376 | |||
377 | ============================================================================== | ||
378 | * Register Listing | ||
379 | ============================================================================== | ||
380 | |||
381 | offset width default r/w name | ||
382 | 0x00 bit7~bit0 0x01 RO device ID | ||
383 | |||
384 | 0x01 bit7~bit0 0xc0 RW version ID | ||
385 | |||
386 | 0x02 bit7~bit0 0x01 RO vendor ID | ||
387 | |||
388 | 0x03 bit7~bit0 0x01 RO product ID | ||
389 | |||
390 | 0x04 bit3~bit0 0x01 RW revision ID | ||
391 | |||
392 | 0x0b RO test mode status 1 | ||
393 | bit3 1 RO 0: rotate 180 degree, 1: no rotation | ||
394 | |||
395 | bit5~bit4 RO number of buttons | ||
396 | 11 => 2, lbtn/rbtn | ||
397 | 10 => 4, lbtn/rbtn/scru/scrd | ||
398 | 01 => 6, lbtn/rbtn/scru/scrd/scrl/scrr | ||
399 | 00 => 6, lbtn/rbtn/scru/scrd/fbtn/bbtn | ||
400 | |||
401 | 0x0f RW register file page control | ||
402 | bit0 0 RW 1 to enable page 1 register files | ||
403 | |||
404 | 0x10 RW system control 1 | ||
405 | bit0 1 RW Reserved, must be 1 | ||
406 | bit1 0 RW Reserved, must be 0 | ||
407 | bit4 1 RW Reserved, must be 0 | ||
408 | bit5 0 RW register clock gating enable | ||
409 | 0: read only, 1: read/write enable | ||
410 | (Note that following registers does not require clock gating being | ||
411 | enabled prior to write: 05 06 07 08 09 0c 0f 10 11 12 16 17 18 23 2e | ||
412 | 40 41 42 43.) | ||
413 | |||
414 | 0x31 RW on-pad command detection | ||
415 | bit7 0 RW on-pad command left button down tag | ||
416 | enable | ||
417 | 0: disable, 1: enable | ||
418 | |||
419 | 0x34 RW on-pad command control 5 | ||
420 | bit4~bit0 0x05 RW XLO in 0s/4/1, so 03h = 0010.1b = 2.5 | ||
421 | (Note that position unit is in 0.5 scanline) | ||
422 | |||
423 | bit7 0 RW on-pad tap zone enable | ||
424 | 0: disable, 1: enable | ||
425 | |||
426 | 0x35 RW on-pad command control 6 | ||
427 | bit4~bit0 0x1d RW XHI in 0s/4/1, so 19h = 1100.1b = 12.5 | ||
428 | (Note that position unit is in 0.5 scanline) | ||
429 | |||
430 | 0x36 RW on-pad command control 7 | ||
431 | bit4~bit0 0x04 RW YLO in 0s/4/1, so 03h = 0010.1b = 2.5 | ||
432 | (Note that position unit is in 0.5 scanline) | ||
433 | |||
434 | 0x37 RW on-pad command control 8 | ||
435 | bit4~bit0 0x13 RW YHI in 0s/4/1, so 11h = 1000.1b = 8.5 | ||
436 | (Note that position unit is in 0.5 scanline) | ||
437 | |||
438 | 0x40 RW system control 5 | ||
439 | bit1 0 RW FSP Intellimouse mode enable | ||
440 | 0: disable, 1: enable | ||
441 | |||
442 | bit2 0 RW movement + abs. coordinate mode enable | ||
443 | 0: disable, 1: enable | ||
444 | (Note that this function has the functionality of bit 1 even when | ||
445 | bit 1 is not set. However, the format is different from that of bit 1. | ||
446 | In addition, when bit 1 and bit 2 are set at the same time, bit 2 will | ||
447 | override bit 1.) | ||
448 | |||
449 | bit3 0 RW abs. coordinate only mode enable | ||
450 | 0: disable, 1: enable | ||
451 | (Note that this function has the functionality of bit 1 even when | ||
452 | bit 1 is not set. However, the format is different from that of bit 1. | ||
453 | In addition, when bit 1, bit 2 and bit 3 are set at the same time, | ||
454 | bit 3 will override bit 1 and 2.) | ||
455 | |||
456 | bit5 0 RW auto switch enable | ||
457 | 0: disable, 1: enable | ||
458 | |||
459 | bit6 0 RW G0 abs. + notify packet format enable | ||
460 | 0: disable, 1: enable | ||
461 | (Note that the absolute/relative coordinate output still depends on | ||
462 | bit 2 and 3. That is, if any of those bit is 1, host will receive | ||
463 | absolute coordinates; otherwise, host only receives packets with | ||
464 | relative coordinate.) | ||
465 | |||
466 | 0x43 RW on-pad control | ||
467 | bit0 0 RW on-pad control enable | ||
468 | 0: disable, 1: enable | ||
469 | (Note that if this bit is cleared, bit 3/5 will be ineffective) | ||
470 | |||
471 | bit3 0 RW on-pad fix vertical scrolling enable | ||
472 | 0: disable, 1: enable | ||
473 | |||
474 | bit5 0 RW on-pad fix horizontal scrolling enable | ||
475 | 0: disable, 1: enable | ||
diff --git a/Documentation/intel_txt.txt b/Documentation/intel_txt.txt new file mode 100644 index 000000000000..f40a1f030019 --- /dev/null +++ b/Documentation/intel_txt.txt | |||
@@ -0,0 +1,210 @@ | |||
1 | Intel(R) TXT Overview: | ||
2 | ===================== | ||
3 | |||
4 | Intel's technology for safer computing, Intel(R) Trusted Execution | ||
5 | Technology (Intel(R) TXT), defines platform-level enhancements that | ||
6 | provide the building blocks for creating trusted platforms. | ||
7 | |||
8 | Intel TXT was formerly known by the code name LaGrande Technology (LT). | ||
9 | |||
10 | Intel TXT in Brief: | ||
11 | o Provides dynamic root of trust for measurement (DRTM) | ||
12 | o Data protection in case of improper shutdown | ||
13 | o Measurement and verification of launched environment | ||
14 | |||
15 | Intel TXT is part of the vPro(TM) brand and is also available some | ||
16 | non-vPro systems. It is currently available on desktop systems | ||
17 | based on the Q35, X38, Q45, and Q43 Express chipsets (e.g. Dell | ||
18 | Optiplex 755, HP dc7800, etc.) and mobile systems based on the GM45, | ||
19 | PM45, and GS45 Express chipsets. | ||
20 | |||
21 | For more information, see http://www.intel.com/technology/security/. | ||
22 | This site also has a link to the Intel TXT MLE Developers Manual, | ||
23 | which has been updated for the new released platforms. | ||
24 | |||
25 | Intel TXT has been presented at various events over the past few | ||
26 | years, some of which are: | ||
27 | LinuxTAG 2008: | ||
28 | http://www.linuxtag.org/2008/en/conf/events/vp-donnerstag/ | ||
29 | details.html?talkid=110 | ||
30 | TRUST2008: | ||
31 | http://www.trust2008.eu/downloads/Keynote-Speakers/ | ||
32 | 3_David-Grawrock_The-Front-Door-of-Trusted-Computing.pdf | ||
33 | IDF 2008, Shanghai: | ||
34 | http://inteldeveloperforum.com.edgesuite.net/shanghai_2008/ | ||
35 | aep/PROS003/index.html | ||
36 | IDFs 2006, 2007 (I'm not sure if/where they are online) | ||
37 | |||
38 | Trusted Boot Project Overview: | ||
39 | ============================= | ||
40 | |||
41 | Trusted Boot (tboot) is an open source, pre- kernel/VMM module that | ||
42 | uses Intel TXT to perform a measured and verified launch of an OS | ||
43 | kernel/VMM. | ||
44 | |||
45 | It is hosted on SourceForge at http://sourceforge.net/projects/tboot. | ||
46 | The mercurial source repo is available at http://www.bughost.org/ | ||
47 | repos.hg/tboot.hg. | ||
48 | |||
49 | Tboot currently supports launching Xen (open source VMM/hypervisor | ||
50 | w/ TXT support since v3.2), and now Linux kernels. | ||
51 | |||
52 | |||
53 | Value Proposition for Linux or "Why should you care?" | ||
54 | ===================================================== | ||
55 | |||
56 | While there are many products and technologies that attempt to | ||
57 | measure or protect the integrity of a running kernel, they all | ||
58 | assume the kernel is "good" to begin with. The Integrity | ||
59 | Measurement Architecture (IMA) and Linux Integrity Module interface | ||
60 | are examples of such solutions. | ||
61 | |||
62 | To get trust in the initial kernel without using Intel TXT, a | ||
63 | static root of trust must be used. This bases trust in BIOS | ||
64 | starting at system reset and requires measurement of all code | ||
65 | executed between system reset through the completion of the kernel | ||
66 | boot as well as data objects used by that code. In the case of a | ||
67 | Linux kernel, this means all of BIOS, any option ROMs, the | ||
68 | bootloader and the boot config. In practice, this is a lot of | ||
69 | code/data, much of which is subject to change from boot to boot | ||
70 | (e.g. changing NICs may change option ROMs). Without reference | ||
71 | hashes, these measurement changes are difficult to assess or | ||
72 | confirm as benign. This process also does not provide DMA | ||
73 | protection, memory configuration/alias checks and locks, crash | ||
74 | protection, or policy support. | ||
75 | |||
76 | By using the hardware-based root of trust that Intel TXT provides, | ||
77 | many of these issues can be mitigated. Specifically: many | ||
78 | pre-launch components can be removed from the trust chain, DMA | ||
79 | protection is provided to all launched components, a large number | ||
80 | of platform configuration checks are performed and values locked, | ||
81 | protection is provided for any data in the event of an improper | ||
82 | shutdown, and there is support for policy-based execution/verification. | ||
83 | This provides a more stable measurement and a higher assurance of | ||
84 | system configuration and initial state than would be otherwise | ||
85 | possible. Since the tboot project is open source, source code for | ||
86 | almost all parts of the trust chain is available (excepting SMM and | ||
87 | Intel-provided firmware). | ||
88 | |||
89 | How Does it Work? | ||
90 | ================= | ||
91 | |||
92 | o Tboot is an executable that is launched by the bootloader as | ||
93 | the "kernel" (the binary the bootloader executes). | ||
94 | o It performs all of the work necessary to determine if the | ||
95 | platform supports Intel TXT and, if so, executes the GETSEC[SENTER] | ||
96 | processor instruction that initiates the dynamic root of trust. | ||
97 | - If tboot determines that the system does not support Intel TXT | ||
98 | or is not configured correctly (e.g. the SINIT AC Module was | ||
99 | incorrect), it will directly launch the kernel with no changes | ||
100 | to any state. | ||
101 | - Tboot will output various information about its progress to the | ||
102 | terminal, serial port, and/or an in-memory log; the output | ||
103 | locations can be configured with a command line switch. | ||
104 | o The GETSEC[SENTER] instruction will return control to tboot and | ||
105 | tboot then verifies certain aspects of the environment (e.g. TPM NV | ||
106 | lock, e820 table does not have invalid entries, etc.). | ||
107 | o It will wake the APs from the special sleep state the GETSEC[SENTER] | ||
108 | instruction had put them in and place them into a wait-for-SIPI | ||
109 | state. | ||
110 | - Because the processors will not respond to an INIT or SIPI when | ||
111 | in the TXT environment, it is necessary to create a small VT-x | ||
112 | guest for the APs. When they run in this guest, they will | ||
113 | simply wait for the INIT-SIPI-SIPI sequence, which will cause | ||
114 | VMEXITs, and then disable VT and jump to the SIPI vector. This | ||
115 | approach seemed like a better choice than having to insert | ||
116 | special code into the kernel's MP wakeup sequence. | ||
117 | o Tboot then applies an (optional) user-defined launch policy to | ||
118 | verify the kernel and initrd. | ||
119 | - This policy is rooted in TPM NV and is described in the tboot | ||
120 | project. The tboot project also contains code for tools to | ||
121 | create and provision the policy. | ||
122 | - Policies are completely under user control and if not present | ||
123 | then any kernel will be launched. | ||
124 | - Policy action is flexible and can include halting on failures | ||
125 | or simply logging them and continuing. | ||
126 | o Tboot adjusts the e820 table provided by the bootloader to reserve | ||
127 | its own location in memory as well as to reserve certain other | ||
128 | TXT-related regions. | ||
129 | o As part of it's launch, tboot DMA protects all of RAM (using the | ||
130 | VT-d PMRs). Thus, the kernel must be booted with 'intel_iommu=on' | ||
131 | in order to remove this blanket protection and use VT-d's | ||
132 | page-level protection. | ||
133 | o Tboot will populate a shared page with some data about itself and | ||
134 | pass this to the Linux kernel as it transfers control. | ||
135 | - The location of the shared page is passed via the boot_params | ||
136 | struct as a physical address. | ||
137 | o The kernel will look for the tboot shared page address and, if it | ||
138 | exists, map it. | ||
139 | o As one of the checks/protections provided by TXT, it makes a copy | ||
140 | of the VT-d DMARs in a DMA-protected region of memory and verifies | ||
141 | them for correctness. The VT-d code will detect if the kernel was | ||
142 | launched with tboot and use this copy instead of the one in the | ||
143 | ACPI table. | ||
144 | o At this point, tboot and TXT are out of the picture until a | ||
145 | shutdown (S<n>) | ||
146 | o In order to put a system into any of the sleep states after a TXT | ||
147 | launch, TXT must first be exited. This is to prevent attacks that | ||
148 | attempt to crash the system to gain control on reboot and steal | ||
149 | data left in memory. | ||
150 | - The kernel will perform all of its sleep preparation and | ||
151 | populate the shared page with the ACPI data needed to put the | ||
152 | platform in the desired sleep state. | ||
153 | - Then the kernel jumps into tboot via the vector specified in the | ||
154 | shared page. | ||
155 | - Tboot will clean up the environment and disable TXT, then use the | ||
156 | kernel-provided ACPI information to actually place the platform | ||
157 | into the desired sleep state. | ||
158 | - In the case of S3, tboot will also register itself as the resume | ||
159 | vector. This is necessary because it must re-establish the | ||
160 | measured environment upon resume. Once the TXT environment | ||
161 | has been restored, it will restore the TPM PCRs and then | ||
162 | transfer control back to the kernel's S3 resume vector. | ||
163 | In order to preserve system integrity across S3, the kernel | ||
164 | provides tboot with a set of memory ranges (kernel | ||
165 | code/data/bss, S3 resume code, and AP trampoline) that tboot | ||
166 | will calculate a MAC (message authentication code) over and then | ||
167 | seal with the TPM. On resume and once the measured environment | ||
168 | has been re-established, tboot will re-calculate the MAC and | ||
169 | verify it against the sealed value. Tboot's policy determines | ||
170 | what happens if the verification fails. | ||
171 | |||
172 | That's pretty much it for TXT support. | ||
173 | |||
174 | |||
175 | Configuring the System: | ||
176 | ====================== | ||
177 | |||
178 | This code works with 32bit, 32bit PAE, and 64bit (x86_64) kernels. | ||
179 | |||
180 | In BIOS, the user must enable: TPM, TXT, VT-x, VT-d. Not all BIOSes | ||
181 | allow these to be individually enabled/disabled and the screens in | ||
182 | which to find them are BIOS-specific. | ||
183 | |||
184 | grub.conf needs to be modified as follows: | ||
185 | title Linux 2.6.29-tip w/ tboot | ||
186 | root (hd0,0) | ||
187 | kernel /tboot.gz logging=serial,vga,memory | ||
188 | module /vmlinuz-2.6.29-tip intel_iommu=on ro | ||
189 | root=LABEL=/ rhgb console=ttyS0,115200 3 | ||
190 | module /initrd-2.6.29-tip.img | ||
191 | module /Q35_SINIT_17.BIN | ||
192 | |||
193 | The kernel option for enabling Intel TXT support is found under the | ||
194 | Security top-level menu and is called "Enable Intel(R) Trusted | ||
195 | Execution Technology (TXT)". It is marked as EXPERIMENTAL and | ||
196 | depends on the generic x86 support (to allow maximum flexibility in | ||
197 | kernel build options), since the tboot code will detect whether the | ||
198 | platform actually supports Intel TXT and thus whether any of the | ||
199 | kernel code is executed. | ||
200 | |||
201 | The Q35_SINIT_17.BIN file is what Intel TXT refers to as an | ||
202 | Authenticated Code Module. It is specific to the chipset in the | ||
203 | system and can also be found on the Trusted Boot site. It is an | ||
204 | (unencrypted) module signed by Intel that is used as part of the | ||
205 | DRTM process to verify and configure the system. It is signed | ||
206 | because it operates at a higher privilege level in the system than | ||
207 | any other macrocode and its correct operation is critical to the | ||
208 | establishment of the DRTM. The process for determining the correct | ||
209 | SINIT ACM for a system is documented in the SINIT-guide.txt file | ||
210 | that is on the tboot SourceForge site under the SINIT ACM downloads. | ||
diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index dbea4f95fc85..aafca0a8f66a 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt | |||
@@ -121,6 +121,7 @@ Code Seq# Include File Comments | |||
121 | 'c' 00-7F linux/comstats.h conflict! | 121 | 'c' 00-7F linux/comstats.h conflict! |
122 | 'c' 00-7F linux/coda.h conflict! | 122 | 'c' 00-7F linux/coda.h conflict! |
123 | 'c' 80-9F arch/s390/include/asm/chsc.h | 123 | 'c' 80-9F arch/s390/include/asm/chsc.h |
124 | 'c' A0-AF arch/x86/include/asm/msr.h | ||
124 | 'd' 00-FF linux/char/drm/drm/h conflict! | 125 | 'd' 00-FF linux/char/drm/drm/h conflict! |
125 | 'd' F0-FF linux/digi1.h | 126 | 'd' F0-FF linux/digi1.h |
126 | 'e' all linux/digi1.h conflict! | 127 | 'e' all linux/digi1.h conflict! |
@@ -192,7 +193,7 @@ Code Seq# Include File Comments | |||
192 | 0xAD 00 Netfilter device in development: | 193 | 0xAD 00 Netfilter device in development: |
193 | <mailto:rusty@rustcorp.com.au> | 194 | <mailto:rusty@rustcorp.com.au> |
194 | 0xAE all linux/kvm.h Kernel-based Virtual Machine | 195 | 0xAE all linux/kvm.h Kernel-based Virtual Machine |
195 | <mailto:kvm-devel@lists.sourceforge.net> | 196 | <mailto:kvm@vger.kernel.org> |
196 | 0xB0 all RATIO devices in development: | 197 | 0xB0 all RATIO devices in development: |
197 | <mailto:vgo@ratio.de> | 198 | <mailto:vgo@ratio.de> |
198 | 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> | 199 | 0xB1 00-1F PPPoX <mailto:mostrows@styx.uwaterloo.ca> |
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 7936b801fe6a..4c12a290bee5 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -57,6 +57,7 @@ parameter is applicable: | |||
57 | ISAPNP ISA PnP code is enabled. | 57 | ISAPNP ISA PnP code is enabled. |
58 | ISDN Appropriate ISDN support is enabled. | 58 | ISDN Appropriate ISDN support is enabled. |
59 | JOY Appropriate joystick support is enabled. | 59 | JOY Appropriate joystick support is enabled. |
60 | KVM Kernel Virtual Machine support is enabled. | ||
60 | LIBATA Libata driver is enabled | 61 | LIBATA Libata driver is enabled |
61 | LP Printer support is enabled. | 62 | LP Printer support is enabled. |
62 | LOOP Loopback device support is enabled. | 63 | LOOP Loopback device support is enabled. |
@@ -1098,6 +1099,44 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1098 | kstack=N [X86] Print N words from the kernel stack | 1099 | kstack=N [X86] Print N words from the kernel stack |
1099 | in oops dumps. | 1100 | in oops dumps. |
1100 | 1101 | ||
1102 | kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs. | ||
1103 | Default is 0 (don't ignore, but inject #GP) | ||
1104 | |||
1105 | kvm.oos_shadow= [KVM] Disable out-of-sync shadow paging. | ||
1106 | Default is 1 (enabled) | ||
1107 | |||
1108 | kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM. | ||
1109 | Default is 0 (off) | ||
1110 | |||
1111 | kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU) | ||
1112 | for all guests. | ||
1113 | Default is 1 (enabled) if in 64bit or 32bit-PAE mode | ||
1114 | |||
1115 | kvm-intel.bypass_guest_pf= | ||
1116 | [KVM,Intel] Disables bypassing of guest page faults | ||
1117 | on Intel chips. Default is 1 (enabled) | ||
1118 | |||
1119 | kvm-intel.ept= [KVM,Intel] Disable extended page tables | ||
1120 | (virtualized MMU) support on capable Intel chips. | ||
1121 | Default is 1 (enabled) | ||
1122 | |||
1123 | kvm-intel.emulate_invalid_guest_state= | ||
1124 | [KVM,Intel] Enable emulation of invalid guest states | ||
1125 | Default is 0 (disabled) | ||
1126 | |||
1127 | kvm-intel.flexpriority= | ||
1128 | [KVM,Intel] Disable FlexPriority feature (TPR shadow). | ||
1129 | Default is 1 (enabled) | ||
1130 | |||
1131 | kvm-intel.unrestricted_guest= | ||
1132 | [KVM,Intel] Disable unrestricted guest feature | ||
1133 | (virtualized real and unpaged mode) on capable | ||
1134 | Intel chips. Default is 1 (enabled) | ||
1135 | |||
1136 | kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification | ||
1137 | feature (tagged TLBs) on capable Intel chips. | ||
1138 | Default is 1 (enabled) | ||
1139 | |||
1101 | l2cr= [PPC] | 1140 | l2cr= [PPC] |
1102 | 1141 | ||
1103 | l3cr= [PPC] | 1142 | l3cr= [PPC] |
@@ -1503,6 +1542,14 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1503 | [NFS] set the TCP port on which the NFSv4 callback | 1542 | [NFS] set the TCP port on which the NFSv4 callback |
1504 | channel should listen. | 1543 | channel should listen. |
1505 | 1544 | ||
1545 | nfs.cache_getent= | ||
1546 | [NFS] sets the pathname to the program which is used | ||
1547 | to update the NFS client cache entries. | ||
1548 | |||
1549 | nfs.cache_getent_timeout= | ||
1550 | [NFS] sets the timeout after which an attempt to | ||
1551 | update a cache entry is deemed to have failed. | ||
1552 | |||
1506 | nfs.idmap_cache_timeout= | 1553 | nfs.idmap_cache_timeout= |
1507 | [NFS] set the maximum lifetime for idmapper cache | 1554 | [NFS] set the maximum lifetime for idmapper cache |
1508 | entries. | 1555 | entries. |
@@ -1535,6 +1582,11 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1535 | symbolic names: lapic and ioapic | 1582 | symbolic names: lapic and ioapic |
1536 | Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic | 1583 | Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic |
1537 | 1584 | ||
1585 | netpoll.carrier_timeout= | ||
1586 | [NET] Specifies amount of time (in seconds) that | ||
1587 | netpoll should wait for a carrier. By default netpoll | ||
1588 | waits 4 seconds. | ||
1589 | |||
1538 | no387 [BUGS=X86-32] Tells the kernel to use the 387 maths | 1590 | no387 [BUGS=X86-32] Tells the kernel to use the 387 maths |
1539 | emulation library even if a 387 maths coprocessor | 1591 | emulation library even if a 387 maths coprocessor |
1540 | is present. | 1592 | is present. |
@@ -1919,11 +1971,12 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1919 | Format: { 0 | 1 } | 1971 | Format: { 0 | 1 } |
1920 | See arch/parisc/kernel/pdc_chassis.c | 1972 | See arch/parisc/kernel/pdc_chassis.c |
1921 | 1973 | ||
1922 | percpu_alloc= [X86] Select which percpu first chunk allocator to use. | 1974 | percpu_alloc= Select which percpu first chunk allocator to use. |
1923 | Allowed values are one of "lpage", "embed" and "4k". | 1975 | Currently supported values are "embed" and "page". |
1924 | See comments in arch/x86/kernel/setup_percpu.c for | 1976 | Archs may support subset or none of the selections. |
1925 | details on each allocator. This parameter is primarily | 1977 | See comments in mm/percpu.c for details on each |
1926 | for debugging and performance comparison. | 1978 | allocator. This parameter is primarily for debugging |
1979 | and performance comparison. | ||
1927 | 1980 | ||
1928 | pf. [PARIDE] | 1981 | pf. [PARIDE] |
1929 | See Documentation/blockdev/paride.txt. | 1982 | See Documentation/blockdev/paride.txt. |
@@ -2395,6 +2448,18 @@ and is between 256 and 4096 characters. It is defined in the file | |||
2395 | stifb= [HW] | 2448 | stifb= [HW] |
2396 | Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]] | 2449 | Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]] |
2397 | 2450 | ||
2451 | sunrpc.min_resvport= | ||
2452 | sunrpc.max_resvport= | ||
2453 | [NFS,SUNRPC] | ||
2454 | SunRPC servers often require that client requests | ||
2455 | originate from a privileged port (i.e. a port in the | ||
2456 | range 0 < portnr < 1024). | ||
2457 | An administrator who wishes to reserve some of these | ||
2458 | ports for other uses may adjust the range that the | ||
2459 | kernel's sunrpc client considers to be privileged | ||
2460 | using these two parameters to set the minimum and | ||
2461 | maximum port values. | ||
2462 | |||
2398 | sunrpc.pool_mode= | 2463 | sunrpc.pool_mode= |
2399 | [NFS] | 2464 | [NFS] |
2400 | Control how the NFS server code allocates CPUs to | 2465 | Control how the NFS server code allocates CPUs to |
@@ -2411,6 +2476,15 @@ and is between 256 and 4096 characters. It is defined in the file | |||
2411 | pernode one pool for each NUMA node (equivalent | 2476 | pernode one pool for each NUMA node (equivalent |
2412 | to global on non-NUMA machines) | 2477 | to global on non-NUMA machines) |
2413 | 2478 | ||
2479 | sunrpc.tcp_slot_table_entries= | ||
2480 | sunrpc.udp_slot_table_entries= | ||
2481 | [NFS,SUNRPC] | ||
2482 | Sets the upper limit on the number of simultaneous | ||
2483 | RPC calls that can be sent from the client to a | ||
2484 | server. Increasing these values may allow you to | ||
2485 | improve throughput, but will also increase the | ||
2486 | amount of memory reserved for use by the client. | ||
2487 | |||
2414 | swiotlb= [IA-64] Number of I/O TLB slabs | 2488 | swiotlb= [IA-64] Number of I/O TLB slabs |
2415 | 2489 | ||
2416 | switches= [HW,M68k] | 2490 | switches= [HW,M68k] |
@@ -2480,6 +2554,11 @@ and is between 256 and 4096 characters. It is defined in the file | |||
2480 | trace_buf_size=nn[KMG] | 2554 | trace_buf_size=nn[KMG] |
2481 | [FTRACE] will set tracing buffer size. | 2555 | [FTRACE] will set tracing buffer size. |
2482 | 2556 | ||
2557 | trace_event=[event-list] | ||
2558 | [FTRACE] Set and start specified trace events in order | ||
2559 | to facilitate early boot debugging. | ||
2560 | See also Documentation/trace/events.txt | ||
2561 | |||
2483 | trix= [HW,OSS] MediaTrix AudioTrix Pro | 2562 | trix= [HW,OSS] MediaTrix AudioTrix Pro |
2484 | Format: | 2563 | Format: |
2485 | <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> | 2564 | <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> |
diff --git a/Documentation/keys.txt b/Documentation/keys.txt index b56aacc1fff8..e4dbbdb1bd96 100644 --- a/Documentation/keys.txt +++ b/Documentation/keys.txt | |||
@@ -26,7 +26,7 @@ This document has the following sections: | |||
26 | - Notes on accessing payload contents | 26 | - Notes on accessing payload contents |
27 | - Defining a key type | 27 | - Defining a key type |
28 | - Request-key callback service | 28 | - Request-key callback service |
29 | - Key access filesystem | 29 | - Garbage collection |
30 | 30 | ||
31 | 31 | ||
32 | ============ | 32 | ============ |
@@ -113,6 +113,9 @@ Each key has a number of attributes: | |||
113 | 113 | ||
114 | (*) Dead. The key's type was unregistered, and so the key is now useless. | 114 | (*) Dead. The key's type was unregistered, and so the key is now useless. |
115 | 115 | ||
116 | Keys in the last three states are subject to garbage collection. See the | ||
117 | section on "Garbage collection". | ||
118 | |||
116 | 119 | ||
117 | ==================== | 120 | ==================== |
118 | KEY SERVICE OVERVIEW | 121 | KEY SERVICE OVERVIEW |
@@ -754,6 +757,26 @@ The keyctl syscall functions are: | |||
754 | successful. | 757 | successful. |
755 | 758 | ||
756 | 759 | ||
760 | (*) Install the calling process's session keyring on its parent. | ||
761 | |||
762 | long keyctl(KEYCTL_SESSION_TO_PARENT); | ||
763 | |||
764 | This functions attempts to install the calling process's session keyring | ||
765 | on to the calling process's parent, replacing the parent's current session | ||
766 | keyring. | ||
767 | |||
768 | The calling process must have the same ownership as its parent, the | ||
769 | keyring must have the same ownership as the calling process, the calling | ||
770 | process must have LINK permission on the keyring and the active LSM module | ||
771 | mustn't deny permission, otherwise error EPERM will be returned. | ||
772 | |||
773 | Error ENOMEM will be returned if there was insufficient memory to complete | ||
774 | the operation, otherwise 0 will be returned to indicate success. | ||
775 | |||
776 | The keyring will be replaced next time the parent process leaves the | ||
777 | kernel and resumes executing userspace. | ||
778 | |||
779 | |||
757 | =============== | 780 | =============== |
758 | KERNEL SERVICES | 781 | KERNEL SERVICES |
759 | =============== | 782 | =============== |
@@ -1231,3 +1254,17 @@ by executing: | |||
1231 | 1254 | ||
1232 | In this case, the program isn't required to actually attach the key to a ring; | 1255 | In this case, the program isn't required to actually attach the key to a ring; |
1233 | the rings are provided for reference. | 1256 | the rings are provided for reference. |
1257 | |||
1258 | |||
1259 | ================== | ||
1260 | GARBAGE COLLECTION | ||
1261 | ================== | ||
1262 | |||
1263 | Dead keys (for which the type has been removed) will be automatically unlinked | ||
1264 | from those keyrings that point to them and deleted as soon as possible by a | ||
1265 | background garbage collector. | ||
1266 | |||
1267 | Similarly, revoked and expired keys will be garbage collected, but only after a | ||
1268 | certain amount of time has passed. This time is set as a number of seconds in: | ||
1269 | |||
1270 | /proc/sys/kernel/keys/gc_delay | ||
diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt index 89068030b01b..34f6638aa5ac 100644 --- a/Documentation/kmemleak.txt +++ b/Documentation/kmemleak.txt | |||
@@ -27,6 +27,13 @@ To trigger an intermediate memory scan: | |||
27 | 27 | ||
28 | # echo scan > /sys/kernel/debug/kmemleak | 28 | # echo scan > /sys/kernel/debug/kmemleak |
29 | 29 | ||
30 | To clear the list of all current possible memory leaks: | ||
31 | |||
32 | # echo clear > /sys/kernel/debug/kmemleak | ||
33 | |||
34 | New leaks will then come up upon reading /sys/kernel/debug/kmemleak | ||
35 | again. | ||
36 | |||
30 | Note that the orphan objects are listed in the order they were allocated | 37 | Note that the orphan objects are listed in the order they were allocated |
31 | and one object at the beginning of the list may cause other subsequent | 38 | and one object at the beginning of the list may cause other subsequent |
32 | objects to be reported as orphan. | 39 | objects to be reported as orphan. |
@@ -42,6 +49,9 @@ Memory scanning parameters can be modified at run-time by writing to the | |||
42 | scan=<secs> - set the automatic memory scanning period in seconds | 49 | scan=<secs> - set the automatic memory scanning period in seconds |
43 | (default 600, 0 to stop the automatic scanning) | 50 | (default 600, 0 to stop the automatic scanning) |
44 | scan - trigger a memory scan | 51 | scan - trigger a memory scan |
52 | clear - clear list of current memory leak suspects, done by | ||
53 | marking all current reported unreferenced objects grey | ||
54 | dump=<addr> - dump information about the object found at <addr> | ||
45 | 55 | ||
46 | Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on | 56 | Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on |
47 | the kernel command line. | 57 | the kernel command line. |
@@ -86,6 +96,27 @@ avoid this, kmemleak can also store the number of values pointing to an | |||
86 | address inside the block address range that need to be found so that the | 96 | address inside the block address range that need to be found so that the |
87 | block is not considered a leak. One example is __vmalloc(). | 97 | block is not considered a leak. One example is __vmalloc(). |
88 | 98 | ||
99 | Testing specific sections with kmemleak | ||
100 | --------------------------------------- | ||
101 | |||
102 | Upon initial bootup your /sys/kernel/debug/kmemleak output page may be | ||
103 | quite extensive. This can also be the case if you have very buggy code | ||
104 | when doing development. To work around these situations you can use the | ||
105 | 'clear' command to clear all reported unreferenced objects from the | ||
106 | /sys/kernel/debug/kmemleak output. By issuing a 'scan' after a 'clear' | ||
107 | you can find new unreferenced objects; this should help with testing | ||
108 | specific sections of code. | ||
109 | |||
110 | To test a critical section on demand with a clean kmemleak do: | ||
111 | |||
112 | # echo clear > /sys/kernel/debug/kmemleak | ||
113 | ... test your kernel or modules ... | ||
114 | # echo scan > /sys/kernel/debug/kmemleak | ||
115 | |||
116 | Then as usual to get your report with: | ||
117 | |||
118 | # cat /sys/kernel/debug/kmemleak | ||
119 | |||
89 | Kmemleak API | 120 | Kmemleak API |
90 | ------------ | 121 | ------------ |
91 | 122 | ||
diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt new file mode 100644 index 000000000000..5a4bc8cf6d04 --- /dev/null +++ b/Documentation/kvm/api.txt | |||
@@ -0,0 +1,759 @@ | |||
1 | The Definitive KVM (Kernel-based Virtual Machine) API Documentation | ||
2 | =================================================================== | ||
3 | |||
4 | 1. General description | ||
5 | |||
6 | The kvm API is a set of ioctls that are issued to control various aspects | ||
7 | of a virtual machine. The ioctls belong to three classes | ||
8 | |||
9 | - System ioctls: These query and set global attributes which affect the | ||
10 | whole kvm subsystem. In addition a system ioctl is used to create | ||
11 | virtual machines | ||
12 | |||
13 | - VM ioctls: These query and set attributes that affect an entire virtual | ||
14 | machine, for example memory layout. In addition a VM ioctl is used to | ||
15 | create virtual cpus (vcpus). | ||
16 | |||
17 | Only run VM ioctls from the same process (address space) that was used | ||
18 | to create the VM. | ||
19 | |||
20 | - vcpu ioctls: These query and set attributes that control the operation | ||
21 | of a single virtual cpu. | ||
22 | |||
23 | Only run vcpu ioctls from the same thread that was used to create the | ||
24 | vcpu. | ||
25 | |||
26 | 2. File descritpors | ||
27 | |||
28 | The kvm API is centered around file descriptors. An initial | ||
29 | open("/dev/kvm") obtains a handle to the kvm subsystem; this handle | ||
30 | can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this | ||
31 | handle will create a VM file descripror which can be used to issue VM | ||
32 | ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu | ||
33 | and return a file descriptor pointing to it. Finally, ioctls on a vcpu | ||
34 | fd can be used to control the vcpu, including the important task of | ||
35 | actually running guest code. | ||
36 | |||
37 | In general file descriptors can be migrated among processes by means | ||
38 | of fork() and the SCM_RIGHTS facility of unix domain socket. These | ||
39 | kinds of tricks are explicitly not supported by kvm. While they will | ||
40 | not cause harm to the host, their actual behavior is not guaranteed by | ||
41 | the API. The only supported use is one virtual machine per process, | ||
42 | and one vcpu per thread. | ||
43 | |||
44 | 3. Extensions | ||
45 | |||
46 | As of Linux 2.6.22, the KVM ABI has been stabilized: no backward | ||
47 | incompatible change are allowed. However, there is an extension | ||
48 | facility that allows backward-compatible extensions to the API to be | ||
49 | queried and used. | ||
50 | |||
51 | The extension mechanism is not based on on the Linux version number. | ||
52 | Instead, kvm defines extension identifiers and a facility to query | ||
53 | whether a particular extension identifier is available. If it is, a | ||
54 | set of ioctls is available for application use. | ||
55 | |||
56 | 4. API description | ||
57 | |||
58 | This section describes ioctls that can be used to control kvm guests. | ||
59 | For each ioctl, the following information is provided along with a | ||
60 | description: | ||
61 | |||
62 | Capability: which KVM extension provides this ioctl. Can be 'basic', | ||
63 | which means that is will be provided by any kernel that supports | ||
64 | API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which | ||
65 | means availability needs to be checked with KVM_CHECK_EXTENSION | ||
66 | (see section 4.4). | ||
67 | |||
68 | Architectures: which instruction set architectures provide this ioctl. | ||
69 | x86 includes both i386 and x86_64. | ||
70 | |||
71 | Type: system, vm, or vcpu. | ||
72 | |||
73 | Parameters: what parameters are accepted by the ioctl. | ||
74 | |||
75 | Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) | ||
76 | are not detailed, but errors with specific meanings are. | ||
77 | |||
78 | 4.1 KVM_GET_API_VERSION | ||
79 | |||
80 | Capability: basic | ||
81 | Architectures: all | ||
82 | Type: system ioctl | ||
83 | Parameters: none | ||
84 | Returns: the constant KVM_API_VERSION (=12) | ||
85 | |||
86 | This identifies the API version as the stable kvm API. It is not | ||
87 | expected that this number will change. However, Linux 2.6.20 and | ||
88 | 2.6.21 report earlier versions; these are not documented and not | ||
89 | supported. Applications should refuse to run if KVM_GET_API_VERSION | ||
90 | returns a value other than 12. If this check passes, all ioctls | ||
91 | described as 'basic' will be available. | ||
92 | |||
93 | 4.2 KVM_CREATE_VM | ||
94 | |||
95 | Capability: basic | ||
96 | Architectures: all | ||
97 | Type: system ioctl | ||
98 | Parameters: none | ||
99 | Returns: a VM fd that can be used to control the new virtual machine. | ||
100 | |||
101 | The new VM has no virtual cpus and no memory. An mmap() of a VM fd | ||
102 | will access the virtual machine's physical address space; offset zero | ||
103 | corresponds to guest physical address zero. Use of mmap() on a VM fd | ||
104 | is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is | ||
105 | available. | ||
106 | |||
107 | 4.3 KVM_GET_MSR_INDEX_LIST | ||
108 | |||
109 | Capability: basic | ||
110 | Architectures: x86 | ||
111 | Type: system | ||
112 | Parameters: struct kvm_msr_list (in/out) | ||
113 | Returns: 0 on success; -1 on error | ||
114 | Errors: | ||
115 | E2BIG: the msr index list is to be to fit in the array specified by | ||
116 | the user. | ||
117 | |||
118 | struct kvm_msr_list { | ||
119 | __u32 nmsrs; /* number of msrs in entries */ | ||
120 | __u32 indices[0]; | ||
121 | }; | ||
122 | |||
123 | This ioctl returns the guest msrs that are supported. The list varies | ||
124 | by kvm version and host processor, but does not change otherwise. The | ||
125 | user fills in the size of the indices array in nmsrs, and in return | ||
126 | kvm adjusts nmsrs to reflect the actual number of msrs and fills in | ||
127 | the indices array with their numbers. | ||
128 | |||
129 | 4.4 KVM_CHECK_EXTENSION | ||
130 | |||
131 | Capability: basic | ||
132 | Architectures: all | ||
133 | Type: system ioctl | ||
134 | Parameters: extension identifier (KVM_CAP_*) | ||
135 | Returns: 0 if unsupported; 1 (or some other positive integer) if supported | ||
136 | |||
137 | The API allows the application to query about extensions to the core | ||
138 | kvm API. Userspace passes an extension identifier (an integer) and | ||
139 | receives an integer that describes the extension availability. | ||
140 | Generally 0 means no and 1 means yes, but some extensions may report | ||
141 | additional information in the integer return value. | ||
142 | |||
143 | 4.5 KVM_GET_VCPU_MMAP_SIZE | ||
144 | |||
145 | Capability: basic | ||
146 | Architectures: all | ||
147 | Type: system ioctl | ||
148 | Parameters: none | ||
149 | Returns: size of vcpu mmap area, in bytes | ||
150 | |||
151 | The KVM_RUN ioctl (cf.) communicates with userspace via a shared | ||
152 | memory region. This ioctl returns the size of that region. See the | ||
153 | KVM_RUN documentation for details. | ||
154 | |||
155 | 4.6 KVM_SET_MEMORY_REGION | ||
156 | |||
157 | Capability: basic | ||
158 | Architectures: all | ||
159 | Type: vm ioctl | ||
160 | Parameters: struct kvm_memory_region (in) | ||
161 | Returns: 0 on success, -1 on error | ||
162 | |||
163 | struct kvm_memory_region { | ||
164 | __u32 slot; | ||
165 | __u32 flags; | ||
166 | __u64 guest_phys_addr; | ||
167 | __u64 memory_size; /* bytes */ | ||
168 | }; | ||
169 | |||
170 | /* for kvm_memory_region::flags */ | ||
171 | #define KVM_MEM_LOG_DIRTY_PAGES 1UL | ||
172 | |||
173 | This ioctl allows the user to create or modify a guest physical memory | ||
174 | slot. When changing an existing slot, it may be moved in the guest | ||
175 | physical memory space, or its flags may be modified. It may not be | ||
176 | resized. Slots may not overlap. | ||
177 | |||
178 | The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which | ||
179 | instructs kvm to keep track of writes to memory within the slot. See | ||
180 | the KVM_GET_DIRTY_LOG ioctl. | ||
181 | |||
182 | It is recommended to use the KVM_SET_USER_MEMORY_REGION ioctl instead | ||
183 | of this API, if available. This newer API allows placing guest memory | ||
184 | at specified locations in the host address space, yielding better | ||
185 | control and easy access. | ||
186 | |||
187 | 4.6 KVM_CREATE_VCPU | ||
188 | |||
189 | Capability: basic | ||
190 | Architectures: all | ||
191 | Type: vm ioctl | ||
192 | Parameters: vcpu id (apic id on x86) | ||
193 | Returns: vcpu fd on success, -1 on error | ||
194 | |||
195 | This API adds a vcpu to a virtual machine. The vcpu id is a small integer | ||
196 | in the range [0, max_vcpus). | ||
197 | |||
198 | 4.7 KVM_GET_DIRTY_LOG (vm ioctl) | ||
199 | |||
200 | Capability: basic | ||
201 | Architectures: x86 | ||
202 | Type: vm ioctl | ||
203 | Parameters: struct kvm_dirty_log (in/out) | ||
204 | Returns: 0 on success, -1 on error | ||
205 | |||
206 | /* for KVM_GET_DIRTY_LOG */ | ||
207 | struct kvm_dirty_log { | ||
208 | __u32 slot; | ||
209 | __u32 padding; | ||
210 | union { | ||
211 | void __user *dirty_bitmap; /* one bit per page */ | ||
212 | __u64 padding; | ||
213 | }; | ||
214 | }; | ||
215 | |||
216 | Given a memory slot, return a bitmap containing any pages dirtied | ||
217 | since the last call to this ioctl. Bit 0 is the first page in the | ||
218 | memory slot. Ensure the entire structure is cleared to avoid padding | ||
219 | issues. | ||
220 | |||
221 | 4.8 KVM_SET_MEMORY_ALIAS | ||
222 | |||
223 | Capability: basic | ||
224 | Architectures: x86 | ||
225 | Type: vm ioctl | ||
226 | Parameters: struct kvm_memory_alias (in) | ||
227 | Returns: 0 (success), -1 (error) | ||
228 | |||
229 | struct kvm_memory_alias { | ||
230 | __u32 slot; /* this has a different namespace than memory slots */ | ||
231 | __u32 flags; | ||
232 | __u64 guest_phys_addr; | ||
233 | __u64 memory_size; | ||
234 | __u64 target_phys_addr; | ||
235 | }; | ||
236 | |||
237 | Defines a guest physical address space region as an alias to another | ||
238 | region. Useful for aliased address, for example the VGA low memory | ||
239 | window. Should not be used with userspace memory. | ||
240 | |||
241 | 4.9 KVM_RUN | ||
242 | |||
243 | Capability: basic | ||
244 | Architectures: all | ||
245 | Type: vcpu ioctl | ||
246 | Parameters: none | ||
247 | Returns: 0 on success, -1 on error | ||
248 | Errors: | ||
249 | EINTR: an unmasked signal is pending | ||
250 | |||
251 | This ioctl is used to run a guest virtual cpu. While there are no | ||
252 | explicit parameters, there is an implicit parameter block that can be | ||
253 | obtained by mmap()ing the vcpu fd at offset 0, with the size given by | ||
254 | KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct | ||
255 | kvm_run' (see below). | ||
256 | |||
257 | 4.10 KVM_GET_REGS | ||
258 | |||
259 | Capability: basic | ||
260 | Architectures: all | ||
261 | Type: vcpu ioctl | ||
262 | Parameters: struct kvm_regs (out) | ||
263 | Returns: 0 on success, -1 on error | ||
264 | |||
265 | Reads the general purpose registers from the vcpu. | ||
266 | |||
267 | /* x86 */ | ||
268 | struct kvm_regs { | ||
269 | /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ | ||
270 | __u64 rax, rbx, rcx, rdx; | ||
271 | __u64 rsi, rdi, rsp, rbp; | ||
272 | __u64 r8, r9, r10, r11; | ||
273 | __u64 r12, r13, r14, r15; | ||
274 | __u64 rip, rflags; | ||
275 | }; | ||
276 | |||
277 | 4.11 KVM_SET_REGS | ||
278 | |||
279 | Capability: basic | ||
280 | Architectures: all | ||
281 | Type: vcpu ioctl | ||
282 | Parameters: struct kvm_regs (in) | ||
283 | Returns: 0 on success, -1 on error | ||
284 | |||
285 | Writes the general purpose registers into the vcpu. | ||
286 | |||
287 | See KVM_GET_REGS for the data structure. | ||
288 | |||
289 | 4.12 KVM_GET_SREGS | ||
290 | |||
291 | Capability: basic | ||
292 | Architectures: x86 | ||
293 | Type: vcpu ioctl | ||
294 | Parameters: struct kvm_sregs (out) | ||
295 | Returns: 0 on success, -1 on error | ||
296 | |||
297 | Reads special registers from the vcpu. | ||
298 | |||
299 | /* x86 */ | ||
300 | struct kvm_sregs { | ||
301 | struct kvm_segment cs, ds, es, fs, gs, ss; | ||
302 | struct kvm_segment tr, ldt; | ||
303 | struct kvm_dtable gdt, idt; | ||
304 | __u64 cr0, cr2, cr3, cr4, cr8; | ||
305 | __u64 efer; | ||
306 | __u64 apic_base; | ||
307 | __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64]; | ||
308 | }; | ||
309 | |||
310 | interrupt_bitmap is a bitmap of pending external interrupts. At most | ||
311 | one bit may be set. This interrupt has been acknowledged by the APIC | ||
312 | but not yet injected into the cpu core. | ||
313 | |||
314 | 4.13 KVM_SET_SREGS | ||
315 | |||
316 | Capability: basic | ||
317 | Architectures: x86 | ||
318 | Type: vcpu ioctl | ||
319 | Parameters: struct kvm_sregs (in) | ||
320 | Returns: 0 on success, -1 on error | ||
321 | |||
322 | Writes special registers into the vcpu. See KVM_GET_SREGS for the | ||
323 | data structures. | ||
324 | |||
325 | 4.14 KVM_TRANSLATE | ||
326 | |||
327 | Capability: basic | ||
328 | Architectures: x86 | ||
329 | Type: vcpu ioctl | ||
330 | Parameters: struct kvm_translation (in/out) | ||
331 | Returns: 0 on success, -1 on error | ||
332 | |||
333 | Translates a virtual address according to the vcpu's current address | ||
334 | translation mode. | ||
335 | |||
336 | struct kvm_translation { | ||
337 | /* in */ | ||
338 | __u64 linear_address; | ||
339 | |||
340 | /* out */ | ||
341 | __u64 physical_address; | ||
342 | __u8 valid; | ||
343 | __u8 writeable; | ||
344 | __u8 usermode; | ||
345 | __u8 pad[5]; | ||
346 | }; | ||
347 | |||
348 | 4.15 KVM_INTERRUPT | ||
349 | |||
350 | Capability: basic | ||
351 | Architectures: x86 | ||
352 | Type: vcpu ioctl | ||
353 | Parameters: struct kvm_interrupt (in) | ||
354 | Returns: 0 on success, -1 on error | ||
355 | |||
356 | Queues a hardware interrupt vector to be injected. This is only | ||
357 | useful if in-kernel local APIC is not used. | ||
358 | |||
359 | /* for KVM_INTERRUPT */ | ||
360 | struct kvm_interrupt { | ||
361 | /* in */ | ||
362 | __u32 irq; | ||
363 | }; | ||
364 | |||
365 | Note 'irq' is an interrupt vector, not an interrupt pin or line. | ||
366 | |||
367 | 4.16 KVM_DEBUG_GUEST | ||
368 | |||
369 | Capability: basic | ||
370 | Architectures: none | ||
371 | Type: vcpu ioctl | ||
372 | Parameters: none) | ||
373 | Returns: -1 on error | ||
374 | |||
375 | Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead. | ||
376 | |||
377 | 4.17 KVM_GET_MSRS | ||
378 | |||
379 | Capability: basic | ||
380 | Architectures: x86 | ||
381 | Type: vcpu ioctl | ||
382 | Parameters: struct kvm_msrs (in/out) | ||
383 | Returns: 0 on success, -1 on error | ||
384 | |||
385 | Reads model-specific registers from the vcpu. Supported msr indices can | ||
386 | be obtained using KVM_GET_MSR_INDEX_LIST. | ||
387 | |||
388 | struct kvm_msrs { | ||
389 | __u32 nmsrs; /* number of msrs in entries */ | ||
390 | __u32 pad; | ||
391 | |||
392 | struct kvm_msr_entry entries[0]; | ||
393 | }; | ||
394 | |||
395 | struct kvm_msr_entry { | ||
396 | __u32 index; | ||
397 | __u32 reserved; | ||
398 | __u64 data; | ||
399 | }; | ||
400 | |||
401 | Application code should set the 'nmsrs' member (which indicates the | ||
402 | size of the entries array) and the 'index' member of each array entry. | ||
403 | kvm will fill in the 'data' member. | ||
404 | |||
405 | 4.18 KVM_SET_MSRS | ||
406 | |||
407 | Capability: basic | ||
408 | Architectures: x86 | ||
409 | Type: vcpu ioctl | ||
410 | Parameters: struct kvm_msrs (in) | ||
411 | Returns: 0 on success, -1 on error | ||
412 | |||
413 | Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the | ||
414 | data structures. | ||
415 | |||
416 | Application code should set the 'nmsrs' member (which indicates the | ||
417 | size of the entries array), and the 'index' and 'data' members of each | ||
418 | array entry. | ||
419 | |||
420 | 4.19 KVM_SET_CPUID | ||
421 | |||
422 | Capability: basic | ||
423 | Architectures: x86 | ||
424 | Type: vcpu ioctl | ||
425 | Parameters: struct kvm_cpuid (in) | ||
426 | Returns: 0 on success, -1 on error | ||
427 | |||
428 | Defines the vcpu responses to the cpuid instruction. Applications | ||
429 | should use the KVM_SET_CPUID2 ioctl if available. | ||
430 | |||
431 | |||
432 | struct kvm_cpuid_entry { | ||
433 | __u32 function; | ||
434 | __u32 eax; | ||
435 | __u32 ebx; | ||
436 | __u32 ecx; | ||
437 | __u32 edx; | ||
438 | __u32 padding; | ||
439 | }; | ||
440 | |||
441 | /* for KVM_SET_CPUID */ | ||
442 | struct kvm_cpuid { | ||
443 | __u32 nent; | ||
444 | __u32 padding; | ||
445 | struct kvm_cpuid_entry entries[0]; | ||
446 | }; | ||
447 | |||
448 | 4.20 KVM_SET_SIGNAL_MASK | ||
449 | |||
450 | Capability: basic | ||
451 | Architectures: x86 | ||
452 | Type: vcpu ioctl | ||
453 | Parameters: struct kvm_signal_mask (in) | ||
454 | Returns: 0 on success, -1 on error | ||
455 | |||
456 | Defines which signals are blocked during execution of KVM_RUN. This | ||
457 | signal mask temporarily overrides the threads signal mask. Any | ||
458 | unblocked signal received (except SIGKILL and SIGSTOP, which retain | ||
459 | their traditional behaviour) will cause KVM_RUN to return with -EINTR. | ||
460 | |||
461 | Note the signal will only be delivered if not blocked by the original | ||
462 | signal mask. | ||
463 | |||
464 | /* for KVM_SET_SIGNAL_MASK */ | ||
465 | struct kvm_signal_mask { | ||
466 | __u32 len; | ||
467 | __u8 sigset[0]; | ||
468 | }; | ||
469 | |||
470 | 4.21 KVM_GET_FPU | ||
471 | |||
472 | Capability: basic | ||
473 | Architectures: x86 | ||
474 | Type: vcpu ioctl | ||
475 | Parameters: struct kvm_fpu (out) | ||
476 | Returns: 0 on success, -1 on error | ||
477 | |||
478 | Reads the floating point state from the vcpu. | ||
479 | |||
480 | /* for KVM_GET_FPU and KVM_SET_FPU */ | ||
481 | struct kvm_fpu { | ||
482 | __u8 fpr[8][16]; | ||
483 | __u16 fcw; | ||
484 | __u16 fsw; | ||
485 | __u8 ftwx; /* in fxsave format */ | ||
486 | __u8 pad1; | ||
487 | __u16 last_opcode; | ||
488 | __u64 last_ip; | ||
489 | __u64 last_dp; | ||
490 | __u8 xmm[16][16]; | ||
491 | __u32 mxcsr; | ||
492 | __u32 pad2; | ||
493 | }; | ||
494 | |||
495 | 4.22 KVM_SET_FPU | ||
496 | |||
497 | Capability: basic | ||
498 | Architectures: x86 | ||
499 | Type: vcpu ioctl | ||
500 | Parameters: struct kvm_fpu (in) | ||
501 | Returns: 0 on success, -1 on error | ||
502 | |||
503 | Writes the floating point state to the vcpu. | ||
504 | |||
505 | /* for KVM_GET_FPU and KVM_SET_FPU */ | ||
506 | struct kvm_fpu { | ||
507 | __u8 fpr[8][16]; | ||
508 | __u16 fcw; | ||
509 | __u16 fsw; | ||
510 | __u8 ftwx; /* in fxsave format */ | ||
511 | __u8 pad1; | ||
512 | __u16 last_opcode; | ||
513 | __u64 last_ip; | ||
514 | __u64 last_dp; | ||
515 | __u8 xmm[16][16]; | ||
516 | __u32 mxcsr; | ||
517 | __u32 pad2; | ||
518 | }; | ||
519 | |||
520 | 4.23 KVM_CREATE_IRQCHIP | ||
521 | |||
522 | Capability: KVM_CAP_IRQCHIP | ||
523 | Architectures: x86, ia64 | ||
524 | Type: vm ioctl | ||
525 | Parameters: none | ||
526 | Returns: 0 on success, -1 on error | ||
527 | |||
528 | Creates an interrupt controller model in the kernel. On x86, creates a virtual | ||
529 | ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a | ||
530 | local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23 | ||
531 | only go to the IOAPIC. On ia64, a IOSAPIC is created. | ||
532 | |||
533 | 4.24 KVM_IRQ_LINE | ||
534 | |||
535 | Capability: KVM_CAP_IRQCHIP | ||
536 | Architectures: x86, ia64 | ||
537 | Type: vm ioctl | ||
538 | Parameters: struct kvm_irq_level | ||
539 | Returns: 0 on success, -1 on error | ||
540 | |||
541 | Sets the level of a GSI input to the interrupt controller model in the kernel. | ||
542 | Requires that an interrupt controller model has been previously created with | ||
543 | KVM_CREATE_IRQCHIP. Note that edge-triggered interrupts require the level | ||
544 | to be set to 1 and then back to 0. | ||
545 | |||
546 | struct kvm_irq_level { | ||
547 | union { | ||
548 | __u32 irq; /* GSI */ | ||
549 | __s32 status; /* not used for KVM_IRQ_LEVEL */ | ||
550 | }; | ||
551 | __u32 level; /* 0 or 1 */ | ||
552 | }; | ||
553 | |||
554 | 4.25 KVM_GET_IRQCHIP | ||
555 | |||
556 | Capability: KVM_CAP_IRQCHIP | ||
557 | Architectures: x86, ia64 | ||
558 | Type: vm ioctl | ||
559 | Parameters: struct kvm_irqchip (in/out) | ||
560 | Returns: 0 on success, -1 on error | ||
561 | |||
562 | Reads the state of a kernel interrupt controller created with | ||
563 | KVM_CREATE_IRQCHIP into a buffer provided by the caller. | ||
564 | |||
565 | struct kvm_irqchip { | ||
566 | __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ | ||
567 | __u32 pad; | ||
568 | union { | ||
569 | char dummy[512]; /* reserving space */ | ||
570 | struct kvm_pic_state pic; | ||
571 | struct kvm_ioapic_state ioapic; | ||
572 | } chip; | ||
573 | }; | ||
574 | |||
575 | 4.26 KVM_SET_IRQCHIP | ||
576 | |||
577 | Capability: KVM_CAP_IRQCHIP | ||
578 | Architectures: x86, ia64 | ||
579 | Type: vm ioctl | ||
580 | Parameters: struct kvm_irqchip (in) | ||
581 | Returns: 0 on success, -1 on error | ||
582 | |||
583 | Sets the state of a kernel interrupt controller created with | ||
584 | KVM_CREATE_IRQCHIP from a buffer provided by the caller. | ||
585 | |||
586 | struct kvm_irqchip { | ||
587 | __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ | ||
588 | __u32 pad; | ||
589 | union { | ||
590 | char dummy[512]; /* reserving space */ | ||
591 | struct kvm_pic_state pic; | ||
592 | struct kvm_ioapic_state ioapic; | ||
593 | } chip; | ||
594 | }; | ||
595 | |||
596 | 5. The kvm_run structure | ||
597 | |||
598 | Application code obtains a pointer to the kvm_run structure by | ||
599 | mmap()ing a vcpu fd. From that point, application code can control | ||
600 | execution by changing fields in kvm_run prior to calling the KVM_RUN | ||
601 | ioctl, and obtain information about the reason KVM_RUN returned by | ||
602 | looking up structure members. | ||
603 | |||
604 | struct kvm_run { | ||
605 | /* in */ | ||
606 | __u8 request_interrupt_window; | ||
607 | |||
608 | Request that KVM_RUN return when it becomes possible to inject external | ||
609 | interrupts into the guest. Useful in conjunction with KVM_INTERRUPT. | ||
610 | |||
611 | __u8 padding1[7]; | ||
612 | |||
613 | /* out */ | ||
614 | __u32 exit_reason; | ||
615 | |||
616 | When KVM_RUN has returned successfully (return value 0), this informs | ||
617 | application code why KVM_RUN has returned. Allowable values for this | ||
618 | field are detailed below. | ||
619 | |||
620 | __u8 ready_for_interrupt_injection; | ||
621 | |||
622 | If request_interrupt_window has been specified, this field indicates | ||
623 | an interrupt can be injected now with KVM_INTERRUPT. | ||
624 | |||
625 | __u8 if_flag; | ||
626 | |||
627 | The value of the current interrupt flag. Only valid if in-kernel | ||
628 | local APIC is not used. | ||
629 | |||
630 | __u8 padding2[2]; | ||
631 | |||
632 | /* in (pre_kvm_run), out (post_kvm_run) */ | ||
633 | __u64 cr8; | ||
634 | |||
635 | The value of the cr8 register. Only valid if in-kernel local APIC is | ||
636 | not used. Both input and output. | ||
637 | |||
638 | __u64 apic_base; | ||
639 | |||
640 | The value of the APIC BASE msr. Only valid if in-kernel local | ||
641 | APIC is not used. Both input and output. | ||
642 | |||
643 | union { | ||
644 | /* KVM_EXIT_UNKNOWN */ | ||
645 | struct { | ||
646 | __u64 hardware_exit_reason; | ||
647 | } hw; | ||
648 | |||
649 | If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown | ||
650 | reasons. Further architecture-specific information is available in | ||
651 | hardware_exit_reason. | ||
652 | |||
653 | /* KVM_EXIT_FAIL_ENTRY */ | ||
654 | struct { | ||
655 | __u64 hardware_entry_failure_reason; | ||
656 | } fail_entry; | ||
657 | |||
658 | If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due | ||
659 | to unknown reasons. Further architecture-specific information is | ||
660 | available in hardware_entry_failure_reason. | ||
661 | |||
662 | /* KVM_EXIT_EXCEPTION */ | ||
663 | struct { | ||
664 | __u32 exception; | ||
665 | __u32 error_code; | ||
666 | } ex; | ||
667 | |||
668 | Unused. | ||
669 | |||
670 | /* KVM_EXIT_IO */ | ||
671 | struct { | ||
672 | #define KVM_EXIT_IO_IN 0 | ||
673 | #define KVM_EXIT_IO_OUT 1 | ||
674 | __u8 direction; | ||
675 | __u8 size; /* bytes */ | ||
676 | __u16 port; | ||
677 | __u32 count; | ||
678 | __u64 data_offset; /* relative to kvm_run start */ | ||
679 | } io; | ||
680 | |||
681 | If exit_reason is KVM_EXIT_IO_IN or KVM_EXIT_IO_OUT, then the vcpu has | ||
682 | executed a port I/O instruction which could not be satisfied by kvm. | ||
683 | data_offset describes where the data is located (KVM_EXIT_IO_OUT) or | ||
684 | where kvm expects application code to place the data for the next | ||
685 | KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a patcked array. | ||
686 | |||
687 | struct { | ||
688 | struct kvm_debug_exit_arch arch; | ||
689 | } debug; | ||
690 | |||
691 | Unused. | ||
692 | |||
693 | /* KVM_EXIT_MMIO */ | ||
694 | struct { | ||
695 | __u64 phys_addr; | ||
696 | __u8 data[8]; | ||
697 | __u32 len; | ||
698 | __u8 is_write; | ||
699 | } mmio; | ||
700 | |||
701 | If exit_reason is KVM_EXIT_MMIO or KVM_EXIT_IO_OUT, then the vcpu has | ||
702 | executed a memory-mapped I/O instruction which could not be satisfied | ||
703 | by kvm. The 'data' member contains the written data if 'is_write' is | ||
704 | true, and should be filled by application code otherwise. | ||
705 | |||
706 | /* KVM_EXIT_HYPERCALL */ | ||
707 | struct { | ||
708 | __u64 nr; | ||
709 | __u64 args[6]; | ||
710 | __u64 ret; | ||
711 | __u32 longmode; | ||
712 | __u32 pad; | ||
713 | } hypercall; | ||
714 | |||
715 | Unused. | ||
716 | |||
717 | /* KVM_EXIT_TPR_ACCESS */ | ||
718 | struct { | ||
719 | __u64 rip; | ||
720 | __u32 is_write; | ||
721 | __u32 pad; | ||
722 | } tpr_access; | ||
723 | |||
724 | To be documented (KVM_TPR_ACCESS_REPORTING). | ||
725 | |||
726 | /* KVM_EXIT_S390_SIEIC */ | ||
727 | struct { | ||
728 | __u8 icptcode; | ||
729 | __u64 mask; /* psw upper half */ | ||
730 | __u64 addr; /* psw lower half */ | ||
731 | __u16 ipa; | ||
732 | __u32 ipb; | ||
733 | } s390_sieic; | ||
734 | |||
735 | s390 specific. | ||
736 | |||
737 | /* KVM_EXIT_S390_RESET */ | ||
738 | #define KVM_S390_RESET_POR 1 | ||
739 | #define KVM_S390_RESET_CLEAR 2 | ||
740 | #define KVM_S390_RESET_SUBSYSTEM 4 | ||
741 | #define KVM_S390_RESET_CPU_INIT 8 | ||
742 | #define KVM_S390_RESET_IPL 16 | ||
743 | __u64 s390_reset_flags; | ||
744 | |||
745 | s390 specific. | ||
746 | |||
747 | /* KVM_EXIT_DCR */ | ||
748 | struct { | ||
749 | __u32 dcrn; | ||
750 | __u32 data; | ||
751 | __u8 is_write; | ||
752 | } dcr; | ||
753 | |||
754 | powerpc specific. | ||
755 | |||
756 | /* Fix the size of the union. */ | ||
757 | char padding[256]; | ||
758 | }; | ||
759 | }; | ||
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 1634c6dcecae..50189bf07d53 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX | |||
@@ -60,6 +60,8 @@ framerelay.txt | |||
60 | - info on using Frame Relay/Data Link Connection Identifier (DLCI). | 60 | - info on using Frame Relay/Data Link Connection Identifier (DLCI). |
61 | generic_netlink.txt | 61 | generic_netlink.txt |
62 | - info on Generic Netlink | 62 | - info on Generic Netlink |
63 | ieee802154.txt | ||
64 | - Linux IEEE 802.15.4 implementation, API and drivers | ||
63 | ip-sysctl.txt | 65 | ip-sysctl.txt |
64 | - /proc/sys/net/ipv4/* variables | 66 | - /proc/sys/net/ipv4/* variables |
65 | ip_dynaddr.txt | 67 | ip_dynaddr.txt |
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt index a0280ad2edc9..23c995e64032 100644 --- a/Documentation/networking/ieee802154.txt +++ b/Documentation/networking/ieee802154.txt | |||
@@ -22,7 +22,7 @@ int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0); | |||
22 | ..... | 22 | ..... |
23 | 23 | ||
24 | The address family, socket addresses etc. are defined in the | 24 | The address family, socket addresses etc. are defined in the |
25 | include/net/ieee802154/af_ieee802154.h header or in the special header | 25 | include/net/af_ieee802154.h header or in the special header |
26 | in our userspace package (see either linux-zigbee sourceforge download page | 26 | in our userspace package (see either linux-zigbee sourceforge download page |
27 | or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee). | 27 | or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee). |
28 | 28 | ||
@@ -33,7 +33,7 @@ MLME - MAC Level Management | |||
33 | ============================ | 33 | ============================ |
34 | 34 | ||
35 | Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands. | 35 | Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands. |
36 | See the include/net/ieee802154/nl802154.h header. Our userspace tools package | 36 | See the include/net/nl802154.h header. Our userspace tools package |
37 | (see above) provides CLI configuration utility for radio interfaces and simple | 37 | (see above) provides CLI configuration utility for radio interfaces and simple |
38 | coordinator for IEEE 802.15.4 networks as an example users of MLME protocol. | 38 | coordinator for IEEE 802.15.4 networks as an example users of MLME protocol. |
39 | 39 | ||
@@ -54,10 +54,14 @@ Those types of devices require different approach to be hooked into Linux kernel | |||
54 | HardMAC | 54 | HardMAC |
55 | ======= | 55 | ======= |
56 | 56 | ||
57 | See the header include/net/ieee802154/netdevice.h. You have to implement Linux | 57 | See the header include/net/ieee802154_netdev.h. You have to implement Linux |
58 | net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family | 58 | net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family |
59 | code via plain sk_buffs. The control block of sk_buffs will contain additional | 59 | code via plain sk_buffs. On skb reception skb->cb must contain additional |
60 | info as described in the struct ieee802154_mac_cb. | 60 | info as described in the struct ieee802154_mac_cb. During packet transmission |
61 | the skb->cb is used to provide additional data to device's header_ops->create | ||
62 | function. Be aware, that this data can be overriden later (when socket code | ||
63 | submits skb to qdisc), so if you need something from that cb later, you should | ||
64 | store info in the skb->data on your own. | ||
61 | 65 | ||
62 | To hook the MLME interface you have to populate the ml_priv field of your | 66 | To hook the MLME interface you have to populate the ml_priv field of your |
63 | net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are | 67 | net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are |
@@ -69,8 +73,8 @@ We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c | |||
69 | SoftMAC | 73 | SoftMAC |
70 | ======= | 74 | ======= |
71 | 75 | ||
72 | We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC | 76 | We are going to provide intermediate layer implementing IEEE 802.15.4 MAC |
73 | in software. This is currently WIP. | 77 | in software. This is currently WIP. |
74 | 78 | ||
75 | See header include/net/ieee802154/mac802154.h and several drivers in | 79 | See header include/net/mac802154.h and several drivers in drivers/ieee802154/. |
76 | drivers/ieee802154/ | 80 | |
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 8be76235fe67..fbe427a6580c 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt | |||
@@ -311,9 +311,12 @@ tcp_no_metrics_save - BOOLEAN | |||
311 | connections. | 311 | connections. |
312 | 312 | ||
313 | tcp_orphan_retries - INTEGER | 313 | tcp_orphan_retries - INTEGER |
314 | How may times to retry before killing TCP connection, closed | 314 | This value influences the timeout of a locally closed TCP connection, |
315 | by our side. Default value 7 corresponds to ~50sec-16min | 315 | when RTO retransmissions remain unacknowledged. |
316 | depending on RTO. If you machine is loaded WEB server, | 316 | See tcp_retries2 for more details. |
317 | |||
318 | The default value is 7. | ||
319 | If your machine is a loaded WEB server, | ||
317 | you should think about lowering this value, such sockets | 320 | you should think about lowering this value, such sockets |
318 | may consume significant resources. Cf. tcp_max_orphans. | 321 | may consume significant resources. Cf. tcp_max_orphans. |
319 | 322 | ||
@@ -327,16 +330,28 @@ tcp_retrans_collapse - BOOLEAN | |||
327 | certain TCP stacks. | 330 | certain TCP stacks. |
328 | 331 | ||
329 | tcp_retries1 - INTEGER | 332 | tcp_retries1 - INTEGER |
330 | How many times to retry before deciding that something is wrong | 333 | This value influences the time, after which TCP decides, that |
331 | and it is necessary to report this suspicion to network layer. | 334 | something is wrong due to unacknowledged RTO retransmissions, |
332 | Minimal RFC value is 3, it is default, which corresponds | 335 | and reports this suspicion to the network layer. |
333 | to ~3sec-8min depending on RTO. | 336 | See tcp_retries2 for more details. |
337 | |||
338 | RFC 1122 recommends at least 3 retransmissions, which is the | ||
339 | default. | ||
334 | 340 | ||
335 | tcp_retries2 - INTEGER | 341 | tcp_retries2 - INTEGER |
336 | How may times to retry before killing alive TCP connection. | 342 | This value influences the timeout of an alive TCP connection, |
337 | RFC1122 says that the limit should be longer than 100 sec. | 343 | when RTO retransmissions remain unacknowledged. |
338 | It is too small number. Default value 15 corresponds to ~13-30min | 344 | Given a value of N, a hypothetical TCP connection following |
339 | depending on RTO. | 345 | exponential backoff with an initial RTO of TCP_RTO_MIN would |
346 | retransmit N times before killing the connection at the (N+1)th RTO. | ||
347 | |||
348 | The default value of 15 yields a hypothetical timeout of 924.6 | ||
349 | seconds and is a lower bound for the effective timeout. | ||
350 | TCP will effectively time out at the first RTO which exceeds the | ||
351 | hypothetical timeout. | ||
352 | |||
353 | RFC 1122 recommends at least 100 seconds for the timeout, | ||
354 | which corresponds to a value of at least 8. | ||
340 | 355 | ||
341 | tcp_rfc1337 - BOOLEAN | 356 | tcp_rfc1337 - BOOLEAN |
342 | If set, the TCP stack behaves conforming to RFC1337. If unset, | 357 | If set, the TCP stack behaves conforming to RFC1337. If unset, |
@@ -1282,6 +1297,16 @@ sctp_rmem - vector of 3 INTEGERs: min, default, max | |||
1282 | sctp_wmem - vector of 3 INTEGERs: min, default, max | 1297 | sctp_wmem - vector of 3 INTEGERs: min, default, max |
1283 | See tcp_wmem for a description. | 1298 | See tcp_wmem for a description. |
1284 | 1299 | ||
1300 | addr_scope_policy - INTEGER | ||
1301 | Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00 | ||
1302 | |||
1303 | 0 - Disable IPv4 address scoping | ||
1304 | 1 - Enable IPv4 address scoping | ||
1305 | 2 - Follow draft but allow IPv4 private addresses | ||
1306 | 3 - Follow draft but allow IPv4 link local addresses | ||
1307 | |||
1308 | Default: 1 | ||
1309 | |||
1285 | 1310 | ||
1286 | /proc/sys/net/core/* | 1311 | /proc/sys/net/core/* |
1287 | dev_weight - INTEGER | 1312 | dev_weight - INTEGER |
diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt new file mode 100644 index 000000000000..f49a33b704d2 --- /dev/null +++ b/Documentation/power/runtime_pm.txt | |||
@@ -0,0 +1,378 @@ | |||
1 | Run-time Power Management Framework for I/O Devices | ||
2 | |||
3 | (C) 2009 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. | ||
4 | |||
5 | 1. Introduction | ||
6 | |||
7 | Support for run-time power management (run-time PM) of I/O devices is provided | ||
8 | at the power management core (PM core) level by means of: | ||
9 | |||
10 | * The power management workqueue pm_wq in which bus types and device drivers can | ||
11 | put their PM-related work items. It is strongly recommended that pm_wq be | ||
12 | used for queuing all work items related to run-time PM, because this allows | ||
13 | them to be synchronized with system-wide power transitions (suspend to RAM, | ||
14 | hibernation and resume from system sleep states). pm_wq is declared in | ||
15 | include/linux/pm_runtime.h and defined in kernel/power/main.c. | ||
16 | |||
17 | * A number of run-time PM fields in the 'power' member of 'struct device' (which | ||
18 | is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can | ||
19 | be used for synchronizing run-time PM operations with one another. | ||
20 | |||
21 | * Three device run-time PM callbacks in 'struct dev_pm_ops' (defined in | ||
22 | include/linux/pm.h). | ||
23 | |||
24 | * A set of helper functions defined in drivers/base/power/runtime.c that can be | ||
25 | used for carrying out run-time PM operations in such a way that the | ||
26 | synchronization between them is taken care of by the PM core. Bus types and | ||
27 | device drivers are encouraged to use these functions. | ||
28 | |||
29 | The run-time PM callbacks present in 'struct dev_pm_ops', the device run-time PM | ||
30 | fields of 'struct dev_pm_info' and the core helper functions provided for | ||
31 | run-time PM are described below. | ||
32 | |||
33 | 2. Device Run-time PM Callbacks | ||
34 | |||
35 | There are three device run-time PM callbacks defined in 'struct dev_pm_ops': | ||
36 | |||
37 | struct dev_pm_ops { | ||
38 | ... | ||
39 | int (*runtime_suspend)(struct device *dev); | ||
40 | int (*runtime_resume)(struct device *dev); | ||
41 | void (*runtime_idle)(struct device *dev); | ||
42 | ... | ||
43 | }; | ||
44 | |||
45 | The ->runtime_suspend() callback is executed by the PM core for the bus type of | ||
46 | the device being suspended. The bus type's callback is then _entirely_ | ||
47 | _responsible_ for handling the device as appropriate, which may, but need not | ||
48 | include executing the device driver's own ->runtime_suspend() callback (from the | ||
49 | PM core's point of view it is not necessary to implement a ->runtime_suspend() | ||
50 | callback in a device driver as long as the bus type's ->runtime_suspend() knows | ||
51 | what to do to handle the device). | ||
52 | |||
53 | * Once the bus type's ->runtime_suspend() callback has completed successfully | ||
54 | for given device, the PM core regards the device as suspended, which need | ||
55 | not mean that the device has been put into a low power state. It is | ||
56 | supposed to mean, however, that the device will not process data and will | ||
57 | not communicate with the CPU(s) and RAM until its bus type's | ||
58 | ->runtime_resume() callback is executed for it. The run-time PM status of | ||
59 | a device after successful execution of its bus type's ->runtime_suspend() | ||
60 | callback is 'suspended'. | ||
61 | |||
62 | * If the bus type's ->runtime_suspend() callback returns -EBUSY or -EAGAIN, | ||
63 | the device's run-time PM status is supposed to be 'active', which means that | ||
64 | the device _must_ be fully operational afterwards. | ||
65 | |||
66 | * If the bus type's ->runtime_suspend() callback returns an error code | ||
67 | different from -EBUSY or -EAGAIN, the PM core regards this as a fatal | ||
68 | error and will refuse to run the helper functions described in Section 4 | ||
69 | for the device, until the status of it is directly set either to 'active' | ||
70 | or to 'suspended' (the PM core provides special helper functions for this | ||
71 | purpose). | ||
72 | |||
73 | In particular, if the driver requires remote wakeup capability for proper | ||
74 | functioning and device_may_wakeup() returns 'false' for the device, then | ||
75 | ->runtime_suspend() should return -EBUSY. On the other hand, if | ||
76 | device_may_wakeup() returns 'true' for the device and the device is put | ||
77 | into a low power state during the execution of its bus type's | ||
78 | ->runtime_suspend(), it is expected that remote wake-up (i.e. hardware mechanism | ||
79 | allowing the device to request a change of its power state, such as PCI PME) | ||
80 | will be enabled for the device. Generally, remote wake-up should be enabled | ||
81 | for all input devices put into a low power state at run time. | ||
82 | |||
83 | The ->runtime_resume() callback is executed by the PM core for the bus type of | ||
84 | the device being woken up. The bus type's callback is then _entirely_ | ||
85 | _responsible_ for handling the device as appropriate, which may, but need not | ||
86 | include executing the device driver's own ->runtime_resume() callback (from the | ||
87 | PM core's point of view it is not necessary to implement a ->runtime_resume() | ||
88 | callback in a device driver as long as the bus type's ->runtime_resume() knows | ||
89 | what to do to handle the device). | ||
90 | |||
91 | * Once the bus type's ->runtime_resume() callback has completed successfully, | ||
92 | the PM core regards the device as fully operational, which means that the | ||
93 | device _must_ be able to complete I/O operations as needed. The run-time | ||
94 | PM status of the device is then 'active'. | ||
95 | |||
96 | * If the bus type's ->runtime_resume() callback returns an error code, the PM | ||
97 | core regards this as a fatal error and will refuse to run the helper | ||
98 | functions described in Section 4 for the device, until its status is | ||
99 | directly set either to 'active' or to 'suspended' (the PM core provides | ||
100 | special helper functions for this purpose). | ||
101 | |||
102 | The ->runtime_idle() callback is executed by the PM core for the bus type of | ||
103 | given device whenever the device appears to be idle, which is indicated to the | ||
104 | PM core by two counters, the device's usage counter and the counter of 'active' | ||
105 | children of the device. | ||
106 | |||
107 | * If any of these counters is decreased using a helper function provided by | ||
108 | the PM core and it turns out to be equal to zero, the other counter is | ||
109 | checked. If that counter also is equal to zero, the PM core executes the | ||
110 | device bus type's ->runtime_idle() callback (with the device as an | ||
111 | argument). | ||
112 | |||
113 | The action performed by a bus type's ->runtime_idle() callback is totally | ||
114 | dependent on the bus type in question, but the expected and recommended action | ||
115 | is to check if the device can be suspended (i.e. if all of the conditions | ||
116 | necessary for suspending the device are satisfied) and to queue up a suspend | ||
117 | request for the device in that case. | ||
118 | |||
119 | The helper functions provided by the PM core, described in Section 4, guarantee | ||
120 | that the following constraints are met with respect to the bus type's run-time | ||
121 | PM callbacks: | ||
122 | |||
123 | (1) The callbacks are mutually exclusive (e.g. it is forbidden to execute | ||
124 | ->runtime_suspend() in parallel with ->runtime_resume() or with another | ||
125 | instance of ->runtime_suspend() for the same device) with the exception that | ||
126 | ->runtime_suspend() or ->runtime_resume() can be executed in parallel with | ||
127 | ->runtime_idle() (although ->runtime_idle() will not be started while any | ||
128 | of the other callbacks is being executed for the same device). | ||
129 | |||
130 | (2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active' | ||
131 | devices (i.e. the PM core will only execute ->runtime_idle() or | ||
132 | ->runtime_suspend() for the devices the run-time PM status of which is | ||
133 | 'active'). | ||
134 | |||
135 | (3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device | ||
136 | the usage counter of which is equal to zero _and_ either the counter of | ||
137 | 'active' children of which is equal to zero, or the 'power.ignore_children' | ||
138 | flag of which is set. | ||
139 | |||
140 | (4) ->runtime_resume() can only be executed for 'suspended' devices (i.e. the | ||
141 | PM core will only execute ->runtime_resume() for the devices the run-time | ||
142 | PM status of which is 'suspended'). | ||
143 | |||
144 | Additionally, the helper functions provided by the PM core obey the following | ||
145 | rules: | ||
146 | |||
147 | * If ->runtime_suspend() is about to be executed or there's a pending request | ||
148 | to execute it, ->runtime_idle() will not be executed for the same device. | ||
149 | |||
150 | * A request to execute or to schedule the execution of ->runtime_suspend() | ||
151 | will cancel any pending requests to execute ->runtime_idle() for the same | ||
152 | device. | ||
153 | |||
154 | * If ->runtime_resume() is about to be executed or there's a pending request | ||
155 | to execute it, the other callbacks will not be executed for the same device. | ||
156 | |||
157 | * A request to execute ->runtime_resume() will cancel any pending or | ||
158 | scheduled requests to execute the other callbacks for the same device. | ||
159 | |||
160 | 3. Run-time PM Device Fields | ||
161 | |||
162 | The following device run-time PM fields are present in 'struct dev_pm_info', as | ||
163 | defined in include/linux/pm.h: | ||
164 | |||
165 | struct timer_list suspend_timer; | ||
166 | - timer used for scheduling (delayed) suspend request | ||
167 | |||
168 | unsigned long timer_expires; | ||
169 | - timer expiration time, in jiffies (if this is different from zero, the | ||
170 | timer is running and will expire at that time, otherwise the timer is not | ||
171 | running) | ||
172 | |||
173 | struct work_struct work; | ||
174 | - work structure used for queuing up requests (i.e. work items in pm_wq) | ||
175 | |||
176 | wait_queue_head_t wait_queue; | ||
177 | - wait queue used if any of the helper functions needs to wait for another | ||
178 | one to complete | ||
179 | |||
180 | spinlock_t lock; | ||
181 | - lock used for synchronisation | ||
182 | |||
183 | atomic_t usage_count; | ||
184 | - the usage counter of the device | ||
185 | |||
186 | atomic_t child_count; | ||
187 | - the count of 'active' children of the device | ||
188 | |||
189 | unsigned int ignore_children; | ||
190 | - if set, the value of child_count is ignored (but still updated) | ||
191 | |||
192 | unsigned int disable_depth; | ||
193 | - used for disabling the helper funcions (they work normally if this is | ||
194 | equal to zero); the initial value of it is 1 (i.e. run-time PM is | ||
195 | initially disabled for all devices) | ||
196 | |||
197 | unsigned int runtime_error; | ||
198 | - if set, there was a fatal error (one of the callbacks returned error code | ||
199 | as described in Section 2), so the helper funtions will not work until | ||
200 | this flag is cleared; this is the error code returned by the failing | ||
201 | callback | ||
202 | |||
203 | unsigned int idle_notification; | ||
204 | - if set, ->runtime_idle() is being executed | ||
205 | |||
206 | unsigned int request_pending; | ||
207 | - if set, there's a pending request (i.e. a work item queued up into pm_wq) | ||
208 | |||
209 | enum rpm_request request; | ||
210 | - type of request that's pending (valid if request_pending is set) | ||
211 | |||
212 | unsigned int deferred_resume; | ||
213 | - set if ->runtime_resume() is about to be run while ->runtime_suspend() is | ||
214 | being executed for that device and it is not practical to wait for the | ||
215 | suspend to complete; means "start a resume as soon as you've suspended" | ||
216 | |||
217 | enum rpm_status runtime_status; | ||
218 | - the run-time PM status of the device; this field's initial value is | ||
219 | RPM_SUSPENDED, which means that each device is initially regarded by the | ||
220 | PM core as 'suspended', regardless of its real hardware status | ||
221 | |||
222 | All of the above fields are members of the 'power' member of 'struct device'. | ||
223 | |||
224 | 4. Run-time PM Device Helper Functions | ||
225 | |||
226 | The following run-time PM helper functions are defined in | ||
227 | drivers/base/power/runtime.c and include/linux/pm_runtime.h: | ||
228 | |||
229 | void pm_runtime_init(struct device *dev); | ||
230 | - initialize the device run-time PM fields in 'struct dev_pm_info' | ||
231 | |||
232 | void pm_runtime_remove(struct device *dev); | ||
233 | - make sure that the run-time PM of the device will be disabled after | ||
234 | removing the device from device hierarchy | ||
235 | |||
236 | int pm_runtime_idle(struct device *dev); | ||
237 | - execute ->runtime_idle() for the device's bus type; returns 0 on success | ||
238 | or error code on failure, where -EINPROGRESS means that ->runtime_idle() | ||
239 | is already being executed | ||
240 | |||
241 | int pm_runtime_suspend(struct device *dev); | ||
242 | - execute ->runtime_suspend() for the device's bus type; returns 0 on | ||
243 | success, 1 if the device's run-time PM status was already 'suspended', or | ||
244 | error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt | ||
245 | to suspend the device again in future | ||
246 | |||
247 | int pm_runtime_resume(struct device *dev); | ||
248 | - execute ->runtime_resume() for the device's bus type; returns 0 on | ||
249 | success, 1 if the device's run-time PM status was already 'active' or | ||
250 | error code on failure, where -EAGAIN means it may be safe to attempt to | ||
251 | resume the device again in future, but 'power.runtime_error' should be | ||
252 | checked additionally | ||
253 | |||
254 | int pm_request_idle(struct device *dev); | ||
255 | - submit a request to execute ->runtime_idle() for the device's bus type | ||
256 | (the request is represented by a work item in pm_wq); returns 0 on success | ||
257 | or error code if the request has not been queued up | ||
258 | |||
259 | int pm_schedule_suspend(struct device *dev, unsigned int delay); | ||
260 | - schedule the execution of ->runtime_suspend() for the device's bus type | ||
261 | in future, where 'delay' is the time to wait before queuing up a suspend | ||
262 | work item in pm_wq, in milliseconds (if 'delay' is zero, the work item is | ||
263 | queued up immediately); returns 0 on success, 1 if the device's PM | ||
264 | run-time status was already 'suspended', or error code if the request | ||
265 | hasn't been scheduled (or queued up if 'delay' is 0); if the execution of | ||
266 | ->runtime_suspend() is already scheduled and not yet expired, the new | ||
267 | value of 'delay' will be used as the time to wait | ||
268 | |||
269 | int pm_request_resume(struct device *dev); | ||
270 | - submit a request to execute ->runtime_resume() for the device's bus type | ||
271 | (the request is represented by a work item in pm_wq); returns 0 on | ||
272 | success, 1 if the device's run-time PM status was already 'active', or | ||
273 | error code if the request hasn't been queued up | ||
274 | |||
275 | void pm_runtime_get_noresume(struct device *dev); | ||
276 | - increment the device's usage counter | ||
277 | |||
278 | int pm_runtime_get(struct device *dev); | ||
279 | - increment the device's usage counter, run pm_request_resume(dev) and | ||
280 | return its result | ||
281 | |||
282 | int pm_runtime_get_sync(struct device *dev); | ||
283 | - increment the device's usage counter, run pm_runtime_resume(dev) and | ||
284 | return its result | ||
285 | |||
286 | void pm_runtime_put_noidle(struct device *dev); | ||
287 | - decrement the device's usage counter | ||
288 | |||
289 | int pm_runtime_put(struct device *dev); | ||
290 | - decrement the device's usage counter, run pm_request_idle(dev) and return | ||
291 | its result | ||
292 | |||
293 | int pm_runtime_put_sync(struct device *dev); | ||
294 | - decrement the device's usage counter, run pm_runtime_idle(dev) and return | ||
295 | its result | ||
296 | |||
297 | void pm_runtime_enable(struct device *dev); | ||
298 | - enable the run-time PM helper functions to run the device bus type's | ||
299 | run-time PM callbacks described in Section 2 | ||
300 | |||
301 | int pm_runtime_disable(struct device *dev); | ||
302 | - prevent the run-time PM helper functions from running the device bus | ||
303 | type's run-time PM callbacks, make sure that all of the pending run-time | ||
304 | PM operations on the device are either completed or canceled; returns | ||
305 | 1 if there was a resume request pending and it was necessary to execute | ||
306 | ->runtime_resume() for the device's bus type to satisfy that request, | ||
307 | otherwise 0 is returned | ||
308 | |||
309 | void pm_suspend_ignore_children(struct device *dev, bool enable); | ||
310 | - set/unset the power.ignore_children flag of the device | ||
311 | |||
312 | int pm_runtime_set_active(struct device *dev); | ||
313 | - clear the device's 'power.runtime_error' flag, set the device's run-time | ||
314 | PM status to 'active' and update its parent's counter of 'active' | ||
315 | children as appropriate (it is only valid to use this function if | ||
316 | 'power.runtime_error' is set or 'power.disable_depth' is greater than | ||
317 | zero); it will fail and return error code if the device has a parent | ||
318 | which is not active and the 'power.ignore_children' flag of which is unset | ||
319 | |||
320 | void pm_runtime_set_suspended(struct device *dev); | ||
321 | - clear the device's 'power.runtime_error' flag, set the device's run-time | ||
322 | PM status to 'suspended' and update its parent's counter of 'active' | ||
323 | children as appropriate (it is only valid to use this function if | ||
324 | 'power.runtime_error' is set or 'power.disable_depth' is greater than | ||
325 | zero) | ||
326 | |||
327 | It is safe to execute the following helper functions from interrupt context: | ||
328 | |||
329 | pm_request_idle() | ||
330 | pm_schedule_suspend() | ||
331 | pm_request_resume() | ||
332 | pm_runtime_get_noresume() | ||
333 | pm_runtime_get() | ||
334 | pm_runtime_put_noidle() | ||
335 | pm_runtime_put() | ||
336 | pm_suspend_ignore_children() | ||
337 | pm_runtime_set_active() | ||
338 | pm_runtime_set_suspended() | ||
339 | pm_runtime_enable() | ||
340 | |||
341 | 5. Run-time PM Initialization, Device Probing and Removal | ||
342 | |||
343 | Initially, the run-time PM is disabled for all devices, which means that the | ||
344 | majority of the run-time PM helper funtions described in Section 4 will return | ||
345 | -EAGAIN until pm_runtime_enable() is called for the device. | ||
346 | |||
347 | In addition to that, the initial run-time PM status of all devices is | ||
348 | 'suspended', but it need not reflect the actual physical state of the device. | ||
349 | Thus, if the device is initially active (i.e. it is able to process I/O), its | ||
350 | run-time PM status must be changed to 'active', with the help of | ||
351 | pm_runtime_set_active(), before pm_runtime_enable() is called for the device. | ||
352 | |||
353 | However, if the device has a parent and the parent's run-time PM is enabled, | ||
354 | calling pm_runtime_set_active() for the device will affect the parent, unless | ||
355 | the parent's 'power.ignore_children' flag is set. Namely, in that case the | ||
356 | parent won't be able to suspend at run time, using the PM core's helper | ||
357 | functions, as long as the child's status is 'active', even if the child's | ||
358 | run-time PM is still disabled (i.e. pm_runtime_enable() hasn't been called for | ||
359 | the child yet or pm_runtime_disable() has been called for it). For this reason, | ||
360 | once pm_runtime_set_active() has been called for the device, pm_runtime_enable() | ||
361 | should be called for it too as soon as reasonably possible or its run-time PM | ||
362 | status should be changed back to 'suspended' with the help of | ||
363 | pm_runtime_set_suspended(). | ||
364 | |||
365 | If the default initial run-time PM status of the device (i.e. 'suspended') | ||
366 | reflects the actual state of the device, its bus type's or its driver's | ||
367 | ->probe() callback will likely need to wake it up using one of the PM core's | ||
368 | helper functions described in Section 4. In that case, pm_runtime_resume() | ||
369 | should be used. Of course, for this purpose the device's run-time PM has to be | ||
370 | enabled earlier by calling pm_runtime_enable(). | ||
371 | |||
372 | If the device bus type's or driver's ->probe() or ->remove() callback runs | ||
373 | pm_runtime_suspend() or pm_runtime_idle() or their asynchronous counterparts, | ||
374 | they will fail returning -EAGAIN, because the device's usage counter is | ||
375 | incremented by the core before executing ->probe() and ->remove(). Still, it | ||
376 | may be desirable to suspend the device as soon as ->probe() or ->remove() has | ||
377 | finished, so the PM core uses pm_runtime_idle_sync() to invoke the device bus | ||
378 | type's ->runtime_idle() callback at that time. | ||
diff --git a/Documentation/s390/s390dbf.txt b/Documentation/s390/s390dbf.txt index 2d10053dd97e..ae66f9b90a25 100644 --- a/Documentation/s390/s390dbf.txt +++ b/Documentation/s390/s390dbf.txt | |||
@@ -495,6 +495,13 @@ and for each vararg a long value. So e.g. for a debug entry with a format | |||
495 | string plus two varargs one would need to allocate a (3 * sizeof(long)) | 495 | string plus two varargs one would need to allocate a (3 * sizeof(long)) |
496 | byte data area in the debug_register() function. | 496 | byte data area in the debug_register() function. |
497 | 497 | ||
498 | IMPORTANT: Using "%s" in sprintf event functions is dangerous. You can only | ||
499 | use "%s" in the sprintf event functions, if the memory for the passed string is | ||
500 | available as long as the debug feature exists. The reason behind this is that | ||
501 | due to performance considerations only a pointer to the string is stored in | ||
502 | the debug feature. If you log a string that is freed afterwards, you will get | ||
503 | an OOPS when inspecting the debug feature, because then the debug feature will | ||
504 | access the already freed memory. | ||
498 | 505 | ||
499 | NOTE: If using the sprintf view do NOT use other event/exception functions | 506 | NOTE: If using the sprintf view do NOT use other event/exception functions |
500 | than the sprintf-event and -exception functions. | 507 | than the sprintf-event and -exception functions. |
diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 4252697a95d6..1c8eb4518ce0 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt | |||
@@ -60,6 +60,12 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
60 | slots - Reserve the slot index for the given driver. | 60 | slots - Reserve the slot index for the given driver. |
61 | This option takes multiple strings. | 61 | This option takes multiple strings. |
62 | See "Module Autoloading Support" section for details. | 62 | See "Module Autoloading Support" section for details. |
63 | debug - Specifies the debug message level | ||
64 | (0 = disable debug prints, 1 = normal debug messages, | ||
65 | 2 = verbose debug messages) | ||
66 | This option appears only when CONFIG_SND_DEBUG=y. | ||
67 | This option can be dynamically changed via sysfs | ||
68 | /sys/modules/snd/parameters/debug file. | ||
63 | 69 | ||
64 | Module snd-pcm-oss | 70 | Module snd-pcm-oss |
65 | ------------------ | 71 | ------------------ |
@@ -513,6 +519,26 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
513 | or input, but you may use this module for any application which | 519 | or input, but you may use this module for any application which |
514 | requires a sound card (like RealPlayer). | 520 | requires a sound card (like RealPlayer). |
515 | 521 | ||
522 | pcm_devs - Number of PCM devices assigned to each card | ||
523 | (default = 1, up to 4) | ||
524 | pcm_substreams - Number of PCM substreams assigned to each PCM | ||
525 | (default = 8, up to 16) | ||
526 | hrtimer - Use hrtimer (=1, default) or system timer (=0) | ||
527 | fake_buffer - Fake buffer allocations (default = 1) | ||
528 | |||
529 | When multiple PCM devices are created, snd-dummy gives different | ||
530 | behavior to each PCM device: | ||
531 | 0 = interleaved with mmap support | ||
532 | 1 = non-interleaved with mmap support | ||
533 | 2 = interleaved without mmap | ||
534 | 3 = non-interleaved without mmap | ||
535 | |||
536 | As default, snd-dummy drivers doesn't allocate the real buffers | ||
537 | but either ignores read/write or mmap a single dummy page to all | ||
538 | buffer pages, in order to save the resouces. If your apps need | ||
539 | the read/ written buffer data to be consistent, pass fake_buffer=0 | ||
540 | option. | ||
541 | |||
516 | The power-management is supported. | 542 | The power-management is supported. |
517 | 543 | ||
518 | Module snd-echo3g | 544 | Module snd-echo3g |
@@ -768,6 +794,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. | |||
768 | bdl_pos_adj - Specifies the DMA IRQ timing delay in samples. | 794 | bdl_pos_adj - Specifies the DMA IRQ timing delay in samples. |
769 | Passing -1 will make the driver to choose the appropriate | 795 | Passing -1 will make the driver to choose the appropriate |
770 | value based on the controller chip. | 796 | value based on the controller chip. |
797 | patch - Specifies the early "patch" files to modify the HD-audio | ||
798 | setup before initializing the codecs. This option is | ||
799 | available only when CONFIG_SND_HDA_PATCH_LOADER=y is set. | ||
800 | See HD-Audio.txt for details. | ||
771 | 801 | ||
772 | [Single (global) options] | 802 | [Single (global) options] |
773 | single_cmd - Use single immediate commands to communicate with | 803 | single_cmd - Use single immediate commands to communicate with |
diff --git a/Documentation/sound/alsa/HD-Audio-Models.txt b/Documentation/sound/alsa/HD-Audio-Models.txt index 939a3dd58148..97eebd63bedc 100644 --- a/Documentation/sound/alsa/HD-Audio-Models.txt +++ b/Documentation/sound/alsa/HD-Audio-Models.txt | |||
@@ -114,8 +114,8 @@ ALC662/663/272 | |||
114 | samsung-nc10 Samsung NC10 mini notebook | 114 | samsung-nc10 Samsung NC10 mini notebook |
115 | auto auto-config reading BIOS (default) | 115 | auto auto-config reading BIOS (default) |
116 | 116 | ||
117 | ALC882/885 | 117 | ALC882/883/885/888/889 |
118 | ========== | 118 | ====================== |
119 | 3stack-dig 3-jack with SPDIF I/O | 119 | 3stack-dig 3-jack with SPDIF I/O |
120 | 6stack-dig 6-jack digital with SPDIF I/O | 120 | 6stack-dig 6-jack digital with SPDIF I/O |
121 | arima Arima W820Di1 | 121 | arima Arima W820Di1 |
@@ -127,12 +127,8 @@ ALC882/885 | |||
127 | mbp3 Macbook Pro rev3 | 127 | mbp3 Macbook Pro rev3 |
128 | imac24 iMac 24'' with jack detection | 128 | imac24 iMac 24'' with jack detection |
129 | w2jc ASUS W2JC | 129 | w2jc ASUS W2JC |
130 | auto auto-config reading BIOS (default) | 130 | 3stack-2ch-dig 3-jack with SPDIF I/O (ALC883) |
131 | 131 | alc883-6stack-dig 6-jack digital with SPDIF I/O (ALC883) | |
132 | ALC883/888 | ||
133 | ========== | ||
134 | 3stack-dig 3-jack with SPDIF I/O | ||
135 | 6stack-dig 6-jack digital with SPDIF I/O | ||
136 | 3stack-6ch 3-jack 6-channel | 132 | 3stack-6ch 3-jack 6-channel |
137 | 3stack-6ch-dig 3-jack 6-channel with SPDIF I/O | 133 | 3stack-6ch-dig 3-jack 6-channel with SPDIF I/O |
138 | 6stack-dig-demo 6-jack digital for Intel demo board | 134 | 6stack-dig-demo 6-jack digital for Intel demo board |
@@ -140,6 +136,7 @@ ALC883/888 | |||
140 | acer-aspire Acer Aspire 9810 | 136 | acer-aspire Acer Aspire 9810 |
141 | acer-aspire-4930g Acer Aspire 4930G | 137 | acer-aspire-4930g Acer Aspire 4930G |
142 | acer-aspire-6530g Acer Aspire 6530G | 138 | acer-aspire-6530g Acer Aspire 6530G |
139 | acer-aspire-7730g Acer Aspire 7730G | ||
143 | acer-aspire-8930g Acer Aspire 8930G | 140 | acer-aspire-8930g Acer Aspire 8930G |
144 | medion Medion Laptops | 141 | medion Medion Laptops |
145 | medion-md2 Medion MD2 | 142 | medion-md2 Medion MD2 |
@@ -155,10 +152,13 @@ ALC883/888 | |||
155 | 3stack-hp HP machines with 3stack (Lucknow, Samba boards) | 152 | 3stack-hp HP machines with 3stack (Lucknow, Samba boards) |
156 | 6stack-dell Dell machines with 6stack (Inspiron 530) | 153 | 6stack-dell Dell machines with 6stack (Inspiron 530) |
157 | mitac Mitac 8252D | 154 | mitac Mitac 8252D |
155 | clevo-m540r Clevo M540R (6ch + digital) | ||
158 | clevo-m720 Clevo M720 laptop series | 156 | clevo-m720 Clevo M720 laptop series |
159 | fujitsu-pi2515 Fujitsu AMILO Pi2515 | 157 | fujitsu-pi2515 Fujitsu AMILO Pi2515 |
160 | fujitsu-xa3530 Fujitsu AMILO XA3530 | 158 | fujitsu-xa3530 Fujitsu AMILO XA3530 |
161 | 3stack-6ch-intel Intel DG33* boards | 159 | 3stack-6ch-intel Intel DG33* boards |
160 | intel-alc889a Intel IbexPeak with ALC889A | ||
161 | intel-x58 Intel DX58 with ALC889 | ||
162 | asus-p5q ASUS P5Q-EM boards | 162 | asus-p5q ASUS P5Q-EM boards |
163 | mb31 MacBook 3,1 | 163 | mb31 MacBook 3,1 |
164 | sony-vaio-tt Sony VAIO TT | 164 | sony-vaio-tt Sony VAIO TT |
@@ -229,7 +229,7 @@ AD1984 | |||
229 | ====== | 229 | ====== |
230 | basic default configuration | 230 | basic default configuration |
231 | thinkpad Lenovo Thinkpad T61/X61 | 231 | thinkpad Lenovo Thinkpad T61/X61 |
232 | dell Dell T3400 | 232 | dell_desktop Dell T3400 |
233 | 233 | ||
234 | AD1986A | 234 | AD1986A |
235 | ======= | 235 | ======= |
@@ -258,6 +258,7 @@ Conexant 5045 | |||
258 | laptop-micsense Laptop with Mic sense (old model fujitsu) | 258 | laptop-micsense Laptop with Mic sense (old model fujitsu) |
259 | laptop-hpmicsense Laptop with HP and Mic senses | 259 | laptop-hpmicsense Laptop with HP and Mic senses |
260 | benq Benq R55E | 260 | benq Benq R55E |
261 | laptop-hp530 HP 530 laptop | ||
261 | test for testing/debugging purpose, almost all controls | 262 | test for testing/debugging purpose, almost all controls |
262 | can be adjusted. Appearing only when compiled with | 263 | can be adjusted. Appearing only when compiled with |
263 | $CONFIG_SND_DEBUG=y | 264 | $CONFIG_SND_DEBUG=y |
@@ -278,9 +279,16 @@ Conexant 5051 | |||
278 | hp-dv6736 HP dv6736 | 279 | hp-dv6736 HP dv6736 |
279 | lenovo-x200 Lenovo X200 laptop | 280 | lenovo-x200 Lenovo X200 laptop |
280 | 281 | ||
282 | Conexant 5066 | ||
283 | ============= | ||
284 | laptop Basic Laptop config (default) | ||
285 | dell-laptop Dell laptops | ||
286 | olpc-xo-1_5 OLPC XO 1.5 | ||
287 | |||
281 | STAC9200 | 288 | STAC9200 |
282 | ======== | 289 | ======== |
283 | ref Reference board | 290 | ref Reference board |
291 | oqo OQO Model 2 | ||
284 | dell-d21 Dell (unknown) | 292 | dell-d21 Dell (unknown) |
285 | dell-d22 Dell (unknown) | 293 | dell-d22 Dell (unknown) |
286 | dell-d23 Dell (unknown) | 294 | dell-d23 Dell (unknown) |
@@ -368,10 +376,12 @@ STAC92HD73* | |||
368 | =========== | 376 | =========== |
369 | ref Reference board | 377 | ref Reference board |
370 | no-jd BIOS setup but without jack-detection | 378 | no-jd BIOS setup but without jack-detection |
379 | intel Intel DG45* mobos | ||
371 | dell-m6-amic Dell desktops/laptops with analog mics | 380 | dell-m6-amic Dell desktops/laptops with analog mics |
372 | dell-m6-dmic Dell desktops/laptops with digital mics | 381 | dell-m6-dmic Dell desktops/laptops with digital mics |
373 | dell-m6 Dell desktops/laptops with both type of mics | 382 | dell-m6 Dell desktops/laptops with both type of mics |
374 | dell-eq Dell desktops/laptops | 383 | dell-eq Dell desktops/laptops |
384 | alienware Alienware M17x | ||
375 | auto BIOS setup (default) | 385 | auto BIOS setup (default) |
376 | 386 | ||
377 | STAC92HD83* | 387 | STAC92HD83* |
@@ -385,3 +395,8 @@ STAC9872 | |||
385 | ======== | 395 | ======== |
386 | vaio VAIO laptop without SPDIF | 396 | vaio VAIO laptop without SPDIF |
387 | auto BIOS setup (default) | 397 | auto BIOS setup (default) |
398 | |||
399 | Cirrus Logic CS4206/4207 | ||
400 | ======================== | ||
401 | mbp55 MacBook Pro 5,5 | ||
402 | auto BIOS setup (default) | ||
diff --git a/Documentation/sound/alsa/HD-Audio.txt b/Documentation/sound/alsa/HD-Audio.txt index 71ac995b1915..7b8a5f947d1d 100644 --- a/Documentation/sound/alsa/HD-Audio.txt +++ b/Documentation/sound/alsa/HD-Audio.txt | |||
@@ -139,6 +139,10 @@ The driver checks PCI SSID and looks through the static configuration | |||
139 | table until any matching entry is found. If you have a new machine, | 139 | table until any matching entry is found. If you have a new machine, |
140 | you may see a message like below: | 140 | you may see a message like below: |
141 | ------------------------------------------------------------------------ | 141 | ------------------------------------------------------------------------ |
142 | hda_codec: ALC880: BIOS auto-probing. | ||
143 | ------------------------------------------------------------------------ | ||
144 | Meanwhile, in the earlier versions, you would see a message like: | ||
145 | ------------------------------------------------------------------------ | ||
142 | hda_codec: Unknown model for ALC880, trying auto-probe from BIOS... | 146 | hda_codec: Unknown model for ALC880, trying auto-probe from BIOS... |
143 | ------------------------------------------------------------------------ | 147 | ------------------------------------------------------------------------ |
144 | Even if you see such a message, DON'T PANIC. Take a deep breath and | 148 | Even if you see such a message, DON'T PANIC. Take a deep breath and |
@@ -403,6 +407,66 @@ re-configure based on that state, run like below: | |||
403 | ------------------------------------------------------------------------ | 407 | ------------------------------------------------------------------------ |
404 | 408 | ||
405 | 409 | ||
410 | Early Patching | ||
411 | ~~~~~~~~~~~~~~ | ||
412 | When CONFIG_SND_HDA_PATCH_LOADER=y is set, you can pass a "patch" as a | ||
413 | firmware file for modifying the HD-audio setup before initializing the | ||
414 | codec. This can work basically like the reconfiguration via sysfs in | ||
415 | the above, but it does it before the first codec configuration. | ||
416 | |||
417 | A patch file is a plain text file which looks like below: | ||
418 | |||
419 | ------------------------------------------------------------------------ | ||
420 | [codec] | ||
421 | 0x12345678 0xabcd1234 2 | ||
422 | |||
423 | [model] | ||
424 | auto | ||
425 | |||
426 | [pincfg] | ||
427 | 0x12 0x411111f0 | ||
428 | |||
429 | [verb] | ||
430 | 0x20 0x500 0x03 | ||
431 | 0x20 0x400 0xff | ||
432 | |||
433 | [hint] | ||
434 | hp_detect = yes | ||
435 | ------------------------------------------------------------------------ | ||
436 | |||
437 | The file needs to have a line `[codec]`. The next line should contain | ||
438 | three numbers indicating the codec vendor-id (0x12345678 in the | ||
439 | example), the codec subsystem-id (0xabcd1234) and the address (2) of | ||
440 | the codec. The rest patch entries are applied to this specified codec | ||
441 | until another codec entry is given. | ||
442 | |||
443 | The `[model]` line allows to change the model name of the each codec. | ||
444 | In the example above, it will be changed to model=auto. | ||
445 | Note that this overrides the module option. | ||
446 | |||
447 | After the `[pincfg]` line, the contents are parsed as the initial | ||
448 | default pin-configurations just like `user_pin_configs` sysfs above. | ||
449 | The values can be shown in user_pin_configs sysfs file, too. | ||
450 | |||
451 | Similarly, the lines after `[verb]` are parsed as `init_verbs` | ||
452 | sysfs entries, and the lines after `[hint]` are parsed as `hints` | ||
453 | sysfs entries, respectively. | ||
454 | |||
455 | The hd-audio driver reads the file via request_firmware(). Thus, | ||
456 | a patch file has to be located on the appropriate firmware path, | ||
457 | typically, /lib/firmware. For example, when you pass the option | ||
458 | `patch=hda-init.fw`, the file /lib/firmware/hda-init-fw must be | ||
459 | present. | ||
460 | |||
461 | The patch module option is specific to each card instance, and you | ||
462 | need to give one file name for each instance, separated by commas. | ||
463 | For example, if you have two cards, one for an on-board analog and one | ||
464 | for an HDMI video board, you may pass patch option like below: | ||
465 | ------------------------------------------------------------------------ | ||
466 | options snd-hda-intel patch=on-board-patch,hdmi-patch | ||
467 | ------------------------------------------------------------------------ | ||
468 | |||
469 | |||
406 | Power-Saving | 470 | Power-Saving |
407 | ~~~~~~~~~~~~ | 471 | ~~~~~~~~~~~~ |
408 | The power-saving is a kind of auto-suspend of the device. When the | 472 | The power-saving is a kind of auto-suspend of the device. When the |
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index 322a00bb99d9..2dbff53369d0 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt | |||
@@ -19,6 +19,7 @@ Currently, these files might (depending on your configuration) | |||
19 | show up in /proc/sys/kernel: | 19 | show up in /proc/sys/kernel: |
20 | - acpi_video_flags | 20 | - acpi_video_flags |
21 | - acct | 21 | - acct |
22 | - callhome [ S390 only ] | ||
22 | - auto_msgmni | 23 | - auto_msgmni |
23 | - core_pattern | 24 | - core_pattern |
24 | - core_uses_pid | 25 | - core_uses_pid |
@@ -91,6 +92,21 @@ valid for 30 seconds. | |||
91 | 92 | ||
92 | ============================================================== | 93 | ============================================================== |
93 | 94 | ||
95 | callhome: | ||
96 | |||
97 | Controls the kernel's callhome behavior in case of a kernel panic. | ||
98 | |||
99 | The s390 hardware allows an operating system to send a notification | ||
100 | to a service organization (callhome) in case of an operating system panic. | ||
101 | |||
102 | When the value in this file is 0 (which is the default behavior) | ||
103 | nothing happens in case of a kernel panic. If this value is set to "1" | ||
104 | the complete kernel oops message is send to the IBM customer service | ||
105 | organization in case the mainframe the Linux operating system is running | ||
106 | on has a service contract with IBM. | ||
107 | |||
108 | ============================================================== | ||
109 | |||
94 | core_pattern: | 110 | core_pattern: |
95 | 111 | ||
96 | core_pattern is used to specify a core dumpfile pattern name. | 112 | core_pattern is used to specify a core dumpfile pattern name. |
diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt index f157d7594ea7..2bcc8d4dea29 100644 --- a/Documentation/trace/events.txt +++ b/Documentation/trace/events.txt | |||
@@ -83,6 +83,15 @@ When reading one of these enable files, there are four results: | |||
83 | X - there is a mixture of events enabled and disabled | 83 | X - there is a mixture of events enabled and disabled |
84 | ? - this file does not affect any event | 84 | ? - this file does not affect any event |
85 | 85 | ||
86 | 2.3 Boot option | ||
87 | --------------- | ||
88 | |||
89 | In order to facilitate early boot debugging, use boot option: | ||
90 | |||
91 | trace_event=[event-list] | ||
92 | |||
93 | The format of this boot option is the same as described in section 2.1. | ||
94 | |||
86 | 3. Defining an event-enabled tracepoint | 95 | 3. Defining an event-enabled tracepoint |
87 | ======================================= | 96 | ======================================= |
88 | 97 | ||
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index a39b3c749de5..355d0f1f8c50 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt | |||
@@ -85,26 +85,19 @@ of ftrace. Here is a list of some of the key files: | |||
85 | This file holds the output of the trace in a human | 85 | This file holds the output of the trace in a human |
86 | readable format (described below). | 86 | readable format (described below). |
87 | 87 | ||
88 | latency_trace: | ||
89 | |||
90 | This file shows the same trace but the information | ||
91 | is organized more to display possible latencies | ||
92 | in the system (described below). | ||
93 | |||
94 | trace_pipe: | 88 | trace_pipe: |
95 | 89 | ||
96 | The output is the same as the "trace" file but this | 90 | The output is the same as the "trace" file but this |
97 | file is meant to be streamed with live tracing. | 91 | file is meant to be streamed with live tracing. |
98 | Reads from this file will block until new data | 92 | Reads from this file will block until new data is |
99 | is retrieved. Unlike the "trace" and "latency_trace" | 93 | retrieved. Unlike the "trace" file, this file is a |
100 | files, this file is a consumer. This means reading | 94 | consumer. This means reading from this file causes |
101 | from this file causes sequential reads to display | 95 | sequential reads to display more current data. Once |
102 | more current data. Once data is read from this | 96 | data is read from this file, it is consumed, and |
103 | file, it is consumed, and will not be read | 97 | will not be read again with a sequential read. The |
104 | again with a sequential read. The "trace" and | 98 | "trace" file is static, and if the tracer is not |
105 | "latency_trace" files are static, and if the | 99 | adding more data,they will display the same |
106 | tracer is not adding more data, they will display | 100 | information every time they are read. |
107 | the same information every time they are read. | ||
108 | 101 | ||
109 | trace_options: | 102 | trace_options: |
110 | 103 | ||
@@ -117,10 +110,10 @@ of ftrace. Here is a list of some of the key files: | |||
117 | Some of the tracers record the max latency. | 110 | Some of the tracers record the max latency. |
118 | For example, the time interrupts are disabled. | 111 | For example, the time interrupts are disabled. |
119 | This time is saved in this file. The max trace | 112 | This time is saved in this file. The max trace |
120 | will also be stored, and displayed by either | 113 | will also be stored, and displayed by "trace". |
121 | "trace" or "latency_trace". A new max trace will | 114 | A new max trace will only be recorded if the |
122 | only be recorded if the latency is greater than | 115 | latency is greater than the value in this |
123 | the value in this file. (in microseconds) | 116 | file. (in microseconds) |
124 | 117 | ||
125 | buffer_size_kb: | 118 | buffer_size_kb: |
126 | 119 | ||
@@ -210,7 +203,7 @@ Here is the list of current tracers that may be configured. | |||
210 | the trace with the longest max latency. | 203 | the trace with the longest max latency. |
211 | See tracing_max_latency. When a new max is recorded, | 204 | See tracing_max_latency. When a new max is recorded, |
212 | it replaces the old trace. It is best to view this | 205 | it replaces the old trace. It is best to view this |
213 | trace via the latency_trace file. | 206 | trace with the latency-format option enabled. |
214 | 207 | ||
215 | "preemptoff" | 208 | "preemptoff" |
216 | 209 | ||
@@ -307,8 +300,8 @@ the lowest priority thread (pid 0). | |||
307 | Latency trace format | 300 | Latency trace format |
308 | -------------------- | 301 | -------------------- |
309 | 302 | ||
310 | For traces that display latency times, the latency_trace file | 303 | When the latency-format option is enabled, the trace file gives |
311 | gives somewhat more information to see why a latency happened. | 304 | somewhat more information to see why a latency happened. |
312 | Here is a typical trace. | 305 | Here is a typical trace. |
313 | 306 | ||
314 | # tracer: irqsoff | 307 | # tracer: irqsoff |
@@ -380,9 +373,10 @@ explains which is which. | |||
380 | 373 | ||
381 | The above is mostly meaningful for kernel developers. | 374 | The above is mostly meaningful for kernel developers. |
382 | 375 | ||
383 | time: This differs from the trace file output. The trace file output | 376 | time: When the latency-format option is enabled, the trace file |
384 | includes an absolute timestamp. The timestamp used by the | 377 | output includes a timestamp relative to the start of the |
385 | latency_trace file is relative to the start of the trace. | 378 | trace. This differs from the output when latency-format |
379 | is disabled, which includes an absolute timestamp. | ||
386 | 380 | ||
387 | delay: This is just to help catch your eye a bit better. And | 381 | delay: This is just to help catch your eye a bit better. And |
388 | needs to be fixed to be only relative to the same CPU. | 382 | needs to be fixed to be only relative to the same CPU. |
@@ -440,7 +434,8 @@ Here are the available options: | |||
440 | sym-addr: | 434 | sym-addr: |
441 | bash-4000 [01] 1477.606694: simple_strtoul <c0339346> | 435 | bash-4000 [01] 1477.606694: simple_strtoul <c0339346> |
442 | 436 | ||
443 | verbose - This deals with the latency_trace file. | 437 | verbose - This deals with the trace file when the |
438 | latency-format option is enabled. | ||
444 | 439 | ||
445 | bash 4000 1 0 00000000 00010a95 [58127d26] 1720.415ms \ | 440 | bash 4000 1 0 00000000 00010a95 [58127d26] 1720.415ms \ |
446 | (+0.000ms): simple_strtoul (strict_strtoul) | 441 | (+0.000ms): simple_strtoul (strict_strtoul) |
@@ -472,7 +467,7 @@ Here are the available options: | |||
472 | the app is no longer running | 467 | the app is no longer running |
473 | 468 | ||
474 | The lookup is performed when you read | 469 | The lookup is performed when you read |
475 | trace,trace_pipe,latency_trace. Example: | 470 | trace,trace_pipe. Example: |
476 | 471 | ||
477 | a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0 | 472 | a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0 |
478 | x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] | 473 | x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] |
@@ -481,6 +476,11 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] | |||
481 | every scheduling event. Will add overhead if | 476 | every scheduling event. Will add overhead if |
482 | there's a lot of tasks running at once. | 477 | there's a lot of tasks running at once. |
483 | 478 | ||
479 | latency-format - This option changes the trace. When | ||
480 | it is enabled, the trace displays | ||
481 | additional information about the | ||
482 | latencies, as described in "Latency | ||
483 | trace format". | ||
484 | 484 | ||
485 | sched_switch | 485 | sched_switch |
486 | ------------ | 486 | ------------ |
@@ -596,12 +596,13 @@ To reset the maximum, echo 0 into tracing_max_latency. Here is | |||
596 | an example: | 596 | an example: |
597 | 597 | ||
598 | # echo irqsoff > current_tracer | 598 | # echo irqsoff > current_tracer |
599 | # echo latency-format > trace_options | ||
599 | # echo 0 > tracing_max_latency | 600 | # echo 0 > tracing_max_latency |
600 | # echo 1 > tracing_enabled | 601 | # echo 1 > tracing_enabled |
601 | # ls -ltr | 602 | # ls -ltr |
602 | [...] | 603 | [...] |
603 | # echo 0 > tracing_enabled | 604 | # echo 0 > tracing_enabled |
604 | # cat latency_trace | 605 | # cat trace |
605 | # tracer: irqsoff | 606 | # tracer: irqsoff |
606 | # | 607 | # |
607 | irqsoff latency trace v1.1.5 on 2.6.26 | 608 | irqsoff latency trace v1.1.5 on 2.6.26 |
@@ -703,12 +704,13 @@ which preemption was disabled. The control of preemptoff tracer | |||
703 | is much like the irqsoff tracer. | 704 | is much like the irqsoff tracer. |
704 | 705 | ||
705 | # echo preemptoff > current_tracer | 706 | # echo preemptoff > current_tracer |
707 | # echo latency-format > trace_options | ||
706 | # echo 0 > tracing_max_latency | 708 | # echo 0 > tracing_max_latency |
707 | # echo 1 > tracing_enabled | 709 | # echo 1 > tracing_enabled |
708 | # ls -ltr | 710 | # ls -ltr |
709 | [...] | 711 | [...] |
710 | # echo 0 > tracing_enabled | 712 | # echo 0 > tracing_enabled |
711 | # cat latency_trace | 713 | # cat trace |
712 | # tracer: preemptoff | 714 | # tracer: preemptoff |
713 | # | 715 | # |
714 | preemptoff latency trace v1.1.5 on 2.6.26-rc8 | 716 | preemptoff latency trace v1.1.5 on 2.6.26-rc8 |
@@ -850,12 +852,13 @@ Again, using this trace is much like the irqsoff and preemptoff | |||
850 | tracers. | 852 | tracers. |
851 | 853 | ||
852 | # echo preemptirqsoff > current_tracer | 854 | # echo preemptirqsoff > current_tracer |
855 | # echo latency-format > trace_options | ||
853 | # echo 0 > tracing_max_latency | 856 | # echo 0 > tracing_max_latency |
854 | # echo 1 > tracing_enabled | 857 | # echo 1 > tracing_enabled |
855 | # ls -ltr | 858 | # ls -ltr |
856 | [...] | 859 | [...] |
857 | # echo 0 > tracing_enabled | 860 | # echo 0 > tracing_enabled |
858 | # cat latency_trace | 861 | # cat trace |
859 | # tracer: preemptirqsoff | 862 | # tracer: preemptirqsoff |
860 | # | 863 | # |
861 | preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 | 864 | preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8 |
@@ -1012,11 +1015,12 @@ Instead of performing an 'ls', we will run 'sleep 1' under | |||
1012 | 'chrt' which changes the priority of the task. | 1015 | 'chrt' which changes the priority of the task. |
1013 | 1016 | ||
1014 | # echo wakeup > current_tracer | 1017 | # echo wakeup > current_tracer |
1018 | # echo latency-format > trace_options | ||
1015 | # echo 0 > tracing_max_latency | 1019 | # echo 0 > tracing_max_latency |
1016 | # echo 1 > tracing_enabled | 1020 | # echo 1 > tracing_enabled |
1017 | # chrt -f 5 sleep 1 | 1021 | # chrt -f 5 sleep 1 |
1018 | # echo 0 > tracing_enabled | 1022 | # echo 0 > tracing_enabled |
1019 | # cat latency_trace | 1023 | # cat trace |
1020 | # tracer: wakeup | 1024 | # tracer: wakeup |
1021 | # | 1025 | # |
1022 | wakeup latency trace v1.1.5 on 2.6.26-rc8 | 1026 | wakeup latency trace v1.1.5 on 2.6.26-rc8 |
diff --git a/Documentation/trace/function-graph-fold.vim b/Documentation/trace/function-graph-fold.vim new file mode 100644 index 000000000000..0544b504c8b0 --- /dev/null +++ b/Documentation/trace/function-graph-fold.vim | |||
@@ -0,0 +1,42 @@ | |||
1 | " Enable folding for ftrace function_graph traces. | ||
2 | " | ||
3 | " To use, :source this file while viewing a function_graph trace, or use vim's | ||
4 | " -S option to load from the command-line together with a trace. You can then | ||
5 | " use the usual vim fold commands, such as "za", to open and close nested | ||
6 | " functions. While closed, a fold will show the total time taken for a call, | ||
7 | " as would normally appear on the line with the closing brace. Folded | ||
8 | " functions will not include finish_task_switch(), so folding should remain | ||
9 | " relatively sane even through a context switch. | ||
10 | " | ||
11 | " Note that this will almost certainly only work well with a | ||
12 | " single-CPU trace (e.g. trace-cmd report --cpu 1). | ||
13 | |||
14 | function! FunctionGraphFoldExpr(lnum) | ||
15 | let line = getline(a:lnum) | ||
16 | if line[-1:] == '{' | ||
17 | if line =~ 'finish_task_switch() {$' | ||
18 | return '>1' | ||
19 | endif | ||
20 | return 'a1' | ||
21 | elseif line[-1:] == '}' | ||
22 | return 's1' | ||
23 | else | ||
24 | return '=' | ||
25 | endif | ||
26 | endfunction | ||
27 | |||
28 | function! FunctionGraphFoldText() | ||
29 | let s = split(getline(v:foldstart), '|', 1) | ||
30 | if getline(v:foldend+1) =~ 'finish_task_switch() {$' | ||
31 | let s[2] = ' task switch ' | ||
32 | else | ||
33 | let e = split(getline(v:foldend), '|', 1) | ||
34 | let s[2] = e[2] | ||
35 | endif | ||
36 | return join(s, '|') | ||
37 | endfunction | ||
38 | |||
39 | setlocal foldexpr=FunctionGraphFoldExpr(v:lnum) | ||
40 | setlocal foldtext=FunctionGraphFoldText() | ||
41 | setlocal foldcolumn=12 | ||
42 | setlocal foldmethod=expr | ||
diff --git a/Documentation/trace/ring-buffer-design.txt b/Documentation/trace/ring-buffer-design.txt new file mode 100644 index 000000000000..5b1d23d604c5 --- /dev/null +++ b/Documentation/trace/ring-buffer-design.txt | |||
@@ -0,0 +1,955 @@ | |||
1 | Lockless Ring Buffer Design | ||
2 | =========================== | ||
3 | |||
4 | Copyright 2009 Red Hat Inc. | ||
5 | Author: Steven Rostedt <srostedt@redhat.com> | ||
6 | License: The GNU Free Documentation License, Version 1.2 | ||
7 | (dual licensed under the GPL v2) | ||
8 | Reviewers: Mathieu Desnoyers, Huang Ying, Hidetoshi Seto, | ||
9 | and Frederic Weisbecker. | ||
10 | |||
11 | |||
12 | Written for: 2.6.31 | ||
13 | |||
14 | Terminology used in this Document | ||
15 | --------------------------------- | ||
16 | |||
17 | tail - where new writes happen in the ring buffer. | ||
18 | |||
19 | head - where new reads happen in the ring buffer. | ||
20 | |||
21 | producer - the task that writes into the ring buffer (same as writer) | ||
22 | |||
23 | writer - same as producer | ||
24 | |||
25 | consumer - the task that reads from the buffer (same as reader) | ||
26 | |||
27 | reader - same as consumer. | ||
28 | |||
29 | reader_page - A page outside the ring buffer used solely (for the most part) | ||
30 | by the reader. | ||
31 | |||
32 | head_page - a pointer to the page that the reader will use next | ||
33 | |||
34 | tail_page - a pointer to the page that will be written to next | ||
35 | |||
36 | commit_page - a pointer to the page with the last finished non nested write. | ||
37 | |||
38 | cmpxchg - hardware assisted atomic transaction that performs the following: | ||
39 | |||
40 | A = B iff previous A == C | ||
41 | |||
42 | R = cmpxchg(A, C, B) is saying that we replace A with B if and only if | ||
43 | current A is equal to C, and we put the old (current) A into R | ||
44 | |||
45 | R gets the previous A regardless if A is updated with B or not. | ||
46 | |||
47 | To see if the update was successful a compare of R == C may be used. | ||
48 | |||
49 | The Generic Ring Buffer | ||
50 | ----------------------- | ||
51 | |||
52 | The ring buffer can be used in either an overwrite mode or in | ||
53 | producer/consumer mode. | ||
54 | |||
55 | Producer/consumer mode is where the producer were to fill up the | ||
56 | buffer before the consumer could free up anything, the producer | ||
57 | will stop writing to the buffer. This will lose most recent events. | ||
58 | |||
59 | Overwrite mode is where the produce were to fill up the buffer | ||
60 | before the consumer could free up anything, the producer will | ||
61 | overwrite the older data. This will lose the oldest events. | ||
62 | |||
63 | No two writers can write at the same time (on the same per cpu buffer), | ||
64 | but a writer may interrupt another writer, but it must finish writing | ||
65 | before the previous writer may continue. This is very important to the | ||
66 | algorithm. The writers act like a "stack". The way interrupts works | ||
67 | enforces this behavior. | ||
68 | |||
69 | |||
70 | writer1 start | ||
71 | <preempted> writer2 start | ||
72 | <preempted> writer3 start | ||
73 | writer3 finishes | ||
74 | writer2 finishes | ||
75 | writer1 finishes | ||
76 | |||
77 | This is very much like a writer being preempted by an interrupt and | ||
78 | the interrupt doing a write as well. | ||
79 | |||
80 | Readers can happen at any time. But no two readers may run at the | ||
81 | same time, nor can a reader preempt/interrupt another reader. A reader | ||
82 | can not preempt/interrupt a writer, but it may read/consume from the | ||
83 | buffer at the same time as a writer is writing, but the reader must be | ||
84 | on another processor to do so. A reader may read on its own processor | ||
85 | and can be preempted by a writer. | ||
86 | |||
87 | A writer can preempt a reader, but a reader can not preempt a writer. | ||
88 | But a reader can read the buffer at the same time (on another processor) | ||
89 | as a writer. | ||
90 | |||
91 | The ring buffer is made up of a list of pages held together by a link list. | ||
92 | |||
93 | At initialization a reader page is allocated for the reader that is not | ||
94 | part of the ring buffer. | ||
95 | |||
96 | The head_page, tail_page and commit_page are all initialized to point | ||
97 | to the same page. | ||
98 | |||
99 | The reader page is initialized to have its next pointer pointing to | ||
100 | the head page, and its previous pointer pointing to a page before | ||
101 | the head page. | ||
102 | |||
103 | The reader has its own page to use. At start up time, this page is | ||
104 | allocated but is not attached to the list. When the reader wants | ||
105 | to read from the buffer, if its page is empty (like it is on start up) | ||
106 | it will swap its page with the head_page. The old reader page will | ||
107 | become part of the ring buffer and the head_page will be removed. | ||
108 | The page after the inserted page (old reader_page) will become the | ||
109 | new head page. | ||
110 | |||
111 | Once the new page is given to the reader, the reader could do what | ||
112 | it wants with it, as long as a writer has left that page. | ||
113 | |||
114 | A sample of how the reader page is swapped: Note this does not | ||
115 | show the head page in the buffer, it is for demonstrating a swap | ||
116 | only. | ||
117 | |||
118 | +------+ | ||
119 | |reader| RING BUFFER | ||
120 | |page | | ||
121 | +------+ | ||
122 | +---+ +---+ +---+ | ||
123 | | |-->| |-->| | | ||
124 | | |<--| |<--| | | ||
125 | +---+ +---+ +---+ | ||
126 | ^ | ^ | | ||
127 | | +-------------+ | | ||
128 | +-----------------+ | ||
129 | |||
130 | |||
131 | +------+ | ||
132 | |reader| RING BUFFER | ||
133 | |page |-------------------+ | ||
134 | +------+ v | ||
135 | | +---+ +---+ +---+ | ||
136 | | | |-->| |-->| | | ||
137 | | | |<--| |<--| |<-+ | ||
138 | | +---+ +---+ +---+ | | ||
139 | | ^ | ^ | | | ||
140 | | | +-------------+ | | | ||
141 | | +-----------------+ | | ||
142 | +------------------------------------+ | ||
143 | |||
144 | +------+ | ||
145 | |reader| RING BUFFER | ||
146 | |page |-------------------+ | ||
147 | +------+ <---------------+ v | ||
148 | | ^ +---+ +---+ +---+ | ||
149 | | | | |-->| |-->| | | ||
150 | | | | | | |<--| |<-+ | ||
151 | | | +---+ +---+ +---+ | | ||
152 | | | | ^ | | | ||
153 | | | +-------------+ | | | ||
154 | | +-----------------------------+ | | ||
155 | +------------------------------------+ | ||
156 | |||
157 | +------+ | ||
158 | |buffer| RING BUFFER | ||
159 | |page |-------------------+ | ||
160 | +------+ <---------------+ v | ||
161 | | ^ +---+ +---+ +---+ | ||
162 | | | | | | |-->| | | ||
163 | | | New | | | |<--| |<-+ | ||
164 | | | Reader +---+ +---+ +---+ | | ||
165 | | | page ----^ | | | ||
166 | | | | | | ||
167 | | +-----------------------------+ | | ||
168 | +------------------------------------+ | ||
169 | |||
170 | |||
171 | |||
172 | It is possible that the page swapped is the commit page and the tail page, | ||
173 | if what is in the ring buffer is less than what is held in a buffer page. | ||
174 | |||
175 | |||
176 | reader page commit page tail page | ||
177 | | | | | ||
178 | v | | | ||
179 | +---+ | | | ||
180 | | |<----------+ | | ||
181 | | |<------------------------+ | ||
182 | | |------+ | ||
183 | +---+ | | ||
184 | | | ||
185 | v | ||
186 | +---+ +---+ +---+ +---+ | ||
187 | <---| |--->| |--->| |--->| |---> | ||
188 | --->| |<---| |<---| |<---| |<--- | ||
189 | +---+ +---+ +---+ +---+ | ||
190 | |||
191 | This case is still valid for this algorithm. | ||
192 | When the writer leaves the page, it simply goes into the ring buffer | ||
193 | since the reader page still points to the next location in the ring | ||
194 | buffer. | ||
195 | |||
196 | |||
197 | The main pointers: | ||
198 | |||
199 | reader page - The page used solely by the reader and is not part | ||
200 | of the ring buffer (may be swapped in) | ||
201 | |||
202 | head page - the next page in the ring buffer that will be swapped | ||
203 | with the reader page. | ||
204 | |||
205 | tail page - the page where the next write will take place. | ||
206 | |||
207 | commit page - the page that last finished a write. | ||
208 | |||
209 | The commit page only is updated by the outer most writer in the | ||
210 | writer stack. A writer that preempts another writer will not move the | ||
211 | commit page. | ||
212 | |||
213 | When data is written into the ring buffer, a position is reserved | ||
214 | in the ring buffer and passed back to the writer. When the writer | ||
215 | is finished writing data into that position, it commits the write. | ||
216 | |||
217 | Another write (or a read) may take place at anytime during this | ||
218 | transaction. If another write happens it must finish before continuing | ||
219 | with the previous write. | ||
220 | |||
221 | |||
222 | Write reserve: | ||
223 | |||
224 | Buffer page | ||
225 | +---------+ | ||
226 | |written | | ||
227 | +---------+ <--- given back to writer (current commit) | ||
228 | |reserved | | ||
229 | +---------+ <--- tail pointer | ||
230 | | empty | | ||
231 | +---------+ | ||
232 | |||
233 | Write commit: | ||
234 | |||
235 | Buffer page | ||
236 | +---------+ | ||
237 | |written | | ||
238 | +---------+ | ||
239 | |written | | ||
240 | +---------+ <--- next positon for write (current commit) | ||
241 | | empty | | ||
242 | +---------+ | ||
243 | |||
244 | |||
245 | If a write happens after the first reserve: | ||
246 | |||
247 | Buffer page | ||
248 | +---------+ | ||
249 | |written | | ||
250 | +---------+ <-- current commit | ||
251 | |reserved | | ||
252 | +---------+ <--- given back to second writer | ||
253 | |reserved | | ||
254 | +---------+ <--- tail pointer | ||
255 | |||
256 | After second writer commits: | ||
257 | |||
258 | |||
259 | Buffer page | ||
260 | +---------+ | ||
261 | |written | | ||
262 | +---------+ <--(last full commit) | ||
263 | |reserved | | ||
264 | +---------+ | ||
265 | |pending | | ||
266 | |commit | | ||
267 | +---------+ <--- tail pointer | ||
268 | |||
269 | When the first writer commits: | ||
270 | |||
271 | Buffer page | ||
272 | +---------+ | ||
273 | |written | | ||
274 | +---------+ | ||
275 | |written | | ||
276 | +---------+ | ||
277 | |written | | ||
278 | +---------+ <--(last full commit and tail pointer) | ||
279 | |||
280 | |||
281 | The commit pointer points to the last write location that was | ||
282 | committed without preempting another write. When a write that | ||
283 | preempted another write is committed, it only becomes a pending commit | ||
284 | and will not be a full commit till all writes have been committed. | ||
285 | |||
286 | The commit page points to the page that has the last full commit. | ||
287 | The tail page points to the page with the last write (before | ||
288 | committing). | ||
289 | |||
290 | The tail page is always equal to or after the commit page. It may | ||
291 | be several pages ahead. If the tail page catches up to the commit | ||
292 | page then no more writes may take place (regardless of the mode | ||
293 | of the ring buffer: overwrite and produce/consumer). | ||
294 | |||
295 | The order of pages are: | ||
296 | |||
297 | head page | ||
298 | commit page | ||
299 | tail page | ||
300 | |||
301 | Possible scenario: | ||
302 | tail page | ||
303 | head page commit page | | ||
304 | | | | | ||
305 | v v v | ||
306 | +---+ +---+ +---+ +---+ | ||
307 | <---| |--->| |--->| |--->| |---> | ||
308 | --->| |<---| |<---| |<---| |<--- | ||
309 | +---+ +---+ +---+ +---+ | ||
310 | |||
311 | There is a special case that the head page is after either the commit page | ||
312 | and possibly the tail page. That is when the commit (and tail) page has been | ||
313 | swapped with the reader page. This is because the head page is always | ||
314 | part of the ring buffer, but the reader page is not. When ever there | ||
315 | has been less than a full page that has been committed inside the ring buffer, | ||
316 | and a reader swaps out a page, it will be swapping out the commit page. | ||
317 | |||
318 | |||
319 | reader page commit page tail page | ||
320 | | | | | ||
321 | v | | | ||
322 | +---+ | | | ||
323 | | |<----------+ | | ||
324 | | |<------------------------+ | ||
325 | | |------+ | ||
326 | +---+ | | ||
327 | | | ||
328 | v | ||
329 | +---+ +---+ +---+ +---+ | ||
330 | <---| |--->| |--->| |--->| |---> | ||
331 | --->| |<---| |<---| |<---| |<--- | ||
332 | +---+ +---+ +---+ +---+ | ||
333 | ^ | ||
334 | | | ||
335 | head page | ||
336 | |||
337 | |||
338 | In this case, the head page will not move when the tail and commit | ||
339 | move back into the ring buffer. | ||
340 | |||
341 | The reader can not swap a page into the ring buffer if the commit page | ||
342 | is still on that page. If the read meets the last commit (real commit | ||
343 | not pending or reserved), then there is nothing more to read. | ||
344 | The buffer is considered empty until another full commit finishes. | ||
345 | |||
346 | When the tail meets the head page, if the buffer is in overwrite mode, | ||
347 | the head page will be pushed ahead one. If the buffer is in producer/consumer | ||
348 | mode, the write will fail. | ||
349 | |||
350 | Overwrite mode: | ||
351 | |||
352 | tail page | ||
353 | | | ||
354 | v | ||
355 | +---+ +---+ +---+ +---+ | ||
356 | <---| |--->| |--->| |--->| |---> | ||
357 | --->| |<---| |<---| |<---| |<--- | ||
358 | +---+ +---+ +---+ +---+ | ||
359 | ^ | ||
360 | | | ||
361 | head page | ||
362 | |||
363 | |||
364 | tail page | ||
365 | | | ||
366 | v | ||
367 | +---+ +---+ +---+ +---+ | ||
368 | <---| |--->| |--->| |--->| |---> | ||
369 | --->| |<---| |<---| |<---| |<--- | ||
370 | +---+ +---+ +---+ +---+ | ||
371 | ^ | ||
372 | | | ||
373 | head page | ||
374 | |||
375 | |||
376 | tail page | ||
377 | | | ||
378 | v | ||
379 | +---+ +---+ +---+ +---+ | ||
380 | <---| |--->| |--->| |--->| |---> | ||
381 | --->| |<---| |<---| |<---| |<--- | ||
382 | +---+ +---+ +---+ +---+ | ||
383 | ^ | ||
384 | | | ||
385 | head page | ||
386 | |||
387 | Note, the reader page will still point to the previous head page. | ||
388 | But when a swap takes place, it will use the most recent head page. | ||
389 | |||
390 | |||
391 | Making the Ring Buffer Lockless: | ||
392 | -------------------------------- | ||
393 | |||
394 | The main idea behind the lockless algorithm is to combine the moving | ||
395 | of the head_page pointer with the swapping of pages with the reader. | ||
396 | State flags are placed inside the pointer to the page. To do this, | ||
397 | each page must be aligned in memory by 4 bytes. This will allow the 2 | ||
398 | least significant bits of the address to be used as flags. Since | ||
399 | they will always be zero for the address. To get the address, | ||
400 | simply mask out the flags. | ||
401 | |||
402 | MASK = ~3 | ||
403 | |||
404 | address & MASK | ||
405 | |||
406 | Two flags will be kept by these two bits: | ||
407 | |||
408 | HEADER - the page being pointed to is a head page | ||
409 | |||
410 | UPDATE - the page being pointed to is being updated by a writer | ||
411 | and was or is about to be a head page. | ||
412 | |||
413 | |||
414 | reader page | ||
415 | | | ||
416 | v | ||
417 | +---+ | ||
418 | | |------+ | ||
419 | +---+ | | ||
420 | | | ||
421 | v | ||
422 | +---+ +---+ +---+ +---+ | ||
423 | <---| |--->| |-H->| |--->| |---> | ||
424 | --->| |<---| |<---| |<---| |<--- | ||
425 | +---+ +---+ +---+ +---+ | ||
426 | |||
427 | |||
428 | The above pointer "-H->" would have the HEADER flag set. That is | ||
429 | the next page is the next page to be swapped out by the reader. | ||
430 | This pointer means the next page is the head page. | ||
431 | |||
432 | When the tail page meets the head pointer, it will use cmpxchg to | ||
433 | change the pointer to the UPDATE state: | ||
434 | |||
435 | |||
436 | tail page | ||
437 | | | ||
438 | v | ||
439 | +---+ +---+ +---+ +---+ | ||
440 | <---| |--->| |-H->| |--->| |---> | ||
441 | --->| |<---| |<---| |<---| |<--- | ||
442 | +---+ +---+ +---+ +---+ | ||
443 | |||
444 | tail page | ||
445 | | | ||
446 | v | ||
447 | +---+ +---+ +---+ +---+ | ||
448 | <---| |--->| |-U->| |--->| |---> | ||
449 | --->| |<---| |<---| |<---| |<--- | ||
450 | +---+ +---+ +---+ +---+ | ||
451 | |||
452 | "-U->" represents a pointer in the UPDATE state. | ||
453 | |||
454 | Any access to the reader will need to take some sort of lock to serialize | ||
455 | the readers. But the writers will never take a lock to write to the | ||
456 | ring buffer. This means we only need to worry about a single reader, | ||
457 | and writes only preempt in "stack" formation. | ||
458 | |||
459 | When the reader tries to swap the page with the ring buffer, it | ||
460 | will also use cmpxchg. If the flag bit in the pointer to the | ||
461 | head page does not have the HEADER flag set, the compare will fail | ||
462 | and the reader will need to look for the new head page and try again. | ||
463 | Note, the flag UPDATE and HEADER are never set at the same time. | ||
464 | |||
465 | The reader swaps the reader page as follows: | ||
466 | |||
467 | +------+ | ||
468 | |reader| RING BUFFER | ||
469 | |page | | ||
470 | +------+ | ||
471 | +---+ +---+ +---+ | ||
472 | | |--->| |--->| | | ||
473 | | |<---| |<---| | | ||
474 | +---+ +---+ +---+ | ||
475 | ^ | ^ | | ||
476 | | +---------------+ | | ||
477 | +-----H-------------+ | ||
478 | |||
479 | The reader sets the reader page next pointer as HEADER to the page after | ||
480 | the head page. | ||
481 | |||
482 | |||
483 | +------+ | ||
484 | |reader| RING BUFFER | ||
485 | |page |-------H-----------+ | ||
486 | +------+ v | ||
487 | | +---+ +---+ +---+ | ||
488 | | | |--->| |--->| | | ||
489 | | | |<---| |<---| |<-+ | ||
490 | | +---+ +---+ +---+ | | ||
491 | | ^ | ^ | | | ||
492 | | | +---------------+ | | | ||
493 | | +-----H-------------+ | | ||
494 | +--------------------------------------+ | ||
495 | |||
496 | It does a cmpxchg with the pointer to the previous head page to make it | ||
497 | point to the reader page. Note that the new pointer does not have the HEADER | ||
498 | flag set. This action atomically moves the head page forward. | ||
499 | |||
500 | +------+ | ||
501 | |reader| RING BUFFER | ||
502 | |page |-------H-----------+ | ||
503 | +------+ v | ||
504 | | ^ +---+ +---+ +---+ | ||
505 | | | | |-->| |-->| | | ||
506 | | | | |<--| |<--| |<-+ | ||
507 | | | +---+ +---+ +---+ | | ||
508 | | | | ^ | | | ||
509 | | | +-------------+ | | | ||
510 | | +-----------------------------+ | | ||
511 | +------------------------------------+ | ||
512 | |||
513 | After the new head page is set, the previous pointer of the head page is | ||
514 | updated to the reader page. | ||
515 | |||
516 | +------+ | ||
517 | |reader| RING BUFFER | ||
518 | |page |-------H-----------+ | ||
519 | +------+ <---------------+ v | ||
520 | | ^ +---+ +---+ +---+ | ||
521 | | | | |-->| |-->| | | ||
522 | | | | | | |<--| |<-+ | ||
523 | | | +---+ +---+ +---+ | | ||
524 | | | | ^ | | | ||
525 | | | +-------------+ | | | ||
526 | | +-----------------------------+ | | ||
527 | +------------------------------------+ | ||
528 | |||
529 | +------+ | ||
530 | |buffer| RING BUFFER | ||
531 | |page |-------H-----------+ <--- New head page | ||
532 | +------+ <---------------+ v | ||
533 | | ^ +---+ +---+ +---+ | ||
534 | | | | | | |-->| | | ||
535 | | | New | | | |<--| |<-+ | ||
536 | | | Reader +---+ +---+ +---+ | | ||
537 | | | page ----^ | | | ||
538 | | | | | | ||
539 | | +-----------------------------+ | | ||
540 | +------------------------------------+ | ||
541 | |||
542 | Another important point. The page that the reader page points back to | ||
543 | by its previous pointer (the one that now points to the new head page) | ||
544 | never points back to the reader page. That is because the reader page is | ||
545 | not part of the ring buffer. Traversing the ring buffer via the next pointers | ||
546 | will always stay in the ring buffer. Traversing the ring buffer via the | ||
547 | prev pointers may not. | ||
548 | |||
549 | Note, the way to determine a reader page is simply by examining the previous | ||
550 | pointer of the page. If the next pointer of the previous page does not | ||
551 | point back to the original page, then the original page is a reader page: | ||
552 | |||
553 | |||
554 | +--------+ | ||
555 | | reader | next +----+ | ||
556 | | page |-------->| |<====== (buffer page) | ||
557 | +--------+ +----+ | ||
558 | | | ^ | ||
559 | | v | next | ||
560 | prev | +----+ | ||
561 | +------------->| | | ||
562 | +----+ | ||
563 | |||
564 | The way the head page moves forward: | ||
565 | |||
566 | When the tail page meets the head page and the buffer is in overwrite mode | ||
567 | and more writes take place, the head page must be moved forward before the | ||
568 | writer may move the tail page. The way this is done is that the writer | ||
569 | performs a cmpxchg to convert the pointer to the head page from the HEADER | ||
570 | flag to have the UPDATE flag set. Once this is done, the reader will | ||
571 | not be able to swap the head page from the buffer, nor will it be able to | ||
572 | move the head page, until the writer is finished with the move. | ||
573 | |||
574 | This eliminates any races that the reader can have on the writer. The reader | ||
575 | must spin, and this is why the reader can not preempt the writer. | ||
576 | |||
577 | tail page | ||
578 | | | ||
579 | v | ||
580 | +---+ +---+ +---+ +---+ | ||
581 | <---| |--->| |-H->| |--->| |---> | ||
582 | --->| |<---| |<---| |<---| |<--- | ||
583 | +---+ +---+ +---+ +---+ | ||
584 | |||
585 | tail page | ||
586 | | | ||
587 | v | ||
588 | +---+ +---+ +---+ +---+ | ||
589 | <---| |--->| |-U->| |--->| |---> | ||
590 | --->| |<---| |<---| |<---| |<--- | ||
591 | +---+ +---+ +---+ +---+ | ||
592 | |||
593 | The following page will be made into the new head page. | ||
594 | |||
595 | tail page | ||
596 | | | ||
597 | v | ||
598 | +---+ +---+ +---+ +---+ | ||
599 | <---| |--->| |-U->| |-H->| |---> | ||
600 | --->| |<---| |<---| |<---| |<--- | ||
601 | +---+ +---+ +---+ +---+ | ||
602 | |||
603 | After the new head page has been set, we can set the old head page | ||
604 | pointer back to NORMAL. | ||
605 | |||
606 | tail page | ||
607 | | | ||
608 | v | ||
609 | +---+ +---+ +---+ +---+ | ||
610 | <---| |--->| |--->| |-H->| |---> | ||
611 | --->| |<---| |<---| |<---| |<--- | ||
612 | +---+ +---+ +---+ +---+ | ||
613 | |||
614 | After the head page has been moved, the tail page may now move forward. | ||
615 | |||
616 | tail page | ||
617 | | | ||
618 | v | ||
619 | +---+ +---+ +---+ +---+ | ||
620 | <---| |--->| |--->| |-H->| |---> | ||
621 | --->| |<---| |<---| |<---| |<--- | ||
622 | +---+ +---+ +---+ +---+ | ||
623 | |||
624 | |||
625 | The above are the trivial updates. Now for the more complex scenarios. | ||
626 | |||
627 | |||
628 | As stated before, if enough writes preempt the first write, the | ||
629 | tail page may make it all the way around the buffer and meet the commit | ||
630 | page. At this time, we must start dropping writes (usually with some kind | ||
631 | of warning to the user). But what happens if the commit was still on the | ||
632 | reader page? The commit page is not part of the ring buffer. The tail page | ||
633 | must account for this. | ||
634 | |||
635 | |||
636 | reader page commit page | ||
637 | | | | ||
638 | v | | ||
639 | +---+ | | ||
640 | | |<----------+ | ||
641 | | | | ||
642 | | |------+ | ||
643 | +---+ | | ||
644 | | | ||
645 | v | ||
646 | +---+ +---+ +---+ +---+ | ||
647 | <---| |--->| |-H->| |--->| |---> | ||
648 | --->| |<---| |<---| |<---| |<--- | ||
649 | +---+ +---+ +---+ +---+ | ||
650 | ^ | ||
651 | | | ||
652 | tail page | ||
653 | |||
654 | If the tail page were to simply push the head page forward, the commit when | ||
655 | leaving the reader page would not be pointing to the correct page. | ||
656 | |||
657 | The solution to this is to test if the commit page is on the reader page | ||
658 | before pushing the head page. If it is, then it can be assumed that the | ||
659 | tail page wrapped the buffer, and we must drop new writes. | ||
660 | |||
661 | This is not a race condition, because the commit page can only be moved | ||
662 | by the outter most writer (the writer that was preempted). | ||
663 | This means that the commit will not move while a writer is moving the | ||
664 | tail page. The reader can not swap the reader page if it is also being | ||
665 | used as the commit page. The reader can simply check that the commit | ||
666 | is off the reader page. Once the commit page leaves the reader page | ||
667 | it will never go back on it unless a reader does another swap with the | ||
668 | buffer page that is also the commit page. | ||
669 | |||
670 | |||
671 | Nested writes | ||
672 | ------------- | ||
673 | |||
674 | In the pushing forward of the tail page we must first push forward | ||
675 | the head page if the head page is the next page. If the head page | ||
676 | is not the next page, the tail page is simply updated with a cmpxchg. | ||
677 | |||
678 | Only writers move the tail page. This must be done atomically to protect | ||
679 | against nested writers. | ||
680 | |||
681 | temp_page = tail_page | ||
682 | next_page = temp_page->next | ||
683 | cmpxchg(tail_page, temp_page, next_page) | ||
684 | |||
685 | The above will update the tail page if it is still pointing to the expected | ||
686 | page. If this fails, a nested write pushed it forward, the the current write | ||
687 | does not need to push it. | ||
688 | |||
689 | |||
690 | temp page | ||
691 | | | ||
692 | v | ||
693 | tail page | ||
694 | | | ||
695 | v | ||
696 | +---+ +---+ +---+ +---+ | ||
697 | <---| |--->| |--->| |--->| |---> | ||
698 | --->| |<---| |<---| |<---| |<--- | ||
699 | +---+ +---+ +---+ +---+ | ||
700 | |||
701 | Nested write comes in and moves the tail page forward: | ||
702 | |||
703 | tail page (moved by nested writer) | ||
704 | temp page | | ||
705 | | | | ||
706 | v v | ||
707 | +---+ +---+ +---+ +---+ | ||
708 | <---| |--->| |--->| |--->| |---> | ||
709 | --->| |<---| |<---| |<---| |<--- | ||
710 | +---+ +---+ +---+ +---+ | ||
711 | |||
712 | The above would fail the cmpxchg, but since the tail page has already | ||
713 | been moved forward, the writer will just try again to reserve storage | ||
714 | on the new tail page. | ||
715 | |||
716 | But the moving of the head page is a bit more complex. | ||
717 | |||
718 | tail page | ||
719 | | | ||
720 | v | ||
721 | +---+ +---+ +---+ +---+ | ||
722 | <---| |--->| |-H->| |--->| |---> | ||
723 | --->| |<---| |<---| |<---| |<--- | ||
724 | +---+ +---+ +---+ +---+ | ||
725 | |||
726 | The write converts the head page pointer to UPDATE. | ||
727 | |||
728 | tail page | ||
729 | | | ||
730 | v | ||
731 | +---+ +---+ +---+ +---+ | ||
732 | <---| |--->| |-U->| |--->| |---> | ||
733 | --->| |<---| |<---| |<---| |<--- | ||
734 | +---+ +---+ +---+ +---+ | ||
735 | |||
736 | But if a nested writer preempts here. It will see that the next | ||
737 | page is a head page, but it is also nested. It will detect that | ||
738 | it is nested and will save that information. The detection is the | ||
739 | fact that it sees the UPDATE flag instead of a HEADER or NORMAL | ||
740 | pointer. | ||
741 | |||
742 | The nested writer will set the new head page pointer. | ||
743 | |||
744 | tail page | ||
745 | | | ||
746 | v | ||
747 | +---+ +---+ +---+ +---+ | ||
748 | <---| |--->| |-U->| |-H->| |---> | ||
749 | --->| |<---| |<---| |<---| |<--- | ||
750 | +---+ +---+ +---+ +---+ | ||
751 | |||
752 | But it will not reset the update back to normal. Only the writer | ||
753 | that converted a pointer from HEAD to UPDATE will convert it back | ||
754 | to NORMAL. | ||
755 | |||
756 | tail page | ||
757 | | | ||
758 | v | ||
759 | +---+ +---+ +---+ +---+ | ||
760 | <---| |--->| |-U->| |-H->| |---> | ||
761 | --->| |<---| |<---| |<---| |<--- | ||
762 | +---+ +---+ +---+ +---+ | ||
763 | |||
764 | After the nested writer finishes, the outer most writer will convert | ||
765 | the UPDATE pointer to NORMAL. | ||
766 | |||
767 | |||
768 | tail page | ||
769 | | | ||
770 | v | ||
771 | +---+ +---+ +---+ +---+ | ||
772 | <---| |--->| |--->| |-H->| |---> | ||
773 | --->| |<---| |<---| |<---| |<--- | ||
774 | +---+ +---+ +---+ +---+ | ||
775 | |||
776 | |||
777 | It can be even more complex if several nested writes came in and moved | ||
778 | the tail page ahead several pages: | ||
779 | |||
780 | |||
781 | (first writer) | ||
782 | |||
783 | tail page | ||
784 | | | ||
785 | v | ||
786 | +---+ +---+ +---+ +---+ | ||
787 | <---| |--->| |-H->| |--->| |---> | ||
788 | --->| |<---| |<---| |<---| |<--- | ||
789 | +---+ +---+ +---+ +---+ | ||
790 | |||
791 | The write converts the head page pointer to UPDATE. | ||
792 | |||
793 | tail page | ||
794 | | | ||
795 | v | ||
796 | +---+ +---+ +---+ +---+ | ||
797 | <---| |--->| |-U->| |--->| |---> | ||
798 | --->| |<---| |<---| |<---| |<--- | ||
799 | +---+ +---+ +---+ +---+ | ||
800 | |||
801 | Next writer comes in, and sees the update and sets up the new | ||
802 | head page. | ||
803 | |||
804 | (second writer) | ||
805 | |||
806 | tail page | ||
807 | | | ||
808 | v | ||
809 | +---+ +---+ +---+ +---+ | ||
810 | <---| |--->| |-U->| |-H->| |---> | ||
811 | --->| |<---| |<---| |<---| |<--- | ||
812 | +---+ +---+ +---+ +---+ | ||
813 | |||
814 | The nested writer moves the tail page forward. But does not set the old | ||
815 | update page to NORMAL because it is not the outer most writer. | ||
816 | |||
817 | tail page | ||
818 | | | ||
819 | v | ||
820 | +---+ +---+ +---+ +---+ | ||
821 | <---| |--->| |-U->| |-H->| |---> | ||
822 | --->| |<---| |<---| |<---| |<--- | ||
823 | +---+ +---+ +---+ +---+ | ||
824 | |||
825 | Another writer preempts and sees the page after the tail page is a head page. | ||
826 | It changes it from HEAD to UPDATE. | ||
827 | |||
828 | (third writer) | ||
829 | |||
830 | tail page | ||
831 | | | ||
832 | v | ||
833 | +---+ +---+ +---+ +---+ | ||
834 | <---| |--->| |-U->| |-U->| |---> | ||
835 | --->| |<---| |<---| |<---| |<--- | ||
836 | +---+ +---+ +---+ +---+ | ||
837 | |||
838 | The writer will move the head page forward: | ||
839 | |||
840 | |||
841 | (third writer) | ||
842 | |||
843 | tail page | ||
844 | | | ||
845 | v | ||
846 | +---+ +---+ +---+ +---+ | ||
847 | <---| |--->| |-U->| |-U->| |-H-> | ||
848 | --->| |<---| |<---| |<---| |<--- | ||
849 | +---+ +---+ +---+ +---+ | ||
850 | |||
851 | But now that the third writer did change the HEAD flag to UPDATE it | ||
852 | will convert it to normal: | ||
853 | |||
854 | |||
855 | (third writer) | ||
856 | |||
857 | tail page | ||
858 | | | ||
859 | v | ||
860 | +---+ +---+ +---+ +---+ | ||
861 | <---| |--->| |-U->| |--->| |-H-> | ||
862 | --->| |<---| |<---| |<---| |<--- | ||
863 | +---+ +---+ +---+ +---+ | ||
864 | |||
865 | |||
866 | Then it will move the tail page, and return back to the second writer. | ||
867 | |||
868 | |||
869 | (second writer) | ||
870 | |||
871 | tail page | ||
872 | | | ||
873 | v | ||
874 | +---+ +---+ +---+ +---+ | ||
875 | <---| |--->| |-U->| |--->| |-H-> | ||
876 | --->| |<---| |<---| |<---| |<--- | ||
877 | +---+ +---+ +---+ +---+ | ||
878 | |||
879 | |||
880 | The second writer will fail to move the tail page because it was already | ||
881 | moved, so it will try again and add its data to the new tail page. | ||
882 | It will return to the first writer. | ||
883 | |||
884 | |||
885 | (first writer) | ||
886 | |||
887 | tail page | ||
888 | | | ||
889 | v | ||
890 | +---+ +---+ +---+ +---+ | ||
891 | <---| |--->| |-U->| |--->| |-H-> | ||
892 | --->| |<---| |<---| |<---| |<--- | ||
893 | +---+ +---+ +---+ +---+ | ||
894 | |||
895 | The first writer can not know atomically test if the tail page moved | ||
896 | while it updates the HEAD page. It will then update the head page to | ||
897 | what it thinks is the new head page. | ||
898 | |||
899 | |||
900 | (first writer) | ||
901 | |||
902 | tail page | ||
903 | | | ||
904 | v | ||
905 | +---+ +---+ +---+ +---+ | ||
906 | <---| |--->| |-U->| |-H->| |-H-> | ||
907 | --->| |<---| |<---| |<---| |<--- | ||
908 | +---+ +---+ +---+ +---+ | ||
909 | |||
910 | Since the cmpxchg returns the old value of the pointer the first writer | ||
911 | will see it succeeded in updating the pointer from NORMAL to HEAD. | ||
912 | But as we can see, this is not good enough. It must also check to see | ||
913 | if the tail page is either where it use to be or on the next page: | ||
914 | |||
915 | |||
916 | (first writer) | ||
917 | |||
918 | A B tail page | ||
919 | | | | | ||
920 | v v v | ||
921 | +---+ +---+ +---+ +---+ | ||
922 | <---| |--->| |-U->| |-H->| |-H-> | ||
923 | --->| |<---| |<---| |<---| |<--- | ||
924 | +---+ +---+ +---+ +---+ | ||
925 | |||
926 | If tail page != A and tail page does not equal B, then it must reset the | ||
927 | pointer back to NORMAL. The fact that it only needs to worry about | ||
928 | nested writers, it only needs to check this after setting the HEAD page. | ||
929 | |||
930 | |||
931 | (first writer) | ||
932 | |||
933 | A B tail page | ||
934 | | | | | ||
935 | v v v | ||
936 | +---+ +---+ +---+ +---+ | ||
937 | <---| |--->| |-U->| |--->| |-H-> | ||
938 | --->| |<---| |<---| |<---| |<--- | ||
939 | +---+ +---+ +---+ +---+ | ||
940 | |||
941 | Now the writer can update the head page. This is also why the head page must | ||
942 | remain in UPDATE and only reset by the outer most writer. This prevents | ||
943 | the reader from seeing the incorrect head page. | ||
944 | |||
945 | |||
946 | (first writer) | ||
947 | |||
948 | A B tail page | ||
949 | | | | | ||
950 | v v v | ||
951 | +---+ +---+ +---+ +---+ | ||
952 | <---| |--->| |--->| |--->| |-H-> | ||
953 | --->| |<---| |<---| |<---| |<--- | ||
954 | +---+ +---+ +---+ +---+ | ||
955 | |||
diff --git a/Documentation/video4linux/CARDLIST.cx23885 b/Documentation/video4linux/CARDLIST.cx23885 index 450b8f8c389b..525edb37c758 100644 --- a/Documentation/video4linux/CARDLIST.cx23885 +++ b/Documentation/video4linux/CARDLIST.cx23885 | |||
@@ -21,3 +21,5 @@ | |||
21 | 20 -> Hauppauge WinTV-HVR1255 [0070:2251] | 21 | 20 -> Hauppauge WinTV-HVR1255 [0070:2251] |
22 | 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295] | 22 | 21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295] |
23 | 22 -> Mygica X8506 DMB-TH [14f1:8651] | 23 | 22 -> Mygica X8506 DMB-TH [14f1:8651] |
24 | 23 -> Magic-Pro ProHDTV Extreme 2 [14f1:8657] | ||
25 | 24 -> Hauppauge WinTV-HVR1850 [0070:8541] | ||
diff --git a/Documentation/video4linux/CARDLIST.cx88 b/Documentation/video4linux/CARDLIST.cx88 index 0736518b2f88..3385f8b094a5 100644 --- a/Documentation/video4linux/CARDLIST.cx88 +++ b/Documentation/video4linux/CARDLIST.cx88 | |||
@@ -80,3 +80,4 @@ | |||
80 | 79 -> Terratec Cinergy HT PCI MKII [153b:1177] | 80 | 79 -> Terratec Cinergy HT PCI MKII [153b:1177] |
81 | 80 -> Hauppauge WinTV-IR Only [0070:9290] | 81 | 80 -> Hauppauge WinTV-IR Only [0070:9290] |
82 | 81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654] | 82 | 81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654] |
83 | 82 -> WinFast DTV2000 H rev. J [107d:6f2b] | ||
diff --git a/Documentation/video4linux/CARDLIST.em28xx b/Documentation/video4linux/CARDLIST.em28xx index e352d754875c..b13fcbd5d94b 100644 --- a/Documentation/video4linux/CARDLIST.em28xx +++ b/Documentation/video4linux/CARDLIST.em28xx | |||
@@ -7,7 +7,7 @@ | |||
7 | 6 -> Terratec Cinergy 200 USB (em2800) | 7 | 6 -> Terratec Cinergy 200 USB (em2800) |
8 | 7 -> Leadtek Winfast USB II (em2800) [0413:6023] | 8 | 7 -> Leadtek Winfast USB II (em2800) [0413:6023] |
9 | 8 -> Kworld USB2800 (em2800) | 9 | 8 -> Kworld USB2800 (em2800) |
10 | 9 -> Pinnacle Dazzle DVC 90/100/101/107 / Kaiser Baas Video to DVD maker (em2820/em2840) [1b80:e302,2304:0207,2304:021a] | 10 | 9 -> Pinnacle Dazzle DVC 90/100/101/107 / Kaiser Baas Video to DVD maker (em2820/em2840) [1b80:e302,1b80:e304,2304:0207,2304:021a] |
11 | 10 -> Hauppauge WinTV HVR 900 (em2880) [2040:6500] | 11 | 10 -> Hauppauge WinTV HVR 900 (em2880) [2040:6500] |
12 | 11 -> Terratec Hybrid XS (em2880) [0ccd:0042] | 12 | 11 -> Terratec Hybrid XS (em2880) [0ccd:0042] |
13 | 12 -> Kworld PVR TV 2800 RF (em2820/em2840) | 13 | 12 -> Kworld PVR TV 2800 RF (em2820/em2840) |
@@ -33,7 +33,7 @@ | |||
33 | 34 -> Terratec Cinergy A Hybrid XS (em2860) [0ccd:004f] | 33 | 34 -> Terratec Cinergy A Hybrid XS (em2860) [0ccd:004f] |
34 | 35 -> Typhoon DVD Maker (em2860) | 34 | 35 -> Typhoon DVD Maker (em2860) |
35 | 36 -> NetGMBH Cam (em2860) | 35 | 36 -> NetGMBH Cam (em2860) |
36 | 37 -> Gadmei UTV330 (em2860) | 36 | 37 -> Gadmei UTV330 (em2860) [eb1a:50a6] |
37 | 38 -> Yakumo MovieMixer (em2861) | 37 | 38 -> Yakumo MovieMixer (em2861) |
38 | 39 -> KWorld PVRTV 300U (em2861) [eb1a:e300] | 38 | 39 -> KWorld PVRTV 300U (em2861) [eb1a:e300] |
39 | 40 -> Plextor ConvertX PX-TV100U (em2861) [093b:a005] | 39 | 40 -> Plextor ConvertX PX-TV100U (em2861) [093b:a005] |
@@ -67,3 +67,4 @@ | |||
67 | 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313] | 67 | 69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313] |
68 | 70 -> Evga inDtube (em2882) | 68 | 70 -> Evga inDtube (em2882) |
69 | 71 -> Silvercrest Webcam 1.3mpix (em2820/em2840) | 69 | 71 -> Silvercrest Webcam 1.3mpix (em2820/em2840) |
70 | 72 -> Gadmei UTV330+ (em2861) | ||
diff --git a/Documentation/video4linux/CARDLIST.saa7134 b/Documentation/video4linux/CARDLIST.saa7134 index c913e5614195..0ac4d2544778 100644 --- a/Documentation/video4linux/CARDLIST.saa7134 +++ b/Documentation/video4linux/CARDLIST.saa7134 | |||
@@ -167,3 +167,7 @@ | |||
167 | 166 -> Beholder BeholdTV 607 RDS [5ace:6073] | 167 | 166 -> Beholder BeholdTV 607 RDS [5ace:6073] |
168 | 167 -> Beholder BeholdTV 609 RDS [5ace:6092] | 168 | 167 -> Beholder BeholdTV 609 RDS [5ace:6092] |
169 | 168 -> Beholder BeholdTV 609 RDS [5ace:6093] | 169 | 168 -> Beholder BeholdTV 609 RDS [5ace:6093] |
170 | 169 -> Compro VideoMate S350/S300 [185b:c900] | ||
171 | 170 -> AverMedia AverTV Studio 505 [1461:a115] | ||
172 | 171 -> Beholder BeholdTV X7 [5ace:7595] | ||
173 | 172 -> RoverMedia TV Link Pro FM [19d1:0138] | ||
diff --git a/Documentation/video4linux/CARDLIST.tuner b/Documentation/video4linux/CARDLIST.tuner index be67844074dd..ba9fa679e2d3 100644 --- a/Documentation/video4linux/CARDLIST.tuner +++ b/Documentation/video4linux/CARDLIST.tuner | |||
@@ -78,3 +78,4 @@ tuner=77 - TCL tuner MF02GIP-5N-E | |||
78 | tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner | 78 | tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner |
79 | tuner=79 - Philips PAL/SECAM multi (FM1216 MK5) | 79 | tuner=79 - Philips PAL/SECAM multi (FM1216 MK5) |
80 | tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough | 80 | tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough |
81 | tuner=81 - Partsnic (Daewoo) PTI-5NF05 | ||
diff --git a/Documentation/video4linux/CQcam.txt b/Documentation/video4linux/CQcam.txt index 04986efb731c..d230878e473e 100644 --- a/Documentation/video4linux/CQcam.txt +++ b/Documentation/video4linux/CQcam.txt | |||
@@ -18,8 +18,8 @@ Table of Contents | |||
18 | 18 | ||
19 | 1.0 Introduction | 19 | 1.0 Introduction |
20 | 20 | ||
21 | The file ../drivers/char/c-qcam.c is a device driver for the | 21 | The file ../../drivers/media/video/c-qcam.c is a device driver for |
22 | Logitech (nee Connectix) parallel port interface color CCD camera. | 22 | the Logitech (nee Connectix) parallel port interface color CCD camera. |
23 | This is a fairly inexpensive device for capturing images. Logitech | 23 | This is a fairly inexpensive device for capturing images. Logitech |
24 | does not currently provide information for developers, but many people | 24 | does not currently provide information for developers, but many people |
25 | have engineered several solutions for non-Microsoft use of the Color | 25 | have engineered several solutions for non-Microsoft use of the Color |
diff --git a/Documentation/video4linux/gspca.txt b/Documentation/video4linux/gspca.txt index 573f95b58807..4686e84dd800 100644 --- a/Documentation/video4linux/gspca.txt +++ b/Documentation/video4linux/gspca.txt | |||
@@ -140,6 +140,7 @@ spca500 04fc:7333 PalmPixDC85 | |||
140 | sunplus 04fc:ffff Pure DigitalDakota | 140 | sunplus 04fc:ffff Pure DigitalDakota |
141 | spca501 0506:00df 3Com HomeConnect Lite | 141 | spca501 0506:00df 3Com HomeConnect Lite |
142 | sunplus 052b:1513 Megapix V4 | 142 | sunplus 052b:1513 Megapix V4 |
143 | sunplus 052b:1803 MegaImage VI | ||
143 | tv8532 0545:808b Veo Stingray | 144 | tv8532 0545:808b Veo Stingray |
144 | tv8532 0545:8333 Veo Stingray | 145 | tv8532 0545:8333 Veo Stingray |
145 | sunplus 0546:3155 Polaroid PDC3070 | 146 | sunplus 0546:3155 Polaroid PDC3070 |
@@ -182,6 +183,7 @@ ov534 06f8:3002 Hercules Blog Webcam | |||
182 | ov534 06f8:3003 Hercules Dualpix HD Weblog | 183 | ov534 06f8:3003 Hercules Dualpix HD Weblog |
183 | sonixj 06f8:3004 Hercules Classic Silver | 184 | sonixj 06f8:3004 Hercules Classic Silver |
184 | sonixj 06f8:3008 Hercules Deluxe Optical Glass | 185 | sonixj 06f8:3008 Hercules Deluxe Optical Glass |
186 | pac7311 06f8:3009 Hercules Classic Link | ||
185 | spca508 0733:0110 ViewQuest VQ110 | 187 | spca508 0733:0110 ViewQuest VQ110 |
186 | spca508 0130:0130 Clone Digital Webcam 11043 | 188 | spca508 0130:0130 Clone Digital Webcam 11043 |
187 | spca501 0733:0401 Intel Create and Share | 189 | spca501 0733:0401 Intel Create and Share |
@@ -235,8 +237,10 @@ pac7311 093a:2621 PAC731x | |||
235 | pac7311 093a:2622 Genius Eye 312 | 237 | pac7311 093a:2622 Genius Eye 312 |
236 | pac7311 093a:2624 PAC7302 | 238 | pac7311 093a:2624 PAC7302 |
237 | pac7311 093a:2626 Labtec 2200 | 239 | pac7311 093a:2626 Labtec 2200 |
240 | pac7311 093a:2629 Genious iSlim 300 | ||
238 | pac7311 093a:262a Webcam 300k | 241 | pac7311 093a:262a Webcam 300k |
239 | pac7311 093a:262c Philips SPC 230 NC | 242 | pac7311 093a:262c Philips SPC 230 NC |
243 | jeilinj 0979:0280 Sakar 57379 | ||
240 | zc3xx 0ac8:0302 Z-star Vimicro zc0302 | 244 | zc3xx 0ac8:0302 Z-star Vimicro zc0302 |
241 | vc032x 0ac8:0321 Vimicro generic vc0321 | 245 | vc032x 0ac8:0321 Vimicro generic vc0321 |
242 | vc032x 0ac8:0323 Vimicro Vc0323 | 246 | vc032x 0ac8:0323 Vimicro Vc0323 |
@@ -247,6 +251,7 @@ zc3xx 0ac8:305b Z-star Vimicro zc0305b | |||
247 | zc3xx 0ac8:307b Ldlc VC302+Ov7620 | 251 | zc3xx 0ac8:307b Ldlc VC302+Ov7620 |
248 | vc032x 0ac8:c001 Sony embedded vimicro | 252 | vc032x 0ac8:c001 Sony embedded vimicro |
249 | vc032x 0ac8:c002 Sony embedded vimicro | 253 | vc032x 0ac8:c002 Sony embedded vimicro |
254 | vc032x 0ac8:c301 Samsung Q1 Ultra Premium | ||
250 | spca508 0af9:0010 Hama USB Sightcam 100 | 255 | spca508 0af9:0010 Hama USB Sightcam 100 |
251 | spca508 0af9:0011 Hama USB Sightcam 100 | 256 | spca508 0af9:0011 Hama USB Sightcam 100 |
252 | sonixb 0c45:6001 Genius VideoCAM NB | 257 | sonixb 0c45:6001 Genius VideoCAM NB |
@@ -284,6 +289,7 @@ sonixj 0c45:613a Microdia Sonix PC Camera | |||
284 | sonixj 0c45:613b Surfer SN-206 | 289 | sonixj 0c45:613b Surfer SN-206 |
285 | sonixj 0c45:613c Sonix Pccam168 | 290 | sonixj 0c45:613c Sonix Pccam168 |
286 | sonixj 0c45:6143 Sonix Pccam168 | 291 | sonixj 0c45:6143 Sonix Pccam168 |
292 | sonixj 0c45:6148 Digitus DA-70811/ZSMC USB PC Camera ZS211/Microdia | ||
287 | sn9c20x 0c45:6240 PC Camera (SN9C201 + MT9M001) | 293 | sn9c20x 0c45:6240 PC Camera (SN9C201 + MT9M001) |
288 | sn9c20x 0c45:6242 PC Camera (SN9C201 + MT9M111) | 294 | sn9c20x 0c45:6242 PC Camera (SN9C201 + MT9M111) |
289 | sn9c20x 0c45:6248 PC Camera (SN9C201 + OV9655) | 295 | sn9c20x 0c45:6248 PC Camera (SN9C201 + OV9655) |
diff --git a/Documentation/video4linux/si4713.txt b/Documentation/video4linux/si4713.txt new file mode 100644 index 000000000000..25abdb78209d --- /dev/null +++ b/Documentation/video4linux/si4713.txt | |||
@@ -0,0 +1,176 @@ | |||
1 | Driver for I2C radios for the Silicon Labs Si4713 FM Radio Transmitters | ||
2 | |||
3 | Copyright (c) 2009 Nokia Corporation | ||
4 | Contact: Eduardo Valentin <eduardo.valentin@nokia.com> | ||
5 | |||
6 | |||
7 | Information about the Device | ||
8 | ============================ | ||
9 | This chip is a Silicon Labs product. It is a I2C device, currently on 0x63 address. | ||
10 | Basically, it has transmission and signal noise level measurement features. | ||
11 | |||
12 | The Si4713 integrates transmit functions for FM broadcast stereo transmission. | ||
13 | The chip also allows integrated receive power scanning to identify low signal | ||
14 | power FM channels. | ||
15 | |||
16 | The chip is programmed using commands and responses. There are also several | ||
17 | properties which can change the behavior of this chip. | ||
18 | |||
19 | Users must comply with local regulations on radio frequency (RF) transmission. | ||
20 | |||
21 | Device driver description | ||
22 | ========================= | ||
23 | There are two modules to handle this device. One is a I2C device driver | ||
24 | and the other is a platform driver. | ||
25 | |||
26 | The I2C device driver exports a v4l2-subdev interface to the kernel. | ||
27 | All properties can also be accessed by v4l2 extended controls interface, by | ||
28 | using the v4l2-subdev calls (g_ext_ctrls, s_ext_ctrls). | ||
29 | |||
30 | The platform device driver exports a v4l2 radio device interface to user land. | ||
31 | So, it uses the I2C device driver as a sub device in order to send the user | ||
32 | commands to the actual device. Basically it is a wrapper to the I2C device driver. | ||
33 | |||
34 | Applications can use v4l2 radio API to specify frequency of operation, mute state, | ||
35 | etc. But mostly of its properties will be present in the extended controls. | ||
36 | |||
37 | When the v4l2 mute property is set to 1 (true), the driver will turn the chip off. | ||
38 | |||
39 | Properties description | ||
40 | ====================== | ||
41 | |||
42 | The properties can be accessed using v4l2 extended controls. | ||
43 | Here is an output from v4l2-ctl util: | ||
44 | / # v4l2-ctl -d /dev/radio0 --all -L | ||
45 | Driver Info: | ||
46 | Driver name : radio-si4713 | ||
47 | Card type : Silicon Labs Si4713 Modulator | ||
48 | Bus info : | ||
49 | Driver version: 0 | ||
50 | Capabilities : 0x00080800 | ||
51 | RDS Output | ||
52 | Modulator | ||
53 | Audio output: 0 (FM Modulator Audio Out) | ||
54 | Frequency: 1408000 (88.000000 MHz) | ||
55 | Video Standard = 0x00000000 | ||
56 | Modulator: | ||
57 | Name : FM Modulator | ||
58 | Capabilities : 62.5 Hz stereo rds | ||
59 | Frequency range : 76.0 MHz - 108.0 MHz | ||
60 | Subchannel modulation: stereo+rds | ||
61 | |||
62 | User Controls | ||
63 | |||
64 | mute (bool) : default=1 value=0 | ||
65 | |||
66 | FM Radio Modulator Controls | ||
67 | |||
68 | rds_signal_deviation (int) : min=0 max=90000 step=10 default=200 value=200 flags=slider | ||
69 | rds_program_id (int) : min=0 max=65535 step=1 default=0 value=0 | ||
70 | rds_program_type (int) : min=0 max=31 step=1 default=0 value=0 | ||
71 | rds_ps_name (str) : min=0 max=96 step=8 value='si4713 ' | ||
72 | rds_radio_text (str) : min=0 max=384 step=32 value='' | ||
73 | audio_limiter_feature_enabled (bool) : default=1 value=1 | ||
74 | audio_limiter_release_time (int) : min=250 max=102390 step=50 default=5010 value=5010 flags=slider | ||
75 | audio_limiter_deviation (int) : min=0 max=90000 step=10 default=66250 value=66250 flags=slider | ||
76 | audio_compression_feature_enabl (bool) : default=1 value=1 | ||
77 | audio_compression_gain (int) : min=0 max=20 step=1 default=15 value=15 flags=slider | ||
78 | audio_compression_threshold (int) : min=-40 max=0 step=1 default=-40 value=-40 flags=slider | ||
79 | audio_compression_attack_time (int) : min=0 max=5000 step=500 default=0 value=0 flags=slider | ||
80 | audio_compression_release_time (int) : min=100000 max=1000000 step=100000 default=1000000 value=1000000 flags=slider | ||
81 | pilot_tone_feature_enabled (bool) : default=1 value=1 | ||
82 | pilot_tone_deviation (int) : min=0 max=90000 step=10 default=6750 value=6750 flags=slider | ||
83 | pilot_tone_frequency (int) : min=0 max=19000 step=1 default=19000 value=19000 flags=slider | ||
84 | pre_emphasis_settings (menu) : min=0 max=2 default=1 value=1 | ||
85 | tune_power_level (int) : min=0 max=120 step=1 default=88 value=88 flags=slider | ||
86 | tune_antenna_capacitor (int) : min=0 max=191 step=1 default=0 value=110 flags=slider | ||
87 | / # | ||
88 | |||
89 | Here is a summary of them: | ||
90 | |||
91 | * Pilot is an audible tone sent by the device. | ||
92 | |||
93 | pilot_frequency - Configures the frequency of the stereo pilot tone. | ||
94 | pilot_deviation - Configures pilot tone frequency deviation level. | ||
95 | pilot_enabled - Enables or disables the pilot tone feature. | ||
96 | |||
97 | * The si4713 device is capable of applying audio compression to the transmitted signal. | ||
98 | |||
99 | acomp_enabled - Enables or disables the audio dynamic range control feature. | ||
100 | acomp_gain - Sets the gain for audio dynamic range control. | ||
101 | acomp_threshold - Sets the threshold level for audio dynamic range control. | ||
102 | acomp_attack_time - Sets the attack time for audio dynamic range control. | ||
103 | acomp_release_time - Sets the release time for audio dynamic range control. | ||
104 | |||
105 | * Limiter setups audio deviation limiter feature. Once a over deviation occurs, | ||
106 | it is possible to adjust the front-end gain of the audio input and always | ||
107 | prevent over deviation. | ||
108 | |||
109 | limiter_enabled - Enables or disables the limiter feature. | ||
110 | limiter_deviation - Configures audio frequency deviation level. | ||
111 | limiter_release_time - Sets the limiter release time. | ||
112 | |||
113 | * Tuning power | ||
114 | |||
115 | power_level - Sets the output power level for signal transmission. | ||
116 | antenna_capacitor - This selects the value of antenna tuning capacitor manually | ||
117 | or automatically if set to zero. | ||
118 | |||
119 | * RDS related | ||
120 | |||
121 | rds_ps_name - Sets the RDS ps name field for transmission. | ||
122 | rds_radio_text - Sets the RDS radio text for transmission. | ||
123 | rds_pi - Sets the RDS PI field for transmission. | ||
124 | rds_pty - Sets the RDS PTY field for transmission. | ||
125 | |||
126 | * Region related | ||
127 | |||
128 | preemphasis - sets the preemphasis to be applied for transmission. | ||
129 | |||
130 | RNL | ||
131 | === | ||
132 | |||
133 | This device also has an interface to measure received noise level. To do that, you should | ||
134 | ioctl the device node. Here is an code of example: | ||
135 | |||
136 | int main (int argc, char *argv[]) | ||
137 | { | ||
138 | struct si4713_rnl rnl; | ||
139 | int fd = open("/dev/radio0", O_RDWR); | ||
140 | int rval; | ||
141 | |||
142 | if (argc < 2) | ||
143 | return -EINVAL; | ||
144 | |||
145 | if (fd < 0) | ||
146 | return fd; | ||
147 | |||
148 | sscanf(argv[1], "%d", &rnl.frequency); | ||
149 | |||
150 | rval = ioctl(fd, SI4713_IOC_MEASURE_RNL, &rnl); | ||
151 | if (rval < 0) | ||
152 | return rval; | ||
153 | |||
154 | printf("received noise level: %d\n", rnl.rnl); | ||
155 | |||
156 | close(fd); | ||
157 | } | ||
158 | |||
159 | The struct si4713_rnl and SI4713_IOC_MEASURE_RNL are defined under | ||
160 | include/media/si4713.h. | ||
161 | |||
162 | Stereo/Mono and RDS subchannels | ||
163 | =============================== | ||
164 | |||
165 | The device can also be configured using the available sub channels for | ||
166 | transmission. To do that use S/G_MODULATOR ioctl and configure txsubchans properly. | ||
167 | Refer to v4l2-spec for proper use of this ioctl. | ||
168 | |||
169 | Testing | ||
170 | ======= | ||
171 | Testing is usually done with v4l2-ctl utility for managing FM tuner cards. | ||
172 | The tool can be found in v4l-dvb repository under v4l2-apps/util directory. | ||
173 | |||
174 | Example for setting rds ps name: | ||
175 | # v4l2-ctl -d /dev/radio0 --set-ctrl=rds_ps_name="Dummy" | ||
176 | |||
diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt index bb1f5c6e28b3..510917ff59ed 100644 --- a/Documentation/vm/slub.txt +++ b/Documentation/vm/slub.txt | |||
@@ -41,6 +41,8 @@ Possible debug options are | |||
41 | P Poisoning (object and padding) | 41 | P Poisoning (object and padding) |
42 | U User tracking (free and alloc) | 42 | U User tracking (free and alloc) |
43 | T Trace (please only use on single slabs) | 43 | T Trace (please only use on single slabs) |
44 | O Switch debugging off for caches that would have | ||
45 | caused higher minimum slab orders | ||
44 | - Switch all debugging off (useful if the kernel is | 46 | - Switch all debugging off (useful if the kernel is |
45 | configured with CONFIG_SLUB_DEBUG_ON) | 47 | configured with CONFIG_SLUB_DEBUG_ON) |
46 | 48 | ||
@@ -59,6 +61,14 @@ to the dentry cache with | |||
59 | 61 | ||
60 | slub_debug=F,dentry | 62 | slub_debug=F,dentry |
61 | 63 | ||
64 | Debugging options may require the minimum possible slab order to increase as | ||
65 | a result of storing the metadata (for example, caches with PAGE_SIZE object | ||
66 | sizes). This has a higher liklihood of resulting in slab allocation errors | ||
67 | in low memory situations or if there's high fragmentation of memory. To | ||
68 | switch off debugging for such caches by default, use | ||
69 | |||
70 | slub_debug=O | ||
71 | |||
62 | In case you forgot to enable debugging on the kernel command line: It is | 72 | In case you forgot to enable debugging on the kernel command line: It is |
63 | possible to enable debugging manually when the kernel is up. Look at the | 73 | possible to enable debugging manually when the kernel is up. Look at the |
64 | contents of: | 74 | contents of: |
diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt index 4f913857b8a2..feb37e177010 100644 --- a/Documentation/x86/zero-page.txt +++ b/Documentation/x86/zero-page.txt | |||
@@ -12,6 +12,7 @@ Offset Proto Name Meaning | |||
12 | 000/040 ALL screen_info Text mode or frame buffer information | 12 | 000/040 ALL screen_info Text mode or frame buffer information |
13 | (struct screen_info) | 13 | (struct screen_info) |
14 | 040/014 ALL apm_bios_info APM BIOS information (struct apm_bios_info) | 14 | 040/014 ALL apm_bios_info APM BIOS information (struct apm_bios_info) |
15 | 058/008 ALL tboot_addr Physical address of tboot shared page | ||
15 | 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information | 16 | 060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information |
16 | (struct ist_info) | 17 | (struct ist_info) |
17 | 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!! | 18 | 080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!! |