diff options
author | Jeff Garzik <jeff@garzik.org> | 2007-02-17 15:11:43 -0500 |
---|---|---|
committer | Jeff Garzik <jeff@garzik.org> | 2007-02-17 15:11:43 -0500 |
commit | f630fe2817601314b2eb7ca5ddc23c7834646731 (patch) | |
tree | 3bfb4939b7bbc3859575ca8b58fa3f929b015941 /Documentation | |
parent | 48c871c1f6a7c7044dd76774fb469e65c7e2e4e8 (diff) | |
parent | 8a03d9a498eaf02c8a118752050a5154852c13bf (diff) |
Merge branch 'master' into upstream
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/gpio.txt | 17 | ||||
-rw-r--r-- | Documentation/hrtimer/timer_stats.txt | 68 | ||||
-rw-r--r-- | Documentation/hrtimers/highres.txt | 249 | ||||
-rw-r--r-- | Documentation/hrtimers/hrtimers.txt (renamed from Documentation/hrtimers.txt) | 0 | ||||
-rw-r--r-- | Documentation/i2c/busses/i2c-i801 | 60 | ||||
-rw-r--r-- | Documentation/i2c/busses/i2c-parport | 15 | ||||
-rw-r--r-- | Documentation/i2c/busses/i2c-piix4 | 2 | ||||
-rw-r--r-- | Documentation/i2c/busses/i2c-viapro | 7 | ||||
-rw-r--r-- | Documentation/i2c/porting-clients | 6 | ||||
-rw-r--r-- | Documentation/i2c/smbus-protocol | 2 | ||||
-rw-r--r-- | Documentation/i2c/writing-clients | 58 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 16 | ||||
-rw-r--r-- | Documentation/powerpc/booting-without-of.txt | 4 | ||||
-rw-r--r-- | Documentation/powerpc/mpc52xx-device-tree-bindings.txt | 183 | ||||
-rw-r--r-- | Documentation/x86_64/boot-options.txt | 132 | ||||
-rw-r--r-- | Documentation/x86_64/cpu-hotplug-spec | 2 | ||||
-rw-r--r-- | Documentation/x86_64/kernel-stacks | 26 | ||||
-rw-r--r-- | Documentation/x86_64/machinecheck | 70 | ||||
-rw-r--r-- | Documentation/x86_64/mm.txt | 22 |
19 files changed, 780 insertions, 159 deletions
diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 09dd510c4a5f..576ce463cf44 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt | |||
@@ -78,7 +78,8 @@ Identifying GPIOs | |||
78 | ----------------- | 78 | ----------------- |
79 | GPIOs are identified by unsigned integers in the range 0..MAX_INT. That | 79 | GPIOs are identified by unsigned integers in the range 0..MAX_INT. That |
80 | reserves "negative" numbers for other purposes like marking signals as | 80 | reserves "negative" numbers for other purposes like marking signals as |
81 | "not available on this board", or indicating faults. | 81 | "not available on this board", or indicating faults. Code that doesn't |
82 | touch the underlying hardware treats these integers as opaque cookies. | ||
82 | 83 | ||
83 | Platforms define how they use those integers, and usually #define symbols | 84 | Platforms define how they use those integers, and usually #define symbols |
84 | for the GPIO lines so that board-specific setup code directly corresponds | 85 | for the GPIO lines so that board-specific setup code directly corresponds |
@@ -139,10 +140,10 @@ issues including wire-OR and output latencies. | |||
139 | The get/set calls have no error returns because "invalid GPIO" should have | 140 | The get/set calls have no error returns because "invalid GPIO" should have |
140 | been reported earlier in gpio_set_direction(). However, note that not all | 141 | been reported earlier in gpio_set_direction(). However, note that not all |
141 | platforms can read the value of output pins; those that can't should always | 142 | platforms can read the value of output pins; those that can't should always |
142 | return zero. Also, these calls will be ignored for GPIOs that can't safely | 143 | return zero. Also, using these calls for GPIOs that can't safely be accessed |
143 | be accessed wihtout sleeping (see below). | 144 | without sleeping (see below) is an error. |
144 | 145 | ||
145 | Platform-specific implementations are encouraged to optimise the two | 146 | Platform-specific implementations are encouraged to optimize the two |
146 | calls to access the GPIO value in cases where the GPIO number (and for | 147 | calls to access the GPIO value in cases where the GPIO number (and for |
147 | output, value) are constant. It's normal for them to need only a couple | 148 | output, value) are constant. It's normal for them to need only a couple |
148 | of instructions in such cases (reading or writing a hardware register), | 149 | of instructions in such cases (reading or writing a hardware register), |
@@ -239,7 +240,8 @@ options are part of the IRQ interface, e.g. IRQF_TRIGGER_FALLING, as are | |||
239 | system wakeup capabilities. | 240 | system wakeup capabilities. |
240 | 241 | ||
241 | Non-error values returned from irq_to_gpio() would most commonly be used | 242 | Non-error values returned from irq_to_gpio() would most commonly be used |
242 | with gpio_get_value(). | 243 | with gpio_get_value(), for example to initialize or update driver state |
244 | when the IRQ is edge-triggered. | ||
243 | 245 | ||
244 | 246 | ||
245 | 247 | ||
@@ -260,9 +262,10 @@ pullups (or pulldowns) so that the on-chip ones should not be used. | |||
260 | There are other system-specific mechanisms that are not specified here, | 262 | There are other system-specific mechanisms that are not specified here, |
261 | like the aforementioned options for input de-glitching and wire-OR output. | 263 | like the aforementioned options for input de-glitching and wire-OR output. |
262 | Hardware may support reading or writing GPIOs in gangs, but that's usually | 264 | Hardware may support reading or writing GPIOs in gangs, but that's usually |
263 | configuration dependednt: for GPIOs sharing the same bank. (GPIOs are | 265 | configuration dependent: for GPIOs sharing the same bank. (GPIOs are |
264 | commonly grouped in banks of 16 or 32, with a given SOC having several such | 266 | commonly grouped in banks of 16 or 32, with a given SOC having several such |
265 | banks.) Code relying on such mechanisms will necessarily be nonportable. | 267 | banks.) Some systems can trigger IRQs from output GPIOs. Code relying on |
268 | such mechanisms will necessarily be nonportable. | ||
266 | 269 | ||
267 | Dynamic definition of GPIOs is not currently supported; for example, as | 270 | Dynamic definition of GPIOs is not currently supported; for example, as |
268 | a side effect of configuring an add-on board with some GPIO expanders. | 271 | a side effect of configuring an add-on board with some GPIO expanders. |
diff --git a/Documentation/hrtimer/timer_stats.txt b/Documentation/hrtimer/timer_stats.txt new file mode 100644 index 000000000000..27f782e3593f --- /dev/null +++ b/Documentation/hrtimer/timer_stats.txt | |||
@@ -0,0 +1,68 @@ | |||
1 | timer_stats - timer usage statistics | ||
2 | ------------------------------------ | ||
3 | |||
4 | timer_stats is a debugging facility to make the timer (ab)usage in a Linux | ||
5 | system visible to kernel and userspace developers. It is not intended for | ||
6 | production usage as it adds significant overhead to the (hr)timer code and the | ||
7 | (hr)timer data structures. | ||
8 | |||
9 | timer_stats should be used by kernel and userspace developers to verify that | ||
10 | their code does not make unduly use of timers. This helps to avoid unnecessary | ||
11 | wakeups, which should be avoided to optimize power consumption. | ||
12 | |||
13 | It can be enabled by CONFIG_TIMER_STATS in the "Kernel hacking" configuration | ||
14 | section. | ||
15 | |||
16 | timer_stats collects information about the timer events which are fired in a | ||
17 | Linux system over a sample period: | ||
18 | |||
19 | - the pid of the task(process) which initialized the timer | ||
20 | - the name of the process which initialized the timer | ||
21 | - the function where the timer was intialized | ||
22 | - the callback function which is associated to the timer | ||
23 | - the number of events (callbacks) | ||
24 | |||
25 | timer_stats adds an entry to /proc: /proc/timer_stats | ||
26 | |||
27 | This entry is used to control the statistics functionality and to read out the | ||
28 | sampled information. | ||
29 | |||
30 | The timer_stats functionality is inactive on bootup. | ||
31 | |||
32 | To activate a sample period issue: | ||
33 | # echo 1 >/proc/timer_stats | ||
34 | |||
35 | To stop a sample period issue: | ||
36 | # echo 0 >/proc/timer_stats | ||
37 | |||
38 | The statistics can be retrieved by: | ||
39 | # cat /proc/timer_stats | ||
40 | |||
41 | The readout of /proc/timer_stats automatically disables sampling. The sampled | ||
42 | information is kept until a new sample period is started. This allows multiple | ||
43 | readouts. | ||
44 | |||
45 | Sample output of /proc/timer_stats: | ||
46 | |||
47 | Timerstats sample period: 3.888770 s | ||
48 | 12, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) | ||
49 | 15, 1 swapper hcd_submit_urb (rh_timer_func) | ||
50 | 4, 959 kedac schedule_timeout (process_timeout) | ||
51 | 1, 0 swapper page_writeback_init (wb_timer_fn) | ||
52 | 28, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) | ||
53 | 22, 2948 IRQ 4 tty_flip_buffer_push (delayed_work_timer_fn) | ||
54 | 3, 3100 bash schedule_timeout (process_timeout) | ||
55 | 1, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) | ||
56 | 1, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) | ||
57 | 1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) | ||
58 | 1, 2292 ip __netdev_watchdog_up (dev_watchdog) | ||
59 | 1, 23 events/1 do_cache_clean (delayed_work_timer_fn) | ||
60 | 90 total events, 30.0 events/sec | ||
61 | |||
62 | The first column is the number of events, the second column the pid, the third | ||
63 | column is the name of the process. The forth column shows the function which | ||
64 | initialized the timer and in parantheses the callback function which was | ||
65 | executed on expiry. | ||
66 | |||
67 | Thomas, Ingo | ||
68 | |||
diff --git a/Documentation/hrtimers/highres.txt b/Documentation/hrtimers/highres.txt new file mode 100644 index 000000000000..ce0e9a91e157 --- /dev/null +++ b/Documentation/hrtimers/highres.txt | |||
@@ -0,0 +1,249 @@ | |||
1 | High resolution timers and dynamic ticks design notes | ||
2 | ----------------------------------------------------- | ||
3 | |||
4 | Further information can be found in the paper of the OLS 2006 talk "hrtimers | ||
5 | and beyond". The paper is part of the OLS 2006 Proceedings Volume 1, which can | ||
6 | be found on the OLS website: | ||
7 | http://www.linuxsymposium.org/2006/linuxsymposium_procv1.pdf | ||
8 | |||
9 | The slides to this talk are available from: | ||
10 | http://tglx.de/projects/hrtimers/ols2006-hrtimers.pdf | ||
11 | |||
12 | The slides contain five figures (pages 2, 15, 18, 20, 22), which illustrate the | ||
13 | changes in the time(r) related Linux subsystems. Figure #1 (p. 2) shows the | ||
14 | design of the Linux time(r) system before hrtimers and other building blocks | ||
15 | got merged into mainline. | ||
16 | |||
17 | Note: the paper and the slides are talking about "clock event source", while we | ||
18 | switched to the name "clock event devices" in meantime. | ||
19 | |||
20 | The design contains the following basic building blocks: | ||
21 | |||
22 | - hrtimer base infrastructure | ||
23 | - timeofday and clock source management | ||
24 | - clock event management | ||
25 | - high resolution timer functionality | ||
26 | - dynamic ticks | ||
27 | |||
28 | |||
29 | hrtimer base infrastructure | ||
30 | --------------------------- | ||
31 | |||
32 | The hrtimer base infrastructure was merged into the 2.6.16 kernel. Details of | ||
33 | the base implementation are covered in Documentation/hrtimers/hrtimer.txt. See | ||
34 | also figure #2 (OLS slides p. 15) | ||
35 | |||
36 | The main differences to the timer wheel, which holds the armed timer_list type | ||
37 | timers are: | ||
38 | - time ordered enqueueing into a rb-tree | ||
39 | - independent of ticks (the processing is based on nanoseconds) | ||
40 | |||
41 | |||
42 | timeofday and clock source management | ||
43 | ------------------------------------- | ||
44 | |||
45 | John Stultz's Generic Time Of Day (GTOD) framework moves a large portion of | ||
46 | code out of the architecture-specific areas into a generic management | ||
47 | framework, as illustrated in figure #3 (OLS slides p. 18). The architecture | ||
48 | specific portion is reduced to the low level hardware details of the clock | ||
49 | sources, which are registered in the framework and selected on a quality based | ||
50 | decision. The low level code provides hardware setup and readout routines and | ||
51 | initializes data structures, which are used by the generic time keeping code to | ||
52 | convert the clock ticks to nanosecond based time values. All other time keeping | ||
53 | related functionality is moved into the generic code. The GTOD base patch got | ||
54 | merged into the 2.6.18 kernel. | ||
55 | |||
56 | Further information about the Generic Time Of Day framework is available in the | ||
57 | OLS 2005 Proceedings Volume 1: | ||
58 | http://www.linuxsymposium.org/2005/linuxsymposium_procv1.pdf | ||
59 | |||
60 | The paper "We Are Not Getting Any Younger: A New Approach to Time and | ||
61 | Timers" was written by J. Stultz, D.V. Hart, & N. Aravamudan. | ||
62 | |||
63 | Figure #3 (OLS slides p.18) illustrates the transformation. | ||
64 | |||
65 | |||
66 | clock event management | ||
67 | ---------------------- | ||
68 | |||
69 | While clock sources provide read access to the monotonically increasing time | ||
70 | value, clock event devices are used to schedule the next event | ||
71 | interrupt(s). The next event is currently defined to be periodic, with its | ||
72 | period defined at compile time. The setup and selection of the event device | ||
73 | for various event driven functionalities is hardwired into the architecture | ||
74 | dependent code. This results in duplicated code across all architectures and | ||
75 | makes it extremely difficult to change the configuration of the system to use | ||
76 | event interrupt devices other than those already built into the | ||
77 | architecture. Another implication of the current design is that it is necessary | ||
78 | to touch all the architecture-specific implementations in order to provide new | ||
79 | functionality like high resolution timers or dynamic ticks. | ||
80 | |||
81 | The clock events subsystem tries to address this problem by providing a generic | ||
82 | solution to manage clock event devices and their usage for the various clock | ||
83 | event driven kernel functionalities. The goal of the clock event subsystem is | ||
84 | to minimize the clock event related architecture dependent code to the pure | ||
85 | hardware related handling and to allow easy addition and utilization of new | ||
86 | clock event devices. It also minimizes the duplicated code across the | ||
87 | architectures as it provides generic functionality down to the interrupt | ||
88 | service handler, which is almost inherently hardware dependent. | ||
89 | |||
90 | Clock event devices are registered either by the architecture dependent boot | ||
91 | code or at module insertion time. Each clock event device fills a data | ||
92 | structure with clock-specific property parameters and callback functions. The | ||
93 | clock event management decides, by using the specified property parameters, the | ||
94 | set of system functions a clock event device will be used to support. This | ||
95 | includes the distinction of per-CPU and per-system global event devices. | ||
96 | |||
97 | System-level global event devices are used for the Linux periodic tick. Per-CPU | ||
98 | event devices are used to provide local CPU functionality such as process | ||
99 | accounting, profiling, and high resolution timers. | ||
100 | |||
101 | The management layer assignes one or more of the folliwing functions to a clock | ||
102 | event device: | ||
103 | - system global periodic tick (jiffies update) | ||
104 | - cpu local update_process_times | ||
105 | - cpu local profiling | ||
106 | - cpu local next event interrupt (non periodic mode) | ||
107 | |||
108 | The clock event device delegates the selection of those timer interrupt related | ||
109 | functions completely to the management layer. The clock management layer stores | ||
110 | a function pointer in the device description structure, which has to be called | ||
111 | from the hardware level handler. This removes a lot of duplicated code from the | ||
112 | architecture specific timer interrupt handlers and hands the control over the | ||
113 | clock event devices and the assignment of timer interrupt related functionality | ||
114 | to the core code. | ||
115 | |||
116 | The clock event layer API is rather small. Aside from the clock event device | ||
117 | registration interface it provides functions to schedule the next event | ||
118 | interrupt, clock event device notification service and support for suspend and | ||
119 | resume. | ||
120 | |||
121 | The framework adds about 700 lines of code which results in a 2KB increase of | ||
122 | the kernel binary size. The conversion of i386 removes about 100 lines of | ||
123 | code. The binary size decrease is in the range of 400 byte. We believe that the | ||
124 | increase of flexibility and the avoidance of duplicated code across | ||
125 | architectures justifies the slight increase of the binary size. | ||
126 | |||
127 | The conversion of an architecture has no functional impact, but allows to | ||
128 | utilize the high resolution and dynamic tick functionalites without any change | ||
129 | to the clock event device and timer interrupt code. After the conversion the | ||
130 | enabling of high resolution timers and dynamic ticks is simply provided by | ||
131 | adding the kernel/time/Kconfig file to the architecture specific Kconfig and | ||
132 | adding the dynamic tick specific calls to the idle routine (a total of 3 lines | ||
133 | added to the idle function and the Kconfig file) | ||
134 | |||
135 | Figure #4 (OLS slides p.20) illustrates the transformation. | ||
136 | |||
137 | |||
138 | high resolution timer functionality | ||
139 | ----------------------------------- | ||
140 | |||
141 | During system boot it is not possible to use the high resolution timer | ||
142 | functionality, while making it possible would be difficult and would serve no | ||
143 | useful function. The initialization of the clock event device framework, the | ||
144 | clock source framework (GTOD) and hrtimers itself has to be done and | ||
145 | appropriate clock sources and clock event devices have to be registered before | ||
146 | the high resolution functionality can work. Up to the point where hrtimers are | ||
147 | initialized, the system works in the usual low resolution periodic mode. The | ||
148 | clock source and the clock event device layers provide notification functions | ||
149 | which inform hrtimers about availability of new hardware. hrtimers validates | ||
150 | the usability of the registered clock sources and clock event devices before | ||
151 | switching to high resolution mode. This ensures also that a kernel which is | ||
152 | configured for high resolution timers can run on a system which lacks the | ||
153 | necessary hardware support. | ||
154 | |||
155 | The high resolution timer code does not support SMP machines which have only | ||
156 | global clock event devices. The support of such hardware would involve IPI | ||
157 | calls when an interrupt happens. The overhead would be much larger than the | ||
158 | benefit. This is the reason why we currently disable high resolution and | ||
159 | dynamic ticks on i386 SMP systems which stop the local APIC in C3 power | ||
160 | state. A workaround is available as an idea, but the problem has not been | ||
161 | tackled yet. | ||
162 | |||
163 | The time ordered insertion of timers provides all the infrastructure to decide | ||
164 | whether the event device has to be reprogrammed when a timer is added. The | ||
165 | decision is made per timer base and synchronized across per-cpu timer bases in | ||
166 | a support function. The design allows the system to utilize separate per-CPU | ||
167 | clock event devices for the per-CPU timer bases, but currently only one | ||
168 | reprogrammable clock event device per-CPU is utilized. | ||
169 | |||
170 | When the timer interrupt happens, the next event interrupt handler is called | ||
171 | from the clock event distribution code and moves expired timers from the | ||
172 | red-black tree to a separate double linked list and invokes the softirq | ||
173 | handler. An additional mode field in the hrtimer structure allows the system to | ||
174 | execute callback functions directly from the next event interrupt handler. This | ||
175 | is restricted to code which can safely be executed in the hard interrupt | ||
176 | context. This applies, for example, to the common case of a wakeup function as | ||
177 | used by nanosleep. The advantage of executing the handler in the interrupt | ||
178 | context is the avoidance of up to two context switches - from the interrupted | ||
179 | context to the softirq and to the task which is woken up by the expired | ||
180 | timer. | ||
181 | |||
182 | Once a system has switched to high resolution mode, the periodic tick is | ||
183 | switched off. This disables the per system global periodic clock event device - | ||
184 | e.g. the PIT on i386 SMP systems. | ||
185 | |||
186 | The periodic tick functionality is provided by an per-cpu hrtimer. The callback | ||
187 | function is executed in the next event interrupt context and updates jiffies | ||
188 | and calls update_process_times and profiling. The implementation of the hrtimer | ||
189 | based periodic tick is designed to be extended with dynamic tick functionality. | ||
190 | This allows to use a single clock event device to schedule high resolution | ||
191 | timer and periodic events (jiffies tick, profiling, process accounting) on UP | ||
192 | systems. This has been proved to work with the PIT on i386 and the Incrementer | ||
193 | on PPC. | ||
194 | |||
195 | The softirq for running the hrtimer queues and executing the callbacks has been | ||
196 | separated from the tick bound timer softirq to allow accurate delivery of high | ||
197 | resolution timer signals which are used by itimer and POSIX interval | ||
198 | timers. The execution of this softirq can still be delayed by other softirqs, | ||
199 | but the overall latencies have been significantly improved by this separation. | ||
200 | |||
201 | Figure #5 (OLS slides p.22) illustrates the transformation. | ||
202 | |||
203 | |||
204 | dynamic ticks | ||
205 | ------------- | ||
206 | |||
207 | Dynamic ticks are the logical consequence of the hrtimer based periodic tick | ||
208 | replacement (sched_tick). The functionality of the sched_tick hrtimer is | ||
209 | extended by three functions: | ||
210 | |||
211 | - hrtimer_stop_sched_tick | ||
212 | - hrtimer_restart_sched_tick | ||
213 | - hrtimer_update_jiffies | ||
214 | |||
215 | hrtimer_stop_sched_tick() is called when a CPU goes into idle state. The code | ||
216 | evaluates the next scheduled timer event (from both hrtimers and the timer | ||
217 | wheel) and in case that the next event is further away than the next tick it | ||
218 | reprograms the sched_tick to this future event, to allow longer idle sleeps | ||
219 | without worthless interruption by the periodic tick. The function is also | ||
220 | called when an interrupt happens during the idle period, which does not cause a | ||
221 | reschedule. The call is necessary as the interrupt handler might have armed a | ||
222 | new timer whose expiry time is before the time which was identified as the | ||
223 | nearest event in the previous call to hrtimer_stop_sched_tick. | ||
224 | |||
225 | hrtimer_restart_sched_tick() is called when the CPU leaves the idle state before | ||
226 | it calls schedule(). hrtimer_restart_sched_tick() resumes the periodic tick, | ||
227 | which is kept active until the next call to hrtimer_stop_sched_tick(). | ||
228 | |||
229 | hrtimer_update_jiffies() is called from irq_enter() when an interrupt happens | ||
230 | in the idle period to make sure that jiffies are up to date and the interrupt | ||
231 | handler has not to deal with an eventually stale jiffy value. | ||
232 | |||
233 | The dynamic tick feature provides statistical values which are exported to | ||
234 | userspace via /proc/stats and can be made available for enhanced power | ||
235 | management control. | ||
236 | |||
237 | The implementation leaves room for further development like full tickless | ||
238 | systems, where the time slice is controlled by the scheduler, variable | ||
239 | frequency profiling, and a complete removal of jiffies in the future. | ||
240 | |||
241 | |||
242 | Aside the current initial submission of i386 support, the patchset has been | ||
243 | extended to x86_64 and ARM already. Initial (work in progress) support is also | ||
244 | available for MIPS and PowerPC. | ||
245 | |||
246 | Thomas, Ingo | ||
247 | |||
248 | |||
249 | |||
diff --git a/Documentation/hrtimers.txt b/Documentation/hrtimers/hrtimers.txt index ce31f65e12e7..ce31f65e12e7 100644 --- a/Documentation/hrtimers.txt +++ b/Documentation/hrtimers/hrtimers.txt | |||
diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index 3db69a086c41..c34f0db78a30 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 | |||
@@ -48,14 +48,9 @@ following: | |||
48 | The SMBus controller is function 3 in device 1f. Class 0c05 is SMBus Serial | 48 | The SMBus controller is function 3 in device 1f. Class 0c05 is SMBus Serial |
49 | Controller. | 49 | Controller. |
50 | 50 | ||
51 | If you do NOT see the 24x3 device at function 3, and you can't figure out | ||
52 | any way in the BIOS to enable it, | ||
53 | |||
54 | The ICH chips are quite similar to Intel's PIIX4 chip, at least in the | 51 | The ICH chips are quite similar to Intel's PIIX4 chip, at least in the |
55 | SMBus controller. | 52 | SMBus controller. |
56 | 53 | ||
57 | See the file i2c-piix4 for some additional information. | ||
58 | |||
59 | 54 | ||
60 | Process Call Support | 55 | Process Call Support |
61 | -------------------- | 56 | -------------------- |
@@ -74,6 +69,61 @@ SMBus 2.0 Support | |||
74 | 69 | ||
75 | The 82801DB (ICH4) and later chips support several SMBus 2.0 features. | 70 | The 82801DB (ICH4) and later chips support several SMBus 2.0 features. |
76 | 71 | ||
72 | |||
73 | Hidden ICH SMBus | ||
74 | ---------------- | ||
75 | |||
76 | If your system has an Intel ICH south bridge, but you do NOT see the | ||
77 | SMBus device at 00:1f.3 in lspci, and you can't figure out any way in the | ||
78 | BIOS to enable it, it means it has been hidden by the BIOS code. Asus is | ||
79 | well known for first doing this on their P4B motherboard, and many other | ||
80 | boards after that. Some vendor machines are affected as well. | ||
81 | |||
82 | The first thing to try is the "i2c_ec" ACPI driver. It could be that the | ||
83 | SMBus was hidden on purpose because it'll be driven by ACPI. If the | ||
84 | i2c_ec driver works for you, just forget about the i2c-i801 driver and | ||
85 | don't try to unhide the ICH SMBus. Even if i2c_ec doesn't work, you | ||
86 | better make sure that the SMBus isn't used by the ACPI code. Try loading | ||
87 | the "fan" and "thermal" drivers, and check in /proc/acpi/fan and | ||
88 | /proc/acpi/thermal_zone. If you find anything there, it's likely that | ||
89 | the ACPI is accessing the SMBus and it's safer not to unhide it. Only | ||
90 | once you are certain that ACPI isn't using the SMBus, you can attempt | ||
91 | to unhide it. | ||
92 | |||
93 | In order to unhide the SMBus, we need to change the value of a PCI | ||
94 | register before the kernel enumerates the PCI devices. This is done in | ||
95 | drivers/pci/quirks.c, where all affected boards must be listed (see | ||
96 | function asus_hides_smbus_hostbridge.) If the SMBus device is missing, | ||
97 | and you think there's something interesting on the SMBus (e.g. a | ||
98 | hardware monitoring chip), you need to add your board to the list. | ||
99 | |||
100 | The motherboard is identified using the subvendor and subdevice IDs of the | ||
101 | host bridge PCI device. Get yours with "lspci -n -v -s 00:00.0": | ||
102 | |||
103 | 00:00.0 Class 0600: 8086:2570 (rev 02) | ||
104 | Subsystem: 1043:80f2 | ||
105 | Flags: bus master, fast devsel, latency 0 | ||
106 | Memory at fc000000 (32-bit, prefetchable) [size=32M] | ||
107 | Capabilities: [e4] #09 [2106] | ||
108 | Capabilities: [a0] AGP version 3.0 | ||
109 | |||
110 | Here the host bridge ID is 2570 (82865G/PE/P), the subvendor ID is 1043 | ||
111 | (Asus) and the subdevice ID is 80f2 (P4P800-X). You can find the symbolic | ||
112 | names for the bridge ID and the subvendor ID in include/linux/pci_ids.h, | ||
113 | and then add a case for your subdevice ID at the right place in | ||
114 | drivers/pci/quirks.c. Then please give it very good testing, to make sure | ||
115 | that the unhidden SMBus doesn't conflict with e.g. ACPI. | ||
116 | |||
117 | If it works, proves useful (i.e. there are usable chips on the SMBus) | ||
118 | and seems safe, please submit a patch for inclusion into the kernel. | ||
119 | |||
120 | Note: There's a useful script in lm_sensors 2.10.2 and later, named | ||
121 | unhide_ICH_SMBus (in prog/hotplug), which uses the fakephp driver to | ||
122 | temporarily unhide the SMBus without having to patch and recompile your | ||
123 | kernel. It's very convenient if you just want to check if there's | ||
124 | anything interesting on your hidden ICH SMBus. | ||
125 | |||
126 | |||
77 | ********************** | 127 | ********************** |
78 | The lm_sensors project gratefully acknowledges the support of Texas | 128 | The lm_sensors project gratefully acknowledges the support of Texas |
79 | Instruments in the initial development of this driver. | 129 | Instruments in the initial development of this driver. |
diff --git a/Documentation/i2c/busses/i2c-parport b/Documentation/i2c/busses/i2c-parport index 77b995dfca22..dceaba1ad930 100644 --- a/Documentation/i2c/busses/i2c-parport +++ b/Documentation/i2c/busses/i2c-parport | |||
@@ -19,6 +19,7 @@ It currently supports the following devices: | |||
19 | * (type=4) Analog Devices ADM1032 evaluation board | 19 | * (type=4) Analog Devices ADM1032 evaluation board |
20 | * (type=5) Analog Devices evaluation boards: ADM1025, ADM1030, ADM1031 | 20 | * (type=5) Analog Devices evaluation boards: ADM1025, ADM1030, ADM1031 |
21 | * (type=6) Barco LPT->DVI (K5800236) adapter | 21 | * (type=6) Barco LPT->DVI (K5800236) adapter |
22 | * (type=7) One For All JP1 parallel port adapter | ||
22 | 23 | ||
23 | These devices use different pinout configurations, so you have to tell | 24 | These devices use different pinout configurations, so you have to tell |
24 | the driver what you have, using the type module parameter. There is no | 25 | the driver what you have, using the type module parameter. There is no |
@@ -157,3 +158,17 @@ many more, using /dev/velleman. | |||
157 | http://home.wanadoo.nl/hihihi/libk8005.htm | 158 | http://home.wanadoo.nl/hihihi/libk8005.htm |
158 | http://struyve.mine.nu:8080/index.php?block=k8000 | 159 | http://struyve.mine.nu:8080/index.php?block=k8000 |
159 | http://sourceforge.net/projects/libk8005/ | 160 | http://sourceforge.net/projects/libk8005/ |
161 | |||
162 | |||
163 | One For All JP1 parallel port adapter | ||
164 | ------------------------------------- | ||
165 | |||
166 | The JP1 project revolves around a set of remote controls which expose | ||
167 | the I2C bus their internal configuration EEPROM lives on via a 6 pin | ||
168 | jumper in the battery compartment. More details can be found at: | ||
169 | |||
170 | http://www.hifi-remote.com/jp1/ | ||
171 | |||
172 | Details of the simple parallel port hardware can be found at: | ||
173 | |||
174 | http://www.hifi-remote.com/jp1/hardware.shtml | ||
diff --git a/Documentation/i2c/busses/i2c-piix4 b/Documentation/i2c/busses/i2c-piix4 index 921476333235..7cbe43fa2701 100644 --- a/Documentation/i2c/busses/i2c-piix4 +++ b/Documentation/i2c/busses/i2c-piix4 | |||
@@ -6,7 +6,7 @@ Supported adapters: | |||
6 | Datasheet: Publicly available at the Intel website | 6 | Datasheet: Publicly available at the Intel website |
7 | * ServerWorks OSB4, CSB5, CSB6 and HT-1000 southbridges | 7 | * ServerWorks OSB4, CSB5, CSB6 and HT-1000 southbridges |
8 | Datasheet: Only available via NDA from ServerWorks | 8 | Datasheet: Only available via NDA from ServerWorks |
9 | * ATI IXP southbridges IXP200, IXP300, IXP400 | 9 | * ATI IXP200, IXP300, IXP400 and SB600 southbridges |
10 | Datasheet: Not publicly available | 10 | Datasheet: Not publicly available |
11 | * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge | 11 | * Standard Microsystems (SMSC) SLC90E66 (Victory66) southbridge |
12 | Datasheet: Publicly available at the SMSC website http://www.smsc.com | 12 | Datasheet: Publicly available at the SMSC website http://www.smsc.com |
diff --git a/Documentation/i2c/busses/i2c-viapro b/Documentation/i2c/busses/i2c-viapro index 25680346e0ac..775f489e86f6 100644 --- a/Documentation/i2c/busses/i2c-viapro +++ b/Documentation/i2c/busses/i2c-viapro | |||
@@ -13,6 +13,9 @@ Supported adapters: | |||
13 | * VIA Technologies, Inc. VT8235, VT8237R, VT8237A, VT8251 | 13 | * VIA Technologies, Inc. VT8235, VT8237R, VT8237A, VT8251 |
14 | Datasheet: available on request and under NDA from VIA | 14 | Datasheet: available on request and under NDA from VIA |
15 | 15 | ||
16 | * VIA Technologies, Inc. CX700 | ||
17 | Datasheet: available on request and under NDA from VIA | ||
18 | |||
16 | Authors: | 19 | Authors: |
17 | Kyösti Mälkki <kmalkki@cc.hut.fi>, | 20 | Kyösti Mälkki <kmalkki@cc.hut.fi>, |
18 | Mark D. Studebaker <mdsxyz123@yahoo.com>, | 21 | Mark D. Studebaker <mdsxyz123@yahoo.com>, |
@@ -44,6 +47,7 @@ Your lspci -n listing must show one of these : | |||
44 | device 1106:3227 (VT8237R) | 47 | device 1106:3227 (VT8237R) |
45 | device 1106:3337 (VT8237A) | 48 | device 1106:3337 (VT8237A) |
46 | device 1106:3287 (VT8251) | 49 | device 1106:3287 (VT8251) |
50 | device 1106:8324 (CX700) | ||
47 | 51 | ||
48 | If none of these show up, you should look in the BIOS for settings like | 52 | If none of these show up, you should look in the BIOS for settings like |
49 | enable ACPI / SMBus or even USB. | 53 | enable ACPI / SMBus or even USB. |
@@ -51,3 +55,6 @@ enable ACPI / SMBus or even USB. | |||
51 | Except for the oldest chips (VT82C596A/B, VT82C686A and most probably | 55 | Except for the oldest chips (VT82C596A/B, VT82C686A and most probably |
52 | VT8231), this driver supports I2C block transactions. Such transactions | 56 | VT8231), this driver supports I2C block transactions. Such transactions |
53 | are mainly useful to read from and write to EEPROMs. | 57 | are mainly useful to read from and write to EEPROMs. |
58 | |||
59 | The CX700 additionally appears to support SMBus PEC, although this driver | ||
60 | doesn't implement it yet. | ||
diff --git a/Documentation/i2c/porting-clients b/Documentation/i2c/porting-clients index f03c2a02f806..ca272b263a92 100644 --- a/Documentation/i2c/porting-clients +++ b/Documentation/i2c/porting-clients | |||
@@ -129,6 +129,12 @@ Technical changes: | |||
129 | structure, those name member should be initialized to a driver name | 129 | structure, those name member should be initialized to a driver name |
130 | string. i2c_driver itself has no name member anymore. | 130 | string. i2c_driver itself has no name member anymore. |
131 | 131 | ||
132 | * [Driver model] Instead of shutdown or reboot notifiers, provide a | ||
133 | shutdown() method in your driver. | ||
134 | |||
135 | * [Power management] Use the driver model suspend() and resume() | ||
136 | callbacks instead of the obsolete pm_register() calls. | ||
137 | |||
132 | Coding policy: | 138 | Coding policy: |
133 | 139 | ||
134 | * [Copyright] Use (C), not (c), for copyright. | 140 | * [Copyright] Use (C), not (c), for copyright. |
diff --git a/Documentation/i2c/smbus-protocol b/Documentation/i2c/smbus-protocol index 09f5e5ca4927..8a653c60d25a 100644 --- a/Documentation/i2c/smbus-protocol +++ b/Documentation/i2c/smbus-protocol | |||
@@ -97,7 +97,7 @@ SMBus Write Word Data | |||
97 | ===================== | 97 | ===================== |
98 | 98 | ||
99 | This is the opposite operation of the Read Word Data command. 16 bits | 99 | This is the opposite operation of the Read Word Data command. 16 bits |
100 | of data is read from a device, from a designated register that is | 100 | of data is written to a device, to the designated register that is |
101 | specified through the Comm byte. | 101 | specified through the Comm byte. |
102 | 102 | ||
103 | S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P | 103 | S Addr Wr [A] Comm [A] DataLow [A] DataHigh [A] P |
diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 3a057c8e5507..fbcff96f4ca1 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients | |||
@@ -21,20 +21,26 @@ The driver structure | |||
21 | 21 | ||
22 | Usually, you will implement a single driver structure, and instantiate | 22 | Usually, you will implement a single driver structure, and instantiate |
23 | all clients from it. Remember, a driver structure contains general access | 23 | all clients from it. Remember, a driver structure contains general access |
24 | routines, a client structure specific information like the actual I2C | 24 | routines, and should be zero-initialized except for fields with data you |
25 | address. | 25 | provide. A client structure holds device-specific information like the |
26 | driver model device node, and its I2C address. | ||
26 | 27 | ||
27 | static struct i2c_driver foo_driver = { | 28 | static struct i2c_driver foo_driver = { |
28 | .driver = { | 29 | .driver = { |
29 | .name = "foo", | 30 | .name = "foo", |
30 | }, | 31 | }, |
31 | .attach_adapter = &foo_attach_adapter, | 32 | .attach_adapter = foo_attach_adapter, |
32 | .detach_client = &foo_detach_client, | 33 | .detach_client = foo_detach_client, |
33 | .command = &foo_command /* may be NULL */ | 34 | .shutdown = foo_shutdown, /* optional */ |
35 | .suspend = foo_suspend, /* optional */ | ||
36 | .resume = foo_resume, /* optional */ | ||
37 | .command = foo_command, /* optional */ | ||
34 | } | 38 | } |
35 | 39 | ||
36 | The name field must match the driver name, including the case. It must not | 40 | The name field is the driver name, and must not contain spaces. It |
37 | contain spaces, and may be up to 31 characters long. | 41 | should match the module name (if the driver can be compiled as a module), |
42 | although you can use MODULE_ALIAS (passing "foo" in this example) to add | ||
43 | another name for the module. | ||
38 | 44 | ||
39 | All other fields are for call-back functions which will be explained | 45 | All other fields are for call-back functions which will be explained |
40 | below. | 46 | below. |
@@ -43,11 +49,18 @@ below. | |||
43 | Extra client data | 49 | Extra client data |
44 | ================= | 50 | ================= |
45 | 51 | ||
46 | The client structure has a special `data' field that can point to any | 52 | Each client structure has a special `data' field that can point to any |
47 | structure at all. You can use this to keep client-specific data. You | 53 | structure at all. You should use this to keep device-specific data, |
54 | especially in drivers that handle multiple I2C or SMBUS devices. You | ||
48 | do not always need this, but especially for `sensors' drivers, it can | 55 | do not always need this, but especially for `sensors' drivers, it can |
49 | be very useful. | 56 | be very useful. |
50 | 57 | ||
58 | /* store the value */ | ||
59 | void i2c_set_clientdata(struct i2c_client *client, void *data); | ||
60 | |||
61 | /* retrieve the value */ | ||
62 | void *i2c_get_clientdata(struct i2c_client *client); | ||
63 | |||
51 | An example structure is below. | 64 | An example structure is below. |
52 | 65 | ||
53 | struct foo_data { | 66 | struct foo_data { |
@@ -493,6 +506,33 @@ by `__init_data'. Hose functions and structures can be removed after | |||
493 | kernel booting (or module loading) is completed. | 506 | kernel booting (or module loading) is completed. |
494 | 507 | ||
495 | 508 | ||
509 | Power Management | ||
510 | ================ | ||
511 | |||
512 | If your I2C device needs special handling when entering a system low | ||
513 | power state -- like putting a transceiver into a low power mode, or | ||
514 | activating a system wakeup mechanism -- do that in the suspend() method. | ||
515 | The resume() method should reverse what the suspend() method does. | ||
516 | |||
517 | These are standard driver model calls, and they work just like they | ||
518 | would for any other driver stack. The calls can sleep, and can use | ||
519 | I2C messaging to the device being suspended or resumed (since their | ||
520 | parent I2C adapter is active when these calls are issued, and IRQs | ||
521 | are still enabled). | ||
522 | |||
523 | |||
524 | System Shutdown | ||
525 | =============== | ||
526 | |||
527 | If your I2C device needs special handling when the system shuts down | ||
528 | or reboots (including kexec) -- like turning something off -- use a | ||
529 | shutdown() method. | ||
530 | |||
531 | Again, this is a standard driver model call, working just like it | ||
532 | would for any other driver stack: the calls can sleep, and can use | ||
533 | I2C messaging. | ||
534 | |||
535 | |||
496 | Command function | 536 | Command function |
497 | ================ | 537 | ================ |
498 | 538 | ||
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index d25acd51e181..abd575cfc759 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -104,6 +104,9 @@ loader, and have no meaning to the kernel directly. | |||
104 | Do not modify the syntax of boot loader parameters without extreme | 104 | Do not modify the syntax of boot loader parameters without extreme |
105 | need or coordination with <Documentation/i386/boot.txt>. | 105 | need or coordination with <Documentation/i386/boot.txt>. |
106 | 106 | ||
107 | There are also arch-specific kernel-parameters not documented here. | ||
108 | See for example <Documentation/x86_64/boot-options.txt>. | ||
109 | |||
107 | Note that ALL kernel parameters listed below are CASE SENSITIVE, and that | 110 | Note that ALL kernel parameters listed below are CASE SENSITIVE, and that |
108 | a trailing = on the name of any parameter states that that parameter will | 111 | a trailing = on the name of any parameter states that that parameter will |
109 | be entered as an environment variable, whereas its absence indicates that | 112 | be entered as an environment variable, whereas its absence indicates that |
@@ -361,6 +364,11 @@ and is between 256 and 4096 characters. It is defined in the file | |||
361 | clocksource is not available, it defaults to PIT. | 364 | clocksource is not available, it defaults to PIT. |
362 | Format: { pit | tsc | cyclone | pmtmr } | 365 | Format: { pit | tsc | cyclone | pmtmr } |
363 | 366 | ||
367 | code_bytes [IA32] How many bytes of object code to print in an | ||
368 | oops report. | ||
369 | Range: 0 - 8192 | ||
370 | Default: 64 | ||
371 | |||
364 | disable_8254_timer | 372 | disable_8254_timer |
365 | enable_8254_timer | 373 | enable_8254_timer |
366 | [IA32/X86_64] Disable/Enable interrupt 0 timer routing | 374 | [IA32/X86_64] Disable/Enable interrupt 0 timer routing |
@@ -601,6 +609,10 @@ and is between 256 and 4096 characters. It is defined in the file | |||
601 | highmem otherwise. This also works to reduce highmem | 609 | highmem otherwise. This also works to reduce highmem |
602 | size on bigger boxes. | 610 | size on bigger boxes. |
603 | 611 | ||
612 | highres= [KNL] Enable/disable high resolution timer mode. | ||
613 | Valid parameters: "on", "off" | ||
614 | Default: "on" | ||
615 | |||
604 | hisax= [HW,ISDN] | 616 | hisax= [HW,ISDN] |
605 | See Documentation/isdn/README.HiSax. | 617 | See Documentation/isdn/README.HiSax. |
606 | 618 | ||
@@ -1070,6 +1082,10 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1070 | in certain environments such as networked servers or | 1082 | in certain environments such as networked servers or |
1071 | real-time systems. | 1083 | real-time systems. |
1072 | 1084 | ||
1085 | nohz= [KNL] Boottime enable/disable dynamic ticks | ||
1086 | Valid arguments: on, off | ||
1087 | Default: on | ||
1088 | |||
1073 | noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing | 1089 | noirqbalance [IA-32,SMP,KNL] Disable kernel irq balancing |
1074 | 1090 | ||
1075 | noirqdebug [IA-32] Disables the code which attempts to detect and | 1091 | noirqdebug [IA-32] Disables the code which attempts to detect and |
diff --git a/Documentation/powerpc/booting-without-of.txt b/Documentation/powerpc/booting-without-of.txt index 33994271cb3b..3b514672b80e 100644 --- a/Documentation/powerpc/booting-without-of.txt +++ b/Documentation/powerpc/booting-without-of.txt | |||
@@ -1334,6 +1334,9 @@ platforms are moved over to use the flattened-device-tree model. | |||
1334 | fsl-usb2-mph compatible controllers. Either this property or | 1334 | fsl-usb2-mph compatible controllers. Either this property or |
1335 | "port0" (or both) must be defined for "fsl-usb2-mph" compatible | 1335 | "port0" (or both) must be defined for "fsl-usb2-mph" compatible |
1336 | controllers. | 1336 | controllers. |
1337 | - dr_mode : indicates the working mode for "fsl-usb2-dr" compatible | ||
1338 | controllers. Can be "host", "peripheral", or "otg". Default to | ||
1339 | "host" if not defined for backward compatibility. | ||
1337 | 1340 | ||
1338 | Recommended properties : | 1341 | Recommended properties : |
1339 | - interrupts : <a b> where a is the interrupt number and b is a | 1342 | - interrupts : <a b> where a is the interrupt number and b is a |
@@ -1367,6 +1370,7 @@ platforms are moved over to use the flattened-device-tree model. | |||
1367 | #size-cells = <0>; | 1370 | #size-cells = <0>; |
1368 | interrupt-parent = <700>; | 1371 | interrupt-parent = <700>; |
1369 | interrupts = <26 1>; | 1372 | interrupts = <26 1>; |
1373 | dr_mode = "otg"; | ||
1370 | phy = "ulpi"; | 1374 | phy = "ulpi"; |
1371 | }; | 1375 | }; |
1372 | 1376 | ||
diff --git a/Documentation/powerpc/mpc52xx-device-tree-bindings.txt b/Documentation/powerpc/mpc52xx-device-tree-bindings.txt index 69f016f02bb0..e59fcbbe338c 100644 --- a/Documentation/powerpc/mpc52xx-device-tree-bindings.txt +++ b/Documentation/powerpc/mpc52xx-device-tree-bindings.txt | |||
@@ -1,7 +1,7 @@ | |||
1 | MPC52xx Device Tree Bindings | 1 | MPC5200 Device Tree Bindings |
2 | ---------------------------- | 2 | ---------------------------- |
3 | 3 | ||
4 | (c) 2006 Secret Lab Technologies Ltd | 4 | (c) 2006-2007 Secret Lab Technologies Ltd |
5 | Grant Likely <grant.likely at secretlab.ca> | 5 | Grant Likely <grant.likely at secretlab.ca> |
6 | 6 | ||
7 | ********** DRAFT *********** | 7 | ********** DRAFT *********** |
@@ -20,11 +20,11 @@ described in Documentation/powerpc/booting-without-of.txt), or passed | |||
20 | by Open Firmare (IEEE 1275) compatible firmware using an OF compatible | 20 | by Open Firmare (IEEE 1275) compatible firmware using an OF compatible |
21 | client interface API. | 21 | client interface API. |
22 | 22 | ||
23 | This document specifies the requirements on the device-tree for mpc52xx | 23 | This document specifies the requirements on the device-tree for mpc5200 |
24 | based boards. These requirements are above and beyond the details | 24 | based boards. These requirements are above and beyond the details |
25 | specified in either the OpenFirmware spec or booting-without-of.txt | 25 | specified in either the OpenFirmware spec or booting-without-of.txt |
26 | 26 | ||
27 | All new mpc52xx-based boards are expected to match this document. In | 27 | All new mpc5200-based boards are expected to match this document. In |
28 | cases where this document is not sufficient to support a new board port, | 28 | cases where this document is not sufficient to support a new board port, |
29 | this document should be updated as part of adding the new board support. | 29 | this document should be updated as part of adding the new board support. |
30 | 30 | ||
@@ -32,26 +32,26 @@ II - Philosophy | |||
32 | =============== | 32 | =============== |
33 | The core of this document is naming convention. The whole point of | 33 | The core of this document is naming convention. The whole point of |
34 | defining this convention is to reduce or eliminate the number of | 34 | defining this convention is to reduce or eliminate the number of |
35 | special cases required to support a 52xx board. If all 52xx boards | 35 | special cases required to support a 5200 board. If all 5200 boards |
36 | follow the same convention, then generic 52xx support code will work | 36 | follow the same convention, then generic 5200 support code will work |
37 | rather than coding special cases for each new board. | 37 | rather than coding special cases for each new board. |
38 | 38 | ||
39 | This section tries to capture the thought process behind why the naming | 39 | This section tries to capture the thought process behind why the naming |
40 | convention is what it is. | 40 | convention is what it is. |
41 | 41 | ||
42 | 1. Node names | 42 | 1. names |
43 | ------------- | 43 | --------- |
44 | There is strong convention/requirements already established for children | 44 | There is strong convention/requirements already established for children |
45 | of the root node. 'cpus' describes the processor cores, 'memory' | 45 | of the root node. 'cpus' describes the processor cores, 'memory' |
46 | describes memory, and 'chosen' provides boot configuration. Other nodes | 46 | describes memory, and 'chosen' provides boot configuration. Other nodes |
47 | are added to describe devices attached to the processor local bus. | 47 | are added to describe devices attached to the processor local bus. |
48 | |||
48 | Following convention already established with other system-on-chip | 49 | Following convention already established with other system-on-chip |
49 | processors, MPC52xx boards must have an 'soc5200' node as a child of the | 50 | processors, 5200 device trees should use the name 'soc5200' for the |
50 | root node. | 51 | parent node of on chip devices, and the root node should be its parent. |
51 | 52 | ||
52 | The soc5200 node holds child nodes for all on chip devices. Child nodes | 53 | Child nodes are typically named after the configured function. ie. |
53 | are typically named after the configured function. ie. the FEC node is | 54 | the FEC node is named 'ethernet', and a PSC in uart mode is named 'serial'. |
54 | named 'ethernet', and a PSC in uart mode is named 'serial'. | ||
55 | 55 | ||
56 | 2. device_type property | 56 | 2. device_type property |
57 | ----------------------- | 57 | ----------------------- |
@@ -66,28 +66,47 @@ exactly. | |||
66 | Since device_type isn't enough to match devices to drivers, there also | 66 | Since device_type isn't enough to match devices to drivers, there also |
67 | needs to be a naming convention for the compatible property. Compatible | 67 | needs to be a naming convention for the compatible property. Compatible |
68 | is an list of device descriptions sorted from specific to generic. For | 68 | is an list of device descriptions sorted from specific to generic. For |
69 | the mpc52xx, the required format for each compatible value is | 69 | the mpc5200, the required format for each compatible value is |
70 | <chip>-<device>[-<mode>]. At the minimum, the list shall contain two | 70 | <chip>-<device>[-<mode>]. The OS should be able to match a device driver |
71 | items; the first specifying the exact chip, and the second specifying | 71 | to the device based solely on the compatible value. If two drivers |
72 | mpc52xx for the chip. | 72 | match on the compatible list; the 'most compatible' driver should be |
73 | 73 | selected. | |
74 | ie. ethernet on mpc5200b: compatible = "mpc5200b-ethernet\0mpc52xx-ethernet" | 74 | |
75 | 75 | The split between the MPC5200 and the MPC5200B leaves a bit of a | |
76 | The idea here is that most drivers will match to the most generic field | 76 | connundrum. How should the compatible property be set up to provide |
77 | in the compatible list (mpc52xx-*), but can also test the more specific | 77 | maximum compatability information; but still acurately describe the |
78 | field for enabling bug fixes or extra features. | 78 | chip? For the MPC5200; the answer is easy. Most of the SoC devices |
79 | originally appeared on the MPC5200. Since they didn't exist anywhere | ||
80 | else; the 5200 compatible properties will contain only one item; | ||
81 | "mpc5200-<device>". | ||
82 | |||
83 | The 5200B is almost the same as the 5200, but not quite. It fixes | ||
84 | silicon bugs and it adds a small number of enhancements. Most of the | ||
85 | devices either provide exactly the same interface as on the 5200. A few | ||
86 | devices have extra functions but still have a backwards compatible mode. | ||
87 | To express this infomation as completely as possible, 5200B device trees | ||
88 | should have two items in the compatible list; | ||
89 | "mpc5200b-<device>\0mpc5200-<device>". It is *strongly* recommended | ||
90 | that 5200B device trees follow this convention (instead of only listing | ||
91 | the base mpc5200 item). | ||
92 | |||
93 | If another chip appear on the market with one of the mpc5200 SoC | ||
94 | devices, then the compatible list should include mpc5200-<device>. | ||
95 | |||
96 | ie. ethernet on mpc5200: compatible = "mpc5200-ethernet" | ||
97 | ethernet on mpc5200b: compatible = "mpc5200b-ethernet\0mpc5200-ethernet" | ||
79 | 98 | ||
80 | Modal devices, like PSCs, also append the configured function to the | 99 | Modal devices, like PSCs, also append the configured function to the |
81 | end of the compatible field. ie. A PSC in i2s mode would specify | 100 | end of the compatible field. ie. A PSC in i2s mode would specify |
82 | "mpc52xx-psc-i2s", not "mpc52xx-i2s". This convention is chosen to | 101 | "mpc5200-psc-i2s", not "mpc5200-i2s". This convention is chosen to |
83 | avoid naming conflicts with non-psc devices providing the same | 102 | avoid naming conflicts with non-psc devices providing the same |
84 | function. For example, "mpc52xx-spi" and "mpc52xx-psc-spi" describe | 103 | function. For example, "mpc5200-spi" and "mpc5200-psc-spi" describe |
85 | the mpc5200 simple spi device and a PSC spi mode respectively. | 104 | the mpc5200 simple spi device and a PSC spi mode respectively. |
86 | 105 | ||
87 | If the soc device is more generic and present on other SOCs, the | 106 | If the soc device is more generic and present on other SOCs, the |
88 | compatible property can specify the more generic device type also. | 107 | compatible property can specify the more generic device type also. |
89 | 108 | ||
90 | ie. mscan: compatible = "mpc5200-mscan\0mpc52xx-mscan\0fsl,mscan"; | 109 | ie. mscan: compatible = "mpc5200-mscan\0fsl,mscan"; |
91 | 110 | ||
92 | At the time of writing, exact chip may be either 'mpc5200' or | 111 | At the time of writing, exact chip may be either 'mpc5200' or |
93 | 'mpc5200b'. | 112 | 'mpc5200b'. |
@@ -96,7 +115,7 @@ Device drivers should always try to match as generically as possible. | |||
96 | 115 | ||
97 | III - Structure | 116 | III - Structure |
98 | =============== | 117 | =============== |
99 | The device tree for an mpc52xx board follows the structure defined in | 118 | The device tree for an mpc5200 board follows the structure defined in |
100 | booting-without-of.txt with the following additional notes: | 119 | booting-without-of.txt with the following additional notes: |
101 | 120 | ||
102 | 0) the root node | 121 | 0) the root node |
@@ -115,7 +134,7 @@ Typical memory description node; see booting-without-of. | |||
115 | 134 | ||
116 | 3) The soc5200 node | 135 | 3) The soc5200 node |
117 | ------------------- | 136 | ------------------- |
118 | This node describes the on chip SOC peripherals. Every mpc52xx based | 137 | This node describes the on chip SOC peripherals. Every mpc5200 based |
119 | board will have this node, and as such there is a common naming | 138 | board will have this node, and as such there is a common naming |
120 | convention for SOC devices. | 139 | convention for SOC devices. |
121 | 140 | ||
@@ -125,71 +144,111 @@ name type description | |||
125 | device_type string must be "soc" | 144 | device_type string must be "soc" |
126 | ranges int should be <0 baseaddr baseaddr+10000> | 145 | ranges int should be <0 baseaddr baseaddr+10000> |
127 | reg int must be <baseaddr 10000> | 146 | reg int must be <baseaddr 10000> |
147 | compatible string mpc5200: "mpc5200-soc" | ||
148 | mpc5200b: "mpc5200b-soc\0mpc5200-soc" | ||
149 | system-frequency int Fsystem frequency; source of all | ||
150 | other clocks. | ||
151 | bus-frequency int IPB bus frequency in HZ. Clock rate | ||
152 | used by most of the soc devices. | ||
153 | #interrupt-cells int must be <3>. | ||
128 | 154 | ||
129 | Recommended properties: | 155 | Recommended properties: |
130 | name type description | 156 | name type description |
131 | ---- ---- ----------- | 157 | ---- ---- ----------- |
132 | compatible string should be "<chip>-soc\0mpc52xx-soc" | 158 | model string Exact model of the chip; |
133 | ie. "mpc5200b-soc\0mpc52xx-soc" | 159 | ie: model="fsl,mpc5200" |
134 | #interrupt-cells int must be <3>. If it is not defined | 160 | revision string Silicon revision of chip |
135 | here then it must be defined in every | 161 | ie: revision="M08A" |
136 | soc device node. | 162 | |
137 | bus-frequency int IPB bus frequency in HZ. Clock rate | 163 | The 'model' and 'revision' properties are *strongly* recommended. Having |
138 | used by most of the soc devices. | 164 | them presence acts as a bit of a safety net for working around as yet |
139 | Defining it here avoids needing it | 165 | undiscovered bugs on one version of silicon. For example, device drivers |
140 | added to every device node. | 166 | can use the model and revision properties to decide if a bug fix should |
167 | be turned on. | ||
141 | 168 | ||
142 | 4) soc5200 child nodes | 169 | 4) soc5200 child nodes |
143 | ---------------------- | 170 | ---------------------- |
144 | Any on chip SOC devices available to Linux must appear as soc5200 child nodes. | 171 | Any on chip SOC devices available to Linux must appear as soc5200 child nodes. |
145 | 172 | ||
146 | Note: in the tables below, '*' matches all <chip> values. ie. | 173 | Note: The tables below show the value for the mpc5200. A mpc5200b device |
147 | *-pic would translate to "mpc5200-pic\0mpc52xx-pic" | 174 | tree should use the "mpc5200b-<device>\0mpc5200-<device> form. |
148 | 175 | ||
149 | Required soc5200 child nodes: | 176 | Required soc5200 child nodes: |
150 | name device_type compatible Description | 177 | name device_type compatible Description |
151 | ---- ----------- ---------- ----------- | 178 | ---- ----------- ---------- ----------- |
152 | cdm@<addr> cdm *-cmd Clock Distribution | 179 | cdm@<addr> cdm mpc5200-cmd Clock Distribution |
153 | pic@<addr> interrupt-controller *-pic need an interrupt | 180 | pic@<addr> interrupt-controller mpc5200-pic need an interrupt |
154 | controller to boot | 181 | controller to boot |
155 | bestcomm@<addr> dma-controller *-bestcomm 52xx pic also requires | 182 | bestcomm@<addr> dma-controller mpc5200-bestcomm 5200 pic also requires |
156 | the bestcomm device | 183 | the bestcomm device |
157 | 184 | ||
158 | Recommended soc5200 child nodes; populate as needed for your board | 185 | Recommended soc5200 child nodes; populate as needed for your board |
159 | name device_type compatible Description | 186 | name device_type compatible Description |
160 | ---- ----------- ---------- ----------- | 187 | ---- ----------- ---------- ----------- |
161 | gpt@<addr> gpt *-gpt General purpose timers | 188 | gpt@<addr> gpt mpc5200-gpt General purpose timers |
162 | rtc@<addr> rtc *-rtc Real time clock | 189 | rtc@<addr> rtc mpc5200-rtc Real time clock |
163 | mscan@<addr> mscan *-mscan CAN bus controller | 190 | mscan@<addr> mscan mpc5200-mscan CAN bus controller |
164 | pci@<addr> pci *-pci PCI bridge | 191 | pci@<addr> pci mpc5200-pci PCI bridge |
165 | serial@<addr> serial *-psc-uart PSC in serial mode | 192 | serial@<addr> serial mpc5200-psc-uart PSC in serial mode |
166 | i2s@<addr> sound *-psc-i2s PSC in i2s mode | 193 | i2s@<addr> sound mpc5200-psc-i2s PSC in i2s mode |
167 | ac97@<addr> sound *-psc-ac97 PSC in ac97 mode | 194 | ac97@<addr> sound mpc5200-psc-ac97 PSC in ac97 mode |
168 | spi@<addr> spi *-psc-spi PSC in spi mode | 195 | spi@<addr> spi mpc5200-psc-spi PSC in spi mode |
169 | irda@<addr> irda *-psc-irda PSC in IrDA mode | 196 | irda@<addr> irda mpc5200-psc-irda PSC in IrDA mode |
170 | spi@<addr> spi *-spi MPC52xx spi device | 197 | spi@<addr> spi mpc5200-spi MPC5200 spi device |
171 | ethernet@<addr> network *-fec MPC52xx ethernet device | 198 | ethernet@<addr> network mpc5200-fec MPC5200 ethernet device |
172 | ata@<addr> ata *-ata IDE ATA interface | 199 | ata@<addr> ata mpc5200-ata IDE ATA interface |
173 | i2c@<addr> i2c *-i2c I2C controller | 200 | i2c@<addr> i2c mpc5200-i2c I2C controller |
174 | usb@<addr> usb-ohci-be *-ohci,ohci-be USB controller | 201 | usb@<addr> usb-ohci-be mpc5200-ohci,ohci-be USB controller |
175 | xlb@<addr> xlb *-xlb XLB arbritrator | 202 | xlb@<addr> xlb mpc5200-xlb XLB arbritrator |
203 | |||
204 | Important child node properties | ||
205 | name type description | ||
206 | ---- ---- ----------- | ||
207 | cell-index int When multiple devices are present, is the | ||
208 | index of the device in the hardware (ie. There | ||
209 | are 6 PSC on the 5200 numbered PSC1 to PSC6) | ||
210 | PSC1 has 'cell-index = <0>' | ||
211 | PSC4 has 'cell-index = <3>' | ||
212 | |||
213 | 5) General Purpose Timer nodes (child of soc5200 node) | ||
214 | On the mpc5200 and 5200b, GPT0 has a watchdog timer function. If the board | ||
215 | design supports the internal wdt, then the device node for GPT0 should | ||
216 | include the empty property 'has-wdt'. | ||
217 | |||
218 | 6) PSC nodes (child of soc5200 node) | ||
219 | PSC nodes can define the optional 'port-number' property to force assignment | ||
220 | order of serial ports. For example, PSC5 might be physically connected to | ||
221 | the port labeled 'COM1' and PSC1 wired to 'COM1'. In this case, PSC5 would | ||
222 | have a "port-number = <0>" property, and PSC1 would have "port-number = <1>". | ||
223 | |||
224 | PSC in i2s mode: The mpc5200 and mpc5200b PSCs are not compatible when in | ||
225 | i2s mode. An 'mpc5200b-psc-i2s' node cannot include 'mpc5200-psc-i2s' in the | ||
226 | compatible field. | ||
176 | 227 | ||
177 | IV - Extra Notes | 228 | IV - Extra Notes |
178 | ================ | 229 | ================ |
179 | 230 | ||
180 | 1. Interrupt mapping | 231 | 1. Interrupt mapping |
181 | -------------------- | 232 | -------------------- |
182 | The mpc52xx pic driver splits hardware IRQ numbers into two levels. The | 233 | The mpc5200 pic driver splits hardware IRQ numbers into two levels. The |
183 | split reflects the layout of the PIC hardware itself, which groups | 234 | split reflects the layout of the PIC hardware itself, which groups |
184 | interrupts into one of three groups; CRIT, MAIN or PERP. Also, the | 235 | interrupts into one of three groups; CRIT, MAIN or PERP. Also, the |
185 | Bestcomm dma engine has it's own set of interrupt sources which are | 236 | Bestcomm dma engine has it's own set of interrupt sources which are |
186 | cascaded off of peripheral interrupt 0, which the driver interprets as a | 237 | cascaded off of peripheral interrupt 0, which the driver interprets as a |
187 | fourth group, SDMA. | 238 | fourth group, SDMA. |
188 | 239 | ||
189 | The interrupts property for device nodes using the mpc52xx pic consists | 240 | The interrupts property for device nodes using the mpc5200 pic consists |
190 | of three cells; <L1 L2 level> | 241 | of three cells; <L1 L2 level> |
191 | 242 | ||
192 | L1 := [CRIT=0, MAIN=1, PERP=2, SDMA=3] | 243 | L1 := [CRIT=0, MAIN=1, PERP=2, SDMA=3] |
193 | L2 := interrupt number; directly mapped from the value in the | 244 | L2 := interrupt number; directly mapped from the value in the |
194 | "ICTL PerStat, MainStat, CritStat Encoded Register" | 245 | "ICTL PerStat, MainStat, CritStat Encoded Register" |
195 | level := [LEVEL_HIGH=0, EDGE_RISING=1, EDGE_FALLING=2, LEVEL_LOW=3] | 246 | level := [LEVEL_HIGH=0, EDGE_RISING=1, EDGE_FALLING=2, LEVEL_LOW=3] |
247 | |||
248 | 2. Shared registers | ||
249 | ------------------- | ||
250 | Some SoC devices share registers between them. ie. the i2c devices use | ||
251 | a single clock control register, and almost all device are affected by | ||
252 | the port_config register. Devices which need to manipulate shared regs | ||
253 | should look to the parent SoC node. The soc node is responsible | ||
254 | for arbitrating all shared register access. | ||
diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt index 5c86ed6f0448..625a21db0c2a 100644 --- a/Documentation/x86_64/boot-options.txt +++ b/Documentation/x86_64/boot-options.txt | |||
@@ -180,40 +180,81 @@ PCI | |||
180 | pci=lastbus=NUMBER Scan upto NUMBER busses, no matter what the mptable says. | 180 | pci=lastbus=NUMBER Scan upto NUMBER busses, no matter what the mptable says. |
181 | pci=noacpi Don't use ACPI to set up PCI interrupt routing. | 181 | pci=noacpi Don't use ACPI to set up PCI interrupt routing. |
182 | 182 | ||
183 | IOMMU | 183 | IOMMU (input/output memory management unit) |
184 | 184 | ||
185 | iommu=[size][,noagp][,off][,force][,noforce][,leak][,memaper[=order]][,merge] | 185 | Currently four x86-64 PCI-DMA mapping implementations exist: |
186 | [,forcesac][,fullflush][,nomerge][,noaperture][,calgary] | 186 | |
187 | size set size of iommu (in bytes) | 187 | 1. <arch/x86_64/kernel/pci-nommu.c>: use no hardware/software IOMMU at all |
188 | noagp don't initialize the AGP driver and use full aperture. | 188 | (e.g. because you have < 3 GB memory). |
189 | off don't use the IOMMU | 189 | Kernel boot message: "PCI-DMA: Disabling IOMMU" |
190 | leak turn on simple iommu leak tracing (only when CONFIG_IOMMU_LEAK is on) | 190 | |
191 | memaper[=order] allocate an own aperture over RAM with size 32MB^order. | 191 | 2. <arch/x86_64/kernel/pci-gart.c>: AMD GART based hardware IOMMU. |
192 | noforce don't force IOMMU usage. Default. | 192 | Kernel boot message: "PCI-DMA: using GART IOMMU" |
193 | force Force IOMMU. | 193 | |
194 | merge Do SG merging. Implies force (experimental) | 194 | 3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used |
195 | nomerge Don't do SG merging. | 195 | e.g. if there is no hardware IOMMU in the system and it is need because |
196 | forcesac For SAC mode for masks <40bits (experimental) | 196 | you have >3GB memory or told the kernel to us it (iommu=soft)) |
197 | fullflush Flush IOMMU on each allocation (default) | 197 | Kernel boot message: "PCI-DMA: Using software bounce buffering |
198 | nofullflush Don't use IOMMU fullflush | 198 | for IO (SWIOTLB)" |
199 | allowed overwrite iommu off workarounds for specific chipsets. | 199 | |
200 | soft Use software bounce buffering (default for Intel machines) | 200 | 4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM |
201 | noaperture Don't touch the aperture for AGP. | 201 | pSeries and xSeries servers. This hardware IOMMU supports DMA address |
202 | allowdac Allow DMA >4GB | 202 | mapping with memory protection, etc. |
203 | When off all DMA over >4GB is forced through an IOMMU or bounce | 203 | Kernel boot message: "PCI-DMA: Using Calgary IOMMU" |
204 | buffering. | 204 | |
205 | nodac Forbid DMA >4GB | 205 | iommu=[<size>][,noagp][,off][,force][,noforce][,leak[=<nr_of_leak_pages>] |
206 | panic Always panic when IOMMU overflows | 206 | [,memaper[=<order>]][,merge][,forcesac][,fullflush][,nomerge] |
207 | calgary Use the Calgary IOMMU if it is available | 207 | [,noaperture][,calgary] |
208 | 208 | ||
209 | swiotlb=pages[,force] | 209 | General iommu options: |
210 | 210 | off Don't initialize and use any kind of IOMMU. | |
211 | pages Prereserve that many 128K pages for the software IO bounce buffering. | 211 | noforce Don't force hardware IOMMU usage when it is not needed. |
212 | force Force all IO through the software TLB. | 212 | (default). |
213 | 213 | force Force the use of the hardware IOMMU even when it is | |
214 | calgary=[64k,128k,256k,512k,1M,2M,4M,8M] | 214 | not actually needed (e.g. because < 3 GB memory). |
215 | calgary=[translate_empty_slots] | 215 | soft Use software bounce buffering (SWIOTLB) (default for |
216 | calgary=[disable=<PCI bus number>] | 216 | Intel machines). This can be used to prevent the usage |
217 | of an available hardware IOMMU. | ||
218 | |||
219 | iommu options only relevant to the AMD GART hardware IOMMU: | ||
220 | <size> Set the size of the remapping area in bytes. | ||
221 | allowed Overwrite iommu off workarounds for specific chipsets. | ||
222 | fullflush Flush IOMMU on each allocation (default). | ||
223 | nofullflush Don't use IOMMU fullflush. | ||
224 | leak Turn on simple iommu leak tracing (only when | ||
225 | CONFIG_IOMMU_LEAK is on). Default number of leak pages | ||
226 | is 20. | ||
227 | memaper[=<order>] Allocate an own aperture over RAM with size 32MB<<order. | ||
228 | (default: order=1, i.e. 64MB) | ||
229 | merge Do scatter-gather (SG) merging. Implies "force" | ||
230 | (experimental). | ||
231 | nomerge Don't do scatter-gather (SG) merging. | ||
232 | noaperture Ask the IOMMU not to touch the aperture for AGP. | ||
233 | forcesac Force single-address cycle (SAC) mode for masks <40bits | ||
234 | (experimental). | ||
235 | noagp Don't initialize the AGP driver and use full aperture. | ||
236 | allowdac Allow double-address cycle (DAC) mode, i.e. DMA >4GB. | ||
237 | DAC is used with 32-bit PCI to push a 64-bit address in | ||
238 | two cycles. When off all DMA over >4GB is forced through | ||
239 | an IOMMU or software bounce buffering. | ||
240 | nodac Forbid DAC mode, i.e. DMA >4GB. | ||
241 | panic Always panic when IOMMU overflows. | ||
242 | calgary Use the Calgary IOMMU if it is available | ||
243 | |||
244 | iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU | ||
245 | implementation: | ||
246 | swiotlb=<pages>[,force] | ||
247 | <pages> Prereserve that many 128K pages for the software IO | ||
248 | bounce buffering. | ||
249 | force Force all IO through the software TLB. | ||
250 | |||
251 | Settings for the IBM Calgary hardware IOMMU currently found in IBM | ||
252 | pSeries and xSeries machines: | ||
253 | |||
254 | calgary=[64k,128k,256k,512k,1M,2M,4M,8M] | ||
255 | calgary=[translate_empty_slots] | ||
256 | calgary=[disable=<PCI bus number>] | ||
257 | panic Always panic when IOMMU overflows | ||
217 | 258 | ||
218 | 64k,...,8M - Set the size of each PCI slot's translation table | 259 | 64k,...,8M - Set the size of each PCI slot's translation table |
219 | when using the Calgary IOMMU. This is the size of the translation | 260 | when using the Calgary IOMMU. This is the size of the translation |
@@ -234,14 +275,14 @@ IOMMU | |||
234 | 275 | ||
235 | Debugging | 276 | Debugging |
236 | 277 | ||
237 | oops=panic Always panic on oopses. Default is to just kill the process, | 278 | oops=panic Always panic on oopses. Default is to just kill the process, |
238 | but there is a small probability of deadlocking the machine. | 279 | but there is a small probability of deadlocking the machine. |
239 | This will also cause panics on machine check exceptions. | 280 | This will also cause panics on machine check exceptions. |
240 | Useful together with panic=30 to trigger a reboot. | 281 | Useful together with panic=30 to trigger a reboot. |
241 | 282 | ||
242 | kstack=N Print that many words from the kernel stack in oops dumps. | 283 | kstack=N Print N words from the kernel stack in oops dumps. |
243 | 284 | ||
244 | pagefaulttrace Dump all page faults. Only useful for extreme debugging | 285 | pagefaulttrace Dump all page faults. Only useful for extreme debugging |
245 | and will create a lot of output. | 286 | and will create a lot of output. |
246 | 287 | ||
247 | call_trace=[old|both|newfallback|new] | 288 | call_trace=[old|both|newfallback|new] |
@@ -251,15 +292,8 @@ Debugging | |||
251 | newfallback: use new unwinder but fall back to old if it gets | 292 | newfallback: use new unwinder but fall back to old if it gets |
252 | stuck (default) | 293 | stuck (default) |
253 | 294 | ||
254 | call_trace=[old|both|newfallback|new] | 295 | Miscellaneous |
255 | old: use old inexact backtracer | ||
256 | new: use new exact dwarf2 unwinder | ||
257 | both: print entries from both | ||
258 | newfallback: use new unwinder but fall back to old if it gets | ||
259 | stuck (default) | ||
260 | |||
261 | Misc | ||
262 | 296 | ||
263 | noreplacement Don't replace instructions with more appropriate ones | 297 | noreplacement Don't replace instructions with more appropriate ones |
264 | for the CPU. This may be useful on asymmetric MP systems | 298 | for the CPU. This may be useful on asymmetric MP systems |
265 | where some CPU have less capabilities than the others. | 299 | where some CPUs have less capabilities than others. |
diff --git a/Documentation/x86_64/cpu-hotplug-spec b/Documentation/x86_64/cpu-hotplug-spec index 5c0fa345e556..3c23e0587db3 100644 --- a/Documentation/x86_64/cpu-hotplug-spec +++ b/Documentation/x86_64/cpu-hotplug-spec | |||
@@ -2,7 +2,7 @@ Firmware support for CPU hotplug under Linux/x86-64 | |||
2 | --------------------------------------------------- | 2 | --------------------------------------------------- |
3 | 3 | ||
4 | Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to | 4 | Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to |
5 | know in advance boot time the maximum number of CPUs that could be plugged | 5 | know in advance of boot time the maximum number of CPUs that could be plugged |
6 | into the system. ACPI 3.0 currently has no official way to supply | 6 | into the system. ACPI 3.0 currently has no official way to supply |
7 | this information from the firmware to the operating system. | 7 | this information from the firmware to the operating system. |
8 | 8 | ||
diff --git a/Documentation/x86_64/kernel-stacks b/Documentation/x86_64/kernel-stacks index bddfddd466ab..5ad65d51fb95 100644 --- a/Documentation/x86_64/kernel-stacks +++ b/Documentation/x86_64/kernel-stacks | |||
@@ -9,9 +9,9 @@ zombie. While the thread is in user space the kernel stack is empty | |||
9 | except for the thread_info structure at the bottom. | 9 | except for the thread_info structure at the bottom. |
10 | 10 | ||
11 | In addition to the per thread stacks, there are specialized stacks | 11 | In addition to the per thread stacks, there are specialized stacks |
12 | associated with each cpu. These stacks are only used while the kernel | 12 | associated with each CPU. These stacks are only used while the kernel |
13 | is in control on that cpu, when a cpu returns to user space the | 13 | is in control on that CPU; when a CPU returns to user space the |
14 | specialized stacks contain no useful data. The main cpu stacks is | 14 | specialized stacks contain no useful data. The main CPU stacks are: |
15 | 15 | ||
16 | * Interrupt stack. IRQSTACKSIZE | 16 | * Interrupt stack. IRQSTACKSIZE |
17 | 17 | ||
@@ -32,17 +32,17 @@ x86_64 also has a feature which is not available on i386, the ability | |||
32 | to automatically switch to a new stack for designated events such as | 32 | to automatically switch to a new stack for designated events such as |
33 | double fault or NMI, which makes it easier to handle these unusual | 33 | double fault or NMI, which makes it easier to handle these unusual |
34 | events on x86_64. This feature is called the Interrupt Stack Table | 34 | events on x86_64. This feature is called the Interrupt Stack Table |
35 | (IST). There can be up to 7 IST entries per cpu. The IST code is an | 35 | (IST). There can be up to 7 IST entries per CPU. The IST code is an |
36 | index into the Task State Segment (TSS), the IST entries in the TSS | 36 | index into the Task State Segment (TSS). The IST entries in the TSS |
37 | point to dedicated stacks, each stack can be a different size. | 37 | point to dedicated stacks; each stack can be a different size. |
38 | 38 | ||
39 | An IST is selected by an non-zero value in the IST field of an | 39 | An IST is selected by a non-zero value in the IST field of an |
40 | interrupt-gate descriptor. When an interrupt occurs and the hardware | 40 | interrupt-gate descriptor. When an interrupt occurs and the hardware |
41 | loads such a descriptor, the hardware automatically sets the new stack | 41 | loads such a descriptor, the hardware automatically sets the new stack |
42 | pointer based on the IST value, then invokes the interrupt handler. If | 42 | pointer based on the IST value, then invokes the interrupt handler. If |
43 | software wants to allow nested IST interrupts then the handler must | 43 | software wants to allow nested IST interrupts then the handler must |
44 | adjust the IST values on entry to and exit from the interrupt handler. | 44 | adjust the IST values on entry to and exit from the interrupt handler. |
45 | (this is occasionally done, e.g. for debug exceptions) | 45 | (This is occasionally done, e.g. for debug exceptions.) |
46 | 46 | ||
47 | Events with different IST codes (i.e. with different stacks) can be | 47 | Events with different IST codes (i.e. with different stacks) can be |
48 | nested. For example, a debug interrupt can safely be interrupted by an | 48 | nested. For example, a debug interrupt can safely be interrupted by an |
@@ -58,17 +58,17 @@ The currently assigned IST stacks are :- | |||
58 | 58 | ||
59 | Used for interrupt 12 - Stack Fault Exception (#SS). | 59 | Used for interrupt 12 - Stack Fault Exception (#SS). |
60 | 60 | ||
61 | This allows to recover from invalid stack segments. Rarely | 61 | This allows the CPU to recover from invalid stack segments. Rarely |
62 | happens. | 62 | happens. |
63 | 63 | ||
64 | * DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE). | 64 | * DOUBLEFAULT_STACK. EXCEPTION_STKSZ (PAGE_SIZE). |
65 | 65 | ||
66 | Used for interrupt 8 - Double Fault Exception (#DF). | 66 | Used for interrupt 8 - Double Fault Exception (#DF). |
67 | 67 | ||
68 | Invoked when handling a exception causes another exception. Happens | 68 | Invoked when handling one exception causes another exception. Happens |
69 | when the kernel is very confused (e.g. kernel stack pointer corrupt) | 69 | when the kernel is very confused (e.g. kernel stack pointer corrupt). |
70 | Using a separate stack allows to recover from it well enough in many | 70 | Using a separate stack allows the kernel to recover from it well enough |
71 | cases to still output an oops. | 71 | in many cases to still output an oops. |
72 | 72 | ||
73 | * NMI_STACK. EXCEPTION_STKSZ (PAGE_SIZE). | 73 | * NMI_STACK. EXCEPTION_STKSZ (PAGE_SIZE). |
74 | 74 | ||
diff --git a/Documentation/x86_64/machinecheck b/Documentation/x86_64/machinecheck new file mode 100644 index 000000000000..068a6d9904b9 --- /dev/null +++ b/Documentation/x86_64/machinecheck | |||
@@ -0,0 +1,70 @@ | |||
1 | |||
2 | Configurable sysfs parameters for the x86-64 machine check code. | ||
3 | |||
4 | Machine checks report internal hardware error conditions detected | ||
5 | by the CPU. Uncorrected errors typically cause a machine check | ||
6 | (often with panic), corrected ones cause a machine check log entry. | ||
7 | |||
8 | Machine checks are organized in banks (normally associated with | ||
9 | a hardware subsystem) and subevents in a bank. The exact meaning | ||
10 | of the banks and subevent is CPU specific. | ||
11 | |||
12 | mcelog knows how to decode them. | ||
13 | |||
14 | When you see the "Machine check errors logged" message in the system | ||
15 | log then mcelog should run to collect and decode machine check entries | ||
16 | from /dev/mcelog. Normally mcelog should be run regularly from a cronjob. | ||
17 | |||
18 | Each CPU has a directory in /sys/devices/system/machinecheck/machinecheckN | ||
19 | (N = CPU number) | ||
20 | |||
21 | The directory contains some configurable entries: | ||
22 | |||
23 | Entries: | ||
24 | |||
25 | bankNctl | ||
26 | (N bank number) | ||
27 | 64bit Hex bitmask enabling/disabling specific subevents for bank N | ||
28 | When a bit in the bitmask is zero then the respective | ||
29 | subevent will not be reported. | ||
30 | By default all events are enabled. | ||
31 | Note that BIOS maintain another mask to disable specific events | ||
32 | per bank. This is not visible here | ||
33 | |||
34 | The following entries appear for each CPU, but they are truly shared | ||
35 | between all CPUs. | ||
36 | |||
37 | check_interval | ||
38 | How often to poll for corrected machine check errors, in seconds | ||
39 | (Note output is hexademical). Default 5 minutes. | ||
40 | |||
41 | tolerant | ||
42 | Tolerance level. When a machine check exception occurs for a non | ||
43 | corrected machine check the kernel can take different actions. | ||
44 | Since machine check exceptions can happen any time it is sometimes | ||
45 | risky for the kernel to kill a process because it defies | ||
46 | normal kernel locking rules. The tolerance level configures | ||
47 | how hard the kernel tries to recover even at some risk of deadlock. | ||
48 | |||
49 | 0: always panic, | ||
50 | 1: panic if deadlock possible, | ||
51 | 2: try to avoid panic, | ||
52 | 3: never panic or exit (for testing only) | ||
53 | |||
54 | Default: 1 | ||
55 | |||
56 | Note this only makes a difference if the CPU allows recovery | ||
57 | from a machine check exception. Current x86 CPUs generally do not. | ||
58 | |||
59 | trigger | ||
60 | Program to run when a machine check event is detected. | ||
61 | This is an alternative to running mcelog regularly from cron | ||
62 | and allows to detect events faster. | ||
63 | |||
64 | TBD document entries for AMD threshold interrupt configuration | ||
65 | |||
66 | For more details about the x86 machine check architecture | ||
67 | see the Intel and AMD architecture manuals from their developer websites. | ||
68 | |||
69 | For more details about the architecture see | ||
70 | see http://one.firstfloor.org/~andi/mce.pdf | ||
diff --git a/Documentation/x86_64/mm.txt b/Documentation/x86_64/mm.txt index 133561b9cb0c..f42798ed1c54 100644 --- a/Documentation/x86_64/mm.txt +++ b/Documentation/x86_64/mm.txt | |||
@@ -3,26 +3,26 @@ | |||
3 | 3 | ||
4 | Virtual memory map with 4 level page tables: | 4 | Virtual memory map with 4 level page tables: |
5 | 5 | ||
6 | 0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm | 6 | 0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm |
7 | hole caused by [48:63] sign extension | 7 | hole caused by [48:63] sign extension |
8 | ffff800000000000 - ffff80ffffffffff (=40bits) guard hole | 8 | ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole |
9 | ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of all phys. memory | 9 | ffff810000000000 - ffffc0ffffffffff (=46 bits) direct mapping of all phys. memory |
10 | ffffc10000000000 - ffffc1ffffffffff (=40bits) hole | 10 | ffffc10000000000 - ffffc1ffffffffff (=40 bits) hole |
11 | ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space | 11 | ffffc20000000000 - ffffe1ffffffffff (=45 bits) vmalloc/ioremap space |
12 | ... unused hole ... | 12 | ... unused hole ... |
13 | ffffffff80000000 - ffffffff82800000 (=40MB) kernel text mapping, from phys 0 | 13 | ffffffff80000000 - ffffffff82800000 (=40 MB) kernel text mapping, from phys 0 |
14 | ... unused hole ... | 14 | ... unused hole ... |
15 | ffffffff88000000 - fffffffffff00000 (=1919MB) module mapping space | 15 | ffffffff88000000 - fffffffffff00000 (=1919 MB) module mapping space |
16 | 16 | ||
17 | The direct mapping covers all memory in the system upto the highest | 17 | The direct mapping covers all memory in the system up to the highest |
18 | memory address (this means in some cases it can also include PCI memory | 18 | memory address (this means in some cases it can also include PCI memory |
19 | holes) | 19 | holes). |
20 | 20 | ||
21 | vmalloc space is lazily synchronized into the different PML4 pages of | 21 | vmalloc space is lazily synchronized into the different PML4 pages of |
22 | the processes using the page fault handler, with init_level4_pgt as | 22 | the processes using the page fault handler, with init_level4_pgt as |
23 | reference. | 23 | reference. |
24 | 24 | ||
25 | Current X86-64 implementations only support 40 bit of address space, | 25 | Current X86-64 implementations only support 40 bits of address space, |
26 | but we support upto 46bits. This expands into MBZ space in the page tables. | 26 | but we support up to 46 bits. This expands into MBZ space in the page tables. |
27 | 27 | ||
28 | -Andi Kleen, Jul 2004 | 28 | -Andi Kleen, Jul 2004 |