diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/CodingStyle | 21 | ||||
-rw-r--r-- | Documentation/DocBook/kernel-hacking.tmpl | 2 | ||||
-rw-r--r-- | Documentation/dell_rbu.txt | 18 | ||||
-rw-r--r-- | Documentation/filesystems/relayfs.txt | 2 | ||||
-rw-r--r-- | Documentation/ia64/mca.txt | 194 |
5 files changed, 229 insertions, 8 deletions
diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 22e5f9036f3c..eb7db3c19227 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle | |||
@@ -410,7 +410,26 @@ Kernel messages do not have to be terminated with a period. | |||
410 | Printing numbers in parentheses (%d) adds no value and should be avoided. | 410 | Printing numbers in parentheses (%d) adds no value and should be avoided. |
411 | 411 | ||
412 | 412 | ||
413 | Chapter 13: References | 413 | Chapter 13: Allocating memory |
414 | |||
415 | The kernel provides the following general purpose memory allocators: | ||
416 | kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API | ||
417 | documentation for further information about them. | ||
418 | |||
419 | The preferred form for passing a size of a struct is the following: | ||
420 | |||
421 | p = kmalloc(sizeof(*p), ...); | ||
422 | |||
423 | The alternative form where struct name is spelled out hurts readability and | ||
424 | introduces an opportunity for a bug when the pointer variable type is changed | ||
425 | but the corresponding sizeof that is passed to a memory allocator is not. | ||
426 | |||
427 | Casting the return value which is a void pointer is redundant. The conversion | ||
428 | from void pointer to any other pointer type is guaranteed by the C programming | ||
429 | language. | ||
430 | |||
431 | |||
432 | Chapter 14: References | ||
414 | 433 | ||
415 | The C Programming Language, Second Edition | 434 | The C Programming Language, Second Edition |
416 | by Brian W. Kernighan and Dennis M. Ritchie. | 435 | by Brian W. Kernighan and Dennis M. Ritchie. |
diff --git a/Documentation/DocBook/kernel-hacking.tmpl b/Documentation/DocBook/kernel-hacking.tmpl index 6367bba32d22..582032eea872 100644 --- a/Documentation/DocBook/kernel-hacking.tmpl +++ b/Documentation/DocBook/kernel-hacking.tmpl | |||
@@ -1105,7 +1105,7 @@ static struct block_device_operations opt_fops = { | |||
1105 | </listitem> | 1105 | </listitem> |
1106 | <listitem> | 1106 | <listitem> |
1107 | <para> | 1107 | <para> |
1108 | Function names as strings (__func__). | 1108 | Function names as strings (__FUNCTION__). |
1109 | </para> | 1109 | </para> |
1110 | </listitem> | 1110 | </listitem> |
1111 | <listitem> | 1111 | <listitem> |
diff --git a/Documentation/dell_rbu.txt b/Documentation/dell_rbu.txt index bcfa5c35036b..95d7f62e4dbc 100644 --- a/Documentation/dell_rbu.txt +++ b/Documentation/dell_rbu.txt | |||
@@ -13,6 +13,8 @@ the BIOS on Dell servers (starting from servers sold since 1999), desktops | |||
13 | and notebooks (starting from those sold in 2005). | 13 | and notebooks (starting from those sold in 2005). |
14 | Please go to http://support.dell.com register and you can find info on | 14 | Please go to http://support.dell.com register and you can find info on |
15 | OpenManage and Dell Update packages (DUP). | 15 | OpenManage and Dell Update packages (DUP). |
16 | Libsmbios can also be used to update BIOS on Dell systems go to | ||
17 | http://linux.dell.com/libsmbios/ for details. | ||
16 | 18 | ||
17 | Dell_RBU driver supports BIOS update using the monilothic image and packetized | 19 | Dell_RBU driver supports BIOS update using the monilothic image and packetized |
18 | image methods. In case of moniolithic the driver allocates a contiguous chunk | 20 | image methods. In case of moniolithic the driver allocates a contiguous chunk |
@@ -22,8 +24,8 @@ would place each packet in contiguous physical memory. The driver also | |||
22 | maintains a link list of packets for reading them back. | 24 | maintains a link list of packets for reading them back. |
23 | If the dell_rbu driver is unloaded all the allocated memory is freed. | 25 | If the dell_rbu driver is unloaded all the allocated memory is freed. |
24 | 26 | ||
25 | The rbu driver needs to have an application which will inform the BIOS to | 27 | The rbu driver needs to have an application (as mentioned above)which will |
26 | enable the update in the next system reboot. | 28 | inform the BIOS to enable the update in the next system reboot. |
27 | 29 | ||
28 | The user should not unload the rbu driver after downloading the BIOS image | 30 | The user should not unload the rbu driver after downloading the BIOS image |
29 | or updating. | 31 | or updating. |
@@ -42,9 +44,11 @@ In case of packet mechanism the single memory can be broken in smaller chuks | |||
42 | of contiguous memory and the BIOS image is scattered in these packets. | 44 | of contiguous memory and the BIOS image is scattered in these packets. |
43 | 45 | ||
44 | By default the driver uses monolithic memory for the update type. This can be | 46 | By default the driver uses monolithic memory for the update type. This can be |
45 | changed to contiguous during the driver load time by specifying the load | 47 | changed to packets during the driver load time by specifying the load |
46 | parameter image_type=packet. This can also be changed later as below | 48 | parameter image_type=packet. This can also be changed later as below |
47 | echo packet > /sys/devices/platform/dell_rbu/image_type | 49 | echo packet > /sys/devices/platform/dell_rbu/image_type |
50 | Also echoing either mono ,packet or init in to image_type will free up the | ||
51 | memory allocated by the driver. | ||
48 | 52 | ||
49 | Do the steps below to download the BIOS image. | 53 | Do the steps below to download the BIOS image. |
50 | 1) echo 1 > /sys/class/firmware/dell_rbu/loading | 54 | 1) echo 1 > /sys/class/firmware/dell_rbu/loading |
@@ -53,9 +57,13 @@ Do the steps below to download the BIOS image. | |||
53 | 57 | ||
54 | The /sys/class/firmware/dell_rbu/ entries will remain till the following is | 58 | The /sys/class/firmware/dell_rbu/ entries will remain till the following is |
55 | done. | 59 | done. |
56 | echo -1 > /sys/class/firmware/dell_rbu/loading | 60 | echo -1 > /sys/class/firmware/dell_rbu/loading. |
57 | |||
58 | Until this step is completed the drivr cannot be unloaded. | 61 | Until this step is completed the drivr cannot be unloaded. |
62 | If an user by accident executes steps 1 and 3 above without executing step 2; | ||
63 | it will make the /sys/class/firmware/dell_rbu/ entries to disappear. | ||
64 | The entries can be recreated by doing the following | ||
65 | echo init > /sys/devices/platform/dell_rbu/image_type | ||
66 | NOTE: echoing init in image_type does not change it original value. | ||
59 | 67 | ||
60 | Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to | 68 | Also the driver provides /sys/devices/platform/dell_rbu/data readonly file to |
61 | read back the image downloaded. This is useful in case of packet update | 69 | read back the image downloaded. This is useful in case of packet update |
diff --git a/Documentation/filesystems/relayfs.txt b/Documentation/filesystems/relayfs.txt index d24e1b0d4f39..d803abed29f0 100644 --- a/Documentation/filesystems/relayfs.txt +++ b/Documentation/filesystems/relayfs.txt | |||
@@ -15,7 +15,7 @@ retrieve the data as it becomes available. | |||
15 | 15 | ||
16 | The format of the data logged into the channel buffers is completely | 16 | The format of the data logged into the channel buffers is completely |
17 | up to the relayfs client; relayfs does however provide hooks which | 17 | up to the relayfs client; relayfs does however provide hooks which |
18 | allow clients to impose some stucture on the buffer data. Nor does | 18 | allow clients to impose some structure on the buffer data. Nor does |
19 | relayfs implement any form of data filtering - this also is left to | 19 | relayfs implement any form of data filtering - this also is left to |
20 | the client. The purpose is to keep relayfs as simple as possible. | 20 | the client. The purpose is to keep relayfs as simple as possible. |
21 | 21 | ||
diff --git a/Documentation/ia64/mca.txt b/Documentation/ia64/mca.txt new file mode 100644 index 000000000000..a71cc6a67ef7 --- /dev/null +++ b/Documentation/ia64/mca.txt | |||
@@ -0,0 +1,194 @@ | |||
1 | An ad-hoc collection of notes on IA64 MCA and INIT processing. Feel | ||
2 | free to update it with notes about any area that is not clear. | ||
3 | |||
4 | --- | ||
5 | |||
6 | MCA/INIT are completely asynchronous. They can occur at any time, when | ||
7 | the OS is in any state. Including when one of the cpus is already | ||
8 | holding a spinlock. Trying to get any lock from MCA/INIT state is | ||
9 | asking for deadlock. Also the state of structures that are protected | ||
10 | by locks is indeterminate, including linked lists. | ||
11 | |||
12 | --- | ||
13 | |||
14 | The complicated ia64 MCA process. All of this is mandated by Intel's | ||
15 | specification for ia64 SAL, error recovery and and unwind, it is not as | ||
16 | if we have a choice here. | ||
17 | |||
18 | * MCA occurs on one cpu, usually due to a double bit memory error. | ||
19 | This is the monarch cpu. | ||
20 | |||
21 | * SAL sends an MCA rendezvous interrupt (which is a normal interrupt) | ||
22 | to all the other cpus, the slaves. | ||
23 | |||
24 | * Slave cpus that receive the MCA interrupt call down into SAL, they | ||
25 | end up spinning disabled while the MCA is being serviced. | ||
26 | |||
27 | * If any slave cpu was already spinning disabled when the MCA occurred | ||
28 | then it cannot service the MCA interrupt. SAL waits ~20 seconds then | ||
29 | sends an unmaskable INIT event to the slave cpus that have not | ||
30 | already rendezvoused. | ||
31 | |||
32 | * Because MCA/INIT can be delivered at any time, including when the cpu | ||
33 | is down in PAL in physical mode, the registers at the time of the | ||
34 | event are _completely_ undefined. In particular the MCA/INIT | ||
35 | handlers cannot rely on the thread pointer, PAL physical mode can | ||
36 | (and does) modify TP. It is allowed to do that as long as it resets | ||
37 | TP on return. However MCA/INIT events expose us to these PAL | ||
38 | internal TP changes. Hence curr_task(). | ||
39 | |||
40 | * If an MCA/INIT event occurs while the kernel was running (not user | ||
41 | space) and the kernel has called PAL then the MCA/INIT handler cannot | ||
42 | assume that the kernel stack is in a fit state to be used. Mainly | ||
43 | because PAL may or may not maintain the stack pointer internally. | ||
44 | Because the MCA/INIT handlers cannot trust the kernel stack, they | ||
45 | have to use their own, per-cpu stacks. The MCA/INIT stacks are | ||
46 | preformatted with just enough task state to let the relevant handlers | ||
47 | do their job. | ||
48 | |||
49 | * Unlike most other architectures, the ia64 struct task is embedded in | ||
50 | the kernel stack[1]. So switching to a new kernel stack means that | ||
51 | we switch to a new task as well. Because various bits of the kernel | ||
52 | assume that current points into the struct task, switching to a new | ||
53 | stack also means a new value for current. | ||
54 | |||
55 | * Once all slaves have rendezvoused and are spinning disabled, the | ||
56 | monarch is entered. The monarch now tries to diagnose the problem | ||
57 | and decide if it can recover or not. | ||
58 | |||
59 | * Part of the monarch's job is to look at the state of all the other | ||
60 | tasks. The only way to do that on ia64 is to call the unwinder, | ||
61 | as mandated by Intel. | ||
62 | |||
63 | * The starting point for the unwind depends on whether a task is | ||
64 | running or not. That is, whether it is on a cpu or is blocked. The | ||
65 | monarch has to determine whether or not a task is on a cpu before it | ||
66 | knows how to start unwinding it. The tasks that received an MCA or | ||
67 | INIT event are no longer running, they have been converted to blocked | ||
68 | tasks. But (and its a big but), the cpus that received the MCA | ||
69 | rendezvous interrupt are still running on their normal kernel stacks! | ||
70 | |||
71 | * To distinguish between these two cases, the monarch must know which | ||
72 | tasks are on a cpu and which are not. Hence each slave cpu that | ||
73 | switches to an MCA/INIT stack, registers its new stack using | ||
74 | set_curr_task(), so the monarch can tell that the _original_ task is | ||
75 | no longer running on that cpu. That gives us a decent chance of | ||
76 | getting a valid backtrace of the _original_ task. | ||
77 | |||
78 | * MCA/INIT can be nested, to a depth of 2 on any cpu. In the case of a | ||
79 | nested error, we want diagnostics on the MCA/INIT handler that | ||
80 | failed, not on the task that was originally running. Again this | ||
81 | requires set_curr_task() so the MCA/INIT handlers can register their | ||
82 | own stack as running on that cpu. Then a recursive error gets a | ||
83 | trace of the failing handler's "task". | ||
84 | |||
85 | [1] My (Keith Owens) original design called for ia64 to separate its | ||
86 | struct task and the kernel stacks. Then the MCA/INIT data would be | ||
87 | chained stacks like i386 interrupt stacks. But that required | ||
88 | radical surgery on the rest of ia64, plus extra hard wired TLB | ||
89 | entries with its associated performance degradation. David | ||
90 | Mosberger vetoed that approach. Which meant that separate kernel | ||
91 | stacks meant separate "tasks" for the MCA/INIT handlers. | ||
92 | |||
93 | --- | ||
94 | |||
95 | INIT is less complicated than MCA. Pressing the nmi button or using | ||
96 | the equivalent command on the management console sends INIT to all | ||
97 | cpus. SAL picks one one of the cpus as the monarch and the rest are | ||
98 | slaves. All the OS INIT handlers are entered at approximately the same | ||
99 | time. The OS monarch prints the state of all tasks and returns, after | ||
100 | which the slaves return and the system resumes. | ||
101 | |||
102 | At least that is what is supposed to happen. Alas there are broken | ||
103 | versions of SAL out there. Some drive all the cpus as monarchs. Some | ||
104 | drive them all as slaves. Some drive one cpu as monarch, wait for that | ||
105 | cpu to return from the OS then drive the rest as slaves. Some versions | ||
106 | of SAL cannot even cope with returning from the OS, they spin inside | ||
107 | SAL on resume. The OS INIT code has workarounds for some of these | ||
108 | broken SAL symptoms, but some simply cannot be fixed from the OS side. | ||
109 | |||
110 | --- | ||
111 | |||
112 | The scheduler hooks used by ia64 (curr_task, set_curr_task) are layer | ||
113 | violations. Unfortunately MCA/INIT start off as massive layer | ||
114 | violations (can occur at _any_ time) and they build from there. | ||
115 | |||
116 | At least ia64 makes an attempt at recovering from hardware errors, but | ||
117 | it is a difficult problem because of the asynchronous nature of these | ||
118 | errors. When processing an unmaskable interrupt we sometimes need | ||
119 | special code to cope with our inability to take any locks. | ||
120 | |||
121 | --- | ||
122 | |||
123 | How is ia64 MCA/INIT different from x86 NMI? | ||
124 | |||
125 | * x86 NMI typically gets delivered to one cpu. MCA/INIT gets sent to | ||
126 | all cpus. | ||
127 | |||
128 | * x86 NMI cannot be nested. MCA/INIT can be nested, to a depth of 2 | ||
129 | per cpu. | ||
130 | |||
131 | * x86 has a separate struct task which points to one of multiple kernel | ||
132 | stacks. ia64 has the struct task embedded in the single kernel | ||
133 | stack, so switching stack means switching task. | ||
134 | |||
135 | * x86 does not call the BIOS so the NMI handler does not have to worry | ||
136 | about any registers having changed. MCA/INIT can occur while the cpu | ||
137 | is in PAL in physical mode, with undefined registers and an undefined | ||
138 | kernel stack. | ||
139 | |||
140 | * i386 backtrace is not very sensitive to whether a process is running | ||
141 | or not. ia64 unwind is very, very sensitive to whether a process is | ||
142 | running or not. | ||
143 | |||
144 | --- | ||
145 | |||
146 | What happens when MCA/INIT is delivered what a cpu is running user | ||
147 | space code? | ||
148 | |||
149 | The user mode registers are stored in the RSE area of the MCA/INIT on | ||
150 | entry to the OS and are restored from there on return to SAL, so user | ||
151 | mode registers are preserved across a recoverable MCA/INIT. Since the | ||
152 | OS has no idea what unwind data is available for the user space stack, | ||
153 | MCA/INIT never tries to backtrace user space. Which means that the OS | ||
154 | does not bother making the user space process look like a blocked task, | ||
155 | i.e. the OS does not copy pt_regs and switch_stack to the user space | ||
156 | stack. Also the OS has no idea how big the user space RSE and memory | ||
157 | stacks are, which makes it too risky to copy the saved state to a user | ||
158 | mode stack. | ||
159 | |||
160 | --- | ||
161 | |||
162 | How do we get a backtrace on the tasks that were running when MCA/INIT | ||
163 | was delivered? | ||
164 | |||
165 | mca.c:::ia64_mca_modify_original_stack(). That identifies and | ||
166 | verifies the original kernel stack, copies the dirty registers from | ||
167 | the MCA/INIT stack's RSE to the original stack's RSE, copies the | ||
168 | skeleton struct pt_regs and switch_stack to the original stack, fills | ||
169 | in the skeleton structures from the PAL minstate area and updates the | ||
170 | original stack's thread.ksp. That makes the original stack look | ||
171 | exactly like any other blocked task, i.e. it now appears to be | ||
172 | sleeping. To get a backtrace, just start with thread.ksp for the | ||
173 | original task and unwind like any other sleeping task. | ||
174 | |||
175 | --- | ||
176 | |||
177 | How do we identify the tasks that were running when MCA/INIT was | ||
178 | delivered? | ||
179 | |||
180 | If the previous task has been verified and converted to a blocked | ||
181 | state, then sos->prev_task on the MCA/INIT stack is updated to point to | ||
182 | the previous task. You can look at that field in dumps or debuggers. | ||
183 | To help distinguish between the handler and the original tasks, | ||
184 | handlers have _TIF_MCA_INIT set in thread_info.flags. | ||
185 | |||
186 | The sos data is always in the MCA/INIT handler stack, at offset | ||
187 | MCA_SOS_OFFSET. You can get that value from mca_asm.h or calculate it | ||
188 | as KERNEL_STACK_SIZE - sizeof(struct pt_regs) - sizeof(struct | ||
189 | ia64_sal_os_state), with 16 byte alignment for all structures. | ||
190 | |||
191 | Also the comm field of the MCA/INIT task is modified to include the pid | ||
192 | of the original task, for humans to use. For example, a comm field of | ||
193 | 'MCA 12159' means that pid 12159 was running when the MCA was | ||
194 | delivered. | ||