diff options
-rw-r--r-- | Documentation/kernel-parameters.txt | 37 | ||||
-rw-r--r-- | Documentation/vm/slub.txt | 135 |
2 files changed, 158 insertions, 14 deletions
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index aae2282600ca..ce91560229f5 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt | |||
@@ -1132,9 +1132,9 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1132 | when set. | 1132 | when set. |
1133 | Format: <int> | 1133 | Format: <int> |
1134 | 1134 | ||
1135 | noaliencache [MM, NUMA] Disables the allcoation of alien caches in | 1135 | noaliencache [MM, NUMA, SLAB] Disables the allocation of alien |
1136 | the slab allocator. Saves per-node memory, but will | 1136 | caches in the slab allocator. Saves per-node memory, |
1137 | impact performance on real NUMA hardware. | 1137 | but will impact performance. |
1138 | 1138 | ||
1139 | noalign [KNL,ARM] | 1139 | noalign [KNL,ARM] |
1140 | 1140 | ||
@@ -1613,6 +1613,37 @@ and is between 256 and 4096 characters. It is defined in the file | |||
1613 | 1613 | ||
1614 | slram= [HW,MTD] | 1614 | slram= [HW,MTD] |
1615 | 1615 | ||
1616 | slub_debug [MM, SLUB] | ||
1617 | Enabling slub_debug allows one to determine the culprit | ||
1618 | if slab objects become corrupted. Enabling slub_debug | ||
1619 | creates guard zones around objects and poisons objects | ||
1620 | when not in use. Also tracks the last alloc / free. | ||
1621 | For more information see Documentation/vm/slub.txt. | ||
1622 | |||
1623 | slub_max_order= [MM, SLUB] | ||
1624 | Determines the maximum allowed order for slabs. Setting | ||
1625 | this too high may cause fragmentation. | ||
1626 | For more information see Documentation/vm/slub.txt. | ||
1627 | |||
1628 | slub_min_objects= [MM, SLUB] | ||
1629 | The minimum objects per slab. SLUB will increase the | ||
1630 | slab order up to slub_max_order to generate a | ||
1631 | sufficiently big slab to satisfy the number of objects. | ||
1632 | The higher the number of objects the smaller the overhead | ||
1633 | of tracking slabs. | ||
1634 | For more information see Documentation/vm/slub.txt. | ||
1635 | |||
1636 | slub_min_order= [MM, SLUB] | ||
1637 | Determines the mininum page order for slabs. Must be | ||
1638 | lower than slub_max_order | ||
1639 | For more information see Documentation/vm/slub.txt. | ||
1640 | |||
1641 | slub_nomerge [MM, SLUB] | ||
1642 | Disable merging of slabs of similar size. May be | ||
1643 | necessary if there is some reason to distinguish | ||
1644 | allocs to different slabs. | ||
1645 | For more information see Documentation/vm/slub.txt. | ||
1646 | |||
1616 | smart2= [HW] | 1647 | smart2= [HW] |
1617 | Format: <io1>[,<io2>[,...,<io8>]] | 1648 | Format: <io1>[,<io2>[,...,<io8>]] |
1618 | 1649 | ||
diff --git a/Documentation/vm/slub.txt b/Documentation/vm/slub.txt index 727c8d81aeaf..1523320abd87 100644 --- a/Documentation/vm/slub.txt +++ b/Documentation/vm/slub.txt | |||
@@ -1,13 +1,9 @@ | |||
1 | Short users guide for SLUB | 1 | Short users guide for SLUB |
2 | -------------------------- | 2 | -------------------------- |
3 | 3 | ||
4 | First of all slub should transparently replace SLAB. If you enable | ||
5 | SLUB then everything should work the same (Note the word "should". | ||
6 | There is likely not much value in that word at this point). | ||
7 | |||
8 | The basic philosophy of SLUB is very different from SLAB. SLAB | 4 | The basic philosophy of SLUB is very different from SLAB. SLAB |
9 | requires rebuilding the kernel to activate debug options for all | 5 | requires rebuilding the kernel to activate debug options for all |
10 | SLABS. SLUB always includes full debugging but its off by default. | 6 | slab caches. SLUB always includes full debugging but it is off by default. |
11 | SLUB can enable debugging only for selected slabs in order to avoid | 7 | SLUB can enable debugging only for selected slabs in order to avoid |
12 | an impact on overall system performance which may make a bug more | 8 | an impact on overall system performance which may make a bug more |
13 | difficult to find. | 9 | difficult to find. |
@@ -76,13 +72,28 @@ of objects. | |||
76 | Careful with tracing: It may spew out lots of information and never stop if | 72 | Careful with tracing: It may spew out lots of information and never stop if |
77 | used on the wrong slab. | 73 | used on the wrong slab. |
78 | 74 | ||
79 | SLAB Merging | 75 | Slab merging |
80 | ------------ | 76 | ------------ |
81 | 77 | ||
82 | If no debugging is specified then SLUB may merge similar slabs together | 78 | If no debug options are specified then SLUB may merge similar slabs together |
83 | in order to reduce overhead and increase cache hotness of objects. | 79 | in order to reduce overhead and increase cache hotness of objects. |
84 | slabinfo -a displays which slabs were merged together. | 80 | slabinfo -a displays which slabs were merged together. |
85 | 81 | ||
82 | Slab validation | ||
83 | --------------- | ||
84 | |||
85 | SLUB can validate all object if the kernel was booted with slub_debug. In | ||
86 | order to do so you must have the slabinfo tool. Then you can do | ||
87 | |||
88 | slabinfo -v | ||
89 | |||
90 | which will test all objects. Output will be generated to the syslog. | ||
91 | |||
92 | This also works in a more limited way if boot was without slab debug. | ||
93 | In that case slabinfo -v simply tests all reachable objects. Usually | ||
94 | these are in the cpu slabs and the partial slabs. Full slabs are not | ||
95 | tracked by SLUB in a non debug situation. | ||
96 | |||
86 | Getting more performance | 97 | Getting more performance |
87 | ------------------------ | 98 | ------------------------ |
88 | 99 | ||
@@ -91,9 +102,9 @@ list_lock once in a while to deal with partial slabs. That overhead is | |||
91 | governed by the order of the allocation for each slab. The allocations | 102 | governed by the order of the allocation for each slab. The allocations |
92 | can be influenced by kernel parameters: | 103 | can be influenced by kernel parameters: |
93 | 104 | ||
94 | slub_min_objects=x (default 8) | 105 | slub_min_objects=x (default 4) |
95 | slub_min_order=x (default 0) | 106 | slub_min_order=x (default 0) |
96 | slub_max_order=x (default 4) | 107 | slub_max_order=x (default 1) |
97 | 108 | ||
98 | slub_min_objects allows to specify how many objects must at least fit | 109 | slub_min_objects allows to specify how many objects must at least fit |
99 | into one slab in order for the allocation order to be acceptable. | 110 | into one slab in order for the allocation order to be acceptable. |
@@ -109,5 +120,107 @@ longer be checked. This is useful to avoid SLUB trying to generate | |||
109 | super large order pages to fit slub_min_objects of a slab cache with | 120 | super large order pages to fit slub_min_objects of a slab cache with |
110 | large object sizes into one high order page. | 121 | large object sizes into one high order page. |
111 | 122 | ||
112 | 123 | SLUB Debug output | |
113 | Christoph Lameter, <clameter@sgi.com>, April 10, 2007 | 124 | ----------------- |
125 | |||
126 | Here is a sample of slub debug output: | ||
127 | |||
128 | *** SLUB kmalloc-8: Redzone Active@0xc90f6d20 slab 0xc528c530 offset=3360 flags=0x400000c3 inuse=61 freelist=0xc90f6d58 | ||
129 | Bytes b4 0xc90f6d10: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ | ||
130 | Object 0xc90f6d20: 31 30 31 39 2e 30 30 35 1019.005 | ||
131 | Redzone 0xc90f6d28: 00 cc cc cc . | ||
132 | FreePointer 0xc90f6d2c -> 0xc90f6d58 | ||
133 | Last alloc: get_modalias+0x61/0xf5 jiffies_ago=53 cpu=1 pid=554 | ||
134 | Filler 0xc90f6d50: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ | ||
135 | [<c010523d>] dump_trace+0x63/0x1eb | ||
136 | [<c01053df>] show_trace_log_lvl+0x1a/0x2f | ||
137 | [<c010601d>] show_trace+0x12/0x14 | ||
138 | [<c0106035>] dump_stack+0x16/0x18 | ||
139 | [<c017e0fa>] object_err+0x143/0x14b | ||
140 | [<c017e2cc>] check_object+0x66/0x234 | ||
141 | [<c017eb43>] __slab_free+0x239/0x384 | ||
142 | [<c017f446>] kfree+0xa6/0xc6 | ||
143 | [<c02e2335>] get_modalias+0xb9/0xf5 | ||
144 | [<c02e23b7>] dmi_dev_uevent+0x27/0x3c | ||
145 | [<c027866a>] dev_uevent+0x1ad/0x1da | ||
146 | [<c0205024>] kobject_uevent_env+0x20a/0x45b | ||
147 | [<c020527f>] kobject_uevent+0xa/0xf | ||
148 | [<c02779f1>] store_uevent+0x4f/0x58 | ||
149 | [<c027758e>] dev_attr_store+0x29/0x2f | ||
150 | [<c01bec4f>] sysfs_write_file+0x16e/0x19c | ||
151 | [<c0183ba7>] vfs_write+0xd1/0x15a | ||
152 | [<c01841d7>] sys_write+0x3d/0x72 | ||
153 | [<c0104112>] sysenter_past_esp+0x5f/0x99 | ||
154 | [<b7f7b410>] 0xb7f7b410 | ||
155 | ======================= | ||
156 | @@@ SLUB kmalloc-8: Restoring redzone (0xcc) from 0xc90f6d28-0xc90f6d2b | ||
157 | |||
158 | |||
159 | |||
160 | If SLUB encounters a corrupted object then it will perform the following | ||
161 | actions: | ||
162 | |||
163 | 1. Isolation and report of the issue | ||
164 | |||
165 | This will be a message in the system log starting with | ||
166 | |||
167 | *** SLUB <slab cache affected>: <What went wrong>@<object address> | ||
168 | offset=<offset of object into slab> flags=<slabflags> | ||
169 | inuse=<objects in use in this slab> freelist=<first free object in slab> | ||
170 | |||
171 | 2. Report on how the problem was dealt with in order to ensure the continued | ||
172 | operation of the system. | ||
173 | |||
174 | These are messages in the system log beginning with | ||
175 | |||
176 | @@@ SLUB <slab cache affected>: <corrective action taken> | ||
177 | |||
178 | |||
179 | In the above sample SLUB found that the Redzone of an active object has | ||
180 | been overwritten. Here a string of 8 characters was written into a slab that | ||
181 | has the length of 8 characters. However, a 8 character string needs a | ||
182 | terminating 0. That zero has overwritten the first byte of the Redzone field. | ||
183 | After reporting the details of the issue encountered the @@@ SLUB message | ||
184 | tell us that SLUB has restored the redzone to its proper value and then | ||
185 | system operations continue. | ||
186 | |||
187 | Various types of lines can follow the @@@ SLUB line: | ||
188 | |||
189 | Bytes b4 <address> : <bytes> | ||
190 | Show a few bytes before the object where the problem was detected. | ||
191 | Can be useful if the corruption does not stop with the start of the | ||
192 | object. | ||
193 | |||
194 | Object <address> : <bytes> | ||
195 | The bytes of the object. If the object is inactive then the bytes | ||
196 | typically contain poisoning values. Any non-poison value shows a | ||
197 | corruption by a write after free. | ||
198 | |||
199 | Redzone <address> : <bytes> | ||
200 | The redzone following the object. The redzone is used to detect | ||
201 | writes after the object. All bytes should always have the same | ||
202 | value. If there is any deviation then it is due to a write after | ||
203 | the object boundary. | ||
204 | |||
205 | Freepointer | ||
206 | The pointer to the next free object in the slab. May become | ||
207 | corrupted if overwriting continues after the red zone. | ||
208 | |||
209 | Last alloc: | ||
210 | Last free: | ||
211 | Shows the address from which the object was allocated/freed last. | ||
212 | We note the pid, the time and the CPU that did so. This is usually | ||
213 | the most useful information to figure out where things went wrong. | ||
214 | Here get_modalias() did an kmalloc(8) instead of a kmalloc(9). | ||
215 | |||
216 | Filler <address> : <bytes> | ||
217 | Unused data to fill up the space in order to get the next object | ||
218 | properly aligned. In the debug case we make sure that there are | ||
219 | at least 4 bytes of filler. This allow for the detection of writes | ||
220 | before the object. | ||
221 | |||
222 | Following the filler will be a stackdump. That stackdump describes the | ||
223 | location where the error was detected. The cause of the corruption is more | ||
224 | likely to be found by looking at the information about the last alloc / free. | ||
225 | |||
226 | Christoph Lameter, <clameter@sgi.com>, May 23, 2007 | ||