diff options
author | Jonathan Corbet <corbet@lwn.net> | 2016-08-19 13:38:36 -0400 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2016-08-19 13:38:36 -0400 |
commit | 5512128f027aec63a9a2ca792858801554a57baf (patch) | |
tree | fff0d5541614d4d17dbc1b709f8b450acf924cf1 /Documentation/dev-tools | |
parent | 44f4ddd1bff04196349ab229a6a08e5223fe1594 (diff) | |
parent | 5f0962748d46c63aaf5c46dcb1c8f52dfb7b717f (diff) |
Merge branch 'dev-tools' into doc/4.9
Coalesce development-tool documents into a single directory and sphinxify
them.
Diffstat (limited to 'Documentation/dev-tools')
-rw-r--r-- | Documentation/dev-tools/coccinelle.rst | 491 | ||||
-rw-r--r-- | Documentation/dev-tools/gcov.rst | 256 | ||||
-rw-r--r-- | Documentation/dev-tools/gdb-kernel-debugging.rst | 173 | ||||
-rw-r--r-- | Documentation/dev-tools/kasan.rst | 173 | ||||
-rw-r--r-- | Documentation/dev-tools/kcov.rst | 111 | ||||
-rw-r--r-- | Documentation/dev-tools/kmemcheck.rst | 733 | ||||
-rw-r--r-- | Documentation/dev-tools/kmemleak.rst | 210 | ||||
-rw-r--r-- | Documentation/dev-tools/sparse.rst | 117 | ||||
-rw-r--r-- | Documentation/dev-tools/tools.rst | 25 | ||||
-rw-r--r-- | Documentation/dev-tools/ubsan.rst | 88 |
10 files changed, 2377 insertions, 0 deletions
diff --git a/Documentation/dev-tools/coccinelle.rst b/Documentation/dev-tools/coccinelle.rst new file mode 100644 index 000000000000..4a64b4c69d3f --- /dev/null +++ b/Documentation/dev-tools/coccinelle.rst | |||
@@ -0,0 +1,491 @@ | |||
1 | .. Copyright 2010 Nicolas Palix <npalix@diku.dk> | ||
2 | .. Copyright 2010 Julia Lawall <julia@diku.dk> | ||
3 | .. Copyright 2010 Gilles Muller <Gilles.Muller@lip6.fr> | ||
4 | |||
5 | .. highlight:: none | ||
6 | |||
7 | Coccinelle | ||
8 | ========== | ||
9 | |||
10 | Coccinelle is a tool for pattern matching and text transformation that has | ||
11 | many uses in kernel development, including the application of complex, | ||
12 | tree-wide patches and detection of problematic programming patterns. | ||
13 | |||
14 | Getting Coccinelle | ||
15 | ------------------- | ||
16 | |||
17 | The semantic patches included in the kernel use features and options | ||
18 | which are provided by Coccinelle version 1.0.0-rc11 and above. | ||
19 | Using earlier versions will fail as the option names used by | ||
20 | the Coccinelle files and coccicheck have been updated. | ||
21 | |||
22 | Coccinelle is available through the package manager | ||
23 | of many distributions, e.g. : | ||
24 | |||
25 | - Debian | ||
26 | - Fedora | ||
27 | - Ubuntu | ||
28 | - OpenSUSE | ||
29 | - Arch Linux | ||
30 | - NetBSD | ||
31 | - FreeBSD | ||
32 | |||
33 | You can get the latest version released from the Coccinelle homepage at | ||
34 | http://coccinelle.lip6.fr/ | ||
35 | |||
36 | Information and tips about Coccinelle are also provided on the wiki | ||
37 | pages at http://cocci.ekstranet.diku.dk/wiki/doku.php | ||
38 | |||
39 | Once you have it, run the following command:: | ||
40 | |||
41 | ./configure | ||
42 | make | ||
43 | |||
44 | as a regular user, and install it with:: | ||
45 | |||
46 | sudo make install | ||
47 | |||
48 | Supplemental documentation | ||
49 | --------------------------- | ||
50 | |||
51 | For supplemental documentation refer to the wiki: | ||
52 | |||
53 | https://bottest.wiki.kernel.org/coccicheck | ||
54 | |||
55 | The wiki documentation always refers to the linux-next version of the script. | ||
56 | |||
57 | Using Coccinelle on the Linux kernel | ||
58 | ------------------------------------ | ||
59 | |||
60 | A Coccinelle-specific target is defined in the top level | ||
61 | Makefile. This target is named ``coccicheck`` and calls the ``coccicheck`` | ||
62 | front-end in the ``scripts`` directory. | ||
63 | |||
64 | Four basic modes are defined: ``patch``, ``report``, ``context``, and | ||
65 | ``org``. The mode to use is specified by setting the MODE variable with | ||
66 | ``MODE=<mode>``. | ||
67 | |||
68 | - ``patch`` proposes a fix, when possible. | ||
69 | |||
70 | - ``report`` generates a list in the following format: | ||
71 | file:line:column-column: message | ||
72 | |||
73 | - ``context`` highlights lines of interest and their context in a | ||
74 | diff-like style.Lines of interest are indicated with ``-``. | ||
75 | |||
76 | - ``org`` generates a report in the Org mode format of Emacs. | ||
77 | |||
78 | Note that not all semantic patches implement all modes. For easy use | ||
79 | of Coccinelle, the default mode is "report". | ||
80 | |||
81 | Two other modes provide some common combinations of these modes. | ||
82 | |||
83 | - ``chain`` tries the previous modes in the order above until one succeeds. | ||
84 | |||
85 | - ``rep+ctxt`` runs successively the report mode and the context mode. | ||
86 | It should be used with the C option (described later) | ||
87 | which checks the code on a file basis. | ||
88 | |||
89 | Examples | ||
90 | ~~~~~~~~ | ||
91 | |||
92 | To make a report for every semantic patch, run the following command:: | ||
93 | |||
94 | make coccicheck MODE=report | ||
95 | |||
96 | To produce patches, run:: | ||
97 | |||
98 | make coccicheck MODE=patch | ||
99 | |||
100 | |||
101 | The coccicheck target applies every semantic patch available in the | ||
102 | sub-directories of ``scripts/coccinelle`` to the entire Linux kernel. | ||
103 | |||
104 | For each semantic patch, a commit message is proposed. It gives a | ||
105 | description of the problem being checked by the semantic patch, and | ||
106 | includes a reference to Coccinelle. | ||
107 | |||
108 | As any static code analyzer, Coccinelle produces false | ||
109 | positives. Thus, reports must be carefully checked, and patches | ||
110 | reviewed. | ||
111 | |||
112 | To enable verbose messages set the V= variable, for example:: | ||
113 | |||
114 | make coccicheck MODE=report V=1 | ||
115 | |||
116 | Coccinelle parallelization | ||
117 | --------------------------- | ||
118 | |||
119 | By default, coccicheck tries to run as parallel as possible. To change | ||
120 | the parallelism, set the J= variable. For example, to run across 4 CPUs:: | ||
121 | |||
122 | make coccicheck MODE=report J=4 | ||
123 | |||
124 | As of Coccinelle 1.0.2 Coccinelle uses Ocaml parmap for parallelization, | ||
125 | if support for this is detected you will benefit from parmap parallelization. | ||
126 | |||
127 | When parmap is enabled coccicheck will enable dynamic load balancing by using | ||
128 | ``--chunksize 1`` argument, this ensures we keep feeding threads with work | ||
129 | one by one, so that we avoid the situation where most work gets done by only | ||
130 | a few threads. With dynamic load balancing, if a thread finishes early we keep | ||
131 | feeding it more work. | ||
132 | |||
133 | When parmap is enabled, if an error occurs in Coccinelle, this error | ||
134 | value is propagated back, the return value of the ``make coccicheck`` | ||
135 | captures this return value. | ||
136 | |||
137 | Using Coccinelle with a single semantic patch | ||
138 | --------------------------------------------- | ||
139 | |||
140 | The optional make variable COCCI can be used to check a single | ||
141 | semantic patch. In that case, the variable must be initialized with | ||
142 | the name of the semantic patch to apply. | ||
143 | |||
144 | For instance:: | ||
145 | |||
146 | make coccicheck COCCI=<my_SP.cocci> MODE=patch | ||
147 | |||
148 | or:: | ||
149 | |||
150 | make coccicheck COCCI=<my_SP.cocci> MODE=report | ||
151 | |||
152 | |||
153 | Controlling Which Files are Processed by Coccinelle | ||
154 | --------------------------------------------------- | ||
155 | |||
156 | By default the entire kernel source tree is checked. | ||
157 | |||
158 | To apply Coccinelle to a specific directory, ``M=`` can be used. | ||
159 | For example, to check drivers/net/wireless/ one may write:: | ||
160 | |||
161 | make coccicheck M=drivers/net/wireless/ | ||
162 | |||
163 | To apply Coccinelle on a file basis, instead of a directory basis, the | ||
164 | following command may be used:: | ||
165 | |||
166 | make C=1 CHECK="scripts/coccicheck" | ||
167 | |||
168 | To check only newly edited code, use the value 2 for the C flag, i.e.:: | ||
169 | |||
170 | make C=2 CHECK="scripts/coccicheck" | ||
171 | |||
172 | In these modes, which works on a file basis, there is no information | ||
173 | about semantic patches displayed, and no commit message proposed. | ||
174 | |||
175 | This runs every semantic patch in scripts/coccinelle by default. The | ||
176 | COCCI variable may additionally be used to only apply a single | ||
177 | semantic patch as shown in the previous section. | ||
178 | |||
179 | The "report" mode is the default. You can select another one with the | ||
180 | MODE variable explained above. | ||
181 | |||
182 | Debugging Coccinelle SmPL patches | ||
183 | --------------------------------- | ||
184 | |||
185 | Using coccicheck is best as it provides in the spatch command line | ||
186 | include options matching the options used when we compile the kernel. | ||
187 | You can learn what these options are by using V=1, you could then | ||
188 | manually run Coccinelle with debug options added. | ||
189 | |||
190 | Alternatively you can debug running Coccinelle against SmPL patches | ||
191 | by asking for stderr to be redirected to stderr, by default stderr | ||
192 | is redirected to /dev/null, if you'd like to capture stderr you | ||
193 | can specify the ``DEBUG_FILE="file.txt"`` option to coccicheck. For | ||
194 | instance:: | ||
195 | |||
196 | rm -f cocci.err | ||
197 | make coccicheck COCCI=scripts/coccinelle/free/kfree.cocci MODE=report DEBUG_FILE=cocci.err | ||
198 | cat cocci.err | ||
199 | |||
200 | You can use SPFLAGS to add debugging flags, for instance you may want to | ||
201 | add both --profile --show-trying to SPFLAGS when debugging. For instance | ||
202 | you may want to use:: | ||
203 | |||
204 | rm -f err.log | ||
205 | export COCCI=scripts/coccinelle/misc/irqf_oneshot.cocci | ||
206 | make coccicheck DEBUG_FILE="err.log" MODE=report SPFLAGS="--profile --show-trying" M=./drivers/mfd/arizona-irq.c | ||
207 | |||
208 | err.log will now have the profiling information, while stdout will | ||
209 | provide some progress information as Coccinelle moves forward with | ||
210 | work. | ||
211 | |||
212 | DEBUG_FILE support is only supported when using coccinelle >= 1.2. | ||
213 | |||
214 | .cocciconfig support | ||
215 | -------------------- | ||
216 | |||
217 | Coccinelle supports reading .cocciconfig for default Coccinelle options that | ||
218 | should be used every time spatch is spawned, the order of precedence for | ||
219 | variables for .cocciconfig is as follows: | ||
220 | |||
221 | - Your current user's home directory is processed first | ||
222 | - Your directory from which spatch is called is processed next | ||
223 | - The directory provided with the --dir option is processed last, if used | ||
224 | |||
225 | Since coccicheck runs through make, it naturally runs from the kernel | ||
226 | proper dir, as such the second rule above would be implied for picking up a | ||
227 | .cocciconfig when using ``make coccicheck``. | ||
228 | |||
229 | ``make coccicheck`` also supports using M= targets.If you do not supply | ||
230 | any M= target, it is assumed you want to target the entire kernel. | ||
231 | The kernel coccicheck script has:: | ||
232 | |||
233 | if [ "$KBUILD_EXTMOD" = "" ] ; then | ||
234 | OPTIONS="--dir $srctree $COCCIINCLUDE" | ||
235 | else | ||
236 | OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE" | ||
237 | fi | ||
238 | |||
239 | KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases | ||
240 | the spatch --dir argument is used, as such third rule applies when whether M= | ||
241 | is used or not, and when M= is used the target directory can have its own | ||
242 | .cocciconfig file. When M= is not passed as an argument to coccicheck the | ||
243 | target directory is the same as the directory from where spatch was called. | ||
244 | |||
245 | If not using the kernel's coccicheck target, keep the above precedence | ||
246 | order logic of .cocciconfig reading. If using the kernel's coccicheck target, | ||
247 | override any of the kernel's .coccicheck's settings using SPFLAGS. | ||
248 | |||
249 | We help Coccinelle when used against Linux with a set of sensible defaults | ||
250 | options for Linux with our own Linux .cocciconfig. This hints to coccinelle | ||
251 | git can be used for ``git grep`` queries over coccigrep. A timeout of 200 | ||
252 | seconds should suffice for now. | ||
253 | |||
254 | The options picked up by coccinelle when reading a .cocciconfig do not appear | ||
255 | as arguments to spatch processes running on your system, to confirm what | ||
256 | options will be used by Coccinelle run:: | ||
257 | |||
258 | spatch --print-options-only | ||
259 | |||
260 | You can override with your own preferred index option by using SPFLAGS. Take | ||
261 | note that when there are conflicting options Coccinelle takes precedence for | ||
262 | the last options passed. Using .cocciconfig is possible to use idutils, however | ||
263 | given the order of precedence followed by Coccinelle, since the kernel now | ||
264 | carries its own .cocciconfig, you will need to use SPFLAGS to use idutils if | ||
265 | desired. See below section "Additional flags" for more details on how to use | ||
266 | idutils. | ||
267 | |||
268 | Additional flags | ||
269 | ---------------- | ||
270 | |||
271 | Additional flags can be passed to spatch through the SPFLAGS | ||
272 | variable. This works as Coccinelle respects the last flags | ||
273 | given to it when options are in conflict. :: | ||
274 | |||
275 | make SPFLAGS=--use-glimpse coccicheck | ||
276 | |||
277 | Coccinelle supports idutils as well but requires coccinelle >= 1.0.6. | ||
278 | When no ID file is specified coccinelle assumes your ID database file | ||
279 | is in the file .id-utils.index on the top level of the kernel, coccinelle | ||
280 | carries a script scripts/idutils_index.sh which creates the database with:: | ||
281 | |||
282 | mkid -i C --output .id-utils.index | ||
283 | |||
284 | If you have another database filename you can also just symlink with this | ||
285 | name. :: | ||
286 | |||
287 | make SPFLAGS=--use-idutils coccicheck | ||
288 | |||
289 | Alternatively you can specify the database filename explicitly, for | ||
290 | instance:: | ||
291 | |||
292 | make SPFLAGS="--use-idutils /full-path/to/ID" coccicheck | ||
293 | |||
294 | See ``spatch --help`` to learn more about spatch options. | ||
295 | |||
296 | Note that the ``--use-glimpse`` and ``--use-idutils`` options | ||
297 | require external tools for indexing the code. None of them is | ||
298 | thus active by default. However, by indexing the code with | ||
299 | one of these tools, and according to the cocci file used, | ||
300 | spatch could proceed the entire code base more quickly. | ||
301 | |||
302 | SmPL patch specific options | ||
303 | --------------------------- | ||
304 | |||
305 | SmPL patches can have their own requirements for options passed | ||
306 | to Coccinelle. SmPL patch specific options can be provided by | ||
307 | providing them at the top of the SmPL patch, for instance:: | ||
308 | |||
309 | // Options: --no-includes --include-headers | ||
310 | |||
311 | SmPL patch Coccinelle requirements | ||
312 | ---------------------------------- | ||
313 | |||
314 | As Coccinelle features get added some more advanced SmPL patches | ||
315 | may require newer versions of Coccinelle. If an SmPL patch requires | ||
316 | at least a version of Coccinelle, this can be specified as follows, | ||
317 | as an example if requiring at least Coccinelle >= 1.0.5:: | ||
318 | |||
319 | // Requires: 1.0.5 | ||
320 | |||
321 | Proposing new semantic patches | ||
322 | ------------------------------- | ||
323 | |||
324 | New semantic patches can be proposed and submitted by kernel | ||
325 | developers. For sake of clarity, they should be organized in the | ||
326 | sub-directories of ``scripts/coccinelle/``. | ||
327 | |||
328 | |||
329 | Detailed description of the ``report`` mode | ||
330 | ------------------------------------------- | ||
331 | |||
332 | ``report`` generates a list in the following format:: | ||
333 | |||
334 | file:line:column-column: message | ||
335 | |||
336 | Example | ||
337 | ~~~~~~~ | ||
338 | |||
339 | Running:: | ||
340 | |||
341 | make coccicheck MODE=report COCCI=scripts/coccinelle/api/err_cast.cocci | ||
342 | |||
343 | will execute the following part of the SmPL script:: | ||
344 | |||
345 | <smpl> | ||
346 | @r depends on !context && !patch && (org || report)@ | ||
347 | expression x; | ||
348 | position p; | ||
349 | @@ | ||
350 | |||
351 | ERR_PTR@p(PTR_ERR(x)) | ||
352 | |||
353 | @script:python depends on report@ | ||
354 | p << r.p; | ||
355 | x << r.x; | ||
356 | @@ | ||
357 | |||
358 | msg="ERR_CAST can be used with %s" % (x) | ||
359 | coccilib.report.print_report(p[0], msg) | ||
360 | </smpl> | ||
361 | |||
362 | This SmPL excerpt generates entries on the standard output, as | ||
363 | illustrated below:: | ||
364 | |||
365 | /home/user/linux/crypto/ctr.c:188:9-16: ERR_CAST can be used with alg | ||
366 | /home/user/linux/crypto/authenc.c:619:9-16: ERR_CAST can be used with auth | ||
367 | /home/user/linux/crypto/xts.c:227:9-16: ERR_CAST can be used with alg | ||
368 | |||
369 | |||
370 | Detailed description of the ``patch`` mode | ||
371 | ------------------------------------------ | ||
372 | |||
373 | When the ``patch`` mode is available, it proposes a fix for each problem | ||
374 | identified. | ||
375 | |||
376 | Example | ||
377 | ~~~~~~~ | ||
378 | |||
379 | Running:: | ||
380 | |||
381 | make coccicheck MODE=patch COCCI=scripts/coccinelle/api/err_cast.cocci | ||
382 | |||
383 | will execute the following part of the SmPL script:: | ||
384 | |||
385 | <smpl> | ||
386 | @ depends on !context && patch && !org && !report @ | ||
387 | expression x; | ||
388 | @@ | ||
389 | |||
390 | - ERR_PTR(PTR_ERR(x)) | ||
391 | + ERR_CAST(x) | ||
392 | </smpl> | ||
393 | |||
394 | This SmPL excerpt generates patch hunks on the standard output, as | ||
395 | illustrated below:: | ||
396 | |||
397 | diff -u -p a/crypto/ctr.c b/crypto/ctr.c | ||
398 | --- a/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200 | ||
399 | +++ b/crypto/ctr.c 2010-06-03 23:44:49.000000000 +0200 | ||
400 | @@ -185,7 +185,7 @@ static struct crypto_instance *crypto_ct | ||
401 | alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER, | ||
402 | CRYPTO_ALG_TYPE_MASK); | ||
403 | if (IS_ERR(alg)) | ||
404 | - return ERR_PTR(PTR_ERR(alg)); | ||
405 | + return ERR_CAST(alg); | ||
406 | |||
407 | /* Block size must be >= 4 bytes. */ | ||
408 | err = -EINVAL; | ||
409 | |||
410 | Detailed description of the ``context`` mode | ||
411 | -------------------------------------------- | ||
412 | |||
413 | ``context`` highlights lines of interest and their context | ||
414 | in a diff-like style. | ||
415 | |||
416 | **NOTE**: The diff-like output generated is NOT an applicable patch. The | ||
417 | intent of the ``context`` mode is to highlight the important lines | ||
418 | (annotated with minus, ``-``) and gives some surrounding context | ||
419 | lines around. This output can be used with the diff mode of | ||
420 | Emacs to review the code. | ||
421 | |||
422 | Example | ||
423 | ~~~~~~~ | ||
424 | |||
425 | Running:: | ||
426 | |||
427 | make coccicheck MODE=context COCCI=scripts/coccinelle/api/err_cast.cocci | ||
428 | |||
429 | will execute the following part of the SmPL script:: | ||
430 | |||
431 | <smpl> | ||
432 | @ depends on context && !patch && !org && !report@ | ||
433 | expression x; | ||
434 | @@ | ||
435 | |||
436 | * ERR_PTR(PTR_ERR(x)) | ||
437 | </smpl> | ||
438 | |||
439 | This SmPL excerpt generates diff hunks on the standard output, as | ||
440 | illustrated below:: | ||
441 | |||
442 | diff -u -p /home/user/linux/crypto/ctr.c /tmp/nothing | ||
443 | --- /home/user/linux/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200 | ||
444 | +++ /tmp/nothing | ||
445 | @@ -185,7 +185,6 @@ static struct crypto_instance *crypto_ct | ||
446 | alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER, | ||
447 | CRYPTO_ALG_TYPE_MASK); | ||
448 | if (IS_ERR(alg)) | ||
449 | - return ERR_PTR(PTR_ERR(alg)); | ||
450 | |||
451 | /* Block size must be >= 4 bytes. */ | ||
452 | err = -EINVAL; | ||
453 | |||
454 | Detailed description of the ``org`` mode | ||
455 | ---------------------------------------- | ||
456 | |||
457 | ``org`` generates a report in the Org mode format of Emacs. | ||
458 | |||
459 | Example | ||
460 | ~~~~~~~ | ||
461 | |||
462 | Running:: | ||
463 | |||
464 | make coccicheck MODE=org COCCI=scripts/coccinelle/api/err_cast.cocci | ||
465 | |||
466 | will execute the following part of the SmPL script:: | ||
467 | |||
468 | <smpl> | ||
469 | @r depends on !context && !patch && (org || report)@ | ||
470 | expression x; | ||
471 | position p; | ||
472 | @@ | ||
473 | |||
474 | ERR_PTR@p(PTR_ERR(x)) | ||
475 | |||
476 | @script:python depends on org@ | ||
477 | p << r.p; | ||
478 | x << r.x; | ||
479 | @@ | ||
480 | |||
481 | msg="ERR_CAST can be used with %s" % (x) | ||
482 | msg_safe=msg.replace("[","@(").replace("]",")") | ||
483 | coccilib.org.print_todo(p[0], msg_safe) | ||
484 | </smpl> | ||
485 | |||
486 | This SmPL excerpt generates Org entries on the standard output, as | ||
487 | illustrated below:: | ||
488 | |||
489 | * TODO [[view:/home/user/linux/crypto/ctr.c::face=ovl-face1::linb=188::colb=9::cole=16][ERR_CAST can be used with alg]] | ||
490 | * TODO [[view:/home/user/linux/crypto/authenc.c::face=ovl-face1::linb=619::colb=9::cole=16][ERR_CAST can be used with auth]] | ||
491 | * TODO [[view:/home/user/linux/crypto/xts.c::face=ovl-face1::linb=227::colb=9::cole=16][ERR_CAST can be used with alg]] | ||
diff --git a/Documentation/dev-tools/gcov.rst b/Documentation/dev-tools/gcov.rst new file mode 100644 index 000000000000..19eedfea8800 --- /dev/null +++ b/Documentation/dev-tools/gcov.rst | |||
@@ -0,0 +1,256 @@ | |||
1 | Using gcov with the Linux kernel | ||
2 | ================================ | ||
3 | |||
4 | gcov profiling kernel support enables the use of GCC's coverage testing | ||
5 | tool gcov_ with the Linux kernel. Coverage data of a running kernel | ||
6 | is exported in gcov-compatible format via the "gcov" debugfs directory. | ||
7 | To get coverage data for a specific file, change to the kernel build | ||
8 | directory and use gcov with the ``-o`` option as follows (requires root):: | ||
9 | |||
10 | # cd /tmp/linux-out | ||
11 | # gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c | ||
12 | |||
13 | This will create source code files annotated with execution counts | ||
14 | in the current directory. In addition, graphical gcov front-ends such | ||
15 | as lcov_ can be used to automate the process of collecting data | ||
16 | for the entire kernel and provide coverage overviews in HTML format. | ||
17 | |||
18 | Possible uses: | ||
19 | |||
20 | * debugging (has this line been reached at all?) | ||
21 | * test improvement (how do I change my test to cover these lines?) | ||
22 | * minimizing kernel configurations (do I need this option if the | ||
23 | associated code is never run?) | ||
24 | |||
25 | .. _gcov: http://gcc.gnu.org/onlinedocs/gcc/Gcov.html | ||
26 | .. _lcov: http://ltp.sourceforge.net/coverage/lcov.php | ||
27 | |||
28 | |||
29 | Preparation | ||
30 | ----------- | ||
31 | |||
32 | Configure the kernel with:: | ||
33 | |||
34 | CONFIG_DEBUG_FS=y | ||
35 | CONFIG_GCOV_KERNEL=y | ||
36 | |||
37 | select the gcc's gcov format, default is autodetect based on gcc version:: | ||
38 | |||
39 | CONFIG_GCOV_FORMAT_AUTODETECT=y | ||
40 | |||
41 | and to get coverage data for the entire kernel:: | ||
42 | |||
43 | CONFIG_GCOV_PROFILE_ALL=y | ||
44 | |||
45 | Note that kernels compiled with profiling flags will be significantly | ||
46 | larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported | ||
47 | on all architectures. | ||
48 | |||
49 | Profiling data will only become accessible once debugfs has been | ||
50 | mounted:: | ||
51 | |||
52 | mount -t debugfs none /sys/kernel/debug | ||
53 | |||
54 | |||
55 | Customization | ||
56 | ------------- | ||
57 | |||
58 | To enable profiling for specific files or directories, add a line | ||
59 | similar to the following to the respective kernel Makefile: | ||
60 | |||
61 | - For a single file (e.g. main.o):: | ||
62 | |||
63 | GCOV_PROFILE_main.o := y | ||
64 | |||
65 | - For all files in one directory:: | ||
66 | |||
67 | GCOV_PROFILE := y | ||
68 | |||
69 | To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL | ||
70 | is specified, use:: | ||
71 | |||
72 | GCOV_PROFILE_main.o := n | ||
73 | |||
74 | and:: | ||
75 | |||
76 | GCOV_PROFILE := n | ||
77 | |||
78 | Only files which are linked to the main kernel image or are compiled as | ||
79 | kernel modules are supported by this mechanism. | ||
80 | |||
81 | |||
82 | Files | ||
83 | ----- | ||
84 | |||
85 | The gcov kernel support creates the following files in debugfs: | ||
86 | |||
87 | ``/sys/kernel/debug/gcov`` | ||
88 | Parent directory for all gcov-related files. | ||
89 | |||
90 | ``/sys/kernel/debug/gcov/reset`` | ||
91 | Global reset file: resets all coverage data to zero when | ||
92 | written to. | ||
93 | |||
94 | ``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda`` | ||
95 | The actual gcov data file as understood by the gcov | ||
96 | tool. Resets file coverage data to zero when written to. | ||
97 | |||
98 | ``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno`` | ||
99 | Symbolic link to a static data file required by the gcov | ||
100 | tool. This file is generated by gcc when compiling with | ||
101 | option ``-ftest-coverage``. | ||
102 | |||
103 | |||
104 | Modules | ||
105 | ------- | ||
106 | |||
107 | Kernel modules may contain cleanup code which is only run during | ||
108 | module unload time. The gcov mechanism provides a means to collect | ||
109 | coverage data for such code by keeping a copy of the data associated | ||
110 | with the unloaded module. This data remains available through debugfs. | ||
111 | Once the module is loaded again, the associated coverage counters are | ||
112 | initialized with the data from its previous instantiation. | ||
113 | |||
114 | This behavior can be deactivated by specifying the gcov_persist kernel | ||
115 | parameter:: | ||
116 | |||
117 | gcov_persist=0 | ||
118 | |||
119 | At run-time, a user can also choose to discard data for an unloaded | ||
120 | module by writing to its data file or the global reset file. | ||
121 | |||
122 | |||
123 | Separated build and test machines | ||
124 | --------------------------------- | ||
125 | |||
126 | The gcov kernel profiling infrastructure is designed to work out-of-the | ||
127 | box for setups where kernels are built and run on the same machine. In | ||
128 | cases where the kernel runs on a separate machine, special preparations | ||
129 | must be made, depending on where the gcov tool is used: | ||
130 | |||
131 | a) gcov is run on the TEST machine | ||
132 | |||
133 | The gcov tool version on the test machine must be compatible with the | ||
134 | gcc version used for kernel build. Also the following files need to be | ||
135 | copied from build to test machine: | ||
136 | |||
137 | from the source tree: | ||
138 | - all C source files + headers | ||
139 | |||
140 | from the build tree: | ||
141 | - all C source files + headers | ||
142 | - all .gcda and .gcno files | ||
143 | - all links to directories | ||
144 | |||
145 | It is important to note that these files need to be placed into the | ||
146 | exact same file system location on the test machine as on the build | ||
147 | machine. If any of the path components is symbolic link, the actual | ||
148 | directory needs to be used instead (due to make's CURDIR handling). | ||
149 | |||
150 | b) gcov is run on the BUILD machine | ||
151 | |||
152 | The following files need to be copied after each test case from test | ||
153 | to build machine: | ||
154 | |||
155 | from the gcov directory in sysfs: | ||
156 | - all .gcda files | ||
157 | - all links to .gcno files | ||
158 | |||
159 | These files can be copied to any location on the build machine. gcov | ||
160 | must then be called with the -o option pointing to that directory. | ||
161 | |||
162 | Example directory setup on the build machine:: | ||
163 | |||
164 | /tmp/linux: kernel source tree | ||
165 | /tmp/out: kernel build directory as specified by make O= | ||
166 | /tmp/coverage: location of the files copied from the test machine | ||
167 | |||
168 | [user@build] cd /tmp/out | ||
169 | [user@build] gcov -o /tmp/coverage/tmp/out/init main.c | ||
170 | |||
171 | |||
172 | Troubleshooting | ||
173 | --------------- | ||
174 | |||
175 | Problem | ||
176 | Compilation aborts during linker step. | ||
177 | |||
178 | Cause | ||
179 | Profiling flags are specified for source files which are not | ||
180 | linked to the main kernel or which are linked by a custom | ||
181 | linker procedure. | ||
182 | |||
183 | Solution | ||
184 | Exclude affected source files from profiling by specifying | ||
185 | ``GCOV_PROFILE := n`` or ``GCOV_PROFILE_basename.o := n`` in the | ||
186 | corresponding Makefile. | ||
187 | |||
188 | Problem | ||
189 | Files copied from sysfs appear empty or incomplete. | ||
190 | |||
191 | Cause | ||
192 | Due to the way seq_file works, some tools such as cp or tar | ||
193 | may not correctly copy files from sysfs. | ||
194 | |||
195 | Solution | ||
196 | Use ``cat``' to read ``.gcda`` files and ``cp -d`` to copy links. | ||
197 | Alternatively use the mechanism shown in Appendix B. | ||
198 | |||
199 | |||
200 | Appendix A: gather_on_build.sh | ||
201 | ------------------------------ | ||
202 | |||
203 | Sample script to gather coverage meta files on the build machine | ||
204 | (see 6a):: | ||
205 | |||
206 | #!/bin/bash | ||
207 | |||
208 | KSRC=$1 | ||
209 | KOBJ=$2 | ||
210 | DEST=$3 | ||
211 | |||
212 | if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then | ||
213 | echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2 | ||
214 | exit 1 | ||
215 | fi | ||
216 | |||
217 | KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) | ||
218 | KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) | ||
219 | |||
220 | find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \ | ||
221 | -perm /u+r,g+r | tar cfz $DEST -P -T - | ||
222 | |||
223 | if [ $? -eq 0 ] ; then | ||
224 | echo "$DEST successfully created, copy to test system and unpack with:" | ||
225 | echo " tar xfz $DEST -P" | ||
226 | else | ||
227 | echo "Could not create file $DEST" | ||
228 | fi | ||
229 | |||
230 | |||
231 | Appendix B: gather_on_test.sh | ||
232 | ----------------------------- | ||
233 | |||
234 | Sample script to gather coverage data files on the test machine | ||
235 | (see 6b):: | ||
236 | |||
237 | #!/bin/bash -e | ||
238 | |||
239 | DEST=$1 | ||
240 | GCDA=/sys/kernel/debug/gcov | ||
241 | |||
242 | if [ -z "$DEST" ] ; then | ||
243 | echo "Usage: $0 <output.tar.gz>" >&2 | ||
244 | exit 1 | ||
245 | fi | ||
246 | |||
247 | TEMPDIR=$(mktemp -d) | ||
248 | echo Collecting data.. | ||
249 | find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \; | ||
250 | find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \; | ||
251 | find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \; | ||
252 | tar czf $DEST -C $TEMPDIR sys | ||
253 | rm -rf $TEMPDIR | ||
254 | |||
255 | echo "$DEST successfully created, copy to build system and unpack with:" | ||
256 | echo " tar xfz $DEST" | ||
diff --git a/Documentation/dev-tools/gdb-kernel-debugging.rst b/Documentation/dev-tools/gdb-kernel-debugging.rst new file mode 100644 index 000000000000..5e93c9bc6619 --- /dev/null +++ b/Documentation/dev-tools/gdb-kernel-debugging.rst | |||
@@ -0,0 +1,173 @@ | |||
1 | .. highlight:: none | ||
2 | |||
3 | Debugging kernel and modules via gdb | ||
4 | ==================================== | ||
5 | |||
6 | The kernel debugger kgdb, hypervisors like QEMU or JTAG-based hardware | ||
7 | interfaces allow to debug the Linux kernel and its modules during runtime | ||
8 | using gdb. Gdb comes with a powerful scripting interface for python. The | ||
9 | kernel provides a collection of helper scripts that can simplify typical | ||
10 | kernel debugging steps. This is a short tutorial about how to enable and use | ||
11 | them. It focuses on QEMU/KVM virtual machines as target, but the examples can | ||
12 | be transferred to the other gdb stubs as well. | ||
13 | |||
14 | |||
15 | Requirements | ||
16 | ------------ | ||
17 | |||
18 | - gdb 7.2+ (recommended: 7.4+) with python support enabled (typically true | ||
19 | for distributions) | ||
20 | |||
21 | |||
22 | Setup | ||
23 | ----- | ||
24 | |||
25 | - Create a virtual Linux machine for QEMU/KVM (see www.linux-kvm.org and | ||
26 | www.qemu.org for more details). For cross-development, | ||
27 | http://landley.net/aboriginal/bin keeps a pool of machine images and | ||
28 | toolchains that can be helpful to start from. | ||
29 | |||
30 | - Build the kernel with CONFIG_GDB_SCRIPTS enabled, but leave | ||
31 | CONFIG_DEBUG_INFO_REDUCED off. If your architecture supports | ||
32 | CONFIG_FRAME_POINTER, keep it enabled. | ||
33 | |||
34 | - Install that kernel on the guest. | ||
35 | Alternatively, QEMU allows to boot the kernel directly using -kernel, | ||
36 | -append, -initrd command line switches. This is generally only useful if | ||
37 | you do not depend on modules. See QEMU documentation for more details on | ||
38 | this mode. | ||
39 | |||
40 | - Enable the gdb stub of QEMU/KVM, either | ||
41 | |||
42 | - at VM startup time by appending "-s" to the QEMU command line | ||
43 | |||
44 | or | ||
45 | |||
46 | - during runtime by issuing "gdbserver" from the QEMU monitor | ||
47 | console | ||
48 | |||
49 | - cd /path/to/linux-build | ||
50 | |||
51 | - Start gdb: gdb vmlinux | ||
52 | |||
53 | Note: Some distros may restrict auto-loading of gdb scripts to known safe | ||
54 | directories. In case gdb reports to refuse loading vmlinux-gdb.py, add:: | ||
55 | |||
56 | add-auto-load-safe-path /path/to/linux-build | ||
57 | |||
58 | to ~/.gdbinit. See gdb help for more details. | ||
59 | |||
60 | - Attach to the booted guest:: | ||
61 | |||
62 | (gdb) target remote :1234 | ||
63 | |||
64 | |||
65 | Examples of using the Linux-provided gdb helpers | ||
66 | ------------------------------------------------ | ||
67 | |||
68 | - Load module (and main kernel) symbols:: | ||
69 | |||
70 | (gdb) lx-symbols | ||
71 | loading vmlinux | ||
72 | scanning for modules in /home/user/linux/build | ||
73 | loading @0xffffffffa0020000: /home/user/linux/build/net/netfilter/xt_tcpudp.ko | ||
74 | loading @0xffffffffa0016000: /home/user/linux/build/net/netfilter/xt_pkttype.ko | ||
75 | loading @0xffffffffa0002000: /home/user/linux/build/net/netfilter/xt_limit.ko | ||
76 | loading @0xffffffffa00ca000: /home/user/linux/build/net/packet/af_packet.ko | ||
77 | loading @0xffffffffa003c000: /home/user/linux/build/fs/fuse/fuse.ko | ||
78 | ... | ||
79 | loading @0xffffffffa0000000: /home/user/linux/build/drivers/ata/ata_generic.ko | ||
80 | |||
81 | - Set a breakpoint on some not yet loaded module function, e.g.:: | ||
82 | |||
83 | (gdb) b btrfs_init_sysfs | ||
84 | Function "btrfs_init_sysfs" not defined. | ||
85 | Make breakpoint pending on future shared library load? (y or [n]) y | ||
86 | Breakpoint 1 (btrfs_init_sysfs) pending. | ||
87 | |||
88 | - Continue the target:: | ||
89 | |||
90 | (gdb) c | ||
91 | |||
92 | - Load the module on the target and watch the symbols being loaded as well as | ||
93 | the breakpoint hit:: | ||
94 | |||
95 | loading @0xffffffffa0034000: /home/user/linux/build/lib/libcrc32c.ko | ||
96 | loading @0xffffffffa0050000: /home/user/linux/build/lib/lzo/lzo_compress.ko | ||
97 | loading @0xffffffffa006e000: /home/user/linux/build/lib/zlib_deflate/zlib_deflate.ko | ||
98 | loading @0xffffffffa01b1000: /home/user/linux/build/fs/btrfs/btrfs.ko | ||
99 | |||
100 | Breakpoint 1, btrfs_init_sysfs () at /home/user/linux/fs/btrfs/sysfs.c:36 | ||
101 | 36 btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj); | ||
102 | |||
103 | - Dump the log buffer of the target kernel:: | ||
104 | |||
105 | (gdb) lx-dmesg | ||
106 | [ 0.000000] Initializing cgroup subsys cpuset | ||
107 | [ 0.000000] Initializing cgroup subsys cpu | ||
108 | [ 0.000000] Linux version 3.8.0-rc4-dbg+ (... | ||
109 | [ 0.000000] Command line: root=/dev/sda2 resume=/dev/sda1 vga=0x314 | ||
110 | [ 0.000000] e820: BIOS-provided physical RAM map: | ||
111 | [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable | ||
112 | [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved | ||
113 | .... | ||
114 | |||
115 | - Examine fields of the current task struct:: | ||
116 | |||
117 | (gdb) p $lx_current().pid | ||
118 | $1 = 4998 | ||
119 | (gdb) p $lx_current().comm | ||
120 | $2 = "modprobe\000\000\000\000\000\000\000" | ||
121 | |||
122 | - Make use of the per-cpu function for the current or a specified CPU:: | ||
123 | |||
124 | (gdb) p $lx_per_cpu("runqueues").nr_running | ||
125 | $3 = 1 | ||
126 | (gdb) p $lx_per_cpu("runqueues", 2).nr_running | ||
127 | $4 = 0 | ||
128 | |||
129 | - Dig into hrtimers using the container_of helper:: | ||
130 | |||
131 | (gdb) set $next = $lx_per_cpu("hrtimer_bases").clock_base[0].active.next | ||
132 | (gdb) p *$container_of($next, "struct hrtimer", "node") | ||
133 | $5 = { | ||
134 | node = { | ||
135 | node = { | ||
136 | __rb_parent_color = 18446612133355256072, | ||
137 | rb_right = 0x0 <irq_stack_union>, | ||
138 | rb_left = 0x0 <irq_stack_union> | ||
139 | }, | ||
140 | expires = { | ||
141 | tv64 = 1835268000000 | ||
142 | } | ||
143 | }, | ||
144 | _softexpires = { | ||
145 | tv64 = 1835268000000 | ||
146 | }, | ||
147 | function = 0xffffffff81078232 <tick_sched_timer>, | ||
148 | base = 0xffff88003fd0d6f0, | ||
149 | state = 1, | ||
150 | start_pid = 0, | ||
151 | start_site = 0xffffffff81055c1f <hrtimer_start_range_ns+20>, | ||
152 | start_comm = "swapper/2\000\000\000\000\000\000" | ||
153 | } | ||
154 | |||
155 | |||
156 | List of commands and functions | ||
157 | ------------------------------ | ||
158 | |||
159 | The number of commands and convenience functions may evolve over the time, | ||
160 | this is just a snapshot of the initial version:: | ||
161 | |||
162 | (gdb) apropos lx | ||
163 | function lx_current -- Return current task | ||
164 | function lx_module -- Find module by name and return the module variable | ||
165 | function lx_per_cpu -- Return per-cpu variable | ||
166 | function lx_task_by_pid -- Find Linux task by PID and return the task_struct variable | ||
167 | function lx_thread_info -- Calculate Linux thread_info from task variable | ||
168 | lx-dmesg -- Print Linux kernel log buffer | ||
169 | lx-lsmod -- List currently loaded modules | ||
170 | lx-symbols -- (Re-)load symbols of Linux kernel and currently loaded modules | ||
171 | |||
172 | Detailed help can be obtained via "help <command-name>" for commands and "help | ||
173 | function <function-name>" for convenience functions. | ||
diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst new file mode 100644 index 000000000000..f7a18f274357 --- /dev/null +++ b/Documentation/dev-tools/kasan.rst | |||
@@ -0,0 +1,173 @@ | |||
1 | The Kernel Address Sanitizer (KASAN) | ||
2 | ==================================== | ||
3 | |||
4 | Overview | ||
5 | -------- | ||
6 | |||
7 | KernelAddressSANitizer (KASAN) is a dynamic memory error detector. It provides | ||
8 | a fast and comprehensive solution for finding use-after-free and out-of-bounds | ||
9 | bugs. | ||
10 | |||
11 | KASAN uses compile-time instrumentation for checking every memory access, | ||
12 | therefore you will need a GCC version 4.9.2 or later. GCC 5.0 or later is | ||
13 | required for detection of out-of-bounds accesses to stack or global variables. | ||
14 | |||
15 | Currently KASAN is supported only for the x86_64 and arm64 architectures. | ||
16 | |||
17 | Usage | ||
18 | ----- | ||
19 | |||
20 | To enable KASAN configure kernel with:: | ||
21 | |||
22 | CONFIG_KASAN = y | ||
23 | |||
24 | and choose between CONFIG_KASAN_OUTLINE and CONFIG_KASAN_INLINE. Outline and | ||
25 | inline are compiler instrumentation types. The former produces smaller binary | ||
26 | the latter is 1.1 - 2 times faster. Inline instrumentation requires a GCC | ||
27 | version 5.0 or later. | ||
28 | |||
29 | KASAN works with both SLUB and SLAB memory allocators. | ||
30 | For better bug detection and nicer reporting, enable CONFIG_STACKTRACE. | ||
31 | |||
32 | To disable instrumentation for specific files or directories, add a line | ||
33 | similar to the following to the respective kernel Makefile: | ||
34 | |||
35 | - For a single file (e.g. main.o):: | ||
36 | |||
37 | KASAN_SANITIZE_main.o := n | ||
38 | |||
39 | - For all files in one directory:: | ||
40 | |||
41 | KASAN_SANITIZE := n | ||
42 | |||
43 | Error reports | ||
44 | ~~~~~~~~~~~~~ | ||
45 | |||
46 | A typical out of bounds access report looks like this:: | ||
47 | |||
48 | ================================================================== | ||
49 | BUG: AddressSanitizer: out of bounds access in kmalloc_oob_right+0x65/0x75 [test_kasan] at addr ffff8800693bc5d3 | ||
50 | Write of size 1 by task modprobe/1689 | ||
51 | ============================================================================= | ||
52 | BUG kmalloc-128 (Not tainted): kasan error | ||
53 | ----------------------------------------------------------------------------- | ||
54 | |||
55 | Disabling lock debugging due to kernel taint | ||
56 | INFO: Allocated in kmalloc_oob_right+0x3d/0x75 [test_kasan] age=0 cpu=0 pid=1689 | ||
57 | __slab_alloc+0x4b4/0x4f0 | ||
58 | kmem_cache_alloc_trace+0x10b/0x190 | ||
59 | kmalloc_oob_right+0x3d/0x75 [test_kasan] | ||
60 | init_module+0x9/0x47 [test_kasan] | ||
61 | do_one_initcall+0x99/0x200 | ||
62 | load_module+0x2cb3/0x3b20 | ||
63 | SyS_finit_module+0x76/0x80 | ||
64 | system_call_fastpath+0x12/0x17 | ||
65 | INFO: Slab 0xffffea0001a4ef00 objects=17 used=7 fp=0xffff8800693bd728 flags=0x100000000004080 | ||
66 | INFO: Object 0xffff8800693bc558 @offset=1368 fp=0xffff8800693bc720 | ||
67 | |||
68 | Bytes b4 ffff8800693bc548: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ | ||
69 | Object ffff8800693bc558: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
70 | Object ffff8800693bc568: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
71 | Object ffff8800693bc578: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
72 | Object ffff8800693bc588: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
73 | Object ffff8800693bc598: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
74 | Object ffff8800693bc5a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
75 | Object ffff8800693bc5b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk | ||
76 | Object ffff8800693bc5c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. | ||
77 | Redzone ffff8800693bc5d8: cc cc cc cc cc cc cc cc ........ | ||
78 | Padding ffff8800693bc718: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ | ||
79 | CPU: 0 PID: 1689 Comm: modprobe Tainted: G B 3.18.0-rc1-mm1+ #98 | ||
80 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 | ||
81 | ffff8800693bc000 0000000000000000 ffff8800693bc558 ffff88006923bb78 | ||
82 | ffffffff81cc68ae 00000000000000f3 ffff88006d407600 ffff88006923bba8 | ||
83 | ffffffff811fd848 ffff88006d407600 ffffea0001a4ef00 ffff8800693bc558 | ||
84 | Call Trace: | ||
85 | [<ffffffff81cc68ae>] dump_stack+0x46/0x58 | ||
86 | [<ffffffff811fd848>] print_trailer+0xf8/0x160 | ||
87 | [<ffffffffa00026a7>] ? kmem_cache_oob+0xc3/0xc3 [test_kasan] | ||
88 | [<ffffffff811ff0f5>] object_err+0x35/0x40 | ||
89 | [<ffffffffa0002065>] ? kmalloc_oob_right+0x65/0x75 [test_kasan] | ||
90 | [<ffffffff8120b9fa>] kasan_report_error+0x38a/0x3f0 | ||
91 | [<ffffffff8120a79f>] ? kasan_poison_shadow+0x2f/0x40 | ||
92 | [<ffffffff8120b344>] ? kasan_unpoison_shadow+0x14/0x40 | ||
93 | [<ffffffff8120a79f>] ? kasan_poison_shadow+0x2f/0x40 | ||
94 | [<ffffffffa00026a7>] ? kmem_cache_oob+0xc3/0xc3 [test_kasan] | ||
95 | [<ffffffff8120a995>] __asan_store1+0x75/0xb0 | ||
96 | [<ffffffffa0002601>] ? kmem_cache_oob+0x1d/0xc3 [test_kasan] | ||
97 | [<ffffffffa0002065>] ? kmalloc_oob_right+0x65/0x75 [test_kasan] | ||
98 | [<ffffffffa0002065>] kmalloc_oob_right+0x65/0x75 [test_kasan] | ||
99 | [<ffffffffa00026b0>] init_module+0x9/0x47 [test_kasan] | ||
100 | [<ffffffff810002d9>] do_one_initcall+0x99/0x200 | ||
101 | [<ffffffff811e4e5c>] ? __vunmap+0xec/0x160 | ||
102 | [<ffffffff81114f63>] load_module+0x2cb3/0x3b20 | ||
103 | [<ffffffff8110fd70>] ? m_show+0x240/0x240 | ||
104 | [<ffffffff81115f06>] SyS_finit_module+0x76/0x80 | ||
105 | [<ffffffff81cd3129>] system_call_fastpath+0x12/0x17 | ||
106 | Memory state around the buggy address: | ||
107 | ffff8800693bc300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc | ||
108 | ffff8800693bc380: fc fc 00 00 00 00 00 00 00 00 00 00 00 00 00 fc | ||
109 | ffff8800693bc400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc | ||
110 | ffff8800693bc480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc | ||
111 | ffff8800693bc500: fc fc fc fc fc fc fc fc fc fc fc 00 00 00 00 00 | ||
112 | >ffff8800693bc580: 00 00 00 00 00 00 00 00 00 00 03 fc fc fc fc fc | ||
113 | ^ | ||
114 | ffff8800693bc600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc | ||
115 | ffff8800693bc680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc | ||
116 | ffff8800693bc700: fc fc fc fc fb fb fb fb fb fb fb fb fb fb fb fb | ||
117 | ffff8800693bc780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb | ||
118 | ffff8800693bc800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb | ||
119 | ================================================================== | ||
120 | |||
121 | The header of the report discribe what kind of bug happened and what kind of | ||
122 | access caused it. It's followed by the description of the accessed slub object | ||
123 | (see 'SLUB Debug output' section in Documentation/vm/slub.txt for details) and | ||
124 | the description of the accessed memory page. | ||
125 | |||
126 | In the last section the report shows memory state around the accessed address. | ||
127 | Reading this part requires some understanding of how KASAN works. | ||
128 | |||
129 | The state of each 8 aligned bytes of memory is encoded in one shadow byte. | ||
130 | Those 8 bytes can be accessible, partially accessible, freed or be a redzone. | ||
131 | We use the following encoding for each shadow byte: 0 means that all 8 bytes | ||
132 | of the corresponding memory region are accessible; number N (1 <= N <= 7) means | ||
133 | that the first N bytes are accessible, and other (8 - N) bytes are not; | ||
134 | any negative value indicates that the entire 8-byte word is inaccessible. | ||
135 | We use different negative values to distinguish between different kinds of | ||
136 | inaccessible memory like redzones or freed memory (see mm/kasan/kasan.h). | ||
137 | |||
138 | In the report above the arrows point to the shadow byte 03, which means that | ||
139 | the accessed address is partially accessible. | ||
140 | |||
141 | |||
142 | Implementation details | ||
143 | ---------------------- | ||
144 | |||
145 | From a high level, our approach to memory error detection is similar to that | ||
146 | of kmemcheck: use shadow memory to record whether each byte of memory is safe | ||
147 | to access, and use compile-time instrumentation to check shadow memory on each | ||
148 | memory access. | ||
149 | |||
150 | AddressSanitizer dedicates 1/8 of kernel memory to its shadow memory | ||
151 | (e.g. 16TB to cover 128TB on x86_64) and uses direct mapping with a scale and | ||
152 | offset to translate a memory address to its corresponding shadow address. | ||
153 | |||
154 | Here is the function which translates an address to its corresponding shadow | ||
155 | address:: | ||
156 | |||
157 | static inline void *kasan_mem_to_shadow(const void *addr) | ||
158 | { | ||
159 | return ((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) | ||
160 | + KASAN_SHADOW_OFFSET; | ||
161 | } | ||
162 | |||
163 | where ``KASAN_SHADOW_SCALE_SHIFT = 3``. | ||
164 | |||
165 | Compile-time instrumentation used for checking memory accesses. Compiler inserts | ||
166 | function calls (__asan_load*(addr), __asan_store*(addr)) before each memory | ||
167 | access of size 1, 2, 4, 8 or 16. These functions check whether memory access is | ||
168 | valid or not by checking corresponding shadow memory. | ||
169 | |||
170 | GCC 5.0 has possibility to perform inline instrumentation. Instead of making | ||
171 | function calls GCC directly inserts the code to check the shadow memory. | ||
172 | This option significantly enlarges kernel but it gives x1.1-x2 performance | ||
173 | boost over outline instrumented kernel. | ||
diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst new file mode 100644 index 000000000000..aca0e27ca197 --- /dev/null +++ b/Documentation/dev-tools/kcov.rst | |||
@@ -0,0 +1,111 @@ | |||
1 | kcov: code coverage for fuzzing | ||
2 | =============================== | ||
3 | |||
4 | kcov exposes kernel code coverage information in a form suitable for coverage- | ||
5 | guided fuzzing (randomized testing). Coverage data of a running kernel is | ||
6 | exported via the "kcov" debugfs file. Coverage collection is enabled on a task | ||
7 | basis, and thus it can capture precise coverage of a single system call. | ||
8 | |||
9 | Note that kcov does not aim to collect as much coverage as possible. It aims | ||
10 | to collect more or less stable coverage that is function of syscall inputs. | ||
11 | To achieve this goal it does not collect coverage in soft/hard interrupts | ||
12 | and instrumentation of some inherently non-deterministic parts of kernel is | ||
13 | disbled (e.g. scheduler, locking). | ||
14 | |||
15 | Usage | ||
16 | ----- | ||
17 | |||
18 | Configure the kernel with:: | ||
19 | |||
20 | CONFIG_KCOV=y | ||
21 | |||
22 | CONFIG_KCOV requires gcc built on revision 231296 or later. | ||
23 | Profiling data will only become accessible once debugfs has been mounted:: | ||
24 | |||
25 | mount -t debugfs none /sys/kernel/debug | ||
26 | |||
27 | The following program demonstrates kcov usage from within a test program:: | ||
28 | |||
29 | #include <stdio.h> | ||
30 | #include <stddef.h> | ||
31 | #include <stdint.h> | ||
32 | #include <stdlib.h> | ||
33 | #include <sys/types.h> | ||
34 | #include <sys/stat.h> | ||
35 | #include <sys/ioctl.h> | ||
36 | #include <sys/mman.h> | ||
37 | #include <unistd.h> | ||
38 | #include <fcntl.h> | ||
39 | |||
40 | #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) | ||
41 | #define KCOV_ENABLE _IO('c', 100) | ||
42 | #define KCOV_DISABLE _IO('c', 101) | ||
43 | #define COVER_SIZE (64<<10) | ||
44 | |||
45 | int main(int argc, char **argv) | ||
46 | { | ||
47 | int fd; | ||
48 | unsigned long *cover, n, i; | ||
49 | |||
50 | /* A single fd descriptor allows coverage collection on a single | ||
51 | * thread. | ||
52 | */ | ||
53 | fd = open("/sys/kernel/debug/kcov", O_RDWR); | ||
54 | if (fd == -1) | ||
55 | perror("open"), exit(1); | ||
56 | /* Setup trace mode and trace size. */ | ||
57 | if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE)) | ||
58 | perror("ioctl"), exit(1); | ||
59 | /* Mmap buffer shared between kernel- and user-space. */ | ||
60 | cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long), | ||
61 | PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); | ||
62 | if ((void*)cover == MAP_FAILED) | ||
63 | perror("mmap"), exit(1); | ||
64 | /* Enable coverage collection on the current thread. */ | ||
65 | if (ioctl(fd, KCOV_ENABLE, 0)) | ||
66 | perror("ioctl"), exit(1); | ||
67 | /* Reset coverage from the tail of the ioctl() call. */ | ||
68 | __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED); | ||
69 | /* That's the target syscal call. */ | ||
70 | read(-1, NULL, 0); | ||
71 | /* Read number of PCs collected. */ | ||
72 | n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); | ||
73 | for (i = 0; i < n; i++) | ||
74 | printf("0x%lx\n", cover[i + 1]); | ||
75 | /* Disable coverage collection for the current thread. After this call | ||
76 | * coverage can be enabled for a different thread. | ||
77 | */ | ||
78 | if (ioctl(fd, KCOV_DISABLE, 0)) | ||
79 | perror("ioctl"), exit(1); | ||
80 | /* Free resources. */ | ||
81 | if (munmap(cover, COVER_SIZE * sizeof(unsigned long))) | ||
82 | perror("munmap"), exit(1); | ||
83 | if (close(fd)) | ||
84 | perror("close"), exit(1); | ||
85 | return 0; | ||
86 | } | ||
87 | |||
88 | After piping through addr2line output of the program looks as follows:: | ||
89 | |||
90 | SyS_read | ||
91 | fs/read_write.c:562 | ||
92 | __fdget_pos | ||
93 | fs/file.c:774 | ||
94 | __fget_light | ||
95 | fs/file.c:746 | ||
96 | __fget_light | ||
97 | fs/file.c:750 | ||
98 | __fget_light | ||
99 | fs/file.c:760 | ||
100 | __fdget_pos | ||
101 | fs/file.c:784 | ||
102 | SyS_read | ||
103 | fs/read_write.c:562 | ||
104 | |||
105 | If a program needs to collect coverage from several threads (independently), | ||
106 | it needs to open /sys/kernel/debug/kcov in each thread separately. | ||
107 | |||
108 | The interface is fine-grained to allow efficient forking of test processes. | ||
109 | That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode, | ||
110 | mmaps coverage buffer and then forks child processes in a loop. Child processes | ||
111 | only need to enable coverage (disable happens automatically on thread end). | ||
diff --git a/Documentation/dev-tools/kmemcheck.rst b/Documentation/dev-tools/kmemcheck.rst new file mode 100644 index 000000000000..7f3d1985de74 --- /dev/null +++ b/Documentation/dev-tools/kmemcheck.rst | |||
@@ -0,0 +1,733 @@ | |||
1 | Getting started with kmemcheck | ||
2 | ============================== | ||
3 | |||
4 | Vegard Nossum <vegardno@ifi.uio.no> | ||
5 | |||
6 | |||
7 | Introduction | ||
8 | ------------ | ||
9 | |||
10 | kmemcheck is a debugging feature for the Linux Kernel. More specifically, it | ||
11 | is a dynamic checker that detects and warns about some uses of uninitialized | ||
12 | memory. | ||
13 | |||
14 | Userspace programmers might be familiar with Valgrind's memcheck. The main | ||
15 | difference between memcheck and kmemcheck is that memcheck works for userspace | ||
16 | programs only, and kmemcheck works for the kernel only. The implementations | ||
17 | are of course vastly different. Because of this, kmemcheck is not as accurate | ||
18 | as memcheck, but it turns out to be good enough in practice to discover real | ||
19 | programmer errors that the compiler is not able to find through static | ||
20 | analysis. | ||
21 | |||
22 | Enabling kmemcheck on a kernel will probably slow it down to the extent that | ||
23 | the machine will not be usable for normal workloads such as e.g. an | ||
24 | interactive desktop. kmemcheck will also cause the kernel to use about twice | ||
25 | as much memory as normal. For this reason, kmemcheck is strictly a debugging | ||
26 | feature. | ||
27 | |||
28 | |||
29 | Downloading | ||
30 | ----------- | ||
31 | |||
32 | As of version 2.6.31-rc1, kmemcheck is included in the mainline kernel. | ||
33 | |||
34 | |||
35 | Configuring and compiling | ||
36 | ------------------------- | ||
37 | |||
38 | kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of | ||
39 | configuration variables must have specific settings in order for the kmemcheck | ||
40 | menu to even appear in "menuconfig". These are: | ||
41 | |||
42 | - ``CONFIG_CC_OPTIMIZE_FOR_SIZE=n`` | ||
43 | This option is located under "General setup" / "Optimize for size". | ||
44 | |||
45 | Without this, gcc will use certain optimizations that usually lead to | ||
46 | false positive warnings from kmemcheck. An example of this is a 16-bit | ||
47 | field in a struct, where gcc may load 32 bits, then discard the upper | ||
48 | 16 bits. kmemcheck sees only the 32-bit load, and may trigger a | ||
49 | warning for the upper 16 bits (if they're uninitialized). | ||
50 | |||
51 | - ``CONFIG_SLAB=y`` or ``CONFIG_SLUB=y`` | ||
52 | This option is located under "General setup" / "Choose SLAB | ||
53 | allocator". | ||
54 | |||
55 | - ``CONFIG_FUNCTION_TRACER=n`` | ||
56 | This option is located under "Kernel hacking" / "Tracers" / "Kernel | ||
57 | Function Tracer" | ||
58 | |||
59 | When function tracing is compiled in, gcc emits a call to another | ||
60 | function at the beginning of every function. This means that when the | ||
61 | page fault handler is called, the ftrace framework will be called | ||
62 | before kmemcheck has had a chance to handle the fault. If ftrace then | ||
63 | modifies memory that was tracked by kmemcheck, the result is an | ||
64 | endless recursive page fault. | ||
65 | |||
66 | - ``CONFIG_DEBUG_PAGEALLOC=n`` | ||
67 | This option is located under "Kernel hacking" / "Memory Debugging" | ||
68 | / "Debug page memory allocations". | ||
69 | |||
70 | In addition, I highly recommend turning on ``CONFIG_DEBUG_INFO=y``. This is also | ||
71 | located under "Kernel hacking". With this, you will be able to get line number | ||
72 | information from the kmemcheck warnings, which is extremely valuable in | ||
73 | debugging a problem. This option is not mandatory, however, because it slows | ||
74 | down the compilation process and produces a much bigger kernel image. | ||
75 | |||
76 | Now the kmemcheck menu should be visible (under "Kernel hacking" / "Memory | ||
77 | Debugging" / "kmemcheck: trap use of uninitialized memory"). Here follows | ||
78 | a description of the kmemcheck configuration variables: | ||
79 | |||
80 | - ``CONFIG_KMEMCHECK`` | ||
81 | This must be enabled in order to use kmemcheck at all... | ||
82 | |||
83 | - ``CONFIG_KMEMCHECK_``[``DISABLED`` | ``ENABLED`` | ``ONESHOT``]``_BY_DEFAULT`` | ||
84 | This option controls the status of kmemcheck at boot-time. "Enabled" | ||
85 | will enable kmemcheck right from the start, "disabled" will boot the | ||
86 | kernel as normal (but with the kmemcheck code compiled in, so it can | ||
87 | be enabled at run-time after the kernel has booted), and "one-shot" is | ||
88 | a special mode which will turn kmemcheck off automatically after | ||
89 | detecting the first use of uninitialized memory. | ||
90 | |||
91 | If you are using kmemcheck to actively debug a problem, then you | ||
92 | probably want to choose "enabled" here. | ||
93 | |||
94 | The one-shot mode is mostly useful in automated test setups because it | ||
95 | can prevent floods of warnings and increase the chances of the machine | ||
96 | surviving in case something is really wrong. In other cases, the one- | ||
97 | shot mode could actually be counter-productive because it would turn | ||
98 | itself off at the very first error -- in the case of a false positive | ||
99 | too -- and this would come in the way of debugging the specific | ||
100 | problem you were interested in. | ||
101 | |||
102 | If you would like to use your kernel as normal, but with a chance to | ||
103 | enable kmemcheck in case of some problem, it might be a good idea to | ||
104 | choose "disabled" here. When kmemcheck is disabled, most of the run- | ||
105 | time overhead is not incurred, and the kernel will be almost as fast | ||
106 | as normal. | ||
107 | |||
108 | - ``CONFIG_KMEMCHECK_QUEUE_SIZE`` | ||
109 | Select the maximum number of error reports to store in an internal | ||
110 | (fixed-size) buffer. Since errors can occur virtually anywhere and in | ||
111 | any context, we need a temporary storage area which is guaranteed not | ||
112 | to generate any other page faults when accessed. The queue will be | ||
113 | emptied as soon as a tasklet may be scheduled. If the queue is full, | ||
114 | new error reports will be lost. | ||
115 | |||
116 | The default value of 64 is probably fine. If some code produces more | ||
117 | than 64 errors within an irqs-off section, then the code is likely to | ||
118 | produce many, many more, too, and these additional reports seldom give | ||
119 | any more information (the first report is usually the most valuable | ||
120 | anyway). | ||
121 | |||
122 | This number might have to be adjusted if you are not using serial | ||
123 | console or similar to capture the kernel log. If you are using the | ||
124 | "dmesg" command to save the log, then getting a lot of kmemcheck | ||
125 | warnings might overflow the kernel log itself, and the earlier reports | ||
126 | will get lost in that way instead. Try setting this to 10 or so on | ||
127 | such a setup. | ||
128 | |||
129 | - ``CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT`` | ||
130 | Select the number of shadow bytes to save along with each entry of the | ||
131 | error-report queue. These bytes indicate what parts of an allocation | ||
132 | are initialized, uninitialized, etc. and will be displayed when an | ||
133 | error is detected to help the debugging of a particular problem. | ||
134 | |||
135 | The number entered here is actually the logarithm of the number of | ||
136 | bytes that will be saved. So if you pick for example 5 here, kmemcheck | ||
137 | will save 2^5 = 32 bytes. | ||
138 | |||
139 | The default value should be fine for debugging most problems. It also | ||
140 | fits nicely within 80 columns. | ||
141 | |||
142 | - ``CONFIG_KMEMCHECK_PARTIAL_OK`` | ||
143 | This option (when enabled) works around certain GCC optimizations that | ||
144 | produce 32-bit reads from 16-bit variables where the upper 16 bits are | ||
145 | thrown away afterwards. | ||
146 | |||
147 | The default value (enabled) is recommended. This may of course hide | ||
148 | some real errors, but disabling it would probably produce a lot of | ||
149 | false positives. | ||
150 | |||
151 | - ``CONFIG_KMEMCHECK_BITOPS_OK`` | ||
152 | This option silences warnings that would be generated for bit-field | ||
153 | accesses where not all the bits are initialized at the same time. This | ||
154 | may also hide some real bugs. | ||
155 | |||
156 | This option is probably obsolete, or it should be replaced with | ||
157 | the kmemcheck-/bitfield-annotations for the code in question. The | ||
158 | default value is therefore fine. | ||
159 | |||
160 | Now compile the kernel as usual. | ||
161 | |||
162 | |||
163 | How to use | ||
164 | ---------- | ||
165 | |||
166 | Booting | ||
167 | ~~~~~~~ | ||
168 | |||
169 | First some information about the command-line options. There is only one | ||
170 | option specific to kmemcheck, and this is called "kmemcheck". It can be used | ||
171 | to override the default mode as chosen by the ``CONFIG_KMEMCHECK_*_BY_DEFAULT`` | ||
172 | option. Its possible settings are: | ||
173 | |||
174 | - ``kmemcheck=0`` (disabled) | ||
175 | - ``kmemcheck=1`` (enabled) | ||
176 | - ``kmemcheck=2`` (one-shot mode) | ||
177 | |||
178 | If SLUB debugging has been enabled in the kernel, it may take precedence over | ||
179 | kmemcheck in such a way that the slab caches which are under SLUB debugging | ||
180 | will not be tracked by kmemcheck. In order to ensure that this doesn't happen | ||
181 | (even though it shouldn't by default), use SLUB's boot option ``slub_debug``, | ||
182 | like this: ``slub_debug=-`` | ||
183 | |||
184 | In fact, this option may also be used for fine-grained control over SLUB vs. | ||
185 | kmemcheck. For example, if the command line includes | ||
186 | ``kmemcheck=1 slub_debug=,dentry``, then SLUB debugging will be used only | ||
187 | for the "dentry" slab cache, and with kmemcheck tracking all the other | ||
188 | caches. This is advanced usage, however, and is not generally recommended. | ||
189 | |||
190 | |||
191 | Run-time enable/disable | ||
192 | ~~~~~~~~~~~~~~~~~~~~~~~ | ||
193 | |||
194 | When the kernel has booted, it is possible to enable or disable kmemcheck at | ||
195 | run-time. WARNING: This feature is still experimental and may cause false | ||
196 | positive warnings to appear. Therefore, try not to use this. If you find that | ||
197 | it doesn't work properly (e.g. you see an unreasonable amount of warnings), I | ||
198 | will be happy to take bug reports. | ||
199 | |||
200 | Use the file ``/proc/sys/kernel/kmemcheck`` for this purpose, e.g.:: | ||
201 | |||
202 | $ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck | ||
203 | |||
204 | The numbers are the same as for the ``kmemcheck=`` command-line option. | ||
205 | |||
206 | |||
207 | Debugging | ||
208 | ~~~~~~~~~ | ||
209 | |||
210 | A typical report will look something like this:: | ||
211 | |||
212 | WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024) | ||
213 | 80000000000000000000000000000000000000000088ffff0000000000000000 | ||
214 | i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u | ||
215 | ^ | ||
216 | |||
217 | Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A | ||
218 | RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190 | ||
219 | RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002 | ||
220 | RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009 | ||
221 | RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84 | ||
222 | RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000 | ||
223 | R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e | ||
224 | R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8 | ||
225 | FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000 | ||
226 | CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 | ||
227 | CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0 | ||
228 | DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 | ||
229 | DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 | ||
230 | [<ffffffff8104f04e>] dequeue_signal+0x8e/0x170 | ||
231 | [<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390 | ||
232 | [<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0 | ||
233 | [<ffffffff8100c7b5>] int_signal+0x12/0x17 | ||
234 | [<ffffffffffffffff>] 0xffffffffffffffff | ||
235 | |||
236 | The single most valuable information in this report is the RIP (or EIP on 32- | ||
237 | bit) value. This will help us pinpoint exactly which instruction that caused | ||
238 | the warning. | ||
239 | |||
240 | If your kernel was compiled with ``CONFIG_DEBUG_INFO=y``, then all we have to do | ||
241 | is give this address to the addr2line program, like this:: | ||
242 | |||
243 | $ addr2line -e vmlinux -i ffffffff8104ede8 | ||
244 | arch/x86/include/asm/string_64.h:12 | ||
245 | include/asm-generic/siginfo.h:287 | ||
246 | kernel/signal.c:380 | ||
247 | kernel/signal.c:410 | ||
248 | |||
249 | The "``-e vmlinux``" tells addr2line which file to look in. **IMPORTANT:** | ||
250 | This must be the vmlinux of the kernel that produced the warning in the | ||
251 | first place! If not, the line number information will almost certainly be | ||
252 | wrong. | ||
253 | |||
254 | The "``-i``" tells addr2line to also print the line numbers of inlined | ||
255 | functions. In this case, the flag was very important, because otherwise, | ||
256 | it would only have printed the first line, which is just a call to | ||
257 | ``memcpy()``, which could be called from a thousand places in the kernel, and | ||
258 | is therefore not very useful. These inlined functions would not show up in | ||
259 | the stack trace above, simply because the kernel doesn't load the extra | ||
260 | debugging information. This technique can of course be used with ordinary | ||
261 | kernel oopses as well. | ||
262 | |||
263 | In this case, it's the caller of ``memcpy()`` that is interesting, and it can be | ||
264 | found in ``include/asm-generic/siginfo.h``, line 287:: | ||
265 | |||
266 | 281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from) | ||
267 | 282 { | ||
268 | 283 if (from->si_code < 0) | ||
269 | 284 memcpy(to, from, sizeof(*to)); | ||
270 | 285 else | ||
271 | 286 /* _sigchld is currently the largest know union member */ | ||
272 | 287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld)); | ||
273 | 288 } | ||
274 | |||
275 | Since this was a read (kmemcheck usually warns about reads only, though it can | ||
276 | warn about writes to unallocated or freed memory as well), it was probably the | ||
277 | "from" argument which contained some uninitialized bytes. Following the chain | ||
278 | of calls, we move upwards to see where "from" was allocated or initialized, | ||
279 | ``kernel/signal.c``, line 380:: | ||
280 | |||
281 | 359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info) | ||
282 | 360 { | ||
283 | ... | ||
284 | 367 list_for_each_entry(q, &list->list, list) { | ||
285 | 368 if (q->info.si_signo == sig) { | ||
286 | 369 if (first) | ||
287 | 370 goto still_pending; | ||
288 | 371 first = q; | ||
289 | ... | ||
290 | 377 if (first) { | ||
291 | 378 still_pending: | ||
292 | 379 list_del_init(&first->list); | ||
293 | 380 copy_siginfo(info, &first->info); | ||
294 | 381 __sigqueue_free(first); | ||
295 | ... | ||
296 | 392 } | ||
297 | 393 } | ||
298 | |||
299 | Here, it is ``&first->info`` that is being passed on to ``copy_siginfo()``. The | ||
300 | variable ``first`` was found on a list -- passed in as the second argument to | ||
301 | ``collect_signal()``. We continue our journey through the stack, to figure out | ||
302 | where the item on "list" was allocated or initialized. We move to line 410:: | ||
303 | |||
304 | 395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask, | ||
305 | 396 siginfo_t *info) | ||
306 | 397 { | ||
307 | ... | ||
308 | 410 collect_signal(sig, pending, info); | ||
309 | ... | ||
310 | 414 } | ||
311 | |||
312 | Now we need to follow the ``pending`` pointer, since that is being passed on to | ||
313 | ``collect_signal()`` as ``list``. At this point, we've run out of lines from the | ||
314 | "addr2line" output. Not to worry, we just paste the next addresses from the | ||
315 | kmemcheck stack dump, i.e.:: | ||
316 | |||
317 | [<ffffffff8104f04e>] dequeue_signal+0x8e/0x170 | ||
318 | [<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390 | ||
319 | [<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0 | ||
320 | [<ffffffff8100c7b5>] int_signal+0x12/0x17 | ||
321 | |||
322 | $ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \ | ||
323 | ffffffff8100b87d ffffffff8100c7b5 | ||
324 | kernel/signal.c:446 | ||
325 | kernel/signal.c:1806 | ||
326 | arch/x86/kernel/signal.c:805 | ||
327 | arch/x86/kernel/signal.c:871 | ||
328 | arch/x86/kernel/entry_64.S:694 | ||
329 | |||
330 | Remember that since these addresses were found on the stack and not as the | ||
331 | RIP value, they actually point to the _next_ instruction (they are return | ||
332 | addresses). This becomes obvious when we look at the code for line 446:: | ||
333 | |||
334 | 422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info) | ||
335 | 423 { | ||
336 | ... | ||
337 | 431 signr = __dequeue_signal(&tsk->signal->shared_pending, | ||
338 | 432 mask, info); | ||
339 | 433 /* | ||
340 | 434 * itimer signal ? | ||
341 | 435 * | ||
342 | 436 * itimers are process shared and we restart periodic | ||
343 | 437 * itimers in the signal delivery path to prevent DoS | ||
344 | 438 * attacks in the high resolution timer case. This is | ||
345 | 439 * compliant with the old way of self restarting | ||
346 | 440 * itimers, as the SIGALRM is a legacy signal and only | ||
347 | 441 * queued once. Changing the restart behaviour to | ||
348 | 442 * restart the timer in the signal dequeue path is | ||
349 | 443 * reducing the timer noise on heavy loaded !highres | ||
350 | 444 * systems too. | ||
351 | 445 */ | ||
352 | 446 if (unlikely(signr == SIGALRM)) { | ||
353 | ... | ||
354 | 489 } | ||
355 | |||
356 | So instead of looking at 446, we should be looking at 431, which is the line | ||
357 | that executes just before 446. Here we see that what we are looking for is | ||
358 | ``&tsk->signal->shared_pending``. | ||
359 | |||
360 | Our next task is now to figure out which function that puts items on this | ||
361 | ``shared_pending`` list. A crude, but efficient tool, is ``git grep``:: | ||
362 | |||
363 | $ git grep -n 'shared_pending' kernel/ | ||
364 | ... | ||
365 | kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending; | ||
366 | kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending; | ||
367 | ... | ||
368 | |||
369 | There were more results, but none of them were related to list operations, | ||
370 | and these were the only assignments. We inspect the line numbers more closely | ||
371 | and find that this is indeed where items are being added to the list:: | ||
372 | |||
373 | 816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t, | ||
374 | 817 int group) | ||
375 | 818 { | ||
376 | ... | ||
377 | 828 pending = group ? &t->signal->shared_pending : &t->pending; | ||
378 | ... | ||
379 | 851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN && | ||
380 | 852 (is_si_special(info) || | ||
381 | 853 info->si_code >= 0))); | ||
382 | 854 if (q) { | ||
383 | 855 list_add_tail(&q->list, &pending->list); | ||
384 | ... | ||
385 | 890 } | ||
386 | |||
387 | and:: | ||
388 | |||
389 | 1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group) | ||
390 | 1310 { | ||
391 | .... | ||
392 | 1339 pending = group ? &t->signal->shared_pending : &t->pending; | ||
393 | 1340 list_add_tail(&q->list, &pending->list); | ||
394 | .... | ||
395 | 1347 } | ||
396 | |||
397 | In the first case, the list element we are looking for, ``q``, is being | ||
398 | returned from the function ``__sigqueue_alloc()``, which looks like an | ||
399 | allocation function. Let's take a look at it:: | ||
400 | |||
401 | 187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags, | ||
402 | 188 int override_rlimit) | ||
403 | 189 { | ||
404 | 190 struct sigqueue *q = NULL; | ||
405 | 191 struct user_struct *user; | ||
406 | 192 | ||
407 | 193 /* | ||
408 | 194 * We won't get problems with the target's UID changing under us | ||
409 | 195 * because changing it requires RCU be used, and if t != current, the | ||
410 | 196 * caller must be holding the RCU readlock (by way of a spinlock) and | ||
411 | 197 * we use RCU protection here | ||
412 | 198 */ | ||
413 | 199 user = get_uid(__task_cred(t)->user); | ||
414 | 200 atomic_inc(&user->sigpending); | ||
415 | 201 if (override_rlimit || | ||
416 | 202 atomic_read(&user->sigpending) <= | ||
417 | 203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur) | ||
418 | 204 q = kmem_cache_alloc(sigqueue_cachep, flags); | ||
419 | 205 if (unlikely(q == NULL)) { | ||
420 | 206 atomic_dec(&user->sigpending); | ||
421 | 207 free_uid(user); | ||
422 | 208 } else { | ||
423 | 209 INIT_LIST_HEAD(&q->list); | ||
424 | 210 q->flags = 0; | ||
425 | 211 q->user = user; | ||
426 | 212 } | ||
427 | 213 | ||
428 | 214 return q; | ||
429 | 215 } | ||
430 | |||
431 | We see that this function initializes ``q->list``, ``q->flags``, and | ||
432 | ``q->user``. It seems that now is the time to look at the definition of | ||
433 | ``struct sigqueue``, e.g.:: | ||
434 | |||
435 | 14 struct sigqueue { | ||
436 | 15 struct list_head list; | ||
437 | 16 int flags; | ||
438 | 17 siginfo_t info; | ||
439 | 18 struct user_struct *user; | ||
440 | 19 }; | ||
441 | |||
442 | And, you might remember, it was a ``memcpy()`` on ``&first->info`` that | ||
443 | caused the warning, so this makes perfect sense. It also seems reasonable | ||
444 | to assume that it is the caller of ``__sigqueue_alloc()`` that has the | ||
445 | responsibility of filling out (initializing) this member. | ||
446 | |||
447 | But just which fields of the struct were uninitialized? Let's look at | ||
448 | kmemcheck's report again:: | ||
449 | |||
450 | WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024) | ||
451 | 80000000000000000000000000000000000000000088ffff0000000000000000 | ||
452 | i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u | ||
453 | ^ | ||
454 | |||
455 | These first two lines are the memory dump of the memory object itself, and | ||
456 | the shadow bytemap, respectively. The memory object itself is in this case | ||
457 | ``&first->info``. Just beware that the start of this dump is NOT the start | ||
458 | of the object itself! The position of the caret (^) corresponds with the | ||
459 | address of the read (ffff88003e4a2024). | ||
460 | |||
461 | The shadow bytemap dump legend is as follows: | ||
462 | |||
463 | - i: initialized | ||
464 | - u: uninitialized | ||
465 | - a: unallocated (memory has been allocated by the slab layer, but has not | ||
466 | yet been handed off to anybody) | ||
467 | - f: freed (memory has been allocated by the slab layer, but has been freed | ||
468 | by the previous owner) | ||
469 | |||
470 | In order to figure out where (relative to the start of the object) the | ||
471 | uninitialized memory was located, we have to look at the disassembly. For | ||
472 | that, we'll need the RIP address again:: | ||
473 | |||
474 | RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190 | ||
475 | |||
476 | $ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8: | ||
477 | ffffffff8104edc8: mov %r8,0x8(%r8) | ||
478 | ffffffff8104edcc: test %r10d,%r10d | ||
479 | ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168> | ||
480 | ffffffff8104edd5: mov %rax,%rdx | ||
481 | ffffffff8104edd8: mov $0xc,%ecx | ||
482 | ffffffff8104eddd: mov %r13,%rdi | ||
483 | ffffffff8104ede0: mov $0x30,%eax | ||
484 | ffffffff8104ede5: mov %rdx,%rsi | ||
485 | ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi) | ||
486 | ffffffff8104edea: test $0x2,%al | ||
487 | ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0> | ||
488 | ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi) | ||
489 | ffffffff8104edf0: test $0x1,%al | ||
490 | ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5> | ||
491 | ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi) | ||
492 | ffffffff8104edf5: mov %r8,%rdi | ||
493 | ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free> | ||
494 | |||
495 | As expected, it's the "``rep movsl``" instruction from the ``memcpy()`` | ||
496 | that causes the warning. We know about ``REP MOVSL`` that it uses the register | ||
497 | ``RCX`` to count the number of remaining iterations. By taking a look at the | ||
498 | register dump again (from the kmemcheck report), we can figure out how many | ||
499 | bytes were left to copy:: | ||
500 | |||
501 | RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009 | ||
502 | |||
503 | By looking at the disassembly, we also see that ``%ecx`` is being loaded | ||
504 | with the value ``$0xc`` just before (ffffffff8104edd8), so we are very | ||
505 | lucky. Keep in mind that this is the number of iterations, not bytes. And | ||
506 | since this is a "long" operation, we need to multiply by 4 to get the | ||
507 | number of bytes. So this means that the uninitialized value was encountered | ||
508 | at 4 * (0xc - 0x9) = 12 bytes from the start of the object. | ||
509 | |||
510 | We can now try to figure out which field of the "``struct siginfo``" that | ||
511 | was not initialized. This is the beginning of the struct:: | ||
512 | |||
513 | 40 typedef struct siginfo { | ||
514 | 41 int si_signo; | ||
515 | 42 int si_errno; | ||
516 | 43 int si_code; | ||
517 | 44 | ||
518 | 45 union { | ||
519 | .. | ||
520 | 92 } _sifields; | ||
521 | 93 } siginfo_t; | ||
522 | |||
523 | On 64-bit, the int is 4 bytes long, so it must the union member that has | ||
524 | not been initialized. We can verify this using gdb:: | ||
525 | |||
526 | $ gdb vmlinux | ||
527 | ... | ||
528 | (gdb) p &((struct siginfo *) 0)->_sifields | ||
529 | $1 = (union {...} *) 0x10 | ||
530 | |||
531 | Actually, it seems that the union member is located at offset 0x10 -- which | ||
532 | means that gcc has inserted 4 bytes of padding between the members ``si_code`` | ||
533 | and ``_sifields``. We can now get a fuller picture of the memory dump:: | ||
534 | |||
535 | _----------------------------=> si_code | ||
536 | / _--------------------=> (padding) | ||
537 | | / _------------=> _sifields(._kill._pid) | ||
538 | | | / _----=> _sifields(._kill._uid) | ||
539 | | | | / | ||
540 | -------|-------|-------|-------| | ||
541 | 80000000000000000000000000000000000000000088ffff0000000000000000 | ||
542 | i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u | ||
543 | |||
544 | This allows us to realize another important fact: ``si_code`` contains the | ||
545 | value 0x80. Remember that x86 is little endian, so the first 4 bytes | ||
546 | "80000000" are really the number 0x00000080. With a bit of research, we | ||
547 | find that this is actually the constant ``SI_KERNEL`` defined in | ||
548 | ``include/asm-generic/siginfo.h``:: | ||
549 | |||
550 | 144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */ | ||
551 | |||
552 | This macro is used in exactly one place in the x86 kernel: In ``send_signal()`` | ||
553 | in ``kernel/signal.c``:: | ||
554 | |||
555 | 816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t, | ||
556 | 817 int group) | ||
557 | 818 { | ||
558 | ... | ||
559 | 828 pending = group ? &t->signal->shared_pending : &t->pending; | ||
560 | ... | ||
561 | 851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN && | ||
562 | 852 (is_si_special(info) || | ||
563 | 853 info->si_code >= 0))); | ||
564 | 854 if (q) { | ||
565 | 855 list_add_tail(&q->list, &pending->list); | ||
566 | 856 switch ((unsigned long) info) { | ||
567 | ... | ||
568 | 865 case (unsigned long) SEND_SIG_PRIV: | ||
569 | 866 q->info.si_signo = sig; | ||
570 | 867 q->info.si_errno = 0; | ||
571 | 868 q->info.si_code = SI_KERNEL; | ||
572 | 869 q->info.si_pid = 0; | ||
573 | 870 q->info.si_uid = 0; | ||
574 | 871 break; | ||
575 | ... | ||
576 | 890 } | ||
577 | |||
578 | Not only does this match with the ``.si_code`` member, it also matches the place | ||
579 | we found earlier when looking for where siginfo_t objects are enqueued on the | ||
580 | ``shared_pending`` list. | ||
581 | |||
582 | So to sum up: It seems that it is the padding introduced by the compiler | ||
583 | between two struct fields that is uninitialized, and this gets reported when | ||
584 | we do a ``memcpy()`` on the struct. This means that we have identified a false | ||
585 | positive warning. | ||
586 | |||
587 | Normally, kmemcheck will not report uninitialized accesses in ``memcpy()`` calls | ||
588 | when both the source and destination addresses are tracked. (Instead, we copy | ||
589 | the shadow bytemap as well). In this case, the destination address clearly | ||
590 | was not tracked. We can dig a little deeper into the stack trace from above:: | ||
591 | |||
592 | arch/x86/kernel/signal.c:805 | ||
593 | arch/x86/kernel/signal.c:871 | ||
594 | arch/x86/kernel/entry_64.S:694 | ||
595 | |||
596 | And we clearly see that the destination siginfo object is located on the | ||
597 | stack:: | ||
598 | |||
599 | 782 static void do_signal(struct pt_regs *regs) | ||
600 | 783 { | ||
601 | 784 struct k_sigaction ka; | ||
602 | 785 siginfo_t info; | ||
603 | ... | ||
604 | 804 signr = get_signal_to_deliver(&info, &ka, regs, NULL); | ||
605 | ... | ||
606 | 854 } | ||
607 | |||
608 | And this ``&info`` is what eventually gets passed to ``copy_siginfo()`` as the | ||
609 | destination argument. | ||
610 | |||
611 | Now, even though we didn't find an actual error here, the example is still a | ||
612 | good one, because it shows how one would go about to find out what the report | ||
613 | was all about. | ||
614 | |||
615 | |||
616 | Annotating false positives | ||
617 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
618 | |||
619 | There are a few different ways to make annotations in the source code that | ||
620 | will keep kmemcheck from checking and reporting certain allocations. Here | ||
621 | they are: | ||
622 | |||
623 | - ``__GFP_NOTRACK_FALSE_POSITIVE`` | ||
624 | This flag can be passed to ``kmalloc()`` or ``kmem_cache_alloc()`` | ||
625 | (therefore also to other functions that end up calling one of | ||
626 | these) to indicate that the allocation should not be tracked | ||
627 | because it would lead to a false positive report. This is a "big | ||
628 | hammer" way of silencing kmemcheck; after all, even if the false | ||
629 | positive pertains to particular field in a struct, for example, we | ||
630 | will now lose the ability to find (real) errors in other parts of | ||
631 | the same struct. | ||
632 | |||
633 | Example:: | ||
634 | |||
635 | /* No warnings will ever trigger on accessing any part of x */ | ||
636 | x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE); | ||
637 | |||
638 | - ``kmemcheck_bitfield_begin(name)``/``kmemcheck_bitfield_end(name)`` and | ||
639 | ``kmemcheck_annotate_bitfield(ptr, name)`` | ||
640 | The first two of these three macros can be used inside struct | ||
641 | definitions to signal, respectively, the beginning and end of a | ||
642 | bitfield. Additionally, this will assign the bitfield a name, which | ||
643 | is given as an argument to the macros. | ||
644 | |||
645 | Having used these markers, one can later use | ||
646 | kmemcheck_annotate_bitfield() at the point of allocation, to indicate | ||
647 | which parts of the allocation is part of a bitfield. | ||
648 | |||
649 | Example:: | ||
650 | |||
651 | struct foo { | ||
652 | int x; | ||
653 | |||
654 | kmemcheck_bitfield_begin(flags); | ||
655 | int flag_a:1; | ||
656 | int flag_b:1; | ||
657 | kmemcheck_bitfield_end(flags); | ||
658 | |||
659 | int y; | ||
660 | }; | ||
661 | |||
662 | struct foo *x = kmalloc(sizeof *x); | ||
663 | |||
664 | /* No warnings will trigger on accessing the bitfield of x */ | ||
665 | kmemcheck_annotate_bitfield(x, flags); | ||
666 | |||
667 | Note that ``kmemcheck_annotate_bitfield()`` can be used even before the | ||
668 | return value of ``kmalloc()`` is checked -- in other words, passing NULL | ||
669 | as the first argument is legal (and will do nothing). | ||
670 | |||
671 | |||
672 | Reporting errors | ||
673 | ---------------- | ||
674 | |||
675 | As we have seen, kmemcheck will produce false positive reports. Therefore, it | ||
676 | is not very wise to blindly post kmemcheck warnings to mailing lists and | ||
677 | maintainers. Instead, I encourage maintainers and developers to find errors | ||
678 | in their own code. If you get a warning, you can try to work around it, try | ||
679 | to figure out if it's a real error or not, or simply ignore it. Most | ||
680 | developers know their own code and will quickly and efficiently determine the | ||
681 | root cause of a kmemcheck report. This is therefore also the most efficient | ||
682 | way to work with kmemcheck. | ||
683 | |||
684 | That said, we (the kmemcheck maintainers) will always be on the lookout for | ||
685 | false positives that we can annotate and silence. So whatever you find, | ||
686 | please drop us a note privately! Kernel configs and steps to reproduce (if | ||
687 | available) are of course a great help too. | ||
688 | |||
689 | Happy hacking! | ||
690 | |||
691 | |||
692 | Technical description | ||
693 | --------------------- | ||
694 | |||
695 | kmemcheck works by marking memory pages non-present. This means that whenever | ||
696 | somebody attempts to access the page, a page fault is generated. The page | ||
697 | fault handler notices that the page was in fact only hidden, and so it calls | ||
698 | on the kmemcheck code to make further investigations. | ||
699 | |||
700 | When the investigations are completed, kmemcheck "shows" the page by marking | ||
701 | it present (as it would be under normal circumstances). This way, the | ||
702 | interrupted code can continue as usual. | ||
703 | |||
704 | But after the instruction has been executed, we should hide the page again, so | ||
705 | that we can catch the next access too! Now kmemcheck makes use of a debugging | ||
706 | feature of the processor, namely single-stepping. When the processor has | ||
707 | finished the one instruction that generated the memory access, a debug | ||
708 | exception is raised. From here, we simply hide the page again and continue | ||
709 | execution, this time with the single-stepping feature turned off. | ||
710 | |||
711 | kmemcheck requires some assistance from the memory allocator in order to work. | ||
712 | The memory allocator needs to | ||
713 | |||
714 | 1. Tell kmemcheck about newly allocated pages and pages that are about to | ||
715 | be freed. This allows kmemcheck to set up and tear down the shadow memory | ||
716 | for the pages in question. The shadow memory stores the status of each | ||
717 | byte in the allocation proper, e.g. whether it is initialized or | ||
718 | uninitialized. | ||
719 | |||
720 | 2. Tell kmemcheck which parts of memory should be marked uninitialized. | ||
721 | There are actually a few more states, such as "not yet allocated" and | ||
722 | "recently freed". | ||
723 | |||
724 | If a slab cache is set up using the SLAB_NOTRACK flag, it will never return | ||
725 | memory that can take page faults because of kmemcheck. | ||
726 | |||
727 | If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still | ||
728 | request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags. | ||
729 | This does not prevent the page faults from occurring, however, but marks the | ||
730 | object in question as being initialized so that no warnings will ever be | ||
731 | produced for this object. | ||
732 | |||
733 | Currently, the SLAB and SLUB allocators are supported by kmemcheck. | ||
diff --git a/Documentation/dev-tools/kmemleak.rst b/Documentation/dev-tools/kmemleak.rst new file mode 100644 index 000000000000..1788722d5495 --- /dev/null +++ b/Documentation/dev-tools/kmemleak.rst | |||
@@ -0,0 +1,210 @@ | |||
1 | Kernel Memory Leak Detector | ||
2 | =========================== | ||
3 | |||
4 | Kmemleak provides a way of detecting possible kernel memory leaks in a | ||
5 | way similar to a tracing garbage collector | ||
6 | (https://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Tracing_garbage_collectors), | ||
7 | with the difference that the orphan objects are not freed but only | ||
8 | reported via /sys/kernel/debug/kmemleak. A similar method is used by the | ||
9 | Valgrind tool (``memcheck --leak-check``) to detect the memory leaks in | ||
10 | user-space applications. | ||
11 | Kmemleak is supported on x86, arm, powerpc, sparc, sh, microblaze, ppc, mips, s390, metag and tile. | ||
12 | |||
13 | Usage | ||
14 | ----- | ||
15 | |||
16 | CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel | ||
17 | thread scans the memory every 10 minutes (by default) and prints the | ||
18 | number of new unreferenced objects found. To display the details of all | ||
19 | the possible memory leaks:: | ||
20 | |||
21 | # mount -t debugfs nodev /sys/kernel/debug/ | ||
22 | # cat /sys/kernel/debug/kmemleak | ||
23 | |||
24 | To trigger an intermediate memory scan:: | ||
25 | |||
26 | # echo scan > /sys/kernel/debug/kmemleak | ||
27 | |||
28 | To clear the list of all current possible memory leaks:: | ||
29 | |||
30 | # echo clear > /sys/kernel/debug/kmemleak | ||
31 | |||
32 | New leaks will then come up upon reading ``/sys/kernel/debug/kmemleak`` | ||
33 | again. | ||
34 | |||
35 | Note that the orphan objects are listed in the order they were allocated | ||
36 | and one object at the beginning of the list may cause other subsequent | ||
37 | objects to be reported as orphan. | ||
38 | |||
39 | Memory scanning parameters can be modified at run-time by writing to the | ||
40 | ``/sys/kernel/debug/kmemleak`` file. The following parameters are supported: | ||
41 | |||
42 | - off | ||
43 | disable kmemleak (irreversible) | ||
44 | - stack=on | ||
45 | enable the task stacks scanning (default) | ||
46 | - stack=off | ||
47 | disable the tasks stacks scanning | ||
48 | - scan=on | ||
49 | start the automatic memory scanning thread (default) | ||
50 | - scan=off | ||
51 | stop the automatic memory scanning thread | ||
52 | - scan=<secs> | ||
53 | set the automatic memory scanning period in seconds | ||
54 | (default 600, 0 to stop the automatic scanning) | ||
55 | - scan | ||
56 | trigger a memory scan | ||
57 | - clear | ||
58 | clear list of current memory leak suspects, done by | ||
59 | marking all current reported unreferenced objects grey, | ||
60 | or free all kmemleak objects if kmemleak has been disabled. | ||
61 | - dump=<addr> | ||
62 | dump information about the object found at <addr> | ||
63 | |||
64 | Kmemleak can also be disabled at boot-time by passing ``kmemleak=off`` on | ||
65 | the kernel command line. | ||
66 | |||
67 | Memory may be allocated or freed before kmemleak is initialised and | ||
68 | these actions are stored in an early log buffer. The size of this buffer | ||
69 | is configured via the CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option. | ||
70 | |||
71 | If CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF are enabled, the kmemleak is | ||
72 | disabled by default. Passing ``kmemleak=on`` on the kernel command | ||
73 | line enables the function. | ||
74 | |||
75 | Basic Algorithm | ||
76 | --------------- | ||
77 | |||
78 | The memory allocations via :c:func:`kmalloc`, :c:func:`vmalloc`, | ||
79 | :c:func:`kmem_cache_alloc` and | ||
80 | friends are traced and the pointers, together with additional | ||
81 | information like size and stack trace, are stored in a rbtree. | ||
82 | The corresponding freeing function calls are tracked and the pointers | ||
83 | removed from the kmemleak data structures. | ||
84 | |||
85 | An allocated block of memory is considered orphan if no pointer to its | ||
86 | start address or to any location inside the block can be found by | ||
87 | scanning the memory (including saved registers). This means that there | ||
88 | might be no way for the kernel to pass the address of the allocated | ||
89 | block to a freeing function and therefore the block is considered a | ||
90 | memory leak. | ||
91 | |||
92 | The scanning algorithm steps: | ||
93 | |||
94 | 1. mark all objects as white (remaining white objects will later be | ||
95 | considered orphan) | ||
96 | 2. scan the memory starting with the data section and stacks, checking | ||
97 | the values against the addresses stored in the rbtree. If | ||
98 | a pointer to a white object is found, the object is added to the | ||
99 | gray list | ||
100 | 3. scan the gray objects for matching addresses (some white objects | ||
101 | can become gray and added at the end of the gray list) until the | ||
102 | gray set is finished | ||
103 | 4. the remaining white objects are considered orphan and reported via | ||
104 | /sys/kernel/debug/kmemleak | ||
105 | |||
106 | Some allocated memory blocks have pointers stored in the kernel's | ||
107 | internal data structures and they cannot be detected as orphans. To | ||
108 | avoid this, kmemleak can also store the number of values pointing to an | ||
109 | address inside the block address range that need to be found so that the | ||
110 | block is not considered a leak. One example is __vmalloc(). | ||
111 | |||
112 | Testing specific sections with kmemleak | ||
113 | --------------------------------------- | ||
114 | |||
115 | Upon initial bootup your /sys/kernel/debug/kmemleak output page may be | ||
116 | quite extensive. This can also be the case if you have very buggy code | ||
117 | when doing development. To work around these situations you can use the | ||
118 | 'clear' command to clear all reported unreferenced objects from the | ||
119 | /sys/kernel/debug/kmemleak output. By issuing a 'scan' after a 'clear' | ||
120 | you can find new unreferenced objects; this should help with testing | ||
121 | specific sections of code. | ||
122 | |||
123 | To test a critical section on demand with a clean kmemleak do:: | ||
124 | |||
125 | # echo clear > /sys/kernel/debug/kmemleak | ||
126 | ... test your kernel or modules ... | ||
127 | # echo scan > /sys/kernel/debug/kmemleak | ||
128 | |||
129 | Then as usual to get your report with:: | ||
130 | |||
131 | # cat /sys/kernel/debug/kmemleak | ||
132 | |||
133 | Freeing kmemleak internal objects | ||
134 | --------------------------------- | ||
135 | |||
136 | To allow access to previously found memory leaks after kmemleak has been | ||
137 | disabled by the user or due to an fatal error, internal kmemleak objects | ||
138 | won't be freed when kmemleak is disabled, and those objects may occupy | ||
139 | a large part of physical memory. | ||
140 | |||
141 | In this situation, you may reclaim memory with:: | ||
142 | |||
143 | # echo clear > /sys/kernel/debug/kmemleak | ||
144 | |||
145 | Kmemleak API | ||
146 | ------------ | ||
147 | |||
148 | See the include/linux/kmemleak.h header for the functions prototype. | ||
149 | |||
150 | - ``kmemleak_init`` - initialize kmemleak | ||
151 | - ``kmemleak_alloc`` - notify of a memory block allocation | ||
152 | - ``kmemleak_alloc_percpu`` - notify of a percpu memory block allocation | ||
153 | - ``kmemleak_free`` - notify of a memory block freeing | ||
154 | - ``kmemleak_free_part`` - notify of a partial memory block freeing | ||
155 | - ``kmemleak_free_percpu`` - notify of a percpu memory block freeing | ||
156 | - ``kmemleak_update_trace`` - update object allocation stack trace | ||
157 | - ``kmemleak_not_leak`` - mark an object as not a leak | ||
158 | - ``kmemleak_ignore`` - do not scan or report an object as leak | ||
159 | - ``kmemleak_scan_area`` - add scan areas inside a memory block | ||
160 | - ``kmemleak_no_scan`` - do not scan a memory block | ||
161 | - ``kmemleak_erase`` - erase an old value in a pointer variable | ||
162 | - ``kmemleak_alloc_recursive`` - as kmemleak_alloc but checks the recursiveness | ||
163 | - ``kmemleak_free_recursive`` - as kmemleak_free but checks the recursiveness | ||
164 | |||
165 | Dealing with false positives/negatives | ||
166 | -------------------------------------- | ||
167 | |||
168 | The false negatives are real memory leaks (orphan objects) but not | ||
169 | reported by kmemleak because values found during the memory scanning | ||
170 | point to such objects. To reduce the number of false negatives, kmemleak | ||
171 | provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and | ||
172 | kmemleak_erase functions (see above). The task stacks also increase the | ||
173 | amount of false negatives and their scanning is not enabled by default. | ||
174 | |||
175 | The false positives are objects wrongly reported as being memory leaks | ||
176 | (orphan). For objects known not to be leaks, kmemleak provides the | ||
177 | kmemleak_not_leak function. The kmemleak_ignore could also be used if | ||
178 | the memory block is known not to contain other pointers and it will no | ||
179 | longer be scanned. | ||
180 | |||
181 | Some of the reported leaks are only transient, especially on SMP | ||
182 | systems, because of pointers temporarily stored in CPU registers or | ||
183 | stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing | ||
184 | the minimum age of an object to be reported as a memory leak. | ||
185 | |||
186 | Limitations and Drawbacks | ||
187 | ------------------------- | ||
188 | |||
189 | The main drawback is the reduced performance of memory allocation and | ||
190 | freeing. To avoid other penalties, the memory scanning is only performed | ||
191 | when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is | ||
192 | intended for debugging purposes where the performance might not be the | ||
193 | most important requirement. | ||
194 | |||
195 | To keep the algorithm simple, kmemleak scans for values pointing to any | ||
196 | address inside a block's address range. This may lead to an increased | ||
197 | number of false negatives. However, it is likely that a real memory leak | ||
198 | will eventually become visible. | ||
199 | |||
200 | Another source of false negatives is the data stored in non-pointer | ||
201 | values. In a future version, kmemleak could only scan the pointer | ||
202 | members in the allocated structures. This feature would solve many of | ||
203 | the false negative cases described above. | ||
204 | |||
205 | The tool can report false positives. These are cases where an allocated | ||
206 | block doesn't need to be freed (some cases in the init_call functions), | ||
207 | the pointer is calculated by other methods than the usual container_of | ||
208 | macro or the pointer is stored in a location not scanned by kmemleak. | ||
209 | |||
210 | Page allocations and ioremap are not tracked. | ||
diff --git a/Documentation/dev-tools/sparse.rst b/Documentation/dev-tools/sparse.rst new file mode 100644 index 000000000000..8c250e8a2105 --- /dev/null +++ b/Documentation/dev-tools/sparse.rst | |||
@@ -0,0 +1,117 @@ | |||
1 | .. Copyright 2004 Linus Torvalds | ||
2 | .. Copyright 2004 Pavel Machek <pavel@ucw.cz> | ||
3 | .. Copyright 2006 Bob Copeland <me@bobcopeland.com> | ||
4 | |||
5 | Sparse | ||
6 | ====== | ||
7 | |||
8 | Sparse is a semantic checker for C programs; it can be used to find a | ||
9 | number of potential problems with kernel code. See | ||
10 | https://lwn.net/Articles/689907/ for an overview of sparse; this document | ||
11 | contains some kernel-specific sparse information. | ||
12 | |||
13 | |||
14 | Using sparse for typechecking | ||
15 | ----------------------------- | ||
16 | |||
17 | "__bitwise" is a type attribute, so you have to do something like this:: | ||
18 | |||
19 | typedef int __bitwise pm_request_t; | ||
20 | |||
21 | enum pm_request { | ||
22 | PM_SUSPEND = (__force pm_request_t) 1, | ||
23 | PM_RESUME = (__force pm_request_t) 2 | ||
24 | }; | ||
25 | |||
26 | which makes PM_SUSPEND and PM_RESUME "bitwise" integers (the "__force" is | ||
27 | there because sparse will complain about casting to/from a bitwise type, | ||
28 | but in this case we really _do_ want to force the conversion). And because | ||
29 | the enum values are all the same type, now "enum pm_request" will be that | ||
30 | type too. | ||
31 | |||
32 | And with gcc, all the "__bitwise"/"__force stuff" goes away, and it all | ||
33 | ends up looking just like integers to gcc. | ||
34 | |||
35 | Quite frankly, you don't need the enum there. The above all really just | ||
36 | boils down to one special "int __bitwise" type. | ||
37 | |||
38 | So the simpler way is to just do:: | ||
39 | |||
40 | typedef int __bitwise pm_request_t; | ||
41 | |||
42 | #define PM_SUSPEND ((__force pm_request_t) 1) | ||
43 | #define PM_RESUME ((__force pm_request_t) 2) | ||
44 | |||
45 | and you now have all the infrastructure needed for strict typechecking. | ||
46 | |||
47 | One small note: the constant integer "0" is special. You can use a | ||
48 | constant zero as a bitwise integer type without sparse ever complaining. | ||
49 | This is because "bitwise" (as the name implies) was designed for making | ||
50 | sure that bitwise types don't get mixed up (little-endian vs big-endian | ||
51 | vs cpu-endian vs whatever), and there the constant "0" really _is_ | ||
52 | special. | ||
53 | |||
54 | __bitwise__ - to be used for relatively compact stuff (gfp_t, etc.) that | ||
55 | is mostly warning-free and is supposed to stay that way. Warnings will | ||
56 | be generated without __CHECK_ENDIAN__. | ||
57 | |||
58 | __bitwise - noisy stuff; in particular, __le*/__be* are that. We really | ||
59 | don't want to drown in noise unless we'd explicitly asked for it. | ||
60 | |||
61 | Using sparse for lock checking | ||
62 | ------------------------------ | ||
63 | |||
64 | The following macros are undefined for gcc and defined during a sparse | ||
65 | run to use the "context" tracking feature of sparse, applied to | ||
66 | locking. These annotations tell sparse when a lock is held, with | ||
67 | regard to the annotated function's entry and exit. | ||
68 | |||
69 | __must_hold - The specified lock is held on function entry and exit. | ||
70 | |||
71 | __acquires - The specified lock is held on function exit, but not entry. | ||
72 | |||
73 | __releases - The specified lock is held on function entry, but not exit. | ||
74 | |||
75 | If the function enters and exits without the lock held, acquiring and | ||
76 | releasing the lock inside the function in a balanced way, no | ||
77 | annotation is needed. The tree annotations above are for cases where | ||
78 | sparse would otherwise report a context imbalance. | ||
79 | |||
80 | Getting sparse | ||
81 | -------------- | ||
82 | |||
83 | You can get latest released versions from the Sparse homepage at | ||
84 | https://sparse.wiki.kernel.org/index.php/Main_Page | ||
85 | |||
86 | Alternatively, you can get snapshots of the latest development version | ||
87 | of sparse using git to clone:: | ||
88 | |||
89 | git://git.kernel.org/pub/scm/devel/sparse/sparse.git | ||
90 | |||
91 | DaveJ has hourly generated tarballs of the git tree available at:: | ||
92 | |||
93 | http://www.codemonkey.org.uk/projects/git-snapshots/sparse/ | ||
94 | |||
95 | |||
96 | Once you have it, just do:: | ||
97 | |||
98 | make | ||
99 | make install | ||
100 | |||
101 | as a regular user, and it will install sparse in your ~/bin directory. | ||
102 | |||
103 | Using sparse | ||
104 | ------------ | ||
105 | |||
106 | Do a kernel make with "make C=1" to run sparse on all the C files that get | ||
107 | recompiled, or use "make C=2" to run sparse on the files whether they need to | ||
108 | be recompiled or not. The latter is a fast way to check the whole tree if you | ||
109 | have already built it. | ||
110 | |||
111 | The optional make variable CF can be used to pass arguments to sparse. The | ||
112 | build system passes -Wbitwise to sparse automatically. To perform endianness | ||
113 | checks, you may define __CHECK_ENDIAN__:: | ||
114 | |||
115 | make C=2 CF="-D__CHECK_ENDIAN__" | ||
116 | |||
117 | These checks are disabled by default as they generate a host of warnings. | ||
diff --git a/Documentation/dev-tools/tools.rst b/Documentation/dev-tools/tools.rst new file mode 100644 index 000000000000..824ae8e54dd5 --- /dev/null +++ b/Documentation/dev-tools/tools.rst | |||
@@ -0,0 +1,25 @@ | |||
1 | ================================ | ||
2 | Development tools for the kernel | ||
3 | ================================ | ||
4 | |||
5 | This document is a collection of documents about development tools that can | ||
6 | be used to work on the kernel. For now, the documents have been pulled | ||
7 | together without any significant effot to integrate them into a coherent | ||
8 | whole; patches welcome! | ||
9 | |||
10 | .. class:: toc-title | ||
11 | |||
12 | Table of contents | ||
13 | |||
14 | .. toctree:: | ||
15 | :maxdepth: 2 | ||
16 | |||
17 | coccinelle | ||
18 | sparse | ||
19 | kcov | ||
20 | gcov | ||
21 | kasan | ||
22 | ubsan | ||
23 | kmemleak | ||
24 | kmemcheck | ||
25 | gdb-kernel-debugging | ||
diff --git a/Documentation/dev-tools/ubsan.rst b/Documentation/dev-tools/ubsan.rst new file mode 100644 index 000000000000..655e6b63c227 --- /dev/null +++ b/Documentation/dev-tools/ubsan.rst | |||
@@ -0,0 +1,88 @@ | |||
1 | The Undefined Behavior Sanitizer - UBSAN | ||
2 | ======================================== | ||
3 | |||
4 | UBSAN is a runtime undefined behaviour checker. | ||
5 | |||
6 | UBSAN uses compile-time instrumentation to catch undefined behavior (UB). | ||
7 | Compiler inserts code that perform certain kinds of checks before operations | ||
8 | that may cause UB. If check fails (i.e. UB detected) __ubsan_handle_* | ||
9 | function called to print error message. | ||
10 | |||
11 | GCC has that feature since 4.9.x [1_] (see ``-fsanitize=undefined`` option and | ||
12 | its suboptions). GCC 5.x has more checkers implemented [2_]. | ||
13 | |||
14 | Report example | ||
15 | -------------- | ||
16 | |||
17 | :: | ||
18 | |||
19 | ================================================================================ | ||
20 | UBSAN: Undefined behaviour in ../include/linux/bitops.h:110:33 | ||
21 | shift exponent 32 is to large for 32-bit type 'unsigned int' | ||
22 | CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-rc1+ #26 | ||
23 | 0000000000000000 ffffffff82403cc8 ffffffff815e6cd6 0000000000000001 | ||
24 | ffffffff82403cf8 ffffffff82403ce0 ffffffff8163a5ed 0000000000000020 | ||
25 | ffffffff82403d78 ffffffff8163ac2b ffffffff815f0001 0000000000000002 | ||
26 | Call Trace: | ||
27 | [<ffffffff815e6cd6>] dump_stack+0x45/0x5f | ||
28 | [<ffffffff8163a5ed>] ubsan_epilogue+0xd/0x40 | ||
29 | [<ffffffff8163ac2b>] __ubsan_handle_shift_out_of_bounds+0xeb/0x130 | ||
30 | [<ffffffff815f0001>] ? radix_tree_gang_lookup_slot+0x51/0x150 | ||
31 | [<ffffffff8173c586>] _mix_pool_bytes+0x1e6/0x480 | ||
32 | [<ffffffff83105653>] ? dmi_walk_early+0x48/0x5c | ||
33 | [<ffffffff8173c881>] add_device_randomness+0x61/0x130 | ||
34 | [<ffffffff83105b35>] ? dmi_save_one_device+0xaa/0xaa | ||
35 | [<ffffffff83105653>] dmi_walk_early+0x48/0x5c | ||
36 | [<ffffffff831066ae>] dmi_scan_machine+0x278/0x4b4 | ||
37 | [<ffffffff8111d58a>] ? vprintk_default+0x1a/0x20 | ||
38 | [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 | ||
39 | [<ffffffff830b2240>] setup_arch+0x405/0xc2c | ||
40 | [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 | ||
41 | [<ffffffff830ae053>] start_kernel+0x83/0x49a | ||
42 | [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 | ||
43 | [<ffffffff830ad386>] x86_64_start_reservations+0x2a/0x2c | ||
44 | [<ffffffff830ad4f3>] x86_64_start_kernel+0x16b/0x17a | ||
45 | ================================================================================ | ||
46 | |||
47 | Usage | ||
48 | ----- | ||
49 | |||
50 | To enable UBSAN configure kernel with:: | ||
51 | |||
52 | CONFIG_UBSAN=y | ||
53 | |||
54 | and to check the entire kernel:: | ||
55 | |||
56 | CONFIG_UBSAN_SANITIZE_ALL=y | ||
57 | |||
58 | To enable instrumentation for specific files or directories, add a line | ||
59 | similar to the following to the respective kernel Makefile: | ||
60 | |||
61 | - For a single file (e.g. main.o):: | ||
62 | |||
63 | UBSAN_SANITIZE_main.o := y | ||
64 | |||
65 | - For all files in one directory:: | ||
66 | |||
67 | UBSAN_SANITIZE := y | ||
68 | |||
69 | To exclude files from being instrumented even if | ||
70 | ``CONFIG_UBSAN_SANITIZE_ALL=y``, use:: | ||
71 | |||
72 | UBSAN_SANITIZE_main.o := n | ||
73 | |||
74 | and:: | ||
75 | |||
76 | UBSAN_SANITIZE := n | ||
77 | |||
78 | Detection of unaligned accesses controlled through the separate option - | ||
79 | CONFIG_UBSAN_ALIGNMENT. It's off by default on architectures that support | ||
80 | unaligned accesses (CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y). One could | ||
81 | still enable it in config, just note that it will produce a lot of UBSAN | ||
82 | reports. | ||
83 | |||
84 | References | ||
85 | ---------- | ||
86 | |||
87 | .. _1: https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html | ||
88 | .. _2: https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html | ||