aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/ftrace.txt
diff options
context:
space:
mode:
authorIngo Molnar <mingo@elte.hu>2008-11-12 06:39:21 -0500
committerIngo Molnar <mingo@elte.hu>2008-11-12 06:39:21 -0500
commit708b8eae0fd532af73ea8350e6dcc10255ff7376 (patch)
treef336436934fd79bc91aff7112a9beb10bc4e839f /Documentation/ftrace.txt
parentd98d38f2014ab79f28c126ff175d034891f7aefc (diff)
parentf21f237cf55494c3a4209de323281a3b0528da10 (diff)
Merge branch 'linus' into core/locking
Diffstat (limited to 'Documentation/ftrace.txt')
-rw-r--r--Documentation/ftrace.txt174
1 files changed, 76 insertions, 98 deletions
diff --git a/Documentation/ftrace.txt b/Documentation/ftrace.txt
index d330fe3103da..9cc4d685dde5 100644
--- a/Documentation/ftrace.txt
+++ b/Documentation/ftrace.txt
@@ -8,7 +8,7 @@ Copyright 2008 Red Hat Inc.
8Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton, 8Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
9 John Kacur, and David Teigland. 9 John Kacur, and David Teigland.
10 10
11Written for: 2.6.27-rc1 11Written for: 2.6.28-rc2
12 12
13Introduction 13Introduction
14------------ 14------------
@@ -50,26 +50,26 @@ of ftrace. Here is a list of some of the key files:
50 50
51 Note: all time values are in microseconds. 51 Note: all time values are in microseconds.
52 52
53 current_tracer : This is used to set or display the current tracer 53 current_tracer: This is used to set or display the current tracer
54 that is configured. 54 that is configured.
55 55
56 available_tracers : This holds the different types of tracers that 56 available_tracers: This holds the different types of tracers that
57 have been compiled into the kernel. The tracers 57 have been compiled into the kernel. The tracers
58 listed here can be configured by echoing their name 58 listed here can be configured by echoing their name
59 into current_tracer. 59 into current_tracer.
60 60
61 tracing_enabled : This sets or displays whether the current_tracer 61 tracing_enabled: This sets or displays whether the current_tracer
62 is activated and tracing or not. Echo 0 into this 62 is activated and tracing or not. Echo 0 into this
63 file to disable the tracer or 1 to enable it. 63 file to disable the tracer or 1 to enable it.
64 64
65 trace : This file holds the output of the trace in a human readable 65 trace: This file holds the output of the trace in a human readable
66 format (described below). 66 format (described below).
67 67
68 latency_trace : This file shows the same trace but the information 68 latency_trace: This file shows the same trace but the information
69 is organized more to display possible latencies 69 is organized more to display possible latencies
70 in the system (described below). 70 in the system (described below).
71 71
72 trace_pipe : The output is the same as the "trace" file but this 72 trace_pipe: The output is the same as the "trace" file but this
73 file is meant to be streamed with live tracing. 73 file is meant to be streamed with live tracing.
74 Reads from this file will block until new data 74 Reads from this file will block until new data
75 is retrieved. Unlike the "trace" and "latency_trace" 75 is retrieved. Unlike the "trace" and "latency_trace"
@@ -82,11 +82,11 @@ of ftrace. Here is a list of some of the key files:
82 tracer is not adding more data, they will display 82 tracer is not adding more data, they will display
83 the same information every time they are read. 83 the same information every time they are read.
84 84
85 iter_ctrl : This file lets the user control the amount of data 85 iter_ctrl: This file lets the user control the amount of data
86 that is displayed in one of the above output 86 that is displayed in one of the above output
87 files. 87 files.
88 88
89 trace_max_latency : Some of the tracers record the max latency. 89 trace_max_latency: Some of the tracers record the max latency.
90 For example, the time interrupts are disabled. 90 For example, the time interrupts are disabled.
91 This time is saved in this file. The max trace 91 This time is saved in this file. The max trace
92 will also be stored, and displayed by either 92 will also be stored, and displayed by either
@@ -94,29 +94,26 @@ of ftrace. Here is a list of some of the key files:
94 only be recorded if the latency is greater than 94 only be recorded if the latency is greater than
95 the value in this file. (in microseconds) 95 the value in this file. (in microseconds)
96 96
97 trace_entries : This sets or displays the number of trace 97 trace_entries: This sets or displays the number of bytes each CPU
98 entries each CPU buffer can hold. The tracer buffers 98 buffer can hold. The tracer buffers are the same size
99 are the same size for each CPU. The displayed number 99 for each CPU. The displayed number is the size of the
100 is the size of the CPU buffer and not total size. The 100 CPU buffer and not total size of all buffers. The
101 trace buffers are allocated in pages (blocks of memory 101 trace buffers are allocated in pages (blocks of memory
102 that the kernel uses for allocation, usually 4 KB in size). 102 that the kernel uses for allocation, usually 4 KB in size).
103 Since each entry is smaller than a page, if the last 103 If the last page allocated has room for more bytes
104 allocated page has room for more entries than were 104 than requested, the rest of the page will be used,
105 requested, the rest of the page is used to allocate 105 making the actual allocation bigger than requested.
106 entries. 106 (Note, the size may not be a multiple of the page size due
107 to buffer managment overhead.)
107 108
108 This can only be updated when the current_tracer 109 This can only be updated when the current_tracer
109 is set to "none". 110 is set to "nop".
110 111
111 NOTE: It is planned on changing the allocated buffers 112 tracing_cpumask: This is a mask that lets the user only trace
112 from being the number of possible CPUS to
113 the number of online CPUS.
114
115 tracing_cpumask : This is a mask that lets the user only trace
116 on specified CPUS. The format is a hex string 113 on specified CPUS. The format is a hex string
117 representing the CPUS. 114 representing the CPUS.
118 115
119 set_ftrace_filter : When dynamic ftrace is configured in (see the 116 set_ftrace_filter: When dynamic ftrace is configured in (see the
120 section below "dynamic ftrace"), the code is dynamically 117 section below "dynamic ftrace"), the code is dynamically
121 modified (code text rewrite) to disable calling of the 118 modified (code text rewrite) to disable calling of the
122 function profiler (mcount). This lets tracing be configured 119 function profiler (mcount). This lets tracing be configured
@@ -130,14 +127,11 @@ of ftrace. Here is a list of some of the key files:
130 be traced. If a function exists in both set_ftrace_filter 127 be traced. If a function exists in both set_ftrace_filter
131 and set_ftrace_notrace, the function will _not_ be traced. 128 and set_ftrace_notrace, the function will _not_ be traced.
132 129
133 available_filter_functions : When a function is encountered the first 130 available_filter_functions: This lists the functions that ftrace
134 time by the dynamic tracer, it is recorded and 131 has processed and can trace. These are the function
135 later the call is converted into a nop. This file 132 names that you can pass to "set_ftrace_filter" or
136 lists the functions that have been recorded 133 "set_ftrace_notrace". (See the section "dynamic ftrace"
137 by the dynamic tracer and these functions can 134 below for more details.)
138 be used to set the ftrace filter by the above
139 "set_ftrace_filter" file. (See the section "dynamic ftrace"
140 below for more details).
141 135
142 136
143The Tracers 137The Tracers
@@ -145,7 +139,7 @@ The Tracers
145 139
146Here is the list of current tracers that may be configured. 140Here is the list of current tracers that may be configured.
147 141
148 ftrace - function tracer that uses mcount to trace all functions. 142 function - function tracer that uses mcount to trace all functions.
149 143
150 sched_switch - traces the context switches between tasks. 144 sched_switch - traces the context switches between tasks.
151 145
@@ -166,8 +160,8 @@ Here is the list of current tracers that may be configured.
166 the highest priority task to get scheduled after 160 the highest priority task to get scheduled after
167 it has been woken up. 161 it has been woken up.
168 162
169 none - This is not a tracer. To remove all tracers from tracing 163 nop - This is not a tracer. To remove all tracers from tracing
170 simply echo "none" into current_tracer. 164 simply echo "nop" into current_tracer.
171 165
172 166
173Examples of using the tracer 167Examples of using the tracer
@@ -182,7 +176,7 @@ Output format:
182Here is an example of the output format of the file "trace" 176Here is an example of the output format of the file "trace"
183 177
184 -------- 178 --------
185# tracer: ftrace 179# tracer: function
186# 180#
187# TASK-PID CPU# TIMESTAMP FUNCTION 181# TASK-PID CPU# TIMESTAMP FUNCTION
188# | | | | | 182# | | | | |
@@ -192,7 +186,7 @@ Here is an example of the output format of the file "trace"
192 -------- 186 --------
193 187
194A header is printed with the tracer name that is represented by the trace. 188A header is printed with the tracer name that is represented by the trace.
195In this case the tracer is "ftrace". Then a header showing the format. Task 189In this case the tracer is "function". Then a header showing the format. Task
196name "bash", the task PID "4251", the CPU that it was running on 190name "bash", the task PID "4251", the CPU that it was running on
197"01", the timestamp in <secs>.<usecs> format, the function name that was 191"01", the timestamp in <secs>.<usecs> format, the function name that was
198traced "path_put" and the parent function that called this function 192traced "path_put" and the parent function that called this function
@@ -291,6 +285,9 @@ explains which is which.
291 CPU#: The CPU which the process was running on. 285 CPU#: The CPU which the process was running on.
292 286
293 irqs-off: 'd' interrupts are disabled. '.' otherwise. 287 irqs-off: 'd' interrupts are disabled. '.' otherwise.
288 Note: If the architecture does not support a way to
289 read the irq flags variable, an 'X' will always
290 be printed here.
294 291
295 need-resched: 'N' task need_resched is set, '.' otherwise. 292 need-resched: 'N' task need_resched is set, '.' otherwise.
296 293
@@ -1000,22 +997,20 @@ is the stack for the hard interrupt. This hides the fact that NEED_RESCHED
1000has been set. We do not see the 'N' until we switch back to the task's 997has been set. We do not see the 'N' until we switch back to the task's
1001assigned stack. 998assigned stack.
1002 999
1003ftrace 1000function
1004------ 1001--------
1005 1002
1006ftrace is not only the name of the tracing infrastructure, but it 1003This tracer is the function tracer. Enabling the function tracer
1007is also a name of one of the tracers. The tracer is the function 1004can be done from the debug file system. Make sure the ftrace_enabled is
1008tracer. Enabling the function tracer can be done from the 1005set; otherwise this tracer is a nop.
1009debug file system. Make sure the ftrace_enabled is set otherwise
1010this tracer is a nop.
1011 1006
1012 # sysctl kernel.ftrace_enabled=1 1007 # sysctl kernel.ftrace_enabled=1
1013 # echo ftrace > /debug/tracing/current_tracer 1008 # echo function > /debug/tracing/current_tracer
1014 # echo 1 > /debug/tracing/tracing_enabled 1009 # echo 1 > /debug/tracing/tracing_enabled
1015 # usleep 1 1010 # usleep 1
1016 # echo 0 > /debug/tracing/tracing_enabled 1011 # echo 0 > /debug/tracing/tracing_enabled
1017 # cat /debug/tracing/trace 1012 # cat /debug/tracing/trace
1018# tracer: ftrace 1013# tracer: function
1019# 1014#
1020# TASK-PID CPU# TIMESTAMP FUNCTION 1015# TASK-PID CPU# TIMESTAMP FUNCTION
1021# | | | | | 1016# | | | | |
@@ -1037,10 +1032,10 @@ this tracer is a nop.
1037[...] 1032[...]
1038 1033
1039 1034
1040Note: ftrace uses ring buffers to store the above entries. The newest data 1035Note: function tracer uses ring buffers to store the above entries.
1041may overwrite the oldest data. Sometimes using echo to stop the trace 1036The newest data may overwrite the oldest data. Sometimes using echo to
1042is not sufficient because the tracing could have overwritten the data 1037stop the trace is not sufficient because the tracing could have overwritten
1043that you wanted to record. For this reason, it is sometimes better to 1038the data that you wanted to record. For this reason, it is sometimes better to
1044disable tracing directly from a program. This allows you to stop the 1039disable tracing directly from a program. This allows you to stop the
1045tracing at the point that you hit the part that you are interested in. 1040tracing at the point that you hit the part that you are interested in.
1046To disable the tracing directly from a C program, something like following 1041To disable the tracing directly from a C program, something like following
@@ -1074,18 +1069,31 @@ every kernel function, produced by the -pg switch in gcc), starts
1074of pointing to a simple return. (Enabling FTRACE will include the 1069of pointing to a simple return. (Enabling FTRACE will include the
1075-pg switch in the compiling of the kernel.) 1070-pg switch in the compiling of the kernel.)
1076 1071
1077When dynamic ftrace is initialized, it calls kstop_machine to make 1072At compile time every C file object is run through the
1078the machine act like a uniprocessor so that it can freely modify code 1073recordmcount.pl script (located in the scripts directory). This
1079without worrying about other processors executing that same code. At 1074script will process the C object using objdump to find all the
1080initialization, the mcount calls are changed to call a "record_ip" 1075locations in the .text section that call mcount. (Note, only
1081function. After this, the first time a kernel function is called, 1076the .text section is processed, since processing other sections
1082it has the calling address saved in a hash table. 1077like .init.text may cause races due to those sections being freed).
1083 1078
1084Later on the ftraced kernel thread is awoken and will again call 1079A new section called "__mcount_loc" is created that holds references
1085kstop_machine if new functions have been recorded. The ftraced thread 1080to all the mcount call sites in the .text section. This section is
1086will change all calls to mcount to "nop". Just calling mcount 1081compiled back into the original object. The final linker will add
1087and having mcount return has shown a 10% overhead. By converting 1082all these references into a single table.
1088it to a nop, there is no measurable overhead to the system. 1083
1084On boot up, before SMP is initialized, the dynamic ftrace code
1085scans this table and updates all the locations into nops. It also
1086records the locations, which are added to the available_filter_functions
1087list. Modules are processed as they are loaded and before they are
1088executed. When a module is unloaded, it also removes its functions from
1089the ftrace function list. This is automatic in the module unload
1090code, and the module author does not need to worry about it.
1091
1092When tracing is enabled, kstop_machine is called to prevent races
1093with the CPUS executing code being modified (which can cause the
1094CPU to do undesireable things), and the nops are patched back
1095to calls. But this time, they do not call mcount (which is just
1096a function stub). They now call into the ftrace infrastructure.
1089 1097
1090One special side-effect to the recording of the functions being 1098One special side-effect to the recording of the functions being
1091traced is that we can now selectively choose which functions we 1099traced is that we can now selectively choose which functions we
@@ -1248,36 +1256,6 @@ Produces:
1248 1256
1249We can see that there's no more lock or preempt tracing. 1257We can see that there's no more lock or preempt tracing.
1250 1258
1251ftraced
1252-------
1253
1254As mentioned above, when dynamic ftrace is configured in, a kernel
1255thread wakes up once a second and checks to see if there are mcount
1256calls that need to be converted into nops. If there are not any, then
1257it simply goes back to sleep. But if there are some, it will call
1258kstop_machine to convert the calls to nops.
1259
1260There may be a case in which you do not want this added latency.
1261Perhaps you are doing some audio recording and this activity might
1262cause skips in the playback. There is an interface to disable
1263and enable the "ftraced" kernel thread.
1264
1265 # echo 0 > /debug/tracing/ftraced_enabled
1266
1267This will disable the calling of kstop_machine to update the
1268mcount calls to nops. Remember that there is a large overhead
1269to calling mcount. Without this kernel thread, that overhead will
1270exist.
1271
1272If there are recorded calls to mcount, any write to the ftraced_enabled
1273file will cause the kstop_machine to run. This means that a
1274user can manually perform the updates when they want to by simply
1275echoing a '0' into the ftraced_enabled file.
1276
1277The updates are also done at the beginning of enabling a tracer
1278that uses ftrace function recording.
1279
1280
1281trace_pipe 1259trace_pipe
1282---------- 1260----------
1283 1261
@@ -1286,14 +1264,14 @@ on the tracing is different. Every read from trace_pipe is consumed.
1286This means that subsequent reads will be different. The trace 1264This means that subsequent reads will be different. The trace
1287is live. 1265is live.
1288 1266
1289 # echo ftrace > /debug/tracing/current_tracer 1267 # echo function > /debug/tracing/current_tracer
1290 # cat /debug/tracing/trace_pipe > /tmp/trace.out & 1268 # cat /debug/tracing/trace_pipe > /tmp/trace.out &
1291[1] 4153 1269[1] 4153
1292 # echo 1 > /debug/tracing/tracing_enabled 1270 # echo 1 > /debug/tracing/tracing_enabled
1293 # usleep 1 1271 # usleep 1
1294 # echo 0 > /debug/tracing/tracing_enabled 1272 # echo 0 > /debug/tracing/tracing_enabled
1295 # cat /debug/tracing/trace 1273 # cat /debug/tracing/trace
1296# tracer: ftrace 1274# tracer: function
1297# 1275#
1298# TASK-PID CPU# TIMESTAMP FUNCTION 1276# TASK-PID CPU# TIMESTAMP FUNCTION
1299# | | | | | 1277# | | | | |
@@ -1314,7 +1292,7 @@ is live.
1314 1292
1315Note, reading the trace_pipe file will block until more input is added. 1293Note, reading the trace_pipe file will block until more input is added.
1316By changing the tracer, trace_pipe will issue an EOF. We needed 1294By changing the tracer, trace_pipe will issue an EOF. We needed
1317to set the ftrace tracer _before_ cating the trace_pipe file. 1295to set the function tracer _before_ we "cat" the trace_pipe file.
1318 1296
1319 1297
1320trace entries 1298trace entries
@@ -1331,10 +1309,10 @@ number of entries.
133165620 130965620
1332 1310
1333Note, to modify this, you must have tracing completely disabled. To do that, 1311Note, to modify this, you must have tracing completely disabled. To do that,
1334echo "none" into the current_tracer. If the current_tracer is not set 1312echo "nop" into the current_tracer. If the current_tracer is not set
1335to "none", an EINVAL error will be returned. 1313to "nop", an EINVAL error will be returned.
1336 1314
1337 # echo none > /debug/tracing/current_tracer 1315 # echo nop > /debug/tracing/current_tracer
1338 # echo 100000 > /debug/tracing/trace_entries 1316 # echo 100000 > /debug/tracing/trace_entries
1339 # cat /debug/tracing/trace_entries 1317 # cat /debug/tracing/trace_entries
1340100045 1318100045