aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBjoern Brandenburg <bbb@mpi-sws.org>2016-03-28 14:59:23 -0400
committerBjoern Brandenburg <bbb@mpi-sws.org>2016-03-28 15:00:44 -0400
commit1baefa0bacd9df3f6aed9b097c4916989cf2caab (patch)
treef44738c9f73398b65c9445e7a230bfc7de92d261
parent1f39b952832779cf7626afb35f720f485f4787b4 (diff)
Add documentation
-rw-r--r--README.md28
-rw-r--r--doc/howto-trace-and-analyze-a-schedule.md111
-rw-r--r--doc/howto-trace-and-process-overheads.md222
3 files changed, 361 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..37d8449
--- /dev/null
+++ b/README.md
@@ -0,0 +1,28 @@
1
2# What is this?
3
4This repository contains tools and scripts for working with the tracing facilities in [LITMUS^RT](http://www.litmus-rt.org). In particular, tools are provided for
5
61. recording and processing **overhead data**, obtained from [LITMUS^RT](http://www.litmus-rt.org) kernels with the help of the Feather-Trace tracing infrastructure, and for
7
82. recording, visualizing, and analyzing **scheduling traces** in the `sched_trace` format, which are also typically obtained from [LITMUS^RT](http://www.litmus-rt.org) kernels.
9
10
11# Prefix Convention
12
13With one exception, tools and scripts prefixed with `ft` work on Feather-Trace overhead data (in either "raw" binary form or in derived formats). Conversely, tools and scripts prefixed with `st` work on scheduling traces in the `sched_trace` binary format.
14
15The one notable exception is the low-level tool `ftcat`, which is used both to record overheads and scheduling events.
16
17# Documentation
18
19There are two guides that provide an overview of how to to work with the provided tools.
20
21- [HOWTO: Trace and process overhead data](doc/howto-trace-and-process-overheads.md)
22
23- [HOWTO: Trace and analyze a schedule](doc/howto-trace-and-analyze-a-schedule.md)
24
25Some additional information is also available on the [LITMUS^RT wiki](https://wiki.litmus-rt.org/litmus/Tracing).
26
27Users are expected to be comfortable reading the (generally clean) source code. When in doubt, consult the source. If still confused, then contact the [LITMUS^RT mailing list](https://wiki.litmus-rt.org/litmus/Mailinglist).
28
diff --git a/doc/howto-trace-and-analyze-a-schedule.md b/doc/howto-trace-and-analyze-a-schedule.md
new file mode 100644
index 0000000..8231f11
--- /dev/null
+++ b/doc/howto-trace-and-analyze-a-schedule.md
@@ -0,0 +1,111 @@
1
2# HOWTO: Trace and Analyze a Schedule
3
4Whereas Feather-Trace data records *how long* a scheduling decision or context switch takes, the `sched_trace` interface instead records *which* tasks are scheduled at what point and corresponding job releases and deadlines.
5
6## Recording a Schedule with `sched_trace`
7
8The main high-level tool for recording scheduling decisions is the script `st-trace-schedule`(➞ [source](../st-trace-schedule)).
9
10To record the execution of a task system, follow the following rough outline:
11
121. start recording all scheduling decisions with `st-trace-schedule`;
13
142. launch and initialize all real-time tasks such that they wait for a *synchronous task system release* (see the `release_ts` utility in `liblitmus`);
15
163. release the task set with `release_ts`; and finally
17
184. stop `st-trace-schedule` when the benchmark has completed.
19
20Example:
21
22 st-trace-schedule my-trace
23 CPU 0: 17102 > schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=0.bin [0]
24 CPU 1: 17104 > schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=1.bin [0]
25 CPU 2: 17106 > schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=2.bin [0]
26 CPU 3: 17108 > schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=3.bin [0]
27 Press Enter to end tracing...
28 # [launch tasks]
29 # [kick off experiment with release_ts]
30 # [press Enter when done]
31 Ending Trace...
32 Disabling 10 events.
33 Disabling 10 events.
34 Disabling 10 events.
35 Disabling 10 events.
36 /dev/litmus/sched_trace2: XXX bytes read.
37 /dev/litmus/sched_trace3: XXX bytes read.
38 /dev/litmus/sched_trace1: XXX bytes read.
39 /dev/litmus/sched_trace0: XXX bytes read.
40
41As the output suggests, `st-trace-schedule` records one trace file per processor.
42
43## What does `sched_trace` data look like?
44
45A scheduling event is recorded whenever
46
47- a task is dispatched (switched to),
48- a task is preempted (switched away),
49- a task suspends (i.e., blocks), or
50- a task resumes (i.e., wakes up).
51
52Furthermore, the release time, deadline, and completion time of each job is recorded, as are each task's parameters and the name of its executable (i.e., the `comm` field in the Linux kernel's PCB `struct task_struct`). Finally, the time of a synchronous task system release (if any) is recorded as a reference of "time zero".
53
54The binary format and all recorded events are documented in the header file `sched_trace.h` (➞ [source](../include/sched_trace.h)). A Python parser for trace files is available in the module `sched_trace` (➞ [low-level parser](../sched_trace/format.py) and [high-level interface](../sched_trace/__init__.py#60)).
55
56The tool `st-dump` (➞ [source](../src/stdump.c)) may be used to print traces in a human-readable format for debugging purposes.
57
58## Drawing a Schedule
59
60The tool [pycairo](http://cairographics.org/pycairo/)-based `st-draw` (➞ [source](../st-draw)) renders a trace as a PDF. By default, it will render the first one thousand milliseconds after *time zero*, which is either the first synchronous task system release (if any) or the time of the first event in the trace (otherwise).
61
62Example:
63
64 st-draw schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=*.bin
65 # Will render the schedule as schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace.pdf
66
67Invoke `st-draw -h` for a list of possible options. If the tool takes a long time to complete, run it in verbose mode (`-v`) and try to render a shorter schedule (`-l`).
68
69## Obtaining Job Statistics
70
71The tool `st-job-stats` (➞ [source](../src/job_stats.c)) produces a CSV file with relevant per-job statistics for further processing with (for example) a spreadsheet application.
72
73Example:
74
75 st-job-stats schedule_host=rts5_scheduler=GSN-EDF_trace=my-trace_cpu=*.bin
76 # Task, Job, Period, Response, DL Miss?, Lateness, Tardiness, Forced?, ACET
77 # task NAME=rtspin PID=17406 COST=590000 PERIOD=113000000 CPU=254
78 17406, 3, 113000000, 17128309, 0, -95871691, 0, 0, 388179
79 17406, 4, 113000000, 12138793, 0, -100861207, 0, 0, 382776
80 17406, 5, 113000000, 7137743, 0, -105862257, 0, 0, 382334
81 17406, 6, 113000000, 2236774, 0, -110763226, 0, 0, 382352
82 17406, 7, 113000000, 561701, 0, -112438299, 0, 0, 559208
83 17406, 8, 113000000, 384752, 0, -112615248, 0, 0, 382539
84 17406, 9, 113000000, 565317, 0, -112434683, 0, 0, 561602
85 17406, 10, 113000000, 379963, 0, -112620037, 0, 0, 377526
86 [...]
87
88There is one row for each job. The columns record:
89
901. the task's PID,
91
922. the job sequence number,
93
943. the task's period,
95
964. the job's response time,
97
985. a flag indicating whether the deadline was missed,
99
1006. the job's lateness (i.e., response time - relative deadline),
101
1027. the job's tardiness (i.e., max(0, lateness)),
103
1048. a flag indicating whether the LITMUS^RT budget enforcement mechanism inserted an artificial job completion, and finally
105
1069. the actual execution time (ACET) of the job.
107
108Note that the *Forced?* flag is always zero for proper reservation-based schedulers (e.g., under `P-RES`). Forced job completion is an artefact of the LITMUS^RT legacy budget enforcement mechanism under process-based schedulers (such as `GSN-EDF` or `P-FP`).
109
110
111
diff --git a/doc/howto-trace-and-process-overheads.md b/doc/howto-trace-and-process-overheads.md
new file mode 100644
index 0000000..37a6fcc
--- /dev/null
+++ b/doc/howto-trace-and-process-overheads.md
@@ -0,0 +1,222 @@
1
2# HOWTO: Trace and Process Kernel Overheads
3
4This guide documents how to trace and process system overheads (such as context switch costs, scheduling costs, task wake-up latencies, etc.) in a [LITMUS^RT](http://www.litmus-rt.org) system.
5
6## Recording Overheads with Feather-Trace
7
8To record overheads, use the high-level wrapper script `ft-trace-overheads` (➞ [source](../ft-trace-overheads)) in a system running a LITMUS^RT kernel that has been compiled with overhead tracing enabled in the kernel configuration (i.e., `CONFIG_SCHED_OVERHEAD_TRACE=y`).
9
10Use the script as follows. First, activate the scheduler that you are interested in (e.g., `GSN-EDF`). Then simply run `ft-trace-overheads` (as root) with a given name to identify the experiment. While `ft-trace-overheads` is running, execute your benchmark to exercise the kernel. When the benchmark has completed, terminate `ft-trace-overheads` by pressing the enter key.
11
12
13Example:
14
15 $ setsched GSN-EDF
16 $ ft-trace-overheads my-experiment
17 [II] Recording /dev/litmus/ft_cpu_trace0 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_cpu=0.bin
18 [II] Recording /dev/litmus/ft_cpu_trace1 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_cpu=1.bin
19 [II] Recording /dev/litmus/ft_cpu_trace2 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_cpu=2.bin
20 [II] Recording /dev/litmus/ft_cpu_trace3 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_cpu=3.bin
21 [II] Recording /dev/litmus/ft_msg_trace0 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_msg=0.bin
22 [II] Recording /dev/litmus/ft_msg_trace1 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_msg=1.bin
23 [II] Recording /dev/litmus/ft_msg_trace2 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_msg=2.bin
24 [II] Recording /dev/litmus/ft_msg_trace3 -> overheads_host=rts5_scheduler=GSN-EDF_trace=my-experiment_msg=3.bin
25 Press Enter to end tracing...
26 # [run your benchmark]
27 # [presse Enter when done]
28 Ending Trace...
29 Disabling 4 events.
30 Disabling 4 events.
31 Disabling 4 events.
32 Disabling 4 events.
33 Disabling 18 events.
34 Disabling 18 events.
35 Disabling 18 events.
36 Disabling 18 events.
37 /dev/litmus/ft_msg_trace3: XXX bytes read.
38 /dev/litmus/ft_msg_trace0: XXX bytes read.
39 /dev/litmus/ft_msg_trace1: XXX bytes read.
40 /dev/litmus/ft_cpu_trace2: XXX bytes read.
41 /dev/litmus/ft_cpu_trace1: XXX bytes read.
42 /dev/litmus/ft_cpu_trace3: XXX bytes read.
43 /dev/litmus/ft_cpu_trace0: XXX bytes read.
44 /dev/litmus/ft_msg_trace2: XXX bytes read.
45
46For performance reasons, Feather-Trace records overhead data into separate per-processor trace buffers, and treats core-local events and inter-processor interrupts (IPIs) differently. Correspondingly, `ft-trace-overheads` records two trace files for each core in the system.
47
481. The file `overheads…cpu=$CPU.bin` contains all overhead samples related to CPU-local events such as context switches.
49
502. The file `overheads…msg=$CPU.bin` contains overhead samples stemming from IPIs related such as reschedule notifications.
51
52
53
54### Key-Value Encoding
55
56To aid with keeping track of relevant setup information, the tool encodes the system's host name and the currently active schedule in a simple `key=value` format in the filename.
57
58We recommend to adopt the same encoding scheme in the experiment tags. For example, when running an experiment named "foo" with (say) 40 tasks and a total utilization of 75 percent, we recommend to launch `ft-trace-overheads` as `ft-trace-overheads foo_n=40_u=75`, as the additional parameters will be added transparently to the final trace file name.
59
60Example:
61
62 ft-trace-overheads foo_n=40_u=75
63 [II] Recording /dev/litmus/ft_cpu_trace0 -> overheads_host=rts5_scheduler=GSN-EDF_trace=foo_n=40_u=75_cpu=0.bin
64
65
66However, this convention is purely optional.
67
68### Automating `ft-trace-overheads`
69
70It can be useful to terminate `ft-trace-overheads` from another script by sending a signal. For this purpose, provide the `-s` flag to `ft-trace-overheads`, which will make it terminate cleanly when it receives the `SIGUSR1` signal.
71
72When recording overhead on a large platform, it can take a few seconds until all tracer processes have finished initialization. To ensure that all overheads are being recorded, the benchmark workload should not be executed until initialization is complete. To this end, it is guaranteed that the string "to end tracing..." does not appear in the script's output (on STDOUT) until initialization is complete on all cores.
73
74## What does Feather-Trace data look like?
75
76Feather-Trace produces "raw" overhead files. Each file contains simple event records. Each event record consists of the following fields (➞ [see definition](https://github.com/LITMUS-RT/feather-trace-tools/blob/master/include/timestamp.h#L18)):
77
78- an event ID (e.g., `SCHED_START`),
79- the ID of the processor on which the event was recorded (0-255),
80- a per-processor sequence number (32 bits),
81- the PID of the affected or involved process (if applicable, zero otherwise),
82- the type of the affected or involved process (best-effort, real-time, or unknown),
83- a timestamp (48 bits),
84- a flag that indicates whether any interrupts occurred since the last recorded event, and
85- an approximate interrupt count (0-31, may overflow).
86
87The timestamp is typically a raw cycle count (e.g, obtained with `rdtsc`). However, for certain events such as `RELEASE_LATENCY`, the kernel records the time value directly in nanoseconds.
88
89**Note**: Feather-Trace records data in native endianness. When processing data files on a machine with a different endianness, endianness swapping is required prior to further processing (see `ftsort` below).
90
91## Event Pairs
92
93Most Feather-Trace events come as pairs. For example, context-switch overheads are measured by first recording a `CXS_START` event prior to the context switch, and then a `CXS_END` event just after the context switch. The context-switch overhead is given by the difference of the two timestamps.
94
95There are two event pairs related to scheduling: the pair `SCHED_START`-`SCHED_END` records the scheduling overhead prior to a context switch, and the pair `SCHED2_START`-`SCHE2D_END` records the scheduling overhead after a context switch (i.e, any clean-up).
96
97To see which event records are available, simply record a trace with `ft-trace-overheads` and look through it with `ftdump` (see below), or have a look at the list of event IDs (➞ [see definitions](https://github.com/LITMUS-RT/feather-trace-tools/blob/master/include/timestamp.h#L41)).
98
99## Low-Level Tools
100
101This repository provides three main low-level tools that operate on raw overhead trace files. These tools provide the basis for the higher-level tools discussed below.
102
1031. `ftdump` prints a human-readable version of a trace file's contents. This is useful primarily for manual inspection. Run as `ftdump <MY-TRACE-FILE>`.
104
1052. `ftsort` sorts a Feather-Trace binary trace file by the recorded sequence numbers, which is useful to normalize traces prior to further processing in case events were stored out of order. Run as `ftsort <MY-TRACE-FILE>`. `ftsort` can also carry-out endianness swaps if needed. Run `ftsort -h` to see the available options.
106
1073. `ft2csv` is used to extract overhead data from raw trace files. For example, to extract all context-switch overhead samples, run `ft2csv CXS <MY-TRACE-FILE>`. Run `ft2csv -h` to see the available options.
108
109By default, `ft2csv` produces CSV data. It can also produce binary output compatible with NumPy's `float32` format, which allows for efficient processing of overhead data with NumPy's `numpy.memmap()` facility.
110
111## High-Level Tools
112
113This repository provides a couple of scripts around `ftsort` and `ft2csv` that automate common post-processing steps. We recommend that novice users stick to these provided high-level scripts until they have acquired some familiarity with the LITMUS^RT tracing infrastructure.
114
115Prost-processing of (a large collection of) overhead files typically involves:
116
1171. sorting all files with `ftsort`,
118
1192. splitting out all recorded overhead samples from all trace files,
120
1213. combining data from per-cpu trace files and from traces with different task counts, system utilizations, etc. into single data files for further processing,
122
1234. counting how many events of each type were recorded,
124
1255. shuffling and truncating all sample files, and finally
126
1276. extracting simple statistics such as the observed median, mean, and maximum values.
128
129Note that step 4 is required to allow a statistically meaningful comparison of the sampled maximum overheads. (That is, to avoid sampling bias, do not compare the maxima of trace files containing a different number of samples.)
130
131Corresponding to the above steps, this repository provides a number of scripts that automate these tasks.
132
133
134### Sorting Feather-Trace files
135
136The `ft-sort-traces` script (➞ [source](../ft-sort-traces)) simply runs `ftsort` on all trace files. Invoke as `ft-sort-traces <MY-TRACE-FILES>`. We recommended to keep a log of all post-processing steps with `tee`.
137
138Example:
139
140 ft-sort-traces overheads_*.bin 2>&1 | tee -a overhead-processing.log
141
142Sorting used to be an essential step, but in recent versions of LITMUS^RT, most traces do not contain any out-of-order samples.
143
144### Extracting Overhead Samples
145
146The script `ft-extract-samples` (➞ [source](../ft-extract-samples)) extracts all samples from all provided files with `ft2csv`.
147
148Example:
149
150 ft-extract-samples overheads_*.bin 2>&1 | tee -a overhead-processing.log
151
152The underlying `ft2csv` tool automatically discards any samples that were disturbed by interrupts.
153
154### Combining Samples
155
156The script`ft-combine-samples` (➞ [source](../ft-combine-samples)) combines several data files into a single data file for further processing. This script assumes that file names follow the specific key=value naming convention already mentioned above:
157
158 <basename>_key1=value1_key2=value2...keyN=valueN.float32
159
160The script simply strips certain key=value pairs to concatenate files that have matching values for all parameters that were not stripped. For instance, to combine all trace data irrespective of task count, as specified by "_n=<NUMBER>_", invoke as `ft-combine-samples -n <MY-DATA-FILES>`. The option `--std` combines files with different task counts (`_n=`), different utilizations (`_u=`), for all sequence numbers (`_seq=`), and for all CPU IDs (`_cpu=` and `_msg=`).
161
162Example:
163
164 ft-combine-samples --std overheads_*.float32 2>&1 | tee -a overhead-processing.log
165
166
167### Counting Samples
168
169The script `ft-count-samples` simply looks at all provided trace files and, for each overhead type, determines the minimum number of samples recorded. The output is formatted as a CSV file.
170
171Example:
172
173 ft-count-samples combined-overheads_*.float32 > counts.csv
174
175
176### Random Sample Selection
177
178To allow for an unbiased comparison of the sample maxima, it is important to use the same number of samples for all compared traces. For example, to compare scheduling overhead under different schedulers, make sure you use the same number of samples for all schedulers. If the traces contain a different number of samples (which is very likely), then a subset must be selected prior to computing any statistics.
179
180The approach chosen here is to randomly shuffle and then truncate (a copy of) the files containing the samples. This is automated by the script ` ft-select-samples` (➞ [source](../ft-select-samples)).
181
182**Note**: the first argument to `ft-select-samples` *must* be a CSV file produced by `ft-count-samples`.
183
184Example:
185
186 ft-select-samples counts.csv combined-overheads_*.float32 2>&1 | tee -a overhead-processing.log
187
188The script does not modify the original sample files. Instead, it produces new files of uniform size containing the randomly selected samples. These files are given the extension `sf32` (= shuffled float32).
189
190### Compute statistics
191
192The script `ft-compute-stats` (➞ [source](../ft-compute-stats)) processes `sf32` or `float32` files to extract the maximum, average, median, and minimum observed overheads, as well as the standard deviation and variance. The output is provided in CSV file for further processing (e.g., formatting with a spreadsheet application).
193
194**Note**: Feather-Trace records most overheads in cycles. To convert to microseconds, one must provide the speed of the experimental platform, measured in the number of processor cycles per microsecond, with the `--cycles-per-usec` option. The speed can be inferred from the processor's spec sheet (e.g., a 2Ghz processor executes 2000 cycles per microsecond) or from `/proc/cpuinfo` (on x86 platforms)\. The LITMUS^RT user-space library [liblitmus](https://github.com/LITMUS-RT/liblitmus) also contains a tool `cycles` that can help measure this value.
195
196Example:
197
198 ft-compute-stats combined-overheads_*.sf32 > stats.csv
199
200
201## Complete Example
202
203Suppose all overhead files collected with `ft-trace-overheads` are located in the directory `$DIR`. Overhead statistics can be extracted as follows.
204
205 # (1) Sort
206 ft-sort-traces overheads_*.bin 2>&1 | tee -a overhead-processing.log
207
208 # (2) Split
209 ft-extract-samples overheads_*.bin 2>&1 | tee -a overhead-processing.log
210
211 # (3) Combine
212 ft-combine-samples --std overheads_*.float32 2>&1 | tee -a overhead-processing.log
213
214 # (4) Count available samples
215 ft-count-samples combined-overheads_*.float32 > counts.csv
216
217 # (5) Shuffle & truncate
218 ft-select-samples counts.csv combined-overheads_*.float32 2>&1 | tee -a overhead-processing.log
219
220 # (6) Compute statistics
221 ft-compute-stats combined-overheads_*.sf32 > stats.csv
222