aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJonathan Herman <hermanjl@cs.unc.edu>2013-03-19 15:06:36 -0400
committerJonathan Herman <hermanjl@cs.unc.edu>2013-03-19 15:06:36 -0400
commit4f97e3e478b4b248d993bce56c1c6bb737decbbe (patch)
treed3c205786e6b302fe8125b928073ac15cc6183c3
parent36a56e31db846706cb2cbcb61d5783e7af11391a (diff)
Added README.
-rw-r--r--README507
1 files changed, 507 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..e1a0815
--- /dev/null
+++ b/README
@@ -0,0 +1,507 @@
1I. INTRODUCTION
2These scripts provide a common way for creating, running, parsing, and
3plotting experiments under LITMUS^RT. They are designed with the
4following principles in mind:
5
61. Little or no configuration: all scripts use certain parameters to
7configure behavior. However, if the user does not give these
8parameters, the scripts will examine the properties of the user's
9system to pick a suitable default. Requiring user input is a last
10resort.
11
122. Interruptability: the scripts save their work as they evaluate
13multiple directories. When the scripts are interrupted, or if new data
14is added to those directories, the scripts can be re-run and they will
15resume where they left off. This vastly decreases turnaround time for
16testing new features.
17
183. Maximum Safety: where possible, scripts save metadata in their output
19directories about the data contained. This metadata can be used by
20the other scripts to safely use the data later.
21
224. Independence / legacy support: none of these scripts assume their
23input was generated by another of these scripts. Three are designed to
24recognize generic input formats inspired by past LITMUS^RT
25experimental setups. (The exception to this is gen_exps.py, which
26has only user intput and creates output only for run_exps.py)
27
285. Save everything: all output and parameters (even from subprocesses)
29is saved for debugging / reproducability. This data is saved in tmp/
30directories while scripts are running in case scripts fail.
31
32These scripts require that the following repos are in the user's PATH:
331. liblitmus - for real-time executable simulation and task set release
342. feather-trace-tools - for recording and parsing overheads and
35 scheduling events
36
37Optionally, additional features will be enabled if these repos are
38present in the PATH:
391. rt-kernelshark - to record ftrace events for kernelshark visualization
402. sched_trace - to output a file containing scheduling events as
41strings
42
43Each of these scripts is designed to operate independently of the
44others. For example, the parse_exps.py will find any feather trace
45files resembling ft-xyz.bin or xyz.ft and print out overhead
46statistics for the records inside. However, the scripts provide the
47most features (especially safety) when their results are chained
48together, like so:
49
50gen_exps.py --> [exps/*] --> run_exps.py --> [run-data/*] --.
51.------------------------------------------------------------'
52'--> parse_exps.py --> [parse-data/*] --> plot_exps.py --> [plot-data/*.pdf]
53
540. Create experiments with gen_exps.py or some other script.
551. Run experiments using run_exps.py, generating binary files in run-data/.
562. Parse binary data in run-data using parse_exps.py, generating csv
57 files in parse-data/.
583. Plot parse-data using plot_exps.py, generating pdfs in plot-data.
59
60Each of these scripts will be described. The run_exps.py script is
61first because gen_exps.py creates schedule files which depend on run_exps.py.
62
63
64II. RUN_EXPS
65Usage: run_exps.py [OPTIONS] [SCHED_FILE]... [SCHED_DIR]...
66 where a SCHED_DIR resembles:
67 SCHED_DIR/
68 SCHED_FILE
69 PARAM_FILE
70
71Output: OUT_DIR/[files] or OUT_DIR/SCHED_DIR/[files] or
72 OUT_DIR/SCHED_FILE/[files] depending on input
73 If all features are enabled, these files are:
74 OUT_DIR/[.*/]
75 trace.slog # LITMUS logging
76 st-[1..m].bin # sched_trace data
77 ft.bin # feather-trace overhead data
78 trace.dat # ftrace data for kernelshark
79 params.py # Schedule parameters
80 exec-out.txt # Standard out from schedule processes
81 exec-err.txt # Standard err '''
82
83Defaults: SCHED_FILE = sched.py, PARAM_FILE = params.py,
84 DURATION = 30, OUT_DIR = run-data/
85
86The run_exps.py script reads schedule files and executes real-time
87task systems, recording all overhead, logging, and trace data which is
88enabled in the system. For example, if trace logging is enabled,
89rt-kernelshark is found in the path, but feather-trace is disabled
90(the devices are not present), only trace-logs and kernelshark logs
91will be recorded.
92
93When run_exps.py is running a schedule file, temporary data is saved
94in a 'tmp' directory in the same directory as the schedule file. When
95execution completes, this data is moved into a directory under the
96run_exps.py output directory (default: 'run-data/', can be changed with
97the -o option). When multiple schedules are run, each schedule's data
98is saved in a unique directory under the output directory.
99
100If a schedule has been run and it's data is in the output directory,
101run_exps.py will not re-run the schedule unless the -f option is
102specified. This is useful if your system crashes midway through a set
103of experiments.
104
105Schedule files have one of the following two formats:
106
107a) simple format
108 path/to/proc{proc_value}
109 ...
110 path/to/proc{proc_value}
111 [real_time_task: default rtspin] task_arguments...
112 ...
113 [real_time_task] task_arguments...
114
115b) python format
116 {'proc':[
117 ('path/to/proc','proc_value'),
118 ...,
119 ('path/to/proc','proc_value')
120 ],
121 'spin':[
122 ('real_time_task', 'task_arguments'),
123 ...
124 ('real_time_task', 'task_arguments')
125 ]
126 }
127
128The following creates a simple 3-task system with utilization 2.0,
129which is then run under the GSN-EDF plugin:
130
131$ echo "10 20
13230 40
13360 90" > test.sched
134$ run_exps.py -s GSN-EDF test.sched
135
136The following will write a release master using
137/proc/litmus/release_master:
138
139$ echo "release_master{2}
14010 20" > test.sched && run_exps.py -s GSN-EDF test.sched
141
142A longer form can be used for proc entries not in /proc/litmus:
143
144$ echo "/proc/sys/something{hello}"
14510 20" > test.sched
146
147You can specify your own spin programs to run as well instead of
148rtspin by putting their name at the beginning of the line.
149
150$ echo "colorspin -f color1.csv 10 20" > test.sched
151
152This example also shows how you can reference files in the same
153directory as the schedule file on the command line.
154
155You can specify parameters for an experiment in a file instead of on
156the command line using params.py (the -p option lets you choose the
157name of this file if params.py is not for you):
158
159$ echo "{'scheduler':'GSN-EDF', 'duration':10}" > params.py
160$ run_exps.py test.sched
161
162You can also run multiple experiments with a single command, provided
163a directory with a schedule file exists for each. By default, the
164program will look for sched.py for the schedule file and params.py for
165the parameter file, but this behavior can be changed using the -p and
166-c options.
167
168You can include non-relevant parameters which run_exps.py does not
169understand in params.py. These parameters will be saved with the data
170output by run_exps.py. This is useful for tracking variations in
171system parameters versus experimental results.
172
173In the following example, multiple experiments are demonstrated and an
174extra parameter 'test-param' is included:
175
176$ mkdir test1
177# The duration will default to 30 and need not be specified
178$ echo "{'scheduler':'C-EDF', 'test-param':1} > test1/params.py
179$ echo "10 20" > test1/sched.py
180$ cp -r test1 test2
181$ echo "{'scheduler':'GSN-EDF', 'test-param':2}"> test2/params.py
182$ run_exps.py test*
183
184Finally, you can specify system properties in params.py which the
185environment must match for the experiment to run. These are useful if
186you have a large batch of experiments which must be run under
187different kernels. The first property is a regular expression for the
188uname of the system:
189
190$ uname -r
1913.0.0-litmus
192$ cp params.py old_params.py
193$ echo "{'uname': r'.*linux.*'}" >> params.py
194# run_exps.py will now complain of an invalid environment for this
195experiment
196$ cp old_params.py params.py
197$ echo "{'uname': r'.*litmus.*'}" >> params.py
198# run_exps.py will now succeed
199
200The second property are kernel configuration options. These assume the
201configuration is stored at /boot/config-`uname -r`. You can specify
202these like so:
203
204$ echo "{'config-options':{
205'RELEASE_MASTER' : 'y',
206'ARM' : 'y'}}" >> params.py
207# Only executes on ARM systems with the release master enabled
208
209
210III. GEN_EXPS
211Usage: gen_exps.py [options] [files...] [generators...] [param=val[,val]...]
212Output: exps/EXP_DIRS which each contain sched.py and params.py
213Defaults: generators = G-EDF P-EDF C-EDF
214
215The gen_exps.py script uses 'generators', one for each LITMUS
216scheduler supported, which each have different properties which can be
217varied to generate different types of schedules. Each of these
218properties has a default value which can be modified on the command
219line for quick and easy experiment generation.
220
221This script as written should be used to create debugging task sets,
222but not for creating task sets for experiments shown in papers. That
223is because the safety features of run_exps.py described above (uname,
224config-options) are not used here. If you are creating experiments for
225a paper, you should create your own generator which outputs values for
226'config-options' required for your plugin so that you cannot ruin your
227experiments at run time.
228
229The -l option lists the supported generators which can be specified:
230
231$ gen_exps.py -l
232G-EDF, P-EDF, C-EDF
233
234The -d option will describe the properties of a generator or
235generators and their default values. Note that some of these defaults
236will vary depending on the system the script is run. For example,
237'cpus' defaults to the number of cpus on the current system, in this
238example 24.
239
240$ gen_exps.py -d G-EDF,P-EDF
241Generator GSN-EDF:
242 num_tasks -- Number of tasks per experiment.
243 Default: [24, 48, 72, 96]
244 Allowed: <type 'int'>
245 ....
246
247Generator PSN-EDF:
248 num_tasks -- Number of tasks per experiment.
249 Default: [24, 48, 72, 96]
250 Allowed: <type 'int'>
251 cpus -- Number of processors on target system.
252 Default: [24]
253 Allowed: <type 'int'>
254 ....
255
256You create experiments by specifying a generator. The following will
257create experiments 4 schedules with 24, 48, 72, and 96 tasks, because
258the default value of num_tasks is an array of these values
259
260$ gen_exps.py P-EDF
261$ ls exps/
262sched=GSN-EDF_num-tasks=24/ sched=GSN-EDF_num-tasks=48/
263sched=GSN-EDF_num-tasks=72/ sched=GSN-EDF_num-tasks=96/
264
265You can modify the default using a single value (the -f option deletes
266previous experiments in the output directory, defaulting to 'exps/',
267changeable with -o):
268
269$ gen_exps.py -f P-EDF num_tasks=24
270$ ls exps/
271sched=GSN-EDF_num-tasks=24/
272
273Or with an array of values, specified as a comma-seperated list:
274
275$ gen_exps.py -f num_tasks=`seq -s, 24 2 30` P-EDF
276sched=PSN-EDF_num-tasks=24/ sched=PSN-EDF_num-tasks=26/
277sched=PSN-EDF_num-tasks=28/ sched=PSN-EDF_num-tasks=30/
278
279The generator will create a different directory for each possible
280configuration of the parameters. Each parameter which is varied is
281included in the name of the schedule directory. For example, to vary
282the number of CPUs but not the number of tasks:
283
284$ gen_exps.py -f num_tasks=24 cpus=3,6 P-EDF
285$ ls exps
286sched=PSN-EDF_cpus=3/ sched=PSN-EDF_cpus=6/
287
288The values of non-varying parameters are still saved in
289params.py. Continuing the example above:
290
291$ cat exps/sched\=PSN-EDF_cpus\=3/params.py
292{'periods': 'harmonic', 'release_master': False, 'duration': 30,
293 'utils': 'uni-medium', 'scheduler': 'PSN-EDF', 'cpus': 3}
294
295You can also have multiple schedules generated with the same
296configuration using the -n option:
297
298$ gen_exps.py -f num_tasks=24 -n 5 P-EDF
299$ ls exps/
300sched=PSN-EDF_trial=0/ sched=PSN-EDF_trial=1/ sched=PSN-EDF_trial=2/
301sched=PSN-EDF_trial=3/ sched=PSN-EDF_trial=4/
302
303
304IV. PARSE_EXPS
305Usage: parse_exps.py [options] [data_dir1] [data_dir2]...
306 where data_dirs contain feather-trace and sched-trace data,
307 e.g. ft.bin, mysched.ft, or st-*.bin.
308
309Output: print out all parsed data or
310 OUT_FILE where OUT_FILE is a python map of the data or
311 OUT_DIR/[FIELD]*/[PARAM]/[TYPE]/[TYPE]/[LINE].csv
312
313 The goal is to create csv files which record how varying PARAM
314 changes the value of FIELD. Only PARAMs which vary are
315 considered.
316
317 FIELD is a parsed value, e.g. 'RELEASE' overhead or 'miss-ratio'
318 PARAM is a parameter which we are going to vary, e.g. 'tasks'
319 A single LINE is created for every configuration of parameters
320 other than PARAM.
321
322 TYPE is the type of measurement, i.e. Max, Min, Avg, or
323 Var[iance]. The two types are used to differentiate between
324 measurements across tasks in a single taskset, and
325 measurements across all tasksets. E.g. miss-ratio/*/Max/Avg
326 is the maximum of all the average miss ratios for each task set, while
327 miss-ratio/*/Avg/Max is the average of the maximum miss ratios
328 for each task set.
329
330Defaults: OUT_DIR or OUT_FILE = parse-data, data_dir1 = '.'
331
332The parse_exps.py script reads a directory or directories, parses the
333binary files inside for feather-trace or sched-trace data, then
334summarizes and organizes the results for output. The output can be to
335the console, to a python map, or to a directory tree of csvs (the
336default, ish). The python map (using -m) can be used for
337schedulability tests. The directory tree can be used to look at how
338changing parameters affects certain measurements.
339
340The script will use half the current computers CPUs to process data.
341
342In the following example, too little data was found to create csv
343files, so the data is output to the console despite the user not
344specifying the -v option. This use is the easiest for quick overhead
345evalutation and debugging. Note that for overhead measurements like
346these, parse_exps.py will use the 'clock-frequency' parameter saved in
347a params.py file by run_exps.py to calculate overhead measurements. If
348a param file is not present, as in this case, the current CPUs
349frequency will be used.
350
351$ ls run-data/
352taskset_scheduler=C-FL-split-L3_host=ludwig_n=10_idx=05_split=randsplit.ft
353$ parse_exps.py run-data/
354Loading experiments...
355Parsing data...
356 0.00%
357Writing result...
358Too little data to make csv files.
359<ExpPoint-/home/hermanjl/tmp/run-data>
360 CXS: Avg: 5.053 Max: 59.925 Min: 0.241
361 SCHED: Avg: 4.410 Max: 39.350 Min: 0.357
362 TICK: Avg: 1.812 Max: 21.380 Min: 0.241
363
364In the next example, because the value of num-tasks varies, csvs can
365be created:
366
367$ ls run-data/
368sched=C-EDF_num-tasks=4/ sched=GSN-EDF_num-tasks=4/
369sched=C-EDF_num-tasks=8/ sched=GSN-EDF_num-tasks=8/
370sched=C-EDF_num-tasks=12/ sched=GSN-EDF_num-tasks=12/
371sched=C-EDF_num-tasks=16/ sched=GSN-EDF_num-tasks=16/
372$ parse_exps.py run-data/*
373$ ls parse-data/
374avg-block/ avg-tard/ max-block/ max-tard/ miss-ratio/
375
376The varying parameters were found by reading the params.py files under
377each run-data subdirectory.
378
379You can use the -v option to print out the values measured even when
380csvs could be created.
381
382You can use the -i option to ignore variations in a certain parameter
383(or parameters if a comma-seperated list is given). In the following
384example, the user has decided the 'option' does not matter after
385viewing output. Note that the 'trial' parameter, used by gen_exps.py
386to create multiple schedules with the same configuration, is always
387ignored.
388
389$ ls run-data/
390sched=C-EDF_num-tasks=4_option=1/ sched=C-EDF_num-tasks=4_option=2/
391sched=C-EDF_num-tasks=8_option=1/ sched=C-EDF_num-tasks=8_option=2/
392$ parse_exps.py run-data/*
393$ for i in `ls parse-data/miss-ratio/tasks/Avg/Avg/`; do echo $i; cat
394$i; done
395option=1.csv
396 4 .1
397 8 .2
398option=2.csv
399 4 .2
400 8 .4
401# Now ignore 'option' for more accurate results
402$ parse_exps.py -i option run-data/*
403$ for i in `ls parse-data/miss-ratio/tasks/Avg/Avg/`; do echo $i; cat
404$i; done
405line.csv
406 4 .2
407 8 .3
408
409The second command will also have run faster than the first. This is
410because parse_exps.py will save the data it parses in tmp/ directories
411before it attempts to sort it into csvs. Parsing takes far longer than
412sorting, so this saves a lot of time. The -f flag can be used to
413re-parse files and overwrite this saved data.
414
415All output from the feather-trace-tool programs used to parse data is
416stored in the tmp/ directories created in the input directories. If
417the sched_trace repo is found in the users PATH, st_show will be used
418to create a human-readable version of the sched-trace data which will
419also be stored there.
420
421
422V. PLOT_EXPS
423Usage: plot_exps.py [options] [csv_dir]...
424 where a csv dir is a directory or directory of directories (and
425 so on) containing csvs, like:
426 csv_dir/[subdirs/...]
427 line1.csv
428 line2.csv
429 line3.csv
430
431Outputs: OUT_DIR/[csv_dir/]*[plot]*.pdf
432 where a single plot exists for each directory of csvs, with a
433 line for for each csv file in that directory. If only a
434 single csv_dir is specified, all plots are placed directly
435 under OUT_DIR.
436
437Defaults: OUT_DIR = 'plot-data/', csv_dir = '.'
438
439The plot_exps.py script takes directories of csvs (or directories
440formatted as specified below) and creates a pdf plot of each csv
441directory found. A line is created for each .csv file contained in a
442plot. Matplotlib is used to do the plotting.
443
444If the csv files are formatted like:
445
446 param=value_param2=value2.csv
447
448the variation of these parameters will be used to color the lines in
449the most readable way. For instance, if there are three parameters,
450variations in one parameter will change line color, another line
451style (dashes/dots/etc), and a third line markers
452(trianges/circles/etc).
453
454If a directory of directories is passed in, the script will assume the
455top level directory is the measured value and the next level is the
456variable, ie:
457
458 value/variable/[..../]line.csv
459
460And put a title on the plot of "Value by variable (...)". Otherwise,
461the name of the top level directory will be the title, like "Value".
462
463A directory with some lines:
464$ ls
465line1.csv line2.csv
466$ plot_exps.py
467$ ls plot-data/
468plot.pdf
469
470A directory with a few subdirectories:
471$ ls test/
472apples/ oranges/
473$ ls test/apples/
474line1.csv line2.csv
475$ plot_exps.py test/
476$ ls plot-data/
477apples.pdf oranges.pdf
478
479A directory with many subdirectories:
480$ ls parse-data
481avg-block/ avg-tard/ max-block/ max-tard/ miss-ratio/
482$ ls parse-data/avg-block/tasks/Avg/Avg
483scheduler=C-EDF.csv scheduler=PSN-EDF.csv
484$ plot_exps.py parse-data
485$ ls plot-data
486avg-block_tasks_Avg_Avg.pdf avg-block_tasks_Avg_Max.pdf avg-block_tasks_Avg_Min.pdf
487avg-block_tasks_Max_Avg.pdf avg-block_tasks_Max_Max.pdf avg-block_tasks_Max_Min.pdf
488avg-block_tasks_Min_Avg.pdf avg-block_tasks_Min_Max.pdf avg-block_tasks_Min_Min.pdf
489avg-block_tasks_Var_Avg.pdf avg-block_tasks_Var_Max.pdf avg-block_tasks_Var_Min.pdf
490.......
491
492If you run the previous example directly on the subdirectories,
493subdirectories will be created in the output:
494
495$ plot_exps.py parse-data/*
496$ ls plot-data/
497avg-block/ max-tard/ avg-tard/ miss-ratio/ max-block/
498$ ls plot-data/avg-block/
499tasks_Avg_Avg.pdf tasks_Avg_Min.pdf tasks_Max_Max.pdf
500tasks_Min_Avg.pdf tasks_Min_Min.pdf tasks_Var_Max.pdf
501tasks_Avg_Max.pdf tasks_Max_Avg.pdf tasks_Max_Min.pdf
502tasks_Min_Max.pdf tasks_Var_Avg.pdf tasks_Var_Min.pdf
503
504However, when a single directory of directories is given, the script
505assumes the experiments are related and can make line styles match in
506different plots and more effectively parallelize the plotting.
507