From d89ab8abb843f2df1c16584f3403e5d687d9fb3d Mon Sep 17 00:00:00 2001
From: Mac Mollison <mollison@cs.unc.edu>
Date: Sun, 14 Mar 2010 04:52:06 -0400
Subject: Bring doc up to date

---
 doc/index.txt | 111 ++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 76 insertions(+), 35 deletions(-)

diff --git a/doc/index.txt b/doc/index.txt
index 7ac39b1..78551c9 100644
--- a/doc/index.txt
+++ b/doc/index.txt
@@ -16,32 +16,26 @@ It is the complete documentation for Unit-Trace.
 ## Architecture ##
 Before trying to use Unit-Trace, it will help to understand the architecture of Unit-Trace.
 
-The `unit_trace` Python module is installed in the Python `site_packages` directory, so that it can be
-imported from and Python script on the system.
-Thus, work with Unit-Trace is done by writing a short frontend script that makes use of the submodules
-you need for the task at hand.
-
-The `unit_trace` module consists of a number of submodules used to work with traces.
-(Here, a `trace` is a record of scheduling events, produced by a scheduler, and may be contained in one or more `trace files` (for example,
-one per CPU).
-
-In a typical Unit-Trace script, a `parser` submodule produces a **stream** of **records**, which are then piped to
-subsequent submodules, and finally piped to one or more modules that produce output.
-
-For example, a user might write a script to invoke a parser on a set of trace files; pipe the resulting stream of event
-records to the global EDF testing submodule, which adds error errors to the stream; pipe the stream to a submodule that computes statistics (for example,
-the 10 lengthiest priority inversions) and produces relevant records; and finally, pipe the stream to a submodule that outputs all scheduling events, errors, and the
-statistical information that was computed previously.
-Seeing errors, the user may then wish to generate a visualization for a given time interval.
-(All the functionality described in this scenario is available in the current version of Unit-Trace.)
-
-This architecture provides a clean and flexible interface for working with scheduler traces. Because Python iterators are "lazy," producing items
-(in this case, various records) only when necessary, this architecture avoids requiring that all trace information be read into memory at one time.
-
-We provide a several frontend scripts for common tasks, but encourage users to customize these to better fix their specific needs. 
-We also provide useful submodules, but expect that users will need new submodules that do not yet exist 
-(for example, not PFAIR testing submodules exists). 
-We hope users will contribute any useful code that they produce back to the project.
+Unit-Trace performs various options on **trace files**. Oftentimes, when scheduler tracing takes
+place, multiple trace files are generated for each experiment (e.g. one per CPU). We call a
+related group of trace files a **trace**.
+
+The user interacts with the tool using `unit-trace`, a Python script which is to be installed
+on the local executable path.
+`unit-trace` invokes operations provided by the `unit_trace` Python module, which is installed
+in the local `site_packages` directory (the default install location for Python modules).
+
+The `unit_trace` module provides submodule(s) for parsing trace files into a **stream** of Python objects (which we call **records**).
+This stream is then passed from one `unit_trace` submodule to the next (like a pipe), undergoing
+various transformations in the process, and ultimately arriving at one or more output submodules.
+Intermediate modules generally add records; for example, the `gedf_test` module adds records to indicate
+priority inversions, which can be treated appropriately by output submodules.
+
+This architecture provides a clean and easy-to-use interface for both users and contributors.
+The stream is implemented using Python iterators.
+This allows records are evaluated lazily (i.e. on an as-needed basis), making it possible to pull into memory only
+the records that are needed, and only as long as they are needed.
+This is important for dealing with very large traces.
 
 ## Obtaining Unit-Trace ##
 Members of UNC's Real-Time Group can obtain Unit-Trace using:  
@@ -50,19 +44,66 @@ Members of UNC's Real-Time Group can obtain Unit-Trace using:
 ## Installing Unit-Trace ##
 Unit-Trace is based on Python 2.6, so make sure that is available on your system.
 
-Unit-Trace is installed by copying the unit_trace folder (a Python module) to the system's Python 2.6 `site-packages` directory, usually located at
-`/usr/lib/python2.6/site-packages/`.
-Frontend scripts (i.e., scripts that `import unit_trace`) can then be used anywhere on the system.
-The Unit-Trace code includes `install.py`, which automates (re)installation when called with `sudo`.
+Unit-Trace can be installed manually by copying the `unit-trace` script and `unit_trace` Python module, as described previously.
+Alternatively, you can use `sudo install.py` to (re)install Unit-Trace.
 
 ## Using Unit-Trace ##
-Example frontend scripts are included in the `scripts` folder, and can be used as-is for many tasks.
-
-- gedf_test.py reads trace files passed as command-line arguments and prints out all scheduling events, priority inversions, and some statistics
-- visualize.py reads traces files passed as command-line arguments and draws the corresponding schedule.
+Type `unit-trace` (without options) to view usage information.
+
+In summary, trace files must be specified.
+Flags are used to enable any desired submodules.
+Some flags have accompanying parameter(s) that are passed to the submodule.
+The order in which submodules process records is pre-determined by the `unit-trace` script,
+so the user can specify flags on the command line in any order.
+
+## Example/Use Case ##
+Suppose that Alice wants to perform G-EDF testing on the LITMUS<sup>RT</sup> traces included in the sample_traces/ folder.
+
+The LITMUS<sup>RT</sup> tracing mechanism outputs superfluous events at the beginning of the trace that will not appear "correct" for
+G-EDF testing (for example, multiple job releases that never complete, indicating the initialization of a task).
+Alice uses the following command to print out (-o) the first 50 records (-m 50), looking for the end of the bogus records:  
+<codeblock>unit-trace -m 50 -o *.bin</codeblock>.
+
+Seeing that she hasn't yet reached useful records, she uses the following command to print out (-o) 50 records (-m 50), skipping the
+first 50 (-s 50).  
+<codeblock>unit-trace -m 50 -s 50 -o *.bin</codeblock>.  
+She is able to see that meaningful releases begin at time `37917282934190`.
+
+She now commences G-EDF testing (-g), starting at the time of interest (-e <earliest time>).
+Because of the lengthy output to be expected, she redirects to standard out.
+She also uses the -c option to clean up additional records that are known to be erroneous, and
+will break the G-EDF tester.  
+<codeblock>unit-trace -c -e 37917282934190 -g -o *.bin > output</codeblock>.
+
+OK, everything worked. Alice can now grep through the output file and see priority inversions.
+She sees a particularly long priority inversion, and decides to generate a visualization (-v) of part of the schedule.  
+<codeblock>unit-trace -c -e 37918340000000 -l 37919000000000 -v *.bin</codeblock>.  
+(NOTE: Currently, this still shows the entire schedule, which likely won't be feasible for larger traces and is a bug.)
 
 ## Submodules ##
-TODO: All submodules will be documented thoroughly here.
+### Input Submodules ###
+<table border=1>
+<tr><td>Name</td><td>Flag</td><td>Options</td><td>Description</td></tr>
+<tr><td>trace_parser</td><td>(on by default)</td><td>(None)</td><td>Parses LITMUS<sup>RT</sup> traces</td></tr>
+</table>
+### Intermediate Submodules ###
+<table border=1>
+<tr><td>Name</td><td>Flag</td><td>Options</td><td>Description</td></tr>
+<tr><td>earliest</td><td>-e</td><td>time</td><td>Filters out records before the given time</td></tr>
+<tr><td>latest</td><td>-l</td><td>time</td><td>Filters out records after the given time</td></tr>
+<tr><td>skipper</td><td>-s</td><td>number n</td><td>Skips the first n records</td></tr>
+<tr><td>maxer</td><td>-m</td><td>number n</td><td>Allows at most n records to be parsed</td></tr>
+<tr><td>sanitizer</td><td>-c</td><td>(None)</td><td>Cleans up LITMUS<sup>RT</sup> traces for G-EDF testing.</td></tr>
+<tr><td>progress</td><td>-p</td><td>(None)</td><td>Outputs progress info (e.g number of records parsed so far, total time to process trace) to std error.</td></tr>
+<tr><td>stats</td><td>-i</td><td>(None)</td><td>Outputs statistics about G-EDF inversions. To be deprecated (incorporated into G-EDF tester).</td></tr>
+<tr><td>gedf_test</td><td>-g</td><td>(None)</td><td>Performs G-EDF testing.</td></tr>
+</table>
+### Output Submodules ###
+<table border=1>
+<tr><td>Name</td><td>Flag</td><td>Options</td><td>Description</td></tr>
+<tr><td>stdout_printer</td><td>-o</td><td>(None)</td><td>Prints records to standard out</td></tr>
+<tr><td>visualizer</td><td>-v</td><td>(None)</td><td>Visualizes records</td></tr>
+</table>
 
 ## Development ##
 TODO: Information on how to contribute will be documented thoroughly here.
-- 
cgit v1.2.2