Improve behavior when out-of-order record detected

In order to sort records from different files into a single stream (in order of timestamp), unit-trace uses per-cpu buffers; the length of the buffers was previously hardcoded. If that length proved insufficient (resulting in out-of-order records), before this commit, unit-trace produced a FATAL ERROR and terminated. Now, unit-trace merely keeps a record of all out-of-order records and prints a warning at the end, listing them. The motivation for this change was the observation that at least some times, grossly out-of-order errors were at the very beginning of the trace (e.g. task system release), so they don't really matter. If we know the IDs of the records that are unordered, we are able to check (with the -o output) to see if their misordering actually matters or not. Moreover, the buffer size can now be specified with -b, and the previously hard-coded value (200) is the default. Making this number smaller greatly improves runtime, and vice versa. I suspect that further investigation into the problem of sorting records will show that the current method is overkill; down the road, we may be able to replace this method with something much faster. (The current method has the advantage that it is extremely scalable, but I don't think that pays off for the size of traces we typically examine.)
author: Mac Mollison <mollison@cs.unc.edu> 2010-12-15 09:53:23 -0500
committer: Mac Mollison <mollison@cs.unc.edu> 2010-12-15 09:53:23 -0500
commit: c364f1d807eeb246ca67184246fd2c8d7933b8b6 (patch)
tree: 4146a91547d660e055a4d60f023fe7879075d11a
parent: 51e246d367d043913a882080abde3d8bae5ce4d4 (diff)
3 files changed, 52 insertions, 18 deletions
diff --git a/unit-trace b/unit-trace
index 5362113..15ba636 100755
--- a/unit-trace
+++ b/unit-trace
@@ -35,6 +35,8 @@ parser.add_option("-e", "--earliest", default=0, type=int, dest="earliest",
    help="Earliest timestamp of interest")
 parser.add_option("-l", "--latest", default=0, type=int, dest="latest",
    help="Latest timestamp of interest")
+parser.add_option("-b", "--bufsize", dest="buffsize", default=200, type=int,
+    help="Per-CPU buffer size for sorting records")
 (options, traces) = parser.parse_args()
 traces = list(traces)
 if len(traces) < 1:
@@ -50,7 +52,7 @@ import unit_trace
 # Read events from traces
 from unit_trace import trace_reader
-stream = trace_reader.trace_reader(traces)
+stream = trace_reader.trace_reader(traces, options.buffsize)
 # Skip over records
 if options.skipnum > 0:
@@ -100,7 +102,7 @@ if options.gedf is True:
 # This might cause a performance bottleneck that could be eliminated by
 # checking how many we actually need :-)
 import itertools
-stream1, stream2, stream3 = itertools.tee(stream,3)
+stream1, stream2, stream3, stream4 = itertools.tee(stream,4)
 # Call standard out printer
 if options.stdout is True:
@@ -117,7 +119,11 @@ if options.num_inversions > -1:
        from unit_trace import gedf_inversion_stat_printer
        gedf_inversion_stat_printer.gedf_inversion_stat_printer(stream2,options.num_inversions)
+# Print any warnings
+from unit_trace import warning_printer
+warning_printer.warning_printer(stream3)
 # Call visualizer
 if options.visualize is True:
    from unit_trace import viz
-    viz.visualizer.visualizer(stream3, options.time_per_maj)
+    viz.visualizer.visualizer(stream4, options.time_per_maj)
diff --git a/unit_trace/trace_reader.py b/unit_trace/trace_reader.py
index a4c3c05..db36c13 100644
--- a/unit_trace/trace_reader.py
+++ b/unit_trace/trace_reader.py
@@ -34,7 +34,7 @@ import sys
 ###############################################################################
 # Generator function returning an iterable over records in a trace file.
-def trace_reader(files):
+def trace_reader(files, buffsize):
    # Yield a record containing the input files
    # This is used by progress.py to calculate progress
@@ -66,17 +66,15 @@ def trace_reader(files):
            # Forget that file iter
            file_iters.pop()
-    # We keep 200 records in each buffer and then keep the buffer sorted
+    # We keep some number of records (given as buffsize) in each buffer and
-    # This is because records may have been recorded slightly out of order
+    # then keep the buffer sorted.  This is because records may have been
-    # This cannot guarantee records are produced in order, but it makes it
+    # recorded slightly out of order.  The 'try' and 'except' catches the case
-    # overwhelmingly probably.
+    # where there are less than buffsize records in a file (throwing a
-    # The 'try' and 'except' catches the case where there are less than 200
+    # StopIteration) which otherwise would propogate up and cause the
-    # records in a file (throwing a StopIteration) which otherwise would
+    # trace_reader generator itself to throw a StopIteration.
-    # propogate up and cause the trace_reader generator itself to throw a
-    # StopIteration.
    for x in range(0,len(file_iter_buff)):
        try:
-            for y in range(0,200):
+            for y in range(0,buffsize):
                file_iter_buff[x].append(file_iters[x].next())
        except StopIteration:
            pass
@@ -119,16 +117,20 @@ def trace_reader(files):
                del file_iter_buff[buff_to_refill]
                del file_iters[buff_to_refill]
+        # Give the record an id number
+        id += 1
+        earliest.id = id
        # Check for monotonically increasing time
        if last_time is not None and earliest.when < last_time:
-            exit("FATAL ERROR: trace_reader.py: out-of-order record produced")
+            record = Obj()
+            record.record_type = "meta"
+            record.type_name = "out_of_order_warning"
+            record.id = earliest.id
+            yield record
        else:
            last_time = earliest.when
-        # Give the record an id number
-        id += 1
-        earliest.id = id
        # Yield the record
        yield earliest
diff --git a/unit_trace/warning_printer.py b/unit_trace/warning_printer.py
new file mode 100644
index 0000000..0f726ff
--- /dev/null
+++ b/unit_trace/warning_printer.py
@@ -0,0 +1,26 @@
+###############################################################################
+# Description
+###############################################################################
+# Display any warnings
+###############################################################################
+# Imports
+###############################################################################
+import sys
+###############################################################################
+# Public functions
+###############################################################################
+def warning_printer(stream):
+    out_of_order_ids = []
+    for record in stream:
+        if (record.record_type == "meta" and
+            record.type_name == "out_of_order_warning"):
+            out_of_order_ids.append(record.id)
+    if len(out_of_order_ids) > 0:
+        sys.stderr.write(
+            "WARNING: The following {0} records were out of order:\n{1}\n".format(
+            len(out_of_order_ids), out_of_order_ids))
author	Mac Mollison <mollison@cs.unc.edu>	2010-12-15 09:53:23 -0500
committer	Mac Mollison <mollison@cs.unc.edu>	2010-12-15 09:53:23 -0500
commit	c364f1d807eeb246ca67184246fd2c8d7933b8b6 (patch)
tree	4146a91547d660e055a4d60f023fe7879075d11a
parent	51e246d367d043913a882080abde3d8bae5ce4d4 (diff)