Improve behavior when out-of-order record detected

In order to sort records from different files into a single stream (in order of timestamp), unit-trace uses per-cpu buffers; the length of the buffers was previously hardcoded. If that length proved insufficient (resulting in out-of-order records), before this commit, unit-trace produced a FATAL ERROR and terminated. Now, unit-trace merely keeps a record of all out-of-order records and prints a warning at the end, listing them. The motivation for this change was the observation that at least some times, grossly out-of-order errors were at the very beginning of the trace (e.g. task system release), so they don't really matter. If we know the IDs of the records that are unordered, we are able to check (with the -o output) to see if their misordering actually matters or not. Moreover, the buffer size can now be specified with -b, and the previously hard-coded value (200) is the default. Making this number smaller greatly improves runtime, and vice versa. I suspect that further investigation into the problem of sorting records will show that the current method is overkill; down the road, we may be able to replace this method with something much faster. (The current method has the advantage that it is extremely scalable, but I don't think that pays off for the size of traces we typically examine.)
author: Mac Mollison <mollison@cs.unc.edu> 2010-12-15 09:53:23 -0500
committer: Mac Mollison <mollison@cs.unc.edu> 2010-12-15 09:53:23 -0500
commit: c364f1d807eeb246ca67184246fd2c8d7933b8b6 (patch)
tree: 4146a91547d660e055a4d60f023fe7879075d11a /unit-trace
parent: 51e246d367d043913a882080abde3d8bae5ce4d4 (diff)
1 files changed, 9 insertions, 3 deletions
diff --git a/unit-trace b/unit-trace
index 5362113..15ba636 100755
--- a/unit-trace
+++ b/unit-trace
@@ -35,6 +35,8 @@ parser.add_option("-e", "--earliest", default=0, type=int, dest="earliest",
    help="Earliest timestamp of interest")
 parser.add_option("-l", "--latest", default=0, type=int, dest="latest",
    help="Latest timestamp of interest")
+parser.add_option("-b", "--bufsize", dest="buffsize", default=200, type=int,
+    help="Per-CPU buffer size for sorting records")
 (options, traces) = parser.parse_args()
 traces = list(traces)
 if len(traces) < 1:
@@ -50,7 +52,7 @@ import unit_trace
 # Read events from traces
 from unit_trace import trace_reader
-stream = trace_reader.trace_reader(traces)
+stream = trace_reader.trace_reader(traces, options.buffsize)
 # Skip over records
 if options.skipnum > 0:
@@ -100,7 +102,7 @@ if options.gedf is True:
 # This might cause a performance bottleneck that could be eliminated by
 # checking how many we actually need :-)
 import itertools
-stream1, stream2, stream3 = itertools.tee(stream,3)
+stream1, stream2, stream3, stream4 = itertools.tee(stream,4)
 # Call standard out printer
 if options.stdout is True:
@@ -117,7 +119,11 @@ if options.num_inversions > -1:
        from unit_trace import gedf_inversion_stat_printer
        gedf_inversion_stat_printer.gedf_inversion_stat_printer(stream2,options.num_inversions)
+# Print any warnings
+from unit_trace import warning_printer
+warning_printer.warning_printer(stream3)
 # Call visualizer
 if options.visualize is True:
    from unit_trace import viz
-    viz.visualizer.visualizer(stream3, options.time_per_maj)
+    viz.visualizer.visualizer(stream4, options.time_per_maj)
author	Mac Mollison <mollison@cs.unc.edu>	2010-12-15 09:53:23 -0500
committer	Mac Mollison <mollison@cs.unc.edu>	2010-12-15 09:53:23 -0500
commit	c364f1d807eeb246ca67184246fd2c8d7933b8b6 (patch)
tree	4146a91547d660e055a4d60f023fe7879075d11a /unit-trace
parent	51e246d367d043913a882080abde3d8bae5ce4d4 (diff)