aboutsummaryrefslogtreecommitdiffstats
path: root/tools
diff options
context:
space:
mode:
authorFrederic Weisbecker <fweisbec@gmail.com>2009-09-13 21:01:12 -0400
committerIngo Molnar <mingo@elte.hu>2009-09-14 09:45:10 -0400
commitaa1ab9d26ae9fe2566a9036e3cb83e7d555b3987 (patch)
treebd9a25fdf5e53d29b5c062a3b8291fbafeedd35a /tools
parentd13025222cdb0043e2239b3b819389358bb31bc0 (diff)
perf tools: Fix processing of randomly serialized sched traces
Currently it's possible to meet such too high latency results with 'perf sched latency'. ----------------------------------------------------------------------------------- Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | ----------------------------------------------------------------------------------- xfce4-panel | 0.222 ms | 2 | avg: 4718.345 ms | max: 9436.493 ms | scsi_eh_3 | 3.962 ms | 36 | avg: 55.957 ms | max: 1977.829 ms | The origin is on traces that are sometimes badly serialized across cpus. For example the raw traces that raised such results for xfce4-panel: (1) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (2) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (3) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] The traces are processed in the order they arrive. Then in (2), xfce4-panel sleeps, it is first waken up in (3) and eventually scheduled in (5). The latency reported is then 1504 - 1495 = 9 secs, as reported by perf sched. But this is wrong, we are confident in the fact the traces are nicely serialized while we should actually more trust the timestamps. If we reorder by timestamps we get: (1) Xorg-4276 [001] 1494.663860125: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (2) [init]-0 [000] 1494.663899990: sched_switch: task swapper:0 [140] (R) ==> xfce4-panel:4569 [120] (3) xfce4-panel-4569 [000] 1494.663928373: sched_switch: task xfce4-panel:4569 [120] (S) ==> swapper:0 [140] (4) Xorg-4276 [001] 1504.098252756: sched_wakeup: task xfce4-panel:4569 [120] success=1 [000] (5) perf-5219 [000] 1504.100353302: sched_switch: task perf:5219 [120] (S) ==> xfce4-panel:4569 [120] Now the trace make more sense, xfce4-panel is sleeping. Then it is woken up in (1), scheduled in (2) It goes to sleep in (3), woken up in (4) and scheduled in (5). Now, latency captured between (1) and (2) is of 39 us. And between (4) and (5) it is 2.1 ms. Such pattern of bad serializing is the origin of the high latencies reported by perf sched. Basically, we need to check whether wake up time is higher than schedule out time. If it's not the case, we need to tag the current work atom as invalid. Beside that, we may need to work later on a better ordering of the traces given by the kernel. After this patch: xfce4-session | 0.221 ms | 1 | avg: 0.538 ms | max: 0.538 ms | Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Diffstat (limited to 'tools')
-rw-r--r--tools/perf/builtin-sched.c8
1 files changed, 7 insertions, 1 deletions
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 686af633b35b..3e003237c42f 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -129,6 +129,7 @@ enum thread_state {
129struct work_atom { 129struct work_atom {
130 struct list_head list; 130 struct list_head list;
131 enum thread_state state; 131 enum thread_state state;
132 u64 sched_out_time;
132 u64 wake_up_time; 133 u64 wake_up_time;
133 u64 sched_in_time; 134 u64 sched_in_time;
134 u64 runtime; 135 u64 runtime;
@@ -988,9 +989,11 @@ lat_sched_out(struct task_atoms *atoms,
988 if (!atom) 989 if (!atom)
989 die("Non memory"); 990 die("Non memory");
990 991
992 atom->sched_out_time = timestamp;
993
991 if (sched_out_state(switch_event) == 'R') { 994 if (sched_out_state(switch_event) == 'R') {
992 atom->state = THREAD_WAIT_CPU; 995 atom->state = THREAD_WAIT_CPU;
993 atom->wake_up_time = timestamp; 996 atom->wake_up_time = atom->sched_out_time;
994 } 997 }
995 998
996 atom->runtime = delta; 999 atom->runtime = delta;
@@ -1106,6 +1109,9 @@ latency_wakeup_event(struct trace_wakeup_event *wakeup_event,
1106 if (atom->state != THREAD_SLEEPING) 1109 if (atom->state != THREAD_SLEEPING)
1107 return; 1110 return;
1108 1111
1112 if (atom->sched_out_time > timestamp)
1113 return;
1114
1109 atom->state = THREAD_WAIT_CPU; 1115 atom->state = THREAD_WAIT_CPU;
1110 atom->wake_up_time = timestamp; 1116 atom->wake_up_time = timestamp;
1111} 1117}