aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/sched_debug.c
Commit message (Collapse)AuthorAge
* sched: improve affine wakeupsIngo Molnar2008-03-18
| | | | | | | | | | | | | | | | | improve affine wakeups. Maintain the 'overlap' metric based on CFS's sum_exec_runtime - which means the amount of time a task executes after it wakes up some other task. Use the 'overlap' for the wakeup decisions: if the 'overlap' is short, it means there's strong workload coupling between this task and the woken up task. If the 'overlap' is large then the workload is decoupled and the scheduler will move them to separate CPUs more easily. ( Also slightly move the preempt_check within try_to_wake_up() - this has no effect on functionality but allows 'early wakeups' (for still-on-rq tasks) to be correctly accounted as well.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: keep total / count stats in addition to the max forArjan van de Ven2008-01-25
| | | | | | | | | | | | Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: monitor clock underflows in /proc/sched_debugGuillaume Chazarain2008-01-25
| | | | | | | We monitor clock overflows, let's also monitor clock underflows. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: fix gcc warningsIngo Molnar2007-12-30
| | | | | | | | | | | | | | | | | | | | | | | | | Meelis Roos reported these warnings on sparc64: CC kernel/sched.o In file included from kernel/sched.c:879: kernel/sched_debug.c: In function 'nsec_high': kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast the debug check in do_div() is over-eager here, because the long long is always positive in these places. Mark this by casting them to unsigned long long. no change in code output: text data bss dec hex filename 51471 6582 376 58429 e43d sched.o.before 51471 6582 376 58429 e43d sched.o.after md5: 7f7729c111f185bf3ccea4d542abc049 sched.o.before.asm 7f7729c111f185bf3ccea4d542abc049 sched.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: clean up overlong line in kernel/sched_debug.cIngo Molnar2007-11-28
| | | | | | clean up overlong line in kernel/sched_debug.c. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: bump version of kernel/sched_debug.cIngo Molnar2007-11-26
| | | | | | | bump version of kernel/sched_debug.c and remove CFS version information from it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: reintroduce the sched_min_granularity tunablePeter Zijlstra2007-11-09
| | | | | | | | | | | | | | | we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: fix unconditional irq lockPeter Zijlstra2007-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep noticed that this lock can also be taken from hardirq context, and can thus not unconditionally disable/enable irqs. WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on() [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30 [show_trace+18/32] show_trace+0x12/0x20 [dump_stack+22/32] dump_stack+0x16/0x20 [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0 [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30 [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080 [show_state_filter+326/368] show_state_filter+0x146/0x170 [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10 [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120 [handle_sysrq+40/64] handle_sysrq+0x28/0x40 [kbd_event+1045/1680] kbd_event+0x415/0x690 [input_pass_event+206/208] input_pass_event+0xce/0xd0 [input_handle_event+170/928] input_handle_event+0xaa/0x3a0 [input_event+95/112] input_event+0x5f/0x70 [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0 [serio_interrupt+59/128] serio_interrupt+0x3b/0x80 [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240 [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60 [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140 [do_IRQ+64/128] do_IRQ+0x40/0x80 [common_interrupt+46/52] common_interrupt+0x2e/0x34 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: reduce schedstat variable overhead a bitKen Chen2007-10-18
| | | | | | | | | | | | | | | | | schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Make scheduler debug file operations constArjan van de Ven2007-10-15
| | | | | | | | | | In general, struct file_operations are const in the kernel, to not have false cacheline sharing and to catch bugs at compiletime with accidental writes to them. The new scheduler code introduces a new non-const one; fix this up. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: debug, improve migration statisticsIngo Molnar2007-10-15
| | | | | | | add new migration statistics when SCHED_DEBUG and SCHEDSTATS is enabled. Available in /proc/<PID>/sched. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: debug: increase width of debug lineIngo Molnar2007-10-15
| | | | | | increase width of debug line - in preparation of more debugging info. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: group scheduling, sysfs tunablesDhaval Giani2007-10-15
| | | | | | | | | | | | | | | | | | | | | | | | | Add tunables in sysfs to modify a user's cpu share. A directory is created in sysfs for each new user in the system. /sys/kernel/uids/<uid>/cpu_share Reading this file returns the cpu shares granted for the user. Writing into this file modifies the cpu share for the user. Only an administrator is allowed to modify a user's cpu share. Ex: # cd /sys/kernel/uids/ # cat 512/cpu_share 1024 # echo 2048 > 512/cpu_share # cat 512/cpu_share 2048 # Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: cleanup: rename task_grp to task_groupIngo Molnar2007-10-15
| | | | | | | cleanup: rename task_grp to task_group. No need to save two characters and 'grp' is annoying to read. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: group scheduler, fix coding style issuesSrivatsa Vaddagiri2007-10-15
| | | | | | | | | Fix coding style issues reported by Randy Dunlap and others Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: speed up and simplify vslice calculationsPeter Zijlstra2007-10-15
| | | | | | | | | speed up and simplify vslice calculations. [ From: Mike Galbraith <efault@gmx.de>: build fix ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: clean up schedstats, cnt -> countIngo Molnar2007-10-15
| | | | | | | | | | | | | | | | rename all 'cnt' fields and variables to the less yucky 'count' name. yuckage noticed by Andrew Morton. no change in code, other than the /proc/sched_debug bkl_count string got a bit larger: text data bss dec hex filename 38236 3506 24 41766 a326 sched.o.before 38240 3506 24 41770 a32a sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched debug: check spreadPeter Zijlstra2007-10-15
| | | | | | | | | | debug feature: check how well we schedule within a reasonable vruntime 'spread' range. (note that CPU overload can increase the spread, so this is not a hard condition, but normal loads should be within the spread.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* sched debug: more width for parameter printoutsIngo Molnar2007-10-15
| | | | | | | | more width for parameter printouts in /proc/sched_debug. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched debug: print settingsIngo Molnar2007-10-15
| | | | | | | | print the current value of all tunables in /proc/sched_debug output. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched debug: BKL usage statistics, fixS.Caglar Onur2007-10-15
| | | | | | | | build fix for the SCHED_DEBUG && !SCHEDSTATS case. Signed-off-by: S.Ceglar Onur <caglar@pardus.org.tr> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched debug: BKL usage statisticsIngo Molnar2007-10-15
| | | | | | | | add per task and per rq BKL usage statistics. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: add fair-user schedulerSrivatsa Vaddagiri2007-10-15
| | | | | | | | | | | | | | | | | | | Enable user-id based fair group scheduling. This is useful for anyone who wants to test the group scheduler w/o having to enable CONFIG_CGROUPS. A separate scheduling group (i.e struct task_grp) is automatically created for every new user added to the system. Upon uid change for a task, it is made to move to the corresponding scheduling group. A /proc tunable (/proc/root_user_share) is also provided to tune root user's quota of cpu bandwidth. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: print nr_running and load in /proc/sched_debugSrivatsa Vaddagiri2007-10-15
| | | | | | | | | | - print nr_running and load information for cfs_rq in /proc/sched_debug Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: fix formatting of /proc/sched_debugMike Galbraith2007-10-15
| | | | | | | | fix formatting of /proc/sched_debug Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: enhance debug outputIngo Molnar2007-10-15
| | | | | | | | | enhance debug output by changing 12345678 nsecs to 12.345678 output, this is more human-readable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: prettify /proc/sched_debug outputIngo Molnar2007-10-15
| | | | | | | | print the correct amount of dashes in /proc/sched_debug. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: do not keep current in the tree and get rid of sched_entity::fair_keyDmitry Adamushko2007-10-15
| | | | | | | | | | | | | | | Get rid of 'sched_entity::fair_key'. As a side effect, 'current' is not kept withing the tree for SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code (e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes them (e.g. a single update_curr() now vs. dequeue/enqueue() before in entity_tick()). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: remove wait_runtime fields and featuresIngo Molnar2007-10-15
| | | | | | | | | | remove wait_runtime based fields and features, now that the CFS math has been changed over to the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: remove wait_runtime limitIngo Molnar2007-10-15
| | | | | | | | | | remove the wait_runtime-limit fields and the code depending on it, now that the math has been changed over to rely on the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: clean up struct load_statDmitry Adamushko2007-10-15
| | | | | | | | | | 'struct load_stat' is redundant now so let's get rid of it. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: add more vruntime statisticsIngo Molnar2007-10-15
| | | | | | | | | add more vruntime statistics. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: add se->vruntime debuggingIngo Molnar2007-10-15
| | | | | | | | debug se->vruntime fields. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
* sched: remove precise CPU loadIngo Molnar2007-10-15
| | | | | | | | | | | | CPU load calculations are statistical anyway, and there's little benefit from having it calculated on every scheduling event. So remove this code, it gets rid of a divide from the scheduler wakeup and context-switch fastpath. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: debug: track maximum 'slice'Ingo Molnar2007-10-15
| | | | | | | | | | | track the maximum amount of time a task has executed while the CPU load was at least 2x. (i.e. at least two nice-0 tasks were runnable) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: small sched_debug cleanupIngo Molnar2007-10-15
| | | | | | | | | | | | | | | | small kernel/sched_debug.c cleanup - break up multi-variable assignment. no code changed: text data bss dec hex filename 38869 3550 24 42443 a5cb sched.o.before 38869 3550 24 42443 a5cb sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
* sched: debug: fix sum_exec_runtime clearingIngo Molnar2007-09-05
| | | | | | when cleaning sched-stats also clear prev_sum_exec_runtime. Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: sched_clock_idle_[sleep|wakeup]_event()Ingo Molnar2007-08-23
| | | | | | | | | | | | | | | | | | | | construct a more or less wall-clock time out of sched_clock(), by using ACPI-idle's existing knowledge about how much time we spent idling. This allows the rq clock to work around TSC-stops-in-C2, TSC-gets-corrupted-in-C3 type of problems. ( Besides the scheduler's statistics this also benefits blktrace and printk-timestamps as well. ) Furthermore, the precise before-C2/C3-sleep and after-C2/C3-wakeup callbacks allow the scheduler to get out the most of the period where the CPU has a reliable TSC. This results in slightly more precise task statistics. the ACPI bits were acked by Len. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Len Brown <len.brown@intel.com>
* sched debug: dont print kernel address in /proc/sched_debugIngo Molnar2007-08-10
| | | | | | | | Arjan van de Ven pointed out that we should not print kernel addresses in world-readable /proc files - fix that. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
* sched debug: remove the 'u64 now' parameter from print_task()/_rq()Ingo Molnar2007-08-09
| | | | | | | | remove the 'u64 now' parameter from sched_debug.c:print_task()/_rq(). ( identity transformation that causes no change in functionality. ) Signed-off-by: Ingo Molnar <mingo@elte.hu>
* sched: remove the 'u64 now' parameter from print_cfs_rq()Ingo Molnar2007-08-09
| | | | | | | | remove the 'u64 now' parameter from print_cfs_rq(). ( identity transformation that causes no change in functionality. ) Signed-off-by: Ingo Molnar <mingo@elte.hu>
* take sched_debug.c out of nasal demon territoryAl Viro2007-08-06
| | | | | | | | C99 6.10.3[11]: preprocessing directive within the argument list of macro invocation => undefined behaviour. Don't do that... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] sched: reduce debug codeIngo Molnar2007-08-02
| | | | | | | | | | | move the rest of the debugging/instrumentation code to under CONFIG_SCHEDSTATS too. This reduces code size and speeds code up: text data bss dec hex filename 33044 4122 28 37194 914a sched.o.before 32708 4122 28 36858 8ffa sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Fix leaks on /proc/{*/sched,sched_debug,timer_list,timer_stats}Alexey Dobriyan2007-07-31
| | | | | | | | | | | On every open/close one struct seq_operations leaks. Kudos to /proc/slab_allocators. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] sched: mark sysrq_sched_debug_show() staticJosh Triplett2007-07-26
| | | | | | | | | | | | Only sched.c uses sysrq_sched_debug_show, and sched.c includes sched_debug.c, so all uses of sysrq_sched_debug_show occur in the same source file. Eliminates a sparse warning: warning: symbol 'sysrq_sched_debug_show' was not declared. Should it be static? Signed-off-by: Josh Triplett <josh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* [PATCH] sched: remove stale version info from kernel/sched_debug.cIngo Molnar2007-07-13
| | | | | | | | kernel/sched_debug.c referred to CFS -v20, but there's no CFS versioning needed within the upstream kernel. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sched: scheduler debugging, coreIngo Molnar2007-07-09
scheduler debugging core: implement /proc/sched_debug and /proc/<PID>/sched files for scheduler debugging. Signed-off-by: Ingo Molnar <mingo@elte.hu>