| Commit message (Collapse) | Author | Age |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch changes sched_trace.c to use the miscdevice API
instead of doing all the cdev management ourselves. This remove a
chunk of code and we get sysfs / udev integration for free.
On systems with default udev rules, this will result in a /dev/litmus/log
device being created automatically.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes a bug found by liblitmus's regression test suite.
Before:
> ** LITMUS^RT test suite.
> ** Running tests for LINUX.
> ** Testing: don't open FMLP semaphores if FMLP is not supported...
> !! TEST FAILURE open_fmlp_sem(fd, 0) -> -16, Success (expected: EBUSY)
> at tests/fdso.c:21 (test_fmlp_not_active)
> ** Testing: reject invalid object descriptors... ok.
> ** Testing: reject invalid object types...
> !! TEST FAILURE od_open(0, -1, 0) -> -22, Bad file descriptor (expected: EINVAL)
> at tests/fdso.c:51 (test_invalid_obj_type)
> ** Testing: reject invalid rt_task pointers... ok.
> ** Result: 2 ok, 2 failed.
After:
> ** LITMUS^RT test suite.
> ** Running tests for LINUX.
> ** Testing: don't open FMLP semaphores if FMLP is not supported... ok.
> ** Testing: reject invalid object descriptors... ok.
> ** Testing: reject invalid object types... ok.
> ** Testing: reject invalid rt_task pointers... ok.
> ** Result: 4 ok, 0 failed.
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If the number of online Cpus is less than the minimum cluster size
(currently set to 4), it is pointless to load C-EDF plugin.
This fixes the following memory corruption problem reported by Bjoern:
It always hangs after initializing Feather-Trace.
[ 0.151575] TCP bind hash table entries: 65536 (order: 9, 3670016 bytes)
[ 0.163623] TCP: Hash tables configured (established 262144 bind 65536)
[ 0.164728] TCP reno registered
[ 0.165667] NET: Registered protocol family 1
[ 0.166383] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 0.167319] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[ 0.168270] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 0.181699] NTFS driver 2.1.29 [Flags: R/W].
[ 0.182751] msgmni has been set to 3868
[ 0.184463] alg: No test for cipher_null (cipher_null-generic)
[ 0.185464] alg: No test for ecb(cipher_null) (ecb-cipher_null)
[ 0.186430] alg: No test for digest_null (digest_null-generic)
[ 0.187387] alg: No test for compress_null (compress_null-generic)
[ 0.190236] alg: No test for fcrypt (fcrypt-generic)
[ 0.193586] alg: No test for stdrng (krng)
[ 0.202158] alg: No test for ghash (ghash-generic)
[ 0.202969] io scheduler noop registered
[ 0.203615] io scheduler anticipatory registered
[ 0.204324] io scheduler deadline registered
[ 0.205019] io scheduler cfq registered (default)
[ 0.205749] Starting LITMUS^RT kernel
[ 0.206302] Registering LITMUS^RT plugin Linux.
[ 0.207066] Registered kill rt tasks magic sysrq.
[ 0.207862] Initializing SRP per-CPU ceilings... done!
[ 0.208696] Initializing LITMUS^RT control device.
[ 0.209539] Registering LITMUS^RT plugin GSN-EDF.
[ 0.210291] Registering LITMUS^RT plugin PSN-EDF.
[ 0.211029] Registering LITMUS^RT plugin C-EDF.
[ 0.211837] Registering LITMUS^RT plugin PFAIR.
[ 0.212611] Initializing TRACE() device
[ 0.213209] Registered dump-trace-buffer(Y) magic sysrq.
[ 0.214027] Initializing Feather-Trace overhead tracing device.
I've attached with GDB and got the following stacktrace:
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
[New Thread 1]
delay_tsc (loops=2127684) at arch/x86/lib/delay.c:62
62 rdtscl(now);
(gdb) bt
at kernel/panic.c:138
at kernel/exit.c:722
signr=1) at arch/x86/kernel/dumpstack.c:244
str=0xffffffff815e78b1 "general protection fault", regs=0xffff88007c8e19a8,
err=0) at arch/x86/kernel/dumpstack.c:303
error_code=0) at arch/x86/kernel/traps.c:308
(gdb) quit
So we got some memory corruption, and even the panic() fails.
According to git bisect, the problem appeared in:
6834f41a1aa2f92e5b7ca6ae8c80b6fee0fa1208 is the first bad commit
|
|/ /
| |
| |
| |
| |
| |
| |
| | |
The od_table is strictly per-thread and should not be inherited across
a fork/clone. This caused memory corruption when a task exited, which
ultimately could lead to oopses in unrelated code.
Bug and testcase initially reported by Glenn.
|
| |
| |
| |
| |
| |
| | |
select_task_rq() -> select_task_rq_litmus() was missing -- my bad :( --
from the litmus sched_class. This caused a bug when executing a task
using for example an execv-like function (rt_launch uses execvp ...)
|
| |
| |
| |
| |
| |
| |
| |
| | |
1) High priority task tied to FMLP semaphore in P-EDF scheduling is
incorrectly tracked for tasks acquiring the lock without
contention. (HP is always set to CPU 0 instead of proper CPU.)
2) Race in a print statement from P-EDF's pi_block() causes NULL
pointer dereference.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The task block flow-path in PSN-EDF may race with schedule()
(e.g., fast task release, hrtimer, etc.). As psnedf_task_block does not
reset the pedf->scheduled field, the BUG_ON(pedf->scheduled != prev) condition
in psnedf_schedule() fires.
Setting pedf-> schedule to NULL in psnedf_task_block() is not enough as we
may loose a rescheduling point (as we skip the check for new real-time tasks)
We need therefore to trace the block event in the first rescheduling point.
The BUG was first reported by Glenn:
[ 46.986089] kernel BUG at litmus/sched_psn_edf.c:138!
[ 46.986089] invalid opcode: 0000 [#1] 0P
TtaEsEk_MmoPdTe( LISMP
MUS_RT_T[AS K) o k46.986089] last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/ide0/0.0/block/hda/size
[ 46.986089] CPU 1 .
[
4] Wai[ ti ng 4f6.986089] Modules linked in:
6r T[S re lea se4.
.986089] Pid: 1488, comm: longtest Not tainted 2.6.32-litmus2010 #3
[ 46.986089] RIP: 0010:[<ffffffff811f3f40>] [<ffffffff811f3f40>] psnedf_schedule+0x360/0x370
[ 46.986089] RSP: 0018:ffff88007bf7fd18 EFLAGS: 00010087
[ 46.986089] RAX: 000000000000f5f5 RBX: ffffffff814412c0 RCX: ffff88007befef90
[ 46.986089] RDX: ffff88007befc830 RSI: ffff88007befef90 RDI: ffff880001850a40
[ 46.986089] RBP: ffff88007bf7fd58 R08: 0000000000000000 R09: ffff88000180dc50
[ 46.986089] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880001850a40
[ 46.986089] R13: ffff88007befef90 R14: ffff88000186cec0 R15: ffff88007befef90
[ 46.986089] FS: 00007fc0694e4910(0000) GS:ffff880001840000(0000) knlGS:0000000000000000
[ 46.986089] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.986089] CR2: 00000000006dc2d8 CR3: 000000007b2ed000 CR4: 00000000000006a0
[ 46.986089] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 46.986089] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 46.986089] Process longtest (pid: 1488, threadinfo ffff88007bf7e000, task ffff88007befef90)
[ 46.986089] Stack:
[ 46.986089] 0000000000000000 0000000000000000 0000000000000000 ffffffff814412c0
[ 46.986089] <0> 0000000000000000 ffff88007befef90 ffff88000186cec0 ffff88000184db08
[ 46.986089] <0> ffff88007bf7fdb8 ffffffff81035f64 0000000000000000 0000000000000000
[ 46.986089] Call Trace:
[ 46.986089] [<ffffffff81035f64>] pick_next_task_litmus+0x44/0x400
[ 46.986089] [<ffffffff814381e9>] schedule+0x239/0x356
[ 46.986089] [<ffffffff810cf1b7>] ? nameidata_to_filp+0x57/0x70
[ 46.986089] [<ffffffff8143a515>] __down_write_nested+0x85/0xd0
[ 46.986089] [<ffffffff8143a56b>] __down_write+0xb/0x10
[ 46.986089] [<ffffffff81439c2e>] down_write+0xe/0x10
[ 46.986089] [<ffffffff810103cd>] sys_mmap+0xdd/0x120
[ 46.986089] [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b
[ 46.986089] Code: c0 83 c6 01 48 c7 c7 80 07 55 81 e8 eb 44 00 00 48 8b 83 50 06 00 00 c7 40 04 01 00 00 00 e9 d5 fd ff ff 0f 0b eb fe 0f 1f 40 00 <0
[ 46.986089] RIP [<ffffffff811f3f40>] psnedf_schedule+0x360/0x370
[ 46.986089] RSP <ffff88007bf7fd18>
[ 46.986089] ---[ end trace 6205a69dc6b27ca5 ]---
|
|/ |
|
|
|
|
|
| |
This patch updates non-preemptive section support in
GSN- and PSN-EDF.
|
|
|
|
|
|
|
|
| |
Dealing with preemptions across CPUs in the presence of non-preemptive
sections can be tricky and should not be replicated across (event-driven) plugins.
This patch introduces a generic preemption function that handles
non-preemptive sections (hopefully) correctly.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-introduce NP sections in the configuration and in litmus.h. Remove the old
np_flag from rt_param.
If CONFIG_NP_SECTION is disabled, then all non-preemptive section checks are
constant expressions which should get removed by the dead code elimination
during optimization.
Instead of re-implementing sys_exit_np(), we simply repurposed sched_yield()
for calling into the scheduler to trigger delayed preemptions.
|
|
|
|
|
|
|
|
|
| |
This device only supports mmap()'ing a single page.
This page is shared RW between the kernel and userspace.
It is inteded to allow near-zero-overhead communication
between the kernel and userspace. It's first use will be a
proper implementation of user-signaled
non-preemptable section support.
|
|
|
|
|
|
|
| |
Having GSN-EDF log so many things each tick is useful
when tracking down race conditions, but it also makes
it really hard to find anything else. Thus, turn it off by
default but leave it in for future debugging fun.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kfifo_alloc is called from rb_alloc_buf with interrupt disabled. Use GFP_ATOMIC instead of GFP_KERNEL.
Fixes following warning:
[ 33.596013] WARNING: at kernel/lockdep.c:2465 lockdep_trace_alloc+0xa7/0xe0()
[ 33.596013] Hardware name:
[ 33.596013] Modules linked in:
[ 33.596013] Pid: 1454, comm: cat Not tainted 2.6.32-litmus2010 #38
[ 33.596013] Call Trace:
[ 33.596013] [<ffffffff810737ff>] ? save_trace+0x3f/0xd0
[ 33.596013] [<ffffffff81074ae7>] ? lockdep_trace_alloc+0xa7/0xe0
[ 33.596013] [<ffffffff81044290>] warn_slowpath_common+0x80/0xd0
[ 33.596013] [<ffffffff810442f4>] warn_slowpath_null+0x14/0x20
[ 33.596013] [<ffffffff81074ae7>] lockdep_trace_alloc+0xa7/0xe0
[ 33.596013] [<ffffffff810b5ed3>] __alloc_pages_nodemask+0xa3/0x710
[ 33.596013] [<ffffffff81074a1c>] ? mark_held_locks+0x6c/0x90
[ 33.596013] [<ffffffff81487585>] ? mutex_lock_nested+0x315/0x3a0
[ 33.596013] [<ffffffff81074d15>] ? trace_hardirqs_on_caller+0x145/0x190
[ 33.596013] [<ffffffff810b655d>] __get_free_pages+0x1d/0x60
[ 33.596013] [<ffffffff810e533f>] __kmalloc+0x1af/0x240
[ 33.596013] [<ffffffff81063e16>] kfifo_alloc+0x66/0xe0
[ 33.596013] [<ffffffff81222da4>] rb_alloc_buf+0x34/0x80
[ 33.596013] [<ffffffff81222e40>] log_open+0x50/0xb0
[ 33.596013] [<ffffffff810ee46a>] chrdev_open+0x1ba/0x2d0
[ 33.596013] [<ffffffff81488a95>] ? _spin_unlock+0x35/0x60
[ 33.596013] [<ffffffff810e8c21>] __dentry_open+0x1b1/0x3f0
[ 33.596013] [<ffffffff810ee2b0>] ? chrdev_open+0x0/0x2d0
[ 33.596013] [<ffffffff810e8f77>] nameidata_to_filp+0x57/0x70
[ 33.596013] [<ffffffff810f904a>] do_filp_open+0x73a/0xb20
[ 33.596013] [<ffffffff811042b1>] ? alloc_fd+0x131/0x160
[ 33.596013] [<ffffffff810e8973>] do_sys_open+0x83/0x110
[ 33.596013] [<ffffffff810e8a40>] sys_open+0x20/0x30
[ 33.596013] [<ffffffff8100b46b>] system_call_fastpath+0x16/0x1b
[ 33.596013] ---[ end trace dbd83780c3496912 ]---
Signed-off-by: Andrea Bastoni <bastoni@cs.unc.edu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a real-time task forks, then its LITMUS^RT-specific fields should be cleared,
because we don't want real-time tasks to spawn new real-time tasks that bypass
the plugin's admission control (if any).
This was broken in three ways:
1) kernel/fork.c did not erase all of tsk->rt_param, only the first few bytes due to
a wrong size argument to memset().
2) It should have been calling litmus_fork() instead anyway.
3) litmus_fork() was _also_ not clearing all of tsk->rt_param, due to another size
argument bug.
Interestingly, 1) and 2) can be traced back to the 2007->2008 port,
whereas 3) was added by Mitchell much later on (to dead code, no less).
I'm really surprised that this never blew up before.
|
| |
|
| |
|
|
|
|
|
|
| |
This solves (? -- or at least solve one of the possible reasons for ;)
a BUG in sched_gsnedf where a real-time task could enter gsnedf_schedule
with an entry->schedule = NULL (and this is BAD!)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix BUG introduced by f01618e24f233b4e7e12d66d0078ce513f4bad2d:
litmus_tick() must be called indipendenty of the currently executing task.
In Bjoern's words:
"The reason why we had litmus_tick() as a special case is that a plugin might
want to reschedule a non-RT task even if it is currently not executing a Litmus
RT task.
This is particularly a concern for non-work-conserving plugins such as PFAIR
(without early-releasing enabled). A 1/11 weight task does not execute most of
the time, but when it receives a quantum allocation, LITMUS needs to preempt
whatever non-Litmus task is currently executing.
In the case of PFAIR (and other quantum-driven schedulers), this requires the
tick function to be called on every quantum boundary (i.e., every time that the
periodic timer tick fires, no matter whether the currently executing task is a
Litmus task)."
|
|
|
|
|
| |
Setting FT_TASK_TRACE_MAJOR, LOG_MAJOR, FT_TRACE_MAJOR to 0
allows to have them automatically assigned by the kernel
|
|
|
|
|
|
| |
Use kfifo [kernel/kfifo.c] to implement the ring buffer used
for sched_trace (TRACE() and TRACE_TASK() macros)
This patch also includes some reorganization of sched_trace.c code
|
|
|
|
|
| |
- remove "likely" condition from branch
- add litmus.nr_running counter
|
|
|
|
|
|
|
|
|
| |
- Move LITMUS choice of next task inside pick_next_task() function
- Unfortunataly, litmus plugins' scheduling decisions need to access
the status of prev task. Save prev task status is done in pre_schedule()
- This patch also introduces a new struct litmus_rq to hold litmus fields
on struct rq
|
|
|
|
|
|
|
|
|
| |
- remove the call to litmus_tick() from scheduler_tick() just after
having performed the class task_tick() and integrate
litmus_tick() in task_tick_litmus()
- task_tick_litmus() is the handler for the litmus class task_tick()
method. It is called in non-queued mode from scheduler_tick()
|
|
|
|
|
|
|
| |
- Binomial heap "heap" names conflicted with priority heap
of cgroup in kernel
- This patch change binomial heap "heap" names in "bheap" (I wasn't
able to come up with a more interesting name, so proposal are welcomed)
|
| |
|
|
|
|
| |
- insert arm_release_timer() in add_relese() path
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
to be merged:
- arm_release_timer() with no rq locking
|
| |
|
|
|
|
|
|
|
|
| |
- fix requesting more than 2^11 pages (MAX_ORDER)
to system allocator
to be merged:
- feather-trace generic implementation
|
| |
|
|
to be merged:
- SRP (sched.c)
- feather-trace implementation (to be fixed)
- sync support (KConfig)
litmus_sched_class implements 3 new methods:
.prio_changed:
void
.switched_to:
void
.get_rr_interval:
return infinity (i.e., 0)
|