From 4c5640cb5f5a6fd780d99397eca028b575cb1206 Mon Sep 17 00:00:00 2001 From: David Meybohm Date: Mon, 22 Aug 2005 13:11:08 -0700 Subject: [PATCH] preempt race in getppid With CONFIG_PREEMPT && !CONFIG_SMP, it's possible for sys_getppid to return a bogus value if the parent's task_struct gets reallocated after current->group_leader->real_parent is read: asmlinkage long sys_getppid(void) { int pid; struct task_struct *me = current; struct task_struct *parent; parent = me->group_leader->real_parent; RACE HERE => for (;;) { pid = parent->tgid; #ifdef CONFIG_SMP { struct task_struct *old = parent; /* * Make sure we read the pid before re-reading the * parent pointer: */ smp_rmb(); parent = me->group_leader->real_parent; if (old != parent) continue; } #endif break; } return pid; } If the process gets preempted at the indicated point, the parent process can go ahead and call exit() and then get wait()'d on to reap its task_struct. When the preempted process gets resumed, it will not do any further checks of the parent pointer on !CONFIG_SMP: it will read the bad pid and return. So, the same algorithm used when SMP is enabled should be used when preempt is enabled, which will recheck ->real_parent in this case. Signed-off-by: David Meybohm Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'kernel') diff --git a/kernel/timer.c b/kernel/timer.c index f2a11887a726..5377f40723ff 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -1023,7 +1023,7 @@ asmlinkage long sys_getppid(void) parent = me->group_leader->real_parent; for (;;) { pid = parent->tgid; -#ifdef CONFIG_SMP +#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) { struct task_struct *old = parent; -- cgit v1.2.2 From d10689b68aff7b48e3de1a3f7fcd6567bd2905af Mon Sep 17 00:00:00 2001 From: Paul Jackson Date: Tue, 23 Aug 2005 01:04:27 -0700 Subject: [PATCH] cpu_exclusive sched domains on partial nodes temp fix This keeps the kernel/cpuset.c routine update_cpu_domains() from invoking the sched.c routine partition_sched_domains() if the cpuset in question doesn't fall on node boundaries. I have boot tested this on an SN2, and with the help of a couple of ad hoc printk's, determined that it does indeed avoid calling the partition_sched_domains() routine on partial nodes. I did not directly verify that this avoids setting up bogus sched domains or avoids the oops that Hawkes saw. This patch imposes a silent artificial constraint on which cpusets can be used to define dynamic sched domains. This patch should allow proceeding with this new feature in 2.6.13 for the configurations in which it is useful (node alligned sched domains) while avoiding trying to setup sched domains in the less useful cases that can cause the kernel corruption and oops. Signed-off-by: Paul Jackson Acked-by: Ingo Molnar Acked-by: Dinakar Guniguntala Acked-by: John Hawkes Signed-off-by: Linus Torvalds --- kernel/cpuset.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'kernel') diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 21a4e3b2cbda..e0d296c5b302 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -635,6 +635,23 @@ static void update_cpu_domains(struct cpuset *cur) if (par == NULL || cpus_empty(cur->cpus_allowed)) return; + /* + * Hack to avoid 2.6.13 partial node dynamic sched domain bug. + * Require the 'cpu_exclusive' cpuset to include all (or none) + * of the CPUs on each node, or return w/o changing sched domains. + * Remove this hack when dynamic sched domains fixed. + */ + { + int i, j; + + for_each_cpu_mask(i, cur->cpus_allowed) { + for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) { + if (!cpu_isset(j, cur->cpus_allowed)) + return; + } + } + } + /* * Get all cpus from parent's cpus_allowed not part of exclusive * children -- cgit v1.2.2