From 4c5640cb5f5a6fd780d99397eca028b575cb1206 Mon Sep 17 00:00:00 2001
From: David Meybohm <dmeybohmlkml@bellsouth.net>
Date: Mon, 22 Aug 2005 13:11:08 -0700
Subject: [PATCH] preempt race in getppid

With CONFIG_PREEMPT && !CONFIG_SMP, it's possible for sys_getppid to
return a bogus value if the parent's task_struct gets reallocated after
current->group_leader->real_parent is read:

        asmlinkage long sys_getppid(void)
        {
                int pid;
                struct task_struct *me = current;
                struct task_struct *parent;

                parent = me->group_leader->real_parent;
RACE HERE =>    for (;;) {
                        pid = parent->tgid;
        #ifdef CONFIG_SMP
        {
                        struct task_struct *old = parent;

                        /*
                         * Make sure we read the pid before re-reading the
                         * parent pointer:
                         */
                        smp_rmb();
                        parent = me->group_leader->real_parent;
                        if (old != parent)
                                continue;
        }
        #endif
                        break;
                }
                return pid;
        }

If the process gets preempted at the indicated point, the parent process
can go ahead and call exit() and then get wait()'d on to reap its
task_struct. When the preempted process gets resumed, it will not do any
further checks of the parent pointer on !CONFIG_SMP: it will read the
bad pid and return.

So, the same algorithm used when SMP is enabled should be used when
preempt is enabled, which will recheck ->real_parent in this case.

Signed-off-by: David Meybohm <dmeybohmlkml@bellsouth.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 kernel/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'kernel')

diff --git a/kernel/timer.c b/kernel/timer.c
index f2a11887a726..5377f40723ff 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1023,7 +1023,7 @@ asmlinkage long sys_getppid(void)
 	parent = me->group_leader->real_parent;
 	for (;;) {
 		pid = parent->tgid;
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
 {
 		struct task_struct *old = parent;
 
-- 
cgit v1.2.2


From d10689b68aff7b48e3de1a3f7fcd6567bd2905af Mon Sep 17 00:00:00 2001
From: Paul Jackson <pj@sgi.com>
Date: Tue, 23 Aug 2005 01:04:27 -0700
Subject: [PATCH] cpu_exclusive sched domains on partial nodes temp fix

This keeps the kernel/cpuset.c routine update_cpu_domains() from
invoking the sched.c routine partition_sched_domains() if the cpuset in
question doesn't fall on node boundaries.

I have boot tested this on an SN2, and with the help of a couple of ad
hoc printk's, determined that it does indeed avoid calling the
partition_sched_domains() routine on partial nodes.

I did not directly verify that this avoids setting up bogus sched
domains or avoids the oops that Hawkes saw.

This patch imposes a silent artificial constraint on which cpusets can
be used to define dynamic sched domains.

This patch should allow proceeding with this new feature in 2.6.13 for
the configurations in which it is useful (node alligned sched domains)
while avoiding trying to setup sched domains in the less useful cases
that can cause the kernel corruption and oops.

Signed-off-by: Paul Jackson <pj@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Dinakar Guniguntala <dino@in.ibm.com>
Acked-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 kernel/cpuset.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

(limited to 'kernel')

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 21a4e3b2cbda..e0d296c5b302 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -635,6 +635,23 @@ static void update_cpu_domains(struct cpuset *cur)
 	if (par == NULL || cpus_empty(cur->cpus_allowed))
 		return;
 
+	/*
+	 * Hack to avoid 2.6.13 partial node dynamic sched domain bug.
+	 * Require the 'cpu_exclusive' cpuset to include all (or none)
+	 * of the CPUs on each node, or return w/o changing sched domains.
+	 * Remove this hack when dynamic sched domains fixed.
+	 */
+	{
+		int i, j;
+
+		for_each_cpu_mask(i, cur->cpus_allowed) {
+			for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) {
+				if (!cpu_isset(j, cur->cpus_allowed))
+					return;
+			}
+		}
+	}
+
 	/*
 	 * Get all cpus from parent's cpus_allowed not part of exclusive
 	 * children
-- 
cgit v1.2.2