From de12a7878c11f3b282d640888aa635e0711d0b5e Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Mon, 10 Apr 2006 17:16:49 -0600 Subject: [PATCH] de_thread: Don't confuse users do_each_thread. Oleg Nesterov spotted two interesting bugs with the current de_thread code. The simplest is a long standing double decrement of __get_cpu_var(process_counts) in __unhash_process. Caused by two processes exiting when only one was created. The other is that since we no longer detach from the thread_group list it is possible for do_each_thread when run under the tasklist_lock to see the same task_struct twice. Once on the task list as a thread_group_leader, and once on the thread list of another thread. The double appearance in do_each_thread can cause a double increment of mm_core_waiters in zap_threads resulting in problems later on in coredump_wait. To remedy those two problems this patch takes the simple approach of changing the old thread group leader into a child thread. The only routine in release_task that cares is __unhash_process, and it can be trivially seen that we handle cleaning up a thread group leader properly. Since de_thread doesn't change the pid of the exiting leader process and instead shares it with the new leader process. I change thread_group_leader to recognize group leadership based on the group_leader field and not based on pids. This should also be slightly cheaper then the existing thread_group_leader macro. I performed a quick audit and I couldn't see any user of thread_group_leader that cared about the difference. Signed-off-by: Eric W. Biederman Signed-off-by: Linus Torvalds --- fs/exec.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index 0291a68a3626..4d38ad0b70d6 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -723,7 +723,12 @@ static int de_thread(struct task_struct *tsk) current->parent = current->real_parent = leader->real_parent; leader->parent = leader->real_parent = child_reaper; current->group_leader = current; - leader->group_leader = leader; + leader->group_leader = current; + + /* Reduce leader to a thread */ + detach_pid(leader, PIDTYPE_PGID); + detach_pid(leader, PIDTYPE_SID); + list_del_init(&leader->tasks); add_parent(current); add_parent(leader); -- cgit v1.2.2 From f5e902817fee1589badca1284f49eecc0ef0c200 Mon Sep 17 00:00:00 2001 From: Roland McGrath Date: Mon, 10 Apr 2006 22:54:16 -0700 Subject: [PATCH] process accounting: take original leader's start_time in non-leader exec The only record we have of the real-time age of a process, regardless of execs it's done, is start_time. When a non-leader thread exec, the original start_time of the process is lost. Things looking at the real-time age of the process are fooled, for example the process accounting record when the process finally dies. This change makes the oldest start_time stick around with the process after a non-leader exec. This way the association between PID and start_time is kept constant, which seems correct to me. Signed-off-by: Roland McGrath Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/exec.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'fs/exec.c') diff --git a/fs/exec.c b/fs/exec.c index 4d38ad0b70d6..3234a0c32d54 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -678,6 +678,18 @@ static int de_thread(struct task_struct *tsk) while (leader->exit_state != EXIT_ZOMBIE) yield(); + /* + * The only record we have of the real-time age of a + * process, regardless of execs it's done, is start_time. + * All the past CPU time is accumulated in signal_struct + * from sister threads now dead. But in this non-leader + * exec, nothing survives from the original leader thread, + * whose birth marks the true age of this process now. + * When we take on its identity by switching to its PID, we + * also take its birthdate (always earlier than our own). + */ + current->start_time = leader->start_time; + spin_lock(&leader->proc_lock); spin_lock(¤t->proc_lock); proc_dentry1 = proc_pid_unhash(current); -- cgit v1.2.2