diff options
author | Kirill Tkhai <ktkhai@parallels.com> | 2014-11-11 04:46:29 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2014-11-16 04:59:01 -0500 |
commit | d8b163c4c657478ef33c082cff78d03a4ca07bb2 (patch) | |
tree | 9653413e705b55c48c23bb57bf6c21bdf774a40a | |
parent | c1a2b5f6293caa14804adca1840eeea1e8f6b322 (diff) |
sched/numa: Init numa balancing fields of init_task
We do not initialize init_task.numa_preferred_nid,
but this value is inherited by userspace "init"
process:
rest_init()->kernel_thread(kernel_init)->do_fork(CLONE_VM);
__sched_fork()
{
if (clone_flags & CLONE_VM)
p->numa_preferred_nid = current->numa_preferred_nid;
else
p->numa_preferred_nid = -1;
}
kernel_init() becomes userspace "init" process.
So, we propagate garbage nid to userspace, and it may be used
during numa balancing.
Currently, we do not have reports about this brings a problem,
but it seem we should set it for sure.
Even if init_task.numa_preferred_nid is zero, we may meet a weird
configuration without nid#0. On sparc64, where processors are
numbered physically, I saw a machine without cpu#1, while cpu#2
existed. Possible, something similar may be with numa nodes.
So, let's initialize it and be sure we're safe.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sergey Dyasly <dserrg@gmail.com>
Link: http://lkml.kernel.org/r/1415699189.15631.6.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-rw-r--r-- | include/linux/init_task.h | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/include/linux/init_task.h b/include/linux/init_task.h index 77fc43f8fb72..5f30ac8c82bc 100644 --- a/include/linux/init_task.h +++ b/include/linux/init_task.h | |||
@@ -166,6 +166,15 @@ extern struct task_group root_task_group; | |||
166 | # define INIT_RT_MUTEXES(tsk) | 166 | # define INIT_RT_MUTEXES(tsk) |
167 | #endif | 167 | #endif |
168 | 168 | ||
169 | #ifdef CONFIG_NUMA_BALANCING | ||
170 | # define INIT_NUMA_BALANCING(tsk) \ | ||
171 | .numa_preferred_nid = -1, \ | ||
172 | .numa_group = NULL, \ | ||
173 | .numa_faults = NULL, | ||
174 | #else | ||
175 | # define INIT_NUMA_BALANCING(tsk) | ||
176 | #endif | ||
177 | |||
169 | /* | 178 | /* |
170 | * INIT_TASK is used to set up the first task table, touch at | 179 | * INIT_TASK is used to set up the first task table, touch at |
171 | * your own risk!. Base=0, limit=0x1fffff (=2MB) | 180 | * your own risk!. Base=0, limit=0x1fffff (=2MB) |
@@ -237,6 +246,7 @@ extern struct task_group root_task_group; | |||
237 | INIT_CPUSET_SEQ(tsk) \ | 246 | INIT_CPUSET_SEQ(tsk) \ |
238 | INIT_RT_MUTEXES(tsk) \ | 247 | INIT_RT_MUTEXES(tsk) \ |
239 | INIT_VTIME(tsk) \ | 248 | INIT_VTIME(tsk) \ |
249 | INIT_NUMA_BALANCING(tsk) \ | ||
240 | } | 250 | } |
241 | 251 | ||
242 | 252 | ||