diff options
author | Andrew Vagin <avagin@openvz.org> | 2012-10-25 16:38:07 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-10-25 17:37:53 -0400 |
commit | f2302505775fd13ba93f034206f1e2a587017929 (patch) | |
tree | 0397e3cbf7556e93618ed7ad7316578049707ca1 | |
parent | d5ea7b5ec1ee4dac868143806c0bd94855754677 (diff) |
pidns: limit the nesting depth of pid namespaces
'struct pid' is a "variable sized struct" - a header with an array of
upids at the end.
The size of the array depends on a level (depth) of pid namespaces. Now a
level of pidns is not limited, so 'struct pid' can be more than one page.
Looks reasonable, that it should be less than a page. MAX_PIS_NS_LEVEL is
not calculated from PAGE_SIZE, because in this case it depends on
architectures, config options and it will be reduced, if someone adds a
new fields in struct pid or struct upid.
I suggest to set MAX_PIS_NS_LEVEL = 32, because it saves ability to expand
"struct pid" and it's more than enough for all known for me use-cases.
When someone finds a reasonable use case, we can add a config option or a
sysctl parameter.
In addition it will reduce the effect of another problem, when we have
many nested namespaces and the oldest one starts dying.
zap_pid_ns_processe will be called for each namespace and find_vpid will
be called for each process in a namespace. find_vpid will be called
minimum max_level^2 / 2 times. The reason of that is that when we found a
bit in pidmap, we can't determine this pidns is top for this process or it
isn't.
vpid is a heavy operation, so a fork bomb, which create many nested
namespace, can make a system inaccessible for a long time. For example my
system becomes inaccessible for a few minutes with 4000 processes.
[akpm@linux-foundation.org: return -EINVAL in response to excessive nesting, not -ENOMEM]
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | kernel/pid_namespace.c | 12 |
1 files changed, 11 insertions, 1 deletions
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index eb00be205811..7b07cc0dfb75 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c | |||
@@ -71,12 +71,22 @@ err_alloc: | |||
71 | return NULL; | 71 | return NULL; |
72 | } | 72 | } |
73 | 73 | ||
74 | /* MAX_PID_NS_LEVEL is needed for limiting size of 'struct pid' */ | ||
75 | #define MAX_PID_NS_LEVEL 32 | ||
76 | |||
74 | static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_pid_ns) | 77 | static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_pid_ns) |
75 | { | 78 | { |
76 | struct pid_namespace *ns; | 79 | struct pid_namespace *ns; |
77 | unsigned int level = parent_pid_ns->level + 1; | 80 | unsigned int level = parent_pid_ns->level + 1; |
78 | int i, err = -ENOMEM; | 81 | int i; |
82 | int err; | ||
83 | |||
84 | if (level > MAX_PID_NS_LEVEL) { | ||
85 | err = -EINVAL; | ||
86 | goto out; | ||
87 | } | ||
79 | 88 | ||
89 | err = -ENOMEM; | ||
80 | ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL); | 90 | ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL); |
81 | if (ns == NULL) | 91 | if (ns == NULL) |
82 | goto out; | 92 | goto out; |