diff options
Diffstat (limited to 'Documentation/prctl/no_new_privs.txt')
-rw-r--r-- | Documentation/prctl/no_new_privs.txt | 50 |
1 files changed, 50 insertions, 0 deletions
diff --git a/Documentation/prctl/no_new_privs.txt b/Documentation/prctl/no_new_privs.txt new file mode 100644 index 000000000000..cb705ec69abe --- /dev/null +++ b/Documentation/prctl/no_new_privs.txt | |||
@@ -0,0 +1,50 @@ | |||
1 | The execve system call can grant a newly-started program privileges that | ||
2 | its parent did not have. The most obvious examples are setuid/setgid | ||
3 | programs and file capabilities. To prevent the parent program from | ||
4 | gaining these privileges as well, the kernel and user code must be | ||
5 | careful to prevent the parent from doing anything that could subvert the | ||
6 | child. For example: | ||
7 | |||
8 | - The dynamic loader handles LD_* environment variables differently if | ||
9 | a program is setuid. | ||
10 | |||
11 | - chroot is disallowed to unprivileged processes, since it would allow | ||
12 | /etc/passwd to be replaced from the point of view of a process that | ||
13 | inherited chroot. | ||
14 | |||
15 | - The exec code has special handling for ptrace. | ||
16 | |||
17 | These are all ad-hoc fixes. The no_new_privs bit (since Linux 3.5) is a | ||
18 | new, generic mechanism to make it safe for a process to modify its | ||
19 | execution environment in a manner that persists across execve. Any task | ||
20 | can set no_new_privs. Once the bit is set, it is inherited across fork, | ||
21 | clone, and execve and cannot be unset. With no_new_privs set, execve | ||
22 | promises not to grant the privilege to do anything that could not have | ||
23 | been done without the execve call. For example, the setuid and setgid | ||
24 | bits will no longer change the uid or gid; file capabilities will not | ||
25 | add to the permitted set, and LSMs will not relax constraints after | ||
26 | execve. | ||
27 | |||
28 | Note that no_new_privs does not prevent privilege changes that do not | ||
29 | involve execve. An appropriately privileged task can still call | ||
30 | setuid(2) and receive SCM_RIGHTS datagrams. | ||
31 | |||
32 | There are two main use cases for no_new_privs so far: | ||
33 | |||
34 | - Filters installed for the seccomp mode 2 sandbox persist across | ||
35 | execve and can change the behavior of newly-executed programs. | ||
36 | Unprivileged users are therefore only allowed to install such filters | ||
37 | if no_new_privs is set. | ||
38 | |||
39 | - By itself, no_new_privs can be used to reduce the attack surface | ||
40 | available to an unprivileged user. If everything running with a | ||
41 | given uid has no_new_privs set, then that uid will be unable to | ||
42 | escalate its privileges by directly attacking setuid, setgid, and | ||
43 | fcap-using binaries; it will need to compromise something without the | ||
44 | no_new_privs bit set first. | ||
45 | |||
46 | In the future, other potentially dangerous kernel features could become | ||
47 | available to unprivileged tasks if no_new_privs is set. In principle, | ||
48 | several options to unshare(2) and clone(2) would be safe when | ||
49 | no_new_privs is set, and no_new_privs + chroot is considerable less | ||
50 | dangerous than chroot by itself. | ||