summaryrefslogtreecommitdiffstats
path: root/kernel/fork.c
diff options
context:
space:
mode:
authorDavid Herrmann <dh.herrmann@gmail.com>2019-01-08 07:58:52 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2019-01-08 12:40:53 -0500
commit7b55851367136b1efd84d98fea81ba57a98304cf (patch)
treed105f4f145187af07460d35c4a7b68c2882d3da2 /kernel/fork.c
parent3bd6e94bec122a951d462c239b47954cf5f36e33 (diff)
fork: record start_time late
This changes the fork(2) syscall to record the process start_time after initializing the basic task structure but still before making the new process visible to user-space. Technically, we could record the start_time anytime during fork(2). But this might lead to scenarios where a start_time is recorded long before a process becomes visible to user-space. For instance, with userfaultfd(2) and TLS, user-space can delay the execution of fork(2) for an indefinite amount of time (and will, if this causes network access, or similar). By recording the start_time late, it much closer reflects the point in time where the process becomes live and can be observed by other processes. Lastly, this makes it much harder for user-space to predict and control the start_time they get assigned. Previously, user-space could fork a process and stall it in copy_thread_tls() before its pid is allocated, but after its start_time is recorded. This can be misused to later-on cycle through PIDs and resume the stalled fork(2) yielding a process that has the same pid and start_time as a process that existed before. This can be used to circumvent security systems that identify processes by their pid+start_time combination. Even though user-space was always aware that start_time recording is flaky (but several projects are known to still rely on start_time-based identification), changing the start_time to be recorded late will help mitigate existing attacks and make it much harder for user-space to control the start_time a process gets assigned. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel/fork.c')
-rw-r--r--kernel/fork.c13
1 files changed, 11 insertions, 2 deletions
diff --git a/kernel/fork.c b/kernel/fork.c
index a60459947f18..7f49be94eba9 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1833,8 +1833,6 @@ static __latent_entropy struct task_struct *copy_process(
1833 1833
1834 posix_cpu_timers_init(p); 1834 posix_cpu_timers_init(p);
1835 1835
1836 p->start_time = ktime_get_ns();
1837 p->real_start_time = ktime_get_boot_ns();
1838 p->io_context = NULL; 1836 p->io_context = NULL;
1839 audit_set_context(p, NULL); 1837 audit_set_context(p, NULL);
1840 cgroup_fork(p); 1838 cgroup_fork(p);
@@ -2001,6 +1999,17 @@ static __latent_entropy struct task_struct *copy_process(
2001 goto bad_fork_free_pid; 1999 goto bad_fork_free_pid;
2002 2000
2003 /* 2001 /*
2002 * From this point on we must avoid any synchronous user-space
2003 * communication until we take the tasklist-lock. In particular, we do
2004 * not want user-space to be able to predict the process start-time by
2005 * stalling fork(2) after we recorded the start_time but before it is
2006 * visible to the system.
2007 */
2008
2009 p->start_time = ktime_get_ns();
2010 p->real_start_time = ktime_get_boot_ns();
2011
2012 /*
2004 * Make it visible to the rest of the system, but dont wake it up yet. 2013 * Make it visible to the rest of the system, but dont wake it up yet.
2005 * Need tasklist lock for parent etc handling! 2014 * Need tasklist lock for parent etc handling!
2006 */ 2015 */