aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/userspace-api
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2019-01-02 12:48:13 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2019-01-02 12:48:13 -0500
commitd9a7fa67b4bfe6ce93ee9aab23ae2e7ca0763e84 (patch)
treeea15c22c088160107c09da1c8d380753bb0c8d21 /Documentation/userspace-api
parentf218a29c25ad8abdb961435d6b8139f462061364 (diff)
parent55b8cbe470d103b44104c64dbf89e5cad525d4e0 (diff)
Merge branch 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull seccomp updates from James Morris: - Add SECCOMP_RET_USER_NOTIF - seccomp fixes for sparse warnings and s390 build (Tycho) * 'next-seccomp' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: seccomp, s390: fix build for syscall type change seccomp: fix poor type promotion samples: add an example of seccomp user trap seccomp: add a return code to trap to userspace seccomp: switch system call argument type to void * seccomp: hoist struct seccomp_data recalculation higher
Diffstat (limited to 'Documentation/userspace-api')
-rw-r--r--Documentation/userspace-api/seccomp_filter.rst84
1 files changed, 84 insertions, 0 deletions
diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
index 82a468bc7560..b1b846d8a094 100644
--- a/Documentation/userspace-api/seccomp_filter.rst
+++ b/Documentation/userspace-api/seccomp_filter.rst
@@ -122,6 +122,11 @@ In precedence order, they are:
122 Results in the lower 16-bits of the return value being passed 122 Results in the lower 16-bits of the return value being passed
123 to userland as the errno without executing the system call. 123 to userland as the errno without executing the system call.
124 124
125``SECCOMP_RET_USER_NOTIF``:
126 Results in a ``struct seccomp_notif`` message sent on the userspace
127 notification fd, if it is attached, or ``-ENOSYS`` if it is not. See below
128 on discussion of how to handle user notifications.
129
125``SECCOMP_RET_TRACE``: 130``SECCOMP_RET_TRACE``:
126 When returned, this value will cause the kernel to attempt to 131 When returned, this value will cause the kernel to attempt to
127 notify a ``ptrace()``-based tracer prior to executing the system 132 notify a ``ptrace()``-based tracer prior to executing the system
@@ -183,6 +188,85 @@ The ``samples/seccomp/`` directory contains both an x86-specific example
183and a more generic example of a higher level macro interface for BPF 188and a more generic example of a higher level macro interface for BPF
184program generation. 189program generation.
185 190
191Userspace Notification
192======================
193
194The ``SECCOMP_RET_USER_NOTIF`` return code lets seccomp filters pass a
195particular syscall to userspace to be handled. This may be useful for
196applications like container managers, which wish to intercept particular
197syscalls (``mount()``, ``finit_module()``, etc.) and change their behavior.
198
199To acquire a notification FD, use the ``SECCOMP_FILTER_FLAG_NEW_LISTENER``
200argument to the ``seccomp()`` syscall:
201
202.. code-block:: c
203
204 fd = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_FLAG_NEW_LISTENER, &prog);
205
206which (on success) will return a listener fd for the filter, which can then be
207passed around via ``SCM_RIGHTS`` or similar. Note that filter fds correspond to
208a particular filter, and not a particular task. So if this task then forks,
209notifications from both tasks will appear on the same filter fd. Reads and
210writes to/from a filter fd are also synchronized, so a filter fd can safely
211have many readers.
212
213The interface for a seccomp notification fd consists of two structures:
214
215.. code-block:: c
216
217 struct seccomp_notif_sizes {
218 __u16 seccomp_notif;
219 __u16 seccomp_notif_resp;
220 __u16 seccomp_data;
221 };
222
223 struct seccomp_notif {
224 __u64 id;
225 __u32 pid;
226 __u32 flags;
227 struct seccomp_data data;
228 };
229
230 struct seccomp_notif_resp {
231 __u64 id;
232 __s64 val;
233 __s32 error;
234 __u32 flags;
235 };
236
237The ``struct seccomp_notif_sizes`` structure can be used to determine the size
238of the various structures used in seccomp notifications. The size of ``struct
239seccomp_data`` may change in the future, so code should use:
240
241.. code-block:: c
242
243 struct seccomp_notif_sizes sizes;
244 seccomp(SECCOMP_GET_NOTIF_SIZES, 0, &sizes);
245
246to determine the size of the various structures to allocate. See
247samples/seccomp/user-trap.c for an example.
248
249Users can read via ``ioctl(SECCOMP_IOCTL_NOTIF_RECV)`` (or ``poll()``) on a
250seccomp notification fd to receive a ``struct seccomp_notif``, which contains
251five members: the input length of the structure, a unique-per-filter ``id``,
252the ``pid`` of the task which triggered this request (which may be 0 if the
253task is in a pid ns not visible from the listener's pid namespace), a ``flags``
254member which for now only has ``SECCOMP_NOTIF_FLAG_SIGNALED``, representing
255whether or not the notification is a result of a non-fatal signal, and the
256``data`` passed to seccomp. Userspace can then make a decision based on this
257information about what to do, and ``ioctl(SECCOMP_IOCTL_NOTIF_SEND)`` a
258response, indicating what should be returned to userspace. The ``id`` member of
259``struct seccomp_notif_resp`` should be the same ``id`` as in ``struct
260seccomp_notif``.
261
262It is worth noting that ``struct seccomp_data`` contains the values of register
263arguments to the syscall, but does not contain pointers to memory. The task's
264memory is accessible to suitably privileged traces via ``ptrace()`` or
265``/proc/pid/mem``. However, care should be taken to avoid the TOCTOU mentioned
266above in this document: all arguments being read from the tracee's memory
267should be read into the tracer's memory before any policy decisions are made.
268This allows for an atomic decision on syscall arguments.
269
186Sysctls 270Sysctls
187======= 271=======
188 272