aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2016-07-29 18:54:19 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2016-07-29 18:54:19 -0400
commita867d7349e94b6409b08629886a819f802377e91 (patch)
treecf26734d638bbeee4e8f1ec58161933a55b922e2
parent601f887d6105ddd28dc569a1504595bdf8df8a5b (diff)
parentaeaa4a79ff6a5ed912b7362f206cf8576fca538b (diff)
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman: "This tree contains some very long awaited work on generalizing the user namespace support for mounting filesystems to include filesystems with a backing store. The real world target is fuse but the goal is to update the vfs to allow any filesystem to be supported. This patchset is based on a lot of code review and testing to approach that goal. While looking at what is needed to support the fuse filesystem it became clear that there were things like xattrs for security modules that needed special treatment. That the resolution of those concerns would not be fuse specific. That sorting out these general issues made most sense at the generic level, where the right people could be drawn into the conversation, and the issues could be solved for everyone. At a high level what this patchset does a couple of simple things: - Add a user namespace owner (s_user_ns) to struct super_block. - Teach the vfs to handle filesystem uids and gids not mapping into to kuids and kgids and being reported as INVALID_UID and INVALID_GID in vfs data structures. By assigning a user namespace owner filesystems that are mounted with only user namespace privilege can be detected. This allows security modules and the like to know which mounts may not be trusted. This also allows the set of uids and gids that are communicated to the filesystem to be capped at the set of kuids and kgids that are in the owning user namespace of the filesystem. One of the crazier corner casees this handles is the case of inodes whose i_uid or i_gid are not mapped into the vfs. Most of the code simply doesn't care but it is easy to confuse the inode writeback path so no operation that could cause an inode write-back is permitted for such inodes (aka only reads are allowed). This set of changes starts out by cleaning up the code paths involved in user namespace permirted mounts. Then when things are clean enough adds code that cleanly sets s_user_ns. Then additional restrictions are added that are possible now that the filesystem superblock contains owner information. These changes should not affect anyone in practice, but there are some parts of these restrictions that are changes in behavior. - Andy's restriction on suid executables that does not honor the suid bit when the path is from another mount namespace (think /proc/[pid]/fd/) or when the filesystem was mounted by a less privileged user. - The replacement of the user namespace implicit setting of MNT_NODEV with implicitly setting SB_I_NODEV on the filesystem superblock instead. Using SB_I_NODEV is a stronger form that happens to make this state user invisible. The user visibility can be managed but it caused problems when it was introduced from applications reasonably expecting mount flags to be what they were set to. There is a little bit of work remaining before it is safe to support mounting filesystems with backing store in user namespaces, beyond what is in this set of changes. - Verifying the mounter has permission to read/write the block device during mount. - Teaching the integrity modules IMA and EVM to handle filesystems mounted with only user namespace root and to reduce trust in their security xattrs accordingly. - Capturing the mounters credentials and using that for permission checks in d_automount and the like. (Given that overlayfs already does this, and we need the work in d_automount it make sense to generalize this case). Furthermore there are a few changes that are on the wishlist: - Get all filesystems supporting posix acls using the generic posix acls so that posix_acl_fix_xattr_from_user and posix_acl_fix_xattr_to_user may be removed. [Maintainability] - Reducing the permission checks in places such as remount to allow the superblock owner to perform them. - Allowing the superblock owner to chown files with unmapped uids and gids to something that is mapped so the files may be treated normally. I am not considering even obvious relaxations of permission checks until it is clear there are no more corner cases that need to be locked down and handled generically. Many thanks to Seth Forshee who kept this code alive, and putting up with me rewriting substantial portions of what he did to handle more corner cases, and for his diligent testing and reviewing of my changes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits) fs: Call d_automount with the filesystems creds fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns evm: Translate user/group ids relative to s_user_ns when computing HMAC dquot: For now explicitly don't support filesystems outside of init_user_ns quota: Handle quota data stored in s_user_ns in quota_setxquota quota: Ensure qids map to the filesystem vfs: Don't create inodes with a uid or gid unknown to the vfs vfs: Don't modify inodes with a uid or gid unknown to the vfs cred: Reject inodes with invalid ids in set_create_file_as() fs: Check for invalid i_uid in may_follow_link() vfs: Verify acls are valid within superblock's s_user_ns. userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS fs: Refuse uid/gid changes which don't map into s_user_ns selinux: Add support for unprivileged mounts from user namespaces Smack: Handle labels consistently in untrusted mounts Smack: Add support for unprivileged mounts from user namespaces fs: Treat foreign mounts as nosuid fs: Limit file caps to the user namespace of the super block userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag userns: Remove implicit MNT_NODEV fragility. ...
-rw-r--r--drivers/staging/lustre/lustre/mdc/mdc_request.c2
-rw-r--r--fs/9p/acl.c2
-rw-r--r--fs/attr.c19
-rw-r--r--fs/block_dev.c2
-rw-r--r--fs/devpts/inode.c3
-rw-r--r--fs/exec.c2
-rw-r--r--fs/inode.c7
-rw-r--r--fs/kernfs/mount.c5
-rw-r--r--fs/namei.c55
-rw-r--r--fs/namespace.c99
-rw-r--r--fs/nfsd/nfsctl.c13
-rw-r--r--fs/posix_acl.c8
-rw-r--r--fs/proc/inode.c15
-rw-r--r--fs/proc/internal.h3
-rw-r--r--fs/proc/root.c61
-rw-r--r--fs/quota/dquot.c8
-rw-r--r--fs/quota/quota.c14
-rw-r--r--fs/super.c69
-rw-r--r--fs/sysfs/mount.c5
-rw-r--r--fs/xattr.c7
-rw-r--r--include/linux/fs.h79
-rw-r--r--include/linux/mount.h1
-rw-r--r--include/linux/posix_acl.h2
-rw-r--r--include/linux/quota.h10
-rw-r--r--include/linux/uidgid.h4
-rw-r--r--include/linux/user_namespace.h6
-rw-r--r--ipc/mqueue.c20
-rw-r--r--ipc/namespace.c5
-rw-r--r--kernel/cred.c2
-rw-r--r--kernel/user_namespace.c14
-rw-r--r--net/sunrpc/rpc_pipe.c8
-rw-r--r--security/commoncap.c10
-rw-r--r--security/integrity/evm/evm_crypto.c4
-rw-r--r--security/selinux/hooks.c25
-rw-r--r--security/smack/smack.h8
-rw-r--r--security/smack/smack_lsm.c34
36 files changed, 418 insertions, 213 deletions
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index d4cc73bb6e1e..542801f04b0d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -415,7 +415,7 @@ static int mdc_unpack_acl(struct ptlrpc_request *req, struct lustre_md *md)
415 return rc; 415 return rc;
416 } 416 }
417 417
418 rc = posix_acl_valid(acl); 418 rc = posix_acl_valid(&init_user_ns, acl);
419 if (rc) { 419 if (rc) {
420 CERROR("validate acl: %d\n", rc); 420 CERROR("validate acl: %d\n", rc);
421 posix_acl_release(acl); 421 posix_acl_release(acl);
diff --git a/fs/9p/acl.c b/fs/9p/acl.c
index 0576eaeb60b9..5b6a1743ea17 100644
--- a/fs/9p/acl.c
+++ b/fs/9p/acl.c
@@ -266,7 +266,7 @@ static int v9fs_xattr_set_acl(const struct xattr_handler *handler,
266 if (IS_ERR(acl)) 266 if (IS_ERR(acl))
267 return PTR_ERR(acl); 267 return PTR_ERR(acl);
268 else if (acl) { 268 else if (acl) {
269 retval = posix_acl_valid(acl); 269 retval = posix_acl_valid(inode->i_sb->s_user_ns, acl);
270 if (retval) 270 if (retval)
271 goto err_out; 271 goto err_out;
272 } 272 }
diff --git a/fs/attr.c b/fs/attr.c
index 25b24d0f6c88..42bb42bb3c72 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -255,6 +255,25 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
255 if (!(attr->ia_valid & ~(ATTR_KILL_SUID | ATTR_KILL_SGID))) 255 if (!(attr->ia_valid & ~(ATTR_KILL_SUID | ATTR_KILL_SGID)))
256 return 0; 256 return 0;
257 257
258 /*
259 * Verify that uid/gid changes are valid in the target
260 * namespace of the superblock.
261 */
262 if (ia_valid & ATTR_UID &&
263 !kuid_has_mapping(inode->i_sb->s_user_ns, attr->ia_uid))
264 return -EOVERFLOW;
265 if (ia_valid & ATTR_GID &&
266 !kgid_has_mapping(inode->i_sb->s_user_ns, attr->ia_gid))
267 return -EOVERFLOW;
268
269 /* Don't allow modifications of files with invalid uids or
270 * gids unless those uids & gids are being made valid.
271 */
272 if (!(ia_valid & ATTR_UID) && !uid_valid(inode->i_uid))
273 return -EOVERFLOW;
274 if (!(ia_valid & ATTR_GID) && !gid_valid(inode->i_gid))
275 return -EOVERFLOW;
276
258 error = security_inode_setattr(dentry, attr); 277 error = security_inode_setattr(dentry, attr);
259 if (error) 278 if (error)
260 return error; 279 return error;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 5cbd5391667e..ada42cf42d06 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1846,7 +1846,7 @@ struct block_device *lookup_bdev(const char *pathname)
1846 if (!S_ISBLK(inode->i_mode)) 1846 if (!S_ISBLK(inode->i_mode))
1847 goto fail; 1847 goto fail;
1848 error = -EACCES; 1848 error = -EACCES;
1849 if (path.mnt->mnt_flags & MNT_NODEV) 1849 if (!may_open_dev(&path))
1850 goto fail; 1850 goto fail;
1851 error = -ENOMEM; 1851 error = -ENOMEM;
1852 bdev = bd_acquire(inode); 1852 bdev = bd_acquire(inode);
diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 37c134a132c7..d116453b0276 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -396,6 +396,7 @@ devpts_fill_super(struct super_block *s, void *data, int silent)
396{ 396{
397 struct inode *inode; 397 struct inode *inode;
398 398
399 s->s_iflags &= ~SB_I_NODEV;
399 s->s_blocksize = 1024; 400 s->s_blocksize = 1024;
400 s->s_blocksize_bits = 10; 401 s->s_blocksize_bits = 10;
401 s->s_magic = DEVPTS_SUPER_MAGIC; 402 s->s_magic = DEVPTS_SUPER_MAGIC;
@@ -480,7 +481,7 @@ static struct file_system_type devpts_fs_type = {
480 .name = "devpts", 481 .name = "devpts",
481 .mount = devpts_mount, 482 .mount = devpts_mount,
482 .kill_sb = devpts_kill_sb, 483 .kill_sb = devpts_kill_sb,
483 .fs_flags = FS_USERNS_MOUNT | FS_USERNS_DEV_MOUNT, 484 .fs_flags = FS_USERNS_MOUNT,
484}; 485};
485 486
486/* 487/*
diff --git a/fs/exec.c b/fs/exec.c
index 887c1c955df8..ca239fc86d8d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1411,7 +1411,7 @@ static void bprm_fill_uid(struct linux_binprm *bprm)
1411 bprm->cred->euid = current_euid(); 1411 bprm->cred->euid = current_euid();
1412 bprm->cred->egid = current_egid(); 1412 bprm->cred->egid = current_egid();
1413 1413
1414 if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) 1414 if (!mnt_may_suid(bprm->file->f_path.mnt))
1415 return; 1415 return;
1416 1416
1417 if (task_no_new_privs(current)) 1417 if (task_no_new_privs(current))
diff --git a/fs/inode.c b/fs/inode.c
index e171f7b5f9e4..9cef4e16aeda 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1619,6 +1619,13 @@ bool atime_needs_update(const struct path *path, struct inode *inode)
1619 1619
1620 if (inode->i_flags & S_NOATIME) 1620 if (inode->i_flags & S_NOATIME)
1621 return false; 1621 return false;
1622
1623 /* Atime updates will likely cause i_uid and i_gid to be written
1624 * back improprely if their true value is unknown to the vfs.
1625 */
1626 if (HAS_UNMAPPED_ID(inode))
1627 return false;
1628
1622 if (IS_NOATIME(inode)) 1629 if (IS_NOATIME(inode))
1623 return false; 1630 return false;
1624 if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode)) 1631 if ((inode->i_sb->s_flags & MS_NODIRATIME) && S_ISDIR(inode->i_mode))
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index 63534f5f9073..b3d73ad52b22 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -152,6 +152,8 @@ static int kernfs_fill_super(struct super_block *sb, unsigned long magic)
152 struct dentry *root; 152 struct dentry *root;
153 153
154 info->sb = sb; 154 info->sb = sb;
155 /* Userspace would break if executables or devices appear on sysfs */
156 sb->s_iflags |= SB_I_NOEXEC | SB_I_NODEV;
155 sb->s_blocksize = PAGE_SIZE; 157 sb->s_blocksize = PAGE_SIZE;
156 sb->s_blocksize_bits = PAGE_SHIFT; 158 sb->s_blocksize_bits = PAGE_SHIFT;
157 sb->s_magic = magic; 159 sb->s_magic = magic;
@@ -241,7 +243,8 @@ struct dentry *kernfs_mount_ns(struct file_system_type *fs_type, int flags,
241 info->root = root; 243 info->root = root;
242 info->ns = ns; 244 info->ns = ns;
243 245
244 sb = sget(fs_type, kernfs_test_super, kernfs_set_super, flags, info); 246 sb = sget_userns(fs_type, kernfs_test_super, kernfs_set_super, flags,
247 &init_user_ns, info);
245 if (IS_ERR(sb) || sb->s_fs_info != info) 248 if (IS_ERR(sb) || sb->s_fs_info != info)
246 kfree(info); 249 kfree(info);
247 if (IS_ERR(sb)) 250 if (IS_ERR(sb))
diff --git a/fs/namei.c b/fs/namei.c
index 68a896c804b7..c386a329ab20 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -36,6 +36,7 @@
36#include <linux/posix_acl.h> 36#include <linux/posix_acl.h>
37#include <linux/hash.h> 37#include <linux/hash.h>
38#include <linux/bitops.h> 38#include <linux/bitops.h>
39#include <linux/init_task.h>
39#include <asm/uaccess.h> 40#include <asm/uaccess.h>
40 41
41#include "internal.h" 42#include "internal.h"
@@ -410,6 +411,14 @@ int __inode_permission(struct inode *inode, int mask)
410 */ 411 */
411 if (IS_IMMUTABLE(inode)) 412 if (IS_IMMUTABLE(inode))
412 return -EACCES; 413 return -EACCES;
414
415 /*
416 * Updating mtime will likely cause i_uid and i_gid to be
417 * written back improperly if their true value is unknown
418 * to the vfs.
419 */
420 if (HAS_UNMAPPED_ID(inode))
421 return -EACCES;
413 } 422 }
414 423
415 retval = do_inode_permission(inode, mask); 424 retval = do_inode_permission(inode, mask);
@@ -901,6 +910,7 @@ static inline int may_follow_link(struct nameidata *nd)
901{ 910{
902 const struct inode *inode; 911 const struct inode *inode;
903 const struct inode *parent; 912 const struct inode *parent;
913 kuid_t puid;
904 914
905 if (!sysctl_protected_symlinks) 915 if (!sysctl_protected_symlinks)
906 return 0; 916 return 0;
@@ -916,7 +926,8 @@ static inline int may_follow_link(struct nameidata *nd)
916 return 0; 926 return 0;
917 927
918 /* Allowed if parent directory and link owner match. */ 928 /* Allowed if parent directory and link owner match. */
919 if (uid_eq(parent->i_uid, inode->i_uid)) 929 puid = parent->i_uid;
930 if (uid_valid(puid) && uid_eq(puid, inode->i_uid))
920 return 0; 931 return 0;
921 932
922 if (nd->flags & LOOKUP_RCU) 933 if (nd->flags & LOOKUP_RCU)
@@ -1089,6 +1100,7 @@ static int follow_automount(struct path *path, struct nameidata *nd,
1089 bool *need_mntput) 1100 bool *need_mntput)
1090{ 1101{
1091 struct vfsmount *mnt; 1102 struct vfsmount *mnt;
1103 const struct cred *old_cred;
1092 int err; 1104 int err;
1093 1105
1094 if (!path->dentry->d_op || !path->dentry->d_op->d_automount) 1106 if (!path->dentry->d_op || !path->dentry->d_op->d_automount)
@@ -1110,11 +1122,16 @@ static int follow_automount(struct path *path, struct nameidata *nd,
1110 path->dentry->d_inode) 1122 path->dentry->d_inode)
1111 return -EISDIR; 1123 return -EISDIR;
1112 1124
1125 if (path->dentry->d_sb->s_user_ns != &init_user_ns)
1126 return -EACCES;
1127
1113 nd->total_link_count++; 1128 nd->total_link_count++;
1114 if (nd->total_link_count >= 40) 1129 if (nd->total_link_count >= 40)
1115 return -ELOOP; 1130 return -ELOOP;
1116 1131
1132 old_cred = override_creds(&init_cred);
1117 mnt = path->dentry->d_op->d_automount(path); 1133 mnt = path->dentry->d_op->d_automount(path);
1134 revert_creds(old_cred);
1118 if (IS_ERR(mnt)) { 1135 if (IS_ERR(mnt)) {
1119 /* 1136 /*
1120 * The filesystem is allowed to return -EISDIR here to indicate 1137 * The filesystem is allowed to return -EISDIR here to indicate
@@ -2741,10 +2758,11 @@ EXPORT_SYMBOL(__check_sticky);
2741 * c. have CAP_FOWNER capability 2758 * c. have CAP_FOWNER capability
2742 * 6. If the victim is append-only or immutable we can't do antyhing with 2759 * 6. If the victim is append-only or immutable we can't do antyhing with
2743 * links pointing to it. 2760 * links pointing to it.
2744 * 7. If we were asked to remove a directory and victim isn't one - ENOTDIR. 2761 * 7. If the victim has an unknown uid or gid we can't change the inode.
2745 * 8. If we were asked to remove a non-directory and victim isn't one - EISDIR. 2762 * 8. If we were asked to remove a directory and victim isn't one - ENOTDIR.
2746 * 9. We can't remove a root or mountpoint. 2763 * 9. If we were asked to remove a non-directory and victim isn't one - EISDIR.
2747 * 10. We don't allow removal of NFS sillyrenamed files; it's handled by 2764 * 10. We can't remove a root or mountpoint.
2765 * 11. We don't allow removal of NFS sillyrenamed files; it's handled by
2748 * nfs_async_unlink(). 2766 * nfs_async_unlink().
2749 */ 2767 */
2750static int may_delete(struct inode *dir, struct dentry *victim, bool isdir) 2768static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
@@ -2766,7 +2784,7 @@ static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
2766 return -EPERM; 2784 return -EPERM;
2767 2785
2768 if (check_sticky(dir, inode) || IS_APPEND(inode) || 2786 if (check_sticky(dir, inode) || IS_APPEND(inode) ||
2769 IS_IMMUTABLE(inode) || IS_SWAPFILE(inode)) 2787 IS_IMMUTABLE(inode) || IS_SWAPFILE(inode) || HAS_UNMAPPED_ID(inode))
2770 return -EPERM; 2788 return -EPERM;
2771 if (isdir) { 2789 if (isdir) {
2772 if (!d_is_dir(victim)) 2790 if (!d_is_dir(victim))
@@ -2787,16 +2805,22 @@ static int may_delete(struct inode *dir, struct dentry *victim, bool isdir)
2787 * 1. We can't do it if child already exists (open has special treatment for 2805 * 1. We can't do it if child already exists (open has special treatment for
2788 * this case, but since we are inlined it's OK) 2806 * this case, but since we are inlined it's OK)
2789 * 2. We can't do it if dir is read-only (done in permission()) 2807 * 2. We can't do it if dir is read-only (done in permission())
2790 * 3. We should have write and exec permissions on dir 2808 * 3. We can't do it if the fs can't represent the fsuid or fsgid.
2791 * 4. We can't do it if dir is immutable (done in permission()) 2809 * 4. We should have write and exec permissions on dir
2810 * 5. We can't do it if dir is immutable (done in permission())
2792 */ 2811 */
2793static inline int may_create(struct inode *dir, struct dentry *child) 2812static inline int may_create(struct inode *dir, struct dentry *child)
2794{ 2813{
2814 struct user_namespace *s_user_ns;
2795 audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE); 2815 audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
2796 if (child->d_inode) 2816 if (child->d_inode)
2797 return -EEXIST; 2817 return -EEXIST;
2798 if (IS_DEADDIR(dir)) 2818 if (IS_DEADDIR(dir))
2799 return -ENOENT; 2819 return -ENOENT;
2820 s_user_ns = dir->i_sb->s_user_ns;
2821 if (!kuid_has_mapping(s_user_ns, current_fsuid()) ||
2822 !kgid_has_mapping(s_user_ns, current_fsgid()))
2823 return -EOVERFLOW;
2800 return inode_permission(dir, MAY_WRITE | MAY_EXEC); 2824 return inode_permission(dir, MAY_WRITE | MAY_EXEC);
2801} 2825}
2802 2826
@@ -2865,6 +2889,12 @@ int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
2865} 2889}
2866EXPORT_SYMBOL(vfs_create); 2890EXPORT_SYMBOL(vfs_create);
2867 2891
2892bool may_open_dev(const struct path *path)
2893{
2894 return !(path->mnt->mnt_flags & MNT_NODEV) &&
2895 !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
2896}
2897
2868static int may_open(struct path *path, int acc_mode, int flag) 2898static int may_open(struct path *path, int acc_mode, int flag)
2869{ 2899{
2870 struct dentry *dentry = path->dentry; 2900 struct dentry *dentry = path->dentry;
@@ -2883,7 +2913,7 @@ static int may_open(struct path *path, int acc_mode, int flag)
2883 break; 2913 break;
2884 case S_IFBLK: 2914 case S_IFBLK:
2885 case S_IFCHR: 2915 case S_IFCHR:
2886 if (path->mnt->mnt_flags & MNT_NODEV) 2916 if (!may_open_dev(path))
2887 return -EACCES; 2917 return -EACCES;
2888 /*FALLTHRU*/ 2918 /*FALLTHRU*/
2889 case S_IFIFO: 2919 case S_IFIFO:
@@ -4135,6 +4165,13 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
4135 */ 4165 */
4136 if (IS_APPEND(inode) || IS_IMMUTABLE(inode)) 4166 if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
4137 return -EPERM; 4167 return -EPERM;
4168 /*
4169 * Updating the link count will likely cause i_uid and i_gid to
4170 * be writen back improperly if their true value is unknown to
4171 * the vfs.
4172 */
4173 if (HAS_UNMAPPED_ID(inode))
4174 return -EPERM;
4138 if (!dir->i_op->link) 4175 if (!dir->i_op->link)
4139 return -EPERM; 4176 return -EPERM;
4140 if (S_ISDIR(inode->i_mode)) 4177 if (S_ISDIR(inode->i_mode))
diff --git a/fs/namespace.c b/fs/namespace.c
index 419f746d851d..7bb2cda3bfef 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2186,13 +2186,7 @@ static int do_remount(struct path *path, int flags, int mnt_flags,
2186 } 2186 }
2187 if ((mnt->mnt.mnt_flags & MNT_LOCK_NODEV) && 2187 if ((mnt->mnt.mnt_flags & MNT_LOCK_NODEV) &&
2188 !(mnt_flags & MNT_NODEV)) { 2188 !(mnt_flags & MNT_NODEV)) {
2189 /* Was the nodev implicitly added in mount? */ 2189 return -EPERM;
2190 if ((mnt->mnt_ns->user_ns != &init_user_ns) &&
2191 !(sb->s_type->fs_flags & FS_USERNS_DEV_MOUNT)) {
2192 mnt_flags |= MNT_NODEV;
2193 } else {
2194 return -EPERM;
2195 }
2196 } 2190 }
2197 if ((mnt->mnt.mnt_flags & MNT_LOCK_NOSUID) && 2191 if ((mnt->mnt.mnt_flags & MNT_LOCK_NOSUID) &&
2198 !(mnt_flags & MNT_NOSUID)) { 2192 !(mnt_flags & MNT_NOSUID)) {
@@ -2376,7 +2370,7 @@ unlock:
2376 return err; 2370 return err;
2377} 2371}
2378 2372
2379static bool fs_fully_visible(struct file_system_type *fs_type, int *new_mnt_flags); 2373static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags);
2380 2374
2381/* 2375/*
2382 * create a new mount for userspace and request it to be added into the 2376 * create a new mount for userspace and request it to be added into the
@@ -2386,7 +2380,6 @@ static int do_new_mount(struct path *path, const char *fstype, int flags,
2386 int mnt_flags, const char *name, void *data) 2380 int mnt_flags, const char *name, void *data)
2387{ 2381{
2388 struct file_system_type *type; 2382 struct file_system_type *type;
2389 struct user_namespace *user_ns = current->nsproxy->mnt_ns->user_ns;
2390 struct vfsmount *mnt; 2383 struct vfsmount *mnt;
2391 int err; 2384 int err;
2392 2385
@@ -2397,26 +2390,6 @@ static int do_new_mount(struct path *path, const char *fstype, int flags,
2397 if (!type) 2390 if (!type)
2398 return -ENODEV; 2391 return -ENODEV;
2399 2392
2400 if (user_ns != &init_user_ns) {
2401 if (!(type->fs_flags & FS_USERNS_MOUNT)) {
2402 put_filesystem(type);
2403 return -EPERM;
2404 }
2405 /* Only in special cases allow devices from mounts
2406 * created outside the initial user namespace.
2407 */
2408 if (!(type->fs_flags & FS_USERNS_DEV_MOUNT)) {
2409 flags |= MS_NODEV;
2410 mnt_flags |= MNT_NODEV | MNT_LOCK_NODEV;
2411 }
2412 if (type->fs_flags & FS_USERNS_VISIBLE) {
2413 if (!fs_fully_visible(type, &mnt_flags)) {
2414 put_filesystem(type);
2415 return -EPERM;
2416 }
2417 }
2418 }
2419
2420 mnt = vfs_kern_mount(type, flags, name, data); 2393 mnt = vfs_kern_mount(type, flags, name, data);
2421 if (!IS_ERR(mnt) && (type->fs_flags & FS_HAS_SUBTYPE) && 2394 if (!IS_ERR(mnt) && (type->fs_flags & FS_HAS_SUBTYPE) &&
2422 !mnt->mnt_sb->s_subtype) 2395 !mnt->mnt_sb->s_subtype)
@@ -2426,6 +2399,11 @@ static int do_new_mount(struct path *path, const char *fstype, int flags,
2426 if (IS_ERR(mnt)) 2399 if (IS_ERR(mnt))
2427 return PTR_ERR(mnt); 2400 return PTR_ERR(mnt);
2428 2401
2402 if (mount_too_revealing(mnt, &mnt_flags)) {
2403 mntput(mnt);
2404 return -EPERM;
2405 }
2406
2429 err = do_add_mount(real_mount(mnt), path, mnt_flags); 2407 err = do_add_mount(real_mount(mnt), path, mnt_flags);
2430 if (err) 2408 if (err)
2431 mntput(mnt); 2409 mntput(mnt);
@@ -3217,22 +3195,19 @@ bool current_chrooted(void)
3217 return chrooted; 3195 return chrooted;
3218} 3196}
3219 3197
3220static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags) 3198static bool mnt_already_visible(struct mnt_namespace *ns, struct vfsmount *new,
3199 int *new_mnt_flags)
3221{ 3200{
3222 struct mnt_namespace *ns = current->nsproxy->mnt_ns;
3223 int new_flags = *new_mnt_flags; 3201 int new_flags = *new_mnt_flags;
3224 struct mount *mnt; 3202 struct mount *mnt;
3225 bool visible = false; 3203 bool visible = false;
3226 3204
3227 if (unlikely(!ns))
3228 return false;
3229
3230 down_read(&namespace_sem); 3205 down_read(&namespace_sem);
3231 list_for_each_entry(mnt, &ns->list, mnt_list) { 3206 list_for_each_entry(mnt, &ns->list, mnt_list) {
3232 struct mount *child; 3207 struct mount *child;
3233 int mnt_flags; 3208 int mnt_flags;
3234 3209
3235 if (mnt->mnt.mnt_sb->s_type != type) 3210 if (mnt->mnt.mnt_sb->s_type != new->mnt_sb->s_type)
3236 continue; 3211 continue;
3237 3212
3238 /* This mount is not fully visible if it's root directory 3213 /* This mount is not fully visible if it's root directory
@@ -3241,12 +3216,8 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
3241 if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root) 3216 if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
3242 continue; 3217 continue;
3243 3218
3244 /* Read the mount flags and filter out flags that 3219 /* A local view of the mount flags */
3245 * may safely be ignored.
3246 */
3247 mnt_flags = mnt->mnt.mnt_flags; 3220 mnt_flags = mnt->mnt.mnt_flags;
3248 if (mnt->mnt.mnt_sb->s_iflags & SB_I_NOEXEC)
3249 mnt_flags &= ~(MNT_LOCK_NOSUID | MNT_LOCK_NOEXEC);
3250 3221
3251 /* Don't miss readonly hidden in the superblock flags */ 3222 /* Don't miss readonly hidden in the superblock flags */
3252 if (mnt->mnt.mnt_sb->s_flags & MS_RDONLY) 3223 if (mnt->mnt.mnt_sb->s_flags & MS_RDONLY)
@@ -3258,15 +3229,6 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
3258 if ((mnt_flags & MNT_LOCK_READONLY) && 3229 if ((mnt_flags & MNT_LOCK_READONLY) &&
3259 !(new_flags & MNT_READONLY)) 3230 !(new_flags & MNT_READONLY))
3260 continue; 3231 continue;
3261 if ((mnt_flags & MNT_LOCK_NODEV) &&
3262 !(new_flags & MNT_NODEV))
3263 continue;
3264 if ((mnt_flags & MNT_LOCK_NOSUID) &&
3265 !(new_flags & MNT_NOSUID))
3266 continue;
3267 if ((mnt_flags & MNT_LOCK_NOEXEC) &&
3268 !(new_flags & MNT_NOEXEC))
3269 continue;
3270 if ((mnt_flags & MNT_LOCK_ATIME) && 3232 if ((mnt_flags & MNT_LOCK_ATIME) &&
3271 ((mnt_flags & MNT_ATIME_MASK) != (new_flags & MNT_ATIME_MASK))) 3233 ((mnt_flags & MNT_ATIME_MASK) != (new_flags & MNT_ATIME_MASK)))
3272 continue; 3234 continue;
@@ -3286,9 +3248,6 @@ static bool fs_fully_visible(struct file_system_type *type, int *new_mnt_flags)
3286 } 3248 }
3287 /* Preserve the locked attributes */ 3249 /* Preserve the locked attributes */
3288 *new_mnt_flags |= mnt_flags & (MNT_LOCK_READONLY | \ 3250 *new_mnt_flags |= mnt_flags & (MNT_LOCK_READONLY | \
3289 MNT_LOCK_NODEV | \
3290 MNT_LOCK_NOSUID | \
3291 MNT_LOCK_NOEXEC | \
3292 MNT_LOCK_ATIME); 3251 MNT_LOCK_ATIME);
3293 visible = true; 3252 visible = true;
3294 goto found; 3253 goto found;
@@ -3299,6 +3258,42 @@ found:
3299 return visible; 3258 return visible;
3300} 3259}
3301 3260
3261static bool mount_too_revealing(struct vfsmount *mnt, int *new_mnt_flags)
3262{
3263 const unsigned long required_iflags = SB_I_NOEXEC | SB_I_NODEV;
3264 struct mnt_namespace *ns = current->nsproxy->mnt_ns;
3265 unsigned long s_iflags;
3266
3267 if (ns->user_ns == &init_user_ns)
3268 return false;
3269
3270 /* Can this filesystem be too revealing? */
3271 s_iflags = mnt->mnt_sb->s_iflags;
3272 if (!(s_iflags & SB_I_USERNS_VISIBLE))
3273 return false;
3274
3275 if ((s_iflags & required_iflags) != required_iflags) {
3276 WARN_ONCE(1, "Expected s_iflags to contain 0x%lx\n",
3277 required_iflags);
3278 return true;
3279 }
3280
3281 return !mnt_already_visible(ns, mnt, new_mnt_flags);
3282}
3283
3284bool mnt_may_suid(struct vfsmount *mnt)
3285{
3286 /*
3287 * Foreign mounts (accessed via fchdir or through /proc
3288 * symlinks) are always treated as if they are nosuid. This
3289 * prevents namespaces from trusting potentially unsafe
3290 * suid/sgid bits, file caps, or security labels that originate
3291 * in other namespaces.
3292 */
3293 return !(mnt->mnt_flags & MNT_NOSUID) && check_mnt(real_mount(mnt)) &&
3294 current_in_userns(mnt->mnt_sb->s_user_ns);
3295}
3296
3302static struct ns_common *mntns_get(struct task_struct *task) 3297static struct ns_common *mntns_get(struct task_struct *task)
3303{ 3298{
3304 struct ns_common *ns = NULL; 3299 struct ns_common *ns = NULL;
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index e7787777620e..65ad0165a94f 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1151,20 +1151,15 @@ static int nfsd_fill_super(struct super_block * sb, void * data, int silent)
1151#endif 1151#endif
1152 /* last one */ {""} 1152 /* last one */ {""}
1153 }; 1153 };
1154 struct net *net = data; 1154 get_net(sb->s_fs_info);
1155 int ret; 1155 return simple_fill_super(sb, 0x6e667364, nfsd_files);
1156
1157 ret = simple_fill_super(sb, 0x6e667364, nfsd_files);
1158 if (ret)
1159 return ret;
1160 sb->s_fs_info = get_net(net);
1161 return 0;
1162} 1156}
1163 1157
1164static struct dentry *nfsd_mount(struct file_system_type *fs_type, 1158static struct dentry *nfsd_mount(struct file_system_type *fs_type,
1165 int flags, const char *dev_name, void *data) 1159 int flags, const char *dev_name, void *data)
1166{ 1160{
1167 return mount_ns(fs_type, flags, current->nsproxy->net_ns, nfsd_fill_super); 1161 struct net *net = current->nsproxy->net_ns;
1162 return mount_ns(fs_type, flags, data, net, net->user_ns, nfsd_fill_super);
1168} 1163}
1169 1164
1170static void nfsd_umount(struct super_block *sb) 1165static void nfsd_umount(struct super_block *sb)
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index edc452c2a563..59d47ab0791a 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -205,7 +205,7 @@ posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
205 * Check if an acl is valid. Returns 0 if it is, or -E... otherwise. 205 * Check if an acl is valid. Returns 0 if it is, or -E... otherwise.
206 */ 206 */
207int 207int
208posix_acl_valid(const struct posix_acl *acl) 208posix_acl_valid(struct user_namespace *user_ns, const struct posix_acl *acl)
209{ 209{
210 const struct posix_acl_entry *pa, *pe; 210 const struct posix_acl_entry *pa, *pe;
211 int state = ACL_USER_OBJ; 211 int state = ACL_USER_OBJ;
@@ -225,7 +225,7 @@ posix_acl_valid(const struct posix_acl *acl)
225 case ACL_USER: 225 case ACL_USER:
226 if (state != ACL_USER) 226 if (state != ACL_USER)
227 return -EINVAL; 227 return -EINVAL;
228 if (!uid_valid(pa->e_uid)) 228 if (!kuid_has_mapping(user_ns, pa->e_uid))
229 return -EINVAL; 229 return -EINVAL;
230 needs_mask = 1; 230 needs_mask = 1;
231 break; 231 break;
@@ -240,7 +240,7 @@ posix_acl_valid(const struct posix_acl *acl)
240 case ACL_GROUP: 240 case ACL_GROUP:
241 if (state != ACL_GROUP) 241 if (state != ACL_GROUP)
242 return -EINVAL; 242 return -EINVAL;
243 if (!gid_valid(pa->e_gid)) 243 if (!kgid_has_mapping(user_ns, pa->e_gid))
244 return -EINVAL; 244 return -EINVAL;
245 needs_mask = 1; 245 needs_mask = 1;
246 break; 246 break;
@@ -834,7 +834,7 @@ set_posix_acl(struct inode *inode, int type, struct posix_acl *acl)
834 return -EPERM; 834 return -EPERM;
835 835
836 if (acl) { 836 if (acl) {
837 int ret = posix_acl_valid(acl); 837 int ret = posix_acl_valid(inode->i_sb->s_user_ns, acl);
838 if (ret) 838 if (ret)
839 return ret; 839 return ret;
840 } 840 }
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 42305ddcbaa0..c1b72388e571 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -457,17 +457,30 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de)
457 return inode; 457 return inode;
458} 458}
459 459
460int proc_fill_super(struct super_block *s) 460int proc_fill_super(struct super_block *s, void *data, int silent)
461{ 461{
462 struct pid_namespace *ns = get_pid_ns(s->s_fs_info);
462 struct inode *root_inode; 463 struct inode *root_inode;
463 int ret; 464 int ret;
464 465
466 if (!proc_parse_options(data, ns))
467 return -EINVAL;
468
469 /* User space would break if executables or devices appear on proc */
470 s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV;
465 s->s_flags |= MS_NODIRATIME | MS_NOSUID | MS_NOEXEC; 471 s->s_flags |= MS_NODIRATIME | MS_NOSUID | MS_NOEXEC;
466 s->s_blocksize = 1024; 472 s->s_blocksize = 1024;
467 s->s_blocksize_bits = 10; 473 s->s_blocksize_bits = 10;
468 s->s_magic = PROC_SUPER_MAGIC; 474 s->s_magic = PROC_SUPER_MAGIC;
469 s->s_op = &proc_sops; 475 s->s_op = &proc_sops;
470 s->s_time_gran = 1; 476 s->s_time_gran = 1;
477
478 /*
479 * procfs isn't actually a stacking filesystem; however, there is
480 * too much magic going on inside it to permit stacking things on
481 * top of it
482 */
483 s->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH;
471 484
472 pde_get(&proc_root); 485 pde_get(&proc_root);
473 root_inode = proc_get_inode(s, &proc_root); 486 root_inode = proc_get_inode(s, &proc_root);
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa2781095bd1..7931c558c192 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -212,7 +212,7 @@ extern const struct inode_operations proc_pid_link_inode_operations;
212 212
213extern void proc_init_inodecache(void); 213extern void proc_init_inodecache(void);
214extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *); 214extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *);
215extern int proc_fill_super(struct super_block *); 215extern int proc_fill_super(struct super_block *, void *data, int flags);
216extern void proc_entry_rundown(struct proc_dir_entry *); 216extern void proc_entry_rundown(struct proc_dir_entry *);
217 217
218/* 218/*
@@ -268,6 +268,7 @@ static inline void proc_tty_init(void) {}
268 * root.c 268 * root.c
269 */ 269 */
270extern struct proc_dir_entry proc_root; 270extern struct proc_dir_entry proc_root;
271extern int proc_parse_options(char *options, struct pid_namespace *pid);
271 272
272extern void proc_self_init(void); 273extern void proc_self_init(void);
273extern int proc_remount(struct super_block *, int *, char *); 274extern int proc_remount(struct super_block *, int *, char *);
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 06702783bf40..8d3e484055a6 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -23,21 +23,6 @@
23 23
24#include "internal.h" 24#include "internal.h"
25 25
26static int proc_test_super(struct super_block *sb, void *data)
27{
28 return sb->s_fs_info == data;
29}
30
31static int proc_set_super(struct super_block *sb, void *data)
32{
33 int err = set_anon_super(sb, NULL);
34 if (!err) {
35 struct pid_namespace *ns = (struct pid_namespace *)data;
36 sb->s_fs_info = get_pid_ns(ns);
37 }
38 return err;
39}
40
41enum { 26enum {
42 Opt_gid, Opt_hidepid, Opt_err, 27 Opt_gid, Opt_hidepid, Opt_err,
43}; 28};
@@ -48,7 +33,7 @@ static const match_table_t tokens = {
48 {Opt_err, NULL}, 33 {Opt_err, NULL},
49}; 34};
50 35
51static int proc_parse_options(char *options, struct pid_namespace *pid) 36int proc_parse_options(char *options, struct pid_namespace *pid)
52{ 37{
53 char *p; 38 char *p;
54 substring_t args[MAX_OPT_ARGS]; 39 substring_t args[MAX_OPT_ARGS];
@@ -100,52 +85,16 @@ int proc_remount(struct super_block *sb, int *flags, char *data)
100static struct dentry *proc_mount(struct file_system_type *fs_type, 85static struct dentry *proc_mount(struct file_system_type *fs_type,
101 int flags, const char *dev_name, void *data) 86 int flags, const char *dev_name, void *data)
102{ 87{
103 int err;
104 struct super_block *sb;
105 struct pid_namespace *ns; 88 struct pid_namespace *ns;
106 char *options;
107 89
108 if (flags & MS_KERNMOUNT) { 90 if (flags & MS_KERNMOUNT) {
109 ns = (struct pid_namespace *)data; 91 ns = data;
110 options = NULL; 92 data = NULL;
111 } else { 93 } else {
112 ns = task_active_pid_ns(current); 94 ns = task_active_pid_ns(current);
113 options = data;
114
115 /* Does the mounter have privilege over the pid namespace? */
116 if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
117 return ERR_PTR(-EPERM);
118 }
119
120 sb = sget(fs_type, proc_test_super, proc_set_super, flags, ns);
121 if (IS_ERR(sb))
122 return ERR_CAST(sb);
123
124 /*
125 * procfs isn't actually a stacking filesystem; however, there is
126 * too much magic going on inside it to permit stacking things on
127 * top of it
128 */
129 sb->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH;
130
131 if (!proc_parse_options(options, ns)) {
132 deactivate_locked_super(sb);
133 return ERR_PTR(-EINVAL);
134 }
135
136 if (!sb->s_root) {
137 err = proc_fill_super(sb);
138 if (err) {
139 deactivate_locked_super(sb);
140 return ERR_PTR(err);
141 }
142
143 sb->s_flags |= MS_ACTIVE;
144 /* User space would break if executables appear on proc */
145 sb->s_iflags |= SB_I_NOEXEC;
146 } 95 }
147 96
148 return dget(sb->s_root); 97 return mount_ns(fs_type, flags, data, ns, ns->user_ns, proc_fill_super);
149} 98}
150 99
151static void proc_kill_sb(struct super_block *sb) 100static void proc_kill_sb(struct super_block *sb)
@@ -165,7 +114,7 @@ static struct file_system_type proc_fs_type = {
165 .name = "proc", 114 .name = "proc",
166 .mount = proc_mount, 115 .mount = proc_mount,
167 .kill_sb = proc_kill_sb, 116 .kill_sb = proc_kill_sb,
168 .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT, 117 .fs_flags = FS_USERNS_MOUNT,
169}; 118};
170 119
171void __init proc_root_init(void) 120void __init proc_root_init(void)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index b1322dd9d136..1bfac28b7e7d 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -841,6 +841,9 @@ struct dquot *dqget(struct super_block *sb, struct kqid qid)
841 unsigned int hashent = hashfn(sb, qid); 841 unsigned int hashent = hashfn(sb, qid);
842 struct dquot *dquot, *empty = NULL; 842 struct dquot *dquot, *empty = NULL;
843 843
844 if (!qid_has_mapping(sb->s_user_ns, qid))
845 return ERR_PTR(-EINVAL);
846
844 if (!sb_has_quota_active(sb, qid.type)) 847 if (!sb_has_quota_active(sb, qid.type))
845 return ERR_PTR(-ESRCH); 848 return ERR_PTR(-ESRCH);
846we_slept: 849we_slept:
@@ -2268,6 +2271,11 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
2268 error = -EINVAL; 2271 error = -EINVAL;
2269 goto out_fmt; 2272 goto out_fmt;
2270 } 2273 }
2274 /* Filesystems outside of init_user_ns not yet supported */
2275 if (sb->s_user_ns != &init_user_ns) {
2276 error = -EINVAL;
2277 goto out_fmt;
2278 }
2271 /* Usage always has to be set... */ 2279 /* Usage always has to be set... */
2272 if (!(flags & DQUOT_USAGE_ENABLED)) { 2280 if (!(flags & DQUOT_USAGE_ENABLED)) {
2273 error = -EINVAL; 2281 error = -EINVAL;
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 0f10ee9892ce..35df08ee9c97 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -211,7 +211,7 @@ static int quota_getquota(struct super_block *sb, int type, qid_t id,
211 if (!sb->s_qcop->get_dqblk) 211 if (!sb->s_qcop->get_dqblk)
212 return -ENOSYS; 212 return -ENOSYS;
213 qid = make_kqid(current_user_ns(), type, id); 213 qid = make_kqid(current_user_ns(), type, id);
214 if (!qid_valid(qid)) 214 if (!qid_has_mapping(sb->s_user_ns, qid))
215 return -EINVAL; 215 return -EINVAL;
216 ret = sb->s_qcop->get_dqblk(sb, qid, &fdq); 216 ret = sb->s_qcop->get_dqblk(sb, qid, &fdq);
217 if (ret) 217 if (ret)
@@ -237,7 +237,7 @@ static int quota_getnextquota(struct super_block *sb, int type, qid_t id,
237 if (!sb->s_qcop->get_nextdqblk) 237 if (!sb->s_qcop->get_nextdqblk)
238 return -ENOSYS; 238 return -ENOSYS;
239 qid = make_kqid(current_user_ns(), type, id); 239 qid = make_kqid(current_user_ns(), type, id);
240 if (!qid_valid(qid)) 240 if (!qid_has_mapping(sb->s_user_ns, qid))
241 return -EINVAL; 241 return -EINVAL;
242 ret = sb->s_qcop->get_nextdqblk(sb, &qid, &fdq); 242 ret = sb->s_qcop->get_nextdqblk(sb, &qid, &fdq);
243 if (ret) 243 if (ret)
@@ -288,7 +288,7 @@ static int quota_setquota(struct super_block *sb, int type, qid_t id,
288 if (!sb->s_qcop->set_dqblk) 288 if (!sb->s_qcop->set_dqblk)
289 return -ENOSYS; 289 return -ENOSYS;
290 qid = make_kqid(current_user_ns(), type, id); 290 qid = make_kqid(current_user_ns(), type, id);
291 if (!qid_valid(qid)) 291 if (!qid_has_mapping(sb->s_user_ns, qid))
292 return -EINVAL; 292 return -EINVAL;
293 copy_from_if_dqblk(&fdq, &idq); 293 copy_from_if_dqblk(&fdq, &idq);
294 return sb->s_qcop->set_dqblk(sb, qid, &fdq); 294 return sb->s_qcop->set_dqblk(sb, qid, &fdq);
@@ -581,10 +581,10 @@ static int quota_setxquota(struct super_block *sb, int type, qid_t id,
581 if (!sb->s_qcop->set_dqblk) 581 if (!sb->s_qcop->set_dqblk)
582 return -ENOSYS; 582 return -ENOSYS;
583 qid = make_kqid(current_user_ns(), type, id); 583 qid = make_kqid(current_user_ns(), type, id);
584 if (!qid_valid(qid)) 584 if (!qid_has_mapping(sb->s_user_ns, qid))
585 return -EINVAL; 585 return -EINVAL;
586 /* Are we actually setting timer / warning limits for all users? */ 586 /* Are we actually setting timer / warning limits for all users? */
587 if (from_kqid(&init_user_ns, qid) == 0 && 587 if (from_kqid(sb->s_user_ns, qid) == 0 &&
588 fdq.d_fieldmask & (FS_DQ_WARNS_MASK | FS_DQ_TIMER_MASK)) { 588 fdq.d_fieldmask & (FS_DQ_WARNS_MASK | FS_DQ_TIMER_MASK)) {
589 struct qc_info qinfo; 589 struct qc_info qinfo;
590 int ret; 590 int ret;
@@ -642,7 +642,7 @@ static int quota_getxquota(struct super_block *sb, int type, qid_t id,
642 if (!sb->s_qcop->get_dqblk) 642 if (!sb->s_qcop->get_dqblk)
643 return -ENOSYS; 643 return -ENOSYS;
644 qid = make_kqid(current_user_ns(), type, id); 644 qid = make_kqid(current_user_ns(), type, id);
645 if (!qid_valid(qid)) 645 if (!qid_has_mapping(sb->s_user_ns, qid))
646 return -EINVAL; 646 return -EINVAL;
647 ret = sb->s_qcop->get_dqblk(sb, qid, &qdq); 647 ret = sb->s_qcop->get_dqblk(sb, qid, &qdq);
648 if (ret) 648 if (ret)
@@ -669,7 +669,7 @@ static int quota_getnextxquota(struct super_block *sb, int type, qid_t id,
669 if (!sb->s_qcop->get_nextdqblk) 669 if (!sb->s_qcop->get_nextdqblk)
670 return -ENOSYS; 670 return -ENOSYS;
671 qid = make_kqid(current_user_ns(), type, id); 671 qid = make_kqid(current_user_ns(), type, id);
672 if (!qid_valid(qid)) 672 if (!qid_has_mapping(sb->s_user_ns, qid))
673 return -EINVAL; 673 return -EINVAL;
674 ret = sb->s_qcop->get_nextdqblk(sb, &qid, &qdq); 674 ret = sb->s_qcop->get_nextdqblk(sb, &qid, &qdq);
675 if (ret) 675 if (ret)
diff --git a/fs/super.c b/fs/super.c
index 5806ffd45563..c2ff475c1711 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -33,6 +33,7 @@
33#include <linux/cleancache.h> 33#include <linux/cleancache.h>
34#include <linux/fsnotify.h> 34#include <linux/fsnotify.h>
35#include <linux/lockdep.h> 35#include <linux/lockdep.h>
36#include <linux/user_namespace.h>
36#include "internal.h" 37#include "internal.h"
37 38
38 39
@@ -165,6 +166,7 @@ static void destroy_super(struct super_block *s)
165 list_lru_destroy(&s->s_inode_lru); 166 list_lru_destroy(&s->s_inode_lru);
166 security_sb_free(s); 167 security_sb_free(s);
167 WARN_ON(!list_empty(&s->s_mounts)); 168 WARN_ON(!list_empty(&s->s_mounts));
169 put_user_ns(s->s_user_ns);
168 kfree(s->s_subtype); 170 kfree(s->s_subtype);
169 kfree(s->s_options); 171 kfree(s->s_options);
170 call_rcu(&s->rcu, destroy_super_rcu); 172 call_rcu(&s->rcu, destroy_super_rcu);
@@ -174,11 +176,13 @@ static void destroy_super(struct super_block *s)
174 * alloc_super - create new superblock 176 * alloc_super - create new superblock
175 * @type: filesystem type superblock should belong to 177 * @type: filesystem type superblock should belong to
176 * @flags: the mount flags 178 * @flags: the mount flags
179 * @user_ns: User namespace for the super_block
177 * 180 *
178 * Allocates and initializes a new &struct super_block. alloc_super() 181 * Allocates and initializes a new &struct super_block. alloc_super()
179 * returns a pointer new superblock or %NULL if allocation had failed. 182 * returns a pointer new superblock or %NULL if allocation had failed.
180 */ 183 */
181static struct super_block *alloc_super(struct file_system_type *type, int flags) 184static struct super_block *alloc_super(struct file_system_type *type, int flags,
185 struct user_namespace *user_ns)
182{ 186{
183 struct super_block *s = kzalloc(sizeof(struct super_block), GFP_USER); 187 struct super_block *s = kzalloc(sizeof(struct super_block), GFP_USER);
184 static const struct super_operations default_op; 188 static const struct super_operations default_op;
@@ -188,6 +192,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
188 return NULL; 192 return NULL;
189 193
190 INIT_LIST_HEAD(&s->s_mounts); 194 INIT_LIST_HEAD(&s->s_mounts);
195 s->s_user_ns = get_user_ns(user_ns);
191 196
192 if (security_sb_alloc(s)) 197 if (security_sb_alloc(s))
193 goto fail; 198 goto fail;
@@ -201,6 +206,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
201 init_waitqueue_head(&s->s_writers.wait_unfrozen); 206 init_waitqueue_head(&s->s_writers.wait_unfrozen);
202 s->s_bdi = &noop_backing_dev_info; 207 s->s_bdi = &noop_backing_dev_info;
203 s->s_flags = flags; 208 s->s_flags = flags;
209 if (s->s_user_ns != &init_user_ns)
210 s->s_iflags |= SB_I_NODEV;
204 INIT_HLIST_NODE(&s->s_instances); 211 INIT_HLIST_NODE(&s->s_instances);
205 INIT_HLIST_BL_HEAD(&s->s_anon); 212 INIT_HLIST_BL_HEAD(&s->s_anon);
206 mutex_init(&s->s_sync_lock); 213 mutex_init(&s->s_sync_lock);
@@ -445,29 +452,42 @@ void generic_shutdown_super(struct super_block *sb)
445EXPORT_SYMBOL(generic_shutdown_super); 452EXPORT_SYMBOL(generic_shutdown_super);
446 453
447/** 454/**
448 * sget - find or create a superblock 455 * sget_userns - find or create a superblock
449 * @type: filesystem type superblock should belong to 456 * @type: filesystem type superblock should belong to
450 * @test: comparison callback 457 * @test: comparison callback
451 * @set: setup callback 458 * @set: setup callback
452 * @flags: mount flags 459 * @flags: mount flags
460 * @user_ns: User namespace for the super_block
453 * @data: argument to each of them 461 * @data: argument to each of them
454 */ 462 */
455struct super_block *sget(struct file_system_type *type, 463struct super_block *sget_userns(struct file_system_type *type,
456 int (*test)(struct super_block *,void *), 464 int (*test)(struct super_block *,void *),
457 int (*set)(struct super_block *,void *), 465 int (*set)(struct super_block *,void *),
458 int flags, 466 int flags, struct user_namespace *user_ns,
459 void *data) 467 void *data)
460{ 468{
461 struct super_block *s = NULL; 469 struct super_block *s = NULL;
462 struct super_block *old; 470 struct super_block *old;
463 int err; 471 int err;
464 472
473 if (!(flags & MS_KERNMOUNT) &&
474 !(type->fs_flags & FS_USERNS_MOUNT) &&
475 !capable(CAP_SYS_ADMIN))
476 return ERR_PTR(-EPERM);
465retry: 477retry:
466 spin_lock(&sb_lock); 478 spin_lock(&sb_lock);
467 if (test) { 479 if (test) {
468 hlist_for_each_entry(old, &type->fs_supers, s_instances) { 480 hlist_for_each_entry(old, &type->fs_supers, s_instances) {
469 if (!test(old, data)) 481 if (!test(old, data))
470 continue; 482 continue;
483 if (user_ns != old->s_user_ns) {
484 spin_unlock(&sb_lock);
485 if (s) {
486 up_write(&s->s_umount);
487 destroy_super(s);
488 }
489 return ERR_PTR(-EBUSY);
490 }
471 if (!grab_super(old)) 491 if (!grab_super(old))
472 goto retry; 492 goto retry;
473 if (s) { 493 if (s) {
@@ -480,7 +500,7 @@ retry:
480 } 500 }
481 if (!s) { 501 if (!s) {
482 spin_unlock(&sb_lock); 502 spin_unlock(&sb_lock);
483 s = alloc_super(type, flags); 503 s = alloc_super(type, flags, user_ns);
484 if (!s) 504 if (!s)
485 return ERR_PTR(-ENOMEM); 505 return ERR_PTR(-ENOMEM);
486 goto retry; 506 goto retry;
@@ -503,6 +523,31 @@ retry:
503 return s; 523 return s;
504} 524}
505 525
526EXPORT_SYMBOL(sget_userns);
527
528/**
529 * sget - find or create a superblock
530 * @type: filesystem type superblock should belong to
531 * @test: comparison callback
532 * @set: setup callback
533 * @flags: mount flags
534 * @data: argument to each of them
535 */
536struct super_block *sget(struct file_system_type *type,
537 int (*test)(struct super_block *,void *),
538 int (*set)(struct super_block *,void *),
539 int flags,
540 void *data)
541{
542 struct user_namespace *user_ns = current_user_ns();
543
544 /* Ensure the requestor has permissions over the target filesystem */
545 if (!(flags & MS_KERNMOUNT) && !ns_capable(user_ns, CAP_SYS_ADMIN))
546 return ERR_PTR(-EPERM);
547
548 return sget_userns(type, test, set, flags, user_ns, data);
549}
550
506EXPORT_SYMBOL(sget); 551EXPORT_SYMBOL(sget);
507 552
508void drop_super(struct super_block *sb) 553void drop_super(struct super_block *sb)
@@ -920,12 +965,20 @@ static int ns_set_super(struct super_block *sb, void *data)
920 return set_anon_super(sb, NULL); 965 return set_anon_super(sb, NULL);
921} 966}
922 967
923struct dentry *mount_ns(struct file_system_type *fs_type, int flags, 968struct dentry *mount_ns(struct file_system_type *fs_type,
924 void *data, int (*fill_super)(struct super_block *, void *, int)) 969 int flags, void *data, void *ns, struct user_namespace *user_ns,
970 int (*fill_super)(struct super_block *, void *, int))
925{ 971{
926 struct super_block *sb; 972 struct super_block *sb;
927 973
928 sb = sget(fs_type, ns_test_super, ns_set_super, flags, data); 974 /* Don't allow mounting unless the caller has CAP_SYS_ADMIN
975 * over the namespace.
976 */
977 if (!(flags & MS_KERNMOUNT) && !ns_capable(user_ns, CAP_SYS_ADMIN))
978 return ERR_PTR(-EPERM);
979
980 sb = sget_userns(fs_type, ns_test_super, ns_set_super, flags,
981 user_ns, ns);
929 if (IS_ERR(sb)) 982 if (IS_ERR(sb))
930 return ERR_CAST(sb); 983 return ERR_CAST(sb);
931 984
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
index f3db82071cfb..20b8f82e115b 100644
--- a/fs/sysfs/mount.c
+++ b/fs/sysfs/mount.c
@@ -41,8 +41,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type,
41 if (IS_ERR(root) || !new_sb) 41 if (IS_ERR(root) || !new_sb)
42 kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); 42 kobj_ns_drop(KOBJ_NS_TYPE_NET, ns);
43 else if (new_sb) 43 else if (new_sb)
44 /* Userspace would break if executables appear on sysfs */ 44 root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE;
45 root->d_sb->s_iflags |= SB_I_NOEXEC;
46 45
47 return root; 46 return root;
48} 47}
@@ -59,7 +58,7 @@ static struct file_system_type sysfs_fs_type = {
59 .name = "sysfs", 58 .name = "sysfs",
60 .mount = sysfs_mount, 59 .mount = sysfs_mount,
61 .kill_sb = sysfs_kill_sb, 60 .kill_sb = sysfs_kill_sb,
62 .fs_flags = FS_USERNS_VISIBLE | FS_USERNS_MOUNT, 61 .fs_flags = FS_USERNS_MOUNT,
63}; 62};
64 63
65int __init sysfs_init(void) 64int __init sysfs_init(void)
diff --git a/fs/xattr.c b/fs/xattr.c
index 4beafc43daa5..c243905835ab 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -38,6 +38,13 @@ xattr_permission(struct inode *inode, const char *name, int mask)
38 if (mask & MAY_WRITE) { 38 if (mask & MAY_WRITE) {
39 if (IS_IMMUTABLE(inode) || IS_APPEND(inode)) 39 if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
40 return -EPERM; 40 return -EPERM;
41 /*
42 * Updating an xattr will likely cause i_uid and i_gid
43 * to be writen back improperly if their true value is
44 * unknown to the vfs.
45 */
46 if (HAS_UNMAPPED_ID(inode))
47 return -EPERM;
41 } 48 }
42 49
43 /* 50 /*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f65a6801f609..577365a77b47 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -829,31 +829,6 @@ static inline void i_size_write(struct inode *inode, loff_t i_size)
829#endif 829#endif
830} 830}
831 831
832/* Helper functions so that in most cases filesystems will
833 * not need to deal directly with kuid_t and kgid_t and can
834 * instead deal with the raw numeric values that are stored
835 * in the filesystem.
836 */
837static inline uid_t i_uid_read(const struct inode *inode)
838{
839 return from_kuid(&init_user_ns, inode->i_uid);
840}
841
842static inline gid_t i_gid_read(const struct inode *inode)
843{
844 return from_kgid(&init_user_ns, inode->i_gid);
845}
846
847static inline void i_uid_write(struct inode *inode, uid_t uid)
848{
849 inode->i_uid = make_kuid(&init_user_ns, uid);
850}
851
852static inline void i_gid_write(struct inode *inode, gid_t gid)
853{
854 inode->i_gid = make_kgid(&init_user_ns, gid);
855}
856
857static inline unsigned iminor(const struct inode *inode) 832static inline unsigned iminor(const struct inode *inode)
858{ 833{
859 return MINOR(inode->i_rdev); 834 return MINOR(inode->i_rdev);
@@ -1320,6 +1295,10 @@ struct mm_struct;
1320/* sb->s_iflags */ 1295/* sb->s_iflags */
1321#define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */ 1296#define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */
1322#define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */ 1297#define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */
1298#define SB_I_NODEV 0x00000004 /* Ignore devices on this fs */
1299
1300/* sb->s_iflags to limit user namespace mounts */
1301#define SB_I_USERNS_VISIBLE 0x00000010 /* fstype already mounted */
1323 1302
1324/* Possible states of 'frozen' field */ 1303/* Possible states of 'frozen' field */
1325enum { 1304enum {
@@ -1423,6 +1402,13 @@ struct super_block {
1423 struct hlist_head s_pins; 1402 struct hlist_head s_pins;
1424 1403
1425 /* 1404 /*
1405 * Owning user namespace and default context in which to
1406 * interpret filesystem uids, gids, quotas, device nodes,
1407 * xattrs and security labels.
1408 */
1409 struct user_namespace *s_user_ns;
1410
1411 /*
1426 * Keep the lru lists last in the structure so they always sit on their 1412 * Keep the lru lists last in the structure so they always sit on their
1427 * own individual cachelines. 1413 * own individual cachelines.
1428 */ 1414 */
@@ -1446,6 +1432,31 @@ struct super_block {
1446 struct list_head s_inodes_wb; /* writeback inodes */ 1432 struct list_head s_inodes_wb; /* writeback inodes */
1447}; 1433};
1448 1434
1435/* Helper functions so that in most cases filesystems will
1436 * not need to deal directly with kuid_t and kgid_t and can
1437 * instead deal with the raw numeric values that are stored
1438 * in the filesystem.
1439 */
1440static inline uid_t i_uid_read(const struct inode *inode)
1441{
1442 return from_kuid(inode->i_sb->s_user_ns, inode->i_uid);
1443}
1444
1445static inline gid_t i_gid_read(const struct inode *inode)
1446{
1447 return from_kgid(inode->i_sb->s_user_ns, inode->i_gid);
1448}
1449
1450static inline void i_uid_write(struct inode *inode, uid_t uid)
1451{
1452 inode->i_uid = make_kuid(inode->i_sb->s_user_ns, uid);
1453}
1454
1455static inline void i_gid_write(struct inode *inode, gid_t gid)
1456{
1457 inode->i_gid = make_kgid(inode->i_sb->s_user_ns, gid);
1458}
1459
1449extern struct timespec current_fs_time(struct super_block *sb); 1460extern struct timespec current_fs_time(struct super_block *sb);
1450 1461
1451/* 1462/*
@@ -1588,6 +1599,7 @@ extern int vfs_whiteout(struct inode *, struct dentry *);
1588 */ 1599 */
1589extern void inode_init_owner(struct inode *inode, const struct inode *dir, 1600extern void inode_init_owner(struct inode *inode, const struct inode *dir,
1590 umode_t mode); 1601 umode_t mode);
1602extern bool may_open_dev(const struct path *path);
1591/* 1603/*
1592 * VFS FS_IOC_FIEMAP helper definitions. 1604 * VFS FS_IOC_FIEMAP helper definitions.
1593 */ 1605 */
@@ -1858,6 +1870,11 @@ struct super_operations {
1858#define IS_WHITEOUT(inode) (S_ISCHR(inode->i_mode) && \ 1870#define IS_WHITEOUT(inode) (S_ISCHR(inode->i_mode) && \
1859 (inode)->i_rdev == WHITEOUT_DEV) 1871 (inode)->i_rdev == WHITEOUT_DEV)
1860 1872
1873static inline bool HAS_UNMAPPED_ID(struct inode *inode)
1874{
1875 return !uid_valid(inode->i_uid) || !gid_valid(inode->i_gid);
1876}
1877
1861/* 1878/*
1862 * Inode state bits. Protected by inode->i_lock 1879 * Inode state bits. Protected by inode->i_lock
1863 * 1880 *
@@ -2006,8 +2023,6 @@ struct file_system_type {
2006#define FS_BINARY_MOUNTDATA 2 2023#define FS_BINARY_MOUNTDATA 2
2007#define FS_HAS_SUBTYPE 4 2024#define FS_HAS_SUBTYPE 4
2008#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ 2025#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
2009#define FS_USERNS_DEV_MOUNT 16 /* A userns mount does not imply MNT_NODEV */
2010#define FS_USERNS_VISIBLE 32 /* FS must already be visible */
2011#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ 2026#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
2012 struct dentry *(*mount) (struct file_system_type *, int, 2027 struct dentry *(*mount) (struct file_system_type *, int,
2013 const char *, void *); 2028 const char *, void *);
@@ -2028,8 +2043,9 @@ struct file_system_type {
2028 2043
2029#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME) 2044#define MODULE_ALIAS_FS(NAME) MODULE_ALIAS("fs-" NAME)
2030 2045
2031extern struct dentry *mount_ns(struct file_system_type *fs_type, int flags, 2046extern struct dentry *mount_ns(struct file_system_type *fs_type,
2032 void *data, int (*fill_super)(struct super_block *, void *, int)); 2047 int flags, void *data, void *ns, struct user_namespace *user_ns,
2048 int (*fill_super)(struct super_block *, void *, int));
2033extern struct dentry *mount_bdev(struct file_system_type *fs_type, 2049extern struct dentry *mount_bdev(struct file_system_type *fs_type,
2034 int flags, const char *dev_name, void *data, 2050 int flags, const char *dev_name, void *data,
2035 int (*fill_super)(struct super_block *, void *, int)); 2051 int (*fill_super)(struct super_block *, void *, int));
@@ -2049,6 +2065,11 @@ void deactivate_locked_super(struct super_block *sb);
2049int set_anon_super(struct super_block *s, void *data); 2065int set_anon_super(struct super_block *s, void *data);
2050int get_anon_bdev(dev_t *); 2066int get_anon_bdev(dev_t *);
2051void free_anon_bdev(dev_t); 2067void free_anon_bdev(dev_t);
2068struct super_block *sget_userns(struct file_system_type *type,
2069 int (*test)(struct super_block *,void *),
2070 int (*set)(struct super_block *,void *),
2071 int flags, struct user_namespace *user_ns,
2072 void *data);
2052struct super_block *sget(struct file_system_type *type, 2073struct super_block *sget(struct file_system_type *type,
2053 int (*test)(struct super_block *,void *), 2074 int (*test)(struct super_block *,void *),
2054 int (*set)(struct super_block *,void *), 2075 int (*set)(struct super_block *,void *),
diff --git a/include/linux/mount.h b/include/linux/mount.h
index f822c3c11377..54a594d49733 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -81,6 +81,7 @@ extern void mntput(struct vfsmount *mnt);
81extern struct vfsmount *mntget(struct vfsmount *mnt); 81extern struct vfsmount *mntget(struct vfsmount *mnt);
82extern struct vfsmount *mnt_clone_internal(struct path *path); 82extern struct vfsmount *mnt_clone_internal(struct path *path);
83extern int __mnt_is_readonly(struct vfsmount *mnt); 83extern int __mnt_is_readonly(struct vfsmount *mnt);
84extern bool mnt_may_suid(struct vfsmount *mnt);
84 85
85struct path; 86struct path;
86extern struct vfsmount *clone_private_mount(struct path *path); 87extern struct vfsmount *clone_private_mount(struct path *path);
diff --git a/include/linux/posix_acl.h b/include/linux/posix_acl.h
index c818772d9f9d..d5d3d741f028 100644
--- a/include/linux/posix_acl.h
+++ b/include/linux/posix_acl.h
@@ -79,7 +79,7 @@ posix_acl_release(struct posix_acl *acl)
79 79
80extern void posix_acl_init(struct posix_acl *, int); 80extern void posix_acl_init(struct posix_acl *, int);
81extern struct posix_acl *posix_acl_alloc(int, gfp_t); 81extern struct posix_acl *posix_acl_alloc(int, gfp_t);
82extern int posix_acl_valid(const struct posix_acl *); 82extern int posix_acl_valid(struct user_namespace *, const struct posix_acl *);
83extern int posix_acl_permission(struct inode *, const struct posix_acl *, int); 83extern int posix_acl_permission(struct inode *, const struct posix_acl *, int);
84extern struct posix_acl *posix_acl_from_mode(umode_t, gfp_t); 84extern struct posix_acl *posix_acl_from_mode(umode_t, gfp_t);
85extern int posix_acl_equiv_mode(const struct posix_acl *, umode_t *); 85extern int posix_acl_equiv_mode(const struct posix_acl *, umode_t *);
diff --git a/include/linux/quota.h b/include/linux/quota.h
index 8486d27cf360..55107a8ff887 100644
--- a/include/linux/quota.h
+++ b/include/linux/quota.h
@@ -179,6 +179,16 @@ static inline struct kqid make_kqid_projid(kprojid_t projid)
179 return kqid; 179 return kqid;
180} 180}
181 181
182/**
183 * qid_has_mapping - Report if a qid maps into a user namespace.
184 * @ns: The user namespace to see if a value maps into.
185 * @qid: The kernel internal quota identifier to test.
186 */
187static inline bool qid_has_mapping(struct user_namespace *ns, struct kqid qid)
188{
189 return from_kqid(ns, qid) != (qid_t) -1;
190}
191
182 192
183extern spinlock_t dq_data_lock; 193extern spinlock_t dq_data_lock;
184 194
diff --git a/include/linux/uidgid.h b/include/linux/uidgid.h
index 03835522dfcb..25e9d9216340 100644
--- a/include/linux/uidgid.h
+++ b/include/linux/uidgid.h
@@ -177,12 +177,12 @@ static inline gid_t from_kgid_munged(struct user_namespace *to, kgid_t kgid)
177 177
178static inline bool kuid_has_mapping(struct user_namespace *ns, kuid_t uid) 178static inline bool kuid_has_mapping(struct user_namespace *ns, kuid_t uid)
179{ 179{
180 return true; 180 return uid_valid(uid);
181} 181}
182 182
183static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid) 183static inline bool kgid_has_mapping(struct user_namespace *ns, kgid_t gid)
184{ 184{
185 return true; 185 return gid_valid(gid);
186} 186}
187 187
188#endif /* CONFIG_USER_NS */ 188#endif /* CONFIG_USER_NS */
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 8297e5b341d8..9217169c64cb 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -72,6 +72,7 @@ extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t,
72extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t, loff_t *); 72extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t, loff_t *);
73extern int proc_setgroups_show(struct seq_file *m, void *v); 73extern int proc_setgroups_show(struct seq_file *m, void *v);
74extern bool userns_may_setgroups(const struct user_namespace *ns); 74extern bool userns_may_setgroups(const struct user_namespace *ns);
75extern bool current_in_userns(const struct user_namespace *target_ns);
75#else 76#else
76 77
77static inline struct user_namespace *get_user_ns(struct user_namespace *ns) 78static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
@@ -100,6 +101,11 @@ static inline bool userns_may_setgroups(const struct user_namespace *ns)
100{ 101{
101 return true; 102 return true;
102} 103}
104
105static inline bool current_in_userns(const struct user_namespace *target_ns)
106{
107 return true;
108}
103#endif 109#endif
104 110
105#endif /* _LINUX_USER_H */ 111#endif /* _LINUX_USER_H */
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ade739f67f1d..0b13ace266f2 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -305,8 +305,9 @@ err:
305static int mqueue_fill_super(struct super_block *sb, void *data, int silent) 305static int mqueue_fill_super(struct super_block *sb, void *data, int silent)
306{ 306{
307 struct inode *inode; 307 struct inode *inode;
308 struct ipc_namespace *ns = data; 308 struct ipc_namespace *ns = sb->s_fs_info;
309 309
310 sb->s_iflags |= SB_I_NOEXEC | SB_I_NODEV;
310 sb->s_blocksize = PAGE_SIZE; 311 sb->s_blocksize = PAGE_SIZE;
311 sb->s_blocksize_bits = PAGE_SHIFT; 312 sb->s_blocksize_bits = PAGE_SHIFT;
312 sb->s_magic = MQUEUE_MAGIC; 313 sb->s_magic = MQUEUE_MAGIC;
@@ -326,17 +327,14 @@ static struct dentry *mqueue_mount(struct file_system_type *fs_type,
326 int flags, const char *dev_name, 327 int flags, const char *dev_name,
327 void *data) 328 void *data)
328{ 329{
329 if (!(flags & MS_KERNMOUNT)) { 330 struct ipc_namespace *ns;
330 struct ipc_namespace *ns = current->nsproxy->ipc_ns; 331 if (flags & MS_KERNMOUNT) {
331 /* Don't allow mounting unless the caller has CAP_SYS_ADMIN 332 ns = data;
332 * over the ipc namespace. 333 data = NULL;
333 */ 334 } else {
334 if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN)) 335 ns = current->nsproxy->ipc_ns;
335 return ERR_PTR(-EPERM);
336
337 data = ns;
338 } 336 }
339 return mount_ns(fs_type, flags, data, mqueue_fill_super); 337 return mount_ns(fs_type, flags, data, ns, ns->user_ns, mqueue_fill_super);
340} 338}
341 339
342static void init_once(void *foo) 340static void init_once(void *foo)
diff --git a/ipc/namespace.c b/ipc/namespace.c
index 068caf18d565..04cb07eb81f1 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -34,8 +34,11 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
34 ns->ns.ops = &ipcns_operations; 34 ns->ns.ops = &ipcns_operations;
35 35
36 atomic_set(&ns->count, 1); 36 atomic_set(&ns->count, 1);
37 ns->user_ns = get_user_ns(user_ns);
38
37 err = mq_init_ns(ns); 39 err = mq_init_ns(ns);
38 if (err) { 40 if (err) {
41 put_user_ns(ns->user_ns);
39 ns_free_inum(&ns->ns); 42 ns_free_inum(&ns->ns);
40 kfree(ns); 43 kfree(ns);
41 return ERR_PTR(err); 44 return ERR_PTR(err);
@@ -46,8 +49,6 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
46 msg_init_ns(ns); 49 msg_init_ns(ns);
47 shm_init_ns(ns); 50 shm_init_ns(ns);
48 51
49 ns->user_ns = get_user_ns(user_ns);
50
51 return ns; 52 return ns;
52} 53}
53 54
diff --git a/kernel/cred.c b/kernel/cred.c
index 0c0cd8a62285..5f264fb5737d 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -689,6 +689,8 @@ EXPORT_SYMBOL(set_security_override_from_ctx);
689 */ 689 */
690int set_create_files_as(struct cred *new, struct inode *inode) 690int set_create_files_as(struct cred *new, struct inode *inode)
691{ 691{
692 if (!uid_valid(inode->i_uid) || !gid_valid(inode->i_gid))
693 return -EINVAL;
692 new->fsuid = inode->i_uid; 694 new->fsuid = inode->i_uid;
693 new->fsgid = inode->i_gid; 695 new->fsgid = inode->i_gid;
694 return security_kernel_create_files_as(new, inode); 696 return security_kernel_create_files_as(new, inode);
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 9bafc211930c..68f594212759 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -938,6 +938,20 @@ bool userns_may_setgroups(const struct user_namespace *ns)
938 return allowed; 938 return allowed;
939} 939}
940 940
941/*
942 * Returns true if @ns is the same namespace as or a descendant of
943 * @target_ns.
944 */
945bool current_in_userns(const struct user_namespace *target_ns)
946{
947 struct user_namespace *ns;
948 for (ns = current_user_ns(); ns; ns = ns->parent) {
949 if (ns == target_ns)
950 return true;
951 }
952 return false;
953}
954
941static inline struct user_namespace *to_user_ns(struct ns_common *ns) 955static inline struct user_namespace *to_user_ns(struct ns_common *ns)
942{ 956{
943 return container_of(ns, struct user_namespace, ns); 957 return container_of(ns, struct user_namespace, ns);
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index fc48eca21fd2..84f98cbe31c3 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1386,7 +1386,7 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
1386{ 1386{
1387 struct inode *inode; 1387 struct inode *inode;
1388 struct dentry *root, *gssd_dentry; 1388 struct dentry *root, *gssd_dentry;
1389 struct net *net = data; 1389 struct net *net = get_net(sb->s_fs_info);
1390 struct sunrpc_net *sn = net_generic(net, sunrpc_net_id); 1390 struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
1391 int err; 1391 int err;
1392 1392
@@ -1419,7 +1419,6 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
1419 sb); 1419 sb);
1420 if (err) 1420 if (err)
1421 goto err_depopulate; 1421 goto err_depopulate;
1422 sb->s_fs_info = get_net(net);
1423 mutex_unlock(&sn->pipefs_sb_lock); 1422 mutex_unlock(&sn->pipefs_sb_lock);
1424 return 0; 1423 return 0;
1425 1424
@@ -1448,7 +1447,8 @@ static struct dentry *
1448rpc_mount(struct file_system_type *fs_type, 1447rpc_mount(struct file_system_type *fs_type,
1449 int flags, const char *dev_name, void *data) 1448 int flags, const char *dev_name, void *data)
1450{ 1449{
1451 return mount_ns(fs_type, flags, current->nsproxy->net_ns, rpc_fill_super); 1450 struct net *net = current->nsproxy->net_ns;
1451 return mount_ns(fs_type, flags, data, net, net->user_ns, rpc_fill_super);
1452} 1452}
1453 1453
1454static void rpc_kill_sb(struct super_block *sb) 1454static void rpc_kill_sb(struct super_block *sb)
@@ -1468,9 +1468,9 @@ static void rpc_kill_sb(struct super_block *sb)
1468 RPC_PIPEFS_UMOUNT, 1468 RPC_PIPEFS_UMOUNT,
1469 sb); 1469 sb);
1470 mutex_unlock(&sn->pipefs_sb_lock); 1470 mutex_unlock(&sn->pipefs_sb_lock);
1471 put_net(net);
1472out: 1471out:
1473 kill_litter_super(sb); 1472 kill_litter_super(sb);
1473 put_net(net);
1474} 1474}
1475 1475
1476static struct file_system_type rpc_pipe_fs_type = { 1476static struct file_system_type rpc_pipe_fs_type = {
diff --git a/security/commoncap.c b/security/commoncap.c
index e7fadde737f4..14540bd78561 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -453,7 +453,15 @@ static int get_file_caps(struct linux_binprm *bprm, bool *effective, bool *has_c
453 if (!file_caps_enabled) 453 if (!file_caps_enabled)
454 return 0; 454 return 0;
455 455
456 if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) 456 if (!mnt_may_suid(bprm->file->f_path.mnt))
457 return 0;
458
459 /*
460 * This check is redundant with mnt_may_suid() but is kept to make
461 * explicit that capability bits are limited to s_user_ns and its
462 * descendants.
463 */
464 if (!current_in_userns(bprm->file->f_path.mnt->mnt_sb->s_user_ns))
457 return 0; 465 return 0;
458 466
459 rc = get_vfs_caps_from_disk(bprm->file->f_path.dentry, &vcaps); 467 rc = get_vfs_caps_from_disk(bprm->file->f_path.dentry, &vcaps);
diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c
index 30b6b7d0429f..11c1d30bd705 100644
--- a/security/integrity/evm/evm_crypto.c
+++ b/security/integrity/evm/evm_crypto.c
@@ -151,8 +151,8 @@ static void hmac_add_misc(struct shash_desc *desc, struct inode *inode,
151 memset(&hmac_misc, 0, sizeof(hmac_misc)); 151 memset(&hmac_misc, 0, sizeof(hmac_misc));
152 hmac_misc.ino = inode->i_ino; 152 hmac_misc.ino = inode->i_ino;
153 hmac_misc.generation = inode->i_generation; 153 hmac_misc.generation = inode->i_generation;
154 hmac_misc.uid = from_kuid(&init_user_ns, inode->i_uid); 154 hmac_misc.uid = from_kuid(inode->i_sb->s_user_ns, inode->i_uid);
155 hmac_misc.gid = from_kgid(&init_user_ns, inode->i_gid); 155 hmac_misc.gid = from_kgid(inode->i_sb->s_user_ns, inode->i_gid);
156 hmac_misc.mode = inode->i_mode; 156 hmac_misc.mode = inode->i_mode;
157 crypto_shash_update(desc, (const u8 *)&hmac_misc, sizeof(hmac_misc)); 157 crypto_shash_update(desc, (const u8 *)&hmac_misc, sizeof(hmac_misc));
158 if (evm_hmac_attrs & EVM_ATTR_FSUUID) 158 if (evm_hmac_attrs & EVM_ATTR_FSUUID)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index a86d537eb79b..19be9d39c742 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -830,6 +830,28 @@ static int selinux_set_mnt_opts(struct super_block *sb,
830 goto out; 830 goto out;
831 } 831 }
832 } 832 }
833
834 /*
835 * If this is a user namespace mount, no contexts are allowed
836 * on the command line and security labels must be ignored.
837 */
838 if (sb->s_user_ns != &init_user_ns) {
839 if (context_sid || fscontext_sid || rootcontext_sid ||
840 defcontext_sid) {
841 rc = -EACCES;
842 goto out;
843 }
844 if (sbsec->behavior == SECURITY_FS_USE_XATTR) {
845 sbsec->behavior = SECURITY_FS_USE_MNTPOINT;
846 rc = security_transition_sid(current_sid(), current_sid(),
847 SECCLASS_FILE, NULL,
848 &sbsec->mntpoint_sid);
849 if (rc)
850 goto out;
851 }
852 goto out_set_opts;
853 }
854
833 /* sets the context of the superblock for the fs being mounted. */ 855 /* sets the context of the superblock for the fs being mounted. */
834 if (fscontext_sid) { 856 if (fscontext_sid) {
835 rc = may_context_mount_sb_relabel(fscontext_sid, sbsec, cred); 857 rc = may_context_mount_sb_relabel(fscontext_sid, sbsec, cred);
@@ -898,6 +920,7 @@ static int selinux_set_mnt_opts(struct super_block *sb,
898 sbsec->def_sid = defcontext_sid; 920 sbsec->def_sid = defcontext_sid;
899 } 921 }
900 922
923out_set_opts:
901 rc = sb_finish_set_opts(sb); 924 rc = sb_finish_set_opts(sb);
902out: 925out:
903 mutex_unlock(&sbsec->lock); 926 mutex_unlock(&sbsec->lock);
@@ -2259,7 +2282,7 @@ static int check_nnp_nosuid(const struct linux_binprm *bprm,
2259 const struct task_security_struct *new_tsec) 2282 const struct task_security_struct *new_tsec)
2260{ 2283{
2261 int nnp = (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS); 2284 int nnp = (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS);
2262 int nosuid = (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID); 2285 int nosuid = !mnt_may_suid(bprm->file->f_path.mnt);
2263 int rc; 2286 int rc;
2264 2287
2265 if (!nnp && !nosuid) 2288 if (!nnp && !nosuid)
diff --git a/security/smack/smack.h b/security/smack/smack.h
index 6c91156ae225..26e58f1804b1 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -90,9 +90,15 @@ struct superblock_smack {
90 struct smack_known *smk_floor; 90 struct smack_known *smk_floor;
91 struct smack_known *smk_hat; 91 struct smack_known *smk_hat;
92 struct smack_known *smk_default; 92 struct smack_known *smk_default;
93 int smk_initialized; 93 int smk_flags;
94}; 94};
95 95
96/*
97 * Superblock flags
98 */
99#define SMK_SB_INITIALIZED 0x01
100#define SMK_SB_UNTRUSTED 0x02
101
96struct socket_smack { 102struct socket_smack {
97 struct smack_known *smk_out; /* outbound label */ 103 struct smack_known *smk_out; /* outbound label */
98 struct smack_known *smk_in; /* inbound label */ 104 struct smack_known *smk_in; /* inbound label */
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 6777295f4b2b..b75634dbf53b 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -549,7 +549,7 @@ static int smack_sb_alloc_security(struct super_block *sb)
549 sbsp->smk_floor = &smack_known_floor; 549 sbsp->smk_floor = &smack_known_floor;
550 sbsp->smk_hat = &smack_known_hat; 550 sbsp->smk_hat = &smack_known_hat;
551 /* 551 /*
552 * smk_initialized will be zero from kzalloc. 552 * SMK_SB_INITIALIZED will be zero from kzalloc.
553 */ 553 */
554 sb->s_security = sbsp; 554 sb->s_security = sbsp;
555 555
@@ -766,10 +766,10 @@ static int smack_set_mnt_opts(struct super_block *sb,
766 int num_opts = opts->num_mnt_opts; 766 int num_opts = opts->num_mnt_opts;
767 int transmute = 0; 767 int transmute = 0;
768 768
769 if (sp->smk_initialized) 769 if (sp->smk_flags & SMK_SB_INITIALIZED)
770 return 0; 770 return 0;
771 771
772 sp->smk_initialized = 1; 772 sp->smk_flags |= SMK_SB_INITIALIZED;
773 773
774 for (i = 0; i < num_opts; i++) { 774 for (i = 0; i < num_opts; i++) {
775 switch (opts->mnt_opts_flags[i]) { 775 switch (opts->mnt_opts_flags[i]) {
@@ -821,6 +821,17 @@ static int smack_set_mnt_opts(struct super_block *sb,
821 skp = smk_of_current(); 821 skp = smk_of_current();
822 sp->smk_root = skp; 822 sp->smk_root = skp;
823 sp->smk_default = skp; 823 sp->smk_default = skp;
824 /*
825 * For a handful of fs types with no user-controlled
826 * backing store it's okay to trust security labels
827 * in the filesystem. The rest are untrusted.
828 */
829 if (sb->s_user_ns != &init_user_ns &&
830 sb->s_magic != SYSFS_MAGIC && sb->s_magic != TMPFS_MAGIC &&
831 sb->s_magic != RAMFS_MAGIC) {
832 transmute = 1;
833 sp->smk_flags |= SMK_SB_UNTRUSTED;
834 }
824 } 835 }
825 836
826 /* 837 /*
@@ -908,6 +919,7 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm)
908 struct inode *inode = file_inode(bprm->file); 919 struct inode *inode = file_inode(bprm->file);
909 struct task_smack *bsp = bprm->cred->security; 920 struct task_smack *bsp = bprm->cred->security;
910 struct inode_smack *isp; 921 struct inode_smack *isp;
922 struct superblock_smack *sbsp;
911 int rc; 923 int rc;
912 924
913 if (bprm->cred_prepared) 925 if (bprm->cred_prepared)
@@ -917,6 +929,11 @@ static int smack_bprm_set_creds(struct linux_binprm *bprm)
917 if (isp->smk_task == NULL || isp->smk_task == bsp->smk_task) 929 if (isp->smk_task == NULL || isp->smk_task == bsp->smk_task)
918 return 0; 930 return 0;
919 931
932 sbsp = inode->i_sb->s_security;
933 if ((sbsp->smk_flags & SMK_SB_UNTRUSTED) &&
934 isp->smk_task != sbsp->smk_root)
935 return 0;
936
920 if (bprm->unsafe & (LSM_UNSAFE_PTRACE | LSM_UNSAFE_PTRACE_CAP)) { 937 if (bprm->unsafe & (LSM_UNSAFE_PTRACE | LSM_UNSAFE_PTRACE_CAP)) {
921 struct task_struct *tracer; 938 struct task_struct *tracer;
922 rc = 0; 939 rc = 0;
@@ -1203,6 +1220,7 @@ static int smack_inode_rename(struct inode *old_inode,
1203 */ 1220 */
1204static int smack_inode_permission(struct inode *inode, int mask) 1221static int smack_inode_permission(struct inode *inode, int mask)
1205{ 1222{
1223 struct superblock_smack *sbsp = inode->i_sb->s_security;
1206 struct smk_audit_info ad; 1224 struct smk_audit_info ad;
1207 int no_block = mask & MAY_NOT_BLOCK; 1225 int no_block = mask & MAY_NOT_BLOCK;
1208 int rc; 1226 int rc;
@@ -1214,6 +1232,11 @@ static int smack_inode_permission(struct inode *inode, int mask)
1214 if (mask == 0) 1232 if (mask == 0)
1215 return 0; 1233 return 0;
1216 1234
1235 if (sbsp->smk_flags & SMK_SB_UNTRUSTED) {
1236 if (smk_of_inode(inode) != sbsp->smk_root)
1237 return -EACCES;
1238 }
1239
1217 /* May be droppable after audit */ 1240 /* May be droppable after audit */
1218 if (no_block) 1241 if (no_block)
1219 return -ECHILD; 1242 return -ECHILD;
@@ -1708,6 +1731,7 @@ static int smack_mmap_file(struct file *file,
1708 struct task_smack *tsp; 1731 struct task_smack *tsp;
1709 struct smack_known *okp; 1732 struct smack_known *okp;
1710 struct inode_smack *isp; 1733 struct inode_smack *isp;
1734 struct superblock_smack *sbsp;
1711 int may; 1735 int may;
1712 int mmay; 1736 int mmay;
1713 int tmay; 1737 int tmay;
@@ -1719,6 +1743,10 @@ static int smack_mmap_file(struct file *file,
1719 isp = file_inode(file)->i_security; 1743 isp = file_inode(file)->i_security;
1720 if (isp->smk_mmap == NULL) 1744 if (isp->smk_mmap == NULL)
1721 return 0; 1745 return 0;
1746 sbsp = file_inode(file)->i_sb->s_security;
1747 if (sbsp->smk_flags & SMK_SB_UNTRUSTED &&
1748 isp->smk_mmap != sbsp->smk_root)
1749 return -EACCES;
1722 mkp = isp->smk_mmap; 1750 mkp = isp->smk_mmap;
1723 1751
1724 tsp = current_security(); 1752 tsp = current_security();