| Commit message (Collapse) | Author | Age |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security
Pull security layer updates from Serge Hallyn:
"This is a merge of James Morris' security-next tree from 3.14 to
yesterday's master, plus four patches from Paul Moore which are in
linux-next, plus one patch from Mimi"
* 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
ima: audit log files opened with O_DIRECT flag
selinux: conditionally reschedule in hashtab_insert while loading selinux policy
selinux: conditionally reschedule in mls_convert_context while loading selinux policy
selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
selinux: Report permissive mode in avc: denied messages.
Warning in scanf string typing
Smack: Label cgroup files for systemd
Smack: Verify read access on file open - v3
security: Convert use of typedef ctl_table to struct ctl_table
Smack: bidirectional UDS connect check
Smack: Correctly remove SMACK64TRANSMUTE attribute
SMACK: Fix handling value==NULL in post setxattr
bugfix patch for SMACK
Smack: adds smackfs/ptrace interface
Smack: unify all ptrace accesses in the smack
Smack: fix the subject/object order in smack_ptrace_traceme()
Minor improvement of 'smack_sb_kern_mount'
smack: fix key permission verification
KEYS: Move the flags representing required permission to linux/key.h
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Files are measured or appraised based on the IMA policy. When a
file, in policy, is opened with the O_DIRECT flag, a deadlock
occurs.
The first attempt at resolving this lockdep temporarily removed the
O_DIRECT flag and restored it, after calculating the hash. The
second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this
flag, do_blockdev_direct_IO() would skip taking the i_mutex a second
time. The third attempt, by Dmitry Kasatkin, resolves the i_mutex
locking issue, by re-introducing the IMA mutex, but uncovered
another problem. Reading a file with O_DIRECT flag set, writes
directly to userspace pages. A second patch allocates a user-space
like memory. This works for all IMA hooks, except ima_file_free(),
which is called on __fput() to recalculate the file hash.
Until this last issue is addressed, do not 'collect' the
measurement for measuring, appraising, or auditing files opened
with the O_DIRECT flag set. Based on policy, permit or deny file
access. This patch defines a new IMA policy rule option named
'permit_directio'. Policy rules could be defined, based on LSM
or other criteria, to permit specific applications to open files
with the O_DIRECT flag set.
Changelog v1:
- permit or deny file access based IMA policy rules
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: <stable@vger.kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After silencing the sleeping warning in mls_convert_context() I started
seeing similar traces from hashtab_insert. Do a cond_resched there too.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
selinux policy
On a slow machine (with debugging enabled), upgrading selinux policy may take
a considerable amount of time. Long enough that the softlockup detector
gets triggered.
The backtrace looks like this..
> BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045]
> Call Trace:
> [<ffffffff81221ddf>] symcmp+0xf/0x20
> [<ffffffff81221c27>] hashtab_search+0x47/0x80
> [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0
> [<ffffffff812294e8>] convert_context+0x378/0x460
> [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240
> [<ffffffff812221b5>] sidtab_map+0x45/0x80
> [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580
> [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
> [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
> [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
> [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
> [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
> [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
> [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
> [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
> [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
> [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0
> [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
> [<ffffffff8121e947>] sel_write_load+0xa7/0x770
> [<ffffffff81139633>] ? vfs_write+0x1c3/0x200
> [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0
> [<ffffffff8113952b>] vfs_write+0xbb/0x200
> [<ffffffff811581c7>] ? fget_light+0x397/0x4b0
> [<ffffffff81139c27>] SyS_write+0x47/0xa0
> [<ffffffff8153bde4>] tracesys+0xdd/0xe2
Stephen Smalley suggested:
> Maybe put a cond_resched() within the ebitmap_for_each_positive_bit()
> loop in mls_convert_context()?
That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem. This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
|
| |\
| | |
| | |
| | | |
into next
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This fixes a warning about the mismatch of types between
the declared unsigned and integer.
Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The cgroup filesystem isn't ready for an LSM to
properly use extented attributes. This patch makes
files created in the cgroup filesystem usable by
a system running Smack and systemd.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Smack believes that many of the operatons that can
be performed on an open file descriptor are read operations.
The fstat and lseek system calls are examples.
An implication of this is that files shouldn't be open
if the task doesn't have read access even if it has
write access and the file is being opened write only.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Smack IPC policy requires that the sender have write access
to the receiver. UDS streams don't do per-packet checks. The
only check is done at connect time. The existing code checks
if the connecting process can write to the other, but not the
other way around. This change adds a check that the other end
can write to the connecting process.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schuafler <casey@schaufler-ca.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Sam Henderson points out that removing the SMACK64TRANSMUTE
attribute from a directory does not result in the directory
transmuting. This is because the inode flag indicating that
the directory is transmuting isn't cleared. The fix is a tad
less than trivial because smk_task and smk_mmap should have
been broken out, too.
Targeted for git://git.gitorious.org/smack-next/kernel.git
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The function `smack_inode_post_setxattr` is called each
time that a setxattr is done, for any value of name.
The kernel allow to put value==NULL when size==0
to set an empty attribute value. The systematic
call to smk_import_entry was causing the dereference
of a NULL pointer hence a KERNEL PANIC!
The problem can be produced easily by issuing the
command `setfattr -n user.data file` under bash prompt
when SMACK is active.
Moving the call to smk_import_entry as proposed by this
patch is correcting the behaviour because the function
smack_inode_post_setxattr is called for the SMACK's
attributes only if the function smack_inode_setxattr validated
the value and its size (what will not be the case when size==0).
It also has a benefical effect to not fill the smack hash
with garbage values coming from any extended attribute
write.
Change-Id: Iaf0039c2be9bccb6cee11c24a3b44d209101fe47
Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
1. In order to remove any SMACK extended attribute from a file, a user
should have CAP_MAC_ADMIN capability. But user without having this
capability is able to remove SMACK64MMAP security attribute.
2. While validating size and value of smack extended attribute in
smack_inode_setsecurity hook, wrong error code is returned.
Signed-off-by: Pankaj Kumar <pamkaj.k2@samsung.com>
Signed-off-by: Himanshu Shukla <himanshu.sh@samsung.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This allows to limit ptrace beyond the regular smack access rules.
It adds a smackfs/ptrace interface that allows smack to be configured
to require equal smack labels for PTRACE_MODE_ATTACH access.
See the changes in Documentation/security/Smack.txt below for details.
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The decision whether we can trace a process is made in the following
functions:
smack_ptrace_traceme()
smack_ptrace_access_check()
smack_bprm_set_creds() (in case the proces is traced)
This patch unifies all those decisions by introducing one function that
checks whether ptrace is allowed: smk_ptrace_rule_check().
This makes possible to actually trace with TRACEME where first the
TRACEME itself must be allowed and then exec() on a traced process.
Additional bugs fixed:
- The decision is made according to the mode parameter that is now correctly
translated from PTRACE_MODE_* to MAY_* instead of being treated 1:1.
PTRACE_MODE_READ requires MAY_READ.
PTRACE_MODE_ATTACH requires MAY_READWRITE.
- Add a smack audit log in case of exec() refused by bprm_set_creds().
- Honor the PTRACE_MODE_NOAUDIT flag and don't put smack audit info
in case this flag is set.
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The order of subject/object is currently reversed in
smack_ptrace_traceme(). It is currently checked if the tracee has a
capability to trace tracer and according to this rule a decision is made
whether the tracer will be allowed to trace tracee.
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@partner.samsung.com>
Signed-off-by: Rafal Krypa <r.krypa@samsung.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Fix a possible memory access fault when transmute is true and isp is NULL.
Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This typedef is unnecessary and should just be removed.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
| |\ \
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
For any keyring access type SMACK always used MAY_READWRITE access check.
It prevents reading the key with label "_", which should be allowed for anyone.
This patch changes default access check to MAY_READ and use MAY_READWRITE in only
appropriate cases.
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Move the flags representing required permission to linux/key.h as the perm
parameter of security_key_permission() is in terms of them - and not the
permissions mask flags used in key->perm.
Whilst we're at it:
(1) Rename them to be KEY_NEED_xxx rather than KEY_xxx to avoid collisions
with symbols in uapi/linux/input.h.
(2) Don't use key_perm_t for a mask of required permissions, but rather limit
it to the permissions mask attached to the key and arguments related
directly to that.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
|
| |\ \ \
| | |_|/
| |/| | |
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Pull UBIFS updates from Artem Bityutskiy:
"This contains several UBIFS fixes. One of them fixes a race condition
between the mmap page fault path and fsync. Another just removes a
bogus assertion from the UBIFS memory shrinker.
UBIFS also started honoring the MS_SILENT mount flag, so now it won't
print many I/O errors when user-space just tries to probe for the FS.
Rest of the changes are rather minor UBI/UBIFS fixes, improvements,
and clean-ups"
* tag 'upstream-3.16-rc1-v2' of git://git.infradead.org/linux-ubifs:
UBIFS: Add an assertion for clean_zn_cnt
UBIFS: respect MS_SILENT mount flag
UBIFS: Remove incorrect assertion in shrink_tnc()
UBIFS: fix debugging check
UBIFS: add missing ui pointer in debugging code
UBI: block: Fix error path on alloc_workqueue failure
UBIFS: Fix dump messages in ubifs_dump_lprops
UBI: fix rb_tree node comparison in add_map
UBIFS: Remove unused variables in ubifs_budget_space
UBI: weaken the 'exclusive' constraint when opening volumes to rename
UBIFS: fix an mmap and fsync race condition
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch adds a new ubifs_assert() in ubifs_tnc_close() to check
if there are any leaks of per-filesystem @clean_zn_cnt. This new
assert inspects whether the return value of ubifs_destroy_tnc_subtree()
is equal to @clean_zn_cnt or not while umount.
Artem: a minor amendment
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
When attempting to mount a non-ubifs formatted volume, lots of error
messages (including a stack dump) are thrown to the kernel log even if
the MS_SILENT mount flag is set.
Fix this by introducing adding an additional state-variable in
struct ubifs_info and suppress error messages in ubifs_read_node if
MS_SILENT is set.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
shows like this:
[ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
[ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1
[ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24)
[ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28)
[ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs])
[ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs])
[ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8)
[ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544)
[ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398)
[ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8)
[ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238)
[ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c)
[ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188)
[ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs])
[ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs])
[ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418)
[ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530)
[ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54)
[ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184)
[ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac)
[ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0)
[ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40)
After analyzing the code, I found a condition that may cause this failed
in correct operations. Thus, I think this assertion is wrong and should be
removed.
Suppose there are two clean znodes and one dirty znode in TNC. So the
per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
on this znode. We clear COW bit and DIRTY bit in write_index() without
@tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
comments in write_index() shows, if another process hold @tnc_mutex and
dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
free_obsolete_znodes() to keep it right.
If shrink_tnc() performs between decrease and increase, it will release
other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
(1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
soon correct @clean_zn_cnt and no harm to fs in this case, I think this
assertion could be removed.
2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2
Thread A (commit) Thread B (write or others) Thread C (shrinker)
->write_index
->clear_bit(DIRTY_NODE)
->clear_bit(COW_ZNODE)
@clean_zn_cnt == 2
->mutex_locked(&tnc_mutex)
->dirty_cow_znode
->!ubifs_zn_cow(znode)
->!test_and_set_bit(DIRTY_NODE)
->atomic_dec(&clean_zn_cnt)
->mutex_unlocked(&tnc_mutex)
@clean_zn_cnt == 1
->mutex_locked(&tnc_mutex)
->shrink_tnc
->destroy_tnc_subtree
->atomic_sub(&clean_zn_cnt, 2)
->ubifs_assert <- hit
->mutex_unlocked(&tnc_mutex)
@clean_zn_cnt == -1
->mutex_lock(&tnc_mutex)
->free_obsolete_znodes
->atomic_inc(&clean_zn_cnt)
->mutux_unlock(&tnc_mutex)
@clean_zn_cnt == 0 (correct after shrink)
Signed-off-by: hujianyang <hujianyang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The debugging check which verifies that we never write outside of the file
length was incorrect, since it was multiplying file length by the page size,
instead of dividing. Fix this.
Spotted-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
If UBIFS_DEBUG is defined an additional assertion of the ui_lock
spinlock in do_writepage cannot compile because the ui pointer has not
been previously declared.
Fix this by declaring and initializing the ui pointer in case
UBIFS_DEBUG is defined.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Otherwise we'd return a random value if allocation of the workqueue fails.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Function ubifs_read_one_lp will not set @lp and returns
an error when ubifs_read_one_lp failed. We should not
perform ubifs_dump_lprop in this case because @lp is not
initialized as we wanted.
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The comparisons used in add_vol() shouldn't be identical. Pretty sure
the following is correct but it is completely untested.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
I found two variables in ubifs_budget_space declared but not
use. This state remains since the first commit 1e5176. So just
remove them.
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The UBI volume rename ioctl (UBI_IOCRNVOL) open the volumes in exclusive
mode. The volumes are opened for two reasons: to build a volume rename list,
and a volume remove list.
However, the first open constraint is excessive and can be replaced by
a 'read-write' open mode. The second open constraint is properly set as
'exclusive' given the volume is opened for removal and we don't want any
users around.
By weakening the former 'exclusive' mode, we allow 'read-only' users to keep
the volume open, while a rename is taking place. This is useful to perform
an atomic rename, in a firmware upgrade scenario, while keeping the volume
in read-only use (for instance, if a ubiblock is mounted as rootfs).
It's worth mention this is not the case of UBIFS, which keeps the volume
opened as 'read-write' despite mounted as read-write or read-only mode.
This change was suggested at least twice by Artem:
http://lists.infradead.org/pipermail/linux-mtd/2012-September/044175.html
http://permalink.gmane.org/gmane.linux.drivers.mtd/39866
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There is a race condition in UBIFS:
Thread A (mmap) Thread B (fsync)
->__do_fault ->write_cache_pages
-> ubifs_vm_page_mkwrite
-> budget_space
-> lock_page
-> release/convert_page_budget
-> SetPagePrivate
-> TestSetPageDirty
-> unlock_page
-> lock_page
-> TestClearPageDirty
-> ubifs_writepage
-> do_writepage
-> release_budget
-> ClearPagePrivate
-> unlock_page
-> !(ret & VM_FAULT_LOCKED)
-> lock_page
-> set_page_dirty
-> ubifs_set_page_dirty
-> TestSetPageDirty (set page dirty without budgeting)
-> unlock_page
This leads to situation where we have a diry page but no budget allocated for
this page, so further write-back may fail with -ENOSPC.
In this fix we return from page_mkwrite without performing unlock_page. We
return VM_FAULT_LOCKED instead. After doing this, the race above will not
happen.
Signed-off-by: hujianyang <hujianyang@huawei.com>
Tested-by: Laurence Withers <lwithers@guralp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fixes an easy DoS and possible information disclosure.
This does nothing about the broken state of x32 auditing.
eparis: If the admin has enabled auditd and has specifically loaded
audit rules. This bug has been around since before git. Wow...
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, there is no special interesting feature, but we've
investigated a couple of tuning points with respect to the I/O flow.
Several major bug fixes and a bunch of clean-ups also have been made.
This patch-set includes the following major enhancement patches:
- enhance wait_on_page_writeback
- support SEEK_DATA and SEEK_HOLE
- enhance readahead flows
- enhance IO flushes
- support fiemap
- add some tracepoints
The other bug fixes are as follows:
- fix to support a large volume > 2TB correctly
- recovery bug fix wrt fallocated space
- fix recursive lock on xattr operations
- fix some cases on the remount flow
And, there are a bunch of cleanups"
* tag 'for-f2fs-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
f2fs: support f2fs_fiemap
f2fs: avoid not to call remove_dirty_inode
f2fs: recover fallocated space
f2fs: fix to recover data written by dio
f2fs: large volume support
f2fs: avoid crash when trace f2fs_submit_page_mbio event in ra_sum_pages
f2fs: avoid overflow when large directory feathure is enabled
f2fs: fix recursive lock by f2fs_setxattr
MAINTAINERS: add a co-maintainer from samsung for F2FS
MAINTAINERS: change the email address for f2fs
f2fs: use inode_init_owner() to simplify codes
f2fs: avoid to use slab memory in f2fs_issue_flush for efficiency
f2fs: add a tracepoint for f2fs_read_data_page
f2fs: add a tracepoint for f2fs_write_{meta,node,data}_pages
f2fs: add a tracepoint for f2fs_write_{meta,node,data}_page
f2fs: add a tracepoint for f2fs_write_end
f2fs: add a tracepoint for f2fs_write_begin
f2fs: fix checkpatch warning
f2fs: deactivate inode page if the inode is evicted
f2fs: decrease the lock granularity during write_begin
...
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch links f2fs_fiemap with generic function with get_block.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
There is an errorneous case during the recovery like below.
In recovery_dentry,
1) dir = f2fs_iget();
2) mark the dir with FI_DELAY_IPUT
3) goto unmap_out
After the end of recovery routine, there is no dirty dentries so the dir cannot
be released by iput in remove_dirty_dir_inode.
This patch fixes such the bug case by handling the iget and iput in the
recovery_dentry procedure.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If a fallocated file is fsynced, we should recover the i_size after sudden
power cut.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If data are overwritten through dio, previous f2fs doesn't remain the fsync mark
due to no additional node writes.
Note that this patch should resolve the xfstests:311.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
f2fs's cp has one page which consists of struct f2fs_checkpoint and
version bitmap of sit and nat. To support lots of segments, we need more
blocks for sit bitmap. So let's arrange sit bitmap as following:
+-----------------+------------+
| f2fs_checkpoint | sit bitmap |
| + nat bitmap | |
+-----------------+------------+
0 4k N blocks
Signed-off-by: Changman Lee <cm224.lee@samsung.com>
[Jaegeuk Kim: simple code change for readability]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Previously we allocate pages with no mapping in ra_sum_pages(), so we may
encounter a crash in event trace of f2fs_submit_page_mbio where we access
mapping data of the page.
We'd better allocate pages in bd_inode mapping and invalidate these pages after
we restore data from pages. It could avoid crash in above scenario.
Changes from V1
o remove redundant code in ra_sum_pages() suggested by Jaegeuk Kim.
Call Trace:
[<f1031630>] ? ftrace_raw_event_f2fs_write_checkpoint+0x80/0x80 [f2fs]
[<f10377bb>] f2fs_submit_page_mbio+0x1cb/0x200 [f2fs]
[<f103c5da>] restore_node_summary+0x13a/0x280 [f2fs]
[<f103e22d>] build_curseg+0x2bd/0x620 [f2fs]
[<f104043b>] build_segment_manager+0x1cb/0x920 [f2fs]
[<f1032c85>] f2fs_fill_super+0x535/0x8e0 [f2fs]
[<c115b66a>] mount_bdev+0x16a/0x1a0
[<f102f63f>] f2fs_mount+0x1f/0x30 [f2fs]
[<c115c096>] mount_fs+0x36/0x170
[<c1173635>] vfs_kern_mount+0x55/0xe0
[<c1175388>] do_mount+0x1e8/0x900
[<c1175d72>] SyS_mount+0x82/0xc0
[<c16059cc>] sysenter_do_call+0x12/0x22
Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When large directory feathure is enable, We have one case which could cause
overflow in dir_buckets() as following:
special case: level + dir_level >= 32 and level < MAX_DIR_HASH_DEPTH / 2.
Here we define MAX_DIR_BUCKETS to limit the return value when the condition
could trigger potential overflow.
Changes from V1
o modify description of calculation in f2fs.txt suggested by Changman Lee.
Suggested-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch should resolve the following recursive lock.
[<ffffffff8135a9c3>] call_rwsem_down_write_failed+0x13/0x20
[<ffffffffa01749dc>] f2fs_setxattr+0x5c/0xa0 [f2fs]
[<ffffffffa0174c99>] __f2fs_set_acl+0x1b9/0x340 [f2fs]
[<ffffffffa017515a>] f2fs_init_acl+0x4a/0xcb [f2fs]
[<ffffffffa0159abe>] __f2fs_add_link+0x26e/0x780 [f2fs]
[<ffffffffa015d4d8>] f2fs_mkdir+0xb8/0x150 [f2fs]
[<ffffffff811cebd7>] vfs_mkdir+0xb7/0x160
[<ffffffff811cf89b>] SyS_mkdir+0xab/0xe0
[<ffffffff817244bf>] tracesys+0xe1/0xe6
[<ffffffffffffffff>] 0xffffffffffffffff
The call path indicates:
- f2fs_add_link
: down_write(&fi->i_sem);
- init_inode_metadata
- f2fs_init_acl
- __f2fs_set_acl
- f2fs_setxattr
: down_write(&fi->i_sem);
Here we should not call f2fs_setxattr, but __f2fs_setxattr.
But __f2fs_setxattr is a static function in xattr.c, so that I found the other
generic approach to use f2fs_setxattr.
In f2fs_setxattr, the page pointer is only given from init_inode_metadata.
So, this patch adds this condition to avoid this in f2fs_setxattr.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch adds a samsung guy for an F2FS maintainer.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch changes the valid email address to maintain the f2fs file system.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch uses exported inode_init_owner() to simplify codes in
f2fs_new_inode().
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If we use slab memory in f2fs_issue_flush(), we will face memory pressure and
latency time caused by racing of kmem_cache_{alloc,free}.
Let's alloc memory in stack instead of slab.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch adds a tracepoint for f2fs_read_data_page to trace when page is
readed by user.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|