| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
| |
Assigning a nice value to LITMUS^RT tasks is meaningless. Bail out
early.
|
|
|
|
| |
This patch fixes a BUG_ON() in litmus/preempt.c.
|
|
|
|
|
| |
SCHED_DEADLINE should not preempt LITMUS^RT tasks, as the LITMUS^RT
scheduling class is positioned above the SCHED_DEADLINE policy.
|
|
|
|
|
|
| |
The rt scheduling class thinks it's the highest-priority scheduling
class around, next to SCHED_DEADLINE. It is not in LITMUS^RT. Don't go
preempting remote cores that run SCHED_LITMUS tasks.
|
| |
|
|
|
|
| |
To keep track of stack usage and to notify plugin, if necessary.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Global plugins require that the plugin be called even if there
currently is no real-time task executing on the local core.
|
| |
|
|
|
|
|
|
| |
Needs to be above stop_machine_class for legacy reasons; the main
plugins were developed before stop_machine_class was introduced and
assume that they are the highest-priority scheduling class.
|
| |
|
| |
|
|
|
|
| |
Check whether a suspension is required at end of schedule().
|
|
|
|
| |
This patch adds context switch tracing to the main Linux scheduler.
|
|
|
|
|
|
|
|
|
|
|
| |
Whenever the kernel checks for rt_task() to avoid delaying real-time
tasks, we want it to also not delay LITMUS^RT tasks. Hence, most
calls to rt_task() should be matched by an equivalent call to
is_realtime().
Notably, this affects the implementations of select() and nanosleep(),
which use timer_slack_ns when setting up timers for non-real-time
tasks.
|
|
|
|
|
| |
Allow LITMUS^RT to do some work when a process is created or
terminated.
|
|
|
|
|
| |
Otherwise, the scheduler state machine becomes confused (and goes into
a rescheduling loop) when stop-machine is triggered.
|
|
|
|
| |
Track when a processor is going to schedule "soon".
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the core of LITMUS^RT:
- library functionality (heaps, rt_domain, prioritization, etc.)
- budget enforcement logic
- job management
- system call backends
- virtual devices (control page, etc.)
- scheduler plugin API (and dummy plugin)
This code compiles, but is not yet integrated with the rest of Linux.
|
|
|
|
|
|
| |
This patch adds hrtimer_start_on(), which allows arming timers on
remote CPUs. This is needed to avoided timer interrupts on "shielded"
CPUs and is also useful for implementing semi-partitioned schedulers.
|
|
|
|
|
|
|
| |
This patch adds the infrastructure for the TRACE() debug macro.
Conflicts:
kernel/printk.c
|
|
|
|
|
|
|
| |
This patch integrates the overhead tracepoints into the Linux
scheduler that are compatible with plain vanilla Linux (i.e., not
specific to LITMUS^RT plugins). This can be used to measure the
overheads of an otherwise unmodified kernel.
|
|
|
|
|
| |
This patch hooks up Feather-Trace's ft_irq_fired() handler with
Linux's interrupt handling infrastructure.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 63781394c540dd9e666a6b21d70b64dd52bce76e upstream.
request_any_context_irq() returns a negative value on failure.
It returns either IRQC_IS_HARDIRQ or IRQC_IS_NESTED on success.
So fix testing return value of request_any_context_irq().
Also fixup the return value of devm_request_any_context_irq() to make it
consistent with request_any_context_irq().
Fixes: 0668d3065128 ("genirq: Add devm_request_any_context_irq()")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Link: http://lkml.kernel.org/r/1431334978.17783.4.camel@ingics.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 9a1bd63cdae4b623494c4ebaf723a91c35ec49fb upstream.
The list of loaded modules is walked through in
module_kallsyms_on_each_symbol (called by kallsyms_on_each_symbol). The
module_mutex lock should be acquired to prevent potential corruptions
in the list.
This was uncovered with new lockdep asserts in module code introduced by
the commit 0be964be0d45 ("module: Sanitize RCU usage and locking") in
recent next- trees.
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6e91f8cb138625be96070b778d9ba71ce520ea7e upstream.
If, at the time __rcu_process_callbacks() is invoked, there are callbacks
in Tiny RCU's callback list, but none of them are ready to be invoked,
the current list-management code will knit the non-ready callbacks out
of the list. This can result in hangs and possibly worse. This commit
therefore inserts a check for there being no callbacks that can be
invoked immediately.
This bug is unlikely to occur -- you have to get a new callback between
the time rcu_sched_qs() or rcu_bh_qs() was called, but before we get to
__rcu_process_callbacks(). It was detected by the addition of RCU-bh
testing to rcutorture, which in turn was instigated by Iftekhar Ahmed's
mutation testing. Although this bug was made much more likely by
915e8a4fe45e (rcu: Remove fastpath from __rcu_process_callbacks()), this
did not cause the bug, but rather made it much more probable. That
said, it takes more than 40 hours of rcutorture testing, on average,
for this bug to appear, so this fix cannot be considered an emergency.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f9bb48825a6b5d02f4cabcc78967c75db903dcdc upstream.
This allows for better documentation in the code and
it allows for a simpler and fully correct version of
fs_fully_visible to be written.
The mount points converted and their filesystems are:
/sys/hypervisor/s390/ s390_hypfs
/sys/kernel/config/ configfs
/sys/kernel/debug/ debugfs
/sys/firmware/efi/efivars/ efivarfs
/sys/fs/fuse/connections/ fusectl
/sys/fs/pstore/ pstore
/sys/kernel/tracing/ tracefs
/sys/fs/cgroup/ cgroup
/sys/kernel/security/ securityfs
/sys/fs/selinux/ selinuxfs
/sys/fs/smackfs/ smackfs
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f9bd6733d3f11e24f3949becf277507d422ee1eb upstream.
Add a magic sysctl table sysctl_mount_point that when used to
create a directory forces that directory to be permanently empty.
Update the code to use make_empty_dir_inode when accessing permanently
empty directories.
Update the code to not allow adding to permanently empty directories.
Update /proc/sys/fs/binfmt_misc to be a permanently empty directory.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 2f993cf093643b98477c421fa2b9a98dcc940323 upstream.
While looking for other users of get_state/cond_sync. I Found
ring_buffer_attach() and it looks obviously buggy?
Don't we need to ensure that we have "synchronize" _between_
list_del() and list_add() ?
IOW. Suppose that ring_buffer_attach() preempts right_after
get_state_synchronize_rcu() and gp completes before spin_lock().
In this case cond_synchronize_rcu() does nothing and we reuse
->rb_entry without waiting for gp in between?
It also moves the ->rcu_pending check under "if (rb)", to make it
more readable imo.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: josh@joshtriplett.org
Cc: tj@kernel.org
Fixes: b69cf53640da ("perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()")
Link: http://lkml.kernel.org/r/20150530200425.GA15748@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing filter fix from Steven Rostedt:
"Vince Weaver reported a warning when he added perf event filters into
his fuzzer tests. There's a missing check of balanced operations when
parenthesis are used, and this triggers a WARN_ON() and when reading
the failure, the filter reports no failure occurred.
The operands were not being checked if they match, this adds that"
* tag 'trace-fix-filter-4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Have filter check for balanced ops
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When the following filter is used it causes a warning to trigger:
# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
Modules linked in: bnep lockd grace bluetooth ...
CPU: 3 PID: 1223 Comm: bash Tainted: G W 4.1.0-rc3-test+ #450
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
Call Trace:
[<ffffffff816ed4f9>] dump_stack+0x4c/0x6e
[<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0
[<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80
[<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20
[<ffffffff81159065>] replace_preds+0x3c5/0x990
[<ffffffff811596b2>] create_filter+0x82/0xb0
[<ffffffff81159944>] apply_event_filter+0xd4/0x180
[<ffffffff81152bbf>] event_filter_write+0x8f/0x120
[<ffffffff811db2a8>] __vfs_write+0x28/0xe0
[<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0
[<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0
[<ffffffff811dc408>] vfs_write+0xb8/0x1b0
[<ffffffff811dc72f>] SyS_write+0x4f/0xb0
[<ffffffff816f5217>] system_call_fastpath+0x12/0x6a
---[ end trace e11028bd95818dcd ]---
Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.
This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:
# cd /sys/kernel/debug/tracing
# echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
# cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression
And give no kernel warning.
Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull lockdep fix from Ingo Molnar:
"A lockdep/modules unload race fix that can oops"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Fix a race between /proc/lock_stat and module unload
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The lock_class iteration of /proc/lock_stat is not serialized against
the lockdep_free_key_range() call from module unload.
Therefore it can happen that we find a class of which ->name/->key are
no longer valid.
There is a further bug in zap_class() that left ->name dangling. Cure
this. Use RCU_INIT_POINTER() because NULL.
Since lockdep_free_key_range() is rcu_sched serialized, we can read
both ->name and ->key under rcu_read_lock_sched() (preempt-disable)
and be assured that if we observe a !NULL value it stays safe to use
for as long as we hold that lock.
If we observe both NULL, skip the entry.
Reported-by: Jerome Marchand <jmarchan@redhat.com>
Tested-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150602105013.GS3644@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ring buffer benchmark buglet fix from Steven Rostedt:
"Wang Long fixed a minor bug in the module parameter for the ring
buffer benchmark, where the produce_fifo was being ignored and the
producer thread's priority was being set with the consumer_fifo
parameter"
* tag 'trace-rb-bm-fix-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ring-buffer-benchmark: Fix the wrong sched_priority of producer
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The producer should be used producer_fifo as its sched_priority,
so correct it.
Link: http://lkml.kernel.org/r/1433923957-67842-1-git-send-email-long.wanglong@huawei.com
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Wang Long <long.wanglong@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Jovi Zhangwei reported the following problem
Below kernel vm bug can be triggered by tcpdump which mmaped a lot of pages
with GFP_COMP flag.
[Mon May 25 05:29:33 2015] page:ffffea0015414000 count:66 mapcount:1 mapping: (null) index:0x0
[Mon May 25 05:29:33 2015] flags: 0x20047580004000(head)
[Mon May 25 05:29:33 2015] page dumped because: VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page))
[Mon May 25 05:29:33 2015] ------------[ cut here ]------------
[Mon May 25 05:29:33 2015] kernel BUG at mm/migrate.c:1661!
[Mon May 25 05:29:33 2015] invalid opcode: 0000 [#1] SMP
In this case it was triggered by running tcpdump but it's not necessary
reproducible on all systems.
sudo tcpdump -i bond0.100 'tcp port 4242' -c 100000000000 -w 4242.pcap
Compound pages cannot be migrated and it was not expected that such pages
be marked for NUMA balancing. This did not take into account that drivers
such as net/packet/af_packet.c may insert compound pages into userspace
with vm_insert_page. This patch tells the NUMA balancing protection
scanner to skip all VM_MIXEDMAP mappings which avoids the possibility that
compound pages are marked for migration.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Jovi Zhangwei <jovi@cloudflare.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"The biggest chunk of the changes are two regression fixes: a HT
workaround fix and an event-group scheduling fix. It's been verified
with 5 days of fuzzer testing.
Other fixes:
- eBPF fix
- a BIOS breakage detection fix
- PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/pt: Fix a refactoring bug
perf/x86: Tweak broken BIOS rules during check_hw_exists()
perf/x86/intel/pt: Untangle pt_buffer_reset_markers()
perf: Disallow sparse AUX allocations for non-SG PMUs in overwrite mode
perf/x86: Improve HT workaround GP counter constraint
perf/x86: Fix event/group validation
perf: Fix race in BPF program unregister
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
PMUs that don't support hardware scatter tables require big contiguous
chunks of memory and a PMI to switch between them. However, in overwrite
using a PMI for this purpose adds extra overhead that the users would
like to avoid. Thus, in overwrite mode for such PMUs we can only allow
one contiguous chunk for the entire requested buffer.
This patch changes the behavior accordingly, so that if the buddy allocator
fails to come up with a single high-order chunk for the entire requested
buffer, the allocation will fail.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1432308626-18845-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
there is a race between perf_event_free_bpf_prog() and free_trace_kprobe():
__free_event()
event->destroy(event)
tp_perf_event_destroy()
perf_trace_destroy()
perf_trace_event_unreg()
which is dropping event->tp_event->perf_refcount and allows to proceed in:
unregister_trace_kprobe()
unregister_kprobe_event()
trace_remove_event_call()
probe_remove_event_call()
free_trace_kprobe()
while __free_event does:
call_rcu(&event->rcu_head, free_event_rcu);
free_event_rcu()
perf_event_free_bpf_prog()
To fix the race simply move perf_event_free_bpf_prog() before
event->destroy(), since event->tp_event is still valid at that point.
Note, perf_trace_destroy() is not racing with trace_remove_event_call()
since they both grab event_mutex.
Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: lizefan@huawei.com
Cc: pi3orama@163.com
Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes")
Link: http://lkml.kernel.org/r/1431717321-28772-1-git-send-email-ast@plumgrid.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In the functions compat_get_bitmap() and compat_put_bitmap() the
variable nr_compat_longs stores how many compat_ulong_t words should be
copied in a loop.
The copy loop itself is this:
if (nr_compat_longs-- > 0) {
if (__get_user(um, umask)) return -EFAULT;
} else {
um = 0;
}
Since nr_compat_longs gets unconditionally decremented in each loop and
since it's type is unsigned this could theoretically lead to out of
bounds accesses to userspace if nr_compat_longs wraps around to
(unsigned)(-1).
Although the callers currently do not trigger out-of-bounds accesses, we
should better implement the loop in a safe way to completely avoid such
warp-arounds.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull fixes for cpumask and modules from Rusty Russell:
"** NOW WITH TESTING! **
Two fixes which got lost in my recent distraction. One is a weird
cpumask function which needed to be rewritten, the other is a module
bug which is cc:stable"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
cpumask_set_cpu_local_first => cpumask_local_spread, lament
module: Call module notifier on failure after complete_formation()
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The module notifier call chain for MODULE_STATE_COMING was moved up before
the parsing of args, into the complete_formation() call. But if the module failed
to load after that, the notifier call chain for MODULE_STATE_GOING was
never called and that prevented the users of those call chains from
cleaning up anything that was allocated.
Link: http://lkml.kernel.org/r/554C52B9.9060700@gmail.com
Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Fixes: 4982223e51e8 "module: set nx before marking module MODULE_STATE_COMING"
Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"One more fix from the timer departement:
- Handle division of negative nanosecond values proper on 32bit.
A recent cleanup wrecked the sign handling of the dividend and
dropped the check for negative divisors"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Fix ktime_divns to do signed division
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It was noted that the 32bit implementation of ktime_divns()
was doing unsigned division and didn't properly handle
negative values.
And when a ktime helper was changed to utilize
ktime_divns, it caused a regression on some IR blasters.
See the following bugzilla for details:
https://bugzilla.redhat.com/show_bug.cgi?id=1200353
This patch fixes the problem in ktime_divns by checking
and preserving the sign bit, and then reapplying it if
appropriate after the division, it also changes the return
type to a s64 to make it more obvious this is expected.
Nicolas also pointed out that negative dividers would
cause infinite loops on 32bit systems, negative dividers
is unlikely for users of this function, but out of caution
this patch adds checks for negative dividers for both
32-bit (BUG_ON) and 64-bit(WARN_ON) versions to make sure
no such use cases creep in.
[ tglx: Hand an u64 to do_div() to avoid the compiler warning ]
Fixes: 166afb64511e 'ktime: Sanitize ktime_to_us/ms conversion'
Reported-and-tested-by: Trevor Cordes <trevor@tecnopolis.ca>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1431118043-23452-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull block fixes from Jens Axboe:
"Three small fixes that have been picked up the last few weeks.
Specifically:
- Fix a memory corruption issue in NVMe with malignant user
constructed request. From Christoph.
- Kill (now) unused blk_queue_bio(), dm was changed to not need this
anymore. From Mike Snitzer.
- Always use blk_schedule_flush_plug() from the io_schedule() path
when flushing a plug, fixing a !TASK_RUNNING warning with md. From
Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block:
sched: always use blk_schedule_flush_plug in io_schedule_out
nvme: fix kernel memory corruption with short INQUIRY buffers
block: remove export for blk_queue_bio
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
block plug callback could sleep, so we introduce a parameter
'from_schedule' and corresponding drivers can use it to destinguish a
schedule plug flush or a plug finish. Unfortunately io_schedule_out
still uses blk_flush_plug(). This causes below output (Note, I added a
might_sleep() in raid1_unplug to make it trigger faster, but the whole
thing doesn't matter if I add might_sleep). In raid1/10, this can cause
deadlock.
This patch makes io_schedule_out always uses blk_schedule_flush_plug.
This should only impact drivers (as far as I know, raid 1/10) which are
sensitive to the 'from_schedule' parameter.
[ 370.817949] ------------[ cut here ]------------
[ 370.817960] WARNING: CPU: 7 PID: 145 at ../kernel/sched/core.c:7306 __might_sleep+0x7f/0x90()
[ 370.817969] do not call blocking ops when !TASK_RUNNING; state=2 set at [<ffffffff81092fcf>] prepare_to_wait+0x2f/0x90
[ 370.817971] Modules linked in: raid1
[ 370.817976] CPU: 7 PID: 145 Comm: kworker/u16:9 Tainted: G W 4.0.0+ #361
[ 370.817977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[ 370.817983] Workqueue: writeback bdi_writeback_workfn (flush-9:1)
[ 370.817985] ffffffff81cd83be ffff8800ba8cb298 ffffffff819dd7af 0000000000000001
[ 370.817988] ffff8800ba8cb2e8 ffff8800ba8cb2d8 ffffffff81051afc ffff8800ba8cb2c8
[ 370.817990] ffffffffa00061a8 000000000000041e 0000000000000000 ffff8800ba8cba28
[ 370.817993] Call Trace:
[ 370.817999] [<ffffffff819dd7af>] dump_stack+0x4f/0x7b
[ 370.818002] [<ffffffff81051afc>] warn_slowpath_common+0x8c/0xd0
[ 370.818004] [<ffffffff81051b86>] warn_slowpath_fmt+0x46/0x50
[ 370.818006] [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[ 370.818008] [<ffffffff81092fcf>] ? prepare_to_wait+0x2f/0x90
[ 370.818010] [<ffffffff810776ef>] __might_sleep+0x7f/0x90
[ 370.818014] [<ffffffffa0000c03>] raid1_unplug+0xd3/0x170 [raid1]
[ 370.818024] [<ffffffff81421d9a>] blk_flush_plug_list+0x8a/0x1e0
[ 370.818028] [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[ 370.818031] [<ffffffff819e21b0>] io_schedule_timeout+0x130/0x140
[ 370.818033] [<ffffffff819e3586>] bit_wait_io+0x36/0x50
[ 370.818034] [<ffffffff819e31b5>] __wait_on_bit+0x65/0x90
[ 370.818041] [<ffffffff8125b67c>] ? ext4_read_block_bitmap_nowait+0xbc/0x630
[ 370.818043] [<ffffffff819e3550>] ? bit_wait+0x50/0x50
[ 370.818045] [<ffffffff819e3302>] out_of_line_wait_on_bit+0x72/0x80
[ 370.818047] [<ffffffff810935e0>] ? autoremove_wake_function+0x40/0x40
[ 370.818050] [<ffffffff811de744>] __wait_on_buffer+0x44/0x50
[ 370.818053] [<ffffffff8125ae80>] ext4_wait_block_bitmap+0xe0/0xf0
[ 370.818058] [<ffffffff812975d6>] ext4_mb_init_cache+0x206/0x790
[ 370.818062] [<ffffffff8114bc6c>] ? lru_cache_add+0x1c/0x50
[ 370.818064] [<ffffffff81297c7e>] ext4_mb_init_group+0x11e/0x200
[ 370.818066] [<ffffffff81298231>] ext4_mb_load_buddy+0x341/0x360
[ 370.818068] [<ffffffff8129a1a3>] ext4_mb_find_by_goal+0x93/0x2f0
[ 370.818070] [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[ 370.818072] [<ffffffff8129ab67>] ext4_mb_regular_allocator+0x67/0x460
[ 370.818074] [<ffffffff81295b54>] ? ext4_mb_normalize_request+0x1e4/0x5b0
[ 370.818076] [<ffffffff8129ca4b>] ext4_mb_new_blocks+0x4cb/0x620
[ 370.818079] [<ffffffff81290956>] ext4_ext_map_blocks+0x4c6/0x14d0
[ 370.818081] [<ffffffff812a4d4e>] ? ext4_es_lookup_extent+0x4e/0x290
[ 370.818085] [<ffffffff8126399d>] ext4_map_blocks+0x14d/0x4f0
[ 370.818088] [<ffffffff81266fbd>] ext4_writepages+0x76d/0xe50
[ 370.818094] [<ffffffff81149691>] do_writepages+0x21/0x50
[ 370.818097] [<ffffffff811d5c00>] __writeback_single_inode+0x60/0x490
[ 370.818099] [<ffffffff811d630a>] writeback_sb_inodes+0x2da/0x590
[ 370.818103] [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[ 370.818105] [<ffffffff811abf4b>] ? trylock_super+0x1b/0x50
[ 370.818107] [<ffffffff811d665f>] __writeback_inodes_wb+0x9f/0xd0
[ 370.818109] [<ffffffff811d69db>] wb_writeback+0x34b/0x3c0
[ 370.818111] [<ffffffff811d70df>] bdi_writeback_workfn+0x23f/0x550
[ 370.818116] [<ffffffff8106bbd8>] process_one_work+0x1c8/0x570
[ 370.818117] [<ffffffff8106bb5b>] ? process_one_work+0x14b/0x570
[ 370.818119] [<ffffffff8106c09b>] worker_thread+0x11b/0x470
[ 370.818121] [<ffffffff8106bf80>] ? process_one_work+0x570/0x570
[ 370.818124] [<ffffffff81071868>] kthread+0xf8/0x110
[ 370.818126] [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[ 370.818129] [<ffffffff819e9322>] ret_from_fork+0x42/0x70
[ 370.818131] [<ffffffff81071770>] ? kthread_create_on_node+0x210/0x210
[ 370.818132] ---[ end trace 7b4deb71e68b6605 ]---
V2: don't change ->in_iowait
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit ab992dc38f9a ("watchdog: Fix merge 'conflict'") has introduced an
obvious deadlock because of a typo. watchdog_proc_mutex should be
unlocked on exit.
Thanks to Miroslav Benes who was staring at the code with me and noticed
this.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Duh-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|