| Commit message (Collapse) | Author | Age |
... | |
| |/ / / / / / /
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
When a timer base is idle, it is forwarded when a new timer is added
to ensure that granularity does not become excessive. When not idle,
the timer tick is expected to increment the base.
However there are several problems:
- If an existing timer is modified, the base is forwarded only after
the index is calculated.
- The base is not forwarded by add_timer_on.
- There is a window after a timer is restarted from a nohz idle, after
it is marked not-idle and before the timer tick on this CPU, where a
timer may be added but the ancient base does not get forwarded.
These result in excessive granularity (a 1 jiffy timeout can blow out
to 100s of jiffies), which cause the rcu lockup detector to trigger,
among other things.
Fix this by keeping track of whether the timer base has been idle
since it was last run or forwarded, and if so then forward it before
adding a new timer.
There is still a case where mod_timer optimises the case of a pending
timer mod with the same expiry time, where the timer can see excessive
granularity relative to the new, shorter interval. A comment is added,
but it's not changed because it is an important fastpath for
networking.
This has been tested and found to fix the RCU softlockup messages.
Testing was also done with tracing to measure requested versus
achieved wakeup latencies for all non-deferrable timers in an idle
system (with no lockup watchdogs running). Wakeup latency relative to
absolute latency is calculated (note this suffers from round-up skew
at low absolute times) and analysed:
max avg std
upstream 506.0 1.20 4.68
patched 2.0 1.08 0.15
The bug was noticed due to the lockup detector Kconfig changes
dropping it out of people's .configs and resulting in larger base
clk skew When the lockup detectors are enabled, no CPU can go idle for
longer than 4 seconds, which limits the granularity errors.
Sub-optimal timer behaviour is observable on a smaller scale in that
case:
max avg std
upstream 9.0 1.05 0.19
patched 2.0 1.04 0.11
Fixes: Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: David Miller <davem@davemloft.net>
Cc: dzickus@redhat.com
Cc: sfr@canb.auug.org.au
Cc: mpe@ellerman.id.au
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: linuxarm@huawei.com
Cc: abdhalee@linux.vnet.ibm.com
Cc: John Stultz <john.stultz@linaro.org>
Cc: akpm@linux-foundation.org
Cc: paulmck@linux.vnet.ibm.com
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A single fix to not allow nonsensical event groups that result in
kernel warnings"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix group {cpu,task} validation
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Regardless of which events form a group, it does not make sense for the
events to target different tasks and/or CPUs, as this leaves the group
inconsistent and impossible to schedule. The core perf code assumes that
these are consistent across (successfully intialised) groups.
Core perf code only verifies this when moving SW events into a HW
context. Thus, we can violate this requirement for pure SW groups and
pure HW groups, unless the relevant PMU driver happens to perform this
verification itself. These mismatched groups subsequently wreak havoc
elsewhere.
For example, we handle watchpoints as SW events, and reserve watchpoint
HW on a per-CPU basis at pmu::event_init() time to ensure that any event
that is initialised is guaranteed to have a slot at pmu::add() time.
However, the core code only checks the group leader's cpu filter (via
event_filter_match()), and can thus install follower events onto CPUs
violating thier (mismatched) CPU filters, potentially installing them
into a CPU without sufficient reserved slots.
This can be triggered with the below test case, resulting in warnings
from arch backends.
#define _GNU_SOURCE
#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>
#include <sched.h>
#include <stdio.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <unistd.h>
static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
int group_fd, unsigned long flags)
{
return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
}
char watched_char;
struct perf_event_attr wp_attr = {
.type = PERF_TYPE_BREAKPOINT,
.bp_type = HW_BREAKPOINT_RW,
.bp_addr = (unsigned long)&watched_char,
.bp_len = 1,
.size = sizeof(wp_attr),
};
int main(int argc, char *argv[])
{
int leader, ret;
cpu_set_t cpus;
/*
* Force use of CPU0 to ensure our CPU0-bound events get scheduled.
*/
CPU_ZERO(&cpus);
CPU_SET(0, &cpus);
ret = sched_setaffinity(0, sizeof(cpus), &cpus);
if (ret) {
printf("Unable to set cpu affinity\n");
return 1;
}
/* open leader event, bound to this task, CPU0 only */
leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
if (leader < 0) {
printf("Couldn't open leader: %d\n", leader);
return 1;
}
/*
* Open a follower event that is bound to the same task, but a
* different CPU. This means that the group should never be possible to
* schedule.
*/
ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
if (ret < 0) {
printf("Couldn't open mismatched follower: %d\n", ret);
return 1;
} else {
printf("Opened leader/follower with mismastched CPUs\n");
}
/*
* Open as many independent events as we can, all bound to the same
* task, CPU0 only.
*/
do {
ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
} while (ret >= 0);
/*
* Force enable/disble all events to trigger the erronoeous
* installation of the follower event.
*/
printf("Opened all events. Toggling..\n");
for (;;) {
prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
}
return 0;
}
Fix this by validating this requirement regardless of whether we're
moving events.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhou Chengming <zhouchengming1@huawei.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1498142498-15758-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Merge misc fixes from Andrew Morton:
"6 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/memblock.c: reversed logic in memblock_discard()
fork: fix incorrect fput of ->exe_file causing use-after-free
mm/madvise.c: fix freeing of locked page with MADV_FREE
dax: fix deadlock due to misaligned PMD faults
mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
PM/hibernate: touch NMI watchdog when creating snapshot
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
In recently introduced memblock_discard() there is a reversed logic bug.
Memory is freed of static array instead of dynamically allocated one.
Link: http://lkml.kernel.org/r/1503511441-95478-2-git-send-email-pasha.tatashin@oracle.com
Fixes: 3010f876500f ("mm: discard memblock data later")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Commit 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for
write killable") made it possible to kill a forking task while it is
waiting to acquire its ->mmap_sem for write, in dup_mmap().
However, it was overlooked that this introduced an new error path before
a reference is taken on the mm_struct's ->exe_file. Since the
->exe_file of the new mm_struct was already set to the old ->exe_file by
the memcpy() in dup_mm(), it was possible for the mmput() in the error
path of dup_mm() to drop a reference to ->exe_file which was never
taken.
This caused the struct file to later be freed prematurely.
Fix it by updating mm_init() to NULL out the ->exe_file, in the same
place it clears other things like the list of mmaps.
This bug was found by syzkaller. It can be reproduced using the
following C program:
#define _GNU_SOURCE
#include <pthread.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <unistd.h>
static void *mmap_thread(void *_arg)
{
for (;;) {
mmap(NULL, 0x1000000, PROT_READ,
MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
}
}
static void *fork_thread(void *_arg)
{
usleep(rand() % 10000);
fork();
}
int main(void)
{
fork();
fork();
fork();
for (;;) {
if (fork() == 0) {
pthread_t t;
pthread_create(&t, NULL, mmap_thread, NULL);
pthread_create(&t, NULL, fork_thread, NULL);
usleep(rand() % 10000);
syscall(__NR_exit_group, 0);
}
wait(NULL);
}
}
No special kernel config options are needed. It usually causes a NULL
pointer dereference in __remove_shared_vm_struct() during exit, or in
dup_mmap() (which is usually inlined into copy_process()) during fork.
Both are due to a vm_area_struct's ->vm_file being used after it's
already been freed.
Google Bug Id: 64772007
Link: http://lkml.kernel.org/r/20170823211408.31198-1-ebiggers3@gmail.com
Fixes: 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write killable")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> [v4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
If madvise(..., MADV_FREE) split a transparent hugepage, it called
put_page() before unlock_page().
This was wrong because put_page() can free the page, e.g. if a
concurrent madvise(..., MADV_DONTNEED) has removed it from the memory
mapping. put_page() then rightfully complained about freeing a locked
page.
Fix this by moving the unlock_page() before put_page().
This bug was found by syzkaller, which encountered the following splat:
BUG: Bad page state in process syzkaller412798 pfn:1bd800
page:ffffea0006f60000 count:0 mapcount:0 mapping: (null) index:0x20a00
flags: 0x200000000040019(locked|uptodate|dirty|swapbacked)
raw: 0200000000040019 0000000000000000 0000000000020a00 00000000ffffffff
raw: ffffea0006f60020 ffffea0006f60020 0000000000000000 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
bad because of flags: 0x1(locked)
Modules linked in:
CPU: 1 PID: 3037 Comm: syzkaller412798 Not tainted 4.13.0-rc5+ #35
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
bad_page+0x230/0x2b0 mm/page_alloc.c:565
free_pages_check_bad+0x1f0/0x2e0 mm/page_alloc.c:943
free_pages_check mm/page_alloc.c:952 [inline]
free_pages_prepare mm/page_alloc.c:1043 [inline]
free_pcp_prepare mm/page_alloc.c:1068 [inline]
free_hot_cold_page+0x8cf/0x12b0 mm/page_alloc.c:2584
__put_single_page mm/swap.c:79 [inline]
__put_page+0xfb/0x160 mm/swap.c:113
put_page include/linux/mm.h:814 [inline]
madvise_free_pte_range+0x137a/0x1ec0 mm/madvise.c:371
walk_pmd_range mm/pagewalk.c:50 [inline]
walk_pud_range mm/pagewalk.c:108 [inline]
walk_p4d_range mm/pagewalk.c:134 [inline]
walk_pgd_range mm/pagewalk.c:160 [inline]
__walk_page_range+0xc3a/0x1450 mm/pagewalk.c:249
walk_page_range+0x200/0x470 mm/pagewalk.c:326
madvise_free_page_range.isra.9+0x17d/0x230 mm/madvise.c:444
madvise_free_single_vma+0x353/0x580 mm/madvise.c:471
madvise_dontneed_free mm/madvise.c:555 [inline]
madvise_vma mm/madvise.c:664 [inline]
SYSC_madvise mm/madvise.c:832 [inline]
SyS_madvise+0x7d3/0x13c0 mm/madvise.c:760
entry_SYSCALL_64_fastpath+0x1f/0xbe
Here is a C reproducer:
#define _GNU_SOURCE
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>
#define MADV_FREE 8
#define PAGE_SIZE 4096
static void *mapping;
static const size_t mapping_size = 0x1000000;
static void *madvise_thrproc(void *arg)
{
madvise(mapping, mapping_size, (long)arg);
}
int main(void)
{
pthread_t t[2];
for (;;) {
mapping = mmap(NULL, mapping_size, PROT_WRITE,
MAP_POPULATE|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
munmap(mapping + mapping_size / 2, PAGE_SIZE);
pthread_create(&t[0], 0, madvise_thrproc, (void*)MADV_DONTNEED);
pthread_create(&t[1], 0, madvise_thrproc, (void*)MADV_FREE);
pthread_join(t[0], NULL);
pthread_join(t[1], NULL);
munmap(mapping, mapping_size);
}
}
Note: to see the splat, CONFIG_TRANSPARENT_HUGEPAGE=y and
CONFIG_DEBUG_VM=y are needed.
Google Bug Id: 64696096
Link: http://lkml.kernel.org/r/20170823205235.132061-1-ebiggers3@gmail.com
Fixes: 854e9ed09ded ("mm: support madvise(MADV_FREE)")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [v4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
In DAX there are two separate places where the 2MiB range of a PMD is
defined.
The first is in the page tables, where a PMD mapping inserted for a
given address spans from (vmf->address & PMD_MASK) to ((vmf->address &
PMD_MASK) + PMD_SIZE - 1). That is, from the 2MiB boundary below the
address to the 2MiB boundary above the address.
So, for example, a fault at address 3MiB (0x30 0000) falls within the
PMD that ranges from 2MiB (0x20 0000) to 4MiB (0x40 0000).
The second PMD range is in the mapping->page_tree, where a given file
offset is covered by a radix tree entry that spans from one 2MiB aligned
file offset to another 2MiB aligned file offset.
So, for example, the file offset for 3MiB (pgoff 768) falls within the
PMD range for the order 9 radix tree entry that ranges from 2MiB (pgoff
512) to 4MiB (pgoff 1024).
This system works so long as the addresses and file offsets for a given
mapping both have the same offsets relative to the start of each PMD.
Consider the case where the starting address for a given file isn't 2MiB
aligned - say our faulting address is 3 MiB (0x30 0000), but that
corresponds to the beginning of our file (pgoff 0). Now all the PMDs in
the mapping are misaligned so that the 2MiB range defined in the page
tables never matches up with the 2MiB range defined in the radix tree.
The current code notices this case for DAX faults to storage with the
following test in dax_pmd_insert_mapping():
if (pfn_t_to_pfn(pfn) & PG_PMD_COLOUR)
goto unlock_fallback;
This test makes sure that the pfn we get from the driver is 2MiB
aligned, and relies on the assumption that the 2MiB alignment of the pfn
we get back from the driver matches the 2MiB alignment of the faulting
address.
However, faults to holes were not checked and we could hit the problem
described above.
This was reported in response to the NVML nvml/src/test/pmempool_sync
TEST5:
$ cd nvml/src/test/pmempool_sync
$ make TEST5
You can grab NVML here:
https://github.com/pmem/nvml/
The dmesg warning you see when you hit this error is:
WARNING: CPU: 13 PID: 2900 at fs/dax.c:641 dax_insert_mapping_entry+0x2df/0x310
Where we notice in dax_insert_mapping_entry() that the radix tree entry
we are about to replace doesn't match the locked entry that we had
previously inserted into the tree. This happens because the initial
insertion was done in grab_mapping_entry() using a pgoff calculated from
the faulting address (vmf->address), and the replacement in
dax_pmd_load_hole() => dax_insert_mapping_entry() is done using
vmf->pgoff.
In our failure case those two page offsets (one calculated from
vmf->address, one using vmf->pgoff) point to different order 9 radix
tree entries.
This failure case can result in a deadlock because the radix tree unlock
also happens on the pgoff calculated from vmf->address. This means that
the locked radix tree entry that we swapped in to the tree in
dax_insert_mapping_entry() using vmf->pgoff is never unlocked, so all
future faults to that 2MiB range will block forever.
Fix this by validating that the faulting address's PMD offset matches
the PMD offset from the start of the file. This check is done at the
very beginning of the fault and covers faults that would have mapped to
storage as well as faults to holes. I left the COLOUR check in
dax_pmd_insert_mapping() in place in case we ever hit the insanity
condition where the alignment of the pfn we get from the driver doesn't
match the alignment of the userspace address.
Link: http://lkml.kernel.org/r/20170822222436.18926-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: "Slusarz, Marcin" <marcin.slusarz@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
/sys/kernel/mm/transparent_hugepage/shmem_enabled controls if we want
to allocate huge pages when allocate pages for private in-kernel shmem
mount.
Unfortunately, as Dan noticed, I've screwed it up and the only way to
make kernel allocate huge page for the mount is to use "force" there.
All other values will be effectively ignored.
Link: http://lkml.kernel.org/r/20170822144254.66431-1-kirill.shutemov@linux.intel.com
Fixes: 5a6e75f8110c ("shmem: prepare huge= mount option and sysfs knob")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |/ / / / / / / /
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
There is a problem that when counting the pages for creating the
hibernation snapshot will take significant amount of time, especially on
system with large memory. Since the counting job is performed with irq
disabled, this might lead to NMI lockup. The following warning were
found on a system with 1.5TB DRAM:
Freezing user space processes ... (elapsed 0.002 seconds) done.
OOM killer disabled.
PM: Preallocating image memory...
NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
RIP: 0010:memory_bm_find_bit+0xf4/0x100
Call Trace:
swsusp_set_page_free+0x2b/0x30
mark_free_pages+0x147/0x1c0
count_data_pages+0x41/0xa0
hibernate_preallocate_memory+0x80/0x450
hibernation_snapshot+0x58/0x410
hibernate+0x17c/0x310
state_store+0xdf/0xf0
kobj_attr_store+0xf/0x20
sysfs_kf_write+0x37/0x40
kernfs_fop_write+0x11c/0x1a0
__vfs_write+0x37/0x170
vfs_write+0xb1/0x1a0
SyS_write+0x55/0xc0
entry_SYSCALL_64_fastpath+0x1a/0xa5
...
done (allocated 6590003 pages)
PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)
It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
triggered. In case the timeout of the NMI watch dog has been set to 1
second, a safe interval should be 6590003/20 = 320k pages in theory.
However there might also be some platforms running at a lower frequency,
so feed the watchdog every 100k pages.
[yu.c.chen@intel.com: simplification]
Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
[yu.c.chen@intel.com: use interval of 128k instead of 100k to avoid modulus]
Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reported-by: Jan Filipcewicz <jan.filipcewicz@intel.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \ \ \ \ \ \
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Pull Paolo Bonzini:
"Bugfixes for x86, PPC and s390"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
KVM: x86: simplify handling of PKRU
KVM: x86: block guest protection keys unless the host has them enabled
KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them
KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss
KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9
KVM: s390: sthyi: fix specification exception detection
KVM: s390: sthyi: fix sthyi inline assembly
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Nixiaoming pointed out that there is a memory leak in
kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
fails; the memory allocated for the kvmppc_spapr_tce_table struct
is not freed, and nor are the pages allocated for the iommu
tables. In addition, we have already incremented the process's
count of locked memory pages, and this doesn't get restored on
error.
David Hildenbrand pointed out that there is a race in that the
function checks early on that there is not already an entry in the
stt->iommu_tables list with the same LIOBN, but an entry with the
same LIOBN could get added between then and when the new entry is
added to the list.
This fixes all three problems. To simplify things, we now call
anon_inode_getfd() before placing the new entry in the list. The
check for an existing entry is done while holding the kvm->lock
mutex, immediately before adding the new entry to the list.
Finally, on failure we now call kvmppc_account_memlimit to
decrement the process's count of locked memory pages.
Reported-by: Nixiaoming <nixiaoming@huawei.com>
Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
The host pkru is restored right after vcpu exit (commit 1be0e61), so
KVM_GET_XSAVE will return the host PKRU value instead. Fix this by
using the guest PKRU explicitly in fill_xsave and load_xsave. This
part is based on a patch by Junkang Fu.
The host PKRU data may also not match the value in vcpu->arch.guest_fpu.state,
because it could have been changed by userspace since the last time
it was saved, so skip loading it in kvm_load_guest_fpu.
Reported-by: Junkang Fu <junkang.fjk@alibaba-inc.com>
Cc: Yang Zhang <zy107165@alibaba-inc.com>
Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Move it to struct kvm_arch_vcpu, replacing guest_pkru_valid with a
simple comparison against the host value of the register. The write of
PKRU in addition can be skipped if the guest has not enabled the feature.
Once we do this, we need not test OSPKE in the host anymore, because
guest_CR4.PKE=1 implies host_CR4.PKE=1.
The static PKU test is kept to elide the code on older CPUs.
Suggested-by: Yang Zhang <zy107165@alibaba-inc.com>
Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
If the host has protection keys disabled, we cannot read and write the
guest PKRU---RDPKRU and WRPKRU fail with #GP(0) if CR4.PKE=0. Block
the PKU cpuid bit in that case.
This ensures that guest_CR4.PKE=1 implies host_CR4.PKE=1.
Fixes: 1be0e61c1f255faaeab04a390e00c8b9b9042870
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This adds missing memory barriers to order updates/tests of
the virtual CPPR and MFRR, thus fixing a lost IPI problem.
While at it also document all barriers in this file.
This fixes a bug causing guest IPIs to occasionally get lost. The
symptom then is hangs or stalls in the guest.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This adds a workaround for a bug in POWER9 DD1 chips where changing
the CPPR (Current Processor Priority Register) can cause bits in the
IPB (Interrupt Pending Buffer) to get lost. Thankfully it only
happens when manually manipulating CPPR which is quite rare. When it
does happen it can cause interrupts to be delayed or lost.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
When msgsnd is used for IPIs to other cores, msgsync must be executed by
the target to order stores performed on the source before its msgsnd
(provided the source executes the appropriate sync).
Fixes: 1704a81ccebc ("KVM: PPC: Book3S HV: Use msgsnd for IPIs to other cores on POWER9")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
|
| |\ \ \ \ \ \ \ \ \
| | |_|/ / / / / / /
| |/| | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: two fixes for sthyi emulation
- missing inline assembly constraint
- wrong exception handling
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
sthyi should only generate a specification exception if the function
code is zero and the response buffer is not on a 4k boundary.
The current code would also test for unknown function codes if the
response buffer, that is currently only defined for function code 0,
is not on a 4k boundary and incorrectly inject a specification
exception instead of returning with condition code 3 and return code 4
(unsupported function code).
Fix this by moving the boundary check.
Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
The sthyi inline assembly misses register r3 within the clobber
list. The sthyi instruction will always write a return code to
register "R2+1", which in this case would be r3. Due to that we may
have register corruption and see host crashes or data corruption
depending on how gcc decided to allocate and use registers during
compile time.
Fixes: 95ca2cb57985 ("KVM: s390: Add sthyi emulation")
Cc: <stable@vger.kernel.org> # 4.8+
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
|
|\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
Pull virtio fixes from Michael Tsirkin:
"Fixes two obvious bugs in virtio pci"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_pci: fix cpu affinity support
virtio_blk: fix incorrect message when disk is resized
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
Commit 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for
virtqueues"") removed the adjustment of the pre_vectors for the virtio
MSI-X vector allocation which was added in commit fb5e31d9 ("virtio:
allow drivers to request IRQ affinity when creating VQs"). This will
lead to an incorrect assignment of MSI-X vectors, and potential
deadlocks when offlining cpus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: 0b0f9dc5 ("Revert "virtio_pci: use shared interrupts for virtqueues")
Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
| |/ / / / / / / / /
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
The message printed on disk resize is incorrect. The following is
printed when resizing to 2 GiB:
$ truncate -s 1G test.img
$ qemu -device virtio-blk-pci,logical_block_size=4096,...
(qemu) block_resize drive1 2G
virtio_blk virtio0: new size: 4194304 4096-byte logical blocks (17.2 GB/16.0 GiB)
The virtio_blk capacity config field is in 512-byte sector units
regardless of logical_block_size as per the VIRTIO specification.
Therefore the message should read:
virtio_blk virtio0: new size: 524288 4096-byte logical blocks (2.15 GB/2.0 GiB)
Note that this only affects the printed message. Thankfully the actual
block device has the correct size because the block layer expects
capacity in sectors.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix, to add a barrier in the switch_mm() code to make sure
the mm cpumask update is ordered vs the MMU starting to load
translations. As far as we know no one's actually hit the bug, but
that's just luck.
Thanks to Benjamin Herrenschmidt, Nicholas Piggin"
* tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Ensure cpumask update is ordered
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
There is no guarantee that the various isync's involved with
the context switch will order the update of the CPU mask with
the first TLB entry for the new context being loaded by the HW.
Be safe here and add a memory barrier to order any subsequent
load/store which may bring entries into the TLB.
The corresponding barrier on the other side already exists as
pte updates use pte_xchg() which uses __cmpxchg_u64 which has
a sync after the atomic operation.
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add comments in the code]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|\ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
Pull nfsd fixes from Bruce Fields:
"Two nfsd bugfixes, neither 4.13 regressions, but both potentially
serious"
* tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux:
net: sunrpc: svcsock: fix NULL-pointer exception
nfsd: Limit end of page list when decoding NFSv4 WRITE
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
While running nfs/connectathon tests kernel NULL-pointer exception
has been observed due to races in svcsock.c.
Race is appear when kernel accepts connection by kernel_accept
(which creates new socket) and start queuing ingress packets
to new socket. This happens in ksoftirq context which could run
concurrently on a different core while new socket setup is not done yet.
The fix is to re-order socket user data init sequence and add
write/read barrier calls to be sure that we got proper values
for callback pointers before actually calling them.
Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations.
Crash log:
---<-snip->---
[ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 6708.647093] pgd = ffff0000094e0000
[ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000
[ 6708.660761] Internal error: Oops: 86000005 [#1] SMP
[ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
[ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021]
[ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G W OE 4.11.0-4.el7.aarch64 #1
[ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017
[ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000
[ 6708.773822] PC is at 0x0
[ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
[ 6708.781866] pc : [<0000000000000000>] lr : [<ffff0000029d7378>] pstate: 60000145
[ 6708.789248] sp : ffff810ffbad3900
[ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58
[ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00
[ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28
[ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000
[ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00
[ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2
[ 6708.824365] x17: 0000000000000000 x16: 0000000000000000
[ 6708.829667] x15: 0000000000000000 x14: 0000000000000001
[ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e
[ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000
[ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000
[ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000
[ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000
[ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000
[ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00
[ 6708.872084]
[ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000)
[ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000)
[ 6708.886075] Call trace:
[ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840)
[ 6708.894942] 3700: ffff810012412400 0001000000000000
[ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000
[ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000
[ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000
[ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d
[ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140
[ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000
[ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028
[ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000
[ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000
[ 6708.973117] [< (null)>] (null)
[ 6708.977824] [<ffff0000086f9fa4>] tcp_data_queue+0x754/0xc5c
[ 6708.983386] [<ffff0000086fa64c>] tcp_rcv_established+0x1a0/0x67c
[ 6708.989384] [<ffff000008704120>] tcp_v4_do_rcv+0x15c/0x22c
[ 6708.994858] [<ffff000008707418>] tcp_v4_rcv+0xaf0/0xb58
[ 6709.000077] [<ffff0000086df784>] ip_local_deliver_finish+0x10c/0x254
[ 6709.006419] [<ffff0000086dfea4>] ip_local_deliver+0xf0/0xfc
[ 6709.011980] [<ffff0000086dfad4>] ip_rcv_finish+0x208/0x3a4
[ 6709.017454] [<ffff0000086e018c>] ip_rcv+0x2dc/0x3c8
[ 6709.022328] [<ffff000008692fc8>] __netif_receive_skb_core+0x2f8/0xa0c
[ 6709.028758] [<ffff000008696068>] __netif_receive_skb+0x38/0x84
[ 6709.034580] [<ffff00000869611c>] netif_receive_skb_internal+0x68/0xdc
[ 6709.041010] [<ffff000008696bc0>] napi_gro_receive+0xcc/0x1a8
[ 6709.046690] [<ffff0000014b0fc4>] nicvf_cq_intr_handler+0x59c/0x730 [nicvf]
[ 6709.053559] [<ffff0000014b1380>] nicvf_poll+0x38/0xb8 [nicvf]
[ 6709.059295] [<ffff000008697a6c>] net_rx_action+0x2f8/0x464
[ 6709.064771] [<ffff000008081824>] __do_softirq+0x11c/0x308
[ 6709.070164] [<ffff0000080d14e4>] irq_exit+0x12c/0x174
[ 6709.075206] [<ffff00000813101c>] __handle_domain_irq+0x78/0xc4
[ 6709.081027] [<ffff000008081608>] gic_handle_irq+0x94/0x190
[ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20)
[ 6709.092929] fde0: 0000810ff2ec0000 ffff000008c10000
[ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18
[ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80
[ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4
[ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50
[ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000
[ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000
[ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20
[ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145
[ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238
[ 6709.171107] [<ffff000008082eb4>] el1_irq+0xb4/0x140
[ 6709.175976] [<ffff000008085244>] arch_cpu_idle+0x44/0x11c
[ 6709.181368] [<ffff0000087bf3b8>] default_idle_call+0x20/0x30
[ 6709.187020] [<ffff000008116d50>] do_idle+0x158/0x1e4
[ 6709.191973] [<ffff000008116ff4>] cpu_startup_entry+0x2c/0x30
[ 6709.197624] [<ffff00000808e7cc>] secondary_start_kernel+0x13c/0x160
[ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4
[ 6709.207967] Code: bad PC value
[ 6709.211061] SMP: stopping secondary CPUs
[ 6709.218830] Starting crashdump kernel...
[ 6709.222749] Bye!
---<-snip>---
Signed-off-by: Vadim Lomovtsev <vlomovts@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | |
| | | | | | | | | | | | |
When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
More critical, nfsd4_decode_write is failing to increment argp->pagelen
when it increments argp->pagelist. This can cause later xdr decoders
to assume more data is available than really is, which can cause server
crashes on malformed requests.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|\ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | | |
git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Some bug fixes for stable for cifs"
* tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
cifs: Fix df output for users with quota limits
|
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | | |
Add checking for the path component length and verify it is <= the maximum
that the server advertizes via FileFsAttributeInformation.
With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
when users to access an overlong path.
To test this, try to cd into a (non-existing) directory on a CIFS share
that has a too long name:
cd /mnt/aaaaaaaaaaaaaaa...
and it now should show a good error message from the shell:
bash: cd: /mnt/aaaaaaaaaaaaaaaa...aaaaaa: File name too long
rh bz 1153996
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: <stable@vger.kernel.org>
|
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | |
| | | | | | | | | | | | | |
The df for a SMB2 share triggers a GetInfo call for
FS_FULL_SIZE_INFORMATION. The values returned are used to populate
struct statfs.
The problem is that none of the information returned by the call
contains the total blocks available on the filesystem. Instead we use
the blocks available to the user ie. quota limitation when filling out
statfs.f_blocks. The information returned does contain Actual free units
on the filesystem and is used to populate statfs.f_bfree. For users with
quota enabled, it can lead to situations where the total free space
reported is more than the total blocks on the system ending up with df
reports like the following
# df -h /mnt/a
Filesystem Size Used Avail Use% Mounted on
//192.168.22.10/a 2.5G -2.3G 2.5G - /mnt/a
To fix this problem, we instead populate both statfs.f_bfree with the
same value as statfs.f_bavail ie. CallerAvailableAllocationUnits. This
is similar to what is done already in the code for cifs and df now
reports the quota information for the user used to mount the share.
# df --si /mnt/a
Filesystem Size Used Avail Use% Mounted on
//192.168.22.10/a 2.7G 101M 2.6G 4% /mnt/a
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Pierguido Lambri <plambri@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: <stable@vger.kernel.org>
|
|\ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Pull MTD fixes from Brian Norris:
"Two fixes - one for a 4.13 regression, and the other for an older one:
- Atmel NAND: since we started utilizing ONFI timings, we found that
we were being too restrict at rejecting them, partly due to
discrepancies in ONFI 4.0 and earlier versions. Relax the
restriction to keep these platforms booting. This is a 4.13-rc1
regression.
- nandsim: repeated probe/removal may not work after a failed init,
because we didn't free up our debugfs files properly on the failure
path. This has been around since 3.8, but it's nice to get this
fixed now in a nice easy patch that can target -stable, since
there's already refactoring work (that also fixes the issue)
targeted for the next merge window"
* tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd:
mtd: nand: atmel: Relax tADL_min constraint
mtd: nandsim: remove debugfs entries in error path
|
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
Version 4 of the ONFI spec mandates that tADL be at least 400 nanoseconds,
but, depending on the master clock rate, 400 ns may not fit in the tADL
field of the SMC reg. We need to relax the check and accept the -ERANGE
return code.
Note that previous versions of the ONFI spec had a lower tADL_min (100 or
200 ns). It's not clear why this timing constraint got increased but it
seems most NANDs are fine with values lower than 400ns, so we should be
safe.
Fixes: f9ce2eddf176 ("mtd: nand: atmel: Add ->setup_data_interface() hooks")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Tested-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
The debugfs entries must be removed before an error is returned in the
probe function. Otherwise another try to load the module fails and when
the debugfs files are accessed without the module loaded, the kernel
still tries to call a function in that module.
Fixes: 5346c27c5fed ("mtd: nandsim: Introduce debugfs infrastructure")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Pull block fixes from Jens Axboe:
"A small batch of fixes that should be included for the 4.13 release.
This contains:
- Revert of the 4k loop blocksize support. Even with a recent batch
of 4 fixes, we're still not really happy with it. Rather than be
stuck with an API issue, let's revert it and get it right for 4.14.
- Trivial patch from Bart, adding a few flags to the blk-mq debugfs
exports that were added in this release, but not to the debugfs
parts.
- Regression fix for bsg, fixing a potential kernel panic. From
Benjamin.
- Tweak for the blk throttling, improving how we account discards.
From Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq-debugfs: Add names for recently added flags
bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
Revert "loop: support 4k physical blocksize"
blk-throttle: cap discard request size
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
The symbolic constants QUEUE_FLAG_SCSI_PASSTHROUGH, QUEUE_FLAG_QUIESCED
and REQ_NOWAIT are missing from blk-mq-debugfs.c. Add these to
blk-mq-debugfs.c such that these appear as names in debugfs instead of
as numbers.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
Since we split the scsi_request out of struct request bsg fails to
provide a reply-buffer for the drivers. This was done via the pointer
for sense-data, that is not preallocated anymore.
Failing to allocate/assign it results in illegal dereferences because
LLDs use this pointer unquestioned.
An example panic on s390x, using the zFCP driver, looks like this (I had
debugging on, otherwise NULL-pointer dereferences wouldn't even panic on
s390x):
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6403
Fault in home space mode while using kernel ASCE.
AS:0000000001590007 R3:0000000000000024
Oops: 0038 ilc:2 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: <Long List>
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-bsg-regression+ #3
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
task: 0000000065cb0100 task.stack: 0000000065cb4000
Krnl PSW : 0704e00180000000 000003ff801e4156 (zfcp_fc_ct_els_job_handler+0x16/0x58 [zfcp])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 000000005fa9d0d0 000000005fa9d078 0000000000e16866
000003ff00000290 6b6b6b6b6b6b6b6b 0000000059f78f00 000000000000000f
00000000593a0958 00000000593a0958 0000000060d88800 000000005ddd4c38
0000000058b50100 07000000659cba08 000003ff801e8556 00000000659cb9a8
Krnl Code: 000003ff801e4146: e31020500004 lg %r1,80(%r2)
000003ff801e414c: 58402040 l %r4,64(%r2)
#000003ff801e4150: e35020200004 lg %r5,32(%r2)
>000003ff801e4156: 50405004 st %r4,4(%r5)
000003ff801e415a: e54c50080000 mvhi 8(%r5),0
000003ff801e4160: e33010280012 lt %r3,40(%r1)
000003ff801e4166: a718fffb lhi %r1,-5
000003ff801e416a: 1803 lr %r0,%r3
Call Trace:
([<000003ff801e8556>] zfcp_fsf_req_complete+0x726/0x768 [zfcp])
[<000003ff801ea82a>] zfcp_fsf_reqid_check+0x102/0x180 [zfcp]
[<000003ff801eb980>] zfcp_qdio_int_resp+0x230/0x278 [zfcp]
[<00000000009b91b6>] qdio_kick_handler+0x2ae/0x2c8
[<00000000009b9e3e>] __tiqdio_inbound_processing+0x406/0xc10
[<00000000001684c2>] tasklet_action+0x15a/0x1d8
[<0000000000bd28ec>] __do_softirq+0x3ec/0x848
[<00000000001675a4>] irq_exit+0x74/0xf8
[<000000000010dd6a>] do_IRQ+0xba/0xf0
[<0000000000bd19e8>] io_int_handler+0x104/0x2d4
[<00000000001033b6>] enabled_wait+0xb6/0x188
([<000000000010339e>] enabled_wait+0x9e/0x188)
[<000000000010396a>] arch_cpu_idle+0x32/0x50
[<0000000000bd0112>] default_idle_call+0x52/0x68
[<00000000001cd0fa>] do_idle+0x102/0x188
[<00000000001cd41e>] cpu_startup_entry+0x3e/0x48
[<0000000000118c64>] smp_start_secondary+0x11c/0x130
[<0000000000bd2016>] restart_int_handler+0x62/0x78
[<0000000000000000>] (null)
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<000003ff801e41d6>] zfcp_fc_ct_job_handler+0x3e/0x48 [zfcp]
Kernel panic - not syncing: Fatal exception in interrupt
This patch moves bsg-lib to allocate and setup struct bsg_job ahead of
time, including the allocation of a buffer for the reply-data.
This means, struct bsg_job is not allocated separately anymore, but as part
of struct request allocation - similar to struct scsi_cmd. Reflect this in
the function names that used to handle creation/destruction of struct
bsg_job.
Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Cc: <stable@vger.kernel.org> #4.11+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
There's some stuff still up in the air, let's not get stuck with a
subpar ABI. I'll follow up with something better for 4.14.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
discard request usually is very big and easily use all bandwidth budget
of a cgroup. discard request size doesn't really mean the size of data
written, so it doesn't make sense to account it into bandwidth budget.
Jens pointed out treating the size 0 doesn't make sense too, because
discard request does have cost. But it's not easy to find the actual
cost. This patch simply makes the size one sector.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has some bugfixes for you: mainly Jarkko fixed up a few things in
the designware driver regarding the new slave mode. But Ulf also fixed
a long-standing and now agreed suspend problem. Plus, some simple
stuff which nonetheless needs fixing"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix runtime PM for I2C slave mode
i2c: designware: Remove needless pm_runtime_put_noidle() call
i2c: aspeed: fixed potential null pointer dereference
i2c: simtec: use release_mem_region instead of release_resource
i2c: core: Make comment about I2C table requirement to reflect the code
i2c: designware: Fix standard mode speed when configuring the slave mode
i2c: designware: Fix oops from i2c_dw_irq_handler_slave
i2c: designware: Fix system suspend
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
I2C slave controller must be powered and active all the time when I2C
slave backend is registered in order to let master address and
communicate with us.
Now if the controller is runtime PM capable it will be suspended after
probe and cannot ever respond to the master or generate interrupts.
Fix this by resuming the controller when I2C slave backend is registered
and let it suspend after unregistering.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
I guess pm_runtime_put_noidle() call in i2c_dw_probe_slave() was copied
by accident from similar master mode adapter registration code. It is
unbalanced due missing pm_runtime_get_noresume() but harmless since it
doesn't decrease dev->power.usage_count below zero.
In theory we can hit similar needless runtime suspend/resume cycle
during slave mode adapter registration that was happening when
registering the master mode adapter. See commit cd998ded5c12 ("i2c:
designware: Prevent runtime suspend during adapter registration").
However, since we are slave, we can consider it as a wrong configuration
if we have other slaves attached under this adapter and can omit the
pm_runtime_get_noresume()/pm_runtime_put_noidle() calls for simplicity.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Before I skipped null checks when the master is in the STOP state; this
fixes that.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: f327c686d3ba ("i2c: aspeed: added driver for Aspeed I2C")
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Use api pair of request_mem_region and release_mem_region
instead of release_resource.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
I2C drivers were required to have an I2C device ID table even if were for
devices that would only be registered using a specific firmware interface
(e.g: OF or ACPI).
But commit da10c06a044b ("i2c: Make I2C ID tables non-mandatory for DT'ed
devices") changed the I2C core to relax the requirement and allow drivers
to avoid defining this table.
Unfortunately it only took into account drivers for OF-only devices and
forgot about ACPI-only ones, and this was fixed by commit c64ffff7a9d1
("i2c: core: Allow empty id_table in ACPI case as well").
But the latter didn't update the original comment, so it doesn't reflect
what the code does now.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
Code sets bit DW_IC_CON_SPEED_FAST (0x4) always when configuring the slave
mode. This results incorrect register value DW_IC_CON_SPEED_HIGH (0x6)
when OR'ed together with DW_IC_CON_SPEED_STD (0x2).
Remove this and let the code set the speed mode bits according to clock
frequency or default to fast mode.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
When i2c-designware is initialized in slave mode the
i2c-designware-slave.c: i2c_dw_irq_handler_slave() can hit a NULL
pointer dereference when I2C slave backend is not registered but code is
accessing the struct dw_i2c_dev.slave without testing is it NULL.
We might get spurious interrupts from other devices or from IRQ core
during unloading the driver when CONFIG_DEBUG_SHIRQ is set. Existing
check for enable and IRQ status is not enough since device can be power
gated and those bits may read 1.
Fix this by handling the interrupt only when also struct dw_i2c_dev.slave
is set.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
| | |_|_|_|_|_|_|_|_|_|_|/ / /
| |/| | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
The commit 8503ff166504 ("i2c: designware: Avoid unnecessary resuming
during system suspend"), may suggest to the PM core to try out the so
called direct_complete path for system sleep. In this path, the PM core
treats a runtime suspended device as it's already in a proper low power
state for system sleep, which makes it skip calling the system sleep
callbacks for the device, except for the ->prepare() and the ->complete()
callbacks.
However, the PM core may unset the direct_complete flag for a parent
device, in case its child device are being system suspended before. In this
scenario, the PM core invokes the system sleep callbacks, no matter if the
device is runtime suspended or not.
Particularly in cases of an existing i2c slave device, the above path is
triggered, which breaks the assumption that the i2c device is always
runtime resumed whenever the dw_i2c_plat_suspend() is being called.
More precisely, dw_i2c_plat_suspend() calls clk_core_disable() and
clk_core_unprepare(), for an already disabled/unprepared clock, leading to
a splat in the log about clocks calls being wrongly balanced and breaking
system sleep.
To still allow the direct_complete path in cases when it's possible, but
also to keep the fix simple, let's runtime resume the i2c device in the
->suspend() callback, before continuing to put the device into low power
state.
Note, in cases when the i2c device is attached to the ACPI PM domain, this
problem doesn't occur, because ACPI's ->suspend() callback, assigned to
acpi_subsys_suspend(), already calls pm_runtime_resume() for the device.
It should also be noted that this change does not fix commit 8503ff166504
("i2c: designware: Avoid unnecessary resuming during system suspend").
Because for the non-ACPI case, the system sleep support was already broken
prior that point.
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|\ \ \ \ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
"MMC core: don't return error code R1_OUT_OF_RANGE for open-ending mode"
* tag 'mmc-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: block: prevent propagating R1_OUT_OF_RANGE for open-ending mode
|