| Commit message (Collapse) | Author | Age |
... | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/890952
commit 9bab0b7fbaceec47d32db51cd9e59c82fb071f5a upstream.
This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.
Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.
This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME".
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/890952
commit cbbc719fccdb8cbd87350a05c0d33167c9b79365 upstream.
The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it
to a sign-extended u64 type.
Change the type to unsigned long so we get the correct result.
Signed-off-by: hank <pyu@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
[ build fix ]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/890952
commit da92b194cc36b5dc1fbd85206aeeffd80bee0c39 upstream.
The pair of functions,
* skb_clone_tx_timestamp()
* skb_complete_tx_timestamp()
were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.
As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.
These functions first appeared in v2.6.36.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/890952
commit 276532ba9666b36974cbe16f303fc8be99c9da17 upstream.
The Kirkwood gave an unaligned memory access error on
line 742 of drivers/usb/host/echi-hcd.c:
"ehci->last_periodic_enable = ktime_get_real();"
Signed-off-by: Harro Haan <hrhaan@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/890952
commit fa90e1c935472281de314e6d7c9a37db9cbc2e4e upstream.
If tty_add_file fails at the point it is now, we have to revert all
the changes we did to the tty. It means either decrease all refcounts
if this was a tty reopen or delete the tty if it was newly allocated.
There was a try to fix this in v3.0-rc2 using tty_release in 0259894c7
(TTY: fix fail path in tty_open). But instead it introduced a NULL
dereference. It's because tty_release dereferences
filp->private_data, but that one is set even in our tty_add_file. And
when tty_add_file fails, it's still NULL/garbage. Hence tty_release
cannot be called there.
To circumvent the original leak (and the current NULL deref) we split
tty_add_file into two functions, making the latter non-failing. In
that case we may do the former early in open, where handling failures
is easy. The latter stays as it is now. So there is no change in
functionality.
The original bug (leak) was introduced by f573bd176 (tty: Remove
__GFP_NOFAIL from tty_add_file()). Thanks Dan for reporting this.
Later, we may split tty_release into more functions and call only some
of them in this fail path instead. (If at all possible.)
Introduced-in: v2.6.37-rc2
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
freezer_do_not_count/freezer_count
BugLink: http://bugs.launchpad.net/bugs/24330
Commit 27920651fe "PM / Freezer: Make fake_signal_wake_up() wake
TASK_KILLABLE tasks too" updated fake_signal_wake_up() used by freezer
to wake up KILLABLE tasks. Sending unsolicited wakeups to tasks in
killable sleep is dangerous as there are code paths which depend on
tasks not waking up spuriously from KILLABLE sleep.
For example. sys_read() or page can sleep in TASK_KILLABLE assuming
that wait/down/whatever _killable can only fail if we can not return
to the usermode. TASK_TRACED is another obvious example.
The offending commit was to resolve freezer hang during system PM
operations caused by KILLABLE sleeps in network filesystems.
wait_event_freezekillable(), which depends on the spurious KILLABLE
wakeup, was added by f06ac72e92 "cifs, freezer: add
wait_event_freezekillable and have cifs use it" to be used to
implement killable & freezable sleeps in network filesystems.
To prepare for reverting of 27920651fe, this patch reimplements
wait_event_freezekillable() using freezer_do_not_count/freezer_count()
so that it doesn't depend on the spurious KILLABLE wakeup. This isn't
very nice but should do for now.
[tj: Refreshed patch to apply to linus/master and updated commit
description on Rafael's request.]
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
(cherry picked from commit 6f35c4abd7f0294166a5e0ab0401fe7949b33034)
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/24330
fs/cifs/transport.c: In function 'wait_for_response':
fs/cifs/transport.c:328: error: implicit declaration of function 'wait_event_freezekillable'
Caused by commit f06ac72e9291 ("cifs, freezer: add
wait_event_freezekillable and have cifs use it"). In this config,
CONFIG_FREEZER is not set.
Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com>
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit e0c8ea1a69410ef44043646938a6a4175f5307e4)
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/24330
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit b957ae9c53d5715a07f8bac644d8ff0a407c7e07)
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/24330
CIFS currently uses wait_event_killable to put tasks to sleep while
they await replies from the server. That function though does not
allow the freezer to run. In many cases, the network interface may
be going down anyway, in which case the reply will never come. The
client then ends up blocking the computer from suspending.
Fix this by adding a new wait_event_freezable variant --
wait_event_freezekillable. The idea is to combine the behavior of
wait_event_killable and wait_event_freezable -- put the task to
sleep and only allow it to be awoken by fatal signals, but also
allow the freezer to do its job.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
(cherry picked from commit f06ac72e929115f2772c29727152ba0832d641e4)
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/881420
commit 04da85b86188f224cc9b391b5bdd92a3ba20ffcf upstream.
The struct ftrace_hash was declared within CONFIG_FUNCTION_TRACER
but was referenced outside of it.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/881420
commit 43dd61c9a09bd413e837df829e6bfb42159be52a upstream.
The new code that allows different utilities to pick and choose
what functions they trace broke the :mod: hook that allows users
to trace only functions of a particular module.
The reason is that the :mod: hook bypasses the hash that is setup
to allow individual users to trace their own functions and uses
the global hash directly. But if the global hash has not been
set up, it will cause a bug:
echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter
produces:
[drm:drm_mode_getfb] *ERROR* invalid framebuffer id
[drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
BUG: unable to handle kernel paging request at ffffffff8160ec90
IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
Oops: 0003 [#1] SMP Jul 7 04:02:28 phyllis kernel: [55303.858604] CPU 1
Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt
Pid: 10344, comm: bash Tainted: G WC 3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
RIP: 0010:[<ffffffff810d9136>] [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
RSP: 0018:ffff88003a96bda8 EFLAGS: 00010246
RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
Stack:
0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
Call Trace:
[<ffffffff810d92f5>] match_records+0x155/0x1b0
[<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
[<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
[<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
[<ffffffff81166e48>] vfs_write+0xc8/0x190
[<ffffffff81167001>] sys_write+0x51/0x90
[<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
RSP <ffff88003a96bda8>
CR2: ffffffff8160ec90
---[ end trace a5d031828efdd88e ]---
Reported-by: Brian Marete <marete@toshnix.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/881420
This patch fixes the issue caused by ef81bb40bf15f350fe865f31fa42f1082772a576
which is a backport of upstream 87c48fa3b4630905f98268dde838ee43626a060c. The
problem does not exist in upstream.
We do not check whether route is attached before trying to assign ip
identification through route dest which lead NULL pointer dereference. This
happens when host bridge transmit a packet from guest.
This patch changes ipv6_select_ident() to accept in6_addr as its paramter and
fix the issue by using the destination address in ipv6 header when no route is
attached.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/881420
commit f75159e9936143177b442afc78150b7a7ad8aa07 upstream.
The IEEE 1588 standard defines two kinds of messages, event and general
messages. Event messages require time stamping, and general do not. When
using UDP transport, two separate ports are used for the two message
types.
The BPF designed to recognize event messages incorrectly classifies L2
general messages as event messages. This commit fixes the issue by
extending the filter to check the message type field for L2 PTP packets.
Event messages are be distinguished from general messages by testing
the "general" bit.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/881420
commit d670ec13178d0fd8680e6742a2bc6e04f28f87d8 upstream.
David reported:
Attached below is a watered-down version of rt/tst-cpuclock2.c from
GLIBC. Just build it with "gcc -o test test.c -lpthread -lrt" or
similar.
Run it several times, and you will see cases where the main thread
will measure a process clock difference before and after the nanosleep
which is smaller than the cpu-burner thread's individual thread clock
difference. This doesn't make any sense since the cpu-burner thread
is part of the top-level process's thread group.
I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
64-bit binaries).
For example:
[davem@boricha build-x86_64-linux]$ ./test
process: before(0.001221967) after(0.498624371) diff(497402404)
thread: before(0.000081692) after(0.498316431) diff(498234739)
self: before(0.001223521) after(0.001240219) diff(16698)
[davem@boricha build-x86_64-linux]$
The diff of 'process' should always be >= the diff of 'thread'.
I make sure to wrap the 'thread' clock measurements the most tightly
around the nanosleep() call, and that the 'process' clock measurements
are the outer-most ones.
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
#include <pthread.h>
static pthread_barrier_t barrier;
static void *chew_cpu(void *arg)
{
pthread_barrier_wait(&barrier);
while (1)
__asm__ __volatile__("" : : : "memory");
return NULL;
}
int main(void)
{
clockid_t process_clock, my_thread_clock, th_clock;
struct timespec process_before, process_after;
struct timespec me_before, me_after;
struct timespec th_before, th_after;
struct timespec sleeptime;
unsigned long diff;
pthread_t th;
int err;
err = clock_getcpuclockid(0, &process_clock);
if (err)
return 1;
err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
if (err)
return 1;
pthread_barrier_init(&barrier, NULL, 2);
err = pthread_create(&th, NULL, chew_cpu, NULL);
if (err)
return 1;
err = pthread_getcpuclockid(th, &th_clock);
if (err)
return 1;
pthread_barrier_wait(&barrier);
err = clock_gettime(process_clock, &process_before);
if (err)
return 1;
err = clock_gettime(my_thread_clock, &me_before);
if (err)
return 1;
err = clock_gettime(th_clock, &th_before);
if (err)
return 1;
sleeptime.tv_sec = 0;
sleeptime.tv_nsec = 500000000;
nanosleep(&sleeptime, NULL);
err = clock_gettime(th_clock, &th_after);
if (err)
return 1;
err = clock_gettime(my_thread_clock, &me_after);
if (err)
return 1;
err = clock_gettime(process_clock, &process_after);
if (err)
return 1;
diff = process_after.tv_nsec - process_before.tv_nsec;
printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
process_before.tv_sec, process_before.tv_nsec,
process_after.tv_sec, process_after.tv_nsec, diff);
diff = th_after.tv_nsec - th_before.tv_nsec;
printf("thread: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
th_before.tv_sec, th_before.tv_nsec,
th_after.tv_sec, th_after.tv_nsec, diff);
diff = me_after.tv_nsec - me_before.tv_nsec;
printf("self: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
me_before.tv_sec, me_before.tv_nsec,
me_after.tv_sec, me_after.tv_nsec, diff);
return 0;
}
This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.
This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().
Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 6e6938b6d3130305a5960c86b1a9b21e58cf6144 upstream.
sync(2) is performed in two stages: the WB_SYNC_NONE sync and the
WB_SYNC_ALL sync. Identify the first stage with .tagged_writepages and
do livelock prevention for it, too.
Jan's commit f446daaea9 ("mm: implement writeback livelock avoidance
using page tagging") is a partial fix in that it only fixed the
WB_SYNC_ALL phase livelock.
Although ext4 is tested to no longer livelock with commit f446daaea9,
it may due to some "redirty_tail() after pages_skipped" effect which
is by no means a guarantee for _all_ the file systems.
Note that writeback_inodes_sb() is called by not only sync(), they are
treated the same because the other callers also need livelock prevention.
Impact: It changes the order in which pages/inodes are synced to disk.
Now in the WB_SYNC_NONE stage, it won't proceed to write the next inode
until finished with the current inode.
Acked-by: Jan Kara <jack@suse.cz>
CC: Dave Chinner <david@fromorbit.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 8efcc57dedfebc99c3cd39564e3fc47cd1a24b75 upstream.
This needs to be an out of band value for the register and on this device
registers are 16 bit so we must shift left one to the 17th bit.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 51b8b4fb32271d39fbdd760397406177b2b0fd36 upstream.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit f88657ce3f9713a0c62101dffb0e972a979e77b9 upstream.
Some of the flags are OS/arch dependent we add a 9p
protocol value which maps to asm-generic/fcntl.h values in Linux
Based on the original patch from Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
[extra comments from author as to why this needs to go to stable:
Earlier for different operation such as open we used the values of open
flag as defined by the OS. But some of these flags such as O_DIRECT are
arch dependent. So if we have the 9p client and server running on
different architectures, we end up with client sending client
architecture value of these open flag and server will try to map these
values to what its architecture states. For ex: O_DIRECT on a x86 client
maps to
#define O_DIRECT 00040000
Where as on sparc server it will maps to
#define O_DIRECT 0x100000
Hence we need to map these open flags to OS/arch independent flag
values. Getting these changes to an early version of kernel ensures us
that we work with different combination of client and server. We should
ideally backport this patch to all possible kernel version.]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 938f97bcf1bdd1b681d5d14d1d7117a2e22d4434 upstream.
Thomas earlier submitted a fix to limit the RTC PIE freq, but
picked 5000Hz out of the air. Willy noticed that we should
instead use the 8192Hz max from the rtc man documentation.
Cc: Willy Tarreau <w@1wt.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 24d406a6bf736f7aebdc8fa0f0ec86e0890c6d24 upstream.
tty_operations->remove is normally called like:
queue_release_one_tty
->tty_shutdown
->tty_driver_remove_tty
->tty_operations->remove
However tty_shutdown() is called from queue_release_one_tty() only if
tty_operations->shutdown is NULL. But for pty, it is not.
pty_unix98_shutdown() is used there as ->shutdown.
So tty_operations->remove of pty (i.e. pty_unix98_remove()) is never
called. This results in invalid pty_count. I.e. what can be seen in
/proc/sys/kernel/pty/nr.
I see this was already reported at:
https://lkml.org/lkml/2009/11/5/370
But it was not fixed since then.
This patch is kind of a hackish way. The problem lies in ->install. We
allocate there another tty (so-called tty->link). So ->install is
called once, but ->remove twice, for both tty and tty->link. The fix
here is to count both tty and tty->link and divide the count by 2 for
user.
And to have ->remove called, let's make tty_driver_remove_tty() global
and call that from pty_unix98_shutdown() (tty_operations->shutdown).
While at it, let's document that when ->shutdown is defined,
tty_shutdown() is not called.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/868628
commit 284fb68d00c56e971ed01e0b4bac5ddd4d1b74ab upstream.
Replace/remove use of RIO v.1.2 registers/bits that are not
forward-compatible with newer versions of RapidIO specification.
RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.
Use of removed (since RIO v.1.3) register bits affects users of
currently available 1.3 and 2.x compliant devices who may use not so
recent kernel versions.
Removing checks for unsupported bits makes corresponding routines
compatible with all versions of RapidIO specification. Therefore,
backporting makes stable kernel versions compliant with RIO v.1.3 and
later as well.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Chul Kim <chul.kim@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
x2apic optout
BugLink: http://bugs.launchpad.net/bugs/797548
On the platforms which are x2apic and interrupt-remapping capable, Linux
kernel is enabling x2apic even if the BIOS doesn't. This is to take
advantage of the features that x2apic brings in.
Some of the OEM platforms are running into issues because of this, as their
bios is not x2apic aware. For example, this was resulting in interrupt migration
issues on one of the platforms. Also if the BIOS SMI handling uses APIC
interface to send SMI's, then the BIOS need to be aware of x2apic mode
that OS has enabled.
On some of these platforms, BIOS doesn't have a HW mechanism to turnoff
the x2apic feature to prevent OS from enabling it.
To resolve this mess, recent changes to the VT-d2 specification
(http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf)
includes a mechanism that provides BIOS a way to request system software
to opt out of enabling x2apic mode.
Look at the x2apic optout flag in the DMAR tables before enabling the x2apic
mode in the platform. Also print a warning that we have disabled x2apic
based on the BIOS request.
Kernel boot parameter "intremap=no_x2apic_optout" can be used to override
the BIOS x2apic optout request.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This change adds a new seccomp mode which specifies the allowed system
calls dynamically. When in the new mode (13), all system calls are
checked against process-defined filters - first by system call number,
then by a filter string. If an entry exists for a given system call and
all filter predicates evaluate to true, then the task may proceed.
Otherwise, the task is killed.
Filter string parsing and evaluation is handled by the ftrace filter
engine. Related patches tweak to the perf filter trace and free
allowing the calls to be shared. Filters inherit their understanding of
types and arguments for each system call from the CONFIG_FTRACE_SYSCALLS
subsystem which already populates this information in syscall_metadata
associated enter_event (and exit_event) structures. If
CONFIG_FTRACE_SYSCALLS is not compiled in, only filter strings of "1"
will be allowed.
The net result is a process may have its system calls filtered using the
ftrace filter engine's inherent understanding of systems calls. The set
of filters is specified through the PR_SET_SECCOMP_FILTER argument in
prctl(). For example, a filterset for a process, like pdftotext, that
should only process read-only input could (roughly) look like:
sprintf(rdonly, "flags == %u", O_RDONLY|O_LARGEFILE);
type = PR_SECCOMP_FILTER_SYSCALL;
prctl(PR_SET_SECCOMP_FILTER, type, __NR_open, rdonly);
prctl(PR_SET_SECCOMP_FILTER, type, __NR__llseek, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_brk, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_close, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_exit_group, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_fstat64, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_mmap2, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_munmap, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_read, "1");
prctl(PR_SET_SECCOMP_FILTER, type, __NR_write, "fd == 1 || fd == 2");
prctl(PR_SET_SECCOMP, 13);
Subsequent calls to PR_SET_SECCOMP_FILTER for the same system call will
be &&'d together to ensure that attack surface may only be reduced:
prctl(PR_SET_SECCOMP_FILTER, __NR_write, "fd != 2");
With the earlier example, the active filter becomes:
"(fd == 1 || fd == 2) && (fd != 2)"
The patch also adds PR_CLEAR_SECCOMP_FILTER and PR_GET_SECCOMP_FILTER.
The latter returns the current filter for a system call to userspace:
prctl(PR_GET_SECCOMP_FILTER, type, __NR_write, buf, bufsize);
while the former clears any filters for a given system call changing it
back to a defaulty deny:
prctl(PR_CLEAR_SECCOMP_FILTER, type, __NR_write);
Note, type may be either PR_SECCOMP_FILTER_EVENT or
PR_SECCOMP_FILTER_SYSCALL. This allows for ftrace event ids to be used
in lieu of system call numbers. At present, only syscalls:sys_enter_*
event id are supported, but this allows for potential future extension
of the backend.
v11: - Use mode "13" to avoid future overlap; with comment update
- Use kref; extra memset; other clean up from msb@chromium.org
- Cleaned up Makefile object merging since locally shared symbols are gone
v10: - Note that PERF_EVENTS are also needed for ftrace filter engine support.
- Removed dependency on ftrace code changes for event_filters
(wrapping with perf_events and violating opaqueness for the filter str)
- pulled in all the hacks to get access to syscall_metadata and build
call objects for filter evaluation.
v9: - rebase on to de505e709ffb09a7382ca8e0d8c7dbb171ba5
- disallow PR_SECCOMP_FILTER_EVENT when a compat task is calling
as ftrace has no compat_syscalls awareness yet.
- return -ENOSYS when filter engine strings are used on a compat call
as there are no compat_syscalls events to reference yet.
v8: - expand parenthical use during SET_SECCOMP_FILTER to avoid operator
precedence undermining attack surface reduction (caught by
segoon@openwall.com). Opted to waste bytes on () than reparse to
avoid OP_OR precedence overriding extend_filter's intentions.
- remove more lingering references to @state
- fix incorrect compat mismatch check (anyone up for a Tested-By?)
v7: - disallow seccomp_filter inheritance across fork except when seccomp
is active. This avoids filters leaking across processes when they
are not actively in use but ensure an allowed fork/clone doesn't drop
filters.
- remove the Mode: print from show as it reflected current and not the
filters holder.
v6: - clean up minor unnecessary changes (empty lines, ordering, etc)
- fix one overly long line
- add refcount overflow BUG_ON
v5: - drop mutex usage when the task_struct is safe to access directly
v4: - move off of RCU to a read/write guarding mutex after
paulmck@linux.vnet.ibm.com's feedback (mem leak, rcu fail)
- stopped inc/dec refcounts in mutex guard sections
- added required changes to init the mutex in INIT_TASK and safely
lock around fork inheritance.
- added id_type support to the prctl interface to support using
ftrace event ids as an alternative to syscall numbers. Behavior
is identical otherwise (as per discussion with mingo@elte.hu)
v3: - always block execve calls (as per torvalds@linux-foundation.org)
- add __NR_seccomp_execve(_32) to seccomp-supporting arches
- ensure compat tasks can't reach ftrace:syscalls
- dropped new defines for seccomp modes.
- two level array instead of hlists (sugg. by olofj@chromium.org)
- added generic Kconfig entry that is not connected.
- dropped internal seccomp.h
- move prctl helpers to seccomp_filter
- killed seccomp_t typedef (as per checkpatch)
v2: - changed to use the existing syscall number ABI.
- prctl changes to minimize parsing in the kernel:
prctl(PR_SET_SECCOMP, {0 | 1 | 2 }, { 0 | ON_EXEC });
prctl(PR_SET_SECCOMP_FILTER, __NR_read, "fd == 5");
prctl(PR_CLEAR_SECCOMP_FILTER, __NR_read);
prctl(PR_GET_SECCOMP_FILTER, __NR_read, buf, bufsize);
- defined PR_SECCOMP_MODE_STRICT and ..._FILTER
- added flags
- provide a default fail syscall_nr_to_meta in ftrace
- provides fallback for unhooked system calls
- use -ENOSYS and ERR_PTR(-ENOSYS) for stubbed functionality
- added kernel/seccomp.h to share seccomp.c/seccomp_filter.c
- moved to a hlist and 4 bit hash of linked lists
- added support to operate without CONFIG_FTRACE_SYSCALLS
- moved Kconfig support next to SECCOMP
- made Kconfig entries dependent on EXPERIMENTAL
- added macros to avoid ifdefs from kernel/fork.c
- added compat task/filter matching
- drop seccomp.h inclusion in sched.h and drop seccomp_t
- added Filtering to "show" output
- added on_exec state dup'ing when enabling after a fast-path accept.
Signed-off-by: Will Drewry <wad@chromium.org>
BUG=chromium-os:14496
TEST=built in x86-alex. Out of tree commandline helper test confirms functionality works. Will check in a test into the minijail repo which can be used from autotest.
Change-Id: I901595e3399914783739d113a058d83550ddf8e2
Reviewed-on: http://gerrit.chromium.org/gerrit/4814
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
Tested-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Grub may be able to select a graphics mode and paint a splash screen
for us. If so it needs to be able to tell us it has done so. Add
support for detecting a new graphics mode selected bit in the
screen_info passed over at boot. Use this to automatically enable
vt_handoff mode.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Introduce a new VT mode KD_TRANSPARENT which endevours to leave the current
content of the framebuffer untouched. This allows the bootloader to insert
a graphical splash and have the kernel maintain it until the OS splash
can take over. When we finally switch away (either through programs like
plymouth or manually) the content is lost and the VT reverts to text mode.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[apw@canonical.com: This has no upstream traction but is used by powertop,
so its worth carrying.]
PowerTOP would like to be able to show who is keeping the disk
busy by dirtying data. The most logical spot for this is in the vfs
in the mark_inode_dirty() function. Doing this on the block level
is not possible because by the time the IO hits the block layer the
guilty party can no longer be found ("kjournald" and "pdflush" are not
useful answers to "who caused this file to be dirty).
The trace point follows the same logic/style as the block_dump code
and pretty much dumps the same data, just not to dmesg (and thus to
/var/log/messages) but via the trace events streams.
Note: This patch was posted to lkml and might potentially go into 2.6.33 but I
have not seen which maintainer will take it.
Signed-of-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Amit Kucheria <amit.kucheria@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Track pages which undergo readahead and for each record which were
actually consumed, via either read or faulted into a map. This allows
userspace readahead applications (such as ureadahead) to track which
pages in core at the end of a boot are actually required and generate an
optimal readahead pack. It also allows pack adjustment and optimisation
in parallel with readahead, allowing the pack to evolve to be accurate
as userspace paths change. The status of the pages are reported back via
the mincore() call using a newly allocated bit.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a simple read-only counter to super_block that indicates deep this
is in the stack of filesystems. Previously ecryptfs was the only
stackable filesystem and it explicitly disallowed multiple layers of
itself.
Overlayfs, however, can be stacked recursively and also may be stacked
on top of ecryptfs or vice versa.
To limit the kernel stack usage we must limit the depth of the
filesystem stack. Initially the limit is set to 2.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Overlayfs needs a private clone of the mount, so create a function for
this and export to modules.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add a new inode operation i_op->open(). This is for stacked
filesystems that want to return a struct file from a different
filesystem.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This code is originally from Ingo Molnar, with some later rebasing and
fixes to respect all the randomization-disabling knobs. It provides
address randomization algorithm when NX emulation is in use in 32-bit
processes. Kees Cook pushed the brk area further away in the case of PIE
binaries landing their brk inside the CS limit.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is old code with some cruft, all originally by Ingo Molnar with
much later rebasing by Fedora folks and at least one arcane fix by
Roland McGrath a few years ago. No longer uses exec-shield sysctl,
merged with disable_nx. Kees Cook fixed boottime NX reporting for various
corner cases.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some application suites have external crash handlers that depend
on being able to use ptrace to generate crash reports (KDE, Wine,
Chromium, Firefox, etc). Since the inferior process has a defined
application-specific relationship with the debugger, allow the inferior
to express that relationship by declaring who can call PTRACE_ATTACH
against it. The inferior can use prctl() with PR_SET_PTRACER to allow a
specific PID and its descendants to perform the ptrace instead of only
a direct ancestor.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
---
v2:
- kmalloc, spinlock init, and doc typo corrections from Tetsuo Handa.
- make sure to replace if possible on add, thanks to Eric Paris.
v3:
- make sure to use thread group leader when searching for exceptions.
v4:
- make sure to use thread group leader when creating exceptions.
v5:
- make sure to use thread group leader when deleting exceptions.
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The current LSM interface to cred_free is not sufficient for allowing
an LSM to track the life and death of a task. This patch adds the
task_free hook so that an LSM can clean up resources on task death.
Signed-off-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| | |
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| | |
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| | |
Add compatibility for v5 network rules.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
OriginalAuthor: Amazona from Ben Howard <behoward@amazon.com>
BugLink: http://bugs.launchpad.net/bugs/634316
The pv-ops kernel suffers from poor performance when using Amazon's
Elastic block storage (EBS). This patch from Amazon improves pv-ops
kernel performance, and has not exhibited any regressions.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/586386
Some init packages such as upstart find having all of the kernel parameters
passed in useful. Currently they have to open up /proc/cmdline and
reparse that to obtain this information. Add a kernel configuration
option to enable passing of all options.
Note, enabling this option will reduce the chances that a fallback from
/sbin/init to /bin/bash or /bin/sh will succeed. Though it should be
noted that there are commonly unknown options present which would already
break this fallback. init=/bin/foo provides explicit control over options
which is unaffected by this change.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
| |
| |
| |
| |
| |
| |
| | |
Check to see if the machine has more than one active CPU, if it does
then it is worth starting the decode of the rootfs earlier.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The expansion of the initramfs is completely independant of other
boot activities. The original data is already present at boot and the
filesystem is not required until we are ready to start init. It is
therefore reasonable to populate the rootfs asynchronously. Move this
processing to an async call.
This reduces kernel initialisation time (the time from bootloader to
starting userspace) by several 10ths of a second on a selection of test
hardware particularly SMP systems, although UP system also benefit.
Signed-off-by: Surbhi Palande <surbhi.palande@canonical.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BugLink: http://bugs.launchpad.net/bugs/462111
This patch uses TRACE_EVENT to add tracepoints for the open(),
exec() and uselib() syscalls so that ureadahead can cheaply trace
the boot sequence to determine what to read to speed up the next.
It's not upstream because it will need to be rebased onto the syscall
trace events whenever that gets merged, and is a stop-gap.
Signed-off-by: Scott James Remnant <scott@ubuntu.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Currently scsi headers are generated by the kernel and by libc6-dev.
We need to coordinate any switch over to the kernel. Temporarily disabled
these headers in the kernel package.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
| |
| |
| |
| |
| |
| |
| | |
Code is required for ubuntu/compcache
Signed-off-by: Ben Collins <ben.collins@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit be27425dcc516fd08245b047ea57f83b8f6f0903 upstream.
I ran into a couple of programs which broke with the new Linux 3.0
version. Some of those were binary only. I tried to use LD_PRELOAD to
work around it, but it was quite difficult and in one case impossible
because of a mix of 32bit and 64bit executables.
For example, all kind of management software from HP doesnt work, unless
we pretend to run a 2.6 kernel.
$ uname -a
Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux
$ hpacucli ctrl all show
Error: No controllers detected.
$ rpm -qf /usr/sbin/hpacucli
hpacucli-8.75-12.0
Another notable case is that Python now reports "linux3" from
sys.platform(); which in turn can break things that were checking
sys.platform() == "linux2":
https://bugzilla.mozilla.org/show_bug.cgi?id=664564
It seems pretty clear to me though it's a bug in the apps that are using
'==' instead of .startswith(), but this allows us to unbreak broken
programs.
This patch adds a UNAME26 personality that makes the kernel report a
2.6.40+x version number instead. The x is the x in 3.x.
I know this is somewhat ugly, but I didn't find a better workaround, and
compatibility to existing programs is important.
Some programs also read /proc/sys/kernel/osrelease. This can be worked
around in user space with mount --bind (and a mount namespace)
To use:
wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
gcc -o uname26 uname26.c
./uname26 program
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit 6d3321e8e2b3bf6a5892e2ef673c7bf536e3f904 upstream.
MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).
MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.
For now, take the stop_machine mutex in the MTRR rendezvous sequence that
gets called from an online cpu (here we are in the process context
and can potentially sleep while taking the mutex). And the MTRR rendezvous
that gets triggered during cpu online doesn't need to take this stop_machine
lock (as the stop_machine() already ensures that there is no cpu hotplug
going on in parallel by doing get_online_cpus())
TBD: Pursue a cleaner solution of extending the stop_machine()
infrastructure to handle the case where the calling cpu is
still not online and use this for MTRR rendezvous sequence.
fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
commit 5c723ba5b7886909b2e430f2eae454c33f7fe5c6 upstream.
In commit 2efaca927f5c ("mm/futex: fix futex writes on archs with SW
tracking of dirty & young") we forgot about MMU=n. This patch fixes
that.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/1311761831.24752.413.camel@twins
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Upstream commit d8873315065f1f527c7c380402cf59b1e1d0ae36 ]
Pktgen attempts to transmit shared skbs to net devices, which can't be used by
some drivers as they keep state information in skbs. This patch adds a flag
marking drivers as being able to handle shared skbs in their tx path. Drivers
are defaulted to being unable to do so, but calling ether_setup enables this
flag, as 90% of the drivers calling ether_setup touch real hardware and can
handle shared skbs. A subsequent patch will audit drivers to ensure that the
flag is set properly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jiri Pirko <jpirko@redhat.com>
CC: Robert Olsson <robert.olsson@its.uu.se>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ Backport of upstream commit 87c48fa3b4630905f98268dde838ee43626a060c ]
Fernando Gont reported current IPv6 fragment identification generation
was not secure, because using a very predictable system-wide generator,
allowing various attacks.
IPv4 uses inetpeer cache to address this problem and to get good
performance. We'll use this mechanism when IPv6 inetpeer is stable
enough in linux-3.1
For the time being, we use jhash on destination address to provide less
predictable identifications. Also remove a spinlock and use cmpxchg() to
get better SMP performance.
Reported-by: Fernando Gont <fernando@gont.com.ar>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.
MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)
Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation. So the periodic
regeneration and 8-bit counter have been removed. We compute and
use a full 32-bit sequence number.
For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.
Reported-by: Dan Kaminsky <dan@doxpara.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|