| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
Originally, the PCI sensitive OF node is tracing the eeh device
through struct device_node->edev. However, it was regarded as
bad idea.
The patch removes struct device_node->edev and uses PCI_DN to
trace the corresponding eeh device according to BenH's comments.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull <linux/device.h> avoidance patches from Paul Gortmaker:
"Nearly every subsystem has some kind of header with a proto like:
void foo(struct device *dev);
and yet there is no reason for most of these guys to care about the
sub fields within the device struct. This allows us to significantly
reduce the scope of headers including headers. For this instance, a
reduction of about 40% is achieved by replacing the include with the
simple fact that the device is some kind of a struct.
Unlike the much larger module.h cleanup, this one is simply two
commits. One to fix the implicit <linux/device.h> users, and then one
to delete the device.h includes from the linux/include/ dir wherever
possible."
* tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
device.h: audit and cleanup users in main include dir
device.h: cleanup users outside of linux/include (C files)
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The <linux/device.h> header includes a lot of stuff, and
it in turn gets a lot of use just for the basic "struct device"
which appears so often.
Clean up the users as follows:
1) For those headers only needing "struct device" as a pointer
in fcn args, replace the include with exactly that.
2) For headers not really using anything from device.h, simply
delete the include altogether.
3) For headers relying on getting device.h implicitly before
being included themselves, now explicitly include device.h
4) For files in which doing #1 or #2 uncovers an implicit
dependency on some other header, fix by explicitly adding
the required header(s).
Any C files that were implicitly relying on device.h to be
present have already been dealt with in advance.
Total removals from #1 and #2: 51. Total additions coming
from #3: 9. Total other implicit dependencies from #4: 7.
As of 3.3-rc1, there were 110, so a net removal of 42 gives
about a 38% reduction in device.h presence in include/*
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull <linux/bug.h> cleanup from Paul Gortmaker:
"The changes shown here are to unify linux's BUG support under the one
<linux/bug.h> file. Due to historical reasons, we have some BUG code
in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in
linux/kernel.h predates the addition of linux/bug.h, but old code in
kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h
was including <asm/bug.h> to pseudo link them.
This has caused confusion[1] and general yuck/WTF[2] reactions. Here
is an example that violates the principle of least surprise:
CC lib/string.o
lib/string.c: In function 'strlcat':
lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
make[2]: *** [lib/string.o] Error 1
$
$ grep linux/bug.h lib/string.c
#include <linux/bug.h>
$
We've included <linux/bug.h> for the BUG infrastructure and yet we
still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
very confusing for someone who is new to kernel development.
With the above in mind, the goals of this changeset are:
1) find and fix any include/*.h files that were relying on the
implicit presence of BUG code.
2) find and fix any C files that were consuming kernel.h and hence
relying on implicitly getting some/all BUG code.
3) Move the BUG related code living in kernel.h to <linux/bug.h>
4) remove the asm/bug.h from kernel.h to finally break the chain.
During development, the order was more like 3-4, build-test, 1-2. But
to ensure that git history for bisect doesn't get needless build
failures introduced, the commits have been reorderd to fix the problem
areas in advance.
[1] https://lkml.org/lkml/2012/1/3/90
[2] https://lkml.org/lkml/2012/1/17/414"
Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
and linux-next.
* tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
kernel.h: doesn't explicitly use bug.h, so don't include it.
bug: consolidate BUILD_BUG_ON with other bug code
BUG: headers with BUG/BUG_ON etc. need linux/bug.h
bug.h: add include of it to various implicit C users
lib: fix implicit users of kernel.h for TAINT_WARN
spinlock: macroize assert_spin_locked to avoid bug.h dependency
x86: relocate get/set debugreg fcns to include/asm/debugreg.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This header isn't using bug.h infrastructure, but due to historical
reasons, it was including it. Removing it revealed several implicit
dependencies (since kernel.h is everywhere) so we've fixed those 1st
before deploying this change.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The support for BUILD_BUG in linux/kernel.h predates the
addition of linux/bug.h -- with this chunk off separate,
you can run into situations where a person gets a compile
fail even when they've included linux/bug.h, like this:
CC lib/string.o
lib/string.c: In function 'strlcat':
lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
make[2]: *** [lib/string.o] Error 1
$
$ grep linux/bug.h lib/string.c
#include <linux/bug.h>
$
Since the above violates the principle of least surprise, move
the BUG chunks from kernel.h to bug.h so it is all together.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including <linux/bug.h> and not just
expecting it to be implicitly present.
We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In spinlock_api_smp.h we find a define for assert_raw_spin_locked
[which uses BUG_ON]. Then assert_spin_locked (as an inline) uses
it, meaning we need bug.h But rather than put linux/bug.h in such
a highly used file like spinlock.h, we can just make the un-raw
version also a macro. Then the required bug.h presence is limited
just to those few files who are actually doing the assert testing.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
CC: Thomas Gleixner <tglx@linutronix.de>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull sysctl updates from Eric Biederman:
- Rewrite of sysctl for speed and clarity.
Insert/remove/Lookup in sysctl are all now O(NlogN) operations, and
are no longer bottlenecks in the process of adding and removing
network devices.
sysctl is now focused on being a filesystem instead of system call
and the code can all be found in fs/proc/proc_sysctl.c. Hopefully
this means the code is now approachable.
Much thanks is owed to Lucian Grinjincu for keeping at this until
something was found that was usable.
- The recent proc_sys_poll oops found by the fuzzer during hibernation
is fixed.
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl: (36 commits)
sysctl: protect poll() in entries that may go away
sysctl: Don't call sysctl_follow_link unless we are a link.
sysctl: Comments to make the code clearer.
sysctl: Correct error return from get_subdir
sysctl: An easier to read version of find_subdir
sysctl: fix memset parameters in setup_sysctl_set()
sysctl: remove an unused variable
sysctl: Add register_sysctl for normal sysctl users
sysctl: Index sysctl directories with rbtrees.
sysctl: Make the header lists per directory.
sysctl: Move sysctl_check_dups into insert_header
sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy
sysctl: Replace root_list with links between sysctl_table_sets.
sysctl: Add sysctl_print_dir and use it in get_subdir
sysctl: Stop requiring explicit management of sysctl directories
sysctl: Add a root pointer to ctl_table_set
sysctl: Rewrite proc_sys_readdir in terms of first_entry and next_entry
sysctl: Rewrite proc_sys_lookup introducing find_entry and lookup_entry.
sysctl: Normalize the root_table data structure.
sysctl: Factor out insert_header and erase_header
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The plan is to convert all callers of register_sysctl_table
and register_sysctl_paths to register_sysctl. The interface
to register_sysctl is enough nicer this should make the callers
a bit more readable. Additionally after the conversion the
230 lines of backwards compatibility can be removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
One of the most important jobs of sysctl is to export network stack
tunables. Several of those tunables are per network device. In
several instances people are running with 1000+ network devices in
there network stacks, which makes the simple per directory linked list
in sysctl a scaling bottleneck. Replace O(N^2) sysctl insertion and
lookup times with O(NlogN) by using an rbtree to index the sysctl
directories.
Benchmark before:
make-dummies 0 999 -> 0.32s
rmmod dummy -> 0.12s
make-dummies 0 9999 -> 1m17s
rmmod dummy -> 17s
Benchmark after:
make-dummies 0 999 -> 0.074s
rmmod dummy -> 0.070s
make-dummies 0 9999 -> 3.4s
rmmod dummy -> 0.44s
Benchmark after (without dev_snmp6):
make-dummies 0 9999 -> 0.75s
rmmod dummy -> 0.44s
make-dummies 0 99999 -> 11s
rmmod dummy -> 4.3s
At 10,000 dummy devices the bottleneck becomes the time to add and
remove the files under /proc/sys/net/dev_snmp6. I have commented
out the code that adds and removes files under /proc/sys/net/dev_snmp6
and taken measurments of creating and destroying 100,000 dummies to
verify the sysctl continues to scale.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Slightly enhance efficiency and clarity of the code by making the
header list per directory instead of per set.
Benchmark before:
make-dummies 0 999 -> 0.63s
rmmod dummy -> 0.12s
make-dummies 0 9999 -> 2m35s
rmmod dummy -> 18s
Benchmark after:
make-dummies 0 999 -> 0.32s
rmmod dummy -> 0.12s
make-dummies 0 9999 -> 1m17s
rmmod dummy -> 17s
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
an nsproxy
An nsproxy argument here has always been awkard and now the nsproxy argument
is completely unnecessary so remove it, replacing it with the set we want
the registered tables to show up in.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Piecing together directories by looking first in one directory
tree, than in another directory tree and finally in a third
directory tree makes it hard to verify that some directory
entries are not multiply defined and makes it hard to create
efficient implementations the sysctl filesystem.
Replace the sysctl wide list of roots with autogenerated
links from the core sysctl directory tree to the other
sysctl directory trees.
This simplifies sysctl directory reading and lookups as now
only entries in a single sysctl directory tree need to be
considered.
Benchmark before:
make-dummies 0 999 -> 0.44s
rmmod dummy -> 0.065s
make-dummies 0 9999 -> 1m36s
rmmod dummy -> 0.4s
Benchmark after:
make-dummies 0 999 -> 0.63s
rmmod dummy -> 0.12s
make-dummies 0 9999 -> 2m35s
rmmod dummy -> 18s
The slowdown is caused by the lookups used in insert_headers
and put_links to see if we need to add links or remove links.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Simplify the code and the sysctl semantics by autogenerating
sysctl directories when a sysctl table is registered that needs
the directories and autodeleting the directories when there are
no more sysctl tables registered that need them.
Autogenerating directories keeps sysctl tables from depending
on each other, removing all of the arcane register/unregister
ordering constraints and makes it impossible to get the order
wrong when reigsering and unregistering sysctl tables.
Autogenerating directories yields one unique entity that dentries
can point to, retaining the current effective use of the dcache.
Add struct ctl_dir as the type of these new autogenerated
directories.
The attached_by and attached_to fields in ctl_table_header are
removed as they are no longer needed.
The child field in ctl_table is no longer needed by the core of
the sysctl code. ctl_table.child can be removed once all of the
existing users have been updated.
Benchmark before:
make-dummies 0 999 -> 0.7s
rmmod dummy -> 0.07s
make-dummies 0 9999 -> 1m10s
rmmod dummy -> 0.4s
Benchmark after:
make-dummies 0 999 -> 0.44s
rmmod dummy -> 0.065s
make-dummies 0 9999 -> 1m36s
rmmod dummy -> 0.4s
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add a ctl_table_root pointer to ctl_table set so it is easy to
go from a ctl_table_set to a ctl_table_root.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add nreg to ctl_table_header. When nreg drops to 0 the ctl_table_header
will be unregistered.
Factor out drop_sysctl_table from unregister_sysctl_table, and add
the logic for decrementing nreg.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
While useful at one time for selinux and the sysctl sanity
checks those users no longer use the parent field and we can
safely remove it.
Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmil.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Split the registration of a complex ctl_table array which may have
arbitrary numbers of directories (->child != NULL) and tables of files
into a series of simpler registrations that only register tables of files.
Graphically:
register('dir', { + file-a
+ file-b
+ subdir1
+ file-c
+ subdir2
+ file-d
+ file-e })
is transformed into:
wrapper->subheaders[0] = register('dir', {file1-a, file1-b})
wrapper->subheaders[1] = register('dir/subdir1', {file-c})
wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e})
return wrapper
This guarantees that __register_sysctl_table will only see a simple
ctl_table array with all entries having (->child == NULL).
Care was taken to pass the original simple ctl_table arrays to
__register_sysctl_table whenever possible.
This change is derived from a similar patch written
by Lucrian Grijincu.
Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Make __register_sysctl_table the core sysctl registration operation and
make it take a char * string as path.
Now that binary paths have been banished into the real of backwards
compatibility in kernel/binary_sysctl.c where they can be safely
ignored there is no longer a need to use struct ctl_path to represent
path names when registering ctl_tables.
Start the transition to using normal char * strings to represent
pathnames when registering sysctl tables. Normal strings are easier
to deal with both in the internal sysctl implementation and for
programmers registering sysctl tables.
__register_sysctl_paths is turned into a backwards compatibility wrapper
that converts a ctl_path array into a normal char * string.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In sysctl_net register the two networking roots in the proper order.
In register_sysctl walk the sysctl sets in the reverse order of the
sysctl roots.
Remove parent from ctl_table_set and setup_sysctl_set as it is no
longer needed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This adds a small helper retire_sysctl_set to remove the intimate knowledge about
the how a sysctl_set is implemented from net/sysct_net.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c
into fs/proc/proc_sysctl.c.
Currently sysctl maintenance is hampered by the sysctl implementation
being split across 3 files with artificial layering between them.
Consolidate the entire sysctl implementation into 1 file so that
it is easier to see what is going on and hopefully allowing for
simpler maintenance.
For functions that are now only used in fs/proc/proc_sysctl.c remove
their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Simplify the code by treating the base sysctl table like any other
sysctl table and register it with register_sysctl_table.
To ensure this table is registered early enough to avoid problems
call sysctl_init from proc_sys_init.
Rename sysctl_net.c:sysctl_init() to net_sysctl_init() to avoid
name conflicts now that kernel/sysctl.c:sysctl_init() is no longer
static.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
- In sysctl.h move functions only available if CONFIG_SYSCL
is defined inside of #ifdef CONFIG_SYSCTL
- Move the stub function definitions for !CONFIG_SYSCTL
into sysctl.h and make them static inlines.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
Pull cpufreq updates for 3.4 from Dave Jones: new drivers and some fixes.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
provide disable_cpufreq() function to disable the API.
EXYNOS5250: Add support cpufreq for EXYNOS5250
EXYNOS4X12: Add support cpufreq for EXYNOS4X12
[CPUFREQ] CPUfreq ondemand: update sampling rate without waiting for next sampling
[CPUFREQ] Add S3C2416/S3C2450 cpufreq driver
[CPUFREQ] Fix exposure of ARM_EXYNOS4210_CPUFREQ
[CPUFREQ] EXYNOS4210: update the name of EXYNOS clock register
[CPUFREQ] EXYNOS: Initialize locking_frequency with initial frequency
[CPUFREQ] s3c64xx: Fix mis-cherry pick of VDDINT
Fix up trivial conflicts in Kconfig and Makefile due to just changes
next to each other (OMAP2PLUS changes vs some new EXYNOS cpufreq
drivers).
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
useful for disabling cpufreq altogether. The cpu frequency
scaling drivers and cpu frequency governors will fail to register.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Jones <davej@redhat.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Pull #2 ARM updates from Russell King:
"Further ARM AMBA primecell updates which aren't included directly in
the previous commit. I wanted to keep these separate as they're
touching stuff outside arch/arm/."
* 'amba' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7362/1: AMBA: Add module_amba_driver() helper macro for amba_driver
ARM: 7335/1: mach-u300: do away with MMC config files
ARM: 7280/1: mmc: mmci: Cache MMCICLOCK and MMCIPOWER register
ARM: 7309/1: realview: fix unconnected interrupts on EB11MP
ARM: 7230/1: mmc: mmci: Fix PIO read for small SDIO packets
ARM: 7227/1: mmc: mmci: Prepare for SDIO before setting up DMA job
ARM: 7223/1: mmc: mmci: Fixup use of runtime PM and use autosuspend
ARM: 7221/1: mmc: mmci: Change from using legacy suspend
ARM: 7219/1: mmc: mmci: Change vdd_handler to a generic ios_handler
ARM: 7218/1: mmc: mmci: Provide option to configure bus signal direction
ARM: 7217/1: mmc: mmci: Put power register deviations in variant data
ARM: 7216/1: mmc: mmci: Do not release spinlock in request_end
ARM: 7215/1: mmc: mmci: Increase max_segs from 16 to 128
|
| |\ \ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The purpose of the vdd_handler does not make sense. We remove it
and use a generic approach instead. A new ios_handler is added, the
purpose of which e.g. can be to control GPIO pins to a levelshifter.
Previously the vdd_handler was also used for making additional
changes to the power register bits. This option is superfluous and is
therefore removed.
Adaptaptions from the old vdd_handler to the new ios_handler is done for
mach-ux500 board, which was the only one using the vdd_handler.
This patch is based upon a patch from Sebastian Rasmussen.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sebastian Rasmussen <sebastian.rasmussen@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | |/ /
| | |/| |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The ST Micro variant supports bus signal direction indication. A new
member in the variant struct is added for this.
Moreover the actual signal direction configuration is board specific,
thus the amba mmci platform data is extended with a new member to be
able provide mmci with these specific board configurations.
This patch is based upon a patch from Sebastian Rasmussen.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sebastian Rasmussen <sebastian.rasmussen@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
For simple modules that contain a single amba_driver without any
additional setup code then ends up being a block of duplicated
boilerplate. This patch adds a new macro, module_amba_driver(),
which replaces the module_init()/module_exit() registrations with
template functions.
Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Pull #1 ARM updates from Russell King:
"This one covers stuff which Arnd is waiting for me to push, as this is
shared between both our trees and probably other trees elsewhere.
Essentially, this contains:
- AMBA primecell device initializer updates - mostly shrinking the
size of the device declarations in platform code to something more
reasonable.
- Getting rid of the NO_IRQ crap from AMBA primecell stuff.
- Nicolas' idle cleanups. This in combination with the restart
cleanups from the last merge window results in a great many
mach/system.h files being deleted."
Yay: ~80 files, ~2000 lines deleted.
* 'for-armsoc' of git://git.linaro.org/people/rmk/linux-arm: (60 commits)
ARM: remove disable_fiq and arch_ret_to_user macros
ARM: make entry-macro.S depend on !MULTI_IRQ_HANDLER
ARM: rpc: make default fiq handler run-time installed
ARM: make arch_ret_to_user macro optional
ARM: amba: samsung: use common amba device initializers
ARM: amba: spear: use common amba device initializers
ARM: amba: nomadik: use common amba device initializers
ARM: amba: u300: use common amba device initializers
ARM: amba: lpc32xx: use common amba device initializers
ARM: amba: netx: use common amba device initializers
ARM: amba: bcmring: use common amba device initializers
ARM: amba: ep93xx: use common amba device initializers
ARM: amba: omap2: use common amba device initializers
ARM: amba: integrator: use common amba device initializers
ARM: amba: realview: get rid of private platform amba_device initializer
ARM: amba: versatile: get rid of private platform amba_device initializer
ARM: amba: vexpress: get rid of private platform amba_device initializer
ARM: amba: provide common initializers for static amba devices
ARM: amba: make use of -1 IRQs warn
ARM: amba: u300: get rid of NO_IRQ initializers
...
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
into for-armsoc
|
| |\ \ \ \ \ \
| | |_|/ / / /
| |/| | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
There were conflicts between fixes going in after 3.3-rc1 and
Russell's stable arm-soc base branch. Resolving it in the dependency
branch so that each topic branch shares the same resolution.
Conflicts:
arch/arm/mach-at91/at91cap9.c
arch/arm/mach-at91/at91sam9g45.c
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
| | |_|/ / /
| |/| | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Add functions to allocate and initialize AMBA device structures, and
add them to the Linux device manager. This allows us to kill this
type of operation from individual platforms, moving it to core code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Merge second batch of patches from Andrew Morton:
- various misc things
- core kernel changes to prctl, exit, exec, init, etc.
- kernel/watchdog.c updates
- get_maintainer
- MAINTAINERS
- the backlight driver queue
- core bitops code cleanups
- the led driver queue
- some core prio_tree work
- checkpatch udpates
- largeish crc32 update
- a new poll() feature for the v4l guys
- the rtc driver queue
- fatfs
- ptrace
- signals
- kmod/usermodehelper updates
- coredump
- procfs updates
* emailed from Andrew Morton <akpm@linux-foundation.org>: (141 commits)
seq_file: add seq_set_overflow(), seq_overflow()
proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate().
procfs: speed up /proc/pid/stat, statm
procfs: add num_to_str() to speed up /proc/stat
proc: speed up /proc/stat handling
fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static
coredump: add VM_NODUMP, MADV_NODUMP, MADV_CLEAR_NODUMP
coredump: remove VM_ALWAYSDUMP flag
kmod: make __request_module() killable
kmod: introduce call_modprobe() helper
usermodehelper: ____call_usermodehelper() doesn't need do_exit()
usermodehelper: kill umh_wait, renumber UMH_* constants
usermodehelper: implement UMH_KILLABLE
usermodehelper: introduce umh_complete(sub_info)
usermodehelper: use UMH_WAIT_PROC consistently
signal: zap_pid_ns_processes: s/SEND_SIG_NOINFO/SEND_SIG_FORCED/
signal: oom_kill_task: use SEND_SIG_FORCED instead of force_sig()
signal: cosmetic, s/from_ancestor_ns/force/ in prepare_signal() paths
signal: give SEND_SIG_FORCED more power to beat SIGNAL_UNKILLABLE
Hexagon: use set_current_blocked() and block_sigmask()
...
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Process accounting applications as top, ps visit some files under
/proc/<pid>. With seq_put_decimal_ull(), we can optimize /proc/<pid>/stat
and /proc/<pid>/statm files.
This patch adds
- seq_put_decimal_ll() for signed values.
- allow delimiter == 0.
- convert seq_printf() to seq_put_decimal_ull/ll in /proc/stat, statm.
Test result on a system with 2000+ procs.
Before patch:
[kamezawa@bluextal test]$ top -b -n 1 | wc -l
2223
[kamezawa@bluextal test]$ time top -b -n 1 > /dev/null
real 0m0.675s
user 0m0.044s
sys 0m0.121s
[kamezawa@bluextal test]$ time ps -elf > /dev/null
real 0m0.236s
user 0m0.056s
sys 0m0.176s
After patch:
kamezawa@bluextal ~]$ time top -b -n 1 > /dev/null
real 0m0.657s
user 0m0.052s
sys 0m0.100s
[kamezawa@bluextal ~]$ time ps -elf > /dev/null
real 0m0.198s
user 0m0.050s
sys 0m0.145s
Considering top, ps tend to scan /proc periodically, this will reduce cpu
consumption by top/ps to some extent.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
== stat_check.py
num = 0
with open("/proc/stat") as f:
while num < 1000 :
data = f.read()
f.seek(0, 0)
num = num + 1
==
perf shows
20.39% stat_check.py [kernel.kallsyms] [k] format_decode
13.41% stat_check.py [kernel.kallsyms] [k] number
12.61% stat_check.py [kernel.kallsyms] [k] vsnprintf
10.85% stat_check.py [kernel.kallsyms] [k] memcpy
4.85% stat_check.py [kernel.kallsyms] [k] radix_tree_lookup
4.43% stat_check.py [kernel.kallsyms] [k] seq_printf
This patch removes most of calls to vsnprintf() by adding num_to_str()
and seq_print_decimal_ull(), which prints decimal numbers without rich
functions provided by printf().
On my 8cpu box.
== Before patch ==
[root@bluextal test]# time ./stat_check.py
real 0m0.150s
user 0m0.026s
sys 0m0.121s
== After patch ==
[root@bluextal test]# time ./stat_check.py
real 0m0.055s
user 0m0.022s
sys 0m0.030s
[akpm@linux-foundation.org: remove incorrect comment, use less statck in num_to_str(), move comment from .h to .c, simplify seq_put_decimal_ull()]
[andrea@betterlinux.com: avoid breaking the ABI in /proc/stat]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Turner <pjt@google.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Since we no longer need the VM_ALWAYSDUMP flag, let's use the freed bit
for 'VM_NODUMP' flag. The idea is is to add a new madvise() flag:
MADV_DONTDUMP, which can be set by applications to specifically request
memory regions which should not dump core.
The specific application I have in mind is qemu: we can add a flag there
that wouldn't dump all of guest memory when qemu dumps core. This flag
might also be useful for security sensitive apps that want to absolutely
make sure that parts of memory are not dumped. To clear the flag use:
MADV_DODUMP.
[akpm@linux-foundation.org: s/MADV_NODUMP/MADV_DONTDUMP/, s/MADV_CLEAR_NODUMP/MADV_DODUMP/, per Roland]
[akpm@linux-foundation.org: fix up the architectures which broke]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The motivation for this patchset was that I was looking at a way for a
qemu-kvm process, to exclude the guest memory from its core dump, which
can be quite large. There are already a number of filter flags in
/proc/<pid>/coredump_filter, however, these allow one to specify 'types'
of kernel memory, not specific address ranges (which is needed in this
case).
Since there are no more vma flags available, the first patch eliminates
the need for the 'VM_ALWAYSDUMP' flag. The flag is used internally by
the kernel to mark vdso and vsyscall pages. However, it is simple
enough to check if a vma covers a vdso or vsyscall page without the need
for this flag.
The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new
'VM_NODUMP' flag, which can be set by userspace using new madvise flags:
'MADV_DONTDUMP', and unset via 'MADV_DODUMP'. The core dump filters
continue to work the same as before unless 'MADV_DONTDUMP' is set on the
region.
The qemu code which implements this features is at:
http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch
In my testing the qemu core dump shrunk from 383MB -> 13MB with this
patch.
I also believe that the 'MADV_DONTDUMP' flag might be useful for
security sensitive apps, which might want to select which areas are
dumped.
This patch:
The VM_ALWAYSDUMP flag is currently used by the coredump code to
indicate that a vma is part of a vsyscall or vdso section. However, we
can determine if a vma is in one these sections by checking it against
the gate_vma and checking for a non-NULL return value from
arch_vma_name(). Thus, freeing a valuable vma bit.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
No functional changes. It is not sane to use UMH_KILLABLE with enum
umh_wait, but obviously we do not want another argument in
call_usermodehelper_* helpers. Kill this enum, use the plain int.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Implement UMH_KILLABLE, should be used along with UMH_WAIT_EXEC/PROC.
The caller must ensure that subprocess_info->path/etc can not go away
until call_usermodehelper_freeinfo().
call_usermodehelper_exec(UMH_KILLABLE) does
wait_for_completion_killable. If it fails, it uses
xchg(&sub_info->complete, NULL) to serialize with umh_complete() which
does the same xhcg() to access sub_info->complete.
If call_usermodehelper_exec wins, it can safely return. umh_complete()
should get NULL and call call_usermodehelper_freeinfo().
Otherwise we know that umh_complete() was already called, in this case
call_usermodehelper_exec() falls back to wait_for_completion() which
should succeed "very soon".
Note: UMH_NO_WAIT == -1 but it obviously should not be used with
UMH_KILLABLE. We delay the neccessary cleanup to simplify the back
porting.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PTRACE_SEIZE code is tested and ready for production use, remove the
code which requires special bit in data argument to make PTRACE_SEIZE
work.
Strace team prepares for a new release of strace, and we would like to
ship the code which uses PTRACE_SEIZE, preferably after this change goes
into released kernel.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
match
PTRACE_EVENT_foo and PTRACE_O_TRACEfoo used to match.
New PTRACE_EVENT_STOP is the first event which has no corresponding
PTRACE_O_TRACE option. If we will ever want to add another such option,
its PTRACE_EVENT's value will collide with PTRACE_EVENT_STOP's value.
This patch changes PTRACE_EVENT_STOP value to prevent this.
While at it, added a comment - the one atop PTRACE_EVENT block, saying
"Wait extended result codes for the above trace options", is not true
for PTRACE_EVENT_STOP.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Exchange PT_TRACESYSGOOD and PT_PTRACE_CAP bit positions, which makes
PT_option bits contiguous and therefore makes code in
ptrace_setoptions() much simpler.
Every PTRACE_O_TRACEevent is defined to (1 << PTRACE_EVENT_event)
instead of using explicit numeric constants, to ensure we don't mess up
relationship between bit positions and event ids.
PT_EVENT_FLAG_SHIFT was not particularly useful, PT_OPT_FLAG_SHIFT with
value of PT_EVENT_FLAG_SHIFT-1 is easier to use.
PT_TRACE_MASK constant is nuked, the only its use is replaced by
(PTRACE_O_MASK << PT_OPT_FLAG_SHIFT).
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
ptrace_event(PTRACE_EVENT_EXEC) sends SIGTRAP if PT_TRACE_EXEC is not
set. This is because this SIGTRAP predates PTRACE_O_TRACEEXEC option,
we do not need/want this with PT_SEIZED which can set the options during
attach.
Suggested-by: Pedro Alves <palves@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Chris Evans <scarybeasts@gmail.com>
Cc: Indan Zupancic <indan@nul.nu>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Another old/known problem. If the tracee is killed after it reports
syscall_entry, it starts the syscall and debugger can't control this.
This confuses the users and this creates the security problems for
ptrace jailers.
Change tracehook_report_syscall_entry() to return non-zero if killed,
this instructs syscall_trace_enter() to abort the syscall.
Reported-by: Chris Evans <scarybeasts@gmail.com>
Tested-by: Indan Zupancic <indan@nul.nu>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Pedro Alves <palves@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In some cases the poll() implementation in a driver has to do different
things depending on the events the caller wants to poll for. An example
is when a driver needs to start a DMA engine if the caller polls for
POLLIN, but doesn't want to do that if POLLIN is not requested but instead
only POLLOUT or POLLPRI is requested. This is something that can happen
in the video4linux subsystem among others.
Unfortunately, the current epoll/poll/select implementation doesn't
provide that information reliably. The poll_table_struct does have it: it
has a key field with the event mask. But once a poll() call matches one
or more bits of that mask any following poll() calls are passed a NULL
poll_table pointer.
Also, the eventpoll implementation always left the key field at ~0 instead
of using the requested events mask.
This was changed in eventpoll.c so the key field now contains the actual
events that should be polled for as set by the caller.
The solution to the NULL poll_table pointer is to set the qproc field to
NULL in poll_table once poll() matches the events, not the poll_table
pointer itself. That way drivers can obtain the mask through a new
poll_requested_events inline.
The poll_table_struct can still be NULL since some kernel code calls it
internally (netfs_state_poll() in ./drivers/staging/pohmelfs/netfs.h). In
that case poll_requested_events() returns ~0 (i.e. all events).
Very rarely drivers might want to know whether poll_wait will actually
wait. If another earlier file descriptor in the set already matched the
events the caller wanted to wait for, then the kernel will return from the
select() call without waiting. This might be useful information in order
to avoid doing expensive work.
A new helper function poll_does_not_wait() is added that drivers can use
to detect this situation. This is now used in sock_poll_wait() in
include/net/sock.h. This was the only place in the kernel that needed
this information.
Drivers should no longer access any of the poll_table internals, but use
the poll_requested_events() and poll_does_not_wait() access functions
instead. In order to enforce that the poll_table fields are now prepended
with an underscore and a comment was added warning against using them
directly.
This required a change in unix_dgram_poll() in unix/af_unix.c which used
the key field to get the requested events. It's been replaced by a call
to poll_requested_events().
For qproc it was especially important to change its name since the
behavior of that field changes with this patch since this function pointer
can now be NULL when that wasn't possible in the past.
Any driver accessing the qproc or key fields directly will now fail to compile.
Some notes regarding the correctness of this patch: the driver's poll()
function is called with a 'struct poll_table_struct *wait' argument. This
pointer may or may not be NULL, drivers can never rely on it being one or
the other as that depends on whether or not an earlier file descriptor in
the select()'s fdset matched the requested events.
There are only three things a driver can do with the wait argument:
1) obtain the key field:
events = wait ? wait->key : ~0;
This will still work although it should be replaced with the new
poll_requested_events() function (which does exactly the same).
This will now even work better, since wait is no longer set to NULL
unnecessarily.
2) use the qproc callback. This could be deadly since qproc can now be
NULL. Renaming qproc should prevent this from happening. There are no
kernel drivers that actually access this callback directly, BTW.
3) test whether wait == NULL to determine whether poll would return without
waiting. This is no longer sufficient as the correct test is now
wait == NULL || wait->_qproc == NULL.
However, the worst that can happen here is a slight performance hit in
the case where wait != NULL and wait->_qproc == NULL. In that case the
driver will assume that poll_wait() will actually add the fd to the set
of waiting file descriptors. Of course, poll_wait() will not do that
since it tests for wait->_qproc. This will not break anything, though.
There is only one place in the whole kernel where this happens
(sock_poll_wait() in include/net/sock.h) and that code will be replaced
by a call to poll_does_not_wait() in the next patch.
Note that even if wait->_qproc != NULL drivers cannot rely on poll_wait()
actually waiting. The next file descriptor from the set might match the
event mask and thus any possible waits will never happen.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|