| Commit message (Collapse) | Author | Age |
| |\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"Miscellaneous improvements:
- clean up debugfs globals
- remove dead code in sysfs
- reorganize duplicated sysfs attribute structs
- consolidate sysfs show and store functions
- remove duplicated sysfs_ops structures
- describe organization of sysfs
- make devreq_mutex static
- g_orangefs_stats -> orangefs_stats for consistency
- rename most remaining global variables
Feature negotiation:
- enable Orangefs userspace and kernel module to negotiate mutually
supported features"
* tag 'for-linus-4.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
Revert "orangefs: bump minimum userspace version"
orangefs: bump minimum userspace version
orangefs: rename most remaining global variables
orangefs: g_orangefs_stats -> orangefs_stats for consistency
orangefs: make devreq_mutex static
orangefs: describe organization of sysfs
orangefs: remove duplicated sysfs_ops structures
orangefs: consolidate sysfs show and store functions
orangefs: reorganize duplicated sysfs attribute structs
orangefs: remove dead code in sysfs
orangefs: clean up debugfs globals
orangefs: do not allow client readahead cache without feature bit
orangefs: add features op
orangefs: record userspace version for feature compatbility
orangefs: add readahead count and size to sysfs
orangefs: re-add flush_racache from out-of-tree
orangefs: turn param response value into union
orangefs: add missing param request ops
orangefs: rename remaining bits of mmap readahead cache
|
| | |
| |
| |
| |
| |
| | |
The features op did make it into OrangeFS 2.9.6 after all.
This reverts commit 0c95ad76361f1d75a1ffdf82deafbcec44d19c42.
|
| | |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
orangefs: miscellaneous improvements and feature negotiation
Two OrangeFS updates:
"Pull in an OrangeFS branch containing miscellaneous improvements.
- clean up debugfs globals
- remove dead code in sysfs
- reorganize duplicated sysfs attribute structs
- consolidate sysfs show and store functions
- remove duplicated sysfs_ops structures
- describe organization of sysfs
- make devreq_mutex static
- g_orangefs_stats -> orangefs_stats for consistency
- rename most remaining global variables"
"Pull in an OrangeFS branch containing improvements which the userspace
component and the kernel to negotiate mutually supported features."
|
| | | |\
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull in an OrangeFS branch containing miscellaneous improvements.
- clean up debugfs globals
- remove dead code in sysfs
- reorganize duplicated sysfs attribute structs
- consolidate sysfs show and store functions
- remove duplicated sysfs_ops structures
- describe organization of sysfs
- make devreq_mutex static
- g_orangefs_stats -> orangefs_stats for consistency
- rename most remaining global variables
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Only op_timeout_secs, slot_timeout_secs, and hash_table_size are left
because they are exposed as module parameters. All other global
variables have the orangefs_ prefix.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Remove a good bit of obfuscated and duplicated code.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We had a separate struct type for each type of attribute, but they all
did the exact same thing. Consolidate them into one
struct orangefs_attribute type.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We had a pageful of structures containing kobjects and variables to store
sysfs entries. However only the kobjects were in use. Replace them with
kobjects.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Mostly this is moving code into orangefs-debugfs.c so that globals turn
into static globals.
Then gossip_debug_mask is renamed orangefs_gossip_debug_mask but keeps
global visibility, so it can be used from a macro.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | |/|
| | |
| | |
| | |
| | | |
Pull in an OrangeFS branch containing improvements which the userspace
component and the kernel to negotiate mutually supported features.
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | | |
OrangeFS 2.9.6 was released without support for the features op. Thus
OrangeFS 2.9.7 will be required to use it.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is a new userspace operation, which will be done if the client-core
version is greater than or equal to 2.9.6. This will provide a way to
implement optional features and to determine which features are
supported by the client-core. If the client-core version is older than
2.9.6, no optional features are supported and the op will not be done.
The intent is to allow protocol extensions without relying on the
client-core's current behavior of ignoring what it doesn't understand.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | |/
| |
| |
| |
| |
| |
| |
| |
| | |
The client reports its version to the kernel on startup. We already test
that it is above the minimum version. Now we record it in a global
variable so code elsewhere can consult it before making a request the
client may not understand.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://github.com/martinbrandenburg/linux
orangefs: integrate readahead cache changes from out-of-tree
The readahead cache has long been present as a compile time option to
OrangeFS. As it had not been tested in some time, it was disabled by
default. Recently, Walt Ligon started work reviving it, which eventually
culminated in the commit below to OrangeFS SVN.
r12519 | walt | 2016-07-13 14:32:42 -0400 (Wed, 13 Jul 2016) | 1 line
reintegrating readahead cache code
This cache is implemented almost entirely in userspace. There are a few
parameters exposed via sysfs, and the cache must be flushed after an
inode is released.
|
| | | |
| | |
| | |
| | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This will support a upcoming request where two related values need to be
updated atomically.
This was done without a union in the OrangeFS server source already. Since
that will break the kernel protocol, it has been fixed there and done here
in a way that does not break the kernel protocol.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | | |
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| | | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This has been dormant code for many years. Parts of it were removed from
the OrangeFS kernel code when it went into mainline. These bits were missed.
Now the readahead cache has been resurrected in the OrangeFS userspace
portions. It was renamed there, since it doesn't really have anything to do
with mmap specifically, so it will be renamed here.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
| |\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"This release cycle is rather small. Just a few fixes to tracing.
The big change is the addition of the hwlat tracer. It not only
detects SMIs, but also other latency that's caused by the hardware. I
have detected some latency from large boxes having bus contention"
* tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Call traceoff trigger after event is recorded
ftrace/scripts: Add helper script to bisect function tracing problem functions
tracing: Have max_latency be defined for HWLAT_TRACER as well
tracing: Add NMI tracing in hwlat detector
tracing: Have hwlat trace migrate across tracing_cpumask CPUs
tracing: Add documentation for hwlat_detector tracer
tracing: Added hardware latency tracer
ftrace: Access ret_stack->subtime only in the function profiler
function_graph: Handle TRACE_BPUTS in print_graph_comment
tracing/uprobe: Drop isdigit() check in create_trace_uprobe
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Call traceoff trigger after the event is recorded.
Since current traceoff trigger is called before recording
the event, we can not know what event stopped tracing.
Typical usecase of traceoff/traceon trigger is tracing
function calls and trace events between a pair of events.
For example, trace function calls between syscall entry/exit.
In that case, it is useful if we can see the return code
of the target syscall.
Link: http://lkml.kernel.org/r/147335074530.12462.4526186083406015005.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Every so often, with a special config or a architecture change, running
function or function_graph tracing can cause the machien to hard reboot,
crash, or simply hard lockup. There's some functions in the function graph
tracer that can not be traced otherwise it causes the function tracer to
recurse before the recursion protection mechanisms are in place.
When this occurs, using the dynamic ftrace featuer that allows limiting what
actually gets traced can be used to bisect down to the problem function.
This adds a script that helps with this process in the scripts/tracing
directory, called ftrace-bisect.sh
The set up is to read all the functions that can be traced from
available_filter_functions into a file (full_file). Then run this script
passing it the full_file and a "test_file" and "non_test_file", where the
test_file will be add to set_ftrace_filter. What ftarce_bisect.sh does, is
to copy half of the functions in full_file into the test_file and the other
half into the non_test_file. This way, one can cat the test_file into the
set_ftrace_filter functions and only test the functions that are in that
file. If it works, then we run the process again after copying non_test_file
to full_file and repeating the process. If the system crashed, then the bad
function is in the test_file and after a reboot, the test_file becomes the
new full_file in the next iteration.
When we get down to a single function in the full_file, then
ftrace_bisect.sh will report that as the bad function.
Full documentation of how to use this simple script is within the script
file itself.
Link: http://lkml.kernel.org/r/20160920100716.131d3647@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The hwlat tracer uses tr->max_latency, and if it's the only tracer enabled
that uses it, the build will fail. Add max_latency and its file when the
hwlat tracer is enabled.
Link: http://lkml.kernel.org/r/d6c3b7eb-ba95-1ffa-0453-464e1e24262a@infradead.org
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
As NMIs can also cause latency when interrupts are disabled, the hwlat
detectory has no way to know if the latency it detects is from an NMI or an
SMI or some other hardware glitch.
As ftrace_nmi_enter/exit() funtions are no longer used (except for sh, which
isn't supported anymore), I converted those to "arch_ftrace_nmi_enter/exit"
and use ftrace_nmi_enter/exit() to check if hwlat detector is tracing or
not, and if so, it calls into the hwlat utility.
Since the hwlat detector only has a single kthread that is spinning with
interrupts disabled, it marks what CPU it is on, and if the NMI callback
happens on that CPU, it records the time spent in that NMI. This is added to
the output that is generated by the hwlat detector as:
#3 inner/outer(us): 9/9 ts:1470836488.206734548
#4 inner/outer(us): 0/8 ts:1470836497.140808588
#5 inner/outer(us): 0/6 ts:1470836499.140825168 nmi-total:5 nmi-count:1
#6 inner/outer(us): 9/9 ts:1470836501.140841748
All time is still tracked in microseconds.
The NMI information is only shown when an NMI occurred during the sample.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Instead of having the hwlat detector thread stay on one CPU, have it migrate
across all the CPUs specified by tracing_cpumask. If the user modifies the
thread's CPU affinity, the migration will stop until the next instance that
the tracer is instantiated. The migration happens at the end of each window
(period).
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Added the documentation on how to use th hwlat_detector.
Signed-off-by: Jon Masters <jcm@redhat.com>
[ Various updates and modified to show hwlat as a tracer ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The hardware latency tracer has been in the PREEMPT_RT patch for some time.
It is used to detect possible SMIs or any other hardware interruptions that
the kernel is unaware of. Note, NMIs may also be detected, but that may be
good to note as well.
The logic is pretty simple. It simply creates a thread that spins on a
single CPU for a specified amount of time (width) within a periodic window
(window). These numbers may be adjusted by their cooresponding names in
/sys/kernel/tracing/hwlat_detector/
The defaults are window = 1000000 us (1 second)
width = 500000 us (1/2 second)
The loop consists of:
t1 = trace_clock_local();
t2 = trace_clock_local();
Where trace_clock_local() is a variant of sched_clock().
The difference of t2 - t1 is recorded as the "inner" timestamp and also the
timestamp t1 - prev_t2 is recorded as the "outer" timestamp. If either of
these differences are greater than the time denoted in
/sys/kernel/tracing/tracing_thresh then it records the event.
When this tracer is started, and tracing_thresh is zero, it changes to the
default threshold of 10 us.
The hwlat tracer in the PREEMPT_RT patch was originally written by
Jon Masters. I have modified it quite a bit and turned it into a
tracer.
Based-on-code-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The subtime is used only for function profiler with function graph
tracer enabled. Move the definition of subtime under
CONFIG_FUNCTION_PROFILER to reduce the memory usage. Also move the
initialization of subtime into the graph entry callback.
Link: http://lkml.kernel.org/r/20160831025529.24018-1-namhyung@kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It missed to handle TRACE_BPUTS so messages recorded by trace_bputs()
will be shown with symbol info unnecessarily.
You can see it with the trace_printk sample code:
# cd /sys/kernel/tracing/
# echo sys_sync > set_graph_function
# echo 1 > options/sym-offset
# echo function_graph > current_tracer
Note that the sys_sync filter was there to prevent recording other
functions and the sym-offset option was needed since the first message
was called from a module init function so kallsyms doesn't have the
symbol and omitted in the output.
# cd ~/build/kernel
# insmod samples/trace_printk/trace-printk.ko
# cd -
# head trace
Before:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | /* 0xffffffffa0002000: This is a static string that will use trace_bputs */
1) | /* This is a dynamic string that will use trace_puts */
1) | /* trace_printk_irq_work+0x5/0x7b [trace_printk]: (irq) This is a static string that will use trace_bputs */
1) | /* (irq) This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bprintk() */
1) | /* (irq) This is a dynamic string that will use trace_printk */
After:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | /* This is a static string that will use trace_bputs */
1) | /* This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bputs */
1) | /* (irq) This is a dynamic string that will use trace_puts */
1) | /* (irq) This is a static string that will use trace_bprintk() */
1) | /* (irq) This is a dynamic string that will use trace_printk */
Link: http://lkml.kernel.org/r/20160901024354.13720-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It's useless. Before:
[tracing]# echo 'p:test /a:0x0' >> uprobe_events
[tracing]# echo 'p:test a:0x0' >> uprobe_events
-bash: echo: write error: No such file or directory
[tracing]# echo 'p:test 1:0x0' >> uprobe_events
-bash: echo: write error: Invalid argument
After:
[tracing]# echo 'p:test 1:0x0' >> uprobe_events
-bash: echo: write error: No such file or directory
Link: http://lkml.kernel.org/r/20160825152110.25663-3-dsafonov@virtuozzo.com
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
| |\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"xen features and fixes for 4.9:
- switch to new CPU hotplug mechanism
- support driver_override in pciback
- require vector callback for HVM guests (the alternate mechanism via
the platform device has been broken for ages)"
* tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/x86: Update topology map for PV VCPUs
xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
xen/pciback: support driver_override
xen/pciback: avoid multiple entries in slot list
xen/pciback: simplify pcistub device handling
xen: Remove event channel notification through Xen PCI platform device
xen/events: Convert to hotplug state machine
xen/x86: Convert to hotplug state machine
x86/xen: add missing \n at end of printk warning message
xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
xen: Make VPMU init message look less scary
xen: rename xen_pmu_init() in sys-hypervisor.c
hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
xen/x86: Move irq allocation from Xen smp_op.cpu_up()
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Early during boot topology_update_package_map() computes
logical_pkg_ids for all present processors.
Later, when processors are brought up, identify_cpu() updates
these values based on phys_pkg_id which is a function of
initial_apicid. On PV guests the latter may point to a
non-existing node, causing logical_pkg_ids to be set to -1.
Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
to index its arrays and therefore in this case will point to index
65535 (since logical_pkg_id is a u16). This could lead to either a
crash or may actually access random memory location.
As a workaround, we recompute topology during CPU bringup to reset
logical_pkg_id to a valid value.
(The reason for initial_apicid being bogus is because it is
initial_apicid of the processor from which the guest is launched.
This value is CPUID(1).EBX[31:24])
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
xen_cpuhp_setup() calls mutex_lock() which, when CONFIG_DEBUG_MUTEXES
is defined, ends up calling xen_save_fl(). That routine expects
per_cpu(xen_vcpu, 0) to be already initialized.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Support the driver_override scheme introduced with commit 782a985d7af2
("PCI: Introduce new device binding path using pci_dev.driver_override")
As pcistub_probe() is called for all devices (it has to check for a
match based on the slot address rather than device type) it has to
check for driver_override set to "pciback" itself.
Up to now for assigning a pci device to pciback you need something like:
echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe
while with the patch you can use the same mechanism as for similar
drivers like pci-stub and vfio-pci:
echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override
echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
echo 0000:07:10.0 > /sys/bus/pci/drivers_probe
So e.g. libvirt doesn't need special handling for pciback.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The Xen pciback driver has a list of all pci devices it is ready to
seize. There is no check whether a to be added entry already exists.
While this might be no problem in the common case it might confuse
those which consume the list via sysfs.
Modify the handling of this list by not adding an entry which already
exists. As this will be needed later split out the list handling into
a separate function.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The Xen pciback driver maintains a list of all its seized devices.
There are two functions searching the list for a specific device with
basically the same semantics just returning different structures in
case of a match.
Split out the search function.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
from old kernel") using the INTx interrupt from Xen PCI platform
device for event channel notification would just lockup the guest
during bootup. postcore_initcall now calls xs_reset_watches which
will eventually try to read a value from XenStore and will get stuck
on read_reply at XenBus forever since the platform driver is not
probed yet and its INTx interrupt handler is not registered yet. That
means that the guest can not be notified at this moment of any pending
event channels and none of the per-event handlers will ever be invoked
(including the XenStore one) and the reply will never be picked up by
the kernel.
The exact stack where things get stuck during xenbus_init:
-xenbus_init
-xs_init
-xs_reset_watches
-xenbus_scanf
-xenbus_read
-xs_single
-xs_single
-xs_talkv
Vector callbacks have always been the favourite event notification
mechanism since their introduction in commit 38e20b07efd5 ("x86/xen:
event channels delivery on HVM.") and the vector callback feature has
always been advertised for quite some time by Xen that's why INTx was
broken for several years now without impacting anyone.
Luckily this also means that event channel notification through INTx
is basically dead-code which can be safely removed without impacting
anybody since it has been effectively disabled for more than 4 years
with nobody complaining about it (at least as far as I'm aware of).
This commit removes event channel notification through Xen PCI
platform device.
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Install the callbacks via the state machine.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Switch to new CPU hotplug infrastructure.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The message is missing a \n, add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus reuse the corresponding function "kmalloc_array".
This issue was detected by using the Coccinelle software.
* Replace the specification of a data type by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The default for the Xen hypervisor is to not enable VPMU in order to
avoid security issues. In this case the Linux kernel will issue the
message "Could not initialize VPMU for cpu 0, error -95" which looks
more like an error than a normal state.
Change the message to something less scary in case the hypervisor
returns EOPNOTSUPP or ENOSYS when trying to activate VPMU.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There are two functions with name xen_pmu_init() in the kernel. Rename
the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in
arch/x86/xen/pmu.c
To avoid the same problem in future rename some more functions.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Now that Xen no longer allocates irqs in _cpu_up() we can restore
commit a89941816726 ("hotplug: Prevent alloc/free of irq descriptors
during cpu up/down")
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
CC: x86@kernel.org
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|
| | | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit ce0d3c0a6fb1 ("genirq: Revert sparse irq locking around
__cpu_up() and move it to x86 for now") reverted irq locking
introduced by commit a89941816726 ("hotplug: Prevent alloc/free
of irq descriptors during cpu up/down") because of Xen allocating
irqs in both of its cpu_up ops.
We can move those allocations into CPU notifiers so that original
patch can be reinstated.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
|