aboutsummaryrefslogtreecommitdiffstats
path: root/include/trace/events
Commit message (Collapse)AuthorAge
* Add tracepoint supportBjoern Brandenburg2015-08-09
| | | | | | | This patch integrates LITMUS^RT's sched_trace_XXX() macros with Linux's notion of tracepoints. This is useful to visualize schedules in kernel shark and similar tools. Historically, LITMUS^RT's sched_trace predates Linux's tracepoint infrastructure.
* Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2015-06-03
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer fixes from Jens Axboe: "Sending this off now, as I'm not aware of other current bugs, nor do I expect further fixes before 4.1 final. This contains two fixes: - a fix for a bdi unregister warning that gets spewed on md, due to a regression introduced earlier in this cycle. From Neil Brown. - a fix for a compile warning for NVMe on 32-bit platforms, also a regression introduced in this cycle. From Arnd Bergmann" * 'for-linus' of git://git.kernel.dk/linux-block: NVMe: fix type warning on 32-bit block: discard bdi_unregister() in favour of bdi_destroy()
| * block: discard bdi_unregister() in favour of bdi_destroy()NeilBrown2015-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdi_unregister() now contains very little functionality. It contains a "WARN_ON" if bdi->dev is NULL. This warning is of no real consequence as bdi->dev isn't needed by anything else in the function, and it triggers if blk_cleanup_queue() -> bdi_destroy() is called before bdi_unregister, which happens since Commit: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.") So this isn't wanted. It also calls bdi_set_min_ratio(). This needs to be called after writes through the bdi have all been flushed, and before the bdi is destroyed. Calling it early is better than calling it late as it frees up a global resource. Calling it immediately after bdi_wb_shutdown() in bdi_destroy() perfectly fits these requirements. So bdi_unregister() can be discarded with the important content moved to bdi_destroy(), as can the writeback_bdi_unregister event which is already not used. Reported-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org (v4.0) Fixes: c4db59d31e39 ("fs: don't reassign dirty inodes to default_backing_dev_info") Fixes: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.") Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com> Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* | tracing/mm: don't trace mm_page_pcpu_drain on offline cpusShreyas B. Prabhu2015-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since tracepoints use RCU for protection, they must not be called on offline cpus. trace_mm_page_pcpu_drain can be called on an offline cpu in this scenario caught by LOCKDEP: =============================== [ INFO: suspicious RCU usage. ] 4.1.0-rc1+ #9 Not tainted ------------------------------- include/trace/events/kmem.h:265 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 1 lock held by swapper/5/0: #0: (&(&zone->lock)->rlock){..-...}, at: [<c0000000002073b0>] .free_pcppages_bulk+0x70/0x920 stack backtrace: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1+ #9 Call Trace: .dump_stack+0x98/0xd4 (unreliable) .lockdep_rcu_suspicious+0x108/0x170 .free_pcppages_bulk+0x60c/0x920 .free_hot_cold_page+0x208/0x280 .destroy_context+0x90/0xd0 .__mmdrop+0x58/0x160 .idle_task_exit+0xf0/0x100 .pnv_smp_cpu_kill_self+0x58/0x2c0 .cpu_die+0x34/0x50 .arch_cpu_idle_dead+0x20/0x40 .cpu_startup_entry+0x708/0x7a0 .start_secondary+0x36c/0x3a0 start_secondary_prolog+0x10/0x14 Fix this by converting mm_page_pcpu_drain trace point into TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id()) Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | tracing/mm: don't trace mm_page_free on offline cpusShreyas B. Prabhu2015-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since tracepoints use RCU for protection, they must not be called on offline cpus. trace_mm_page_free can be called on an offline cpu in this scenario caught by LOCKDEP: =============================== [ INFO: suspicious RCU usage. ] 4.1.0-rc1+ #9 Not tainted ------------------------------- include/trace/events/kmem.h:170 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9 Call Trace: .dump_stack+0x98/0xd4 (unreliable) .lockdep_rcu_suspicious+0x108/0x170 .free_pages_prepare+0x494/0x680 .free_hot_cold_page+0x50/0x280 .destroy_context+0x90/0xd0 .__mmdrop+0x58/0x160 .idle_task_exit+0xf0/0x100 .pnv_smp_cpu_kill_self+0x58/0x2c0 .cpu_die+0x34/0x50 .arch_cpu_idle_dead+0x20/0x40 .cpu_startup_entry+0x708/0x7a0 .start_secondary+0x36c/0x3a0 start_secondary_prolog+0x10/0x14 Fix this by converting mm_page_free trace point into TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id()) Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | tracing/mm: don't trace kmem_cache_free on offline cpusShreyas B. Prabhu2015-05-28
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since tracepoints use RCU for protection, they must not be called on offline cpus. trace_kmem_cache_free can be called on an offline cpu in this scenario caught by LOCKDEP: =============================== [ INFO: suspicious RCU usage. ] 4.1.0-rc1+ #9 Not tainted ------------------------------- include/trace/events/kmem.h:148 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.0-rc1+ #9 Call Trace: .dump_stack+0x98/0xd4 (unreliable) .lockdep_rcu_suspicious+0x108/0x170 .kmem_cache_free+0x344/0x4b0 .__mmdrop+0x4c/0x160 .idle_task_exit+0xf0/0x100 .pnv_smp_cpu_kill_self+0x58/0x2c0 .cpu_die+0x34/0x50 .arch_cpu_idle_dead+0x20/0x40 .cpu_startup_entry+0x708/0x7a0 .start_secondary+0x36c/0x3a0 start_secondary_prolog+0x10/0x14 Fix this by converting kmem_cache_free trace point into TRACE_EVENT_CONDITION where condition is cpu_online(smp_processor_id()) Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2015-04-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
| * VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells2015-04-15
| | | | | | | | | | | | | | that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'clk-for-linus-4.1' of ↵Linus Torvalds2015-04-21
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clock framework updates from Michael Turquette: "The changes to the common clock framework for 4.0 are mostly new clock drivers and updates to existing ones for feature enhancements and bug fixes. There is more churn than usual in the framework core due to the change to introduce per-user unique struct clk pointers in 4.0. This caused several regressions to surface, some of which were sent as fixes to 4.0. New generic clock drivers were added for GPIO- and PWM-based clock controllers. Additionally the common clk-divider code recieved several fixes to the way it rounds rates" * tag 'clk-for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (91 commits) clk: check ->determine/round_rate() return value in clk_calc_new_rates clk: at91: usb: propagate rate modification to the parent clk clk: samsung: exynos4: Disable ARMCLK down feature on Exynos4210 SoC clk: don't use __initconst for non-const arrays clk: at91: change to using endian agnositc IO clk: clk-gpio-gate: Fix active low clk: Add PWM clock driver clk: Add clock driver for mb86s7x clk: pxa: pxa3xx: add missing os timer clock clk: tegra: Use the proper parent for plld_dsi clk: tegra: Use generic tegra_osc_clk_init() on Tegra114 clk: tegra: Model oscillator as clock clk: tegra: Add peripheral registers for bank Y clk: tegra: Register the proper number of resets clk: tegra: Remove needless initializations clk: tegra: Use consistent indentation clk: tegra: Various whitespace cleanups clk: tegra: Enable HDA to HDMI clocks on Tegra124 clk: tegra: Fix a bunch of sparse warnings clk: tegra: Fix typo tabel -> table ...
| * | clk: Add tracepoints for hardware operationsStephen Boyd2015-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's useful to have tracepoints around operations that change the hardware state so that we can debug clock hardware performance and operations. Four basic types of events are supported: on/off events for enable, disable, prepare, unprepare that only record an event and a clock name, rate changing events for clk_set_{min_,max_}rate{_range}(), phase changing events for clk_set_phase() and parent changing events for clk_set_parent(). Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Michael Turquette <mturquette@linaro.org>
* | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2015-04-18
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "This update has mostly fixes, but also other bits: - perf tooling fixes - PMU driver fixes - Intel Broadwell PMU driver HW-enablement for LBR callstacks - a late coming 'perf kmem' tool update that enables it to also analyze page allocation data. Note, this comes with MM tracepoint changes that we believe to not break anything: because it changes the formerly opaque 'struct page *' field that uniquely identifies pages to 'pfn' which identifies pages uniquely too, but isn't as opaque and can be used for other purposes as well" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/pt: Fix and clean up error handling in pt_event_add() perf/x86/intel: Add Broadwell support for the LBR callstack perf/x86/intel/rapl: Fix energy counter measurements but supporing per domain energy units perf/x86/intel: Fix Core2,Atom,NHM,WSM cycles:pp events perf/x86: Fix hw_perf_event::flags collision perf probe: Fix segfault when probe with lazy_line to file perf probe: Find compilation directory path for lazy matching perf probe: Set retprobe flag when probe in address-based alternative mode perf kmem: Analyze page allocator events also tracing, mm: Record pfn instead of pointer to struct page
| * | | tracing, mm: Record pfn instead of pointer to struct pageNamhyung Kim2015-04-13
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct page is opaque for userspace tools, so it'd be better to save pfn in order to identify page frames. The textual output of $debugfs/tracing/trace file remains unchanged and only raw (binary) data format is changed - but thanks to libtraceevent, userspace tools which deal with the raw data (like perf and trace-cmd) can parse the format easily. So impact on the userspace will also be minimal. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Based-on-patch-by: Joonsoo Kim <js1304@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1428298576-9785-3-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | Merge tag 'for-f2fs-4.1' of ↵Linus Torvalds2015-04-18
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - in-memory extent_cache - fs_shutdown to test power-off-recovery - use inline_data to store symlink path - show f2fs as a non-misc filesystem Major fixes: - avoid CPU stalls on sync_dirty_dir_inodes - fix some power-off-recovery procedure - fix handling of broken symlink correctly - fix missing dot and dotdot made by sudden power cuts - handle wrong data index during roll-forward recovery - preallocate data blocks for direct_io ... and a bunch of minor bug fixes and cleanups" * tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (71 commits) f2fs: pass checkpoint reason on roll-forward recovery f2fs: avoid abnormal behavior on broken symlink f2fs: flush symlink path to avoid broken symlink after POR f2fs: change 0 to false for bool type f2fs: do not recover wrong data index f2fs: do not increase link count during recovery f2fs: assign parent's i_mode for empty dir f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries f2fs: fix mismatching lock and unlock pages for roll-forward recovery f2fs: fix sparse warnings f2fs: limit b_size of mapped bh in f2fs_map_bh f2fs: persist system.advise into on-disk inode f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get f2fs: preallocate fallocated blocks for direct IO f2fs: enable inline data by default f2fs: preserve extent info for extent cache f2fs: initialize extent tree with on-disk extent info of inode f2fs: introduce __{find,grab}_extent_tree f2fs: split set_data_blkaddr from f2fs_update_extent_cache f2fs: enable fast symlink by utilizing inline data ...
| * | | f2fs: pass checkpoint reason on roll-forward recoveryJaegeuk Kim2015-04-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds CP_RECOVERY to remain recovery information for checkpoint. And, it makes sure writing checkpoint in this case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: add some tracepoints to debug volatile and atomic writesJaegeuk Kim2015-04-10
| | | | | | | | | | | | | | | | Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: add trace for rb-tree extent cache opsChao Yu2015-03-03
| | |/ | |/| | | | | | | | | | | | | | | | This patch adds trace for lookup/update/shrink/destroy ops in rb-tree extent cache. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | | mm: cma: add trace events for CMA allocations and freeingsStefan Strogin2015-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add trace events for cma_alloc() and cma_release(). The cma_alloc tracepoint is used both for successful and failed allocations, in case of allocation failure pfn=-1UL is stored and printed. Signed-off-by: Stefan Strogin <stefan.strogin@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Nazarewicz <mpn@google.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Cc: Thierry Reding <treding@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2015-04-14
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...
| * | | x86: expose number of page table levels on Kconfig levelKirill A. Shutemov2015-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would want to use number of page table level to define mm_struct. Let's expose it as CONFIG_PGTABLE_LEVELS. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'trace-v4.1' of ↵Linus Torvalds2015-04-14
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Some clean ups and small fixes, but the biggest change is the addition of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints. Tracepoints have helper functions for the TP_printk() called __print_symbolic() and __print_flags() that lets a numeric number be displayed as a a human comprehensible text. What is placed in the TP_printk() is also shown in the tracepoint format file such that user space tools like perf and trace-cmd can parse the binary data and express the values too. Unfortunately, the way the TRACE_EVENT() macro works, anything placed in the TP_printk() will be shown pretty much exactly as is. The problem arises when enums are used. That's because unlike macros, enums will not be changed into their values by the C pre-processor. Thus, the enum string is exported to the format file, and this makes it useless for user space tools. The TRACE_DEFINE_ENUM() solves this by converting the enum strings in the TP_printk() format into their number, and that is what is shown to user space. For example, the tracepoint tlb_flush currently has this in its format file: __print_symbolic(REC->reason, { TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" }, { TLB_REMOTE_SHOOTDOWN, "remote shootdown" }, { TLB_LOCAL_SHOOTDOWN, "local shootdown" }, { TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" }) After adding: TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH); TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN); TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN); Its format file will contain this: __print_symbolic(REC->reason, { 0, "flush on task switch" }, { 1, "remote shootdown" }, { 2, "local shootdown" }, { 3, "local mm shootdown" })" * tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits) tracing: Add enum_map file to show enums that have been mapped writeback: Export enums used by tracepoint to user space v4l: Export enums used by tracepoints to user space SUNRPC: Export enums in tracepoints to user space mm: tracing: Export enums in tracepoints to user space irq/tracing: Export enums in tracepoints to user space f2fs: Export the enums in the tracepoints to userspace net/9p/tracing: Export enums in tracepoints to userspace x86/tlb/trace: Export enums in used by tlb_flush tracepoint tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM() tracing: Allow for modules to convert their enums to values tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation tracing: Give system name a pointer brcmsmac: Move each system tracepoints to their own header iwlwifi: Move each system tracepoints to their own header mac80211: Move message tracepoints to their own header tracing: Add TRACE_SYSTEM_VAR to xhci-hcd tracing: Add TRACE_SYSTEM_VAR to kvm-s390 tracing: Add TRACE_SYSTEM_VAR to intel-sst ...
| * | | writeback: Export enums used by tracepoint to user spaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enums used in tracepoints for __print_symbolic() do not have their values shown in the tracepoint format files and this makes it difficult for user space tools to convert the binary values to the strings they are to represent. Add TRACE_DEFINE_ENUM(x) macros to export the enum names to their values to make the tracing output from user space tools more robust. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Dave Chinner <dchinner@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | v4l: Export enums used by tracepoints to user spaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enums used by tracepoints for __print_symbolic() are shown in the tracepoint format files with just their names and not their values. This makes it difficult for user space tools to know how to convert the binary data into their string representations. By adding the use of TRACE_DEFINE_ENUM(), the enum names will be mapped to their values and shown in the tracing file system to let tools convert the data as necessary. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | SUNRPC: Export enums in tracepoints to user spaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enums used in the tracepoints for __print_symbolic() have their names shown in the tracepoint format files. User space tools do not know how to convert those names into their values to be able to convert the binary data. Use TRACE_DEFINE_ENUM() to export the enum names to their values for userspace to do the parsing correctly. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | mm: tracing: Export enums in tracepoints to user spaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enums used in tracepoints with __print_symbolic() have their names shown in the tracepoint format files and not their values. This makes it difficult for user space tools to convert the binary data to the strings as user space does not know what those enums are about. By having them use TRACE_DEFINE_ENUM(), the names of the enums will be mapped to the values and shown to user space. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | irq/tracing: Export enums in tracepoints to user spaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enums used by the softirq mapping is what is shown in the output of the __print_symbolic() and not their values, that are needed to map them to their strings. Export them to userspace with the TRACE_DEFINE_ENUM() macro so that user space tools can map the enums with their values. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | f2fs: Export the enums in the tracepoints to userspaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracepoints that use __print_symbolic() use enums as the value to convert to strings. Unfortunately, the format files for these tracepoints show the enum name and not their value. This causes some userspace tools not to know how to convert __print_symbolic() to their strings. Add TRACE_DEFINE_ENUM() macros to export the enums used to userspace to let those tools know what those enum values are. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | net/9p/tracing: Export enums in tracepoints to userspaceSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tracepoints in the 9p code use a lot of enums for the __print_symbolic() function. These enums are shown in the tracepoint format files, and user space tools such as trace-cmd does not have the information to parse it. Add helper macros to export the enums with TRACE_DEFINE_ENUM(). Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | x86/tlb/trace: Export enums in used by tlb_flush tracepointSteven Rostedt (Red Hat)2015-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Have the enums used in __print_symbolic() by the trace_tlb_flush() tracepoint exported to userpace such that they can be parsed by userspace tools. Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org Cc: Dave Hansen <dave@sr71.net> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: Add TRACE_SYSTEM_VAR to intel-sstSteven Rostedt (Red Hat)2015-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New code will require TRACE_SYSTEM to be a valid C variable name, but some tracepoints have TRACE_SYSTEM with '-' and not '_', so it can not be used. Instead, add a TRACE_SYSTEM_VAR that can give the tracing infrastructure a unique name for the trace system. Link: http://lkml.kernel.org/r/20150402142831.GT6023@sirena.org.uk Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | | tracing: %pF is only for function pointersScott Wood2015-03-25
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | Use %pS for actual addresses, otherwise you'll get bad output on arches like ppc64 where %pF expects a function descriptor. Link: http://lkml.kernel.org/r/1426130037-17956-22-git-send-email-scottwood@freescale.com Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | Merge branch 'for-4.1' of ↵Linus Torvalds2015-04-13
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: - Hannes's patchset implements support for better error reporting introduced by the new ATA command spec. - the deperecated pci_ dma API usages have been replaced by dma_ ones. - a bunch of hardware specific updates and some cleanups. * 'for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata: remove deprecated use of pci api ahci: st: st_configure_oob must be called after IP is clocked. ahci: st: Update the ahci_st DT documentation ahci: st: Update the DT example for how to obtain the PHY. sata_dwc_460ex: indent an if statement libata: Add tracepoints libata-eh: Set 'information' field for autosense libata: Implement support for sense data reporting libata: Implement NCQ autosense libata: use status bit definitions in ata_dump_status() ide,ata: Rename ATA_IDX to ATA_SENSE libata: whitespace fixes in ata_to_sense_error() libata: whitespace cleanup in ata_get_cmd_descript() libata: use READ_LOG_DMA_EXT libata: remove ATA_FLAG_LOWTAG sata_dwc_460ex: re-use hsdev->dev instead of dwc_dev sata_dwc_460ex: move to generic DMA driver sata_dwc_460ex: join messages back sata: xgene: add ACPI support for APM X-Gene SATA ports ata: sata_mv: add proper definitions for LP_PHY_CTL register values
| * | | libata: Add tracepointsHannes Reinecke2015-03-27
| |/ / | | | | | | | | | | | | | | | | | | | | | Add some tracepoints for ata_qc_issue, ata_qc_complete, and ata_eh_link_autopsy. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
* | / regmap: Move tracing header into drivers/base/regmapSteven Rostedt2015-03-19
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | The tracing events for regmap are confined to the regmap subsystem. It also requires accessing an internal header. Instead of including the internal header from a generic file location, move the tracing file into the regmap directory. Also rename the regmap tracing header to trace.h, as it is redundant to keep the regmap.h name when it is in the regmap directory. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Mark Brown <broonie@kernel.org>
* | regmap: introduce regmap_name to fix syscon regmap trace eventsPhilipp Zabel2015-03-19
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a NULL pointer dereference when enabling regmap event tracing in the presence of a syscon regmap, introduced by commit bdb0066df96e ("mfd: syscon: Decouple syscon interface from platform devices"). That patch introduced syscon regmaps that have their dev field set to NULL. The regmap trace events expect it to point to a valid struct device and feed it to dev_name(): $ echo 1 > /sys/kernel/debug/tracing/events/regmap/enable Unable to handle kernel NULL pointer dereference at virtual address 0000002c pgd = 80004000 [0000002c] *pgd=00000000 Internal error: Oops: 17 [#1] SMP ARM Modules linked in: coda videobuf2_vmalloc CPU: 0 PID: 304 Comm: kworker/0:2 Not tainted 4.0.0-rc2+ #9197 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Workqueue: events_freezable thermal_zone_device_check task: 9f25a200 ti: 9f1ee000 task.ti: 9f1ee000 PC is at ftrace_raw_event_regmap_block+0x3c/0xe4 LR is at _regmap_raw_read+0x1bc/0x1cc pc : [<803636e8>] lr : [<80365f2c>] psr: 600f0093 sp : 9f1efd78 ip : 9f1efdb8 fp : 9f1efdb4 r10: 00000004 r9 : 00000001 r8 : 00000001 r7 : 00000180 r6 : 00000000 r5 : 9f00e3c0 r4 : 00000003 r3 : 00000001 r2 : 00000180 r1 : 00000000 r0 : 9f00e3c0 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 2d91004a DAC: 00000015 Process kworker/0:2 (pid: 304, stack limit = 0x9f1ee210) Stack: (0x9f1efd78 to 0x9f1f0000) fd60: 9f1efda4 9f1efd88 fd80: 800708c0 805f9510 80927140 800f0013 9f1fc800 9eb2f490 00000000 00000180 fda0: 808e3840 00000001 9f1efdfc 9f1efdb8 80365f2c 803636b8 805f8958 800708e0 fdc0: a00f0013 803636ac 9f16de00 00000180 80927140 9f1fc800 9f1fc800 9f1efe6c fde0: 9f1efe6c 9f732400 00000000 00000000 9f1efe1c 9f1efe00 80365f70 80365d7c fe00: 80365f3c 9f1fc800 9f1fc800 00000180 9f1efe44 9f1efe20 803656a4 80365f48 fe20: 9f1fc800 00000180 9f1efe6c 9f1efe6c 9f732400 00000000 9f1efe64 9f1efe48 fe40: 803657bc 80365634 00000001 9e95f910 9f1fc800 9f1efeb4 9f1efe8c 9f1efe68 fe60: 80452ac0 80365778 9f1efe8c 9f1efe78 9e93d400 9e93d5e8 9f1efeb4 9f72ef40 fe80: 9f1efeac 9f1efe90 8044e11c 80452998 8045298c 9e93d608 9e93d400 808e1978 fea0: 9f1efecc 9f1efeb0 8044fd14 8044e0d0 ffffffff 9f25a200 9e93d608 9e481380 fec0: 9f1efedc 9f1efed0 8044fde8 8044fcec 9f1eff1c 9f1efee0 80038d50 8044fdd8 fee0: 9f1ee020 9f72ef40 9e481398 00000000 00000008 9f72ef54 9f1ee020 9f72ef40 ff00: 9e481398 9e481380 00000008 9f72ef40 9f1eff5c 9f1eff20 80039754 80038bfc ff20: 00000000 9e481380 80894100 808e1662 00000000 9e4f2ec0 00000000 9e481380 ff40: 800396f8 00000000 00000000 00000000 9f1effac 9f1eff60 8003e020 80039704 ff60: ffffffff 00000000 ffffffff 9e481380 00000000 00000000 9f1eff78 9f1eff78 ff80: 00000000 00000000 9f1eff88 9f1eff88 9e4f2ec0 8003df30 00000000 00000000 ffa0: 00000000 9f1effb0 8000eb60 8003df3c 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff Backtrace: [<803636ac>] (ftrace_raw_event_regmap_block) from [<80365f2c>] (_regmap_raw_read+0x1bc/0x1cc) r9:00000001 r8:808e3840 r7:00000180 r6:00000000 r5:9eb2f490 r4:9f1fc800 [<80365d70>] (_regmap_raw_read) from [<80365f70>] (_regmap_bus_read+0x34/0x6c) r10:00000000 r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:9f1fc800 r4:9f1fc800 [<80365f3c>] (_regmap_bus_read) from [<803656a4>] (_regmap_read+0x7c/0x144) r6:00000180 r5:9f1fc800 r4:9f1fc800 r3:80365f3c [<80365628>] (_regmap_read) from [<803657bc>] (regmap_read+0x50/0x70) r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:00000180 r4:9f1fc800 [<8036576c>] (regmap_read) from [<80452ac0>] (imx_get_temp+0x134/0x1a4) r6:9f1efeb4 r5:9f1fc800 r4:9e95f910 r3:00000001 [<8045298c>] (imx_get_temp) from [<8044e11c>] (thermal_zone_get_temp+0x58/0x74) r7:9f72ef40 r6:9f1efeb4 r5:9e93d5e8 r4:9e93d400 [<8044e0c4>] (thermal_zone_get_temp) from [<8044fd14>] (thermal_zone_device_update+0x34/0xec) r6:808e1978 r5:9e93d400 r4:9e93d608 r3:8045298c [<8044fce0>] (thermal_zone_device_update) from [<8044fde8>] (thermal_zone_device_check+0x1c/0x20) r5:9e481380 r4:9e93d608 [<8044fdcc>] (thermal_zone_device_check) from [<80038d50>] (process_one_work+0x160/0x3d4) [<80038bf0>] (process_one_work) from [<80039754>] (worker_thread+0x5c/0x4f4) r10:9f72ef40 r9:00000008 r8:9e481380 r7:9e481398 r6:9f72ef40 r5:9f1ee020 r4:9f72ef54 [<800396f8>] (worker_thread) from [<8003e020>] (kthread+0xf0/0x108) r10:00000000 r9:00000000 r8:00000000 r7:800396f8 r6:9e481380 r5:00000000 r4:9e4f2ec0 [<8003df30>] (kthread) from [<8000eb60>] (ret_from_fork+0x14/0x34) r7:00000000 r6:00000000 r5:8003df30 r4:9e4f2ec0 Code: e3140040 1a00001a e3140020 1a000016 (e596002c) ---[ end trace 193c15c2494ec960 ]--- Fixes: bdb0066df96e (mfd: syscon: Decouple syscon interface from platform devices) Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
* Merge branch 'lazytime' of ↵Linus Torvalds2015-02-17
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull lazytime mount option support from Al Viro: "Lazytime stuff from tytso" * 'lazytime' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ext4: add optimization for the lazytime mount option vfs: add find_inode_nowait() function vfs: add support for a lazytime mount option
| * ext4: add optimization for the lazytime mount optionTheodore Ts'o2015-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an optimization for the MS_LAZYTIME mount option so that we will opportunistically write out any inodes with the I_DIRTY_TIME flag set in a particular inode table block when we need to update some inode in that inode table block anyway. Also add some temporary code so that we can set the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * vfs: add support for a lazytime mount optionTheodore Ts'o2015-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new mount option which enables a new "lazytime" mode. This mode causes atime, mtime, and ctime updates to only be made to the in-memory version of the inode. The on-disk times will only get updated when (a) if the inode needs to be updated for some non-time related change, (b) if userspace calls fsync(), syncfs() or sync(), or (c) just before an undeleted inode is evicted from memory. This is OK according to POSIX because there are no guarantees after a crash unless userspace explicitly requests via a fsync(2) call. For workloads which feature a large number of random write to a preallocated file, the lazytime mount option significantly reduces writes to the inode table. The repeated 4k writes to a single block will result in undesirable stress on flash devices and SMR disk drives. Even on conventional HDD's, the repeated writes to the inode table block will trigger Adjacent Track Interference (ATI) remediation latencies, which very negatively impact long tail latencies --- which is a very big deal for web serving tiers (for example). Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-02-13
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...
| * | kvm: add halt_poll_ns module parameterPaolo Bonzini2015-02-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new module parameter for the KVM module; when it is present, KVM attempts a bit of polling on every HLT before scheduling itself out via kvm_vcpu_block. This parameter helps a lot for latency-bound workloads---in particular I tested it with O_DSYNC writes with a battery-backed disk in the host. In this case, writes are fast (because the data doesn't have to go all the way to the platters) but they cannot be merged by either the host or the guest. KVM's performance here is usually around 30% of bare metal, or 50% if you use cache=directsync or cache=writethrough (these parameters avoid that the guest sends pointless flush requests, and at the same time they are not slow because of the battery-backed cache). The bad performance happens because on every halt the host CPU decides to halt itself too. When the interrupt comes, the vCPU thread is then migrated to a new physical CPU, and in general the latency is horrible because the vCPU thread has to be scheduled back in. With this patch performance reaches 60-65% of bare metal and, more important, 99% of what you get if you use idle=poll in the guest. This means that the tunable gets rid of this particular bottleneck, and more work can be done to improve performance in the kernel or QEMU. Of course there is some price to pay; every time an otherwise idle vCPUs is interrupted by an interrupt, it will poll unnecessarily and thus impose a little load on the host. The above results were obtained with a mostly random value of the parameter (500000), and the load was around 1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU. The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll, that can be used to tune the parameter. It counts how many HLT instructions received an interrupt during the polling period; each successful poll avoids that Linux schedules the VCPU thread out and back in, and may also avoid a likely trip to C1 and back for the physical CPU. While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second. Of these halts, almost all are failed polls. During the benchmark, instead, basically all halts end within the polling period, except a more or less constant stream of 50 per second coming from vCPUs that are not running the benchmark. The wasted time is thus very low. Things may be slightly different for Windows VMs, which have a ~10 ms timer tick. The effect is also visible on Marcelo's recently-introduced latency test for the TSC deadline timer. Though of course a non-RT kernel has awful latency bounds, the latency of the timer is around 8000-10000 clock cycles compared to 20000-120000 without setting halt_poll_ns. For the TSC deadline timer, thus, the effect is both a smaller average latency and a smaller variance. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | Merge tag 'for-f2fs-3.20' of ↵Linus Torvalds2015-02-12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "Major changes are to: - add f2fs_io_tracer and F2FS_IOC_GETVERSION - fix wrong acl assignment from parent - fix accessing wrong data blocks - fix wrong condition check for f2fs_sync_fs - align start block address for direct_io - add and refactor the readahead flows of FS metadata - refactor atomic and volatile write policies But most of patches are for clean-ups and minor bug fixes. Some of them refactor old code too" * tag 'for-f2fs-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (64 commits) f2fs: use spinlock for segmap_lock instead of rwlock f2fs: fix accessing wrong indexed data blocks f2fs: avoid variable length array f2fs: fix sparse warnings f2fs: allocate data blocks in advance for f2fs_direct_IO f2fs: introduce macros to convert bytes and blocks in f2fs f2fs: call set_buffer_new for get_block f2fs: check node page contents all the time f2fs: avoid data offset overflow when lseeking huge file f2fs: fix to use highmem for pages of newly created directory f2fs: introduce a batched trim f2fs: merge {invalidate,release}page for meta/node/data pages f2fs: show the number of writeback pages in stat f2fs: keep PagePrivate during releasepage f2fs: should fail mount when trying to recover data on read-only dev f2fs: split UMOUNT and FASTBOOT flags f2fs: avoid write_checkpoint if f2fs is mounted readonly f2fs: support norecovery mount option f2fs: fix not to drop mount options when retrying fill_super f2fs: merge flags in struct f2fs_sb_info ...
| * | | f2fs: fix sparse warningsJaegeuk Kim2015-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves the following warnings. include/trace/events/f2fs.h:150:1: warning: expression using sizeof bool include/trace/events/f2fs.h:180:1: warning: expression using sizeof bool include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool include/trace/events/f2fs.h:150:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) include/trace/events/f2fs.h:180:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1) fs/f2fs/checkpoint.c:27:19: warning: symbol 'inode_entry_slab' was not declared. Should it be static? fs/f2fs/checkpoint.c:577:15: warning: cast to restricted __le32 fs/f2fs/checkpoint.c:592:15: warning: cast to restricted __le32 fs/f2fs/trace.c:19:1: warning: symbol 'pids' was not declared. Should it be static? fs/f2fs/trace.c:21:21: warning: symbol 'last_io' was not declared. Should it be static? Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: split UMOUNT and FASTBOOT flagsJaegeuk Kim2015-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds FASTBOOT flag into checkpoint as follows. - CP_UMOUNT_FLAG is set when system is umounted. - CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries was done. So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly. Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: merge flags in struct f2fs_sb_infoChao Yu2015-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there are several variables with Boolean type as below: struct f2fs_sb_info { ... int s_dirty; bool need_fsck; bool s_closing; ... bool por_doing; ... } For this there are some issues: 1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean type variables by compiler. 2. if we continuously add new flag into f2fs_sb_info, structure will be messed up. So in this patch, we try to: 1. switch s_dirty to Boolean type variable since it has two status 0/1. 2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag. 3. introduce an enum type which can indicate different states of sbi. 4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag to operate flags for sbi. After that, above issues will be fixed. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: cleanup parameters for ↵Chao Yu2015-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | trace_f2fs_submit_{read_,write_,page_,page_m}bio with fio Cleanup parameters for trace_f2fs_submit_{read_,write_,page_,page_m}bio with fio as one parameter. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | | f2fs: cleanup trace event of f2fs_submit_page_{m,}bio with DECLARE_EVENT_CLASSChao Yu2015-01-09
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds missing parameter _type_ for trace_f2fs_submit_page_bio, then use DECLARE_EVENT_CLASS/DEFINE_EVENT_CONDITION pair to cleanup some trace event code related to f2fs_submit_page_{m,}bio. Additionally, after we remove redundant code, size of code can be reduced: text data bss dec hex filename 176787 8712 56 185555 2d4d3 f2fs.ko.org 174408 8648 56 183112 2cb48 f2fs.ko Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | | Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-blockLinus Torvalds2015-02-12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull backing device changes from Jens Axboe: "This contains a cleanup of how the backing device is handled, in preparation for a rework of the life time rules. In this part, the most important change is to split the unrelated nommu mmap flags from it, but also removing a backing_dev_info pointer from the address_space (and inode), and a cleanup of other various minor bits. Christoph did all the work here, I just fixed an oops with pages that have a swap backing. Arnd fixed a missing export, and Oleg killed the lustre backing_dev_info from staging. Last patch was from Al, unexporting parts that are now no longer needed outside" * 'for-3.20/bdi' of git://git.kernel.dk/linux-block: Make super_blocks and sb_lock static mtd: export new mtd_mmap_capabilities fs: make inode_to_bdi() handle NULL inode staging/lustre/llite: get rid of backing_dev_info fs: remove default_backing_dev_info fs: don't reassign dirty inodes to default_backing_dev_info nfs: don't call bdi_unregister ceph: remove call to bdi_unregister fs: remove mapping->backing_dev_info fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info nilfs2: set up s_bdi like the generic mount_bdev code block_dev: get bdev inode bdi directly from the block device block_dev: only write bdev inode on close fs: introduce f_op->mmap_capabilities for nommu mmap support fs: kill BDI_CAP_SWAP_BACKED fs: deduplicate noop_backing_dev_info
| * | | fs: remove default_backing_dev_infoChristoph Hellwig2015-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that default_backing_dev_info is not used for writeback purposes we can git rid of it easily: - instead of using it's name for tracing unregistered bdi we just use "unknown" - btrfs and ceph can just assign the default read ahead window themselves like several other filesystems already do. - we can assign noop_backing_dev_info as the default one in alloc_super. All filesystems already either assigned their own or noop_backing_dev_info. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | fs: export inode_to_bdi and use it in favor of mapping->backing_dev_infoChristoph Hellwig2015-01-20
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we got rid of the bdi abuse on character devices we can always use sb->s_bdi to get at the backing_dev_info for a file, except for the block device special case. Export inode_to_bdi and replace uses of mapping->backing_dev_info with it to prepare for the removal of mapping->backing_dev_info. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* | | Merge tag 'iommu-updates-v3.20' of ↵Linus Torvalds2015-02-12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This time with: - Generic page-table framework for ARM IOMMUs using the LPAE page-table format, ARM-SMMU and Renesas IPMMU make use of it already. - Break out the IO virtual address allocator from the Intel IOMMU so that it can be used by other DMA-API implementations too. The first user will be the ARM64 common DMA-API implementation for IOMMUs - Device tree support for Renesas IPMMU - Various fixes and cleanups all over the place" * tag 'iommu-updates-v3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits) iommu/amd: Convert non-returned local variable to boolean when relevant iommu: Update my email address iommu/amd: Use wait_event in put_pasid_state_wait iommu/amd: Fix amd_iommu_free_device() iommu/arm-smmu: Avoid build warning iommu/fsl: Various cleanups iommu/fsl: Use %pa to print phys_addr_t iommu/omap: Print phys_addr_t using %pa iommu: Make more drivers depend on COMPILE_TEST iommu/ipmmu-vmsa: Fix IOMMU lookup when multiple IOMMUs are registered iommu: Disable on !MMU builds iommu/fsl: Remove unused fsl_of_pamu_ids[] iommu/fsl: Fix section mismatch iommu/ipmmu-vmsa: Use the ARM LPAE page table allocator iommu: Fix trace_map() to report original iova and original size iommu/arm-smmu: add support for iova_to_phys through ATS1PR iopoll: Introduce memory-mapped IO polling macros iommu/arm-smmu: don't touch the secure STLBIALL register iommu/arm-smmu: make use of generic LPAE allocator iommu: io-pgtable-arm: add non-secure quirk ...
| * \ \ Merge branches 'arm/renesas', 'arm/smmu', 'arm/omap', 'ppc/pamu', 'x86/amd' ↵Joerg Roedel2015-02-04
| |\ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | and 'core' into next Conflicts: drivers/iommu/Kconfig drivers/iommu/Makefile