aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* perf trace: Support MAP_FIXED_NOREPLACEArnaldo Carvalho de Melo2018-04-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduced in a4ff8e8620d3 ("mm: introduce MAP_FIXED_NOREPLACE"), and now that we have that define in the just syncronized tools/arch/*/include/uapi/asm/mman.h files, add support for it. This should really transition to autogeneration of string tables as done for various other things: $ ls /tmp/build/perf/trace/beauty/generated/*.c arch_errno_name_array.c kcmp_type_array.c madvise_behavior_array.c pkey_alloc_access_rights_array.c prctl_option_array.c $ head /tmp/build/perf/trace/beauty/generated/madvise_behavior_array.c static const char *madvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [8] = "FREE", [9] = "REMOVE", [10] = "DONTFORK", [11] = "DOFORK", $ Till then, add support for this the old way. Also it has to be ifdef'ed, because arches like mips still don't define it. The proper solution will be to have per-arch tables for these values to support cross-analysis. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-td9t5vhjltqnlzaurkkgq8cn@git.kernel.org Signef-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Remove superfluous allocation error checkJiri Olsa2018-04-17
| | | | | | | | | | | | | | | | | | | | | If the get_callchain_buffers fails to allocate the buffer it will decrease the nr_callchain_events right away. There's no point of checking the allocation error for nr_callchain_events > 1. Removing that check. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20180415092352.12403-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Fix sample_max_stack maximum checkJiri Olsa2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The syzbot hit KASAN bug in perf_callchain_store having the entry stored behind the allocated bounds [1]. We miss the sample_max_stack check for the initial event that allocates callchain buffers. This missing check allows to create an event with sample_max_stack value bigger than the global sysctl maximum: # sysctl -a | grep perf_event_max_stack kernel.perf_event_max_stack = 127 # perf record -vv -C 1 -e cycles/max-stack=256/ kill ... perf_event_attr: size 112 ... sample_max_stack 256 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4 Note the '-C 1', which forces perf record to create just single event. Otherwise it opens event for every cpu, then the sample_max_stack check fails on the second event and all's fine. The fix is to run the sample_max_stack check also for the first event with callchains. [1] https://marc.info/?l=linux-kernel&m=152352732920874&w=2 Reported-by: syzbot+7c449856228b63ac951e@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 97c79a38cd45 ("perf core: Per event callchain limit") Link: http://lkml.kernel.org/r/20180415092352.12403-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf: Return proper values for user stack errorsJiri Olsa2018-04-17
| | | | | | | | | | | | | | | | | | | | Return immediately when we find issue in the user stack checks. The error value could get overwritten by following check for PERF_SAMPLE_REGS_INTR. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt") Link: http://lkml.kernel.org/r/20180415092352.12403-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf list: Add s390 support for detailed/verbose PMU event descriptionThomas Richter2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'perf list' with flags -d and -v print a description (-d) or a very verbose explanation (-v) of CPU specific counter events. These descriptions are provided with the json files in directory pmu-events/arch/s390/*.json. Display of these descriptions on s390 requires the corresponding json files. On s390 this does not work because function is_pmu_core() does not detect the s390 directory name where the CPU specific events are listed. On x86 it is: /sys/bus/event_source/devices/cpu whereas on s390 it is: /sys/bus/event_source/devices/cpum_cf /sys/bus/event_source/devices/cpum_sf Fix this by adding s390 directory name testing to function is_pmu_core(). This is the same approach as taken for the ARM platform. Output before: [root@s35lp76 perf]# ./perf list -d pmu List of pre-defined events (to be used in -e): cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event] cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event] cpum_cf/AES_CYCLES/ [Kernel PMU event] cpum_cf/AES_FUNCTIONS/ [Kernel PMU event] .... cpum_cf/TX_NC_TEND/ [Kernel PMU event] cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event] cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event] Output after: [root@s35lp76 perf]# ./perf list -d pmu List of pre-defined events (to be used in -e): cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event] cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event] cpum_cf/AES_CYCLES/ [Kernel PMU event] cpum_cf/AES_FUNCTIONS/ [Kernel PMU event] .... cpum_cf/TX_NC_TEND/ [Kernel PMU event] cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event] cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event] 3906: bcd_dfp_execution_slots [BCD DFP Execution Slots] decimal_instructions [Decimal Instructions] dtlb2_gpage_writes [DTLB2 GPAGE Writes] dtlb2_hpage_writes [DTLB2 HPAGE Writes] dtlb2_misses [DTLB2 Misses] dtlb2_writes [DTLB2 Writes] itlb2_misses [ITLB2 Misses] itlb2_writes [ITLB2 Writes] l1c_tlb2_misses [L1C TLB2 Misses] ..... cfvn 3: cpu_cycles [CPU Cycles] instructions [Instructions] l1d_dir_writes [L1D Directory Writes] l1d_penalty_cycles [L1D Penalty Cycles] l1i_dir_writes [L1I Directory Writes] l1i_penalty_cycles [L1I Penalty Cycles] problem_state_cpu_cycles [Problem State CPU Cycles] problem_state_instructions [Problem State Instructions] .... csvn generic: aes_blocked_cycles [AES Blocked Cycles] aes_blocked_functions [AES Blocked Functions] aes_cycles [AES Cycles] aes_functions [AES Functions] dea_blocked_cycles [DEA Blocked Cycles] dea_blocked_functions [DEA Blocked Functions] .... Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180416132314.33249-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Extend misc field decoding with switch out event typeAlexey Budankov2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Append 'p' sign to 'S' tag designating the type of context switch out event so 'Sp' means preemption context switch. Documentation is extended to cover new presentation changes. $ perf script --show-switch-events -F +misc -I -i perf.data: hdparm 4073 [004] U 762.198265: 380194 cycles:ppp: 7faf727f5a23 strchr (/usr/lib64/ld-2.26.so) hdparm 4073 [004] K 762.198366: 441572 cycles:ppp: ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux) hdparm 4073 [004] S 762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073 swapper 0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 4073/4073 hdparm 4073 [004] 762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 swapper 0 [007] K 762.198514: 2303073 cycles:ppp: ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux) swapper 0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 1134/1134 kworker/u16:18 1134 [007] 762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 kworker/u16:18 1134 [007] S 762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fc65ce7-8ca5-53ae-8858-8ddd27290575@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf report: Extend raw dump (-D) out with switch out event typeAlexey Budankov2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | Print additional 'preempt' tag for PERF_RECORD_SWITCH[_CPU_WIDE] OUT records when event header misc field contains PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit set designating preemption context switch out event: tools/perf/perf report -D -i perf.data | grep _SWITCH 0 768361415226 0x27f076 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 8/8 4 768362216813 0x28f45e [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 4 768362217824 0x28f486 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073 0 768362414027 0x27f0ce [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 8/8 0 768362414367 0x27f0f6 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/6f5aebb9-b96c-f304-f08f-8f046d38de4f@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf/core: Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]Alexey Budankov2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | Store preempting context switch out event into Perf trace as a part of PERF_RECORD_SWITCH[_CPU_WIDE] record. Percentage of preempting and non-preempting context switches help understanding the nature of workloads (CPU or IO bound) that are running on a machine; The event is treated as preemption one when task->state value of the thread being switched out is TASK_RUNNING. Event type encoding is implemented using PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit; Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/9ff84e83-a0ca-dd82-a6d0-cb951689be74@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* tools/headers: Synchronize kernel ABI headers, v4.17-rc1Ingo Molnar2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync the following tooling headers with the latest kernel version: tools/arch/arm/include/uapi/asm/kvm.h - New ABI: KVM_REG_ARM_* tools/arch/x86/include/asm/required-features.h - Removal of NEED_LA57 dependency tools/arch/x86/include/uapi/asm/kvm.h - New KVM ABI: KVM_SYNC_X86_* tools/include/uapi/asm-generic/mman-common.h - New ABI: MAP_FIXED_NOREPLACE flag tools/include/uapi/linux/bpf.h - New ABI: BPF_F_SEQ_NUMBER functions tools/include/uapi/linux/if_link.h - New ABI: IFLA tun and rmnet support tools/include/uapi/linux/kvm.h - New ABI: hyperv eventfd and CONN_ID_MASK support plus header cleanups tools/include/uapi/sound/asound.h - New ABI: SNDRV_PCM_FORMAT_FIRST PCM format specifier tools/perf/arch/x86/entry/syscalls/syscall_64.tbl - The x86 system call table description changed due to the ptregs changes and the renames, in: d5a00528b58c: syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*() 5ac9efa3c50d: syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention ebeb8c82ffaf: syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32 Also fix the x86 syscall table warning: -Warning: Kernel ABI header at 'tools/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' +Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' None of these changes impact existing tooling code, so we only have to copy the kernel version. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Brian Robbins <brianrob@microsoft.com> Cc: Clark Williams <williams@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Li Zhijian <lizhijian@cn.fujitsu.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Takuya Yamamoto <tkydevel@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: William Cohen <wcohen@redhat.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lkml.kernel.org/r/20180416064024.ofjtrz5yuu3ykhvl@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* trace_kprobe: Remove warning message "Could not insert probe at..."Song Liu2018-04-17
| | | | | | | | | | | | | | | | | | | | This warning message is not very helpful, as the return value should already show information about the error. Also, this message will spam dmesg if the user space does testing in a loop, like: for x in {0..5} do echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events done Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Song Liu <songliubraving@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20180413185513.3626052-1-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge tag 'perf-core-for-mingo-4.17-20180413' of ↵Ingo Molnar2018-04-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull tooling improvements and fixes from Arnaldo Carvalho de Melo: perf annotate fixes and improvements: - Allow showing offsets in more than just jump targets, use the new 'O' hotkey in the TUI, config ~/.perfconfig annotate.offset_level for it and for --stdio2 (Arnaldo Carvalho de Melo) - Use the resolved variable names from objdump disassembled lines to make them more compact, just like was already done for some instructions, like "mov", this eventually will be done more generally, but lets now add some more to the existing mechanism (Arnaldo Carvalho de Melo) perf record fixes: - Change warning for missing topology sysfs entry to debug, as not all architectures have those files, s390 being one of those (Thomas Richter) perf sched fixes: - Fix -g/--call-graph documentation (Takuya Yamamoto) perf stat: - Enable 1ms interval for printing event counters values in (Alexey Budankov) perf test fixes: - Run dwarf unwind on arm32 (Kim Phillips) - Remove unused ptrace.h include from LLVM test, sidesteping older clang's lack of support for some asm constructs (Arnaldo Carvalho de Melo) perf version fixes: - Do not print info about HAVE_LIBAUDIT_SUPPORT in 'perf version --build-options' when HAVE_SYSCALL_TABLE_SUPPORT is true, as libaudit won't be used in that case, print info about syscall_table support instead (Jin Yao) Build system fixes: - Use HAVE_..._SUPPORT used consistently (Jin Yao) - Restore READ_ONCE() C++ compatibility in tools/include (Mark Rutland) - Give hints about package names needed to build jvmti (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf annotate: Handle variables in 'sub', 'or' and many other instructionsArnaldo Carvalho de Melo2018-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like is done for 'mov' and others that can have as source or targets variables resolved by objdump, to make them more compact: - orb $0x4,0x224d71(%rip) # 226ca4 <_rtld_global+0xca4> + orb $0x4,_rtld_global+0xca4 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-efex7746id4w4wa03nqxvh3m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf annotate: Allow setting the offset level in .perfconfigArnaldo Carvalho de Melo2018-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default is 1 (jump_target): # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 nop 4.61 push %rbx 19.33 pushfq 7.97 pop %rax 0.32 nop 0.06 mov %rax,%rbx 14.63 cli 0.06 nop xor %eax,%eax mov $0x1,%edx 49.94 lock cmpxchg %edx,(%rdi) 0.16 test %eax,%eax ↓ jne 2b 2.66 mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # But one can ask for showing offsets for call instructions by setting this: # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 nop 4.61 push %rbx 19.33 pushfq 7.97 pop %rax 0.32 nop 0.06 mov %rax,%rbx 14.63 cli 0.06 nop xor %eax,%eax mov $0x1,%edx 49.94 lock cmpxchg %edx,(%rdi) 0.16 test %eax,%eax ↓ jne 2b 2.66 mov %rbx,%rax pop %rbx ← retq 2b: mov %eax,%esi 2d: → callq *ffffffffb30eaed0 mov %rbx,%rax pop %rbx ← retq # Or using a big value to ask for all offsets to be shown: # cat ~/.perfconfig [annotate] offset_level = 100 hide_src_code = true # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574 _raw_spin_lock_irqsave() /proc/kcore 0.26 0: nop 4.61 5: push %rbx 19.33 6: pushfq 7.97 7: pop %rax 0.32 8: nop 0.06 d: mov %rax,%rbx 14.63 10: cli 0.06 11: nop 17: xor %eax,%eax 19: mov $0x1,%edx 49.94 1e: lock cmpxchg %edx,(%rdi) 0.16 22: test %eax,%eax 24: ↓ jne 2b 2.66 26: mov %rbx,%rax 29: pop %rbx 2a: ← retq 2b: mov %eax,%esi 2d: → callq *ffffffffb30eaed0 32: mov %rbx,%rax 35: pop %rbx 36: ← retq # This also affects the TUI, i.e. the default 'perf annotate' and 'perf top/report' -> A hotkey -> annotate interfaces, when slang-devel is present in the build, i.e.: # perf version --build-options | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-venm6x5zrt40eu8hxdsmqxz6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf report: Fix switching to another perf.data fileArnaldo Carvalho de Melo2018-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the TUI the 's' hotkey can be used to switch to another perf.data file in the current directory, but that got broken in Fixes: b01141f4f59c ("perf annotate: Initialize the priv are in symbol__new()"), that would show this once another file was chosen: ┌─Fatal Error─────────────────────────────────────┐ │Annotation needs to be init before symbol__init()│ │ │ │ │ │Press any key... │ └─────────────────────────────────────────────────┘ Fix it by just silently bailing out if symbol__annotation_init() was already called, just like is done with symbol__init(), i.e. they are done just once at session start, not when switching to a new perf.data file. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: b01141f4f59c ("perf annotate: Initialize the priv are in symbol__new()") Link: https://lkml.kernel.org/n/tip-ogppdtpzfax7y1h6gjdv5s6u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf record: Change warning for missing sysfs entry to debugThomas Richter2018-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using perf on 4.16.0 kernel on s390 shows this warning: failed: can't open node sysfs data each time I run command perf record ... for example: [root@s35lp76 perf]# ./perf record -e rB0000 -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open node sysfs data [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ] [root@s35lp76 perf]# It turns out commit e2091cedd51bf ("perf tools: Add MEM_TOPOLOGY feature to perf data file") tries to open directory named /sys/devices/system/node/ which does not exist on s390. This is the call stack: __cmd_record +---> perf_session__write_header +---> perf_header__adds_write +---> do_write_feat +---> write_mem_topology +---> build_mem_topology prints warning The issue starts in do_write_feat() which unconditionally loops over all features and now includes HEADER_MEM_TOPOLOGY and calls write_mem_topology(). Function record__init_features() at the beginning of __cmd_record() sets all features and then turns off some of them. Fix this by changing the warning to a level 2 debug output statement. So it is only shown when debug level 2 or higher is set. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180412133246.92801-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tests: Disable breakpoint accounting test for powerpcSandipan Das2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We disable this test as instruction breakpoints (HW_BREAKPOINT_X) are not available for powerpc. Before applying patch: 21: Breakpoint accounting : --- start --- test child forked, pid 3635 failed opening event 0 failed opening event 0 watchpoints count 1, breakpoints count 0, has_ioctl 1, share 0 test child finished with -2 ---- end ---- Breakpoint accounting: Skip After applying patch: 21: Breakpoint accounting : Disabled Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180412162140.2992-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf sched: Fix documentation for timehistTakuya Yamamoto2018-04-12
| | | | | | | | | | | | | | | | | | | | Fixed a incorrect option and usage to those shown by "perf sched timehist -h", i.e. the default is really --call-graph, which is equivalent to -g. Signed-off-by: Takuya Yamamoto <tkydevel@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-8fzo0dlsi1mku5aqx8brep5s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf version: Print status for syscall_tableJin Yao2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch doesn't print "libaudit" line if HAVE_SYSCALL_TABLE_SUPPORT is available and add a line for HAVE_SYSCALL_TABLE_SUPPORT. For example, $ ./perf -vv perf version 4.13.rc5.gc2f8af9 dwarf: [ on ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT glibc: [ on ] # HAVE_GLIBC_SUPPORT gtk2: [ on ] # HAVE_GTK2_SUPPORT syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT libbfd: [ on ] # HAVE_LIBBFD_SUPPORT libelf: [ on ] # HAVE_LIBELF_SUPPORT libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ on ] # HAVE_LIBBPF_SUPPORT The line "syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT" is new created. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORTJin Yao2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config, this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and updates the C code accordingly. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf script: Use HAVE_LIBXXX_SUPPORT to replace NO_LIBXXXJin Yao2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Makefile.config, we define the conditional compilation variables HAVE_LIBPERL_SUPPORT and HAVE_LIBPYTHON_SUPPORT. To make the C code more consistent, this patch replaces NO_LIBPERL/NO_LIBPYTHON in C code with HAVE_LIBPERL_SUPPORT/ HAVE_LIBPYTHON_SUPPORT. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1523269609-28824-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * Revert "x86/asm: Allow again using asm.h when building for the 'bpf' clang ↵Arnaldo Carvalho de Melo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | target" This reverts commit ca26cffa4e4aaeb09bb9e308f95c7835cb149248. Newer clang versions accept that asm(_ASM_SP) construct, and now that the bpf-script-test-kbuild.c script, used in one of the 'perf test LLVM' subtests doesn't include ptrace.h, which ended up including arch/x86/include/asm/asm.h, we can revert this patch. Suggested-by: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.com Acked-by: Yonghong Song <yhs@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-nqozcv8loq40tkqpfw997993@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tests bpf: Remove unused ptrace.h include from LLVM testArnaldo Carvalho de Melo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bpf-script-test-kbuild.c script, used in one of the LLVM subtests, includes ptrace.h unnecessarily, and that ends up making it include a header that uses asm(_ASM_SP), a feature that is not supported by clang <= 4.0, breaking that 'perf test' entry. This ended up leading to the ca26cffa4e4a ("x86/asm: Allow again using asm.h when building for the 'bpf' clang target"), adding an ifndef __BPF__ to the arch/x86/include/asm/asm.h file. Newer clang versions accept that asm(_ASM_SP) construct, so just remove the ptrace.h include, which paves the way for reverting ca26cffa4e4a ("x86/asm: Allow again using asm.h when building for the 'bpf' clang target"). Suggested-by: Yonghong Song <yhs@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lkml.kernel.org/r/613f0a0d-c433-8f4d-dcc1-c9889deae39e@fb.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-clbcnzbakdp18ibme4wt43ib@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf jvmti: Give hints about package names needed to buildArnaldo Carvalho de Melo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Give as examples of package names to install to have this built for fedora and debian, to help the user a bit. The part from 'e.g.:' onwards: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Link: https://lkml.kernel.org/n/tip-edbi4r2pvzn7no6ebxbtczng@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf annotate browser: Allow showing offsets in more than just jump targetsArnaldo Carvalho de Melo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jesper wanted to see offsets at callq sites when doing some performance investigation related to retpolines, so save him some time by providing a 'O' hotkey to allow showing offsets from function start at call instructions or in all instructions, just go on pressing 'O' till the offsets you need appear. Example: Starts with: Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│ ↑ je 2a │ ┌──cmp $0xffffffff,%r13d │ ├──je d0 │ │ mov $0x53e3,%edi │ │→ callq __const_udelay │ │ sub $0x1,%r15d │ │↑ jne 83 │ │ mov 0x8(%rbp),%rax │ │ testb $0x20,0x1799(%rax) │ │↑ je 2a │ │ mov 0x200(%rax),%rdi │ │ mov %r13d,%edx │ │ mov $0xffffffffc02595d8,%rsi │ │→ callq netdev_warn │ │↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │ mov %rbp,%rdi │ mov %eax,0x4(%rsp) │ → callq ixgbe_remove_adapter.isra.77 │ mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Pess 'O': Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│ ↑ je 2a │ ┌──cmp $0xffffffff,%r13d │ ├──je d0 │ │ mov $0x53e3,%edi │99:│→ callq __const_udelay │ │ sub $0x1,%r15d │ │↑ jne 83 │ │ mov 0x8(%rbp),%rax │ │ testb $0x20,0x1799(%rax) │ │↑ je 2a │ │ mov 0x200(%rax),%rdi │ │ mov %r13d,%edx │ │ mov $0xffffffffc02595d8,%rsi │c6:│→ callq netdev_warn │ │↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │ mov %rbp,%rdi │ mov %eax,0x4(%rsp) │db: → callq ixgbe_remove_adapter.isra.77 │ mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Press 'O' again: Samples: 64 of event 'cycles:ppp', 100000 Hz, Event count (approx.): 318963 ixgbe_read_reg /proc/kcore Percent│8c: ↑ je 2a │8e:┌──cmp $0xffffffff,%r13d │92:├──je d0 │94:│ mov $0x53e3,%edi │99:│→ callq __const_udelay │9e:│ sub $0x1,%r15d │a2:│↑ jne 83 │a4:│ mov 0x8(%rbp),%rax │a8:│ testb $0x20,0x1799(%rax) │af:│↑ je 2a │b5:│ mov 0x200(%rax),%rdi │bc:│ mov %r13d,%edx │bf:│ mov $0xffffffffc02595d8,%rsi │c6:│→ callq netdev_warn │cb:│↑ jmpq 2a │d0:└─→mov 0x8(%rbp),%rsi │d4: mov %rbp,%rdi │d7: mov %eax,0x4(%rsp) │db: → callq ixgbe_remove_adapter.isra.77 │e0: mov 0x4(%rsp),%eax Press 'h' for help on key bindings ============================================================================ Press 'O' again and it will show just jump target offsets. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-upp6pfdetwlsx18ec2uf1od4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf annotate: Allow showing offsets in more than just jump targetsArnaldo Carvalho de Melo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jesper wanted to see offsets at callq sites when doing some performance investigation related to retpolines, so save him some time by providing an 'struct annotation_options' to control where offsets should appear: just on jump targets? That + call instructions? All? This puts in place the logic to show the offsets, now we need to wire this up in the TUI browser (next patch) and on the 'perf annotate --stdio2" interface, where we need a more general mechanism to setup the 'annotation_options' struct from the command line. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf tests: Run dwarf unwind test on arm32Kim Phillips2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable the unwind test on arm32: $ perf test unwind 58: DWARF unwind : Ok Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brian Robbins <brianrob@microsoft.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180410191624.a3a468670dd4548c66d3d094@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * tools headers: Restore READ_ONCE() C++ compatibilityMark Rutland2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our userspace <linux/compiler.h> defines READ_ONCE() in a way that clang doesn't like, as we have an anonymous union in which neither field is initialized. WRITE_ONCE() is fine since it initializes the __val field. For READ_ONCE() we can keep clang and GCC happy with a dummy initialization of the __c field, so let's do that. At the same time, let's split READ_ONCE() and WRITE_ONCE() over several lines for legibility, as we do in the in-kernel <linux/compiler.h>. Reported-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 6aa7de059173a986 ("locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()") Link: http://lkml.kernel.org/r/20180404163445.16492-1-mark.rutland@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * perf stat: Enable 1ms interval for printing event counters valuesAlexey Budankov2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently print count interval for performance counters values is limited by 10ms so reading the values at frequencies higher than 100Hz is restricted by the tool. This change makes perf stat -I possible on frequencies up to 1KHz and, to some extent, makes perf stat -I to be on-par with perf record sampling profiling. When running perf stat -I for monitoring e.g. PCIe uncore counters and at the same time profiling some I/O workload by perf record e.g. for cpu-cycles and context switches, it is then possible to observe consolidated CPU/OS/IO(Uncore) performance picture for that workload. Tool overhead warning printed when specifying -v option can be missed due to screen scrolling in case you have output to the console so message is moved into help available by running perf stat -h. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/b842ad6a-d606-32e4-afe5-974071b5198e@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Linux 4.17-rc1Linus Torvalds2018-04-15
| |
* | Merge tag 'for-4.17-part2-tag' of ↵Linus Torvalds2018-04-15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull more btrfs updates from David Sterba: "We have queued a few more fixes (error handling, log replay, softlockup) and the rest is SPDX updates that touche almost all files so the diffstat is long" * tag 'for-4.17-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: Only check first key for committed tree blocks btrfs: add SPDX header to Kconfig btrfs: replace GPL boilerplate by SPDX -- sources btrfs: replace GPL boilerplate by SPDX -- headers Btrfs: fix loss of prealloc extents past i_size after fsync log replay Btrfs: clean up resources during umount after trans is aborted btrfs: Fix possible softlock on single core machines Btrfs: bail out on error during replay_dir_deletes Btrfs: fix NULL pointer dereference in log_dir_items
| * | btrfs: Only check first key for committed tree blocksQu Wenruo2018-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When looping btrfs/074 with many cpus (>= 8), it's possible to trigger kernel warning due to first key verification: [ 4239.523446] WARNING: CPU: 5 PID: 2381 at fs/btrfs/disk-io.c:460 btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.523830] Modules linked in: [ 4239.524630] RIP: 0010:btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.527101] Call Trace: [ 4239.527251] read_tree_block+0x42/0x70 [ 4239.527434] read_node_slot+0xd2/0x110 [ 4239.527632] push_leaf_right+0xad/0x1b0 [ 4239.527809] split_leaf+0x4ea/0x700 [ 4239.527988] ? leaf_space_used+0xbc/0xe0 [ 4239.528192] ? btrfs_set_lock_blocking_rw+0x99/0xb0 [ 4239.528416] btrfs_search_slot+0x8cc/0xa40 [ 4239.528605] btrfs_insert_empty_items+0x71/0xc0 [ 4239.528798] __btrfs_run_delayed_refs+0xa98/0x1680 [ 4239.529013] btrfs_run_delayed_refs+0x10b/0x1b0 [ 4239.529205] btrfs_commit_transaction+0x33/0xaf0 [ 4239.529445] ? start_transaction+0xa8/0x4f0 [ 4239.529630] btrfs_alloc_data_chunk_ondemand+0x1b0/0x4e0 [ 4239.529833] btrfs_check_data_free_space+0x54/0xa0 [ 4239.530045] btrfs_delalloc_reserve_space+0x25/0x70 [ 4239.531907] btrfs_direct_IO+0x233/0x3d0 [ 4239.532098] generic_file_direct_write+0xcb/0x170 [ 4239.532296] btrfs_file_write_iter+0x2bb/0x5f4 [ 4239.532491] aio_write+0xe2/0x180 [ 4239.532669] ? lock_acquire+0xac/0x1e0 [ 4239.532839] ? __might_fault+0x3e/0x90 [ 4239.533032] do_io_submit+0x594/0x860 [ 4239.533223] ? do_io_submit+0x594/0x860 [ 4239.533398] SyS_io_submit+0x10/0x20 [ 4239.533560] ? SyS_io_submit+0x10/0x20 [ 4239.533729] do_syscall_64+0x75/0x1d0 [ 4239.533979] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 4239.534182] RIP: 0033:0x7f8519741697 The problem here is, at btree_read_extent_buffer_pages() we don't have acquired read/write lock on that extent buffer, only basic info like level/bytenr is reliable. So race condition leads to such false alert. However in current call site, it's impossible to acquire proper lock without race window. To fix the problem, we only verify first key for committed tree blocks (whose generation is no larger than fs_info->last_trans_committed), so the content of such tree blocks will not change and there is no need to get read/write lock. Reported-by: Nikolay Borisov <nborisov@suse.com> Fixes: 581c1760415c ("btrfs: Validate child tree block's level and first key") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: add SPDX header to KconfigDavid Sterba2018-04-12
| | | | | | | | | | | | Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: replace GPL boilerplate by SPDX -- sourcesDavid Sterba2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: replace GPL boilerplate by SPDX -- headersDavid Sterba2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Unify the include protection macros to match the file names. Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: fix loss of prealloc extents past i_size after fsync log replayFilipe Manana2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if we allocate extents beyond an inode's i_size (through the fallocate system call) and then fsync the file, we log the extents but after a power failure we replay them and then immediately drop them. This behaviour happens since about 2009, commit c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log"), because it marks the inode as an orphan instead of dropping any extents beyond i_size before replaying logged extents, so after the log replay, and while the mount operation is still ongoing, we find the inode marked as an orphan and then perform a truncation (drop extents beyond the inode's i_size). Because the processing of orphan inodes is still done right after replaying the log and before the mount operation finishes, the intention of that commit does not make any sense (at least as of today). However reverting that behaviour is not enough, because we can not simply discard all extents beyond i_size and then replay logged extents, because we risk dropping extents beyond i_size created in past transactions, for example: add prealloc extent beyond i_size fsync - clears the flag BTRFS_INODE_NEEDS_FULL_SYNC from the inode transaction commit add another prealloc extent beyond i_size fsync - triggers the fast fsync path power failure In that scenario, we would drop the first extent and then replay the second one. To fix this just make sure that all prealloc extents beyond i_size are logged, and if we find too many (which is far from a common case), fallback to a full transaction commit (like we do when logging regular extents in the fast fsync path). Trivial reproducer: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0xab 0 256K" /mnt/foo $ sync $ xfs_io -c "falloc -k 256K 1M" /mnt/foo $ xfs_io -c "fsync" /mnt/foo <power failure> # mount to replay log $ mount /dev/sdb /mnt # at this point the file only has one extent, at offset 0, size 256K A test case for fstests follows soon, covering multiple scenarios that involve adding prealloc extents with previous shrinking truncates and without such truncates. Fixes: c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: clean up resources during umount after trans is abortedLiu Bo2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently if some fatal errors occur, like all IO get -EIO, resources would be cleaned up when a) transaction is being committed or b) BTRFS_FS_STATE_ERROR is set However, in some rare cases, resources may be left alone after transaction gets aborted and umount may run into some ASSERT(), e.g. ASSERT(list_empty(&block_group->dirty_list)); For case a), in btrfs_commit_transaciton(), there're several places at the beginning where we just call btrfs_end_transaction() without cleaning up resources. For case b), it is possible that the trans handle doesn't have any dirty stuff, then only trans hanlde is marked as aborted while BTRFS_FS_STATE_ERROR is not set, so resources remain in memory. This makes btrfs also check BTRFS_FS_STATE_TRANS_ABORTED to make sure that all resources won't stay in memory after umount. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: Fix possible softlock on single core machinesNikolay Borisov2018-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_chunk_alloc implements a loop checking whether there is a pending chunk allocation and if so causes the caller do loop. Generally this loop is executed only once, however testing with btrfs/072 on a single core vm machines uncovered an extreme case where the system could loop indefinitely. This is due to a missing cond_resched when loop which doesn't give a chance to the previous chunk allocator finish its job. The fix is to simply add the missing cond_resched. Fixes: 6d74119f1a3e ("Btrfs: avoid taking the chunk_mutex in do_chunk_alloc") Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: bail out on error during replay_dir_deletesLiu Bo2018-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If errors were returned by btrfs_next_leaf(), replay_dir_deletes needs to bail out, otherwise @ret would be forced to be 0 after 'break;' and the caller won't be aware of it. Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: fix NULL pointer dereference in log_dir_itemsLiu Bo2018-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0, 1 and <0 can be returned by btrfs_next_leaf(), and when <0 is returned, path->nodes[0] could be NULL, log_dir_items lacks such a check for <0 and we may run into a null pointer dereference panic. Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
* | | Merge tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2018-04-15
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "SMB3 fixes, a few for stable, and some important cleanup work from Ronnie of the smb3 transport code" * tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: change validate_buf to validate_iov cifs: remove rfc1002 hardcoded constants from cifs_discard_remaining_data() cifs: Change SMB2_open to return an iov for the error parameter cifs: add resp_buf_size to the mid_q_entry structure smb3.11: replace a 4 with server->vals->header_preamble_size cifs: replace a 4 with server->vals->header_preamble_size cifs: add pdu_size to the TCP_Server_Info structure SMB311: Improve checking of negotiate security contexts SMB3: Fix length checking of SMB3.11 negotiate request CIFS: add ONCE flag for cifs_dbg type cifs: Use ULL suffix for 64-bit constant SMB3: Log at least once if tree connect fails during reconnect cifs: smb2pdu: Fix potential NULL pointer dereference
| * | | cifs: change validate_buf to validate_iovRonnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: remove rfc1002 hardcoded constants from cifs_discard_remaining_data()Ronnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: Change SMB2_open to return an iov for the error parameterRonnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: add resp_buf_size to the mid_q_entry structureRonnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and get rid of some more calls to get_rfc1002_length() Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | smb3.11: replace a 4 with server->vals->header_preamble_sizeSteve French2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More cleanup of use of hardcoded 4 byte RFC1001 field size Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
| * | | cifs: replace a 4 with server->vals->header_preamble_sizeRonnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | cifs: add pdu_size to the TCP_Server_Info structureRonnie Sahlberg2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and get rid of some get_rfc1002_length() in smb2 Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | SMB311: Improve checking of negotiate security contextsSteve French2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMB3.11 crypto and hash contexts were not being checked strictly enough. Add parsing and validity checking for the security contexts in the SMB3.11 negotiate response. Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | SMB3: Fix length checking of SMB3.11 negotiate requestSteve French2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The length checking for SMB3.11 negotiate request includes "negotiate contexts" which caused a buffer validation problem and a confusing warning message on SMB3.11 mount e.g.: SMB2 server sent bad RFC1001 len 236 not 170 Fix the length checking for SMB3.11 negotiate to account for the new negotiate context so that we don't log a warning on SMB3.11 mount by default but do log warnings if lengths returned by the server are incorrect. CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
| * | | CIFS: add ONCE flag for cifs_dbg typeAurelien Aptel2018-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Since cifs_vfs_error was just using pr_debug_ratelimited like the rest of cifs_dbg, move it there too * Add a ONCE type flag to call the pr_xxx_once() debug function instead of the ratelimited ones. To convert existing printk_once() calls to this we can run: perl -i -pE \ 's/printk_once\s*\(([^" \n]+)(.*)/cifs_dbg(VFS|ONCE,$2/g' \ fs/cifs/*.c Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>