aboutsummaryrefslogtreecommitdiffstats
path: root/tools
Commit message (Collapse)AuthorAge
...
* perf string: Simplify ltrim() implementationArnaldo Carvalho de Melo2017-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to use strlen(), a var, or check for the end explicitely, isspace('\0') is false: [acme@jouet c]$ cat ltrim.c #include <ctype.h> #include <stdio.h> static char *ltrim(char *s) { while (isspace(*s)) ++s; return s; } int main(void) { printf("ltrim(\"\")='%s'\n", ltrim("")); return 0; } [acme@jouet c]$ ./ltrim ltrim("")='' [acme@jouet c]$ Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/n/tip-w3nk0x3pai2vojk2ab6kdvaw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Refactor the code to strip command name with {l,r}trim()Taeung Song2017-04-11
| | | | | | | | | | | | | | | | | After reading command name from /proc/<pid>/status, use ltrim() and rtrim() to strip command name, not using just while loop, isspace() and etc. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-6-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf pmu: Refactor wordwrap() with ltrim()Taeung Song2017-04-11
| | | | | | | | | | | | Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-5-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf ui browser: Refactor the code to parse color configs with ltrim()Taeung Song2017-04-11
| | | | | | | | | | | | | | When parsing {fore, back} ground color configs, use ltrim() instead of just while loop and isspace(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-4-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf stat: Refactor the code to strip csv output with ltrim()Taeung Song2017-04-11
| | | | | | | | | | | | | | | To strip csv output, use ltrim() instead of just while loop and isspace() at print_metric_{only}_csv(). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1491575061-704-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evsel: Return exact sub event which failed with EPERM for wildcardsJin Yao2017-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel has a special check for a specific irq_vectors trace event. TRACE_EVENT_PERF_PERM(irq_work_exit, is_sampling_event(p_event) ? -EPERM : 0); The perf-record fails for this irq_vectors event when it is present, like when using a wildcard: root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2 Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN To make this setting permanent, edit /etc/sysctl.conf too, e.g.: kernel.perf_event_paranoid = -1 This patch prints out the exact sub event that failed with EPERM for wildcards to help in understanding what went wrong when this event is present: After the patch: root@skl:/tmp# perf record -a -e irq_vectors:* sleep 2 Error: No permission to enable irq_vectors:irq_work_exit event. You may not have permission to collect system-wide stats. ...... Committer notes: So we have a lot of irq_vectors events: [root@jouet ~]# perf list irq_vectors:* List of pre-defined events (to be used in -e): irq_vectors:call_function_entry [Tracepoint event] irq_vectors:call_function_exit [Tracepoint event] irq_vectors:call_function_single_entry [Tracepoint event] irq_vectors:call_function_single_exit [Tracepoint event] irq_vectors:deferred_error_apic_entry [Tracepoint event] irq_vectors:deferred_error_apic_exit [Tracepoint event] irq_vectors:error_apic_entry [Tracepoint event] irq_vectors:error_apic_exit [Tracepoint event] irq_vectors:irq_work_entry [Tracepoint event] irq_vectors:irq_work_exit [Tracepoint event] irq_vectors:local_timer_entry [Tracepoint event] irq_vectors:local_timer_exit [Tracepoint event] irq_vectors:reschedule_entry [Tracepoint event] irq_vectors:reschedule_exit [Tracepoint event] irq_vectors:spurious_apic_entry [Tracepoint event] irq_vectors:spurious_apic_exit [Tracepoint event] irq_vectors:thermal_apic_entry [Tracepoint event] irq_vectors:thermal_apic_exit [Tracepoint event] irq_vectors:threshold_apic_entry [Tracepoint event] irq_vectors:threshold_apic_exit [Tracepoint event] irq_vectors:x86_platform_ipi_entry [Tracepoint event] irq_vectors:x86_platform_ipi_exit [Tracepoint event] # And some may be sampled: [root@jouet ~]# perf record -e irq_vectors:local* sleep 20s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data (2 samples) ] [root@jouet ~]# perf report -D | egrep 'stats:|events:' Aggregated stats: TOTAL events: 155 MMAP events: 144 COMM events: 2 EXIT events: 1 SAMPLE events: 2 MMAP2 events: 4 FINISHED_ROUND events: 1 TIME_CONV events: 1 irq_vectors:local_timer_entry stats: TOTAL events: 1 SAMPLE events: 1 irq_vectors:local_timer_exit stats: TOTAL events: 1 SAMPLE events: 1 [root@jouet ~]# But, as shown in the tracepoint definition at the start of this message, some, like "irq_vectors:irq_work_exit", may not be sampled, just counted, i.e. if we try to sample, as when using 'perf record', we get an error: [root@jouet ~]# perf record -e irq_vectors:irq_work_exit Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, <SNIP> The error message is misleading, this patch will help in pointing out what is the event causing such an error, but the error message needs improvement, i.e. we need to figure out a way to check if a tracepoint is counting only, like this one, when all we can do is to count it with 'perf stat', at most printing the delta using interval printing, as in: [root@jouet ~]# perf stat -I 5000 -e irq_vectors:irq_work_* # time counts unit events 5.000168871 0 irq_vectors:irq_work_entry 5.000168871 0 irq_vectors:irq_work_exit 10.000676730 0 irq_vectors:irq_work_entry 10.000676730 0 irq_vectors:irq_work_exit 15.001122415 0 irq_vectors:irq_work_entry 15.001122415 0 irq_vectors:irq_work_exit 20.001298051 0 irq_vectors:irq_work_entry 20.001298051 0 irq_vectors:irq_work_exit 25.001485020 1 irq_vectors:irq_work_entry 25.001485020 1 irq_vectors:irq_work_exit 30.001658706 0 irq_vectors:irq_work_entry 30.001658706 0 irq_vectors:irq_work_exit ^C 32.045711878 0 irq_vectors:irq_work_entry 32.045711878 0 irq_vectors:irq_work_exit [root@jouet ~]# But at least, when we use a wildcard, this patch helps a bit. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1491566932-503-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Use strtok_r() when parsing output field listArnaldo Carvalho de Melo2017-04-11
| | | | | | | | Just avoiding non-reentrant functions. Cc: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/n/tip-eqytykipd74epzl9aexvppcg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf callchains: Switch from strtok() to strtok_r() when parsing optionsArnaldo Carvalho de Melo2017-04-11
| | | | | | | | Trying to keep everything reentrant. Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/n/tip-rdce0p2k9e1b4qnrb8ki9mtf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'v4.11-rc6' into perf/core, to pick up fixesIngo Molnar2017-04-11
|\ | | | | | | Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * Merge tag 'powerpc-4.11-7' of ↵Linus Torvalds2017-04-08
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.11: Headed to stable: - disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash triggered by a hostile guest, but only in configurations that no one uses - don't try to fix up misaligned load-with-reservation instructions - fix flush_(d|i)cache_range() called from modules on little endian kernels - add missing global TLB invalidate if cxl is active - fix missing preempt_disable() in crc32c-vpmsum And a fix for selftests build changes that went in this release: - selftests/powerpc: Fix standalone powerpc build Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras" * tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() powerpc/mm: Add missing global TLB invalidate if cxl is active powerpc/64: Fix flush_(d|i)cache_range() called from modules powerpc: Don't try to fix up misaligned load-with-reservation instructions powerpc: Disable HFSCR[TM] if TM is not supported selftests/powerpc: Fix standalone powerpc build
| | * selftests/powerpc: Fix standalone powerpc buildMichael Ellerman2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changes to enable building with a separate output directory, in commit a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT") broke building the powerpc selftests on their own, eg: $ cd tools/testing/selftests/powerpc; make It was partially fixed in commit e53aff45c490 ("selftests: lib.mk Fix individual test builds"), which defined OUTPUT for standalone tests. But that only defines OUTPUT within the Makefile, the value is not exported so sub-shells can't see it. We could export OUTPUT, but it's actually cleaner to just expand the value of OUTPUT before we invoke the shell. Fixes: a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | bpf: add various verifier test cases for self-testsDaniel Borkmann2017-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a couple of test cases, for example, probing for xadd on a spilled pointer to packet and map_value_adj register, various other map_value_adj tests including the unaligned load/store, and trying out pointer arithmetic on map_value_adj register itself. For the unaligned load/store, we need to figure out whether the architecture has efficient unaligned access and need to mark affected tests accordingly. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: improve verifier packet range checksAlexei Starovoitov2017-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | llvm can optimize the 'if (ptr > data_end)' checks to be in the order slightly different than the original C code which will confuse verifier. Like: if (ptr + 16 > data_end) return TC_ACT_SHOT; // may be followed by if (ptr + 14 > data_end) return TC_ACT_SHOT; while llvm can see that 'ptr' is valid for all 16 bytes, the verifier could not. Fix verifier logic to account for such case and add a test. Reported-by: Huapeng Zhou <hzhou@fb.com> Fixes: 969bf05eb3ce ("bpf: direct packet access") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | perf annotate: Fix missing number of samples for source_line_samplesTaeung Song2017-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The option 'show-total-period' works fine without a option '-l'. But if running 'perf annotate --stdio -l --show-total-period', you can see a problem showing only zero '0' for number of samples. Before: $ perf annotate --stdio -l --show-total-period ... 0 : 400816: push %rbp 0 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 0 : 400821: mov -0x24(%rbp),%eax 0 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 0 : 40082a: mov $0x0,%edx ... The reason is it was missed to set number of samples of source_line_samples, so set it ordinarily. After: $ perf annotate --stdio -l --show-total-period ... 3 : 400816: push %rbp 4 : 400817: mov %rsp,%rbp 0 : 40081a: mov %edi,-0x24(%rbp) 0 : 40081d: mov %rsi,-0x30(%rbp) 1 : 400821: mov -0x24(%rbp),%eax 2 : 400824: mov -0x30(%rbp),%rdx 0 : 400828: mov (%rdx),%esi 1 : 40082a: mov $0x0,%edx ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin Liska <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/1490703125-13643-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf tools: Don't die on a print functionArnaldo Carvalho de Melo2017-04-04
| | | | | | | | | | | | | | | | | | | | | | | | Trying to remove die() calls from library functions, postponing exiting to the tool main code. Link: http://lkml.kernel.org/n/tip-ackxq5nqe39gunln3tkczs42@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf tools: Handle allocation failures gracefullyArnaldo Carvalho de Melo2017-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The callers of perf_read_values__enlarge_counters() already propagate errors, so just print some debug diagnostics and handle allocation failures gracefully, not trying to do silly things like 'a = realloc(a)'. Link: http://lkml.kernel.org/n/tip-nsmmh7uzpg35rzcl9nq7yztp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf tools: Remove die() callArnaldo Carvalho de Melo2017-04-04
| | | | | | | | | | | | | | | | | | | | | We can just use the exit() right after the branch calling die(). Link: http://lkml.kernel.org/n/tip-90athn06d7atf2jkpfvq1iic@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | Merge branch 'perf/uncore-json-updates-1' of ↵Arnaldo Carvalho de Melo2017-04-04
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc into perf/core Pull perf/core improvements from Andi Kleen: This pull requests contains updates to the Intel PMU events JSON files, plus two one liner code fixes for the JSON files (also appended as patch) The most remarkable change is support for Sandy Bridge to Skylake client uncore event list support. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf vendor events intel: Add missing space in json descriptionsAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a missing space in the JSON description after the uncore unit Before: perf list ... unc_arb_coh_trk_requests.all [Unit: uncore_arbNumber of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc] ... After: unc_arb_coh_trk_requests.all [Unit: uncore_arb Number of entries allocated. Account for Any type: e.g. Snoop, Core aperture, etc] Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-p989c7x9kaiy2bnkmgpo6cvt@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore_arb JSON supportAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The JSON lists call the box iMPH-U, while perf calls it arb. Add conversion support to json to convert the unit properly. Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-stq5ly95z2qioggp9bfaqe0h@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore events for Skylake clientAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add V25 of Skylake uncore events Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-00qmcrmq183x2qrj59g92fma@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore events for Broadwell clientAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add V18 of Broadwell uncore events Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-xlbguqdzho7l3qn7di40a7av@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore events for Haswell clientAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add V25 of Haswell uncore events Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-133r1do7vvssoyszxgx174hj@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore events for Ivy Bridge clientAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add V18 of Ivy Bridge uncore events Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-299k76asec5rwp0i86qygnnt@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add uncore events for Sandy Bridge clientAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add V15 of Sandy Bridge uncore events Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-2qkwutpwljdue8jmwk3xqdbl@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
| * | | perf vendor events intel: Add missing UNC_M_DCLOCKTICKS for Broadwell DE uncoreAndi Kleen2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An earlier update removed the UNC_M_CLOCKTICKS event for Broadwell DE. But Metric events were still referring to it. This adds it back under a different name from the event list, and also fixes up the Metric events to use the new name. Cc: jolsa@kernel.org Link: http://lkml.kernel.org/n/tip-zxxzg4g5nr93o7np00vgqqwm@git.kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com>
* | | | perf sdt powerpc: Add argument supportRavi Bangoria2017-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SDT marker argument is in N@OP format. Here OP is arch dependent component. Add powerpc logic to parse OP and convert it to uprobe compatible format. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170328094754.3156-4-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | Merge tag 'perf-core-for-mingo-4.12-20170331' of ↵Ingo Molnar2017-04-01
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo) e.g.: System wide strace like session: # trace -e statx 16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0 36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0 ^C# User visible changes: - Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we shouldn't try to calculate duration or print the timestamp for a missing matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo) - Do not print "cycles: 0" in perf report LBR lines in platforms not supporting 'cycles', such as Intel's Broadwell (Jin Yao) - Handle missing $HOME env var (Jiri Olsa) - Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to the next best thing (ax, bx, etc) supported (Ravi Bangoria) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | perf trace: Beautify statx syscall 'flag' and 'mask' argumentsArnaldo Carvalho de Melo2017-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To test it, build samples/statx/test_statx, which I did as: $ make headers_install $ cc -I ~/git/linux/usr/include samples/statx/test-statx.c -o /tmp/statx And then use perf trace on it: # perf trace -e statx /tmp/statx /etc/passwd statx(/etc/passwd) = 0 results=7ff Size: 3496 Blocks: 8 IO Block: 4096 regular file Device: fd:00 Inode: 280156 Links: 1 Access: (0644/-rw-r--r--) Uid: 0 Gid: 0 Access: 2017-03-29 16:01:01.650073438-0300 Modify: 2017-03-10 16:25:14.156479354-0300 Change: 2017-03-10 16:25:14.171479328-0300 0.000 ( 0.007 ms): statx/30648 statx(dfd: CWD, filename: 0x7ef503f4, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff7ef4eb10) = 0 # Using the test-stat.c options to change the mask: # perf trace -e statx /tmp/statx -O /etc/passwd > /dev/null 0.000 ( 0.008 ms): statx/30745 statx(dfd: CWD, filename: 0x3a0753f4, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffd3a0735c0) = 0 # # perf trace -e statx /tmp/statx -A /etc/passwd > /dev/null 0.000 ( 0.010 ms): statx/30757 statx(dfd: CWD, filename: 0xa94e63f4, flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffea94e49d0) = 0 # # trace --no-inherit -e statx /tmp/statx -F /etc/passwd > /dev/null 0.000 ( 0.011 ms): statx(dfd: CWD, filename: 0x3b02d3f3, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffd3b02c850) = 0 # # trace --no-inherit -e statx /tmp/statx -F -L /etc/passwd > /dev/null 0.000 ( 0.008 ms): statx(dfd: CWD, filename: 0x15cff3f3, flags: STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff15cfdda0) = 0 # # trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null 0.000 ( 0.009 ms): statx(dfd: CWD, filename: 0xfa37f3f3, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffffa37da20) = 0 # Adding a probe to get the filename collected as well: # perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string' Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 # trace --no-inherit -e statx /tmp/statx -D -O /etc/passwd > /dev/null 0.169 ( 0.007 ms): statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0 # Same technique could be used to collect and beautify the result put in the 'buffer' argument. Finally do a system wide 'perf trace' session looking for any use of statx, then run the test proggie with various flags: # trace -e statx 16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0 33064.447 ( 0.011 ms): statx/4569 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW|STATX_FORCE_SYNC, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffc5484c790) = 0 36050.891 ( 0.023 ms): statx/4576 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: BTIME, buffer: 0x7ffeb18b66e0) = 0 38039.889 ( 0.023 ms): statx/4584 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7fff1db0ea90) = 0 ^C# This one also starts moving the beautifiers from files directly included in builtin-trace.c to separate objects + a beauty.h header with prototypes, so that we can add test cases in tools/perf/tests/ to fire syscalls with various arguments and then get them intercepted as syscalls:sys_enter_foo or raw_syscalls:sys_enter + sys_exit to then format and check that the formatted output is the one we expect. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Ahern <dsahern@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xvzw8eynffvez5czyzidhrno@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf tools: Do not fail in case of empty HOME env variableJiri Olsa2017-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we fail in the following case: $ unset HOME $ ./perf record ls $ echo $? 255 It's because the config code init fails due to a missing HOME variable value. Fix this by skipping the user config init if there's no HOME variable value. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20170330144637.7468-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | tools include uapi: Grab copies of stat.h and fcntl.hArnaldo Carvalho de Melo2017-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will need it to build tools/perf/trace/beauty/statx.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-nin41ve2fa63lrfbdr6x57yr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf utils: Fix spelling mistake: "Invalud" -> "Invalid"Colin Ian King2017-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivial fix to spelling mistake in pr_debug message. Signed-off-by: Colin King <colin.king@canonical.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20170330095440.19444-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf trace: Handle unpaired raw_syscalls:sys_exit eventArnaldo Carvalho de Melo2017-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Which may happen when we start a tracing session and a thread is waiting for something like "poll" to return, in which case we better print "?" both for the syscall entry timestamp and for the duration. E.g.: Tracing existing mutt session: # perf trace -p `pidof mutt` ? ( ? ): mutt/17135 ... [continued]: poll()) = 1 0.027 ( 0.013 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1 0.047 ( 0.008 ms): mutt/17135 poll(ufds: 0x7ffcb3c42c50, nfds: 1, timeout_msecs: 1000) = 1 0.059 ( 0.008 ms): mutt/17135 read(buf: 0x7ffcb3c42cef, count: 1) = 1 <SNIP> Before it would print a large number because we'd do: ttrace->entry_time - trace->base_time And entry_time would be 0, while base_time would be the timestamp for the first event 'perf trace' reads, oops. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Claudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wbcb93ofva2qdjd5ltn5eeqq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf report: Drop cycles 0 for LBR printJin Yao2017-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some platforms, for example Broadwell, it doesn't support cycles for LBR. But the perf always prints cycles:0, it's not necessary. The patch refactors the LBR info print code and drops the cycles:0. For example: perf report --branch-history --no-children --stdio On Broadwell: --0.91%--__random_r random_r.c:394 (iterations:2) __random_r random_r.c:360 (predicted:0.0%) __random_r random_r.c:380 (predicted:0.0%) __random_r random_r.c:357 On Skylake: --1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17) main div.c:44 (predicted:52.4% cycles:1) main div.c:42 (cycles:2) compute_flag div.c:28 (cycles:2) compute_flag div.c:27 (cycles:1) rand rand.c:28 (cycles:1) rand rand.c:28 (cycles:1) __random random.c:298 (cycles:1) __random random.c:297 (cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (cycles:1) __random random.c:295 (cycles:1) Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1489046786-10061-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf/sdt/x86: Move OP parser to tools/perf/arch/x86/Ravi Bangoria2017-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SDT marker argument is in N@OP format. N is the size of argument and OP is the actual assembly operand. OP is arch dependent component and hence it's parsing logic also should be placed under tools/perf/arch/. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170328094754.3156-3-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf/sdt/x86: Add renaming logic for (missing) 8 bit registersRavi Bangoria2017-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found couple of events using al, bl, cl and dl registers for argument. These are not directly accepted by uprobe_events and thus needs to be mapped to ax, bx, cx and dx respectively. Few ex, /usr/bin/qemu-system-s390x css_adapter_interrupt: 1@%bl css_chpid_add: 1@%cl 1@%sil 1@%dl dma_bdrv_io: 8@%rbx 8@%rbp -8@%r14 1@%al /usr/bin/postgres buffer__read__done: ... -1@-bash -1@%al buffer__read__start: ... -1@%al I don't find any sdt events using ah, bh,... registers. But I also don't see any reason to not use them, so there might be rare events using these registers, and if so, perf should have a renaming logic for them too. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170328094754.3156-2-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | perf tools: Remove support for command aliasesArnaldo Carvalho de Melo2017-03-28
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came from 'git', but isn't documented anywhere in tools/perf/Documentation/, looks like baggage we can do without, ditch it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-e7uwkn60t4hmlnwj99ba4t2s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar2017-03-30
|\ \ \ \ | |/ / / |/| / / | |/ / Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2017-03-23
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Several netfilter fixes from Pablo and the crew: - Handle fragmented packets properly in netfilter conntrack, from Florian Westphal. - Fix SCTP ICMP packet handling, from Ying Xue. - Fix big-endian bug in nftables, from Liping Zhang. - Fix alignment of fake conntrack entry, from Steven Rostedt. 2) Fix feature flags setting in fjes driver, from Taku Izumi. 3) Openvswitch ipv6 tunnel source address not set properly, from Or Gerlitz. 4) Fix jumbo MTU handling in amd-xgbe driver, from Thomas Lendacky. 5) sk->sk_frag.page not released properly in some cases, from Eric Dumazet. 6) Fix RTNL deadlocks in nl80211, from Johannes Berg. 7) Fix erroneous RTNL lockdep splat in crypto, from Herbert Xu. 8) Cure improper inflight handling during AF_UNIX GC, from Andrey Ulanov. 9) sch_dsmark doesn't write to packet headers properly, from Eric Dumazet. 10) Fix SCM_TIMESTAMPING_OPT_STATS handling in TCP, from Soheil Hassas Yeganeh. 11) Add some IDs for Motorola qmi_wwan chips, from Tony Lindgren. 12) Fix nametbl deadlock in tipc, from Ying Xue. 13) GRO and LRO packets not counted correctly in mlx5 driver, from Gal Pressman. 14) Fix reset of internal PHYs in bcmgenet, from Doug Berger. 15) Fix hashmap allocation handling, from Alexei Starovoitov. 16) nl_fib_input() needs stronger netlink message length checking, from Eric Dumazet. 17) Fix double-free of sk->sk_filter during sock clone, from Daniel Borkmann. 18) Fix RX checksum offloading in aquantia driver, from Pavel Belous. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) net:ethernet:aquantia: Fix for RX checksum offload. amd-xgbe: Fix the ECC-related bit position definitions sfc: cleanup a condition in efx_udp_tunnel_del() Bluetooth: btqcomsmd: fix compile-test dependency inet: frag: release spinlock before calling icmp_send() tcp: initialize icsk_ack.lrcvtime at session start time genetlink: fix counting regression on ctrl_dumpfamily() socket, bpf: fix sk_filter use after free in sk_clone_lock ipv4: provide stronger user input validation in nl_fib_input() bpf: fix hashmap extra_elems logic enic: update enic maintainers net: bcmgenet: remove bcmgenet_internal_phy_setup() ipv6: make sure to initialize sockc.tsflags before first use fjes: Do not load fjes driver if extended socket device is not power on. fjes: Do not load fjes driver if system does not have extended socket device. net/mlx5e: Count LRO packets correctly net/mlx5e: Count GSO packets correctly net/mlx5: Increase number of max QPs in default profile net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps net/mlx5e: Use the proper UAPI values when offloading TC vlan actions ...
| | * bpf: fix hashmap extra_elems logicAlexei Starovoitov2017-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In both kmalloc and prealloc mode the bpf_map_update_elem() is using per-cpu extra_elems to do atomic update when the map is full. There are two issues with it. The logic can be misused, since it allows max_entries+num_cpus elements to be present in the map. And alloc_extra_elems() at map creation time can fail percpu alloc for large map values with a warn: WARNING: CPU: 3 PID: 2752 at ../mm/percpu.c:892 pcpu_alloc+0x119/0xa60 illegal size (32824) or align (8) for percpu allocation The fixes for both of these issues are different for kmalloc and prealloc modes. For prealloc mode allocate extra num_possible_cpus elements and store their pointers into extra_elems array instead of actual elements. Hence we can use these hidden(spare) elements not only when the map is full but during bpf_map_update_elem() that replaces existing element too. That also improves performance, since pcpu_freelist_pop/push is avoided. Unfortunately this approach cannot be used for kmalloc mode which needs to kfree elements after rcu grace period. Therefore switch it back to normal kmalloc even when full and old element exists like it was prior to commit 6c9059817432 ("bpf: pre-allocate hash map elements"). Add tests to check for over max_entries and large map values. Reported-by: Dave Jones <davej@codemonkey.org.uk> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * selftests/bpf: fix broken build, take 2Zi Shen Lim2017-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge of 'linux-kselftest-4.11-rc1': 1. Partially removed use of 'test_objs' target, breaking force rebuild of BPFOBJ, introduced in commit d498f8719a09 ("bpf: Rebuild bpf.o for any dependency update"). Update target so dependency on BPFOBJ is restored. 2. Introduced commit 2047f1d8ba28 ("selftests: Fix the .c linking rule") which fixes order of LDLIBS. Commit d02d8986a768 ("bpf: Always test unprivileged programs") added libcap dependency into CFLAGS. Use LDLIBS instead to fix linking of test_verifier. 3. Introduced commit d83c3ba0b926 ("selftests: Fix selftests build to just build, not run tests"). Reordering the Makefile allows us to remove the 'all' target. Tested both: selftests/bpf$ make and selftests$ make TARGETS=bpf on Ubuntu 16.04.2. Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2017-03-17
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf related fixes: - fix a CR4.PCE propagation issue caused by usage of mm instead of active_mm and therefore propagated the wrong value. - perf core fixes, which plug a use-after-free issue and make the event inheritance on fork more robust. - a tooling fix for symbol handling" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf symbols: Fix symbols__fixup_end heuristic for corner cases x86/perf: Clarify why x86_pmu_event_mapped() isn't racy x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm perf/core: Better explain the inherit magic perf/core: Simplify perf_event_free_task() perf/core: Fix event inheritance on fork() perf/core: Fix use-after-free in perf_release()
* | | Merge tag 'perf-core-for-mingo-4.12-20170327' of ↵Ingo Molnar2017-03-28
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Handle inline functions in callchains (Jin Yao) - Enable sorting by srcline as key (Milian Wolff) Fixes: - Fix no_size logic in addr_filter__resolve_kernel_syms() in the auxtrace code (Adrian Hunter) - Fix some thread refcount leaks in 'perf trace' (Arnaldo Carvalho de Melo) - Fix divide by zero when calculating percent for an event in a group in the annotate by source line code (Taeung Song) - build-id files now aren't anymore symlinks, their parent directories are, so readlink the later (Taeung Song) - Assorted fixes for null termination problems, mostly related to readlink, detected by valgrind (Tommi Rantala) Infrastructure changes: - Make vfs_getname probe point logic in 'perf trace' more robust wrt length of pathname (Arnaldo Carvalho de Melo) - Remove unused 'prefix' parameter from builtins main functions (Arnaldo Carvalho de Melo) - Show 'perf list sdt' option in man page (Ravi Bangoria) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | perf utils: Readlink /proc/self/exe to find the perf binaryTommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplification: it is easier to open /proc/self/exe than /proc/$pid/exe. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-7-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf utils: Null terminate buf in read_ftrace_printk()Tommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that the string that we read from the data file is null terminated. Valgrind was complaining: ==31357== Invalid read of size 1 ==31357== at 0x4EC8C1: __strtok_r_1c (string2.h:200) ==31357== by 0x4EC8C1: parse_ftrace_printk (trace-event-parse.c:161) ==31357== by 0x4F82A8: read_ftrace_printk (trace-event-read.c:204) ==31357== by 0x4F82A8: trace_report (trace-event-read.c:468) ==31357== by 0x4CD552: process_tracing_data (header.c:1576) ==31357== by 0x4D3397: perf_file_section__process (header.c:2705) ==31357== by 0x4D3397: perf_header__process_sections (header.c:2488) ==31357== by 0x4D3397: perf_session__read_header (header.c:2925) ==31357== by 0x4E71E2: perf_session__open (session.c:32) ==31357== by 0x4E71E2: perf_session__new (session.c:139) ==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472) ==31357== by 0x497150: run_builtin (perf.c:359) ==31357== by 0x428CE0: handle_internal_command (perf.c:421) ==31357== by 0x428CE0: run_argv (perf.c:467) ==31357== by 0x428CE0: main (perf.c:614) ==31357== Address 0x8ac0efb is 0 bytes after a block of size 1,963 alloc'd ==31357== at 0x4C2DB9D: malloc (vg_replace_malloc.c:299) ==31357== by 0x4F827B: read_ftrace_printk (trace-event-read.c:195) ==31357== by 0x4F827B: trace_report (trace-event-read.c:468) ==31357== by 0x4CD552: process_tracing_data (header.c:1576) ==31357== by 0x4D3397: perf_file_section__process (header.c:2705) ==31357== by 0x4D3397: perf_header__process_sections (header.c:2488) ==31357== by 0x4D3397: perf_session__read_header (header.c:2925) ==31357== by 0x4E71E2: perf_session__open (session.c:32) ==31357== by 0x4E71E2: perf_session__new (session.c:139) ==31357== by 0x429F5D: cmd_annotate (builtin-annotate.c:472) ==31357== by 0x497150: run_builtin (perf.c:359) ==31357== by 0x428CE0: handle_internal_command (perf.c:421) ==31357== by 0x428CE0: run_argv (perf.c:467) ==31357== by 0x428CE0: main (perf.c:614) Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-6-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf utils: use sizeof(buf) - 1 in readlink() callTommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that we have space for the null byte in buf. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-5-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf tests: Do not assume that readlink() returns a null terminated stringTommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that the string in buf is null terminated. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-4-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf buildid: Do not assume that readlink() returns a null terminated stringTommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind was complaining: $ valgrind ./perf list >/dev/null ==11643== Memcheck, a memory error detector ==11643== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==11643== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==11643== Command: ./perf list ==11643== ==11643== Conditional jump or move depends on uninitialised value(s) ==11643== at 0x4C30620: rindex (vg_replace_strmem.c:199) ==11643== by 0x49DAA9: build_id_cache__origname (build-id.c:198) ==11643== by 0x49E1C7: build_id_cache__valid_id (build-id.c:222) ==11643== by 0x49E1C7: build_id_cache__list_all (build-id.c:507) ==11643== by 0x4B9C8F: print_sdt_events (parse-events.c:2067) ==11643== by 0x4BB0B3: print_events (parse-events.c:2313) ==11643== by 0x439501: cmd_list (builtin-list.c:53) ==11643== by 0x497150: run_builtin (perf.c:359) ==11643== by 0x428CE0: handle_internal_command (perf.c:421) ==11643== by 0x428CE0: run_argv (perf.c:467) ==11643== by 0x428CE0: main (perf.c:614) [...] Additionally, a zero length result from readlink() is not very interesting. Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170322130624.21881-3-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf buildid: Do not update SDT cache with null filenameTommi Rantala2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind was complaining: ==2633== Syscall param open(filename) points to unaddressable byte(s) ==2633== at 0x5281CC0: __open_nocancel (syscall-template.S:84) ==2633== by 0x537D38: open (fcntl2.h:53) ==2633== by 0x537D38: get_sdt_note_list (symbol-elf.c:2017) ==2633== by 0x5396FD: probe_cache__scan_sdt (probe-file.c:700) ==2633== by 0x49EA2C: build_id_cache__add_sdt_cache (build-id.c:625) ==2633== by 0x49EA2C: build_id_cache__add_s (build-id.c:697) ==2633== by 0x49EE72: build_id_cache__add_b (build-id.c:717) ==2633== by 0x49EE72: dso__cache_build_id (build-id.c:782) ==2633== by 0x49F190: __dsos__cache_build_ids (build-id.c:793) ==2633== by 0x49F190: machine__cache_build_ids (build-id.c:801) ==2633== by 0x49F190: perf_session__cache_build_ids (build-id.c:815) ==2633== by 0x4CD4F2: write_build_id (header.c:165) ==2633== by 0x4D26F7: do_write_feat (header.c:2296) ==2633== by 0x4D26F7: perf_header__adds_write (header.c:2335) ==2633== by 0x4D26F7: perf_session__write_header (header.c:2414) ==2633== by 0x43B324: __cmd_record (builtin-record.c:1154) ==2633== by 0x43B324: cmd_record (builtin-record.c:1839) ==2633== by 0x455A07: __cmd_record (builtin-kmem.c:1868) ==2633== by 0x455A07: cmd_kmem (builtin-kmem.c:1944) ==2633== by 0x497150: run_builtin (perf.c:359) ==2633== by 0x428CE0: handle_internal_command (perf.c:421) ==2633== by 0x428CE0: run_argv (perf.c:467) ==2633== by 0x428CE0: main (perf.c:614) ==2633== Address 0x0 is not stack'd, malloc'd or (recently) free'd Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tommi Rantala <tommi.t.rantala@nokia.com> Link: http://lkml.kernel.org/r/20170322130624.21881-2-tommi.t.rantala@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf annotate: Fix a bug of division by zero when calculating percentTaeung Song2017-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently perf-annotate with --print-line can print -nan(0x8000000000000) because of division by zero when calculating percent. The division by zero happens when a sum of samples is zero in symbol__get_source_line(), so fix it. For example: After running 'perf record' like below, $ perf record -e "{cycles,page-faults,branch-misses}" ./a.out Before: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 -nan 7.04 a.c:38 25.14 -nan 0.00 a.c:34 16.26 -nan 56.34 a.c:31 15.88 -nan 1.41 a.c:37 5.67 -nan 0.00 a.c:39 1.13 -nan 35.21 a.c:26 0.95 -nan 0.00 a.c:44 0.57 -nan 0.00 a.c:32 Percent | Source code & Disassembly of a.out for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 -nan 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 -nan 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... However, if a sum of samples is zero (e.g. 'page-faults'), skip calculating percent. After: $ perf annotate --stdio -l Sorted summary for file /home/taeung/workspace/a.out ---------------------------------------------- 32.89 0.00 7.04 a.c:38 25.14 0.00 0.00 a.c:34 16.26 0.00 56.34 a.c:31 15.88 0.00 1.41 a.c:37 5.67 0.00 0.00 a.c:39 1.13 0.00 35.21 a.c:26 0.95 0.00 0.00 a.c:44 0.57 0.00 0.00 a.c:32 Percent | Source code & Disassembly of old for cycles (529 samples) ----------------------------------------------------------------------------------------- : ... a.c:26 0.57 0.00 4.23 : 40081a: mov %edi,-0x24(%rbp) a.c:26 0.00 0.00 9.86 : 40081d: mov %rsi,-0x30(%rbp) ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1490598638-13947-3-git-send-email-treeze.taeung@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>