diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 14:15:14 -0500 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 14:15:14 -0500 |
commit | d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch) | |
tree | bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/util/pmu.c | |
parent | 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff) | |
parent | e4c1091cb495d9cbec8956d642644a71a1689958 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Clean up the x86 instruction decoder (Masami Hiramatsu)
- Add new uprobes optimization for PUSH instructions on x86 (Yonghong
Song)
- Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
- Fix misc bugs, update documentation, plus various cleanups (Jiri
Olsa)
There's a large number of tooling side improvements:
- Intel-PT/BTS improvements (Adrian Hunter)
- Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
- Introduce an errno code to string facility (Hendrik Brueckner)
- Various build system improvements (Jiri Olsa)
- Add support for CoreSight trace decoding by making the perf tools
use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
- Add ARM Statistical Profiling Extensions (SPE) support (Kim
Phillips)
- libtraceevent updates (Steven Rostedt)
- Intel vendor event JSON updates (Andi Kleen)
- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
- Add infrastructure to record first and last sample time to the
perf.data file header, so that when processing all samples in a
'perf record' session, such as when doing build-id processing, or
when specifically requesting that that info be recorded, use that
in 'perf report --time', that also got support for percent slices
in addition to absolute ones.
I.e. now it is possible to ask for the samples in the 10%-20% time
slice of a perf.data file (Jin Yao)
- Allow system wide 'perf stat --per-thread', sorting the result (Jin
Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- 'perf probe' improvements (Masami Hiramatsu)
- Improve PMU infrastructure to support amp64's ThunderX2
implementation defined core events (Ganapatrao Kulkarni)
- Various annotation related improvements and fixes (Thomas Richter)
- Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
code, removing the 'overwrite' parameter from several functions as
it was always used it as 'false' (Wang Nan)
- Fix/improve 'perf record' reverse recording support (Wang Nan)
- Improve command line options documentation (Sihyeon Jang)
- Optimize sample parsing for ordering events, where we don't need to
parse all the PERF_SAMPLE_ bits, just the ones leading to the
timestamp needed to reorder events (Jiri Olsa)
- Generalize the annotation code to support other source information
besides objdump/DWARF obtained ones, starting with python scripts,
that will is slated to be merged soon (Jiri Olsa)
- ... and a lot more that I failed to list, see the shortlog and
changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf trace beauty flock: Move to separate object file
perf evlist: Remove fcntl.h from evlist.h
perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
perf trace: Do not print from time delta for interrupted syscall lines
perf trace: Add --print-sample
perf bpf: Remove misplaced __maybe_unused attribute
MAINTAINERS: Adding entry for CoreSight trace decoding
perf tools: Add mechanic to synthesise CoreSight trace packets
perf tools: Add full support for CoreSight trace decoding
pert tools: Add queue management functionality
perf tools: Add functionality to communicate with the openCSD decoder
perf tools: Add support for decoding CoreSight trace data
perf tools: Add decoder mechanic to support dumping trace data
perf tools: Add processing of coresight metadata
perf tools: Add initial entry point for decoder CoreSight traces
perf tools: Integrating the CoreSight decoding library
perf vendor events intel: Update IvyTown files to V20
perf vendor events intel: Update IvyBridge files to V20
perf vendor events intel: Update BroadwellDE events to V7
perf vendor events intel: Update SkylakeX events to V1.06
...
Diffstat (limited to 'tools/perf/util/pmu.c')
-rw-r--r-- | tools/perf/util/pmu.c | 87 |
1 files changed, 71 insertions, 16 deletions
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 80fb1593913a..57e38fdf0b34 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c | |||
@@ -12,6 +12,7 @@ | |||
12 | #include <dirent.h> | 12 | #include <dirent.h> |
13 | #include <api/fs/fs.h> | 13 | #include <api/fs/fs.h> |
14 | #include <locale.h> | 14 | #include <locale.h> |
15 | #include <regex.h> | ||
15 | #include "util.h" | 16 | #include "util.h" |
16 | #include "pmu.h" | 17 | #include "pmu.h" |
17 | #include "parse-events.h" | 18 | #include "parse-events.h" |
@@ -537,17 +538,45 @@ static bool pmu_is_uncore(const char *name) | |||
537 | } | 538 | } |
538 | 539 | ||
539 | /* | 540 | /* |
541 | * PMU CORE devices have different name other than cpu in sysfs on some | ||
542 | * platforms. looking for possible sysfs files to identify as core device. | ||
543 | */ | ||
544 | static int is_pmu_core(const char *name) | ||
545 | { | ||
546 | struct stat st; | ||
547 | char path[PATH_MAX]; | ||
548 | const char *sysfs = sysfs__mountpoint(); | ||
549 | |||
550 | if (!sysfs) | ||
551 | return 0; | ||
552 | |||
553 | /* Look for cpu sysfs (x86 and others) */ | ||
554 | scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/cpu", sysfs); | ||
555 | if ((stat(path, &st) == 0) && | ||
556 | (strncmp(name, "cpu", strlen("cpu")) == 0)) | ||
557 | return 1; | ||
558 | |||
559 | /* Look for cpu sysfs (specific to arm) */ | ||
560 | scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/%s/cpus", | ||
561 | sysfs, name); | ||
562 | if (stat(path, &st) == 0) | ||
563 | return 1; | ||
564 | |||
565 | return 0; | ||
566 | } | ||
567 | |||
568 | /* | ||
540 | * Return the CPU id as a raw string. | 569 | * Return the CPU id as a raw string. |
541 | * | 570 | * |
542 | * Each architecture should provide a more precise id string that | 571 | * Each architecture should provide a more precise id string that |
543 | * can be use to match the architecture's "mapfile". | 572 | * can be use to match the architecture's "mapfile". |
544 | */ | 573 | */ |
545 | char * __weak get_cpuid_str(void) | 574 | char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused) |
546 | { | 575 | { |
547 | return NULL; | 576 | return NULL; |
548 | } | 577 | } |
549 | 578 | ||
550 | static char *perf_pmu__getcpuid(void) | 579 | static char *perf_pmu__getcpuid(struct perf_pmu *pmu) |
551 | { | 580 | { |
552 | char *cpuid; | 581 | char *cpuid; |
553 | static bool printed; | 582 | static bool printed; |
@@ -556,7 +585,7 @@ static char *perf_pmu__getcpuid(void) | |||
556 | if (cpuid) | 585 | if (cpuid) |
557 | cpuid = strdup(cpuid); | 586 | cpuid = strdup(cpuid); |
558 | if (!cpuid) | 587 | if (!cpuid) |
559 | cpuid = get_cpuid_str(); | 588 | cpuid = get_cpuid_str(pmu); |
560 | if (!cpuid) | 589 | if (!cpuid) |
561 | return NULL; | 590 | return NULL; |
562 | 591 | ||
@@ -567,22 +596,45 @@ static char *perf_pmu__getcpuid(void) | |||
567 | return cpuid; | 596 | return cpuid; |
568 | } | 597 | } |
569 | 598 | ||
570 | struct pmu_events_map *perf_pmu__find_map(void) | 599 | struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu) |
571 | { | 600 | { |
572 | struct pmu_events_map *map; | 601 | struct pmu_events_map *map; |
573 | char *cpuid = perf_pmu__getcpuid(); | 602 | char *cpuid = perf_pmu__getcpuid(pmu); |
574 | int i; | 603 | int i; |
575 | 604 | ||
605 | /* on some platforms which uses cpus map, cpuid can be NULL for | ||
606 | * PMUs other than CORE PMUs. | ||
607 | */ | ||
608 | if (!cpuid) | ||
609 | return NULL; | ||
610 | |||
576 | i = 0; | 611 | i = 0; |
577 | for (;;) { | 612 | for (;;) { |
613 | regex_t re; | ||
614 | regmatch_t pmatch[1]; | ||
615 | int match; | ||
616 | |||
578 | map = &pmu_events_map[i++]; | 617 | map = &pmu_events_map[i++]; |
579 | if (!map->table) { | 618 | if (!map->table) { |
580 | map = NULL; | 619 | map = NULL; |
581 | break; | 620 | break; |
582 | } | 621 | } |
583 | 622 | ||
584 | if (!strcmp(map->cpuid, cpuid)) | 623 | if (regcomp(&re, map->cpuid, REG_EXTENDED) != 0) { |
624 | /* Warn unable to generate match particular string. */ | ||
625 | pr_info("Invalid regular expression %s\n", map->cpuid); | ||
585 | break; | 626 | break; |
627 | } | ||
628 | |||
629 | match = !regexec(&re, cpuid, 1, pmatch, 0); | ||
630 | regfree(&re); | ||
631 | if (match) { | ||
632 | size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so); | ||
633 | |||
634 | /* Verify the entire string matched. */ | ||
635 | if (match_len == strlen(cpuid)) | ||
636 | break; | ||
637 | } | ||
586 | } | 638 | } |
587 | free(cpuid); | 639 | free(cpuid); |
588 | return map; | 640 | return map; |
@@ -593,13 +645,14 @@ struct pmu_events_map *perf_pmu__find_map(void) | |||
593 | * to the current running CPU. Then, add all PMU events from that table | 645 | * to the current running CPU. Then, add all PMU events from that table |
594 | * as aliases. | 646 | * as aliases. |
595 | */ | 647 | */ |
596 | static void pmu_add_cpu_aliases(struct list_head *head, const char *name) | 648 | static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu) |
597 | { | 649 | { |
598 | int i; | 650 | int i; |
599 | struct pmu_events_map *map; | 651 | struct pmu_events_map *map; |
600 | struct pmu_event *pe; | 652 | struct pmu_event *pe; |
653 | const char *name = pmu->name; | ||
601 | 654 | ||
602 | map = perf_pmu__find_map(); | 655 | map = perf_pmu__find_map(pmu); |
603 | if (!map) | 656 | if (!map) |
604 | return; | 657 | return; |
605 | 658 | ||
@@ -608,7 +661,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name) | |||
608 | */ | 661 | */ |
609 | i = 0; | 662 | i = 0; |
610 | while (1) { | 663 | while (1) { |
611 | const char *pname; | ||
612 | 664 | ||
613 | pe = &map->table[i++]; | 665 | pe = &map->table[i++]; |
614 | if (!pe->name) { | 666 | if (!pe->name) { |
@@ -617,9 +669,13 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name) | |||
617 | break; | 669 | break; |
618 | } | 670 | } |
619 | 671 | ||
620 | pname = pe->pmu ? pe->pmu : "cpu"; | 672 | if (!is_pmu_core(name)) { |
621 | if (strncmp(pname, name, strlen(pname))) | 673 | /* check for uncore devices */ |
622 | continue; | 674 | if (pe->pmu == NULL) |
675 | continue; | ||
676 | if (strncmp(pe->pmu, name, strlen(pe->pmu))) | ||
677 | continue; | ||
678 | } | ||
623 | 679 | ||
624 | /* need type casts to override 'const' */ | 680 | /* need type casts to override 'const' */ |
625 | __perf_pmu__new_alias(head, NULL, (char *)pe->name, | 681 | __perf_pmu__new_alias(head, NULL, (char *)pe->name, |
@@ -661,21 +717,20 @@ static struct perf_pmu *pmu_lookup(const char *name) | |||
661 | if (pmu_aliases(name, &aliases)) | 717 | if (pmu_aliases(name, &aliases)) |
662 | return NULL; | 718 | return NULL; |
663 | 719 | ||
664 | pmu_add_cpu_aliases(&aliases, name); | ||
665 | pmu = zalloc(sizeof(*pmu)); | 720 | pmu = zalloc(sizeof(*pmu)); |
666 | if (!pmu) | 721 | if (!pmu) |
667 | return NULL; | 722 | return NULL; |
668 | 723 | ||
669 | pmu->cpus = pmu_cpumask(name); | 724 | pmu->cpus = pmu_cpumask(name); |
670 | 725 | pmu->name = strdup(name); | |
726 | pmu->type = type; | ||
671 | pmu->is_uncore = pmu_is_uncore(name); | 727 | pmu->is_uncore = pmu_is_uncore(name); |
728 | pmu_add_cpu_aliases(&aliases, pmu); | ||
672 | 729 | ||
673 | INIT_LIST_HEAD(&pmu->format); | 730 | INIT_LIST_HEAD(&pmu->format); |
674 | INIT_LIST_HEAD(&pmu->aliases); | 731 | INIT_LIST_HEAD(&pmu->aliases); |
675 | list_splice(&format, &pmu->format); | 732 | list_splice(&format, &pmu->format); |
676 | list_splice(&aliases, &pmu->aliases); | 733 | list_splice(&aliases, &pmu->aliases); |
677 | pmu->name = strdup(name); | ||
678 | pmu->type = type; | ||
679 | list_add_tail(&pmu->list, &pmus); | 734 | list_add_tail(&pmu->list, &pmus); |
680 | 735 | ||
681 | pmu->default_config = perf_pmu__get_default_config(pmu); | 736 | pmu->default_config = perf_pmu__get_default_config(pmu); |