aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2016-03-14 20:58:53 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2016-03-14 20:58:53 -0400
commite71c2c1eeb8de7a083a728c5b7e0b83ed1faf047 (patch)
tree722ff062c2ee32d6b80d1271ac70767043dceb9d /arch/x86
parentd09e356ad06a8b6f5cceabf7c6cf05fdb62b46e5 (diff)
parentced30bc9129777d715057d06fc8dbdfd3b81e94d (diff)
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Main kernel side changes: - Big reorganization of the x86 perf support code. The old code grew organically deep inside arch/x86/kernel/cpu/perf* and its naming became somewhat messy. The new location is under arch/x86/events/, using the following cleaner hierarchy of source code files: perf/x86: Move perf_event.c .................. => x86/events/core.c perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch] perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch] perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch] perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c perf/x86: Move perf_event_intel_uncore_snb.c => x86/events/intel/uncore_snb.c perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c (Borislav Petkov) - Update various x86 PMU constraint and hw support details (Stephane Eranian) - Optimize kprobes for BPF execution (Martin KaFai Lau) - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas Gleixner) - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner) - Various fixes and smaller cleanups. There are lots of perf tooling updates as well. A few highlights: perf report/top: - Hierarchy histogram mode for 'perf top' and 'perf report', showing multiple levels, one per --sort entry: (Namhyung Kim) On a mostly idle system: # perf top --hierarchy -s comm,dso Then expand some levels and use 'P' to take a snapshot: # cat perf.hist.0 - 92.32% perf 58.20% perf 22.29% libc-2.22.so 5.97% [kernel] 4.18% libelf-0.165.so 1.69% [unknown] - 4.71% qemu-system-x86 3.10% [kernel] 1.60% qemu-system-x86_64 (deleted) + 2.97% swapper # - Add 'L' hotkey to dynamicly set the percent threshold for histogram entries and callchains, i.e. dynamicly do what the --percent-limit command line option to 'top' and 'report' does. (Namhyung Kim) perf mem: - Allow specifying events via -e in 'perf mem record', also listing what events can be specified via 'perf mem record -e list' (Jiri Olsa) perf record: - Add 'perf record' --all-user/--all-kernel options, so that one can tell that all the events in the command line should be restricted to the user or kernel levels (Jiri Olsa), i.e.: perf record -e cycles:u,instructions:u is equivalent to: perf record --all-user -e cycles,instructions - Make 'perf record' collect CPU cache info in the perf.data file header: $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] $ perf report --header-only -I | tail -10 | head -8 # CPU cache info: # L1 Data 32K [0-1] # L1 Instruction 32K [0-1] # L1 Data 32K [2-3] # L1 Instruction 32K [2-3] # L2 Unified 256K [0-1] # L2 Unified 256K [2-3] # L3 Unified 4096K [0-3] Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance running the same workload in multiple machines and then when using 'diff' show the hardware difference. (Jiri Olsa) - Improved support for Java, using the JVMTI agent library to do jitdumps that then will be inserted in synthesized PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized ELF files stored in ~/.debug and keyed with build-ids, to allow symbol resolution and even annotation with source line info, see the changeset comments to see how to use it (Stephane Eranian) perf script/trace: - Decode data_src values (e.g. perf.data files generated by 'perf mem record') in 'perf script': (Jiri Olsa) # perf script perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Improve support to 'data_src', 'weight' and 'addr' fields in 'perf script' (Jiri Olsa) - Handle empty print fmts in 'perf script -s' i.e. when running python or perl scripts (Taeung Song) perf stat: - 'perf stat' now shows shadow metrics (insn per cycle, etc) in interval mode too. E.g: # perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000215928 519,620 instructions # 0.69 insn per cycle 1.000215928 752,003 cycles <SNIP> - Port 'perf kvm stat' to PowerPC (Hemant Kumar) - Implement CSV metrics output in 'perf stat' (Andi Kleen) perf BPF support: - Support converting data from bpf events in 'perf data' (Wang Nan) - Print bpf-output events in 'perf script': (Wang Nan). # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000 # perf script usleep 4882 21384.532523: evt: ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms]) BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" # - Add API to set values of map entries in a BPF object, be it individual map slots or ranges (Wang Nan) - Introduce support for the 'bpf-output' event (Wang Nan) - Add glue to read perf events in a BPF program (Wang Nan) - Improve support for bpf-output events in 'perf trace' (Wang Nan) ... and tons of other changes as well - see the shortlog and git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits) perf stat: Add --metric-only support for -A perf stat: Implement --metric-only mode perf stat: Document CSV format in manpage perf hists browser: Check sort keys before hot key actions perf hists browser: Allow thread filtering for comm sort key perf tools: Add sort__has_comm variable perf tools: Recalc total periods using top-level entries in hierarchy perf tools: Remove nr_sort_keys field perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry() perf tools: Remove hist_entry->fmt field perf tools: Fix command line filters in hierarchy mode perf tools: Add more sort entry check functions perf tools: Fix hist_entry__filter() for hierarchy perf jitdump: Build only on supported archs tools lib traceevent: Add '~' operation within arg_num_eval() perf tools: Omit unnecessary cast in perf_pmu__parse_scale perf tools: Pass perf_hpp_list all the way through setup_sort_list perf tools: Fix perf script python database export crash perf jitdump: DWARF is also needed perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes ...
Diffstat (limited to 'arch/x86')
-rw-r--r--arch/x86/Kbuild3
-rw-r--r--arch/x86/events/Makefile13
-rw-r--r--arch/x86/events/amd/core.c (renamed from arch/x86/kernel/cpu/perf_event_amd.c)2
-rw-r--r--arch/x86/events/amd/ibs.c (renamed from arch/x86/kernel/cpu/perf_event_amd_ibs.c)12
-rw-r--r--arch/x86/events/amd/iommu.c (renamed from arch/x86/kernel/cpu/perf_event_amd_iommu.c)4
-rw-r--r--arch/x86/events/amd/iommu.h (renamed from arch/x86/kernel/cpu/perf_event_amd_iommu.h)0
-rw-r--r--arch/x86/events/amd/uncore.c (renamed from arch/x86/kernel/cpu/perf_event_amd_uncore.c)4
-rw-r--r--arch/x86/events/core.c (renamed from arch/x86/kernel/cpu/perf_event.c)22
-rw-r--r--arch/x86/events/intel/bts.c (renamed from arch/x86/kernel/cpu/perf_event_intel_bts.c)2
-rw-r--r--arch/x86/events/intel/core.c (renamed from arch/x86/kernel/cpu/perf_event_intel.c)31
-rw-r--r--arch/x86/events/intel/cqm.c (renamed from arch/x86/kernel/cpu/perf_event_intel_cqm.c)34
-rw-r--r--arch/x86/events/intel/cstate.c (renamed from arch/x86/kernel/cpu/perf_event_intel_cstate.c)2
-rw-r--r--arch/x86/events/intel/ds.c (renamed from arch/x86/kernel/cpu/perf_event_intel_ds.c)56
-rw-r--r--arch/x86/events/intel/knc.c (renamed from arch/x86/kernel/cpu/perf_event_knc.c)6
-rw-r--r--arch/x86/events/intel/lbr.c (renamed from arch/x86/kernel/cpu/perf_event_intel_lbr.c)2
-rw-r--r--arch/x86/events/intel/p4.c (renamed from arch/x86/kernel/cpu/perf_event_p4.c)2
-rw-r--r--arch/x86/events/intel/p6.c (renamed from arch/x86/kernel/cpu/perf_event_p6.c)2
-rw-r--r--arch/x86/events/intel/pt.c (renamed from arch/x86/kernel/cpu/perf_event_intel_pt.c)4
-rw-r--r--arch/x86/events/intel/pt.h (renamed from arch/x86/kernel/cpu/intel_pt.h)0
-rw-r--r--arch/x86/events/intel/rapl.c (renamed from arch/x86/kernel/cpu/perf_event_intel_rapl.c)412
-rw-r--r--arch/x86/events/intel/uncore.c (renamed from arch/x86/kernel/cpu/perf_event_intel_uncore.c)677
-rw-r--r--arch/x86/events/intel/uncore.h (renamed from arch/x86/kernel/cpu/perf_event_intel_uncore.h)55
-rw-r--r--arch/x86/events/intel/uncore_nhmex.c (renamed from arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c)8
-rw-r--r--arch/x86/events/intel/uncore_snb.c (renamed from arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c)16
-rw-r--r--arch/x86/events/intel/uncore_snbep.c (renamed from arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c)21
-rw-r--r--arch/x86/events/msr.c (renamed from arch/x86/kernel/cpu/perf_event_msr.c)0
-rw-r--r--arch/x86/events/perf_event.h (renamed from arch/x86/kernel/cpu/perf_event.h)5
-rw-r--r--arch/x86/include/asm/elf.h2
-rw-r--r--arch/x86/include/asm/perf_event.h1
-rw-r--r--arch/x86/include/asm/processor.h2
-rw-r--r--arch/x86/include/asm/topology.h11
-rw-r--r--arch/x86/kernel/apic/apic.c14
-rw-r--r--arch/x86/kernel/cpu/Makefile24
-rw-r--r--arch/x86/kernel/cpu/amd.c23
-rw-r--r--arch/x86/kernel/cpu/bugs_64.c2
-rw-r--r--arch/x86/kernel/cpu/centaur.c10
-rw-r--r--arch/x86/kernel/cpu/common.c44
-rw-r--r--arch/x86/kernel/cpu/cyrix.c10
-rw-r--r--arch/x86/kernel/cpu/hypervisor.c2
-rw-r--r--arch/x86/kernel/cpu/intel.c23
-rw-r--r--arch/x86/kernel/cpu/intel_cacheinfo.c2
-rw-r--r--arch/x86/kernel/cpu/mcheck/mce-inject.c15
-rw-r--r--arch/x86/kernel/cpu/mcheck/p5.c18
-rw-r--r--arch/x86/kernel/cpu/mcheck/therm_throt.c15
-rw-r--r--arch/x86/kernel/cpu/mcheck/threshold.c4
-rw-r--r--arch/x86/kernel/cpu/mcheck/winchip.c5
-rw-r--r--arch/x86/kernel/cpu/microcode/amd.c2
-rw-r--r--arch/x86/kernel/cpu/mshyperv.c8
-rw-r--r--arch/x86/kernel/cpu/mtrr/centaur.c2
-rw-r--r--arch/x86/kernel/cpu/mtrr/cleanup.c44
-rw-r--r--arch/x86/kernel/cpu/mtrr/generic.c23
-rw-r--r--arch/x86/kernel/cpu/mtrr/main.c20
-rw-r--r--arch/x86/kernel/cpu/rdrand.c2
-rw-r--r--arch/x86/kernel/cpu/topology.c4
-rw-r--r--arch/x86/kernel/cpu/transmeta.c8
-rw-r--r--arch/x86/kernel/cpu/vmware.c5
-rw-r--r--arch/x86/kernel/mpparse.c2
-rw-r--r--arch/x86/kernel/nmi.c3
-rw-r--r--arch/x86/kernel/smpboot.c100
-rw-r--r--arch/x86/lguest/boot.c2
-rw-r--r--arch/x86/xen/enlighten.c2
-rw-r--r--arch/x86/xen/pmu.c2
62 files changed, 1047 insertions, 804 deletions
diff --git a/arch/x86/Kbuild b/arch/x86/Kbuild
index 1538562cc720..eb3abf8ac44e 100644
--- a/arch/x86/Kbuild
+++ b/arch/x86/Kbuild
@@ -1,6 +1,7 @@
1
2obj-y += entry/ 1obj-y += entry/
3 2
3obj-$(CONFIG_PERF_EVENTS) += events/
4
4obj-$(CONFIG_KVM) += kvm/ 5obj-$(CONFIG_KVM) += kvm/
5 6
6# Xen paravirtualization support 7# Xen paravirtualization support
diff --git a/arch/x86/events/Makefile b/arch/x86/events/Makefile
new file mode 100644
index 000000000000..fdfea1511cc0
--- /dev/null
+++ b/arch/x86/events/Makefile
@@ -0,0 +1,13 @@
1obj-y += core.o
2
3obj-$(CONFIG_CPU_SUP_AMD) += amd/core.o amd/uncore.o
4obj-$(CONFIG_X86_LOCAL_APIC) += amd/ibs.o msr.o
5ifdef CONFIG_AMD_IOMMU
6obj-$(CONFIG_CPU_SUP_AMD) += amd/iommu.o
7endif
8obj-$(CONFIG_CPU_SUP_INTEL) += intel/core.o intel/bts.o intel/cqm.o
9obj-$(CONFIG_CPU_SUP_INTEL) += intel/cstate.o intel/ds.o intel/knc.o
10obj-$(CONFIG_CPU_SUP_INTEL) += intel/lbr.o intel/p4.o intel/p6.o intel/pt.o
11obj-$(CONFIG_CPU_SUP_INTEL) += intel/rapl.o msr.o
12obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += intel/uncore.o intel/uncore_nhmex.o
13obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += intel/uncore_snb.o intel/uncore_snbep.o
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/events/amd/core.c
index 58610539b048..049ada8d4e9c 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/events/amd/core.c
@@ -5,7 +5,7 @@
5#include <linux/slab.h> 5#include <linux/slab.h>
6#include <asm/apicdef.h> 6#include <asm/apicdef.h>
7 7
8#include "perf_event.h" 8#include "../perf_event.h"
9 9
10static __initconst const u64 amd_hw_cache_event_ids 10static __initconst const u64 amd_hw_cache_event_ids
11 [PERF_COUNT_HW_CACHE_MAX] 11 [PERF_COUNT_HW_CACHE_MAX]
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/events/amd/ibs.c
index 989d3c215d2b..51087c29b2c2 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -14,7 +14,7 @@
14 14
15#include <asm/apic.h> 15#include <asm/apic.h>
16 16
17#include "perf_event.h" 17#include "../perf_event.h"
18 18
19static u32 ibs_caps; 19static u32 ibs_caps;
20 20
@@ -670,7 +670,7 @@ static __init int perf_event_ibs_init(void)
670 perf_ibs_pmu_init(&perf_ibs_op, "ibs_op"); 670 perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
671 671
672 register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs"); 672 register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
673 printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps); 673 pr_info("perf: AMD IBS detected (0x%08x)\n", ibs_caps);
674 674
675 return 0; 675 return 0;
676} 676}
@@ -774,14 +774,14 @@ static int setup_ibs_ctl(int ibs_eilvt_off)
774 pci_read_config_dword(cpu_cfg, IBSCTL, &value); 774 pci_read_config_dword(cpu_cfg, IBSCTL, &value);
775 if (value != (ibs_eilvt_off | IBSCTL_LVT_OFFSET_VALID)) { 775 if (value != (ibs_eilvt_off | IBSCTL_LVT_OFFSET_VALID)) {
776 pci_dev_put(cpu_cfg); 776 pci_dev_put(cpu_cfg);
777 printk(KERN_DEBUG "Failed to setup IBS LVT offset, " 777 pr_debug("Failed to setup IBS LVT offset, IBSCTL = 0x%08x\n",
778 "IBSCTL = 0x%08x\n", value); 778 value);
779 return -EINVAL; 779 return -EINVAL;
780 } 780 }
781 } while (1); 781 } while (1);
782 782
783 if (!nodes) { 783 if (!nodes) {
784 printk(KERN_DEBUG "No CPU node configured for IBS\n"); 784 pr_debug("No CPU node configured for IBS\n");
785 return -ENODEV; 785 return -ENODEV;
786 } 786 }
787 787
@@ -810,7 +810,7 @@ static void force_ibs_eilvt_setup(void)
810 preempt_enable(); 810 preempt_enable();
811 811
812 if (offset == APIC_EILVT_NR_MAX) { 812 if (offset == APIC_EILVT_NR_MAX) {
813 printk(KERN_DEBUG "No EILVT entry available\n"); 813 pr_debug("No EILVT entry available\n");
814 return; 814 return;
815 } 815 }
816 816
diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/events/amd/iommu.c
index 97242a9242bd..635e5eba0caf 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/events/amd/iommu.c
@@ -16,8 +16,8 @@
16#include <linux/cpumask.h> 16#include <linux/cpumask.h>
17#include <linux/slab.h> 17#include <linux/slab.h>
18 18
19#include "perf_event.h" 19#include "../perf_event.h"
20#include "perf_event_amd_iommu.h" 20#include "iommu.h"
21 21
22#define COUNTER_SHIFT 16 22#define COUNTER_SHIFT 16
23 23
diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.h b/arch/x86/events/amd/iommu.h
index 845d173278e3..845d173278e3 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.h
+++ b/arch/x86/events/amd/iommu.h
diff --git a/arch/x86/kernel/cpu/perf_event_amd_uncore.c b/arch/x86/events/amd/uncore.c
index 8836fc9fa84b..3db9569e658c 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -538,7 +538,7 @@ static int __init amd_uncore_init(void)
538 if (ret) 538 if (ret)
539 goto fail_nb; 539 goto fail_nb;
540 540
541 printk(KERN_INFO "perf: AMD NB counters detected\n"); 541 pr_info("perf: AMD NB counters detected\n");
542 ret = 0; 542 ret = 0;
543 } 543 }
544 544
@@ -552,7 +552,7 @@ static int __init amd_uncore_init(void)
552 if (ret) 552 if (ret)
553 goto fail_l2; 553 goto fail_l2;
554 554
555 printk(KERN_INFO "perf: AMD L2I counters detected\n"); 555 pr_info("perf: AMD L2I counters detected\n");
556 ret = 0; 556 ret = 0;
557 } 557 }
558 558
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/events/core.c
index 1b443db2db50..5e830d0c95c9 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/events/core.c
@@ -254,15 +254,16 @@ static bool check_hw_exists(void)
254 * We still allow the PMU driver to operate: 254 * We still allow the PMU driver to operate:
255 */ 255 */
256 if (bios_fail) { 256 if (bios_fail) {
257 printk(KERN_CONT "Broken BIOS detected, complain to your hardware vendor.\n"); 257 pr_cont("Broken BIOS detected, complain to your hardware vendor.\n");
258 printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg_fail, val_fail); 258 pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n",
259 reg_fail, val_fail);
259 } 260 }
260 261
261 return true; 262 return true;
262 263
263msr_fail: 264msr_fail:
264 printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n"); 265 pr_cont("Broken PMU hardware detected, using software events only.\n");
265 printk("%sFailed to access perfctr msr (MSR %x is %Lx)\n", 266 pr_info("%sFailed to access perfctr msr (MSR %x is %Lx)\n",
266 boot_cpu_has(X86_FEATURE_HYPERVISOR) ? KERN_INFO : KERN_ERR, 267 boot_cpu_has(X86_FEATURE_HYPERVISOR) ? KERN_INFO : KERN_ERR,
267 reg, val_new); 268 reg, val_new);
268 269
@@ -596,6 +597,19 @@ void x86_pmu_disable_all(void)
596 } 597 }
597} 598}
598 599
600/*
601 * There may be PMI landing after enabled=0. The PMI hitting could be before or
602 * after disable_all.
603 *
604 * If PMI hits before disable_all, the PMU will be disabled in the NMI handler.
605 * It will not be re-enabled in the NMI handler again, because enabled=0. After
606 * handling the NMI, disable_all will be called, which will not change the
607 * state either. If PMI hits after disable_all, the PMU is already disabled
608 * before entering NMI handler. The NMI handler will not change the state
609 * either.
610 *
611 * So either situation is harmless.
612 */
599static void x86_pmu_disable(struct pmu *pmu) 613static void x86_pmu_disable(struct pmu *pmu)
600{ 614{
601 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); 615 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_bts.c b/arch/x86/events/intel/bts.c
index 2cad71d1b14c..b99dc9258c0f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -26,7 +26,7 @@
26#include <asm-generic/sizes.h> 26#include <asm-generic/sizes.h>
27#include <asm/perf_event.h> 27#include <asm/perf_event.h>
28 28
29#include "perf_event.h" 29#include "../perf_event.h"
30 30
31struct bts_ctx { 31struct bts_ctx {
32 struct perf_output_handle handle; 32 struct perf_output_handle handle;
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/events/intel/core.c
index fed2ab1f1065..68fa55b4d42e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/events/intel/core.c
@@ -18,7 +18,7 @@
18#include <asm/hardirq.h> 18#include <asm/hardirq.h>
19#include <asm/apic.h> 19#include <asm/apic.h>
20 20
21#include "perf_event.h" 21#include "../perf_event.h"
22 22
23/* 23/*
24 * Intel PerfMon, used on Core and later. 24 * Intel PerfMon, used on Core and later.
@@ -1502,7 +1502,15 @@ static __initconst const u64 knl_hw_cache_extra_regs
1502}; 1502};
1503 1503
1504/* 1504/*
1505 * Use from PMIs where the LBRs are already disabled. 1505 * Used from PMIs where the LBRs are already disabled.
1506 *
1507 * This function could be called consecutively. It is required to remain in
1508 * disabled state if called consecutively.
1509 *
1510 * During consecutive calls, the same disable value will be written to related
1511 * registers, so the PMU state remains unchanged. hw.state in
1512 * intel_bts_disable_local will remain PERF_HES_STOPPED too in consecutive
1513 * calls.
1506 */ 1514 */
1507static void __intel_pmu_disable_all(void) 1515static void __intel_pmu_disable_all(void)
1508{ 1516{
@@ -1884,6 +1892,16 @@ again:
1884 if (__test_and_clear_bit(62, (unsigned long *)&status)) { 1892 if (__test_and_clear_bit(62, (unsigned long *)&status)) {
1885 handled++; 1893 handled++;
1886 x86_pmu.drain_pebs(regs); 1894 x86_pmu.drain_pebs(regs);
1895 /*
1896 * There are cases where, even though, the PEBS ovfl bit is set
1897 * in GLOBAL_OVF_STATUS, the PEBS events may also have their
1898 * overflow bits set for their counters. We must clear them
1899 * here because they have been processed as exact samples in
1900 * the drain_pebs() routine. They must not be processed again
1901 * in the for_each_bit_set() loop for regular samples below.
1902 */
1903 status &= ~cpuc->pebs_enabled;
1904 status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
1887 } 1905 }
1888 1906
1889 /* 1907 /*
@@ -1929,7 +1947,10 @@ again:
1929 goto again; 1947 goto again;
1930 1948
1931done: 1949done:
1932 __intel_pmu_enable_all(0, true); 1950 /* Only restore PMU state when it's active. See x86_pmu_disable(). */
1951 if (cpuc->enabled)
1952 __intel_pmu_enable_all(0, true);
1953
1933 /* 1954 /*
1934 * Only unmask the NMI after the overflow counters 1955 * Only unmask the NMI after the overflow counters
1935 * have been reset. This avoids spurious NMIs on 1956 * have been reset. This avoids spurious NMIs on
@@ -3396,6 +3417,7 @@ __init int intel_pmu_init(void)
3396 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 3417 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
3397 X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1); 3418 X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1);
3398 3419
3420 intel_pmu_pebs_data_source_nhm();
3399 x86_add_quirk(intel_nehalem_quirk); 3421 x86_add_quirk(intel_nehalem_quirk);
3400 3422
3401 pr_cont("Nehalem events, "); 3423 pr_cont("Nehalem events, ");
@@ -3459,6 +3481,7 @@ __init int intel_pmu_init(void)
3459 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 3481 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
3460 X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1); 3482 X86_CONFIG(.event=0xb1, .umask=0x3f, .inv=1, .cmask=1);
3461 3483
3484 intel_pmu_pebs_data_source_nhm();
3462 pr_cont("Westmere events, "); 3485 pr_cont("Westmere events, ");
3463 break; 3486 break;
3464 3487
@@ -3581,7 +3604,7 @@ __init int intel_pmu_init(void)
3581 intel_pmu_lbr_init_hsw(); 3604 intel_pmu_lbr_init_hsw();
3582 3605
3583 x86_pmu.event_constraints = intel_bdw_event_constraints; 3606 x86_pmu.event_constraints = intel_bdw_event_constraints;
3584 x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints; 3607 x86_pmu.pebs_constraints = intel_bdw_pebs_event_constraints;
3585 x86_pmu.extra_regs = intel_snbep_extra_regs; 3608 x86_pmu.extra_regs = intel_snbep_extra_regs;
3586 x86_pmu.pebs_aliases = intel_pebs_aliases_ivb; 3609 x86_pmu.pebs_aliases = intel_pebs_aliases_ivb;
3587 x86_pmu.pebs_prec_dist = true; 3610 x86_pmu.pebs_prec_dist = true;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/events/intel/cqm.c
index a316ca96f1b6..93cb412a5579 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/events/intel/cqm.c
@@ -7,7 +7,7 @@
7#include <linux/perf_event.h> 7#include <linux/perf_event.h>
8#include <linux/slab.h> 8#include <linux/slab.h>
9#include <asm/cpu_device_id.h> 9#include <asm/cpu_device_id.h>
10#include "perf_event.h" 10#include "../perf_event.h"
11 11
12#define MSR_IA32_PQR_ASSOC 0x0c8f 12#define MSR_IA32_PQR_ASSOC 0x0c8f
13#define MSR_IA32_QM_CTR 0x0c8e 13#define MSR_IA32_QM_CTR 0x0c8e
@@ -1244,15 +1244,12 @@ static struct pmu intel_cqm_pmu = {
1244 1244
1245static inline void cqm_pick_event_reader(int cpu) 1245static inline void cqm_pick_event_reader(int cpu)
1246{ 1246{
1247 int phys_id = topology_physical_package_id(cpu); 1247 int reader;
1248 int i;
1249 1248
1250 for_each_cpu(i, &cqm_cpumask) { 1249 /* First online cpu in package becomes the reader */
1251 if (phys_id == topology_physical_package_id(i)) 1250 reader = cpumask_any_and(&cqm_cpumask, topology_core_cpumask(cpu));
1252 return; /* already got reader for this socket */ 1251 if (reader >= nr_cpu_ids)
1253 } 1252 cpumask_set_cpu(cpu, &cqm_cpumask);
1254
1255 cpumask_set_cpu(cpu, &cqm_cpumask);
1256} 1253}
1257 1254
1258static void intel_cqm_cpu_starting(unsigned int cpu) 1255static void intel_cqm_cpu_starting(unsigned int cpu)
@@ -1270,24 +1267,17 @@ static void intel_cqm_cpu_starting(unsigned int cpu)
1270 1267
1271static void intel_cqm_cpu_exit(unsigned int cpu) 1268static void intel_cqm_cpu_exit(unsigned int cpu)
1272{ 1269{
1273 int phys_id = topology_physical_package_id(cpu); 1270 int target;
1274 int i;
1275 1271
1276 /* 1272 /* Is @cpu the current cqm reader for this package ? */
1277 * Is @cpu a designated cqm reader?
1278 */
1279 if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask)) 1273 if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask))
1280 return; 1274 return;
1281 1275
1282 for_each_online_cpu(i) { 1276 /* Find another online reader in this package */
1283 if (i == cpu) 1277 target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
1284 continue;
1285 1278
1286 if (phys_id == topology_physical_package_id(i)) { 1279 if (target < nr_cpu_ids)
1287 cpumask_set_cpu(i, &cqm_cpumask); 1280 cpumask_set_cpu(target, &cqm_cpumask);
1288 break;
1289 }
1290 }
1291} 1281}
1292 1282
1293static int intel_cqm_cpu_notifier(struct notifier_block *nb, 1283static int intel_cqm_cpu_notifier(struct notifier_block *nb,
diff --git a/arch/x86/kernel/cpu/perf_event_intel_cstate.c b/arch/x86/events/intel/cstate.c
index 75a38b5a2e26..7946c4231169 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -89,7 +89,7 @@
89#include <linux/slab.h> 89#include <linux/slab.h>
90#include <linux/perf_event.h> 90#include <linux/perf_event.h>
91#include <asm/cpu_device_id.h> 91#include <asm/cpu_device_id.h>
92#include "perf_event.h" 92#include "../perf_event.h"
93 93
94#define DEFINE_CSTATE_FORMAT_ATTR(_var, _name, _format) \ 94#define DEFINE_CSTATE_FORMAT_ATTR(_var, _name, _format) \
95static ssize_t __cstate_##_var##_show(struct kobject *kobj, \ 95static ssize_t __cstate_##_var##_show(struct kobject *kobj, \
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/events/intel/ds.c
index 10602f0a438f..ce7211a07c0b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,7 +5,7 @@
5#include <asm/perf_event.h> 5#include <asm/perf_event.h>
6#include <asm/insn.h> 6#include <asm/insn.h>
7 7
8#include "perf_event.h" 8#include "../perf_event.h"
9 9
10/* The size of a BTS record in bytes: */ 10/* The size of a BTS record in bytes: */
11#define BTS_RECORD_SIZE 24 11#define BTS_RECORD_SIZE 24
@@ -51,7 +51,8 @@ union intel_x86_pebs_dse {
51#define OP_LH (P(OP, LOAD) | P(LVL, HIT)) 51#define OP_LH (P(OP, LOAD) | P(LVL, HIT))
52#define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS)) 52#define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
53 53
54static const u64 pebs_data_source[] = { 54/* Version for Sandy Bridge and later */
55static u64 pebs_data_source[] = {
55 P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */ 56 P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
56 OP_LH | P(LVL, L1) | P(SNOOP, NONE), /* 0x01: L1 local */ 57 OP_LH | P(LVL, L1) | P(SNOOP, NONE), /* 0x01: L1 local */
57 OP_LH | P(LVL, LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */ 58 OP_LH | P(LVL, LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */
@@ -70,6 +71,14 @@ static const u64 pebs_data_source[] = {
70 OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */ 71 OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
71}; 72};
72 73
74/* Patch up minor differences in the bits */
75void __init intel_pmu_pebs_data_source_nhm(void)
76{
77 pebs_data_source[0x05] = OP_LH | P(LVL, L3) | P(SNOOP, HIT);
78 pebs_data_source[0x06] = OP_LH | P(LVL, L3) | P(SNOOP, HITM);
79 pebs_data_source[0x07] = OP_LH | P(LVL, L3) | P(SNOOP, HITM);
80}
81
73static u64 precise_store_data(u64 status) 82static u64 precise_store_data(u64 status)
74{ 83{
75 union intel_x86_pebs_dse dse; 84 union intel_x86_pebs_dse dse;
@@ -269,7 +278,7 @@ static int alloc_pebs_buffer(int cpu)
269 if (!x86_pmu.pebs) 278 if (!x86_pmu.pebs)
270 return 0; 279 return 0;
271 280
272 buffer = kzalloc_node(PEBS_BUFFER_SIZE, GFP_KERNEL, node); 281 buffer = kzalloc_node(x86_pmu.pebs_buffer_size, GFP_KERNEL, node);
273 if (unlikely(!buffer)) 282 if (unlikely(!buffer))
274 return -ENOMEM; 283 return -ENOMEM;
275 284
@@ -286,7 +295,7 @@ static int alloc_pebs_buffer(int cpu)
286 per_cpu(insn_buffer, cpu) = ibuffer; 295 per_cpu(insn_buffer, cpu) = ibuffer;
287 } 296 }
288 297
289 max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size; 298 max = x86_pmu.pebs_buffer_size / x86_pmu.pebs_record_size;
290 299
291 ds->pebs_buffer_base = (u64)(unsigned long)buffer; 300 ds->pebs_buffer_base = (u64)(unsigned long)buffer;
292 ds->pebs_index = ds->pebs_buffer_base; 301 ds->pebs_index = ds->pebs_buffer_base;
@@ -722,6 +731,30 @@ struct event_constraint intel_hsw_pebs_event_constraints[] = {
722 EVENT_CONSTRAINT_END 731 EVENT_CONSTRAINT_END
723}; 732};
724 733
734struct event_constraint intel_bdw_pebs_event_constraints[] = {
735 INTEL_FLAGS_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
736 INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */
737 /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
738 INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
739 /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */
740 INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c0, 0x2),
741 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
742 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
743 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
744 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
745 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
746 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_STORES */
747 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_STORES */
748 INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
749 INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
750 INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd2, 0xf), /* MEM_LOAD_UOPS_L3_HIT_RETIRED.* */
751 INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd3, 0xf), /* MEM_LOAD_UOPS_L3_MISS_RETIRED.* */
752 /* Allow all events as PEBS with no flags */
753 INTEL_ALL_EVENT_CONSTRAINT(0, 0xf),
754 EVENT_CONSTRAINT_END
755};
756
757
725struct event_constraint intel_skl_pebs_event_constraints[] = { 758struct event_constraint intel_skl_pebs_event_constraints[] = {
726 INTEL_FLAGS_UEVENT_CONSTRAINT(0x1c0, 0x2), /* INST_RETIRED.PREC_DIST */ 759 INTEL_FLAGS_UEVENT_CONSTRAINT(0x1c0, 0x2), /* INST_RETIRED.PREC_DIST */
727 /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */ 760 /* INST_RETIRED.PREC_DIST, inv=1, cmask=16 (cycles:ppp). */
@@ -1319,19 +1352,28 @@ void __init intel_ds_init(void)
1319 1352
1320 x86_pmu.bts = boot_cpu_has(X86_FEATURE_BTS); 1353 x86_pmu.bts = boot_cpu_has(X86_FEATURE_BTS);
1321 x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS); 1354 x86_pmu.pebs = boot_cpu_has(X86_FEATURE_PEBS);
1355 x86_pmu.pebs_buffer_size = PEBS_BUFFER_SIZE;
1322 if (x86_pmu.pebs) { 1356 if (x86_pmu.pebs) {
1323 char pebs_type = x86_pmu.intel_cap.pebs_trap ? '+' : '-'; 1357 char pebs_type = x86_pmu.intel_cap.pebs_trap ? '+' : '-';
1324 int format = x86_pmu.intel_cap.pebs_format; 1358 int format = x86_pmu.intel_cap.pebs_format;
1325 1359
1326 switch (format) { 1360 switch (format) {
1327 case 0: 1361 case 0:
1328 printk(KERN_CONT "PEBS fmt0%c, ", pebs_type); 1362 pr_cont("PEBS fmt0%c, ", pebs_type);
1329 x86_pmu.pebs_record_size = sizeof(struct pebs_record_core); 1363 x86_pmu.pebs_record_size = sizeof(struct pebs_record_core);
1364 /*
1365 * Using >PAGE_SIZE buffers makes the WRMSR to
1366 * PERF_GLOBAL_CTRL in intel_pmu_enable_all()
1367 * mysteriously hang on Core2.
1368 *
1369 * As a workaround, we don't do this.
1370 */
1371 x86_pmu.pebs_buffer_size = PAGE_SIZE;
1330 x86_pmu.drain_pebs = intel_pmu_drain_pebs_core; 1372 x86_pmu.drain_pebs = intel_pmu_drain_pebs_core;
1331 break; 1373 break;
1332 1374
1333 case 1: 1375 case 1:
1334 printk(KERN_CONT "PEBS fmt1%c, ", pebs_type); 1376 pr_cont("PEBS fmt1%c, ", pebs_type);
1335 x86_pmu.pebs_record_size = sizeof(struct pebs_record_nhm); 1377 x86_pmu.pebs_record_size = sizeof(struct pebs_record_nhm);
1336 x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm; 1378 x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm;
1337 break; 1379 break;
@@ -1351,7 +1393,7 @@ void __init intel_ds_init(void)
1351 break; 1393 break;
1352 1394
1353 default: 1395 default:
1354 printk(KERN_CONT "no PEBS fmt%d%c, ", format, pebs_type); 1396 pr_cont("no PEBS fmt%d%c, ", format, pebs_type);
1355 x86_pmu.pebs = 0; 1397 x86_pmu.pebs = 0;
1356 } 1398 }
1357 } 1399 }
diff --git a/arch/x86/kernel/cpu/perf_event_knc.c b/arch/x86/events/intel/knc.c
index 5b0c232d1ee6..548d5f774b07 100644
--- a/arch/x86/kernel/cpu/perf_event_knc.c
+++ b/arch/x86/events/intel/knc.c
@@ -5,7 +5,7 @@
5 5
6#include <asm/hardirq.h> 6#include <asm/hardirq.h>
7 7
8#include "perf_event.h" 8#include "../perf_event.h"
9 9
10static const u64 knc_perfmon_event_map[] = 10static const u64 knc_perfmon_event_map[] =
11{ 11{
@@ -263,7 +263,9 @@ again:
263 goto again; 263 goto again;
264 264
265done: 265done:
266 knc_pmu_enable_all(0); 266 /* Only restore PMU state when it's active. See x86_pmu_disable(). */
267 if (cpuc->enabled)
268 knc_pmu_enable_all(0);
267 269
268 return handled; 270 return handled;
269} 271}
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/events/intel/lbr.c
index 653f88d25987..69dd11887dd1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -5,7 +5,7 @@
5#include <asm/msr.h> 5#include <asm/msr.h>
6#include <asm/insn.h> 6#include <asm/insn.h>
7 7
8#include "perf_event.h" 8#include "../perf_event.h"
9 9
10enum { 10enum {
11 LBR_FORMAT_32 = 0x00, 11 LBR_FORMAT_32 = 0x00,
diff --git a/arch/x86/kernel/cpu/perf_event_p4.c b/arch/x86/events/intel/p4.c
index f2e56783af3d..0a5ede187d9c 100644
--- a/arch/x86/kernel/cpu/perf_event_p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -13,7 +13,7 @@
13#include <asm/hardirq.h> 13#include <asm/hardirq.h>
14#include <asm/apic.h> 14#include <asm/apic.h>
15 15
16#include "perf_event.h" 16#include "../perf_event.h"
17 17
18#define P4_CNTR_LIMIT 3 18#define P4_CNTR_LIMIT 3
19/* 19/*
diff --git a/arch/x86/kernel/cpu/perf_event_p6.c b/arch/x86/events/intel/p6.c
index 7c1a0c07b607..1f5c47ab4c65 100644
--- a/arch/x86/kernel/cpu/perf_event_p6.c
+++ b/arch/x86/events/intel/p6.c
@@ -1,7 +1,7 @@
1#include <linux/perf_event.h> 1#include <linux/perf_event.h>
2#include <linux/types.h> 2#include <linux/types.h>
3 3
4#include "perf_event.h" 4#include "../perf_event.h"
5 5
6/* 6/*
7 * Not sure about some of these 7 * Not sure about some of these
diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/events/intel/pt.c
index c0bbd1033b7c..6af7cf71d6b2 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -29,8 +29,8 @@
29#include <asm/io.h> 29#include <asm/io.h>
30#include <asm/intel_pt.h> 30#include <asm/intel_pt.h>
31 31
32#include "perf_event.h" 32#include "../perf_event.h"
33#include "intel_pt.h" 33#include "pt.h"
34 34
35static DEFINE_PER_CPU(struct pt, pt_ctx); 35static DEFINE_PER_CPU(struct pt, pt_ctx);
36 36
diff --git a/arch/x86/kernel/cpu/intel_pt.h b/arch/x86/events/intel/pt.h
index 336878a5d205..336878a5d205 100644
--- a/arch/x86/kernel/cpu/intel_pt.h
+++ b/arch/x86/events/intel/pt.h
diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/events/intel/rapl.c
index 24a351ad628d..b834a3f55a01 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -44,11 +44,14 @@
44 * the duration of the measurement. Tools may use a function such as 44 * the duration of the measurement. Tools may use a function such as
45 * ldexp(raw_count, -32); 45 * ldexp(raw_count, -32);
46 */ 46 */
47
48#define pr_fmt(fmt) "RAPL PMU: " fmt
49
47#include <linux/module.h> 50#include <linux/module.h>
48#include <linux/slab.h> 51#include <linux/slab.h>
49#include <linux/perf_event.h> 52#include <linux/perf_event.h>
50#include <asm/cpu_device_id.h> 53#include <asm/cpu_device_id.h>
51#include "perf_event.h" 54#include "../perf_event.h"
52 55
53/* 56/*
54 * RAPL energy status counters 57 * RAPL energy status counters
@@ -107,7 +110,7 @@ static ssize_t __rapl_##_var##_show(struct kobject *kobj, \
107static struct kobj_attribute format_attr_##_var = \ 110static struct kobj_attribute format_attr_##_var = \
108 __ATTR(_name, 0444, __rapl_##_var##_show, NULL) 111 __ATTR(_name, 0444, __rapl_##_var##_show, NULL)
109 112
110#define RAPL_CNTR_WIDTH 32 /* 32-bit rapl counters */ 113#define RAPL_CNTR_WIDTH 32
111 114
112#define RAPL_EVENT_ATTR_STR(_name, v, str) \ 115#define RAPL_EVENT_ATTR_STR(_name, v, str) \
113static struct perf_pmu_events_attr event_attr_##v = { \ 116static struct perf_pmu_events_attr event_attr_##v = { \
@@ -117,23 +120,33 @@ static struct perf_pmu_events_attr event_attr_##v = { \
117}; 120};
118 121
119struct rapl_pmu { 122struct rapl_pmu {
120 spinlock_t lock; 123 raw_spinlock_t lock;
121 int n_active; /* number of active events */ 124 int n_active;
122 struct list_head active_list; 125 int cpu;
123 struct pmu *pmu; /* pointer to rapl_pmu_class */ 126 struct list_head active_list;
124 ktime_t timer_interval; /* in ktime_t unit */ 127 struct pmu *pmu;
125 struct hrtimer hrtimer; 128 ktime_t timer_interval;
129 struct hrtimer hrtimer;
126}; 130};
127 131
128static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly; /* 1/2^hw_unit Joule */ 132struct rapl_pmus {
129static struct pmu rapl_pmu_class; 133 struct pmu pmu;
134 unsigned int maxpkg;
135 struct rapl_pmu *pmus[];
136};
137
138 /* 1/2^hw_unit Joule */
139static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
140static struct rapl_pmus *rapl_pmus;
130static cpumask_t rapl_cpu_mask; 141static cpumask_t rapl_cpu_mask;
131static int rapl_cntr_mask; 142static unsigned int rapl_cntr_mask;
143static u64 rapl_timer_ms;
132 144
133static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu); 145static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
134static DEFINE_PER_CPU(struct rapl_pmu *, rapl_pmu_to_free); 146{
147 return rapl_pmus->pmus[topology_logical_package_id(cpu)];
148}
135 149
136static struct x86_pmu_quirk *rapl_quirks;
137static inline u64 rapl_read_counter(struct perf_event *event) 150static inline u64 rapl_read_counter(struct perf_event *event)
138{ 151{
139 u64 raw; 152 u64 raw;
@@ -141,19 +154,10 @@ static inline u64 rapl_read_counter(struct perf_event *event)
141 return raw; 154 return raw;
142} 155}
143 156
144#define rapl_add_quirk(func_) \
145do { \
146 static struct x86_pmu_quirk __quirk __initdata = { \
147 .func = func_, \
148 }; \
149 __quirk.next = rapl_quirks; \
150 rapl_quirks = &__quirk; \
151} while (0)
152
153static inline u64 rapl_scale(u64 v, int cfg) 157static inline u64 rapl_scale(u64 v, int cfg)
154{ 158{
155 if (cfg > NR_RAPL_DOMAINS) { 159 if (cfg > NR_RAPL_DOMAINS) {
156 pr_warn("invalid domain %d, failed to scale data\n", cfg); 160 pr_warn("Invalid domain %d, failed to scale data\n", cfg);
157 return v; 161 return v;
158 } 162 }
159 /* 163 /*
@@ -206,27 +210,21 @@ static void rapl_start_hrtimer(struct rapl_pmu *pmu)
206 HRTIMER_MODE_REL_PINNED); 210 HRTIMER_MODE_REL_PINNED);
207} 211}
208 212
209static void rapl_stop_hrtimer(struct rapl_pmu *pmu)
210{
211 hrtimer_cancel(&pmu->hrtimer);
212}
213
214static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer) 213static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
215{ 214{
216 struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu); 215 struct rapl_pmu *pmu = container_of(hrtimer, struct rapl_pmu, hrtimer);
217 struct perf_event *event; 216 struct perf_event *event;
218 unsigned long flags; 217 unsigned long flags;
219 218
220 if (!pmu->n_active) 219 if (!pmu->n_active)
221 return HRTIMER_NORESTART; 220 return HRTIMER_NORESTART;
222 221
223 spin_lock_irqsave(&pmu->lock, flags); 222 raw_spin_lock_irqsave(&pmu->lock, flags);
224 223
225 list_for_each_entry(event, &pmu->active_list, active_entry) { 224 list_for_each_entry(event, &pmu->active_list, active_entry)
226 rapl_event_update(event); 225 rapl_event_update(event);
227 }
228 226
229 spin_unlock_irqrestore(&pmu->lock, flags); 227 raw_spin_unlock_irqrestore(&pmu->lock, flags);
230 228
231 hrtimer_forward_now(hrtimer, pmu->timer_interval); 229 hrtimer_forward_now(hrtimer, pmu->timer_interval);
232 230
@@ -260,28 +258,28 @@ static void __rapl_pmu_event_start(struct rapl_pmu *pmu,
260 258
261static void rapl_pmu_event_start(struct perf_event *event, int mode) 259static void rapl_pmu_event_start(struct perf_event *event, int mode)
262{ 260{
263 struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu); 261 struct rapl_pmu *pmu = event->pmu_private;
264 unsigned long flags; 262 unsigned long flags;
265 263
266 spin_lock_irqsave(&pmu->lock, flags); 264 raw_spin_lock_irqsave(&pmu->lock, flags);
267 __rapl_pmu_event_start(pmu, event); 265 __rapl_pmu_event_start(pmu, event);
268 spin_unlock_irqrestore(&pmu->lock, flags); 266 raw_spin_unlock_irqrestore(&pmu->lock, flags);
269} 267}
270 268
271static void rapl_pmu_event_stop(struct perf_event *event, int mode) 269static void rapl_pmu_event_stop(struct perf_event *event, int mode)
272{ 270{
273 struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu); 271 struct rapl_pmu *pmu = event->pmu_private;
274 struct hw_perf_event *hwc = &event->hw; 272 struct hw_perf_event *hwc = &event->hw;
275 unsigned long flags; 273 unsigned long flags;
276 274
277 spin_lock_irqsave(&pmu->lock, flags); 275 raw_spin_lock_irqsave(&pmu->lock, flags);
278 276
279 /* mark event as deactivated and stopped */ 277 /* mark event as deactivated and stopped */
280 if (!(hwc->state & PERF_HES_STOPPED)) { 278 if (!(hwc->state & PERF_HES_STOPPED)) {
281 WARN_ON_ONCE(pmu->n_active <= 0); 279 WARN_ON_ONCE(pmu->n_active <= 0);
282 pmu->n_active--; 280 pmu->n_active--;
283 if (pmu->n_active == 0) 281 if (pmu->n_active == 0)
284 rapl_stop_hrtimer(pmu); 282 hrtimer_cancel(&pmu->hrtimer);
285 283
286 list_del(&event->active_entry); 284 list_del(&event->active_entry);
287 285
@@ -299,23 +297,23 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
299 hwc->state |= PERF_HES_UPTODATE; 297 hwc->state |= PERF_HES_UPTODATE;
300 } 298 }
301 299
302 spin_unlock_irqrestore(&pmu->lock, flags); 300 raw_spin_unlock_irqrestore(&pmu->lock, flags);
303} 301}
304 302
305static int rapl_pmu_event_add(struct perf_event *event, int mode) 303static int rapl_pmu_event_add(struct perf_event *event, int mode)
306{ 304{
307 struct rapl_pmu *pmu = __this_cpu_read(rapl_pmu); 305 struct rapl_pmu *pmu = event->pmu_private;
308 struct hw_perf_event *hwc = &event->hw; 306 struct hw_perf_event *hwc = &event->hw;
309 unsigned long flags; 307 unsigned long flags;
310 308
311 spin_lock_irqsave(&pmu->lock, flags); 309 raw_spin_lock_irqsave(&pmu->lock, flags);
312 310
313 hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; 311 hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
314 312
315 if (mode & PERF_EF_START) 313 if (mode & PERF_EF_START)
316 __rapl_pmu_event_start(pmu, event); 314 __rapl_pmu_event_start(pmu, event);
317 315
318 spin_unlock_irqrestore(&pmu->lock, flags); 316 raw_spin_unlock_irqrestore(&pmu->lock, flags);
319 317
320 return 0; 318 return 0;
321} 319}
@@ -329,15 +327,19 @@ static int rapl_pmu_event_init(struct perf_event *event)
329{ 327{
330 u64 cfg = event->attr.config & RAPL_EVENT_MASK; 328 u64 cfg = event->attr.config & RAPL_EVENT_MASK;
331 int bit, msr, ret = 0; 329 int bit, msr, ret = 0;
330 struct rapl_pmu *pmu;
332 331
333 /* only look at RAPL events */ 332 /* only look at RAPL events */
334 if (event->attr.type != rapl_pmu_class.type) 333 if (event->attr.type != rapl_pmus->pmu.type)
335 return -ENOENT; 334 return -ENOENT;
336 335
337 /* check only supported bits are set */ 336 /* check only supported bits are set */
338 if (event->attr.config & ~RAPL_EVENT_MASK) 337 if (event->attr.config & ~RAPL_EVENT_MASK)
339 return -EINVAL; 338 return -EINVAL;
340 339
340 if (event->cpu < 0)
341 return -EINVAL;
342
341 /* 343 /*
342 * check event is known (determines counter) 344 * check event is known (determines counter)
343 */ 345 */
@@ -376,6 +378,9 @@ static int rapl_pmu_event_init(struct perf_event *event)
376 return -EINVAL; 378 return -EINVAL;
377 379
378 /* must be done before validate_group */ 380 /* must be done before validate_group */
381 pmu = cpu_to_rapl_pmu(event->cpu);
382 event->cpu = pmu->cpu;
383 event->pmu_private = pmu;
379 event->hw.event_base = msr; 384 event->hw.event_base = msr;
380 event->hw.config = cfg; 385 event->hw.config = cfg;
381 event->hw.idx = bit; 386 event->hw.idx = bit;
@@ -506,139 +511,62 @@ const struct attribute_group *rapl_attr_groups[] = {
506 NULL, 511 NULL,
507}; 512};
508 513
509static struct pmu rapl_pmu_class = {
510 .attr_groups = rapl_attr_groups,
511 .task_ctx_nr = perf_invalid_context, /* system-wide only */
512 .event_init = rapl_pmu_event_init,
513 .add = rapl_pmu_event_add, /* must have */
514 .del = rapl_pmu_event_del, /* must have */
515 .start = rapl_pmu_event_start,
516 .stop = rapl_pmu_event_stop,
517 .read = rapl_pmu_event_read,
518};
519
520static void rapl_cpu_exit(int cpu) 514static void rapl_cpu_exit(int cpu)
521{ 515{
522 struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu); 516 struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
523 int i, phys_id = topology_physical_package_id(cpu); 517 int target;
524 int target = -1;
525 518
526 /* find a new cpu on same package */ 519 /* Check if exiting cpu is used for collecting rapl events */
527 for_each_online_cpu(i) { 520 if (!cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask))
528 if (i == cpu) 521 return;
529 continue;
530 if (phys_id == topology_physical_package_id(i)) {
531 target = i;
532 break;
533 }
534 }
535 /*
536 * clear cpu from cpumask
537 * if was set in cpumask and still some cpu on package,
538 * then move to new cpu
539 */
540 if (cpumask_test_and_clear_cpu(cpu, &rapl_cpu_mask) && target >= 0)
541 cpumask_set_cpu(target, &rapl_cpu_mask);
542 522
543 WARN_ON(cpumask_empty(&rapl_cpu_mask)); 523 pmu->cpu = -1;
544 /* 524 /* Find a new cpu to collect rapl events */
545 * migrate events and context to new cpu 525 target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
546 */
547 if (target >= 0)
548 perf_pmu_migrate_context(pmu->pmu, cpu, target);
549 526
550 /* cancel overflow polling timer for CPU */ 527 /* Migrate rapl events to the new target */
551 rapl_stop_hrtimer(pmu); 528 if (target < nr_cpu_ids) {
529 cpumask_set_cpu(target, &rapl_cpu_mask);
530 pmu->cpu = target;
531 perf_pmu_migrate_context(pmu->pmu, cpu, target);
532 }
552} 533}
553 534
554static void rapl_cpu_init(int cpu) 535static void rapl_cpu_init(int cpu)
555{ 536{
556 int i, phys_id = topology_physical_package_id(cpu); 537 struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
538 int target;
557 539
558 /* check if phys_is is already covered */
559 for_each_cpu(i, &rapl_cpu_mask) {
560 if (phys_id == topology_physical_package_id(i))
561 return;
562 }
563 /* was not found, so add it */
564 cpumask_set_cpu(cpu, &rapl_cpu_mask);
565}
566
567static __init void rapl_hsw_server_quirk(void)
568{
569 /* 540 /*
570 * DRAM domain on HSW server has fixed energy unit which can be 541 * Check if there is an online cpu in the package which collects rapl
571 * different than the unit from power unit MSR. 542 * events already.
572 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
573 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
574 */ 543 */
575 rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16; 544 target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu));
545 if (target < nr_cpu_ids)
546 return;
547
548 cpumask_set_cpu(cpu, &rapl_cpu_mask);
549 pmu->cpu = cpu;
576} 550}
577 551
578static int rapl_cpu_prepare(int cpu) 552static int rapl_cpu_prepare(int cpu)
579{ 553{
580 struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu); 554 struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
581 int phys_id = topology_physical_package_id(cpu);
582 u64 ms;
583 555
584 if (pmu) 556 if (pmu)
585 return 0; 557 return 0;
586 558
587 if (phys_id < 0)
588 return -1;
589
590 pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu)); 559 pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
591 if (!pmu) 560 if (!pmu)
592 return -1; 561 return -ENOMEM;
593 spin_lock_init(&pmu->lock);
594 562
563 raw_spin_lock_init(&pmu->lock);
595 INIT_LIST_HEAD(&pmu->active_list); 564 INIT_LIST_HEAD(&pmu->active_list);
596 565 pmu->pmu = &rapl_pmus->pmu;
597 pmu->pmu = &rapl_pmu_class; 566 pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
598 567 pmu->cpu = -1;
599 /*
600 * use reference of 200W for scaling the timeout
601 * to avoid missing counter overflows.
602 * 200W = 200 Joules/sec
603 * divide interval by 2 to avoid lockstep (2 * 100)
604 * if hw unit is 32, then we use 2 ms 1/200/2
605 */
606 if (rapl_hw_unit[0] < 32)
607 ms = (1000 / (2 * 100)) * (1ULL << (32 - rapl_hw_unit[0] - 1));
608 else
609 ms = 2;
610
611 pmu->timer_interval = ms_to_ktime(ms);
612
613 rapl_hrtimer_init(pmu); 568 rapl_hrtimer_init(pmu);
614 569 rapl_pmus->pmus[topology_logical_package_id(cpu)] = pmu;
615 /* set RAPL pmu for this cpu for now */
616 per_cpu(rapl_pmu, cpu) = pmu;
617 per_cpu(rapl_pmu_to_free, cpu) = NULL;
618
619 return 0;
620}
621
622static void rapl_cpu_kfree(int cpu)
623{
624 struct rapl_pmu *pmu = per_cpu(rapl_pmu_to_free, cpu);
625
626 kfree(pmu);
627
628 per_cpu(rapl_pmu_to_free, cpu) = NULL;
629}
630
631static int rapl_cpu_dying(int cpu)
632{
633 struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
634
635 if (!pmu)
636 return 0;
637
638 per_cpu(rapl_pmu, cpu) = NULL;
639
640 per_cpu(rapl_pmu_to_free, cpu) = pmu;
641
642 return 0; 570 return 0;
643} 571}
644 572
@@ -651,28 +579,20 @@ static int rapl_cpu_notifier(struct notifier_block *self,
651 case CPU_UP_PREPARE: 579 case CPU_UP_PREPARE:
652 rapl_cpu_prepare(cpu); 580 rapl_cpu_prepare(cpu);
653 break; 581 break;
654 case CPU_STARTING: 582
655 rapl_cpu_init(cpu); 583 case CPU_DOWN_FAILED:
656 break;
657 case CPU_UP_CANCELED:
658 case CPU_DYING:
659 rapl_cpu_dying(cpu);
660 break;
661 case CPU_ONLINE: 584 case CPU_ONLINE:
662 case CPU_DEAD: 585 rapl_cpu_init(cpu);
663 rapl_cpu_kfree(cpu);
664 break; 586 break;
587
665 case CPU_DOWN_PREPARE: 588 case CPU_DOWN_PREPARE:
666 rapl_cpu_exit(cpu); 589 rapl_cpu_exit(cpu);
667 break; 590 break;
668 default:
669 break;
670 } 591 }
671
672 return NOTIFY_OK; 592 return NOTIFY_OK;
673} 593}
674 594
675static int rapl_check_hw_unit(void) 595static int rapl_check_hw_unit(bool apply_quirk)
676{ 596{
677 u64 msr_rapl_power_unit_bits; 597 u64 msr_rapl_power_unit_bits;
678 int i; 598 int i;
@@ -683,28 +603,107 @@ static int rapl_check_hw_unit(void)
683 for (i = 0; i < NR_RAPL_DOMAINS; i++) 603 for (i = 0; i < NR_RAPL_DOMAINS; i++)
684 rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL; 604 rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
685 605
606 /*
607 * DRAM domain on HSW server and KNL has fixed energy unit which can be
608 * different than the unit from power unit MSR. See
609 * "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
610 * of 2. Datasheet, September 2014, Reference Number: 330784-001 "
611 */
612 if (apply_quirk)
613 rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
614
615 /*
616 * Calculate the timer rate:
617 * Use reference of 200W for scaling the timeout to avoid counter
618 * overflows. 200W = 200 Joules/sec
619 * Divide interval by 2 to avoid lockstep (2 * 100)
620 * if hw unit is 32, then we use 2 ms 1/200/2
621 */
622 rapl_timer_ms = 2;
623 if (rapl_hw_unit[0] < 32) {
624 rapl_timer_ms = (1000 / (2 * 100));
625 rapl_timer_ms *= (1ULL << (32 - rapl_hw_unit[0] - 1));
626 }
627 return 0;
628}
629
630static void __init rapl_advertise(void)
631{
632 int i;
633
634 pr_info("API unit is 2^-32 Joules, %d fixed counters, %llu ms ovfl timer\n",
635 hweight32(rapl_cntr_mask), rapl_timer_ms);
636
637 for (i = 0; i < NR_RAPL_DOMAINS; i++) {
638 if (rapl_cntr_mask & (1 << i)) {
639 pr_info("hw unit of domain %s 2^-%d Joules\n",
640 rapl_domain_names[i], rapl_hw_unit[i]);
641 }
642 }
643}
644
645static int __init rapl_prepare_cpus(void)
646{
647 unsigned int cpu, pkg;
648 int ret;
649
650 for_each_online_cpu(cpu) {
651 pkg = topology_logical_package_id(cpu);
652 if (rapl_pmus->pmus[pkg])
653 continue;
654
655 ret = rapl_cpu_prepare(cpu);
656 if (ret)
657 return ret;
658 rapl_cpu_init(cpu);
659 }
660 return 0;
661}
662
663static void __init cleanup_rapl_pmus(void)
664{
665 int i;
666
667 for (i = 0; i < rapl_pmus->maxpkg; i++)
668 kfree(rapl_pmus->pmus + i);
669 kfree(rapl_pmus);
670}
671
672static int __init init_rapl_pmus(void)
673{
674 int maxpkg = topology_max_packages();
675 size_t size;
676
677 size = sizeof(*rapl_pmus) + maxpkg * sizeof(struct rapl_pmu *);
678 rapl_pmus = kzalloc(size, GFP_KERNEL);
679 if (!rapl_pmus)
680 return -ENOMEM;
681
682 rapl_pmus->maxpkg = maxpkg;
683 rapl_pmus->pmu.attr_groups = rapl_attr_groups;
684 rapl_pmus->pmu.task_ctx_nr = perf_invalid_context;
685 rapl_pmus->pmu.event_init = rapl_pmu_event_init;
686 rapl_pmus->pmu.add = rapl_pmu_event_add;
687 rapl_pmus->pmu.del = rapl_pmu_event_del;
688 rapl_pmus->pmu.start = rapl_pmu_event_start;
689 rapl_pmus->pmu.stop = rapl_pmu_event_stop;
690 rapl_pmus->pmu.read = rapl_pmu_event_read;
686 return 0; 691 return 0;
687} 692}
688 693
689static const struct x86_cpu_id rapl_cpu_match[] = { 694static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
690 [0] = { .vendor = X86_VENDOR_INTEL, .family = 6 }, 695 [0] = { .vendor = X86_VENDOR_INTEL, .family = 6 },
691 [1] = {}, 696 [1] = {},
692}; 697};
693 698
694static int __init rapl_pmu_init(void) 699static int __init rapl_pmu_init(void)
695{ 700{
696 struct rapl_pmu *pmu; 701 bool apply_quirk = false;
697 int cpu, ret; 702 int ret;
698 struct x86_pmu_quirk *quirk;
699 int i;
700 703
701 /*
702 * check for Intel processor family 6
703 */
704 if (!x86_match_cpu(rapl_cpu_match)) 704 if (!x86_match_cpu(rapl_cpu_match))
705 return 0; 705 return -ENODEV;
706 706
707 /* check supported CPU */
708 switch (boot_cpu_data.x86_model) { 707 switch (boot_cpu_data.x86_model) {
709 case 42: /* Sandy Bridge */ 708 case 42: /* Sandy Bridge */
710 case 58: /* Ivy Bridge */ 709 case 58: /* Ivy Bridge */
@@ -712,7 +711,7 @@ static int __init rapl_pmu_init(void)
712 rapl_pmu_events_group.attrs = rapl_events_cln_attr; 711 rapl_pmu_events_group.attrs = rapl_events_cln_attr;
713 break; 712 break;
714 case 63: /* Haswell-Server */ 713 case 63: /* Haswell-Server */
715 rapl_add_quirk(rapl_hsw_server_quirk); 714 apply_quirk = true;
716 rapl_cntr_mask = RAPL_IDX_SRV; 715 rapl_cntr_mask = RAPL_IDX_SRV;
717 rapl_pmu_events_group.attrs = rapl_events_srv_attr; 716 rapl_pmu_events_group.attrs = rapl_events_srv_attr;
718 break; 717 break;
@@ -728,56 +727,41 @@ static int __init rapl_pmu_init(void)
728 rapl_pmu_events_group.attrs = rapl_events_srv_attr; 727 rapl_pmu_events_group.attrs = rapl_events_srv_attr;
729 break; 728 break;
730 case 87: /* Knights Landing */ 729 case 87: /* Knights Landing */
731 rapl_add_quirk(rapl_hsw_server_quirk); 730 apply_quirk = true;
732 rapl_cntr_mask = RAPL_IDX_KNL; 731 rapl_cntr_mask = RAPL_IDX_KNL;
733 rapl_pmu_events_group.attrs = rapl_events_knl_attr; 732 rapl_pmu_events_group.attrs = rapl_events_knl_attr;
734 733 break;
735 default: 734 default:
736 /* unsupported */ 735 return -ENODEV;
737 return 0;
738 } 736 }
739 ret = rapl_check_hw_unit(); 737
738 ret = rapl_check_hw_unit(apply_quirk);
740 if (ret) 739 if (ret)
741 return ret; 740 return ret;
742 741
743 /* run cpu model quirks */ 742 ret = init_rapl_pmus();
744 for (quirk = rapl_quirks; quirk; quirk = quirk->next) 743 if (ret)
745 quirk->func(); 744 return ret;
746 cpu_notifier_register_begin();
747 745
748 for_each_online_cpu(cpu) { 746 cpu_notifier_register_begin();
749 ret = rapl_cpu_prepare(cpu);
750 if (ret)
751 goto out;
752 rapl_cpu_init(cpu);
753 }
754 747
755 __perf_cpu_notifier(rapl_cpu_notifier); 748 ret = rapl_prepare_cpus();
749 if (ret)
750 goto out;
756 751
757 ret = perf_pmu_register(&rapl_pmu_class, "power", -1); 752 ret = perf_pmu_register(&rapl_pmus->pmu, "power", -1);
758 if (WARN_ON(ret)) { 753 if (ret)
759 pr_info("RAPL PMU detected, registration failed (%d), RAPL PMU disabled\n", ret); 754 goto out;
760 cpu_notifier_register_done();
761 return -1;
762 }
763 755
764 pmu = __this_cpu_read(rapl_pmu); 756 __perf_cpu_notifier(rapl_cpu_notifier);
757 cpu_notifier_register_done();
758 rapl_advertise();
759 return 0;
765 760
766 pr_info("RAPL PMU detected,"
767 " API unit is 2^-32 Joules,"
768 " %d fixed counters"
769 " %llu ms ovfl timer\n",
770 hweight32(rapl_cntr_mask),
771 ktime_to_ms(pmu->timer_interval));
772 for (i = 0; i < NR_RAPL_DOMAINS; i++) {
773 if (rapl_cntr_mask & (1 << i)) {
774 pr_info("hw unit of domain %s 2^-%d Joules\n",
775 rapl_domain_names[i], rapl_hw_unit[i]);
776 }
777 }
778out: 761out:
762 pr_warn("Initialization failed (%d), disabled\n", ret);
763 cleanup_rapl_pmus();
779 cpu_notifier_register_done(); 764 cpu_notifier_register_done();
780 765 return ret;
781 return 0;
782} 766}
783device_initcall(rapl_pmu_init); 767device_initcall(rapl_pmu_init);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/events/intel/uncore.c
index 3bf41d413775..7012d18bb293 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1,4 +1,4 @@
1#include "perf_event_intel_uncore.h" 1#include "uncore.h"
2 2
3static struct intel_uncore_type *empty_uncore[] = { NULL, }; 3static struct intel_uncore_type *empty_uncore[] = { NULL, };
4struct intel_uncore_type **uncore_msr_uncores = empty_uncore; 4struct intel_uncore_type **uncore_msr_uncores = empty_uncore;
@@ -9,9 +9,9 @@ struct pci_driver *uncore_pci_driver;
9/* pci bus to socket mapping */ 9/* pci bus to socket mapping */
10DEFINE_RAW_SPINLOCK(pci2phy_map_lock); 10DEFINE_RAW_SPINLOCK(pci2phy_map_lock);
11struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head); 11struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head);
12struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX]; 12struct pci_extra_dev *uncore_extra_pci_dev;
13static int max_packages;
13 14
14static DEFINE_RAW_SPINLOCK(uncore_box_lock);
15/* mask of cpus that collect uncore events */ 15/* mask of cpus that collect uncore events */
16static cpumask_t uncore_cpu_mask; 16static cpumask_t uncore_cpu_mask;
17 17
@@ -21,7 +21,7 @@ static struct event_constraint uncore_constraint_fixed =
21struct event_constraint uncore_constraint_empty = 21struct event_constraint uncore_constraint_empty =
22 EVENT_CONSTRAINT(0, 0, 0); 22 EVENT_CONSTRAINT(0, 0, 0);
23 23
24int uncore_pcibus_to_physid(struct pci_bus *bus) 24static int uncore_pcibus_to_physid(struct pci_bus *bus)
25{ 25{
26 struct pci2phy_map *map; 26 struct pci2phy_map *map;
27 int phys_id = -1; 27 int phys_id = -1;
@@ -38,6 +38,16 @@ int uncore_pcibus_to_physid(struct pci_bus *bus)
38 return phys_id; 38 return phys_id;
39} 39}
40 40
41static void uncore_free_pcibus_map(void)
42{
43 struct pci2phy_map *map, *tmp;
44
45 list_for_each_entry_safe(map, tmp, &pci2phy_map_head, list) {
46 list_del(&map->list);
47 kfree(map);
48 }
49}
50
41struct pci2phy_map *__find_pci2phy_map(int segment) 51struct pci2phy_map *__find_pci2phy_map(int segment)
42{ 52{
43 struct pci2phy_map *map, *alloc = NULL; 53 struct pci2phy_map *map, *alloc = NULL;
@@ -82,43 +92,9 @@ ssize_t uncore_event_show(struct kobject *kobj,
82 return sprintf(buf, "%s", event->config); 92 return sprintf(buf, "%s", event->config);
83} 93}
84 94
85struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
86{
87 return container_of(event->pmu, struct intel_uncore_pmu, pmu);
88}
89
90struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu) 95struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
91{ 96{
92 struct intel_uncore_box *box; 97 return pmu->boxes[topology_logical_package_id(cpu)];
93
94 box = *per_cpu_ptr(pmu->box, cpu);
95 if (box)
96 return box;
97
98 raw_spin_lock(&uncore_box_lock);
99 /* Recheck in lock to handle races. */
100 if (*per_cpu_ptr(pmu->box, cpu))
101 goto out;
102 list_for_each_entry(box, &pmu->box_list, list) {
103 if (box->phys_id == topology_physical_package_id(cpu)) {
104 atomic_inc(&box->refcnt);
105 *per_cpu_ptr(pmu->box, cpu) = box;
106 break;
107 }
108 }
109out:
110 raw_spin_unlock(&uncore_box_lock);
111
112 return *per_cpu_ptr(pmu->box, cpu);
113}
114
115struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
116{
117 /*
118 * perf core schedules event on the basis of cpu, uncore events are
119 * collected by one of the cpus inside a physical package.
120 */
121 return uncore_pmu_to_box(uncore_event_to_pmu(event), smp_processor_id());
122} 98}
123 99
124u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event) 100u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
@@ -207,7 +183,8 @@ u64 uncore_shared_reg_config(struct intel_uncore_box *box, int idx)
207 return config; 183 return config;
208} 184}
209 185
210static void uncore_assign_hw_event(struct intel_uncore_box *box, struct perf_event *event, int idx) 186static void uncore_assign_hw_event(struct intel_uncore_box *box,
187 struct perf_event *event, int idx)
211{ 188{
212 struct hw_perf_event *hwc = &event->hw; 189 struct hw_perf_event *hwc = &event->hw;
213 190
@@ -302,24 +279,25 @@ static void uncore_pmu_init_hrtimer(struct intel_uncore_box *box)
302 box->hrtimer.function = uncore_pmu_hrtimer; 279 box->hrtimer.function = uncore_pmu_hrtimer;
303} 280}
304 281
305static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type, int node) 282static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
283 int node)
306{ 284{
285 int i, size, numshared = type->num_shared_regs ;
307 struct intel_uncore_box *box; 286 struct intel_uncore_box *box;
308 int i, size;
309 287
310 size = sizeof(*box) + type->num_shared_regs * sizeof(struct intel_uncore_extra_reg); 288 size = sizeof(*box) + numshared * sizeof(struct intel_uncore_extra_reg);
311 289
312 box = kzalloc_node(size, GFP_KERNEL, node); 290 box = kzalloc_node(size, GFP_KERNEL, node);
313 if (!box) 291 if (!box)
314 return NULL; 292 return NULL;
315 293
316 for (i = 0; i < type->num_shared_regs; i++) 294 for (i = 0; i < numshared; i++)
317 raw_spin_lock_init(&box->shared_regs[i].lock); 295 raw_spin_lock_init(&box->shared_regs[i].lock);
318 296
319 uncore_pmu_init_hrtimer(box); 297 uncore_pmu_init_hrtimer(box);
320 atomic_set(&box->refcnt, 1);
321 box->cpu = -1; 298 box->cpu = -1;
322 box->phys_id = -1; 299 box->pci_phys_id = -1;
300 box->pkgid = -1;
323 301
324 /* set default hrtimer timeout */ 302 /* set default hrtimer timeout */
325 box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL; 303 box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
@@ -341,7 +319,8 @@ static bool is_uncore_event(struct perf_event *event)
341} 319}
342 320
343static int 321static int
344uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader, bool dogrp) 322uncore_collect_events(struct intel_uncore_box *box, struct perf_event *leader,
323 bool dogrp)
345{ 324{
346 struct perf_event *event; 325 struct perf_event *event;
347 int n, max_count; 326 int n, max_count;
@@ -402,7 +381,8 @@ uncore_get_event_constraint(struct intel_uncore_box *box, struct perf_event *eve
402 return &type->unconstrainted; 381 return &type->unconstrainted;
403} 382}
404 383
405static void uncore_put_event_constraint(struct intel_uncore_box *box, struct perf_event *event) 384static void uncore_put_event_constraint(struct intel_uncore_box *box,
385 struct perf_event *event)
406{ 386{
407 if (box->pmu->type->ops->put_constraint) 387 if (box->pmu->type->ops->put_constraint)
408 box->pmu->type->ops->put_constraint(box, event); 388 box->pmu->type->ops->put_constraint(box, event);
@@ -582,7 +562,7 @@ static void uncore_pmu_event_del(struct perf_event *event, int flags)
582 if (event == box->event_list[i]) { 562 if (event == box->event_list[i]) {
583 uncore_put_event_constraint(box, event); 563 uncore_put_event_constraint(box, event);
584 564
585 while (++i < box->n_events) 565 for (++i; i < box->n_events; i++)
586 box->event_list[i - 1] = box->event_list[i]; 566 box->event_list[i - 1] = box->event_list[i];
587 567
588 --box->n_events; 568 --box->n_events;
@@ -676,6 +656,7 @@ static int uncore_pmu_event_init(struct perf_event *event)
676 if (!box || box->cpu < 0) 656 if (!box || box->cpu < 0)
677 return -EINVAL; 657 return -EINVAL;
678 event->cpu = box->cpu; 658 event->cpu = box->cpu;
659 event->pmu_private = box;
679 660
680 event->hw.idx = -1; 661 event->hw.idx = -1;
681 event->hw.last_tag = ~0ULL; 662 event->hw.last_tag = ~0ULL;
@@ -760,64 +741,110 @@ static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
760 } 741 }
761 742
762 ret = perf_pmu_register(&pmu->pmu, pmu->name, -1); 743 ret = perf_pmu_register(&pmu->pmu, pmu->name, -1);
744 if (!ret)
745 pmu->registered = true;
763 return ret; 746 return ret;
764} 747}
765 748
749static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
750{
751 if (!pmu->registered)
752 return;
753 perf_pmu_unregister(&pmu->pmu);
754 pmu->registered = false;
755}
756
757static void __init __uncore_exit_boxes(struct intel_uncore_type *type, int cpu)
758{
759 struct intel_uncore_pmu *pmu = type->pmus;
760 struct intel_uncore_box *box;
761 int i, pkg;
762
763 if (pmu) {
764 pkg = topology_physical_package_id(cpu);
765 for (i = 0; i < type->num_boxes; i++, pmu++) {
766 box = pmu->boxes[pkg];
767 if (box)
768 uncore_box_exit(box);
769 }
770 }
771}
772
773static void __init uncore_exit_boxes(void *dummy)
774{
775 struct intel_uncore_type **types;
776
777 for (types = uncore_msr_uncores; *types; types++)
778 __uncore_exit_boxes(*types++, smp_processor_id());
779}
780
781static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
782{
783 int pkg;
784
785 for (pkg = 0; pkg < max_packages; pkg++)
786 kfree(pmu->boxes[pkg]);
787 kfree(pmu->boxes);
788}
789
766static void __init uncore_type_exit(struct intel_uncore_type *type) 790static void __init uncore_type_exit(struct intel_uncore_type *type)
767{ 791{
792 struct intel_uncore_pmu *pmu = type->pmus;
768 int i; 793 int i;
769 794
770 for (i = 0; i < type->num_boxes; i++) 795 if (pmu) {
771 free_percpu(type->pmus[i].box); 796 for (i = 0; i < type->num_boxes; i++, pmu++) {
772 kfree(type->pmus); 797 uncore_pmu_unregister(pmu);
773 type->pmus = NULL; 798 uncore_free_boxes(pmu);
799 }
800 kfree(type->pmus);
801 type->pmus = NULL;
802 }
774 kfree(type->events_group); 803 kfree(type->events_group);
775 type->events_group = NULL; 804 type->events_group = NULL;
776} 805}
777 806
778static void __init uncore_types_exit(struct intel_uncore_type **types) 807static void __init uncore_types_exit(struct intel_uncore_type **types)
779{ 808{
780 int i; 809 for (; *types; types++)
781 for (i = 0; types[i]; i++) 810 uncore_type_exit(*types);
782 uncore_type_exit(types[i]);
783} 811}
784 812
785static int __init uncore_type_init(struct intel_uncore_type *type) 813static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
786{ 814{
787 struct intel_uncore_pmu *pmus; 815 struct intel_uncore_pmu *pmus;
788 struct attribute_group *attr_group; 816 struct attribute_group *attr_group;
789 struct attribute **attrs; 817 struct attribute **attrs;
818 size_t size;
790 int i, j; 819 int i, j;
791 820
792 pmus = kzalloc(sizeof(*pmus) * type->num_boxes, GFP_KERNEL); 821 pmus = kzalloc(sizeof(*pmus) * type->num_boxes, GFP_KERNEL);
793 if (!pmus) 822 if (!pmus)
794 return -ENOMEM; 823 return -ENOMEM;
795 824
796 type->pmus = pmus; 825 size = max_packages * sizeof(struct intel_uncore_box *);
797 826
827 for (i = 0; i < type->num_boxes; i++) {
828 pmus[i].func_id = setid ? i : -1;
829 pmus[i].pmu_idx = i;
830 pmus[i].type = type;
831 pmus[i].boxes = kzalloc(size, GFP_KERNEL);
832 if (!pmus[i].boxes)
833 return -ENOMEM;
834 }
835
836 type->pmus = pmus;
798 type->unconstrainted = (struct event_constraint) 837 type->unconstrainted = (struct event_constraint)
799 __EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1, 838 __EVENT_CONSTRAINT(0, (1ULL << type->num_counters) - 1,
800 0, type->num_counters, 0, 0); 839 0, type->num_counters, 0, 0);
801 840
802 for (i = 0; i < type->num_boxes; i++) {
803 pmus[i].func_id = -1;
804 pmus[i].pmu_idx = i;
805 pmus[i].type = type;
806 INIT_LIST_HEAD(&pmus[i].box_list);
807 pmus[i].box = alloc_percpu(struct intel_uncore_box *);
808 if (!pmus[i].box)
809 goto fail;
810 }
811
812 if (type->event_descs) { 841 if (type->event_descs) {
813 i = 0; 842 for (i = 0; type->event_descs[i].attr.attr.name; i++);
814 while (type->event_descs[i].attr.attr.name)
815 i++;
816 843
817 attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) + 844 attr_group = kzalloc(sizeof(struct attribute *) * (i + 1) +
818 sizeof(*attr_group), GFP_KERNEL); 845 sizeof(*attr_group), GFP_KERNEL);
819 if (!attr_group) 846 if (!attr_group)
820 goto fail; 847 return -ENOMEM;
821 848
822 attrs = (struct attribute **)(attr_group + 1); 849 attrs = (struct attribute **)(attr_group + 1);
823 attr_group->name = "events"; 850 attr_group->name = "events";
@@ -831,25 +858,19 @@ static int __init uncore_type_init(struct intel_uncore_type *type)
831 858
832 type->pmu_group = &uncore_pmu_attr_group; 859 type->pmu_group = &uncore_pmu_attr_group;
833 return 0; 860 return 0;
834fail:
835 uncore_type_exit(type);
836 return -ENOMEM;
837} 861}
838 862
839static int __init uncore_types_init(struct intel_uncore_type **types) 863static int __init
864uncore_types_init(struct intel_uncore_type **types, bool setid)
840{ 865{
841 int i, ret; 866 int ret;
842 867
843 for (i = 0; types[i]; i++) { 868 for (; *types; types++) {
844 ret = uncore_type_init(types[i]); 869 ret = uncore_type_init(*types, setid);
845 if (ret) 870 if (ret)
846 goto fail; 871 return ret;
847 } 872 }
848 return 0; 873 return 0;
849fail:
850 while (--i >= 0)
851 uncore_type_exit(types[i]);
852 return ret;
853} 874}
854 875
855/* 876/*
@@ -857,28 +878,28 @@ fail:
857 */ 878 */
858static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) 879static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
859{ 880{
881 struct intel_uncore_type *type;
860 struct intel_uncore_pmu *pmu; 882 struct intel_uncore_pmu *pmu;
861 struct intel_uncore_box *box; 883 struct intel_uncore_box *box;
862 struct intel_uncore_type *type; 884 int phys_id, pkg, ret;
863 int phys_id;
864 bool first_box = false;
865 885
866 phys_id = uncore_pcibus_to_physid(pdev->bus); 886 phys_id = uncore_pcibus_to_physid(pdev->bus);
867 if (phys_id < 0) 887 if (phys_id < 0)
868 return -ENODEV; 888 return -ENODEV;
869 889
890 pkg = topology_phys_to_logical_pkg(phys_id);
891 if (WARN_ON_ONCE(pkg < 0))
892 return -EINVAL;
893
870 if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) { 894 if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
871 int idx = UNCORE_PCI_DEV_IDX(id->driver_data); 895 int idx = UNCORE_PCI_DEV_IDX(id->driver_data);
872 uncore_extra_pci_dev[phys_id][idx] = pdev; 896
897 uncore_extra_pci_dev[pkg].dev[idx] = pdev;
873 pci_set_drvdata(pdev, NULL); 898 pci_set_drvdata(pdev, NULL);
874 return 0; 899 return 0;
875 } 900 }
876 901
877 type = uncore_pci_uncores[UNCORE_PCI_DEV_TYPE(id->driver_data)]; 902 type = uncore_pci_uncores[UNCORE_PCI_DEV_TYPE(id->driver_data)];
878 box = uncore_alloc_box(type, NUMA_NO_NODE);
879 if (!box)
880 return -ENOMEM;
881
882 /* 903 /*
883 * for performance monitoring unit with multiple boxes, 904 * for performance monitoring unit with multiple boxes,
884 * each box has a different function id. 905 * each box has a different function id.
@@ -890,44 +911,60 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
890 * some device types. Hence PCI device idx would be 0 for all devices. 911 * some device types. Hence PCI device idx would be 0 for all devices.
891 * So increment pmu pointer to point to an unused array element. 912 * So increment pmu pointer to point to an unused array element.
892 */ 913 */
893 if (boot_cpu_data.x86_model == 87) 914 if (boot_cpu_data.x86_model == 87) {
894 while (pmu->func_id >= 0) 915 while (pmu->func_id >= 0)
895 pmu++; 916 pmu++;
917 }
918
919 if (WARN_ON_ONCE(pmu->boxes[pkg] != NULL))
920 return -EINVAL;
921
922 box = uncore_alloc_box(type, NUMA_NO_NODE);
923 if (!box)
924 return -ENOMEM;
925
896 if (pmu->func_id < 0) 926 if (pmu->func_id < 0)
897 pmu->func_id = pdev->devfn; 927 pmu->func_id = pdev->devfn;
898 else 928 else
899 WARN_ON_ONCE(pmu->func_id != pdev->devfn); 929 WARN_ON_ONCE(pmu->func_id != pdev->devfn);
900 930
901 box->phys_id = phys_id; 931 atomic_inc(&box->refcnt);
932 box->pci_phys_id = phys_id;
933 box->pkgid = pkg;
902 box->pci_dev = pdev; 934 box->pci_dev = pdev;
903 box->pmu = pmu; 935 box->pmu = pmu;
904 uncore_box_init(box); 936 uncore_box_init(box);
905 pci_set_drvdata(pdev, box); 937 pci_set_drvdata(pdev, box);
906 938
907 raw_spin_lock(&uncore_box_lock); 939 pmu->boxes[pkg] = box;
908 if (list_empty(&pmu->box_list)) 940 if (atomic_inc_return(&pmu->activeboxes) > 1)
909 first_box = true; 941 return 0;
910 list_add_tail(&box->list, &pmu->box_list);
911 raw_spin_unlock(&uncore_box_lock);
912 942
913 if (first_box) 943 /* First active box registers the pmu */
914 uncore_pmu_register(pmu); 944 ret = uncore_pmu_register(pmu);
915 return 0; 945 if (ret) {
946 pci_set_drvdata(pdev, NULL);
947 pmu->boxes[pkg] = NULL;
948 uncore_box_exit(box);
949 kfree(box);
950 }
951 return ret;
916} 952}
917 953
918static void uncore_pci_remove(struct pci_dev *pdev) 954static void uncore_pci_remove(struct pci_dev *pdev)
919{ 955{
920 struct intel_uncore_box *box = pci_get_drvdata(pdev); 956 struct intel_uncore_box *box = pci_get_drvdata(pdev);
921 struct intel_uncore_pmu *pmu; 957 struct intel_uncore_pmu *pmu;
922 int i, cpu, phys_id; 958 int i, phys_id, pkg;
923 bool last_box = false;
924 959
925 phys_id = uncore_pcibus_to_physid(pdev->bus); 960 phys_id = uncore_pcibus_to_physid(pdev->bus);
961 pkg = topology_phys_to_logical_pkg(phys_id);
962
926 box = pci_get_drvdata(pdev); 963 box = pci_get_drvdata(pdev);
927 if (!box) { 964 if (!box) {
928 for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) { 965 for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
929 if (uncore_extra_pci_dev[phys_id][i] == pdev) { 966 if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
930 uncore_extra_pci_dev[phys_id][i] = NULL; 967 uncore_extra_pci_dev[pkg].dev[i] = NULL;
931 break; 968 break;
932 } 969 }
933 } 970 }
@@ -936,33 +973,20 @@ static void uncore_pci_remove(struct pci_dev *pdev)
936 } 973 }
937 974
938 pmu = box->pmu; 975 pmu = box->pmu;
939 if (WARN_ON_ONCE(phys_id != box->phys_id)) 976 if (WARN_ON_ONCE(phys_id != box->pci_phys_id))
940 return; 977 return;
941 978
942 pci_set_drvdata(pdev, NULL); 979 pci_set_drvdata(pdev, NULL);
943 980 pmu->boxes[pkg] = NULL;
944 raw_spin_lock(&uncore_box_lock); 981 if (atomic_dec_return(&pmu->activeboxes) == 0)
945 list_del(&box->list); 982 uncore_pmu_unregister(pmu);
946 if (list_empty(&pmu->box_list)) 983 uncore_box_exit(box);
947 last_box = true;
948 raw_spin_unlock(&uncore_box_lock);
949
950 for_each_possible_cpu(cpu) {
951 if (*per_cpu_ptr(pmu->box, cpu) == box) {
952 *per_cpu_ptr(pmu->box, cpu) = NULL;
953 atomic_dec(&box->refcnt);
954 }
955 }
956
957 WARN_ON_ONCE(atomic_read(&box->refcnt) != 1);
958 kfree(box); 984 kfree(box);
959
960 if (last_box)
961 perf_pmu_unregister(&pmu->pmu);
962} 985}
963 986
964static int __init uncore_pci_init(void) 987static int __init uncore_pci_init(void)
965{ 988{
989 size_t size;
966 int ret; 990 int ret;
967 991
968 switch (boot_cpu_data.x86_model) { 992 switch (boot_cpu_data.x86_model) {
@@ -999,25 +1023,40 @@ static int __init uncore_pci_init(void)
999 ret = skl_uncore_pci_init(); 1023 ret = skl_uncore_pci_init();
1000 break; 1024 break;
1001 default: 1025 default:
1002 return 0; 1026 return -ENODEV;
1003 } 1027 }
1004 1028
1005 if (ret) 1029 if (ret)
1006 return ret; 1030 return ret;
1007 1031
1008 ret = uncore_types_init(uncore_pci_uncores); 1032 size = max_packages * sizeof(struct pci_extra_dev);
1033 uncore_extra_pci_dev = kzalloc(size, GFP_KERNEL);
1034 if (!uncore_extra_pci_dev) {
1035 ret = -ENOMEM;
1036 goto err;
1037 }
1038
1039 ret = uncore_types_init(uncore_pci_uncores, false);
1009 if (ret) 1040 if (ret)
1010 return ret; 1041 goto errtype;
1011 1042
1012 uncore_pci_driver->probe = uncore_pci_probe; 1043 uncore_pci_driver->probe = uncore_pci_probe;
1013 uncore_pci_driver->remove = uncore_pci_remove; 1044 uncore_pci_driver->remove = uncore_pci_remove;
1014 1045
1015 ret = pci_register_driver(uncore_pci_driver); 1046 ret = pci_register_driver(uncore_pci_driver);
1016 if (ret == 0) 1047 if (ret)
1017 pcidrv_registered = true; 1048 goto errtype;
1018 else 1049
1019 uncore_types_exit(uncore_pci_uncores); 1050 pcidrv_registered = true;
1051 return 0;
1020 1052
1053errtype:
1054 uncore_types_exit(uncore_pci_uncores);
1055 kfree(uncore_extra_pci_dev);
1056 uncore_extra_pci_dev = NULL;
1057 uncore_free_pcibus_map();
1058err:
1059 uncore_pci_uncores = empty_uncore;
1021 return ret; 1060 return ret;
1022} 1061}
1023 1062
@@ -1027,173 +1066,139 @@ static void __init uncore_pci_exit(void)
1027 pcidrv_registered = false; 1066 pcidrv_registered = false;
1028 pci_unregister_driver(uncore_pci_driver); 1067 pci_unregister_driver(uncore_pci_driver);
1029 uncore_types_exit(uncore_pci_uncores); 1068 uncore_types_exit(uncore_pci_uncores);
1030 } 1069 kfree(uncore_extra_pci_dev);
1031} 1070 uncore_free_pcibus_map();
1032
1033/* CPU hot plug/unplug are serialized by cpu_add_remove_lock mutex */
1034static LIST_HEAD(boxes_to_free);
1035
1036static void uncore_kfree_boxes(void)
1037{
1038 struct intel_uncore_box *box;
1039
1040 while (!list_empty(&boxes_to_free)) {
1041 box = list_entry(boxes_to_free.next,
1042 struct intel_uncore_box, list);
1043 list_del(&box->list);
1044 kfree(box);
1045 } 1071 }
1046} 1072}
1047 1073
1048static void uncore_cpu_dying(int cpu) 1074static void uncore_cpu_dying(int cpu)
1049{ 1075{
1050 struct intel_uncore_type *type; 1076 struct intel_uncore_type *type, **types = uncore_msr_uncores;
1051 struct intel_uncore_pmu *pmu; 1077 struct intel_uncore_pmu *pmu;
1052 struct intel_uncore_box *box; 1078 struct intel_uncore_box *box;
1053 int i, j; 1079 int i, pkg;
1054 1080
1055 for (i = 0; uncore_msr_uncores[i]; i++) { 1081 pkg = topology_logical_package_id(cpu);
1056 type = uncore_msr_uncores[i]; 1082 for (; *types; types++) {
1057 for (j = 0; j < type->num_boxes; j++) { 1083 type = *types;
1058 pmu = &type->pmus[j]; 1084 pmu = type->pmus;
1059 box = *per_cpu_ptr(pmu->box, cpu); 1085 for (i = 0; i < type->num_boxes; i++, pmu++) {
1060 *per_cpu_ptr(pmu->box, cpu) = NULL; 1086 box = pmu->boxes[pkg];
1061 if (box && atomic_dec_and_test(&box->refcnt)) 1087 if (box && atomic_dec_return(&box->refcnt) == 0)
1062 list_add(&box->list, &boxes_to_free); 1088 uncore_box_exit(box);
1063 } 1089 }
1064 } 1090 }
1065} 1091}
1066 1092
1067static int uncore_cpu_starting(int cpu) 1093static void uncore_cpu_starting(int cpu, bool init)
1068{ 1094{
1069 struct intel_uncore_type *type; 1095 struct intel_uncore_type *type, **types = uncore_msr_uncores;
1070 struct intel_uncore_pmu *pmu; 1096 struct intel_uncore_pmu *pmu;
1071 struct intel_uncore_box *box, *exist; 1097 struct intel_uncore_box *box;
1072 int i, j, k, phys_id; 1098 int i, pkg, ncpus = 1;
1073
1074 phys_id = topology_physical_package_id(cpu);
1075
1076 for (i = 0; uncore_msr_uncores[i]; i++) {
1077 type = uncore_msr_uncores[i];
1078 for (j = 0; j < type->num_boxes; j++) {
1079 pmu = &type->pmus[j];
1080 box = *per_cpu_ptr(pmu->box, cpu);
1081 /* called by uncore_cpu_init? */
1082 if (box && box->phys_id >= 0) {
1083 uncore_box_init(box);
1084 continue;
1085 }
1086 1099
1087 for_each_online_cpu(k) { 1100 if (init) {
1088 exist = *per_cpu_ptr(pmu->box, k); 1101 /*
1089 if (exist && exist->phys_id == phys_id) { 1102 * On init we get the number of online cpus in the package
1090 atomic_inc(&exist->refcnt); 1103 * and set refcount for all of them.
1091 *per_cpu_ptr(pmu->box, cpu) = exist; 1104 */
1092 if (box) { 1105 ncpus = cpumask_weight(topology_core_cpumask(cpu));
1093 list_add(&box->list, 1106 }
1094 &boxes_to_free);
1095 box = NULL;
1096 }
1097 break;
1098 }
1099 }
1100 1107
1101 if (box) { 1108 pkg = topology_logical_package_id(cpu);
1102 box->phys_id = phys_id; 1109 for (; *types; types++) {
1110 type = *types;
1111 pmu = type->pmus;
1112 for (i = 0; i < type->num_boxes; i++, pmu++) {
1113 box = pmu->boxes[pkg];
1114 if (!box)
1115 continue;
1116 /* The first cpu on a package activates the box */
1117 if (atomic_add_return(ncpus, &box->refcnt) == ncpus)
1103 uncore_box_init(box); 1118 uncore_box_init(box);
1104 }
1105 } 1119 }
1106 } 1120 }
1107 return 0;
1108} 1121}
1109 1122
1110static int uncore_cpu_prepare(int cpu, int phys_id) 1123static int uncore_cpu_prepare(int cpu)
1111{ 1124{
1112 struct intel_uncore_type *type; 1125 struct intel_uncore_type *type, **types = uncore_msr_uncores;
1113 struct intel_uncore_pmu *pmu; 1126 struct intel_uncore_pmu *pmu;
1114 struct intel_uncore_box *box; 1127 struct intel_uncore_box *box;
1115 int i, j; 1128 int i, pkg;
1116 1129
1117 for (i = 0; uncore_msr_uncores[i]; i++) { 1130 pkg = topology_logical_package_id(cpu);
1118 type = uncore_msr_uncores[i]; 1131 for (; *types; types++) {
1119 for (j = 0; j < type->num_boxes; j++) { 1132 type = *types;
1120 pmu = &type->pmus[j]; 1133 pmu = type->pmus;
1121 if (pmu->func_id < 0) 1134 for (i = 0; i < type->num_boxes; i++, pmu++) {
1122 pmu->func_id = j; 1135 if (pmu->boxes[pkg])
1123 1136 continue;
1137 /* First cpu of a package allocates the box */
1124 box = uncore_alloc_box(type, cpu_to_node(cpu)); 1138 box = uncore_alloc_box(type, cpu_to_node(cpu));
1125 if (!box) 1139 if (!box)
1126 return -ENOMEM; 1140 return -ENOMEM;
1127
1128 box->pmu = pmu; 1141 box->pmu = pmu;
1129 box->phys_id = phys_id; 1142 box->pkgid = pkg;
1130 *per_cpu_ptr(pmu->box, cpu) = box; 1143 pmu->boxes[pkg] = box;
1131 } 1144 }
1132 } 1145 }
1133 return 0; 1146 return 0;
1134} 1147}
1135 1148
1136static void 1149static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
1137uncore_change_context(struct intel_uncore_type **uncores, int old_cpu, int new_cpu) 1150 int new_cpu)
1138{ 1151{
1139 struct intel_uncore_type *type; 1152 struct intel_uncore_pmu *pmu = type->pmus;
1140 struct intel_uncore_pmu *pmu;
1141 struct intel_uncore_box *box; 1153 struct intel_uncore_box *box;
1142 int i, j; 1154 int i, pkg;
1143 1155
1144 for (i = 0; uncores[i]; i++) { 1156 pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
1145 type = uncores[i]; 1157 for (i = 0; i < type->num_boxes; i++, pmu++) {
1146 for (j = 0; j < type->num_boxes; j++) { 1158 box = pmu->boxes[pkg];
1147 pmu = &type->pmus[j]; 1159 if (!box)
1148 if (old_cpu < 0) 1160 continue;
1149 box = uncore_pmu_to_box(pmu, new_cpu);
1150 else
1151 box = uncore_pmu_to_box(pmu, old_cpu);
1152 if (!box)
1153 continue;
1154
1155 if (old_cpu < 0) {
1156 WARN_ON_ONCE(box->cpu != -1);
1157 box->cpu = new_cpu;
1158 continue;
1159 }
1160 1161
1161 WARN_ON_ONCE(box->cpu != old_cpu); 1162 if (old_cpu < 0) {
1162 if (new_cpu >= 0) { 1163 WARN_ON_ONCE(box->cpu != -1);
1163 uncore_pmu_cancel_hrtimer(box); 1164 box->cpu = new_cpu;
1164 perf_pmu_migrate_context(&pmu->pmu, 1165 continue;
1165 old_cpu, new_cpu);
1166 box->cpu = new_cpu;
1167 } else {
1168 box->cpu = -1;
1169 }
1170 } 1166 }
1167
1168 WARN_ON_ONCE(box->cpu != old_cpu);
1169 box->cpu = -1;
1170 if (new_cpu < 0)
1171 continue;
1172
1173 uncore_pmu_cancel_hrtimer(box);
1174 perf_pmu_migrate_context(&pmu->pmu, old_cpu, new_cpu);
1175 box->cpu = new_cpu;
1171 } 1176 }
1172} 1177}
1173 1178
1179static void uncore_change_context(struct intel_uncore_type **uncores,
1180 int old_cpu, int new_cpu)
1181{
1182 for (; *uncores; uncores++)
1183 uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
1184}
1185
1174static void uncore_event_exit_cpu(int cpu) 1186static void uncore_event_exit_cpu(int cpu)
1175{ 1187{
1176 int i, phys_id, target; 1188 int target;
1177 1189
1178 /* if exiting cpu is used for collecting uncore events */ 1190 /* Check if exiting cpu is used for collecting uncore events */
1179 if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask)) 1191 if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
1180 return; 1192 return;
1181 1193
1182 /* find a new cpu to collect uncore events */ 1194 /* Find a new cpu to collect uncore events */
1183 phys_id = topology_physical_package_id(cpu); 1195 target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
1184 target = -1;
1185 for_each_online_cpu(i) {
1186 if (i == cpu)
1187 continue;
1188 if (phys_id == topology_physical_package_id(i)) {
1189 target = i;
1190 break;
1191 }
1192 }
1193 1196
1194 /* migrate uncore events to the new cpu */ 1197 /* Migrate uncore events to the new target */
1195 if (target >= 0) 1198 if (target < nr_cpu_ids)
1196 cpumask_set_cpu(target, &uncore_cpu_mask); 1199 cpumask_set_cpu(target, &uncore_cpu_mask);
1200 else
1201 target = -1;
1197 1202
1198 uncore_change_context(uncore_msr_uncores, cpu, target); 1203 uncore_change_context(uncore_msr_uncores, cpu, target);
1199 uncore_change_context(uncore_pci_uncores, cpu, target); 1204 uncore_change_context(uncore_pci_uncores, cpu, target);
@@ -1201,13 +1206,15 @@ static void uncore_event_exit_cpu(int cpu)
1201 1206
1202static void uncore_event_init_cpu(int cpu) 1207static void uncore_event_init_cpu(int cpu)
1203{ 1208{
1204 int i, phys_id; 1209 int target;
1205 1210
1206 phys_id = topology_physical_package_id(cpu); 1211 /*
1207 for_each_cpu(i, &uncore_cpu_mask) { 1212 * Check if there is an online cpu in the package
1208 if (phys_id == topology_physical_package_id(i)) 1213 * which collects uncore events already.
1209 return; 1214 */
1210 } 1215 target = cpumask_any_and(&uncore_cpu_mask, topology_core_cpumask(cpu));
1216 if (target < nr_cpu_ids)
1217 return;
1211 1218
1212 cpumask_set_cpu(cpu, &uncore_cpu_mask); 1219 cpumask_set_cpu(cpu, &uncore_cpu_mask);
1213 1220
@@ -1220,39 +1227,25 @@ static int uncore_cpu_notifier(struct notifier_block *self,
1220{ 1227{
1221 unsigned int cpu = (long)hcpu; 1228 unsigned int cpu = (long)hcpu;
1222 1229
1223 /* allocate/free data structure for uncore box */
1224 switch (action & ~CPU_TASKS_FROZEN) { 1230 switch (action & ~CPU_TASKS_FROZEN) {
1225 case CPU_UP_PREPARE: 1231 case CPU_UP_PREPARE:
1226 uncore_cpu_prepare(cpu, -1); 1232 return notifier_from_errno(uncore_cpu_prepare(cpu));
1227 break; 1233
1228 case CPU_STARTING: 1234 case CPU_STARTING:
1229 uncore_cpu_starting(cpu); 1235 uncore_cpu_starting(cpu, false);
1236 case CPU_DOWN_FAILED:
1237 uncore_event_init_cpu(cpu);
1230 break; 1238 break;
1239
1231 case CPU_UP_CANCELED: 1240 case CPU_UP_CANCELED:
1232 case CPU_DYING: 1241 case CPU_DYING:
1233 uncore_cpu_dying(cpu); 1242 uncore_cpu_dying(cpu);
1234 break; 1243 break;
1235 case CPU_ONLINE:
1236 case CPU_DEAD:
1237 uncore_kfree_boxes();
1238 break;
1239 default:
1240 break;
1241 }
1242 1244
1243 /* select the cpu that collects uncore events */
1244 switch (action & ~CPU_TASKS_FROZEN) {
1245 case CPU_DOWN_FAILED:
1246 case CPU_STARTING:
1247 uncore_event_init_cpu(cpu);
1248 break;
1249 case CPU_DOWN_PREPARE: 1245 case CPU_DOWN_PREPARE:
1250 uncore_event_exit_cpu(cpu); 1246 uncore_event_exit_cpu(cpu);
1251 break; 1247 break;
1252 default:
1253 break;
1254 } 1248 }
1255
1256 return NOTIFY_OK; 1249 return NOTIFY_OK;
1257} 1250}
1258 1251
@@ -1265,9 +1258,29 @@ static struct notifier_block uncore_cpu_nb = {
1265 .priority = CPU_PRI_PERF + 1, 1258 .priority = CPU_PRI_PERF + 1,
1266}; 1259};
1267 1260
1268static void __init uncore_cpu_setup(void *dummy) 1261static int __init type_pmu_register(struct intel_uncore_type *type)
1269{ 1262{
1270 uncore_cpu_starting(smp_processor_id()); 1263 int i, ret;
1264
1265 for (i = 0; i < type->num_boxes; i++) {
1266 ret = uncore_pmu_register(&type->pmus[i]);
1267 if (ret)
1268 return ret;
1269 }
1270 return 0;
1271}
1272
1273static int __init uncore_msr_pmus_register(void)
1274{
1275 struct intel_uncore_type **types = uncore_msr_uncores;
1276 int ret;
1277
1278 for (; *types; types++) {
1279 ret = type_pmu_register(*types);
1280 if (ret)
1281 return ret;
1282 }
1283 return 0;
1271} 1284}
1272 1285
1273static int __init uncore_cpu_init(void) 1286static int __init uncore_cpu_init(void)
@@ -1311,71 +1324,61 @@ static int __init uncore_cpu_init(void)
1311 knl_uncore_cpu_init(); 1324 knl_uncore_cpu_init();
1312 break; 1325 break;
1313 default: 1326 default:
1314 return 0; 1327 return -ENODEV;
1315 } 1328 }
1316 1329
1317 ret = uncore_types_init(uncore_msr_uncores); 1330 ret = uncore_types_init(uncore_msr_uncores, true);
1318 if (ret) 1331 if (ret)
1319 return ret; 1332 goto err;
1320 1333
1334 ret = uncore_msr_pmus_register();
1335 if (ret)
1336 goto err;
1321 return 0; 1337 return 0;
1338err:
1339 uncore_types_exit(uncore_msr_uncores);
1340 uncore_msr_uncores = empty_uncore;
1341 return ret;
1322} 1342}
1323 1343
1324static int __init uncore_pmus_register(void) 1344static void __init uncore_cpu_setup(void *dummy)
1325{ 1345{
1326 struct intel_uncore_pmu *pmu; 1346 uncore_cpu_starting(smp_processor_id(), true);
1327 struct intel_uncore_type *type;
1328 int i, j;
1329
1330 for (i = 0; uncore_msr_uncores[i]; i++) {
1331 type = uncore_msr_uncores[i];
1332 for (j = 0; j < type->num_boxes; j++) {
1333 pmu = &type->pmus[j];
1334 uncore_pmu_register(pmu);
1335 }
1336 }
1337
1338 return 0;
1339} 1347}
1340 1348
1341static void __init uncore_cpumask_init(void) 1349/* Lazy to avoid allocation of a few bytes for the normal case */
1342{ 1350static __initdata DECLARE_BITMAP(packages, MAX_LOCAL_APIC);
1343 int cpu;
1344
1345 /*
1346 * ony invoke once from msr or pci init code
1347 */
1348 if (!cpumask_empty(&uncore_cpu_mask))
1349 return;
1350 1351
1351 cpu_notifier_register_begin(); 1352static int __init uncore_cpumask_init(bool msr)
1353{
1354 unsigned int cpu;
1352 1355
1353 for_each_online_cpu(cpu) { 1356 for_each_online_cpu(cpu) {
1354 int i, phys_id = topology_physical_package_id(cpu); 1357 unsigned int pkg = topology_logical_package_id(cpu);
1358 int ret;
1355 1359
1356 for_each_cpu(i, &uncore_cpu_mask) { 1360 if (test_and_set_bit(pkg, packages))
1357 if (phys_id == topology_physical_package_id(i)) {
1358 phys_id = -1;
1359 break;
1360 }
1361 }
1362 if (phys_id < 0)
1363 continue; 1361 continue;
1364 1362 /*
1365 uncore_cpu_prepare(cpu, phys_id); 1363 * The first online cpu of each package allocates and takes
1364 * the refcounts for all other online cpus in that package.
1365 * If msrs are not enabled no allocation is required.
1366 */
1367 if (msr) {
1368 ret = uncore_cpu_prepare(cpu);
1369 if (ret)
1370 return ret;
1371 }
1366 uncore_event_init_cpu(cpu); 1372 uncore_event_init_cpu(cpu);
1373 smp_call_function_single(cpu, uncore_cpu_setup, NULL, 1);
1367 } 1374 }
1368 on_each_cpu(uncore_cpu_setup, NULL, 1);
1369
1370 __register_cpu_notifier(&uncore_cpu_nb); 1375 __register_cpu_notifier(&uncore_cpu_nb);
1371 1376 return 0;
1372 cpu_notifier_register_done();
1373} 1377}
1374 1378
1375
1376static int __init intel_uncore_init(void) 1379static int __init intel_uncore_init(void)
1377{ 1380{
1378 int ret; 1381 int pret, cret, ret;
1379 1382
1380 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) 1383 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
1381 return -ENODEV; 1384 return -ENODEV;
@@ -1383,19 +1386,27 @@ static int __init intel_uncore_init(void)
1383 if (cpu_has_hypervisor) 1386 if (cpu_has_hypervisor)
1384 return -ENODEV; 1387 return -ENODEV;
1385 1388
1386 ret = uncore_pci_init(); 1389 max_packages = topology_max_packages();
1387 if (ret) 1390
1388 goto fail; 1391 pret = uncore_pci_init();
1389 ret = uncore_cpu_init(); 1392 cret = uncore_cpu_init();
1390 if (ret) {
1391 uncore_pci_exit();
1392 goto fail;
1393 }
1394 uncore_cpumask_init();
1395 1393
1396 uncore_pmus_register(); 1394 if (cret && pret)
1395 return -ENODEV;
1396
1397 cpu_notifier_register_begin();
1398 ret = uncore_cpumask_init(!cret);
1399 if (ret)
1400 goto err;
1401 cpu_notifier_register_done();
1397 return 0; 1402 return 0;
1398fail: 1403
1404err:
1405 /* Undo box->init_box() */
1406 on_each_cpu_mask(&uncore_cpu_mask, uncore_exit_boxes, NULL, 1);
1407 uncore_types_exit(uncore_msr_uncores);
1408 uncore_pci_exit();
1409 cpu_notifier_register_done();
1399 return ret; 1410 return ret;
1400} 1411}
1401device_initcall(intel_uncore_init); 1412device_initcall(intel_uncore_init);
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h b/arch/x86/events/intel/uncore.h
index a7086b862156..79766b9a3580 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -1,8 +1,10 @@
1#include <linux/module.h> 1#include <linux/module.h>
2#include <linux/slab.h> 2#include <linux/slab.h>
3#include <linux/pci.h> 3#include <linux/pci.h>
4#include <asm/apicdef.h>
5
4#include <linux/perf_event.h> 6#include <linux/perf_event.h>
5#include "perf_event.h" 7#include "../perf_event.h"
6 8
7#define UNCORE_PMU_NAME_LEN 32 9#define UNCORE_PMU_NAME_LEN 32
8#define UNCORE_PMU_HRTIMER_INTERVAL (60LL * NSEC_PER_SEC) 10#define UNCORE_PMU_HRTIMER_INTERVAL (60LL * NSEC_PER_SEC)
@@ -19,11 +21,12 @@
19#define UNCORE_EXTRA_PCI_DEV 0xff 21#define UNCORE_EXTRA_PCI_DEV 0xff
20#define UNCORE_EXTRA_PCI_DEV_MAX 3 22#define UNCORE_EXTRA_PCI_DEV_MAX 3
21 23
22/* support up to 8 sockets */
23#define UNCORE_SOCKET_MAX 8
24
25#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff) 24#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
26 25
26struct pci_extra_dev {
27 struct pci_dev *dev[UNCORE_EXTRA_PCI_DEV_MAX];
28};
29
27struct intel_uncore_ops; 30struct intel_uncore_ops;
28struct intel_uncore_pmu; 31struct intel_uncore_pmu;
29struct intel_uncore_box; 32struct intel_uncore_box;
@@ -61,6 +64,7 @@ struct intel_uncore_type {
61 64
62struct intel_uncore_ops { 65struct intel_uncore_ops {
63 void (*init_box)(struct intel_uncore_box *); 66 void (*init_box)(struct intel_uncore_box *);
67 void (*exit_box)(struct intel_uncore_box *);
64 void (*disable_box)(struct intel_uncore_box *); 68 void (*disable_box)(struct intel_uncore_box *);
65 void (*enable_box)(struct intel_uncore_box *); 69 void (*enable_box)(struct intel_uncore_box *);
66 void (*disable_event)(struct intel_uncore_box *, struct perf_event *); 70 void (*disable_event)(struct intel_uncore_box *, struct perf_event *);
@@ -73,13 +77,14 @@ struct intel_uncore_ops {
73}; 77};
74 78
75struct intel_uncore_pmu { 79struct intel_uncore_pmu {
76 struct pmu pmu; 80 struct pmu pmu;
77 char name[UNCORE_PMU_NAME_LEN]; 81 char name[UNCORE_PMU_NAME_LEN];
78 int pmu_idx; 82 int pmu_idx;
79 int func_id; 83 int func_id;
80 struct intel_uncore_type *type; 84 bool registered;
81 struct intel_uncore_box ** __percpu box; 85 atomic_t activeboxes;
82 struct list_head box_list; 86 struct intel_uncore_type *type;
87 struct intel_uncore_box **boxes;
83}; 88};
84 89
85struct intel_uncore_extra_reg { 90struct intel_uncore_extra_reg {
@@ -89,7 +94,8 @@ struct intel_uncore_extra_reg {
89}; 94};
90 95
91struct intel_uncore_box { 96struct intel_uncore_box {
92 int phys_id; 97 int pci_phys_id;
98 int pkgid;
93 int n_active; /* number of active events */ 99 int n_active; /* number of active events */
94 int n_events; 100 int n_events;
95 int cpu; /* cpu to collect events */ 101 int cpu; /* cpu to collect events */
@@ -123,7 +129,6 @@ struct pci2phy_map {
123 int pbus_to_physid[256]; 129 int pbus_to_physid[256];
124}; 130};
125 131
126int uncore_pcibus_to_physid(struct pci_bus *bus);
127struct pci2phy_map *__find_pci2phy_map(int segment); 132struct pci2phy_map *__find_pci2phy_map(int segment);
128 133
129ssize_t uncore_event_show(struct kobject *kobj, 134ssize_t uncore_event_show(struct kobject *kobj,
@@ -305,14 +310,30 @@ static inline void uncore_box_init(struct intel_uncore_box *box)
305 } 310 }
306} 311}
307 312
313static inline void uncore_box_exit(struct intel_uncore_box *box)
314{
315 if (test_and_clear_bit(UNCORE_BOX_FLAG_INITIATED, &box->flags)) {
316 if (box->pmu->type->ops->exit_box)
317 box->pmu->type->ops->exit_box(box);
318 }
319}
320
308static inline bool uncore_box_is_fake(struct intel_uncore_box *box) 321static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
309{ 322{
310 return (box->phys_id < 0); 323 return (box->pkgid < 0);
324}
325
326static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
327{
328 return container_of(event->pmu, struct intel_uncore_pmu, pmu);
329}
330
331static inline struct intel_uncore_box *uncore_event_to_box(struct perf_event *event)
332{
333 return event->pmu_private;
311} 334}
312 335
313struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event);
314struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu); 336struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu);
315struct intel_uncore_box *uncore_event_to_box(struct perf_event *event);
316u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event); 337u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event);
317void uncore_pmu_start_hrtimer(struct intel_uncore_box *box); 338void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
318void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box); 339void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
@@ -328,7 +349,7 @@ extern struct intel_uncore_type **uncore_pci_uncores;
328extern struct pci_driver *uncore_pci_driver; 349extern struct pci_driver *uncore_pci_driver;
329extern raw_spinlock_t pci2phy_map_lock; 350extern raw_spinlock_t pci2phy_map_lock;
330extern struct list_head pci2phy_map_head; 351extern struct list_head pci2phy_map_head;
331extern struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX]; 352extern struct pci_extra_dev *uncore_extra_pci_dev;
332extern struct event_constraint uncore_constraint_empty; 353extern struct event_constraint uncore_constraint_empty;
333 354
334/* perf_event_intel_uncore_snb.c */ 355/* perf_event_intel_uncore_snb.c */
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c b/arch/x86/events/intel/uncore_nhmex.c
index 2749965afed0..cda569332005 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_nhmex.c
+++ b/arch/x86/events/intel/uncore_nhmex.c
@@ -1,5 +1,5 @@
1/* Nehalem-EX/Westmere-EX uncore support */ 1/* Nehalem-EX/Westmere-EX uncore support */
2#include "perf_event_intel_uncore.h" 2#include "uncore.h"
3 3
4/* NHM-EX event control */ 4/* NHM-EX event control */
5#define NHMEX_PMON_CTL_EV_SEL_MASK 0x000000ff 5#define NHMEX_PMON_CTL_EV_SEL_MASK 0x000000ff
@@ -201,6 +201,11 @@ static void nhmex_uncore_msr_init_box(struct intel_uncore_box *box)
201 wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, NHMEX_U_PMON_GLOBAL_EN_ALL); 201 wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, NHMEX_U_PMON_GLOBAL_EN_ALL);
202} 202}
203 203
204static void nhmex_uncore_msr_exit_box(struct intel_uncore_box *box)
205{
206 wrmsrl(NHMEX_U_MSR_PMON_GLOBAL_CTL, 0);
207}
208
204static void nhmex_uncore_msr_disable_box(struct intel_uncore_box *box) 209static void nhmex_uncore_msr_disable_box(struct intel_uncore_box *box)
205{ 210{
206 unsigned msr = uncore_msr_box_ctl(box); 211 unsigned msr = uncore_msr_box_ctl(box);
@@ -250,6 +255,7 @@ static void nhmex_uncore_msr_enable_event(struct intel_uncore_box *box, struct p
250 255
251#define NHMEX_UNCORE_OPS_COMMON_INIT() \ 256#define NHMEX_UNCORE_OPS_COMMON_INIT() \
252 .init_box = nhmex_uncore_msr_init_box, \ 257 .init_box = nhmex_uncore_msr_init_box, \
258 .exit_box = nhmex_uncore_msr_exit_box, \
253 .disable_box = nhmex_uncore_msr_disable_box, \ 259 .disable_box = nhmex_uncore_msr_disable_box, \
254 .enable_box = nhmex_uncore_msr_enable_box, \ 260 .enable_box = nhmex_uncore_msr_enable_box, \
255 .disable_event = nhmex_uncore_msr_disable_event, \ 261 .disable_event = nhmex_uncore_msr_disable_event, \
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 2bd030ddd0db..96531d2b843f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -1,5 +1,5 @@
1/* Nehalem/SandBridge/Haswell uncore support */ 1/* Nehalem/SandBridge/Haswell uncore support */
2#include "perf_event_intel_uncore.h" 2#include "uncore.h"
3 3
4/* Uncore IMC PCI IDs */ 4/* Uncore IMC PCI IDs */
5#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100 5#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
@@ -95,6 +95,12 @@ static void snb_uncore_msr_init_box(struct intel_uncore_box *box)
95 } 95 }
96} 96}
97 97
98static void snb_uncore_msr_exit_box(struct intel_uncore_box *box)
99{
100 if (box->pmu->pmu_idx == 0)
101 wrmsrl(SNB_UNC_PERF_GLOBAL_CTL, 0);
102}
103
98static struct uncore_event_desc snb_uncore_events[] = { 104static struct uncore_event_desc snb_uncore_events[] = {
99 INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x00"), 105 INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x00"),
100 { /* end: all zeroes */ }, 106 { /* end: all zeroes */ },
@@ -116,6 +122,7 @@ static struct attribute_group snb_uncore_format_group = {
116 122
117static struct intel_uncore_ops snb_uncore_msr_ops = { 123static struct intel_uncore_ops snb_uncore_msr_ops = {
118 .init_box = snb_uncore_msr_init_box, 124 .init_box = snb_uncore_msr_init_box,
125 .exit_box = snb_uncore_msr_exit_box,
119 .disable_event = snb_uncore_msr_disable_event, 126 .disable_event = snb_uncore_msr_disable_event,
120 .enable_event = snb_uncore_msr_enable_event, 127 .enable_event = snb_uncore_msr_enable_event,
121 .read_counter = uncore_msr_read_counter, 128 .read_counter = uncore_msr_read_counter,
@@ -231,6 +238,11 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
231 box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL; 238 box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
232} 239}
233 240
241static void snb_uncore_imc_exit_box(struct intel_uncore_box *box)
242{
243 iounmap(box->io_addr);
244}
245
234static void snb_uncore_imc_enable_box(struct intel_uncore_box *box) 246static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
235{} 247{}
236 248
@@ -301,6 +313,7 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
301 return -EINVAL; 313 return -EINVAL;
302 314
303 event->cpu = box->cpu; 315 event->cpu = box->cpu;
316 event->pmu_private = box;
304 317
305 event->hw.idx = -1; 318 event->hw.idx = -1;
306 event->hw.last_tag = ~0ULL; 319 event->hw.last_tag = ~0ULL;
@@ -458,6 +471,7 @@ static struct pmu snb_uncore_imc_pmu = {
458 471
459static struct intel_uncore_ops snb_uncore_imc_ops = { 472static struct intel_uncore_ops snb_uncore_imc_ops = {
460 .init_box = snb_uncore_imc_init_box, 473 .init_box = snb_uncore_imc_init_box,
474 .exit_box = snb_uncore_imc_exit_box,
461 .enable_box = snb_uncore_imc_enable_box, 475 .enable_box = snb_uncore_imc_enable_box,
462 .disable_box = snb_uncore_imc_disable_box, 476 .disable_box = snb_uncore_imc_disable_box,
463 .disable_event = snb_uncore_imc_disable_event, 477 .disable_event = snb_uncore_imc_disable_event,
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 33acb884ccf1..93f6bd9bf761 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1,6 +1,5 @@
1/* SandyBridge-EP/IvyTown uncore support */ 1/* SandyBridge-EP/IvyTown uncore support */
2#include "perf_event_intel_uncore.h" 2#include "uncore.h"
3
4 3
5/* SNB-EP Box level control */ 4/* SNB-EP Box level control */
6#define SNBEP_PMON_BOX_CTL_RST_CTRL (1 << 0) 5#define SNBEP_PMON_BOX_CTL_RST_CTRL (1 << 0)
@@ -987,7 +986,9 @@ static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_eve
987 986
988 if (reg1->idx != EXTRA_REG_NONE) { 987 if (reg1->idx != EXTRA_REG_NONE) {
989 int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER; 988 int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
990 struct pci_dev *filter_pdev = uncore_extra_pci_dev[box->phys_id][idx]; 989 int pkg = topology_phys_to_logical_pkg(box->pci_phys_id);
990 struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
991
991 if (filter_pdev) { 992 if (filter_pdev) {
992 pci_write_config_dword(filter_pdev, reg1->reg, 993 pci_write_config_dword(filter_pdev, reg1->reg,
993 (u32)reg1->config); 994 (u32)reg1->config);
@@ -2521,14 +2522,16 @@ static struct intel_uncore_type *hswep_msr_uncores[] = {
2521 2522
2522void hswep_uncore_cpu_init(void) 2523void hswep_uncore_cpu_init(void)
2523{ 2524{
2525 int pkg = topology_phys_to_logical_pkg(0);
2526
2524 if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) 2527 if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
2525 hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores; 2528 hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
2526 2529
2527 /* Detect 6-8 core systems with only two SBOXes */ 2530 /* Detect 6-8 core systems with only two SBOXes */
2528 if (uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3]) { 2531 if (uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3]) {
2529 u32 capid4; 2532 u32 capid4;
2530 2533
2531 pci_read_config_dword(uncore_extra_pci_dev[0][HSWEP_PCI_PCU_3], 2534 pci_read_config_dword(uncore_extra_pci_dev[pkg].dev[HSWEP_PCI_PCU_3],
2532 0x94, &capid4); 2535 0x94, &capid4);
2533 if (((capid4 >> 6) & 0x3) == 0) 2536 if (((capid4 >> 6) & 0x3) == 0)
2534 hswep_uncore_sbox.num_boxes = 2; 2537 hswep_uncore_sbox.num_boxes = 2;
@@ -2875,11 +2878,13 @@ static struct intel_uncore_type bdx_uncore_sbox = {
2875 .format_group = &hswep_uncore_sbox_format_group, 2878 .format_group = &hswep_uncore_sbox_format_group,
2876}; 2879};
2877 2880
2881#define BDX_MSR_UNCORE_SBOX 3
2882
2878static struct intel_uncore_type *bdx_msr_uncores[] = { 2883static struct intel_uncore_type *bdx_msr_uncores[] = {
2879 &bdx_uncore_ubox, 2884 &bdx_uncore_ubox,
2880 &bdx_uncore_cbox, 2885 &bdx_uncore_cbox,
2881 &bdx_uncore_sbox,
2882 &hswep_uncore_pcu, 2886 &hswep_uncore_pcu,
2887 &bdx_uncore_sbox,
2883 NULL, 2888 NULL,
2884}; 2889};
2885 2890
@@ -2888,6 +2893,10 @@ void bdx_uncore_cpu_init(void)
2888 if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) 2893 if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
2889 bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores; 2894 bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
2890 uncore_msr_uncores = bdx_msr_uncores; 2895 uncore_msr_uncores = bdx_msr_uncores;
2896
2897 /* BDX-DE doesn't have SBOX */
2898 if (boot_cpu_data.x86_model == 86)
2899 uncore_msr_uncores[BDX_MSR_UNCORE_SBOX] = NULL;
2891} 2900}
2892 2901
2893static struct intel_uncore_type bdx_uncore_ha = { 2902static struct intel_uncore_type bdx_uncore_ha = {
diff --git a/arch/x86/kernel/cpu/perf_event_msr.c b/arch/x86/events/msr.c
index ec863b9a9f78..ec863b9a9f78 100644
--- a/arch/x86/kernel/cpu/perf_event_msr.c
+++ b/arch/x86/events/msr.c
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/events/perf_event.h
index 7bb61e32fb29..68155cafa8a1 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -586,6 +586,7 @@ struct x86_pmu {
586 pebs_broken :1, 586 pebs_broken :1,
587 pebs_prec_dist :1; 587 pebs_prec_dist :1;
588 int pebs_record_size; 588 int pebs_record_size;
589 int pebs_buffer_size;
589 void (*drain_pebs)(struct pt_regs *regs); 590 void (*drain_pebs)(struct pt_regs *regs);
590 struct event_constraint *pebs_constraints; 591 struct event_constraint *pebs_constraints;
591 void (*pebs_aliases)(struct perf_event *event); 592 void (*pebs_aliases)(struct perf_event *event);
@@ -860,6 +861,8 @@ extern struct event_constraint intel_ivb_pebs_event_constraints[];
860 861
861extern struct event_constraint intel_hsw_pebs_event_constraints[]; 862extern struct event_constraint intel_hsw_pebs_event_constraints[];
862 863
864extern struct event_constraint intel_bdw_pebs_event_constraints[];
865
863extern struct event_constraint intel_skl_pebs_event_constraints[]; 866extern struct event_constraint intel_skl_pebs_event_constraints[];
864 867
865struct event_constraint *intel_pebs_constraints(struct perf_event *event); 868struct event_constraint *intel_pebs_constraints(struct perf_event *event);
@@ -904,6 +907,8 @@ void intel_pmu_lbr_init_skl(void);
904 907
905void intel_pmu_lbr_init_knl(void); 908void intel_pmu_lbr_init_knl(void);
906 909
910void intel_pmu_pebs_data_source_nhm(void);
911
907int intel_pmu_setup_lbr_filter(struct perf_event *event); 912int intel_pmu_setup_lbr_filter(struct perf_event *event);
908 913
909void intel_pt_interrupt(void); 914void intel_pt_interrupt(void);
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 1514753fd435..15340e36ddcb 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -256,7 +256,7 @@ extern int force_personality32;
256 instruction set this CPU supports. This could be done in user space, 256 instruction set this CPU supports. This could be done in user space,
257 but it's not easy, and we've already done it here. */ 257 but it's not easy, and we've already done it here. */
258 258
259#define ELF_HWCAP (boot_cpu_data.x86_capability[0]) 259#define ELF_HWCAP (boot_cpu_data.x86_capability[CPUID_1_EDX])
260 260
261/* This yields a string that ld.so will use to load implementation 261/* This yields a string that ld.so will use to load implementation
262 specific libraries for optimization. This is more specific in 262 specific libraries for optimization. This is more specific in
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 7bcb861a04e5..5a2ed3ed2f26 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -165,6 +165,7 @@ struct x86_pmu_capability {
165#define GLOBAL_STATUS_ASIF BIT_ULL(60) 165#define GLOBAL_STATUS_ASIF BIT_ULL(60)
166#define GLOBAL_STATUS_COUNTERS_FROZEN BIT_ULL(59) 166#define GLOBAL_STATUS_COUNTERS_FROZEN BIT_ULL(59)
167#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(58) 167#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(58)
168#define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(55)
168 169
169/* 170/*
170 * IBS cpuid feature detection 171 * IBS cpuid feature detection
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 20c11d1aa4cc..813384ef811a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -129,6 +129,8 @@ struct cpuinfo_x86 {
129 u16 booted_cores; 129 u16 booted_cores;
130 /* Physical processor id: */ 130 /* Physical processor id: */
131 u16 phys_proc_id; 131 u16 phys_proc_id;
132 /* Logical processor id: */
133 u16 logical_proc_id;
132 /* Core id: */ 134 /* Core id: */
133 u16 cpu_core_id; 135 u16 cpu_core_id;
134 /* Compute unit id */ 136 /* Compute unit id */
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 0fb46482dfde..7f991bd5031b 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -119,12 +119,23 @@ static inline void setup_node_to_cpumask_map(void) { }
119 119
120extern const struct cpumask *cpu_coregroup_mask(int cpu); 120extern const struct cpumask *cpu_coregroup_mask(int cpu);
121 121
122#define topology_logical_package_id(cpu) (cpu_data(cpu).logical_proc_id)
122#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id) 123#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
123#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id) 124#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
124 125
125#ifdef ENABLE_TOPO_DEFINES 126#ifdef ENABLE_TOPO_DEFINES
126#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu)) 127#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
127#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu)) 128#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
129
130extern unsigned int __max_logical_packages;
131#define topology_max_packages() (__max_logical_packages)
132int topology_update_package_map(unsigned int apicid, unsigned int cpu);
133extern int topology_phys_to_logical_pkg(unsigned int pkg);
134#else
135#define topology_max_packages() (1)
136static inline int
137topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
138static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
128#endif 139#endif
129 140
130static inline void arch_fix_phys_package_id(int num, u32 slot) 141static inline void arch_fix_phys_package_id(int num, u32 slot)
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 8a5cddac7d44..531b9611c51d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2078,6 +2078,20 @@ int generic_processor_info(int apicid, int version)
2078 cpu = cpumask_next_zero(-1, cpu_present_mask); 2078 cpu = cpumask_next_zero(-1, cpu_present_mask);
2079 2079
2080 /* 2080 /*
2081 * This can happen on physical hotplug. The sanity check at boot time
2082 * is done from native_smp_prepare_cpus() after num_possible_cpus() is
2083 * established.
2084 */
2085 if (topology_update_package_map(apicid, cpu) < 0) {
2086 int thiscpu = max + disabled_cpus;
2087
2088 pr_warning("ACPI: Package limit reached. Processor %d/0x%x ignored.\n",
2089 thiscpu, apicid);
2090 disabled_cpus++;
2091 return -ENOSPC;
2092 }
2093
2094 /*
2081 * Validate version 2095 * Validate version
2082 */ 2096 */
2083 if (version == 0x0) { 2097 if (version == 0x0) {
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 58031303e304..7a60424d63fa 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -30,33 +30,11 @@ obj-$(CONFIG_CPU_SUP_CENTAUR) += centaur.o
30obj-$(CONFIG_CPU_SUP_TRANSMETA_32) += transmeta.o 30obj-$(CONFIG_CPU_SUP_TRANSMETA_32) += transmeta.o
31obj-$(CONFIG_CPU_SUP_UMC_32) += umc.o 31obj-$(CONFIG_CPU_SUP_UMC_32) += umc.o
32 32
33obj-$(CONFIG_PERF_EVENTS) += perf_event.o
34
35ifdef CONFIG_PERF_EVENTS
36obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o perf_event_amd_uncore.o
37ifdef CONFIG_AMD_IOMMU
38obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd_iommu.o
39endif
40obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_knc.o perf_event_p4.o
41obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
42obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o perf_event_intel_cqm.o
43obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_pt.o perf_event_intel_bts.o
44obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_cstate.o
45
46obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += perf_event_intel_uncore.o \
47 perf_event_intel_uncore_snb.o \
48 perf_event_intel_uncore_snbep.o \
49 perf_event_intel_uncore_nhmex.o
50obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_msr.o
51obj-$(CONFIG_CPU_SUP_AMD) += perf_event_msr.o
52endif
53
54
55obj-$(CONFIG_X86_MCE) += mcheck/ 33obj-$(CONFIG_X86_MCE) += mcheck/
56obj-$(CONFIG_MTRR) += mtrr/ 34obj-$(CONFIG_MTRR) += mtrr/
57obj-$(CONFIG_MICROCODE) += microcode/ 35obj-$(CONFIG_MICROCODE) += microcode/
58 36
59obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o perf_event_amd_ibs.o 37obj-$(CONFIG_X86_LOCAL_APIC) += perfctr-watchdog.o
60 38
61obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o 39obj-$(CONFIG_HYPERVISOR_GUEST) += vmware.o hypervisor.o mshyperv.o
62 40
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index a07956a08936..97c59fd60702 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -117,7 +117,7 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
117 void (*f_vide)(void); 117 void (*f_vide)(void);
118 u64 d, d2; 118 u64 d, d2;
119 119
120 printk(KERN_INFO "AMD K6 stepping B detected - "); 120 pr_info("AMD K6 stepping B detected - ");
121 121
122 /* 122 /*
123 * It looks like AMD fixed the 2.6.2 bug and improved indirect 123 * It looks like AMD fixed the 2.6.2 bug and improved indirect
@@ -133,10 +133,9 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
133 d = d2-d; 133 d = d2-d;
134 134
135 if (d > 20*K6_BUG_LOOP) 135 if (d > 20*K6_BUG_LOOP)
136 printk(KERN_CONT 136 pr_cont("system stability may be impaired when more than 32 MB are used.\n");
137 "system stability may be impaired when more than 32 MB are used.\n");
138 else 137 else
139 printk(KERN_CONT "probably OK (after B9730xxxx).\n"); 138 pr_cont("probably OK (after B9730xxxx).\n");
140 } 139 }
141 140
142 /* K6 with old style WHCR */ 141 /* K6 with old style WHCR */
@@ -154,7 +153,7 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
154 wbinvd(); 153 wbinvd();
155 wrmsr(MSR_K6_WHCR, l, h); 154 wrmsr(MSR_K6_WHCR, l, h);
156 local_irq_restore(flags); 155 local_irq_restore(flags);
157 printk(KERN_INFO "Enabling old style K6 write allocation for %d Mb\n", 156 pr_info("Enabling old style K6 write allocation for %d Mb\n",
158 mbytes); 157 mbytes);
159 } 158 }
160 return; 159 return;
@@ -175,7 +174,7 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
175 wbinvd(); 174 wbinvd();
176 wrmsr(MSR_K6_WHCR, l, h); 175 wrmsr(MSR_K6_WHCR, l, h);
177 local_irq_restore(flags); 176 local_irq_restore(flags);
178 printk(KERN_INFO "Enabling new style K6 write allocation for %d Mb\n", 177 pr_info("Enabling new style K6 write allocation for %d Mb\n",
179 mbytes); 178 mbytes);
180 } 179 }
181 180
@@ -202,7 +201,7 @@ static void init_amd_k7(struct cpuinfo_x86 *c)
202 */ 201 */
203 if (c->x86_model >= 6 && c->x86_model <= 10) { 202 if (c->x86_model >= 6 && c->x86_model <= 10) {
204 if (!cpu_has(c, X86_FEATURE_XMM)) { 203 if (!cpu_has(c, X86_FEATURE_XMM)) {
205 printk(KERN_INFO "Enabling disabled K7/SSE Support.\n"); 204 pr_info("Enabling disabled K7/SSE Support.\n");
206 msr_clear_bit(MSR_K7_HWCR, 15); 205 msr_clear_bit(MSR_K7_HWCR, 15);
207 set_cpu_cap(c, X86_FEATURE_XMM); 206 set_cpu_cap(c, X86_FEATURE_XMM);
208 } 207 }
@@ -216,9 +215,8 @@ static void init_amd_k7(struct cpuinfo_x86 *c)
216 if ((c->x86_model == 8 && c->x86_mask >= 1) || (c->x86_model > 8)) { 215 if ((c->x86_model == 8 && c->x86_mask >= 1) || (c->x86_model > 8)) {
217 rdmsr(MSR_K7_CLK_CTL, l, h); 216 rdmsr(MSR_K7_CLK_CTL, l, h);
218 if ((l & 0xfff00000) != 0x20000000) { 217 if ((l & 0xfff00000) != 0x20000000) {
219 printk(KERN_INFO 218 pr_info("CPU: CLK_CTL MSR was %x. Reprogramming to %x\n",
220 "CPU: CLK_CTL MSR was %x. Reprogramming to %x\n", 219 l, ((l & 0x000fffff)|0x20000000));
221 l, ((l & 0x000fffff)|0x20000000));
222 wrmsr(MSR_K7_CLK_CTL, (l & 0x000fffff)|0x20000000, h); 220 wrmsr(MSR_K7_CLK_CTL, (l & 0x000fffff)|0x20000000, h);
223 } 221 }
224 } 222 }
@@ -485,7 +483,7 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
485 if (!rdmsrl_safe(MSR_K8_TSEG_ADDR, &tseg)) { 483 if (!rdmsrl_safe(MSR_K8_TSEG_ADDR, &tseg)) {
486 unsigned long pfn = tseg >> PAGE_SHIFT; 484 unsigned long pfn = tseg >> PAGE_SHIFT;
487 485
488 printk(KERN_DEBUG "tseg: %010llx\n", tseg); 486 pr_debug("tseg: %010llx\n", tseg);
489 if (pfn_range_is_mapped(pfn, pfn + 1)) 487 if (pfn_range_is_mapped(pfn, pfn + 1))
490 set_memory_4k((unsigned long)__va(tseg), 1); 488 set_memory_4k((unsigned long)__va(tseg), 1);
491 } 489 }
@@ -500,8 +498,7 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
500 498
501 rdmsrl(MSR_K7_HWCR, val); 499 rdmsrl(MSR_K7_HWCR, val);
502 if (!(val & BIT(24))) 500 if (!(val & BIT(24)))
503 printk(KERN_WARNING FW_BUG "TSC doesn't count " 501 pr_warn(FW_BUG "TSC doesn't count with P0 frequency!\n");
504 "with P0 frequency!\n");
505 } 502 }
506 } 503 }
507 504
diff --git a/arch/x86/kernel/cpu/bugs_64.c b/arch/x86/kernel/cpu/bugs_64.c
index 04f0fe5af83e..a972ac4c7e7d 100644
--- a/arch/x86/kernel/cpu/bugs_64.c
+++ b/arch/x86/kernel/cpu/bugs_64.c
@@ -15,7 +15,7 @@ void __init check_bugs(void)
15{ 15{
16 identify_boot_cpu(); 16 identify_boot_cpu();
17#if !defined(CONFIG_SMP) 17#if !defined(CONFIG_SMP)
18 printk(KERN_INFO "CPU: "); 18 pr_info("CPU: ");
19 print_cpu_info(&boot_cpu_data); 19 print_cpu_info(&boot_cpu_data);
20#endif 20#endif
21 alternative_instructions(); 21 alternative_instructions();
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index ae20be6e483c..ce197bb7c129 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -29,7 +29,7 @@ static void init_c3(struct cpuinfo_x86 *c)
29 rdmsr(MSR_VIA_FCR, lo, hi); 29 rdmsr(MSR_VIA_FCR, lo, hi);
30 lo |= ACE_FCR; /* enable ACE unit */ 30 lo |= ACE_FCR; /* enable ACE unit */
31 wrmsr(MSR_VIA_FCR, lo, hi); 31 wrmsr(MSR_VIA_FCR, lo, hi);
32 printk(KERN_INFO "CPU: Enabled ACE h/w crypto\n"); 32 pr_info("CPU: Enabled ACE h/w crypto\n");
33 } 33 }
34 34
35 /* enable RNG unit, if present and disabled */ 35 /* enable RNG unit, if present and disabled */
@@ -37,7 +37,7 @@ static void init_c3(struct cpuinfo_x86 *c)
37 rdmsr(MSR_VIA_RNG, lo, hi); 37 rdmsr(MSR_VIA_RNG, lo, hi);
38 lo |= RNG_ENABLE; /* enable RNG unit */ 38 lo |= RNG_ENABLE; /* enable RNG unit */
39 wrmsr(MSR_VIA_RNG, lo, hi); 39 wrmsr(MSR_VIA_RNG, lo, hi);
40 printk(KERN_INFO "CPU: Enabled h/w RNG\n"); 40 pr_info("CPU: Enabled h/w RNG\n");
41 } 41 }
42 42
43 /* store Centaur Extended Feature Flags as 43 /* store Centaur Extended Feature Flags as
@@ -130,7 +130,7 @@ static void init_centaur(struct cpuinfo_x86 *c)
130 name = "C6"; 130 name = "C6";
131 fcr_set = ECX8|DSMC|EDCTLB|EMMX|ERETSTK; 131 fcr_set = ECX8|DSMC|EDCTLB|EMMX|ERETSTK;
132 fcr_clr = DPDC; 132 fcr_clr = DPDC;
133 printk(KERN_NOTICE "Disabling bugged TSC.\n"); 133 pr_notice("Disabling bugged TSC.\n");
134 clear_cpu_cap(c, X86_FEATURE_TSC); 134 clear_cpu_cap(c, X86_FEATURE_TSC);
135 break; 135 break;
136 case 8: 136 case 8:
@@ -163,11 +163,11 @@ static void init_centaur(struct cpuinfo_x86 *c)
163 newlo = (lo|fcr_set) & (~fcr_clr); 163 newlo = (lo|fcr_set) & (~fcr_clr);
164 164
165 if (newlo != lo) { 165 if (newlo != lo) {
166 printk(KERN_INFO "Centaur FCR was 0x%X now 0x%X\n", 166 pr_info("Centaur FCR was 0x%X now 0x%X\n",
167 lo, newlo); 167 lo, newlo);
168 wrmsr(MSR_IDT_FCR1, newlo, hi); 168 wrmsr(MSR_IDT_FCR1, newlo, hi);
169 } else { 169 } else {
170 printk(KERN_INFO "Centaur FCR is 0x%X\n", lo); 170 pr_info("Centaur FCR is 0x%X\n", lo);
171 } 171 }
172 /* Emulate MTRRs using Centaur's MCR. */ 172 /* Emulate MTRRs using Centaur's MCR. */
173 set_cpu_cap(c, X86_FEATURE_CENTAUR_MCR); 173 set_cpu_cap(c, X86_FEATURE_CENTAUR_MCR);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 37830de8f60a..81cf716f6f97 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -228,7 +228,7 @@ static void squash_the_stupid_serial_number(struct cpuinfo_x86 *c)
228 lo |= 0x200000; 228 lo |= 0x200000;
229 wrmsr(MSR_IA32_BBL_CR_CTL, lo, hi); 229 wrmsr(MSR_IA32_BBL_CR_CTL, lo, hi);
230 230
231 printk(KERN_NOTICE "CPU serial number disabled.\n"); 231 pr_notice("CPU serial number disabled.\n");
232 clear_cpu_cap(c, X86_FEATURE_PN); 232 clear_cpu_cap(c, X86_FEATURE_PN);
233 233
234 /* Disabling the serial number may affect the cpuid level */ 234 /* Disabling the serial number may affect the cpuid level */
@@ -329,9 +329,8 @@ static void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn)
329 if (!warn) 329 if (!warn)
330 continue; 330 continue;
331 331
332 printk(KERN_WARNING 332 pr_warn("CPU: CPU feature " X86_CAP_FMT " disabled, no CPUID level 0x%x\n",
333 "CPU: CPU feature " X86_CAP_FMT " disabled, no CPUID level 0x%x\n", 333 x86_cap_flag(df->feature), df->level);
334 x86_cap_flag(df->feature), df->level);
335 } 334 }
336} 335}
337 336
@@ -510,7 +509,7 @@ void detect_ht(struct cpuinfo_x86 *c)
510 smp_num_siblings = (ebx & 0xff0000) >> 16; 509 smp_num_siblings = (ebx & 0xff0000) >> 16;
511 510
512 if (smp_num_siblings == 1) { 511 if (smp_num_siblings == 1) {
513 printk_once(KERN_INFO "CPU0: Hyper-Threading is disabled\n"); 512 pr_info_once("CPU0: Hyper-Threading is disabled\n");
514 goto out; 513 goto out;
515 } 514 }
516 515
@@ -531,10 +530,10 @@ void detect_ht(struct cpuinfo_x86 *c)
531 530
532out: 531out:
533 if (!printed && (c->x86_max_cores * smp_num_siblings) > 1) { 532 if (!printed && (c->x86_max_cores * smp_num_siblings) > 1) {
534 printk(KERN_INFO "CPU: Physical Processor ID: %d\n", 533 pr_info("CPU: Physical Processor ID: %d\n",
535 c->phys_proc_id); 534 c->phys_proc_id);
536 printk(KERN_INFO "CPU: Processor Core ID: %d\n", 535 pr_info("CPU: Processor Core ID: %d\n",
537 c->cpu_core_id); 536 c->cpu_core_id);
538 printed = 1; 537 printed = 1;
539 } 538 }
540#endif 539#endif
@@ -559,9 +558,8 @@ static void get_cpu_vendor(struct cpuinfo_x86 *c)
559 } 558 }
560 } 559 }
561 560
562 printk_once(KERN_ERR 561 pr_err_once("CPU: vendor_id '%s' unknown, using generic init.\n" \
563 "CPU: vendor_id '%s' unknown, using generic init.\n" \ 562 "CPU: Your system may be unstable.\n", v);
564 "CPU: Your system may be unstable.\n", v);
565 563
566 c->x86_vendor = X86_VENDOR_UNKNOWN; 564 c->x86_vendor = X86_VENDOR_UNKNOWN;
567 this_cpu = &default_cpu; 565 this_cpu = &default_cpu;
@@ -760,7 +758,7 @@ void __init early_cpu_init(void)
760 int count = 0; 758 int count = 0;
761 759
762#ifdef CONFIG_PROCESSOR_SELECT 760#ifdef CONFIG_PROCESSOR_SELECT
763 printk(KERN_INFO "KERNEL supported cpus:\n"); 761 pr_info("KERNEL supported cpus:\n");
764#endif 762#endif
765 763
766 for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) { 764 for (cdev = __x86_cpu_dev_start; cdev < __x86_cpu_dev_end; cdev++) {
@@ -778,7 +776,7 @@ void __init early_cpu_init(void)
778 for (j = 0; j < 2; j++) { 776 for (j = 0; j < 2; j++) {
779 if (!cpudev->c_ident[j]) 777 if (!cpudev->c_ident[j])
780 continue; 778 continue;
781 printk(KERN_INFO " %s %s\n", cpudev->c_vendor, 779 pr_info(" %s %s\n", cpudev->c_vendor,
782 cpudev->c_ident[j]); 780 cpudev->c_ident[j]);
783 } 781 }
784 } 782 }
@@ -977,6 +975,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
977#ifdef CONFIG_NUMA 975#ifdef CONFIG_NUMA
978 numa_add_cpu(smp_processor_id()); 976 numa_add_cpu(smp_processor_id());
979#endif 977#endif
978 /* The boot/hotplug time assigment got cleared, restore it */
979 c->logical_proc_id = topology_phys_to_logical_pkg(c->phys_proc_id);
980} 980}
981 981
982/* 982/*
@@ -1061,7 +1061,7 @@ static void __print_cpu_msr(void)
1061 for (index = index_min; index < index_max; index++) { 1061 for (index = index_min; index < index_max; index++) {
1062 if (rdmsrl_safe(index, &val)) 1062 if (rdmsrl_safe(index, &val))
1063 continue; 1063 continue;
1064 printk(KERN_INFO " MSR%08x: %016llx\n", index, val); 1064 pr_info(" MSR%08x: %016llx\n", index, val);
1065 } 1065 }
1066 } 1066 }
1067} 1067}
@@ -1100,19 +1100,19 @@ void print_cpu_info(struct cpuinfo_x86 *c)
1100 } 1100 }
1101 1101
1102 if (vendor && !strstr(c->x86_model_id, vendor)) 1102 if (vendor && !strstr(c->x86_model_id, vendor))
1103 printk(KERN_CONT "%s ", vendor); 1103 pr_cont("%s ", vendor);
1104 1104
1105 if (c->x86_model_id[0]) 1105 if (c->x86_model_id[0])
1106 printk(KERN_CONT "%s", c->x86_model_id); 1106 pr_cont("%s", c->x86_model_id);
1107 else 1107 else
1108 printk(KERN_CONT "%d86", c->x86); 1108 pr_cont("%d86", c->x86);
1109 1109
1110 printk(KERN_CONT " (family: 0x%x, model: 0x%x", c->x86, c->x86_model); 1110 pr_cont(" (family: 0x%x, model: 0x%x", c->x86, c->x86_model);
1111 1111
1112 if (c->x86_mask || c->cpuid_level >= 0) 1112 if (c->x86_mask || c->cpuid_level >= 0)
1113 printk(KERN_CONT ", stepping: 0x%x)\n", c->x86_mask); 1113 pr_cont(", stepping: 0x%x)\n", c->x86_mask);
1114 else 1114 else
1115 printk(KERN_CONT ")\n"); 1115 pr_cont(")\n");
1116 1116
1117 print_cpu_msr(c); 1117 print_cpu_msr(c);
1118} 1118}
@@ -1438,7 +1438,7 @@ void cpu_init(void)
1438 1438
1439 show_ucode_info_early(); 1439 show_ucode_info_early();
1440 1440
1441 printk(KERN_INFO "Initializing CPU#%d\n", cpu); 1441 pr_info("Initializing CPU#%d\n", cpu);
1442 1442
1443 if (cpu_feature_enabled(X86_FEATURE_VME) || 1443 if (cpu_feature_enabled(X86_FEATURE_VME) ||
1444 cpu_has_tsc || 1444 cpu_has_tsc ||
diff --git a/arch/x86/kernel/cpu/cyrix.c b/arch/x86/kernel/cpu/cyrix.c
index aaf152e79637..187bb583d0df 100644
--- a/arch/x86/kernel/cpu/cyrix.c
+++ b/arch/x86/kernel/cpu/cyrix.c
@@ -103,7 +103,7 @@ static void check_cx686_slop(struct cpuinfo_x86 *c)
103 local_irq_restore(flags); 103 local_irq_restore(flags);
104 104
105 if (ccr5 & 2) { /* possible wrong calibration done */ 105 if (ccr5 & 2) { /* possible wrong calibration done */
106 printk(KERN_INFO "Recalibrating delay loop with SLOP bit reset\n"); 106 pr_info("Recalibrating delay loop with SLOP bit reset\n");
107 calibrate_delay(); 107 calibrate_delay();
108 c->loops_per_jiffy = loops_per_jiffy; 108 c->loops_per_jiffy = loops_per_jiffy;
109 } 109 }
@@ -115,7 +115,7 @@ static void set_cx86_reorder(void)
115{ 115{
116 u8 ccr3; 116 u8 ccr3;
117 117
118 printk(KERN_INFO "Enable Memory access reorder on Cyrix/NSC processor.\n"); 118 pr_info("Enable Memory access reorder on Cyrix/NSC processor.\n");
119 ccr3 = getCx86(CX86_CCR3); 119 ccr3 = getCx86(CX86_CCR3);
120 setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */ 120 setCx86(CX86_CCR3, (ccr3 & 0x0f) | 0x10); /* enable MAPEN */
121 121
@@ -128,7 +128,7 @@ static void set_cx86_reorder(void)
128 128
129static void set_cx86_memwb(void) 129static void set_cx86_memwb(void)
130{ 130{
131 printk(KERN_INFO "Enable Memory-Write-back mode on Cyrix/NSC processor.\n"); 131 pr_info("Enable Memory-Write-back mode on Cyrix/NSC processor.\n");
132 132
133 /* CCR2 bit 2: unlock NW bit */ 133 /* CCR2 bit 2: unlock NW bit */
134 setCx86_old(CX86_CCR2, getCx86_old(CX86_CCR2) & ~0x04); 134 setCx86_old(CX86_CCR2, getCx86_old(CX86_CCR2) & ~0x04);
@@ -268,7 +268,7 @@ static void init_cyrix(struct cpuinfo_x86 *c)
268 * VSA1 we work around however. 268 * VSA1 we work around however.
269 */ 269 */
270 270
271 printk(KERN_INFO "Working around Cyrix MediaGX virtual DMA bugs.\n"); 271 pr_info("Working around Cyrix MediaGX virtual DMA bugs.\n");
272 isa_dma_bridge_buggy = 2; 272 isa_dma_bridge_buggy = 2;
273 273
274 /* We do this before the PCI layer is running. However we 274 /* We do this before the PCI layer is running. However we
@@ -426,7 +426,7 @@ static void cyrix_identify(struct cpuinfo_x86 *c)
426 if (dir0 == 5 || dir0 == 3) { 426 if (dir0 == 5 || dir0 == 3) {
427 unsigned char ccr3; 427 unsigned char ccr3;
428 unsigned long flags; 428 unsigned long flags;
429 printk(KERN_INFO "Enabling CPUID on Cyrix processor.\n"); 429 pr_info("Enabling CPUID on Cyrix processor.\n");
430 local_irq_save(flags); 430 local_irq_save(flags);
431 ccr3 = getCx86(CX86_CCR3); 431 ccr3 = getCx86(CX86_CCR3);
432 /* enable MAPEN */ 432 /* enable MAPEN */
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index d820d8eae96b..73d391ae452f 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -56,7 +56,7 @@ detect_hypervisor_vendor(void)
56 } 56 }
57 57
58 if (max_pri) 58 if (max_pri)
59 printk(KERN_INFO "Hypervisor detected: %s\n", x86_hyper->name); 59 pr_info("Hypervisor detected: %s\n", x86_hyper->name);
60} 60}
61 61
62void init_hypervisor(struct cpuinfo_x86 *c) 62void init_hypervisor(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 565648bc1a0a..38766c2b5b00 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -61,7 +61,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
61 */ 61 */
62 if (c->x86 == 6 && c->x86_model == 0x1c && c->x86_mask <= 2 && 62 if (c->x86 == 6 && c->x86_model == 0x1c && c->x86_mask <= 2 &&
63 c->microcode < 0x20e) { 63 c->microcode < 0x20e) {
64 printk(KERN_WARNING "Atom PSE erratum detected, BIOS microcode update recommended\n"); 64 pr_warn("Atom PSE erratum detected, BIOS microcode update recommended\n");
65 clear_cpu_cap(c, X86_FEATURE_PSE); 65 clear_cpu_cap(c, X86_FEATURE_PSE);
66 } 66 }
67 67
@@ -140,7 +140,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
140 if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) { 140 if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
141 rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); 141 rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
142 if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) { 142 if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) {
143 printk(KERN_INFO "Disabled fast string operations\n"); 143 pr_info("Disabled fast string operations\n");
144 setup_clear_cpu_cap(X86_FEATURE_REP_GOOD); 144 setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
145 setup_clear_cpu_cap(X86_FEATURE_ERMS); 145 setup_clear_cpu_cap(X86_FEATURE_ERMS);
146 } 146 }
@@ -160,6 +160,19 @@ static void early_init_intel(struct cpuinfo_x86 *c)
160 pr_info("Disabling PGE capability bit\n"); 160 pr_info("Disabling PGE capability bit\n");
161 setup_clear_cpu_cap(X86_FEATURE_PGE); 161 setup_clear_cpu_cap(X86_FEATURE_PGE);
162 } 162 }
163
164 if (c->cpuid_level >= 0x00000001) {
165 u32 eax, ebx, ecx, edx;
166
167 cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
168 /*
169 * If HTT (EDX[28]) is set EBX[16:23] contain the number of
170 * apicids which are reserved per package. Store the resulting
171 * shift value for the package management code.
172 */
173 if (edx & (1U << 28))
174 c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
175 }
163} 176}
164 177
165#ifdef CONFIG_X86_32 178#ifdef CONFIG_X86_32
@@ -176,7 +189,7 @@ int ppro_with_ram_bug(void)
176 boot_cpu_data.x86 == 6 && 189 boot_cpu_data.x86 == 6 &&
177 boot_cpu_data.x86_model == 1 && 190 boot_cpu_data.x86_model == 1 &&
178 boot_cpu_data.x86_mask < 8) { 191 boot_cpu_data.x86_mask < 8) {
179 printk(KERN_INFO "Pentium Pro with Errata#50 detected. Taking evasive action.\n"); 192 pr_info("Pentium Pro with Errata#50 detected. Taking evasive action.\n");
180 return 1; 193 return 1;
181 } 194 }
182 return 0; 195 return 0;
@@ -225,7 +238,7 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
225 238
226 set_cpu_bug(c, X86_BUG_F00F); 239 set_cpu_bug(c, X86_BUG_F00F);
227 if (!f00f_workaround_enabled) { 240 if (!f00f_workaround_enabled) {
228 printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n"); 241 pr_notice("Intel Pentium with F0 0F bug - workaround enabled.\n");
229 f00f_workaround_enabled = 1; 242 f00f_workaround_enabled = 1;
230 } 243 }
231 } 244 }
@@ -244,7 +257,7 @@ static void intel_workarounds(struct cpuinfo_x86 *c)
244 * Forcefully enable PAE if kernel parameter "forcepae" is present. 257 * Forcefully enable PAE if kernel parameter "forcepae" is present.
245 */ 258 */
246 if (forcepae) { 259 if (forcepae) {
247 printk(KERN_WARNING "PAE forced!\n"); 260 pr_warn("PAE forced!\n");
248 set_cpu_cap(c, X86_FEATURE_PAE); 261 set_cpu_cap(c, X86_FEATURE_PAE);
249 add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_NOW_UNRELIABLE); 262 add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_NOW_UNRELIABLE);
250 } 263 }
diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 0b6c52388cf4..6ed779efff26 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -444,7 +444,7 @@ static ssize_t store_cache_disable(struct cacheinfo *this_leaf,
444 err = amd_set_l3_disable_slot(nb, cpu, slot, val); 444 err = amd_set_l3_disable_slot(nb, cpu, slot, val);
445 if (err) { 445 if (err) {
446 if (err == -EEXIST) 446 if (err == -EEXIST)
447 pr_warning("L3 slot %d in use/index already disabled!\n", 447 pr_warn("L3 slot %d in use/index already disabled!\n",
448 slot); 448 slot);
449 return err; 449 return err;
450 } 450 }
diff --git a/arch/x86/kernel/cpu/mcheck/mce-inject.c b/arch/x86/kernel/cpu/mcheck/mce-inject.c
index 4cfba4371a71..517619ea6498 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -115,7 +115,7 @@ static int raise_local(void)
115 int cpu = m->extcpu; 115 int cpu = m->extcpu;
116 116
117 if (m->inject_flags & MCJ_EXCEPTION) { 117 if (m->inject_flags & MCJ_EXCEPTION) {
118 printk(KERN_INFO "Triggering MCE exception on CPU %d\n", cpu); 118 pr_info("Triggering MCE exception on CPU %d\n", cpu);
119 switch (context) { 119 switch (context) {
120 case MCJ_CTX_IRQ: 120 case MCJ_CTX_IRQ:
121 /* 121 /*
@@ -128,15 +128,15 @@ static int raise_local(void)
128 raise_exception(m, NULL); 128 raise_exception(m, NULL);
129 break; 129 break;
130 default: 130 default:
131 printk(KERN_INFO "Invalid MCE context\n"); 131 pr_info("Invalid MCE context\n");
132 ret = -EINVAL; 132 ret = -EINVAL;
133 } 133 }
134 printk(KERN_INFO "MCE exception done on CPU %d\n", cpu); 134 pr_info("MCE exception done on CPU %d\n", cpu);
135 } else if (m->status) { 135 } else if (m->status) {
136 printk(KERN_INFO "Starting machine check poll CPU %d\n", cpu); 136 pr_info("Starting machine check poll CPU %d\n", cpu);
137 raise_poll(m); 137 raise_poll(m);
138 mce_notify_irq(); 138 mce_notify_irq();
139 printk(KERN_INFO "Machine check poll done on CPU %d\n", cpu); 139 pr_info("Machine check poll done on CPU %d\n", cpu);
140 } else 140 } else
141 m->finished = 0; 141 m->finished = 0;
142 142
@@ -183,8 +183,7 @@ static void raise_mce(struct mce *m)
183 start = jiffies; 183 start = jiffies;
184 while (!cpumask_empty(mce_inject_cpumask)) { 184 while (!cpumask_empty(mce_inject_cpumask)) {
185 if (!time_before(jiffies, start + 2*HZ)) { 185 if (!time_before(jiffies, start + 2*HZ)) {
186 printk(KERN_ERR 186 pr_err("Timeout waiting for mce inject %lx\n",
187 "Timeout waiting for mce inject %lx\n",
188 *cpumask_bits(mce_inject_cpumask)); 187 *cpumask_bits(mce_inject_cpumask));
189 break; 188 break;
190 } 189 }
@@ -241,7 +240,7 @@ static int inject_init(void)
241{ 240{
242 if (!alloc_cpumask_var(&mce_inject_cpumask, GFP_KERNEL)) 241 if (!alloc_cpumask_var(&mce_inject_cpumask, GFP_KERNEL))
243 return -ENOMEM; 242 return -ENOMEM;
244 printk(KERN_INFO "Machine check injector initialized\n"); 243 pr_info("Machine check injector initialized\n");
245 register_mce_write_callback(mce_write); 244 register_mce_write_callback(mce_write);
246 register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0, 245 register_nmi_handler(NMI_LOCAL, mce_raise_notify, 0,
247 "mce_notify"); 246 "mce_notify");
diff --git a/arch/x86/kernel/cpu/mcheck/p5.c b/arch/x86/kernel/cpu/mcheck/p5.c
index 12402e10aeff..2a0717bf8033 100644
--- a/arch/x86/kernel/cpu/mcheck/p5.c
+++ b/arch/x86/kernel/cpu/mcheck/p5.c
@@ -26,14 +26,12 @@ static void pentium_machine_check(struct pt_regs *regs, long error_code)
26 rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi); 26 rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
27 rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi); 27 rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
28 28
29 printk(KERN_EMERG 29 pr_emerg("CPU#%d: Machine Check Exception: 0x%8X (type 0x%8X).\n",
30 "CPU#%d: Machine Check Exception: 0x%8X (type 0x%8X).\n", 30 smp_processor_id(), loaddr, lotype);
31 smp_processor_id(), loaddr, lotype);
32 31
33 if (lotype & (1<<5)) { 32 if (lotype & (1<<5)) {
34 printk(KERN_EMERG 33 pr_emerg("CPU#%d: Possible thermal failure (CPU on fire ?).\n",
35 "CPU#%d: Possible thermal failure (CPU on fire ?).\n", 34 smp_processor_id());
36 smp_processor_id());
37 } 35 }
38 36
39 add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); 37 add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
@@ -61,12 +59,10 @@ void intel_p5_mcheck_init(struct cpuinfo_x86 *c)
61 /* Read registers before enabling: */ 59 /* Read registers before enabling: */
62 rdmsr(MSR_IA32_P5_MC_ADDR, l, h); 60 rdmsr(MSR_IA32_P5_MC_ADDR, l, h);
63 rdmsr(MSR_IA32_P5_MC_TYPE, l, h); 61 rdmsr(MSR_IA32_P5_MC_TYPE, l, h);
64 printk(KERN_INFO 62 pr_info("Intel old style machine check architecture supported.\n");
65 "Intel old style machine check architecture supported.\n");
66 63
67 /* Enable MCE: */ 64 /* Enable MCE: */
68 cr4_set_bits(X86_CR4_MCE); 65 cr4_set_bits(X86_CR4_MCE);
69 printk(KERN_INFO 66 pr_info("Intel old style machine check reporting enabled on CPU#%d.\n",
70 "Intel old style machine check reporting enabled on CPU#%d.\n", 67 smp_processor_id());
71 smp_processor_id());
72} 68}
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 2c5aaf8c2e2f..0b445c2ff735 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -190,7 +190,7 @@ static int therm_throt_process(bool new_event, int event, int level)
190 /* if we just entered the thermal event */ 190 /* if we just entered the thermal event */
191 if (new_event) { 191 if (new_event) {
192 if (event == THERMAL_THROTTLING_EVENT) 192 if (event == THERMAL_THROTTLING_EVENT)
193 printk(KERN_CRIT "CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n", 193 pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
194 this_cpu, 194 this_cpu,
195 level == CORE_LEVEL ? "Core" : "Package", 195 level == CORE_LEVEL ? "Core" : "Package",
196 state->count); 196 state->count);
@@ -198,8 +198,7 @@ static int therm_throt_process(bool new_event, int event, int level)
198 } 198 }
199 if (old_event) { 199 if (old_event) {
200 if (event == THERMAL_THROTTLING_EVENT) 200 if (event == THERMAL_THROTTLING_EVENT)
201 printk(KERN_INFO "CPU%d: %s temperature/speed normal\n", 201 pr_info("CPU%d: %s temperature/speed normal\n", this_cpu,
202 this_cpu,
203 level == CORE_LEVEL ? "Core" : "Package"); 202 level == CORE_LEVEL ? "Core" : "Package");
204 return 1; 203 return 1;
205 } 204 }
@@ -417,8 +416,8 @@ static void intel_thermal_interrupt(void)
417 416
418static void unexpected_thermal_interrupt(void) 417static void unexpected_thermal_interrupt(void)
419{ 418{
420 printk(KERN_ERR "CPU%d: Unexpected LVT thermal interrupt!\n", 419 pr_err("CPU%d: Unexpected LVT thermal interrupt!\n",
421 smp_processor_id()); 420 smp_processor_id());
422} 421}
423 422
424static void (*smp_thermal_vector)(void) = unexpected_thermal_interrupt; 423static void (*smp_thermal_vector)(void) = unexpected_thermal_interrupt;
@@ -499,7 +498,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
499 498
500 if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) { 499 if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
501 if (system_state == SYSTEM_BOOTING) 500 if (system_state == SYSTEM_BOOTING)
502 printk(KERN_DEBUG "CPU%d: Thermal monitoring handled by SMI\n", cpu); 501 pr_debug("CPU%d: Thermal monitoring handled by SMI\n", cpu);
503 return; 502 return;
504 } 503 }
505 504
@@ -557,8 +556,8 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
557 l = apic_read(APIC_LVTTHMR); 556 l = apic_read(APIC_LVTTHMR);
558 apic_write(APIC_LVTTHMR, l & ~APIC_LVT_MASKED); 557 apic_write(APIC_LVTTHMR, l & ~APIC_LVT_MASKED);
559 558
560 printk_once(KERN_INFO "CPU0: Thermal monitoring enabled (%s)\n", 559 pr_info_once("CPU0: Thermal monitoring enabled (%s)\n",
561 tm2 ? "TM2" : "TM1"); 560 tm2 ? "TM2" : "TM1");
562 561
563 /* enable thermal throttle processing */ 562 /* enable thermal throttle processing */
564 atomic_set(&therm_throt_en, 1); 563 atomic_set(&therm_throt_en, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/threshold.c b/arch/x86/kernel/cpu/mcheck/threshold.c
index 7245980186ee..fcf9ae9384f4 100644
--- a/arch/x86/kernel/cpu/mcheck/threshold.c
+++ b/arch/x86/kernel/cpu/mcheck/threshold.c
@@ -12,8 +12,8 @@
12 12
13static void default_threshold_interrupt(void) 13static void default_threshold_interrupt(void)
14{ 14{
15 printk(KERN_ERR "Unexpected threshold interrupt at vector %x\n", 15 pr_err("Unexpected threshold interrupt at vector %x\n",
16 THRESHOLD_APIC_VECTOR); 16 THRESHOLD_APIC_VECTOR);
17} 17}
18 18
19void (*mce_threshold_vector)(void) = default_threshold_interrupt; 19void (*mce_threshold_vector)(void) = default_threshold_interrupt;
diff --git a/arch/x86/kernel/cpu/mcheck/winchip.c b/arch/x86/kernel/cpu/mcheck/winchip.c
index 01dd8702880b..c6a722e1d011 100644
--- a/arch/x86/kernel/cpu/mcheck/winchip.c
+++ b/arch/x86/kernel/cpu/mcheck/winchip.c
@@ -17,7 +17,7 @@ static void winchip_machine_check(struct pt_regs *regs, long error_code)
17{ 17{
18 ist_enter(regs); 18 ist_enter(regs);
19 19
20 printk(KERN_EMERG "CPU0: Machine Check Exception.\n"); 20 pr_emerg("CPU0: Machine Check Exception.\n");
21 add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE); 21 add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
22 22
23 ist_exit(regs); 23 ist_exit(regs);
@@ -39,6 +39,5 @@ void winchip_mcheck_init(struct cpuinfo_x86 *c)
39 39
40 cr4_set_bits(X86_CR4_MCE); 40 cr4_set_bits(X86_CR4_MCE);
41 41
42 printk(KERN_INFO 42 pr_info("Winchip machine check reporting enabled on CPU#0.\n");
43 "Winchip machine check reporting enabled on CPU#0.\n");
44} 43}
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 2233f8a76615..75d3aab5f7b2 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -953,7 +953,7 @@ struct microcode_ops * __init init_amd_microcode(void)
953 struct cpuinfo_x86 *c = &boot_cpu_data; 953 struct cpuinfo_x86 *c = &boot_cpu_data;
954 954
955 if (c->x86_vendor != X86_VENDOR_AMD || c->x86 < 0x10) { 955 if (c->x86_vendor != X86_VENDOR_AMD || c->x86 < 0x10) {
956 pr_warning("AMD CPU family 0x%x not supported\n", c->x86); 956 pr_warn("AMD CPU family 0x%x not supported\n", c->x86);
957 return NULL; 957 return NULL;
958 } 958 }
959 959
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 20e242ea1bc4..4e7c6933691c 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -161,8 +161,8 @@ static void __init ms_hyperv_init_platform(void)
161 ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES); 161 ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
162 ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO); 162 ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
163 163
164 printk(KERN_INFO "HyperV: features 0x%x, hints 0x%x\n", 164 pr_info("HyperV: features 0x%x, hints 0x%x\n",
165 ms_hyperv.features, ms_hyperv.hints); 165 ms_hyperv.features, ms_hyperv.hints);
166 166
167#ifdef CONFIG_X86_LOCAL_APIC 167#ifdef CONFIG_X86_LOCAL_APIC
168 if (ms_hyperv.features & HV_X64_MSR_APIC_FREQUENCY_AVAILABLE) { 168 if (ms_hyperv.features & HV_X64_MSR_APIC_FREQUENCY_AVAILABLE) {
@@ -174,8 +174,8 @@ static void __init ms_hyperv_init_platform(void)
174 rdmsrl(HV_X64_MSR_APIC_FREQUENCY, hv_lapic_frequency); 174 rdmsrl(HV_X64_MSR_APIC_FREQUENCY, hv_lapic_frequency);
175 hv_lapic_frequency = div_u64(hv_lapic_frequency, HZ); 175 hv_lapic_frequency = div_u64(hv_lapic_frequency, HZ);
176 lapic_timer_frequency = hv_lapic_frequency; 176 lapic_timer_frequency = hv_lapic_frequency;
177 printk(KERN_INFO "HyperV: LAPIC Timer Frequency: %#x\n", 177 pr_info("HyperV: LAPIC Timer Frequency: %#x\n",
178 lapic_timer_frequency); 178 lapic_timer_frequency);
179 } 179 }
180#endif 180#endif
181 181
diff --git a/arch/x86/kernel/cpu/mtrr/centaur.c b/arch/x86/kernel/cpu/mtrr/centaur.c
index 316fe3e60a97..3d689937fc1b 100644
--- a/arch/x86/kernel/cpu/mtrr/centaur.c
+++ b/arch/x86/kernel/cpu/mtrr/centaur.c
@@ -103,7 +103,7 @@ centaur_validate_add_page(unsigned long base, unsigned long size, unsigned int t
103 */ 103 */
104 if (type != MTRR_TYPE_WRCOMB && 104 if (type != MTRR_TYPE_WRCOMB &&
105 (centaur_mcr_type == 0 || type != MTRR_TYPE_UNCACHABLE)) { 105 (centaur_mcr_type == 0 || type != MTRR_TYPE_UNCACHABLE)) {
106 pr_warning("mtrr: only write-combining%s supported\n", 106 pr_warn("mtrr: only write-combining%s supported\n",
107 centaur_mcr_type ? " and uncacheable are" : " is"); 107 centaur_mcr_type ? " and uncacheable are" : " is");
108 return -EINVAL; 108 return -EINVAL;
109 } 109 }
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c
index 0d98503c2245..31e951ce6dff 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -57,9 +57,9 @@ static int __initdata nr_range;
57static struct var_mtrr_range_state __initdata range_state[RANGE_NUM]; 57static struct var_mtrr_range_state __initdata range_state[RANGE_NUM];
58 58
59static int __initdata debug_print; 59static int __initdata debug_print;
60#define Dprintk(x...) do { if (debug_print) printk(KERN_DEBUG x); } while (0) 60#define Dprintk(x...) do { if (debug_print) pr_debug(x); } while (0)
61 61
62#define BIOS_BUG_MSG KERN_WARNING \ 62#define BIOS_BUG_MSG \
63 "WARNING: BIOS bug: VAR MTRR %d contains strange UC entry under 1M, check with your system vendor!\n" 63 "WARNING: BIOS bug: VAR MTRR %d contains strange UC entry under 1M, check with your system vendor!\n"
64 64
65static int __init 65static int __init
@@ -81,9 +81,9 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
81 base, base + size); 81 base, base + size);
82 } 82 }
83 if (debug_print) { 83 if (debug_print) {
84 printk(KERN_DEBUG "After WB checking\n"); 84 pr_debug("After WB checking\n");
85 for (i = 0; i < nr_range; i++) 85 for (i = 0; i < nr_range; i++)
86 printk(KERN_DEBUG "MTRR MAP PFN: %016llx - %016llx\n", 86 pr_debug("MTRR MAP PFN: %016llx - %016llx\n",
87 range[i].start, range[i].end); 87 range[i].start, range[i].end);
88 } 88 }
89 89
@@ -101,7 +101,7 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
101 (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) && 101 (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
102 (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { 102 (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
103 /* Var MTRR contains UC entry below 1M? Skip it: */ 103 /* Var MTRR contains UC entry below 1M? Skip it: */
104 printk(BIOS_BUG_MSG, i); 104 pr_warn(BIOS_BUG_MSG, i);
105 if (base + size <= (1<<(20-PAGE_SHIFT))) 105 if (base + size <= (1<<(20-PAGE_SHIFT)))
106 continue; 106 continue;
107 size -= (1<<(20-PAGE_SHIFT)) - base; 107 size -= (1<<(20-PAGE_SHIFT)) - base;
@@ -114,11 +114,11 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
114 extra_remove_base + extra_remove_size); 114 extra_remove_base + extra_remove_size);
115 115
116 if (debug_print) { 116 if (debug_print) {
117 printk(KERN_DEBUG "After UC checking\n"); 117 pr_debug("After UC checking\n");
118 for (i = 0; i < RANGE_NUM; i++) { 118 for (i = 0; i < RANGE_NUM; i++) {
119 if (!range[i].end) 119 if (!range[i].end)
120 continue; 120 continue;
121 printk(KERN_DEBUG "MTRR MAP PFN: %016llx - %016llx\n", 121 pr_debug("MTRR MAP PFN: %016llx - %016llx\n",
122 range[i].start, range[i].end); 122 range[i].start, range[i].end);
123 } 123 }
124 } 124 }
@@ -126,9 +126,9 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
126 /* sort the ranges */ 126 /* sort the ranges */
127 nr_range = clean_sort_range(range, RANGE_NUM); 127 nr_range = clean_sort_range(range, RANGE_NUM);
128 if (debug_print) { 128 if (debug_print) {
129 printk(KERN_DEBUG "After sorting\n"); 129 pr_debug("After sorting\n");
130 for (i = 0; i < nr_range; i++) 130 for (i = 0; i < nr_range; i++)
131 printk(KERN_DEBUG "MTRR MAP PFN: %016llx - %016llx\n", 131 pr_debug("MTRR MAP PFN: %016llx - %016llx\n",
132 range[i].start, range[i].end); 132 range[i].start, range[i].end);
133 } 133 }
134 134
@@ -544,7 +544,7 @@ static void __init print_out_mtrr_range_state(void)
544 start_base = to_size_factor(start_base, &start_factor), 544 start_base = to_size_factor(start_base, &start_factor),
545 type = range_state[i].type; 545 type = range_state[i].type;
546 546
547 printk(KERN_DEBUG "reg %d, base: %ld%cB, range: %ld%cB, type %s\n", 547 pr_debug("reg %d, base: %ld%cB, range: %ld%cB, type %s\n",
548 i, start_base, start_factor, 548 i, start_base, start_factor,
549 size_base, size_factor, 549 size_base, size_factor,
550 (type == MTRR_TYPE_UNCACHABLE) ? "UC" : 550 (type == MTRR_TYPE_UNCACHABLE) ? "UC" :
@@ -713,7 +713,7 @@ int __init mtrr_cleanup(unsigned address_bits)
713 return 0; 713 return 0;
714 714
715 /* Print original var MTRRs at first, for debugging: */ 715 /* Print original var MTRRs at first, for debugging: */
716 printk(KERN_DEBUG "original variable MTRRs\n"); 716 pr_debug("original variable MTRRs\n");
717 print_out_mtrr_range_state(); 717 print_out_mtrr_range_state();
718 718
719 memset(range, 0, sizeof(range)); 719 memset(range, 0, sizeof(range));
@@ -733,7 +733,7 @@ int __init mtrr_cleanup(unsigned address_bits)
733 x_remove_base, x_remove_size); 733 x_remove_base, x_remove_size);
734 734
735 range_sums = sum_ranges(range, nr_range); 735 range_sums = sum_ranges(range, nr_range);
736 printk(KERN_INFO "total RAM covered: %ldM\n", 736 pr_info("total RAM covered: %ldM\n",
737 range_sums >> (20 - PAGE_SHIFT)); 737 range_sums >> (20 - PAGE_SHIFT));
738 738
739 if (mtrr_chunk_size && mtrr_gran_size) { 739 if (mtrr_chunk_size && mtrr_gran_size) {
@@ -745,12 +745,11 @@ int __init mtrr_cleanup(unsigned address_bits)
745 745
746 if (!result[i].bad) { 746 if (!result[i].bad) {
747 set_var_mtrr_all(address_bits); 747 set_var_mtrr_all(address_bits);
748 printk(KERN_DEBUG "New variable MTRRs\n"); 748 pr_debug("New variable MTRRs\n");
749 print_out_mtrr_range_state(); 749 print_out_mtrr_range_state();
750 return 1; 750 return 1;
751 } 751 }
752 printk(KERN_INFO "invalid mtrr_gran_size or mtrr_chunk_size, " 752 pr_info("invalid mtrr_gran_size or mtrr_chunk_size, will find optimal one\n");
753 "will find optimal one\n");
754 } 753 }
755 754
756 i = 0; 755 i = 0;
@@ -768,7 +767,7 @@ int __init mtrr_cleanup(unsigned address_bits)
768 x_remove_base, x_remove_size, i); 767 x_remove_base, x_remove_size, i);
769 if (debug_print) { 768 if (debug_print) {
770 mtrr_print_out_one_result(i); 769 mtrr_print_out_one_result(i);
771 printk(KERN_INFO "\n"); 770 pr_info("\n");
772 } 771 }
773 772
774 i++; 773 i++;
@@ -779,7 +778,7 @@ int __init mtrr_cleanup(unsigned address_bits)
779 index_good = mtrr_search_optimal_index(); 778 index_good = mtrr_search_optimal_index();
780 779
781 if (index_good != -1) { 780 if (index_good != -1) {
782 printk(KERN_INFO "Found optimal setting for mtrr clean up\n"); 781 pr_info("Found optimal setting for mtrr clean up\n");
783 i = index_good; 782 i = index_good;
784 mtrr_print_out_one_result(i); 783 mtrr_print_out_one_result(i);
785 784
@@ -790,7 +789,7 @@ int __init mtrr_cleanup(unsigned address_bits)
790 gran_size <<= 10; 789 gran_size <<= 10;
791 x86_setup_var_mtrrs(range, nr_range, chunk_size, gran_size); 790 x86_setup_var_mtrrs(range, nr_range, chunk_size, gran_size);
792 set_var_mtrr_all(address_bits); 791 set_var_mtrr_all(address_bits);
793 printk(KERN_DEBUG "New variable MTRRs\n"); 792 pr_debug("New variable MTRRs\n");
794 print_out_mtrr_range_state(); 793 print_out_mtrr_range_state();
795 return 1; 794 return 1;
796 } else { 795 } else {
@@ -799,8 +798,8 @@ int __init mtrr_cleanup(unsigned address_bits)
799 mtrr_print_out_one_result(i); 798 mtrr_print_out_one_result(i);
800 } 799 }
801 800
802 printk(KERN_INFO "mtrr_cleanup: can not find optimal value\n"); 801 pr_info("mtrr_cleanup: can not find optimal value\n");
803 printk(KERN_INFO "please specify mtrr_gran_size/mtrr_chunk_size\n"); 802 pr_info("please specify mtrr_gran_size/mtrr_chunk_size\n");
804 803
805 return 0; 804 return 0;
806} 805}
@@ -918,7 +917,7 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
918 917
919 /* kvm/qemu doesn't have mtrr set right, don't trim them all: */ 918 /* kvm/qemu doesn't have mtrr set right, don't trim them all: */
920 if (!highest_pfn) { 919 if (!highest_pfn) {
921 printk(KERN_INFO "CPU MTRRs all blank - virtualized system.\n"); 920 pr_info("CPU MTRRs all blank - virtualized system.\n");
922 return 0; 921 return 0;
923 } 922 }
924 923
@@ -973,7 +972,8 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
973 end_pfn); 972 end_pfn);
974 973
975 if (total_trim_size) { 974 if (total_trim_size) {
976 pr_warning("WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing %lluMB of RAM.\n", total_trim_size >> 20); 975 pr_warn("WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing %lluMB of RAM.\n",
976 total_trim_size >> 20);
977 977
978 if (!changed_by_mtrr_cleanup) 978 if (!changed_by_mtrr_cleanup)
979 WARN_ON(1); 979 WARN_ON(1);
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index c870af161008..fcbcb2f678ca 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -55,7 +55,7 @@ static inline void k8_check_syscfg_dram_mod_en(void)
55 55
56 rdmsr(MSR_K8_SYSCFG, lo, hi); 56 rdmsr(MSR_K8_SYSCFG, lo, hi);
57 if (lo & K8_MTRRFIXRANGE_DRAM_MODIFY) { 57 if (lo & K8_MTRRFIXRANGE_DRAM_MODIFY) {
58 printk(KERN_ERR FW_WARN "MTRR: CPU %u: SYSCFG[MtrrFixDramModEn]" 58 pr_err(FW_WARN "MTRR: CPU %u: SYSCFG[MtrrFixDramModEn]"
59 " not cleared by BIOS, clearing this bit\n", 59 " not cleared by BIOS, clearing this bit\n",
60 smp_processor_id()); 60 smp_processor_id());
61 lo &= ~K8_MTRRFIXRANGE_DRAM_MODIFY; 61 lo &= ~K8_MTRRFIXRANGE_DRAM_MODIFY;
@@ -501,14 +501,14 @@ void __init mtrr_state_warn(void)
501 if (!mask) 501 if (!mask)
502 return; 502 return;
503 if (mask & MTRR_CHANGE_MASK_FIXED) 503 if (mask & MTRR_CHANGE_MASK_FIXED)
504 pr_warning("mtrr: your CPUs had inconsistent fixed MTRR settings\n"); 504 pr_warn("mtrr: your CPUs had inconsistent fixed MTRR settings\n");
505 if (mask & MTRR_CHANGE_MASK_VARIABLE) 505 if (mask & MTRR_CHANGE_MASK_VARIABLE)
506 pr_warning("mtrr: your CPUs had inconsistent variable MTRR settings\n"); 506 pr_warn("mtrr: your CPUs had inconsistent variable MTRR settings\n");
507 if (mask & MTRR_CHANGE_MASK_DEFTYPE) 507 if (mask & MTRR_CHANGE_MASK_DEFTYPE)
508 pr_warning("mtrr: your CPUs had inconsistent MTRRdefType settings\n"); 508 pr_warn("mtrr: your CPUs had inconsistent MTRRdefType settings\n");
509 509
510 printk(KERN_INFO "mtrr: probably your BIOS does not setup all CPUs.\n"); 510 pr_info("mtrr: probably your BIOS does not setup all CPUs.\n");
511 printk(KERN_INFO "mtrr: corrected configuration.\n"); 511 pr_info("mtrr: corrected configuration.\n");
512} 512}
513 513
514/* 514/*
@@ -519,8 +519,7 @@ void __init mtrr_state_warn(void)
519void mtrr_wrmsr(unsigned msr, unsigned a, unsigned b) 519void mtrr_wrmsr(unsigned msr, unsigned a, unsigned b)
520{ 520{
521 if (wrmsr_safe(msr, a, b) < 0) { 521 if (wrmsr_safe(msr, a, b) < 0) {
522 printk(KERN_ERR 522 pr_err("MTRR: CPU %u: Writing MSR %x to %x:%x failed\n",
523 "MTRR: CPU %u: Writing MSR %x to %x:%x failed\n",
524 smp_processor_id(), msr, a, b); 523 smp_processor_id(), msr, a, b);
525 } 524 }
526} 525}
@@ -607,7 +606,7 @@ static void generic_get_mtrr(unsigned int reg, unsigned long *base,
607 tmp |= ~((1ULL<<(hi - 1)) - 1); 606 tmp |= ~((1ULL<<(hi - 1)) - 1);
608 607
609 if (tmp != mask) { 608 if (tmp != mask) {
610 printk(KERN_WARNING "mtrr: your BIOS has configured an incorrect mask, fixing it.\n"); 609 pr_warn("mtrr: your BIOS has configured an incorrect mask, fixing it.\n");
611 add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); 610 add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
612 mask = tmp; 611 mask = tmp;
613 } 612 }
@@ -858,13 +857,13 @@ int generic_validate_add_page(unsigned long base, unsigned long size,
858 boot_cpu_data.x86_model == 1 && 857 boot_cpu_data.x86_model == 1 &&
859 boot_cpu_data.x86_mask <= 7) { 858 boot_cpu_data.x86_mask <= 7) {
860 if (base & ((1 << (22 - PAGE_SHIFT)) - 1)) { 859 if (base & ((1 << (22 - PAGE_SHIFT)) - 1)) {
861 pr_warning("mtrr: base(0x%lx000) is not 4 MiB aligned\n", base); 860 pr_warn("mtrr: base(0x%lx000) is not 4 MiB aligned\n", base);
862 return -EINVAL; 861 return -EINVAL;
863 } 862 }
864 if (!(base + size < 0x70000 || base > 0x7003F) && 863 if (!(base + size < 0x70000 || base > 0x7003F) &&
865 (type == MTRR_TYPE_WRCOMB 864 (type == MTRR_TYPE_WRCOMB
866 || type == MTRR_TYPE_WRBACK)) { 865 || type == MTRR_TYPE_WRBACK)) {
867 pr_warning("mtrr: writable mtrr between 0x70000000 and 0x7003FFFF may hang the CPU.\n"); 866 pr_warn("mtrr: writable mtrr between 0x70000000 and 0x7003FFFF may hang the CPU.\n");
868 return -EINVAL; 867 return -EINVAL;
869 } 868 }
870 } 869 }
@@ -878,7 +877,7 @@ int generic_validate_add_page(unsigned long base, unsigned long size,
878 lbase = lbase >> 1, last = last >> 1) 877 lbase = lbase >> 1, last = last >> 1)
879 ; 878 ;
880 if (lbase != last) { 879 if (lbase != last) {
881 pr_warning("mtrr: base(0x%lx000) is not aligned on a size(0x%lx000) boundary\n", base, size); 880 pr_warn("mtrr: base(0x%lx000) is not aligned on a size(0x%lx000) boundary\n", base, size);
882 return -EINVAL; 881 return -EINVAL;
883 } 882 }
884 return 0; 883 return 0;
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 5c3d149ee91c..ba80d68f683e 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -300,24 +300,24 @@ int mtrr_add_page(unsigned long base, unsigned long size,
300 return error; 300 return error;
301 301
302 if (type >= MTRR_NUM_TYPES) { 302 if (type >= MTRR_NUM_TYPES) {
303 pr_warning("mtrr: type: %u invalid\n", type); 303 pr_warn("mtrr: type: %u invalid\n", type);
304 return -EINVAL; 304 return -EINVAL;
305 } 305 }
306 306
307 /* If the type is WC, check that this processor supports it */ 307 /* If the type is WC, check that this processor supports it */
308 if ((type == MTRR_TYPE_WRCOMB) && !have_wrcomb()) { 308 if ((type == MTRR_TYPE_WRCOMB) && !have_wrcomb()) {
309 pr_warning("mtrr: your processor doesn't support write-combining\n"); 309 pr_warn("mtrr: your processor doesn't support write-combining\n");
310 return -ENOSYS; 310 return -ENOSYS;
311 } 311 }
312 312
313 if (!size) { 313 if (!size) {
314 pr_warning("mtrr: zero sized request\n"); 314 pr_warn("mtrr: zero sized request\n");
315 return -EINVAL; 315 return -EINVAL;
316 } 316 }
317 317
318 if ((base | (base + size - 1)) >> 318 if ((base | (base + size - 1)) >>
319 (boot_cpu_data.x86_phys_bits - PAGE_SHIFT)) { 319 (boot_cpu_data.x86_phys_bits - PAGE_SHIFT)) {
320 pr_warning("mtrr: base or size exceeds the MTRR width\n"); 320 pr_warn("mtrr: base or size exceeds the MTRR width\n");
321 return -EINVAL; 321 return -EINVAL;
322 } 322 }
323 323
@@ -348,7 +348,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
348 } else if (types_compatible(type, ltype)) 348 } else if (types_compatible(type, ltype))
349 continue; 349 continue;
350 } 350 }
351 pr_warning("mtrr: 0x%lx000,0x%lx000 overlaps existing" 351 pr_warn("mtrr: 0x%lx000,0x%lx000 overlaps existing"
352 " 0x%lx000,0x%lx000\n", base, size, lbase, 352 " 0x%lx000,0x%lx000\n", base, size, lbase,
353 lsize); 353 lsize);
354 goto out; 354 goto out;
@@ -357,7 +357,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
357 if (ltype != type) { 357 if (ltype != type) {
358 if (types_compatible(type, ltype)) 358 if (types_compatible(type, ltype))
359 continue; 359 continue;
360 pr_warning("mtrr: type mismatch for %lx000,%lx000 old: %s new: %s\n", 360 pr_warn("mtrr: type mismatch for %lx000,%lx000 old: %s new: %s\n",
361 base, size, mtrr_attrib_to_str(ltype), 361 base, size, mtrr_attrib_to_str(ltype),
362 mtrr_attrib_to_str(type)); 362 mtrr_attrib_to_str(type));
363 goto out; 363 goto out;
@@ -395,7 +395,7 @@ int mtrr_add_page(unsigned long base, unsigned long size,
395static int mtrr_check(unsigned long base, unsigned long size) 395static int mtrr_check(unsigned long base, unsigned long size)
396{ 396{
397 if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) { 397 if ((base & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1))) {
398 pr_warning("mtrr: size and base must be multiples of 4 kiB\n"); 398 pr_warn("mtrr: size and base must be multiples of 4 kiB\n");
399 pr_debug("mtrr: size: 0x%lx base: 0x%lx\n", size, base); 399 pr_debug("mtrr: size: 0x%lx base: 0x%lx\n", size, base);
400 dump_stack(); 400 dump_stack();
401 return -1; 401 return -1;
@@ -493,16 +493,16 @@ int mtrr_del_page(int reg, unsigned long base, unsigned long size)
493 } 493 }
494 } 494 }
495 if (reg >= max) { 495 if (reg >= max) {
496 pr_warning("mtrr: register: %d too big\n", reg); 496 pr_warn("mtrr: register: %d too big\n", reg);
497 goto out; 497 goto out;
498 } 498 }
499 mtrr_if->get(reg, &lbase, &lsize, &ltype); 499 mtrr_if->get(reg, &lbase, &lsize, &ltype);
500 if (lsize < 1) { 500 if (lsize < 1) {
501 pr_warning("mtrr: MTRR %d not used\n", reg); 501 pr_warn("mtrr: MTRR %d not used\n", reg);
502 goto out; 502 goto out;
503 } 503 }
504 if (mtrr_usage_table[reg] < 1) { 504 if (mtrr_usage_table[reg] < 1) {
505 pr_warning("mtrr: reg: %d has count=0\n", reg); 505 pr_warn("mtrr: reg: %d has count=0\n", reg);
506 goto out; 506 goto out;
507 } 507 }
508 if (--mtrr_usage_table[reg] < 1) 508 if (--mtrr_usage_table[reg] < 1)
diff --git a/arch/x86/kernel/cpu/rdrand.c b/arch/x86/kernel/cpu/rdrand.c
index 819d94982e07..f6f50c4ceaec 100644
--- a/arch/x86/kernel/cpu/rdrand.c
+++ b/arch/x86/kernel/cpu/rdrand.c
@@ -51,7 +51,7 @@ void x86_init_rdrand(struct cpuinfo_x86 *c)
51 for (i = 0; i < SANITY_CHECK_LOOPS; i++) { 51 for (i = 0; i < SANITY_CHECK_LOOPS; i++) {
52 if (!rdrand_long(&tmp)) { 52 if (!rdrand_long(&tmp)) {
53 clear_cpu_cap(c, X86_FEATURE_RDRAND); 53 clear_cpu_cap(c, X86_FEATURE_RDRAND);
54 printk_once(KERN_WARNING "rdrand: disabled\n"); 54 pr_warn_once("rdrand: disabled\n");
55 return; 55 return;
56 } 56 }
57 } 57 }
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 4c60eaf0571c..cd531355e838 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -87,10 +87,10 @@ void detect_extended_topology(struct cpuinfo_x86 *c)
87 c->x86_max_cores = (core_level_siblings / smp_num_siblings); 87 c->x86_max_cores = (core_level_siblings / smp_num_siblings);
88 88
89 if (!printed) { 89 if (!printed) {
90 printk(KERN_INFO "CPU: Physical Processor ID: %d\n", 90 pr_info("CPU: Physical Processor ID: %d\n",
91 c->phys_proc_id); 91 c->phys_proc_id);
92 if (c->x86_max_cores > 1) 92 if (c->x86_max_cores > 1)
93 printk(KERN_INFO "CPU: Processor Core ID: %d\n", 93 pr_info("CPU: Processor Core ID: %d\n",
94 c->cpu_core_id); 94 c->cpu_core_id);
95 printed = 1; 95 printed = 1;
96 } 96 }
diff --git a/arch/x86/kernel/cpu/transmeta.c b/arch/x86/kernel/cpu/transmeta.c
index 252da7aceca6..e3b4d1841175 100644
--- a/arch/x86/kernel/cpu/transmeta.c
+++ b/arch/x86/kernel/cpu/transmeta.c
@@ -33,7 +33,7 @@ static void init_transmeta(struct cpuinfo_x86 *c)
33 if (max >= 0x80860001) { 33 if (max >= 0x80860001) {
34 cpuid(0x80860001, &dummy, &cpu_rev, &cpu_freq, &cpu_flags); 34 cpuid(0x80860001, &dummy, &cpu_rev, &cpu_freq, &cpu_flags);
35 if (cpu_rev != 0x02000000) { 35 if (cpu_rev != 0x02000000) {
36 printk(KERN_INFO "CPU: Processor revision %u.%u.%u.%u, %u MHz\n", 36 pr_info("CPU: Processor revision %u.%u.%u.%u, %u MHz\n",
37 (cpu_rev >> 24) & 0xff, 37 (cpu_rev >> 24) & 0xff,
38 (cpu_rev >> 16) & 0xff, 38 (cpu_rev >> 16) & 0xff,
39 (cpu_rev >> 8) & 0xff, 39 (cpu_rev >> 8) & 0xff,
@@ -44,10 +44,10 @@ static void init_transmeta(struct cpuinfo_x86 *c)
44 if (max >= 0x80860002) { 44 if (max >= 0x80860002) {
45 cpuid(0x80860002, &new_cpu_rev, &cms_rev1, &cms_rev2, &dummy); 45 cpuid(0x80860002, &new_cpu_rev, &cms_rev1, &cms_rev2, &dummy);
46 if (cpu_rev == 0x02000000) { 46 if (cpu_rev == 0x02000000) {
47 printk(KERN_INFO "CPU: Processor revision %08X, %u MHz\n", 47 pr_info("CPU: Processor revision %08X, %u MHz\n",
48 new_cpu_rev, cpu_freq); 48 new_cpu_rev, cpu_freq);
49 } 49 }
50 printk(KERN_INFO "CPU: Code Morphing Software revision %u.%u.%u-%u-%u\n", 50 pr_info("CPU: Code Morphing Software revision %u.%u.%u-%u-%u\n",
51 (cms_rev1 >> 24) & 0xff, 51 (cms_rev1 >> 24) & 0xff,
52 (cms_rev1 >> 16) & 0xff, 52 (cms_rev1 >> 16) & 0xff,
53 (cms_rev1 >> 8) & 0xff, 53 (cms_rev1 >> 8) & 0xff,
@@ -76,7 +76,7 @@ static void init_transmeta(struct cpuinfo_x86 *c)
76 (void *)&cpu_info[56], 76 (void *)&cpu_info[56],
77 (void *)&cpu_info[60]); 77 (void *)&cpu_info[60]);
78 cpu_info[64] = '\0'; 78 cpu_info[64] = '\0';
79 printk(KERN_INFO "CPU: %s\n", cpu_info); 79 pr_info("CPU: %s\n", cpu_info);
80 } 80 }
81 81
82 /* Unhide possibly hidden capability flags */ 82 /* Unhide possibly hidden capability flags */
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index 628a059a9a06..364e58346897 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -62,7 +62,7 @@ static unsigned long vmware_get_tsc_khz(void)
62 tsc_hz = eax | (((uint64_t)ebx) << 32); 62 tsc_hz = eax | (((uint64_t)ebx) << 32);
63 do_div(tsc_hz, 1000); 63 do_div(tsc_hz, 1000);
64 BUG_ON(tsc_hz >> 32); 64 BUG_ON(tsc_hz >> 32);
65 printk(KERN_INFO "TSC freq read from hypervisor : %lu.%03lu MHz\n", 65 pr_info("TSC freq read from hypervisor : %lu.%03lu MHz\n",
66 (unsigned long) tsc_hz / 1000, 66 (unsigned long) tsc_hz / 1000,
67 (unsigned long) tsc_hz % 1000); 67 (unsigned long) tsc_hz % 1000);
68 68
@@ -84,8 +84,7 @@ static void __init vmware_platform_setup(void)
84 if (ebx != UINT_MAX) 84 if (ebx != UINT_MAX)
85 x86_platform.calibrate_tsc = vmware_get_tsc_khz; 85 x86_platform.calibrate_tsc = vmware_get_tsc_khz;
86 else 86 else
87 printk(KERN_WARNING 87 pr_warn("Failed to get TSC freq from the hypervisor\n");
88 "Failed to get TSC freq from the hypervisor\n");
89} 88}
90 89
91/* 90/*
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 30ca7607cbbb..97340f2c437c 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -408,7 +408,7 @@ static inline void __init construct_default_ISA_mptable(int mpc_default_type)
408 processor.cpuflag = CPU_ENABLED; 408 processor.cpuflag = CPU_ENABLED;
409 processor.cpufeature = (boot_cpu_data.x86 << 8) | 409 processor.cpufeature = (boot_cpu_data.x86 << 8) |
410 (boot_cpu_data.x86_model << 4) | boot_cpu_data.x86_mask; 410 (boot_cpu_data.x86_model << 4) | boot_cpu_data.x86_mask;
411 processor.featureflag = boot_cpu_data.x86_capability[0]; 411 processor.featureflag = boot_cpu_data.x86_capability[CPUID_1_EDX];
412 processor.reserved[0] = 0; 412 processor.reserved[0] = 0;
413 processor.reserved[1] = 0; 413 processor.reserved[1] = 0;
414 for (i = 0; i < 2; i++) { 414 for (i = 0; i < 2; i++) {
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 8a2cdd736fa4..04b132a767f1 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -30,6 +30,7 @@
30#include <asm/nmi.h> 30#include <asm/nmi.h>
31#include <asm/x86_init.h> 31#include <asm/x86_init.h>
32#include <asm/reboot.h> 32#include <asm/reboot.h>
33#include <asm/cache.h>
33 34
34#define CREATE_TRACE_POINTS 35#define CREATE_TRACE_POINTS
35#include <trace/events/nmi.h> 36#include <trace/events/nmi.h>
@@ -69,7 +70,7 @@ struct nmi_stats {
69 70
70static DEFINE_PER_CPU(struct nmi_stats, nmi_stats); 71static DEFINE_PER_CPU(struct nmi_stats, nmi_stats);
71 72
72static int ignore_nmis; 73static int ignore_nmis __read_mostly;
73 74
74int unknown_nmi_panic; 75int unknown_nmi_panic;
75/* 76/*
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 24d57f77b3c1..3bf1e0b5f827 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -97,6 +97,14 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
97DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info); 97DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
98EXPORT_PER_CPU_SYMBOL(cpu_info); 98EXPORT_PER_CPU_SYMBOL(cpu_info);
99 99
100/* Logical package management. We might want to allocate that dynamically */
101static int *physical_to_logical_pkg __read_mostly;
102static unsigned long *physical_package_map __read_mostly;;
103static unsigned long *logical_package_map __read_mostly;
104static unsigned int max_physical_pkg_id __read_mostly;
105unsigned int __max_logical_packages __read_mostly;
106EXPORT_SYMBOL(__max_logical_packages);
107
100static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip) 108static inline void smpboot_setup_warm_reset_vector(unsigned long start_eip)
101{ 109{
102 unsigned long flags; 110 unsigned long flags;
@@ -251,6 +259,97 @@ static void notrace start_secondary(void *unused)
251 cpu_startup_entry(CPUHP_ONLINE); 259 cpu_startup_entry(CPUHP_ONLINE);
252} 260}
253 261
262int topology_update_package_map(unsigned int apicid, unsigned int cpu)
263{
264 unsigned int new, pkg = apicid >> boot_cpu_data.x86_coreid_bits;
265
266 /* Called from early boot ? */
267 if (!physical_package_map)
268 return 0;
269
270 if (pkg >= max_physical_pkg_id)
271 return -EINVAL;
272
273 /* Set the logical package id */
274 if (test_and_set_bit(pkg, physical_package_map))
275 goto found;
276
277 if (pkg < __max_logical_packages) {
278 set_bit(pkg, logical_package_map);
279 physical_to_logical_pkg[pkg] = pkg;
280 goto found;
281 }
282 new = find_first_zero_bit(logical_package_map, __max_logical_packages);
283 if (new >= __max_logical_packages) {
284 physical_to_logical_pkg[pkg] = -1;
285 pr_warn("APIC(%x) Package %u exceeds logical package map\n",
286 apicid, pkg);
287 return -ENOSPC;
288 }
289 set_bit(new, logical_package_map);
290 pr_info("APIC(%x) Converting physical %u to logical package %u\n",
291 apicid, pkg, new);
292 physical_to_logical_pkg[pkg] = new;
293
294found:
295 cpu_data(cpu).logical_proc_id = physical_to_logical_pkg[pkg];
296 return 0;
297}
298
299/**
300 * topology_phys_to_logical_pkg - Map a physical package id to a logical
301 *
302 * Returns logical package id or -1 if not found
303 */
304int topology_phys_to_logical_pkg(unsigned int phys_pkg)
305{
306 if (phys_pkg >= max_physical_pkg_id)
307 return -1;
308 return physical_to_logical_pkg[phys_pkg];
309}
310EXPORT_SYMBOL(topology_phys_to_logical_pkg);
311
312static void __init smp_init_package_map(void)
313{
314 unsigned int ncpus, cpu;
315 size_t size;
316
317 /*
318 * Today neither Intel nor AMD support heterogenous systems. That
319 * might change in the future....
320 */
321 ncpus = boot_cpu_data.x86_max_cores * smp_num_siblings;
322 __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus);
323
324 /*
325 * Possibly larger than what we need as the number of apic ids per
326 * package can be smaller than the actual used apic ids.
327 */
328 max_physical_pkg_id = DIV_ROUND_UP(MAX_LOCAL_APIC, ncpus);
329 size = max_physical_pkg_id * sizeof(unsigned int);
330 physical_to_logical_pkg = kmalloc(size, GFP_KERNEL);
331 memset(physical_to_logical_pkg, 0xff, size);
332 size = BITS_TO_LONGS(max_physical_pkg_id) * sizeof(unsigned long);
333 physical_package_map = kzalloc(size, GFP_KERNEL);
334 size = BITS_TO_LONGS(__max_logical_packages) * sizeof(unsigned long);
335 logical_package_map = kzalloc(size, GFP_KERNEL);
336
337 pr_info("Max logical packages: %u\n", __max_logical_packages);
338
339 for_each_present_cpu(cpu) {
340 unsigned int apicid = apic->cpu_present_to_apicid(cpu);
341
342 if (apicid == BAD_APICID || !apic->apic_id_valid(apicid))
343 continue;
344 if (!topology_update_package_map(apicid, cpu))
345 continue;
346 pr_warn("CPU %u APICId %x disabled\n", cpu, apicid);
347 per_cpu(x86_bios_cpu_apicid, cpu) = BAD_APICID;
348 set_cpu_possible(cpu, false);
349 set_cpu_present(cpu, false);
350 }
351}
352
254void __init smp_store_boot_cpu_info(void) 353void __init smp_store_boot_cpu_info(void)
255{ 354{
256 int id = 0; /* CPU 0 */ 355 int id = 0; /* CPU 0 */
@@ -258,6 +357,7 @@ void __init smp_store_boot_cpu_info(void)
258 357
259 *c = boot_cpu_data; 358 *c = boot_cpu_data;
260 c->cpu_index = id; 359 c->cpu_index = id;
360 smp_init_package_map();
261} 361}
262 362
263/* 363/*
diff --git a/arch/x86/lguest/boot.c b/arch/x86/lguest/boot.c
index f56cc418c87d..fd57d3ae7e16 100644
--- a/arch/x86/lguest/boot.c
+++ b/arch/x86/lguest/boot.c
@@ -1529,7 +1529,7 @@ __init void lguest_init(void)
1529 */ 1529 */
1530 cpu_detect(&new_cpu_data); 1530 cpu_detect(&new_cpu_data);
1531 /* head.S usually sets up the first capability word, so do it here. */ 1531 /* head.S usually sets up the first capability word, so do it here. */
1532 new_cpu_data.x86_capability[0] = cpuid_edx(1); 1532 new_cpu_data.x86_capability[CPUID_1_EDX] = cpuid_edx(1);
1533 1533
1534 /* Math is always hard! */ 1534 /* Math is always hard! */
1535 set_cpu_cap(&new_cpu_data, X86_FEATURE_FPU); 1535 set_cpu_cap(&new_cpu_data, X86_FEATURE_FPU);
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index d09e4c9d7cc5..2c261082eadf 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1654,7 +1654,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
1654 cpu_detect(&new_cpu_data); 1654 cpu_detect(&new_cpu_data);
1655 set_cpu_cap(&new_cpu_data, X86_FEATURE_FPU); 1655 set_cpu_cap(&new_cpu_data, X86_FEATURE_FPU);
1656 new_cpu_data.wp_works_ok = 1; 1656 new_cpu_data.wp_works_ok = 1;
1657 new_cpu_data.x86_capability[0] = cpuid_edx(1); 1657 new_cpu_data.x86_capability[CPUID_1_EDX] = cpuid_edx(1);
1658#endif 1658#endif
1659 1659
1660 if (xen_start_info->mod_start) { 1660 if (xen_start_info->mod_start) {
diff --git a/arch/x86/xen/pmu.c b/arch/x86/xen/pmu.c
index 724a08740a04..9466354d3e49 100644
--- a/arch/x86/xen/pmu.c
+++ b/arch/x86/xen/pmu.c
@@ -11,7 +11,7 @@
11#include "pmu.h" 11#include "pmu.h"
12 12
13/* x86_pmu.handle_irq definition */ 13/* x86_pmu.handle_irq definition */
14#include "../kernel/cpu/perf_event.h" 14#include "../events/perf_event.h"
15 15
16#define XENPMU_IRQ_PROCESSING 1 16#define XENPMU_IRQ_PROCESSING 1
17struct xenpmu { 17struct xenpmu {