aboutsummaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAge
* Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2009-09-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits) sched: Fix sched::sched_stat_wait tracepoint field sched: Disable NEW_FAIR_SLEEPERS for now sched: Keep kthreads at default priority sched: Re-tune the scheduler latency defaults to decrease worst-case latencies sched: Turn off child_runs_first sched: Ensure that a child can't gain time over it's parent after fork() sched: enable SD_WAKE_IDLE sched: Deal with low-load in wake_affine() sched: Remove short cut from select_task_rq_fair() sched: Turn on SD_BALANCE_NEWIDLE sched: Clean up topology.h sched: Fix dynamic power-balancing crash sched: Remove reciprocal for cpu_power sched: Try to deal with low capacity, fix update_sd_power_savings_stats() sched: Try to deal with low capacity sched: Scale down cpu_power due to RT tasks sched: Implement dynamic cpu_power sched: Add smt_gain sched: Update the cpu_power sum during load-balance sched: Add SD_PREFER_SIBLING ...
| * sched: enable SD_WAKE_IDLEPeter Zijlstra2009-09-07
| | | | | | | | | | | | | | | | | | Now that SD_WAKE_IDLE doesn't make pipe-test suck anymore, enable it by default for MC, CPU and NUMA domains. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: Turn on SD_BALANCE_NEWIDLEIngo Molnar2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Start the re-tuning of the balancer by turning on newidle. It improves hackbench performance and parallelism on a 4x4 box. The "perf stat --repeat 10" measurements give us: domain0 domain1 ....................................... -SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2041.273208 task-clock-msecs # 9.354 CPUs ( +- 0.363% ) +SD_BALANCE_NEWIDLE -SD_BALANCE_NEWIDLE: 2086.326925 task-clock-msecs # 11.934 CPUs ( +- 0.301% ) +SD_BALANCE_NEWIDLE +SD_BALANCE_NEWIDLE: 2115.289791 task-clock-msecs # 12.158 CPUs ( +- 0.263% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * sched: Clean up topology.hIngo Molnar2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-organize the flag settings so that it's visible at a glance which sched-domains flags are set and which not. With the new balancer code we'll need to re-tune these details anyway, so make it cleaner to make fewer mistakes down the road ;-) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'perfcounters-core-for-linus' of ↵Linus Torvalds2009-09-11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) perf tools: Avoid unnecessary work in directory lookups perf stat: Clean up statistics calculations a bit more perf stat: More advanced variance computation perf stat: Use stddev_mean in stead of stddev perf stat: Remove the limit on repeat perf stat: Change noise calculation to use stddev x86, perf_counter, bts: Do not allow kernel BTS tracing for now x86, perf_counter, bts: Correct pointer-to-u64 casts x86, perf_counter, bts: Fail if BTS is not available perf_counter: Fix output-sharing error path perf trace: Fix read_string() perf trace: Print out in nanoseconds perf tools: Seek to the end of the header area perf trace: Fix parsing of perf.data perf trace: Sample timestamps as well perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access perf trace: Sample the CPU too perf tools: Work around strict aliasing related warnings perf tools: Clean up warnings list in the Makefile perf tools: Complete support for dynamic strings ...
| * | x86, perf_counter, bts: Do not allow kernel BTS tracing for nowmarkus.t.metzger@intel.com2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kernel BTS tracing generates too much data too fast for us to handle, causing the kernel to hang. Fail for BTS requests for kernel code. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zjilstra@chello.nl> LKML-Reference: <20090902140616.901253000@intel.com> [ This is really a workaround - but we want BTS tracing in .32 so make sure we dont regress. The lockup should be fixed ASAP. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, perf_counter, bts: Correct pointer-to-u64 castsmarkus.t.metzger@intel.com2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32bit, pointers in the DS AREA configuration are cast to u64. The current (long) cast to avoid compiler warnings results in a signed 64bit address. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140615.305889000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | x86, perf_counter, bts: Fail if BTS is not availablemarkus.t.metzger@intel.com2009-09-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reserve PERF_COUNT_HW_BRANCH_INSTRUCTIONS with sample_period == 1 for BTS tracing and fail, if BTS is not available. Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140612.943801000@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'perfcounters/urgent' into perfcounters/coreIngo Molnar2009-09-02
| |\| | | | | | | | | | | | | | | | | | | Merge reason: We are going to modify a place modified by perfcounters/urgent. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | Merge branch 'master' of ↵Ingo Molnar2009-08-18
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core
| | * | perf_counter: powerpc: Add callchain supportPaul Mackerras2009-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for tracing callchains for powerpc, both 32-bit and 64-bit, and both in the kernel and userspace, from PMU interrupt context. The first three entries stored for each callchain are the NIP (next instruction pointer), LR (link register), and the contents of the LR save area in the second stack frame (the first is ignored because the ABI convention on powerpc is that functions save their return address in their caller's stack frame). Because leaf functions don't have to save their return address (LR value) and don't have to establish a stack frame, it's possible for either or both of LR and the second stack frame's LR save area to have valid return addresses in them. This is basically impossible to disambiguate without either reading the code or looking at auxiliary information such as CFI tables. Since we don't want to do either of those things at interrupt time, we store both LR and the second stack frame's LR save area. Once we get past the second stack frame, there is no ambiguity; all return addresses we get are reliable. For kernel traces, we check whether they are valid kernel instruction addresses and store zero instead if they are not (rather than omitting them, which would make it impossible for userspace to know which was which). We also store zero instead of the second stack frame's LR save area value if it is the same as LR. For kernel traces, we check for interrupt frames, and for user traces, we check for signal frames. In each case, since we're starting a new trace, we store a PERF_CONTEXT_KERNEL/USER marker so that userspace knows that the next three entries are NIP, LR and the second stack frame for the interrupted context. We read user memory with __get_user_inatomic. On 64-bit, if this PMU interrupt occurred while interrupts are soft-disabled, and there is no MMU hash table entry for the page, we will get an -EFAULT return from __get_user_inatomic even if there is a valid Linux PTE for the page, since hash_page isn't reentrant. Thus we have code here to read the Linux PTE and access the page via the kernel linear mapping. Since 64-bit doesn't use (or need) highmem there is no need to do kmap_atomic. On 32-bit, we don't do soft interrupt disabling, so this complication doesn't occur and there is no need to fall back to reading the Linux PTE, since hash_page (or the TLB miss handler) will get called automatically if necessary. Note that we cannot get PMU interrupts in the interval during context switch between switch_mm (which switches the user address space) and switch_to (which actually changes current to the new process). On 64-bit this is because interrupts are hard-disabled in switch_mm and stay hard-disabled until they are soft-enabled later, after switch_to has returned. So there is no possibility of trying to do a user stack trace when the user address space is not current's address space. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | * | powerpc: Allow perf_counters to access user memory at interrupt timePaul Mackerras2009-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a mechanism to allow the perf_counters code to access user memory in a PMU interrupt routine. Such an access can cause various kinds of interrupt: SLB miss, MMU hash table miss, segment table miss, or TLB miss, depending on the processor. This commit only deals with 64-bit classic/server processors, which use an MMU hash table. 32-bit processors are already able to access user memory at interrupt time. Since we don't soft-disable on 32-bit, we avoid the possibility of reentering hash_page or the TLB miss handlers, since they run with interrupts disabled. On 64-bit processors, an SLB miss interrupt on a user address will update the slb_cache and slb_cache_ptr fields in the paca. This is OK except in the case where a PMU interrupt occurs in switch_slb, which also accesses those fields. To prevent this, we hard-disable interrupts in switch_slb. Interrupts are already soft-disabled at this point, and will get hard-enabled when they get soft-enabled later. This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice, and to make sure that it clears the slb_cache_ptr when called from other callers than switch_slb, the existing routine is renamed to __slb_flush_and_rebolt, which is called by switch_slb and the new version of slb_flush_and_rebolt. Similarly, switch_stab (used on POWER3 and RS64 processors) gets a hard_irq_disable() to protect the per-cpu variables used there and in ste_allocate. If a MMU hashtable miss interrupt occurs, normally we would call hash_page to look up the Linux PTE for the address and create a HPTE. However, hash_page is fairly complex and takes some locks, so to avoid the possibility of deadlock, we check the preemption count to see if we are in a (pseudo-)NMI handler, and if so, we don't call hash_page but instead treat it like a bad access that will get reported up through the exception table mechanism. An interrupt whose handler runs even though the interrupt occurred when soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI handler, which should use nmi_enter()/nmi_exit() rather than irq_enter()/irq_exit(). Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | * | powerpc/32: Always order writes to halves of 64-bit PTEsPaul Mackerras2009-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two 32-bit halves. On SMP we write the higher-order half and then the lower-order half, with a write barrier between the two halves, but on UP there was no particular ordering of the writes to the two halves. This extends the ordering that we already do on SMP to the UP case as well. The reason is that with the perf_counter subsystem potentially accessing user memory at interrupt time to get stack traces, we have to be careful not to create an incorrect but apparently valid PTE even on UP. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | Merge branch 'perfcounters/urgent' into perfcounters/coreIngo Molnar2009-08-15
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: kernel/perf_counter.c Merge reason: update to latest upstream (-rc6) and resolve the conflict with urgent fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, perf_counter, bts: Add BTS support to perfcountersMarkus Metzger2009-08-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a performance counter with: attr.type = PERF_TYPE_HARDWARE attr.config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS attr.sample_period = 1 Using branch trace store (BTS) on x86 hardware, if available. The from and to address for each branch can be sampled using: PERF_SAMPLE_IP for the from address PERF_SAMPLE_ADDR for the to address [ v2: address review feedback, fix bugs ] Signed-off-by: Markus Metzger <markus.t.metzger@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'oprofile-for-linus' of ↵Linus Torvalds2009-09-11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (55 commits) arch/x86/oprofile/op_model_amd.c: fix op_amd_handle_ibs() return type Revert "x86: oprofile/op_model_amd.c set return values for op_amd_handle_ibs()" x86/oprofile: Small coding style fixes x86/oprofile: Add counter reservation check for virtual counters x86/oprofile: Implement op_x86_virt_to_phys() oprofile: Adding switch counter to oprofile statistic variables x86/oprofile: Implement mux_clone() x86/oprofile: Enable multiplexing only if the model supports it x86/oprofile: Add function has_mux() to check multiplexing support x86/oprofile: Modify initialization of num_virt_counters x86/oprofile: Remove unused num_virt_controls from struct op_x86_model_spec x86/oprofile: Remove const qualifier from struct op_x86_model_spec x86/oprofile: Moving nmi_cpu_switch() in nmi_int.c x86/oprofile: Moving nmi_cpu_save/restore_mpx_registers() in nmi_int.c x86/oprofile: Moving nmi_setup_cpu_mux() in nmi_int.c x86/oprofile: Implement multiplexing setup/shutdown functions oprofile: Grouping multiplexing code in op_model_amd.c oprofile: Introduce op_x86_phys_to_virt() oprofile: Grouping multiplexing code in oprof.c oprofile: Remove oprofile_multiplexing_init() ...
| * | | | arch/x86/oprofile/op_model_amd.c: fix op_amd_handle_ibs() return typeAndrew Morton2009-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch/x86/oprofile/op_model_amd.c: In function 'op_amd_handle_ibs': arch/x86/oprofile/op_model_amd.c:217: warning: no return statement in function returning non-void Fix this by making op_amd_handle_ibs() return void. Cc: Robert Richter <robert.richter@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | Revert "x86: oprofile/op_model_amd.c set return values for op_amd_handle_ibs()"Robert Richter2009-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 21e70878215f620fe99ea7d7c74bc641aeec932f. Instead Andrew's patch will be applied he posted at the same time. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Small coding style fixesRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some small coding style fixes. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Add counter reservation check for virtual countersRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a check for the availability of a counter. A virtual counter is used only if its physical counter is not reserved. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Implement op_x86_virt_to_phys()Robert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a common x86 function to convert virtual counter numbers to physical. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | oprofile: Adding switch counter to oprofile statistic variablesRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the multiplexing switch counter from x86 code to common oprofile statistic variables. Now the value will be available and usable for all architectures. The initialization and incrementation also moved to common code. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Implement mux_clone()Robert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To setup a counter for all cpus, its structure is cloned from cpu 0. This patch implements mux_clone() to do this part for multiplexing data. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Enable multiplexing only if the model supports itRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch checks if the model supports multiplexing. Only then multiplexing will be enabled. The code is added to the common x86 initialization. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Add function has_mux() to check multiplexing supportRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check is used to prevent running multiplexing code for models not supporting multiplexing. Before, the code was running but without effect. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Modify initialization of num_virt_countersRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Models that do not yet support counter multiplexing have to setup num_virt_counters. This patch implements the setup from num_counters if num_virt_counters is not set. Thus, num_virt_counters must be setup only for multiplexing support. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Remove unused num_virt_controls from struct op_x86_model_specRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The member num_virt_controls of struct op_x86_model_spec is not used. This patch removes it. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Remove const qualifier from struct op_x86_model_specRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the const qualifier from struct op_x86_model_spec to make model parameters changable. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Moving nmi_cpu_switch() in nmi_int.cRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves some code in nmi_int.c to get a single separate multiplexing code section. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Moving nmi_cpu_save/restore_mpx_registers() in nmi_int.cRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves some code in nmi_int.c to get a single separate multiplexing code section. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Moving nmi_setup_cpu_mux() in nmi_int.cRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves some code in nmi_int.c to get a single separate multiplexing code section. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Implement multiplexing setup/shutdown functionsRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements nmi_setup_mux() and nmi_shutdown_mux() functions to setup/shutdown multiplexing. Multiplexing code in nmi_int.c is now much more separated. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | oprofile: Grouping multiplexing code in op_model_amd.cRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves some multiplexing code to the new function op_mux_fill_in_addresses(). Also, the whole multiplexing code is now at a single location. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | oprofile: Introduce op_x86_phys_to_virt()Robert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new function translates physical to virtual counter numbers. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Fix initialization of switch_indexRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable switch_index must be initialized for each cpu. This patch fixes the initialization by moving it to the per-cpu init function nmi_cpu_setup(). Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Use per_cpu() instead of __get_cpu_var()Robert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __get_cpu_var() calls smp_processor_id(). When the cpu id is already known, instead use per_cpu() to avoid generating the id again. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Fix usage of NUM_CONTROLS/NUM_COUNTERS macrosRobert Richter2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the corresponding macros when iterating over counter and control registers. Since NUM_CONTROLS and NUM_COUNTERS are equal for AMD cpus the fix is more a cosmetical change. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | oprofile: Implement performance counter multiplexingJason Yeh2009-07-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of hardware counters is limited. The multiplexing feature enables OProfile to gather more events than counters are provided by the hardware. This is realized by switching between events at an user specified time interval. A new file (/dev/oprofile/time_slice) is added for the user to specify the timer interval in ms. If the number of events to profile is higher than the number of hardware counters available, the patch will schedule a work queue that switches the event counter and re-writes the different sets of values into it. The switching mechanism needs to be implemented for each architecture to support multiplexing. This patch only implements AMD CPU support, but multiplexing can be easily extended for other models and architectures. There are follow-on patches that rework parts of this patch. Signed-off-by: Jason Yeh <jason.yeh@amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Whitespaces changes onlyRobert Richter2009-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes whitespace changes of code that will be touched in follow-on patches. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Rework and simplify nmi_cpu_setup()Robert Richter2009-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the function nmi_save_registers(). Per-cpu code is now executed only in the function nmi_cpu_setup(). Also, it renames the per-cpu function nmi_restore_registers() to nmi_cpu_restore_registers(). Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | x86/oprofile: Fix cast of counter valueRobert Richter2009-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When casting the counter value to a 64 bit value in 32 bit mode, sign extension may lead to broken counter values. This patch fixes this by casting to (u64) instead of (s64). Signed-off-by: Robert Richter <robert.richter@amd.com>
| | | | |
| | \ \ \
| *-. \ \ \ Merge commit 'v2.6.31-rc3'; commit 'tip/oprofile' into oprofile/coreRobert Richter2009-07-14
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/oprofile/oprofile_stats.c drivers/usb/otg/Kconfig drivers/usb/otg/Makefile Signed-off-by: Robert Richter <robert.richter@amd.com>
| | | * \ \ \ Merge branch 'auto' of ↵Ingo Molnar2009-06-17
| | | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into oprofile
| * | | | | | | x86: oprofile/op_model_amd.c set return values for op_amd_handle_ibs()Jaswinder Singh Rajput2009-06-18
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | op_amd_handle_ibs() should return 0 when IBS is not present or not defined. Fix compilation warning: CC [M] arch/x86/oprofile/op_model_amd.o arch/x86/oprofile/op_model_amd.c: In function ‘op_amd_handle_ibs’: arch/x86/oprofile/op_model_amd.c:217: warning: no return statement in function returning non-void Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | x86/oprofile: fix initialization of arch_perfmon for core_i7Robert Richter2009-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit: e419294 x86/oprofile: moving arch_perfmon counter setup to op_x86_model_spec.init introduced a bug in the initialization of core_i7 leading to the incorrect model setup to &op_ppro_spec. This patch fixes this. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | Merge commit 'tip/perfcounters-for-linus' into oprofile/masterRobert Richter2009-06-12
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/oprofile/op_model_ppro.c Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | | x86/oprofile: introduce oprofile_add_data64()Robert Richter2009-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IBS implemention writes 64 bit register values to the cpu buffer by writing two 32 values using oprofile_add_data(). This patch introduces oprofile_add_data64() to write a single 64 bit value to the buffer. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | | x86/oprofile: use 64 bit values in IBS functionsRobert Richter2009-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IBS code internally uses 32 bit values (a low and a high value) to represent a 64 bit value. This patch changes this and now 64 bit values are used instead. 64 bit MSR functions can be used now. No functional changes. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | | x86/oprofile: remove some local variables in MSR save/restore functionsRobert Richter2009-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch removes some local variables in these functions. Signed-off-by: Robert Richter <robert.richter@amd.com>
| * | | | | | | x86/oprofile: use 64 bit values to save MSR statesRobert Richter2009-06-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes struct op_saved_msr and replaces it by an u64 variable. This makes code easier and it is possible to use 64 bit MSR functions. Signed-off-by: Robert Richter <robert.richter@amd.com>