| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
Impact: add defines to make iommu stats collection configurable
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: adds new Kconfig entry
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: cleanup
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: cleanup
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: cleanup
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: use bool instead of int for iommu->need_sync
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: use generic dev_name instead of own function
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: also hotplug devices benefit from device isolation
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: adds a new protection domain flag
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
| |
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function to lockup addresses in protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function to unmap pages into protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function to map pages into protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function to attach devices to protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function to detach devices from protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function for releasing protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a generic function for allocation protection domains
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a function to remove all devices from a domain
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: inform IOMMU about state change of a device in the driver core
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add helper functions to detach a device from a domain
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: rename set_device_domain() to attach_device()
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: know how many devices are assigned to a domain
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: detect when a driver uses a device assigned otherwise
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
Imapct: add a new struct member to 'struct protection_domain'
When using protection domains for dma_ops and KVM its better to know for
which subsystem it was allocated. Add a flags member to struct
protection domain for that purpose.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add a function to flush a domain id on every IOMMU
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
Impact: save unneeded logic to add and remove domains to the list
The removal of a protection domain from the iommu_pd_list is not
necessary. Another benefit is that we save complexity because we don't
have to readd it later when the device no longer uses the domain.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: refactoring of iommu_queue_inv_iommu_pages
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
| |
Impact: split one function into three
The separate functions are required synchronize commands across all
hardware IOMMUs in the system.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
| |
Impact: add code to release a domain id
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
Impact: change code to free pagetables from protection domains
The dma_ops_free_pagetable function can only free pagetables from
dma_ops domains. Change that to free pagetables of pure protection
domains.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
| |
Impact: function rename
The iommu_map function maps only one page. Make this clear in the
function name.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
GCC 3.0 and 3.1 are too old to build a working kernel.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ This check got dropped as obsolete when I simplified the gcc header
inclusion mess in f153b82121b0366fe0e5f9553545cce237335175, but Willy
Tarreau reports actually having those old versions still.. -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
x86: export vector_used_by_percpu_irq
x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
sched: nominate preferred wakeup cpu, fix
x86: fix lguest used_vectors breakage, -v2
x86: fix warning in arch/x86/kernel/io_apic.c
sched: fix warning in kernel/sched.c
sched: move test_sd_parent() to an SMP section of sched.h
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
sched: activate active load balancing in new idle cpus
sched: bias task wakeups to preferred semi-idle packages
sched: nominate preferred wakeup cpu
sched: favour lower logical cpu number for sched_mc balance
sched: framework for sched_mc/smt_power_savings=N
sched: convert BALANCE_FOR_xx_POWER to inline functions
x86: use possible_cpus=NUM to extend the possible cpus allowed
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
x86: update io_apic.c to the new cpumask code
x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
x86: xen: use smp_call_function_many()
x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
...
Fixed up trivial conflict in kernel/time/tick-sched.c manually
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: build fix
lguest can be built as a module and makes use of this new symbol:
ERROR: "vector_used_by_percpu_irq" [drivers/lguest/lg.ko] undefined!
export it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
These commits:
commit 95d313cf1c1ecedc8bec5727b09bdacbf67dfc45
Author: Mike Travis <travis@sgi.com>
Date: Tue Dec 16 17:33:54 2008 -0800
x86: Add cpu_mask_to_apicid_and
and
commit 6eeb7c5a99434596c5953a95baa17d2f085664e3
Author: Mike Travis <travis@sgi.com>
Date: Tue Dec 16 17:33:55 2008 -0800
x86: update add-cpu_mask_to_apicid_and to use struct cpumask*
broke interrupt delivery on x2apic platforms. As x2apic cluster mode uses
logical delivery mode, we need to use logical apicid instead of physical apicid
in x2apic_cpu_mask_to_apicid_and()
Impact: fixes the broken interrupt delivery issue on generic x2apic platforms.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Andrew Morton reported:
> kernel/sched.c: In function 'schedule':
> kernel/sched.c:3679: warning: 'active_balance' may be used uninitialized in this function
>
> This warning is correct - the code is buggy.
In sched.c load_balance_newidle, there's real potential use of
uninitialised variable - fix it.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: fix lguest, clean up
32-bit lguest used used_vectors to record vectors, but that model of
allocating vectors changed and got broken, after we changed vector
allocation to a per_cpu array.
Try enable that for 64bit, and the array is used for all vectors that
are not managed by vector_irq per_cpu array.
Also kill system_vectors[], that is now a duplication of the
used_vectors bitmap.
[ merged in cpus4096 due to io_apic.c cpumask changes. ]
[ -v2, fix build failure ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
this warning:
arch/x86/kernel/io_apic.c: In function ‘ir_set_msi_irq_affinity’:
arch/x86/kernel/io_apic.c:3373: warning: ‘cfg’ may be used uninitialized in this function
triggers because the variable was truly uninitialized. We'd crash on
entering this code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: fix cpumask conversion bug
this warning:
kernel/sched.c: In function ‘find_busiest_group’:
kernel/sched.c:3429: warning: passing argument 1 of ‘__first_cpu’ from incompatible pointer type
shows that we forgot to convert a new patch to the new cpumask APIs.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| | |
Impact: build fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: change task balancing to save power more agressively
Add SD_BALANCE_NEWIDLE flag at MC level and CPU level
if sched_mc is set. This helps power savings and
will not affect performance when sched_mc=0
Ingo and Mike Galbraith have optimised the SD flags by
removing SD_BALANCE_NEWIDLE at MC and CPU level. This
helps performance but hurts power savings since this
slows down task consolidation by reducing the number
of times load_balance is run.
sched: fine-tune SD_MC_INIT
commit 14800984706bf6936bbec5187f736e928be5c218
Author: Mike Galbraith <efault@gmx.de>
Date: Fri Nov 7 15:26:50 2008 +0100
sched: re-tune balancing -- revert
commit 9fcd18c9e63e325dbd2b4c726623f760788d5aa8
Author: Ingo Molnar <mingo@elte.hu>
Date: Wed Nov 5 16:52:08 2008 +0100
This patch selectively enables SD_BALANCE_NEWIDLE flag
only when sched_mc is set to 1 or 2. This helps power savings
by task consolidation and also does not hurt performance at
sched_mc=0 where all power saving optimisations are turned off.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: tweak task balancing to save power more agressively
Active load balancing is a process by which migration thread
is woken up on the target CPU in order to pull current
running task on another package into this newly idle
package.
This method is already in use with normal load_balance(),
this patch introduces this method to new idle cpus when
sched_mc is set to POWERSAVINGS_BALANCE_WAKEUP.
This logic provides effective consolidation of short running
daemon jobs in a almost idle system
The side effect of this patch may be ping-ponging of tasks
if the system is moderately utilised. May need to adjust the
iterations before triggering.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: tweak task wakeup to save power more agressively
Preferred wakeup cpu (from a semi idle package) has been
nominated in find_busiest_group() in the previous patch. Use
this information in sched_mc_preferred_wakeup_cpu in function
wake_idle() to bias task wakeups if the following conditions
are satisfied:
- The present cpu that is trying to wakeup the process is
idle and waking the target process on this cpu will
potentially wakeup a completely idle package
- The previous cpu on which the target process ran is
also idle and hence selecting the previous cpu may
wakeup a semi idle cpu package
- The task being woken up is allowed to run in the
nominated cpu (cpu affinity and restrictions)
Basically if both the current cpu and the previous cpu on
which the task ran is idle, select the nominated cpu from semi
idle cpu package for running the new task that is waking up.
Cache hotness is considered since the actual biasing happens
in wake_idle() only if the application is cache cold.
This technique will effectively move short running bursty jobs in
a mostly idle system.
Wakeup biasing for power savings gets automatically disabled if
system utilisation increases due to the fact that the probability
of finding both this_cpu and prev_cpu idle decreases.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: extend load-balancing code (no change in behavior yet)
When the system utilisation is low and more cpus are idle,
then the process waking up from sleep should prefer to
wakeup an idle cpu from semi-idle cpu package (multi core
package) rather than a completely idle cpu package which
would waste power.
Use the sched_mc balance logic in find_busiest_group() to
nominate a preferred wakeup cpu.
This info can be stored in appropriate sched_domain, but
updating this info in all copies of sched_domain is not
practical. Hence this information is stored in root_domain
struct which is one copy per partitioned sched domain.
The root_domain can be accessed from each cpu's runqueue
and there is one copy per partitioned sched domain.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: change load-balancing direction to match that of irqbalanced
Just in case two groups have identical load, prefer to move load to lower
logical cpu number rather than the present logic of moving to higher logical
number.
find_busiest_group() tries to look for a group_leader that has spare capacity
to take more tasks and freeup an appropriate least loaded group. Just in case
there is a tie and the load is equal, then the group with higher logical number
is favoured. This conflicts with user space irqbalance daemon that will move
interrupts to lower logical number if the system utilisation is very low.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: extend range of /sys/devices/system/cpu/sched_mc_power_savings
Currently the sched_mc/smt_power_savings variable is a boolean,
which either enables or disables topology based power savings.
This patch extends the behaviour of the variable from boolean to
multivalued, such that based on the value, we decide how
aggressively do we want to perform powersavings balance at
appropriate sched domain based on topology.
Variable levels of power saving tunable would benefit end user to
match the required level of power savings vs performance
trade-off depending on the system configuration and workloads.
This version makes the sched_mc_power_savings global variable to
take more values (0,1,2). Later versions can have a single
tunable called sched_power_savings instead of
sched_{mc,smt}_power_savings.
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: cleanup
BALANCE_FOR_MC_POWER and similar macros defined in sched.h are
not constants and have various condition checks and significant
amount of code that is not suitable to be contain in a macro.
Also there could be side effects on the expressions passed to
some of them like test_sd_parent().
This patch converts all complex macros related to power savings
balance to inline functions.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: add new boot parameter
Use possible_cpus=NUM kernel parameter to extend the number of possible
cpus.
The ability to HOTPLUG ON cpus that are "possible" but not "present" is
dealt with in a later patch.
Signed-off-by: Mike Travis <travis@sgi.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Impact: fix potential APIC crash
In determining the destination apicid, there are usually three cpumasks
that are considered: the incoming cpumask arg, cfg->domain and the
cpu_online_mask. Since we are just introducing the cpu_mask_to_apicid_and
function, make sure it includes the cpu_online_mask in it's evaluation.
[Added with this patch.]
There are two io_apic.c functions that did not previously use the
cpu_online_mask: setup_IO_APIC_irq and msi_compose_msg. Both of these
simply used cpu_mask_to_apicid(cfg->domain & TARGET_CPUS), and all but
one arch (NUMAQ[*]) returns only online cpus in the TARGET_CPUS mask,
so the behavior is identical for all cases.
[*: NUMAQ bug?]
Note that alloc_cpumask_var is only used for the 32-bit cases where
it's highly likely that the cpumask set size will be small and therefore
CPUMASK_OFFSTACK=n. But if that's not the case, failing the allocate
will cause the same return value as the default.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |\
| | |
| | |
| | |
| | |
| | | |
This done for conflict prevention: we merge it into the cpus4096 tree
because upcoming cpumask changes will touch apic.c that would collide
with x86/apic otherwise.
|