aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
...
| | * | | | | | | | | x86: Simplify flush_write_buffers()Brian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always make it an inline instead of using a macro for the no-op case. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-7-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | x86: Clean up mem*io functions.Brian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Iomem has no special significance on x86. Use the standard mem* functions instead of trying to call other versions. Some fixups are needed to match the function prototypes. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-6-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | x86-64: Use BUILDIO in io_64.hBrian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copied from io_32.h. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-5-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | x86-64: Reorganize io_64.hBrian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it more similar to io_32.h. No real code changes. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-4-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | x86-32: Remove _local variants of in/out from io_32.hBrian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These were leftover from the numaq support that was removed in commit 1fba38703d0ce8a5ff0fad9df3eccc6b55cf2cfb. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-3-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | x86-32: Move XQUAD definitions to numaq.hBrian Gerst2010-02-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The XQUAD stuff is part of the NUMAQ architecture, so move it there. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1265380629-3212-2-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | | | | Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds2010-02-28
| |\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Enable L3 CID only on AMD x86, cacheinfo: Remove NUMA dependency, fix for AMD Fam10h rev D1 x86, cpu: Print AMD virtualization features in /proc/cpuinfo x86, cacheinfo: Calculate L3 indices x86, cacheinfo: Add cache index disable sysfs attrs only to L3 caches x86, cacheinfo: Fix disabling of L3 cache indices intel-agp: Switch to wbinvd_on_all_cpus x86, lib: Add wbinvd smp helpers
| | * | | | | | | | | | x86, cacheinfo: Enable L3 CID only on AMDBorislav Petkov2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Final stage linking can fail with arch/x86/built-in.o: In function `store_cache_disable': intel_cacheinfo.c:(.text+0xc509): undefined reference to `amd_get_nb_id' arch/x86/built-in.o: In function `show_cache_disable': intel_cacheinfo.c:(.text+0xc7d3): undefined reference to `amd_get_nb_id' when CONFIG_CPU_SUP_AMD is not enabled because the amd_get_nb_id helper is defined in AMD-specific code but also used in generic code (intel_cacheinfo.c). Reorganize the L3 cache index disable code under CONFIG_CPU_SUP_AMD since it is AMD-only anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100218184210.GF20473@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, cacheinfo: Remove NUMA dependency, fix for AMD Fam10h rev D1Borislav Petkov2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The show/store_cache_disable routines depend unnecessarily on NUMA's cpu_to_node and the disabling of cache indices broke when !CONFIG_NUMA. Remove that dependency by using a helper which is always correct. While at it, enable L3 Cache Index disable on rev D1 Istanbuls which sport the feature too. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100218184339.GG20473@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, cpu: Print AMD virtualization features in /proc/cpuinfoJoerg Roedel2010-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds code to cpu initialization path to detect the extended virtualization features of AMD cpus to show them in /proc/cpuinfo. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <1260792521-15212-1-git-send-email-joerg.roedel@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, cacheinfo: Calculate L3 indicesBorislav Petkov2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to know the valid L3 indices interval when disabling them over /sysfs. Do that when the core is brought online and add boundary checks to the sysfs .store attribute. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-6-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, cacheinfo: Add cache index disable sysfs attrs only to L3 cachesBorislav Petkov2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cache_disable_[01] attribute in /sys/devices/system/cpu/cpu?/cache/index[0-3]/ is enabled on all cache levels although only L3 supports it. Add it only to the cache level that actually supports it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-5-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, cacheinfo: Fix disabling of L3 cache indicesBorislav Petkov2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Correct the masks used for writing the cache index disable indices. * Do not turn off L3 scrubber - it is not necessary. * Make sure wbinvd is executed on the same node where the L3 is. * Check for out-of-bounds values written to the registers. * Make show_cache_disable hex values unambiguous * Check for Erratum #388 Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-4-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | intel-agp: Switch to wbinvd_on_all_cpusBorislav Petkov2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify if-statement while at it. [ hpa: we need to #include <asm/smp.h> ] Cc: Dave Jones <davej@redhat.com> Cc: David Airlie <airlied@linux.ie> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-3-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | x86, lib: Add wbinvd smp helpersBorislav Petkov2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add wbinvd_on_cpu and wbinvd_on_all_cpus stubs for executing wbinvd on a particular CPU. [ hpa: renamed lib/smp.c to lib/cache-smp.c ] [ hpa: wbinvd_on_all_cpus() returns int, but wbinvd() returns void. Thus, the former cannot be a macro for the latter, replace with an inline function. ] Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1264172467-25155-2-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | | | | | Merge branch 'x86-cleanups-for-linus' of ↵Linus Torvalds2010-02-28
| |\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove trailing spaces in messages x86, mtrr: Remove unused mtrr/state.c x86, trivial: Fix grammo in tsc comment about Geode TSC reliability
| | * | | | | | | | | | | x86: Remove trailing spaces in messagesFrans Pop2010-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Frans Pop <elendil@planet.nl> Cc: Avi Kivity <avi@redhat.com> Cc: x86@kernel.org LKML-Reference: <1265478443-31072-10-git-send-email-elendil@planet.nl> [ Left out the KVM bits. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86, mtrr: Remove unused mtrr/state.cBorislav Petkov2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last reference to the helpers in <arch/x86/kernel/cpu/mtrr/state.c> went away with 9a6b344ea967efa0bb5ca4cb5405f840652b66c4 leaving unused code. Remove it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100204085128.GA513@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86, trivial: Fix grammo in tsc comment about Geode TSC reliabilityThadeu Lima de Souza Cascardo2010-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: marcelo@kvack.org Cc: dilinger@collabora.co.uk Cc: trivial@kernel.org LKML-Reference: <1263764685-9871-1-git-send-email-cascardo@holoscopio.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds2010-02-28
| |\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Mark atomic irq ops raw for 32bit legacy x86: Merge show_regs() x86: Macroise x86 cache descriptors x86-32: clean up rwsem inline asm statements x86: Merge asm/atomic_{32,64}.h x86: Sync asm/atomic_32.h and asm/atomic_64.h x86: Split atomic64_t functions into seperate headers x86-64: Modify memcpy()/memset() alternatives mechanism x86-64: Modify copy_user_generic() alternatives mechanism x86: Lift restriction on the location of FIX_BTMAP_* x86, core: Optimize hweight32()
| | * | | | | | | | | | | | x86: Mark atomic irq ops raw for 32bit legacyIngo Molnar2010-02-16
| | | |_|_|_|_|_|_|/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The atomic ops emulation for 32bit legacy CPUs floods the tracer with irq off/on entries. The irq disabled regions are short and therefor not interesting when chasing long irq disabled latencies. Mark them raw and keep them out of the trace. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | | | | | | | | x86: Merge show_regs()Brian Gerst2010-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using kernel_stack_pointer() allows 32-bit and 64-bit versions to be merged. This is more correct for 64-bit, since the old %rsp is always saved on the stack. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1263397555-27695-1-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | | x86: Macroise x86 cache descriptorsDave Jones2010-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a macro to define the cache sizes when cachesize > 1 MB. This is less typing, and less prone to introducing bugs like we saw in e02e0e1a130b9ca37c5186d38ad4b3aaf58bb149, and means we don't have to do maths when adding new non-power-of-2 updates like those seen recently. Signed-off-by: Dave Jones <davej@redhat.com> LKML-Reference: <20100104144735.GA18390@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86-32: clean up rwsem inline asm statementsLinus Torvalds2010-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes gcc use the right register names and instruction operand sizes automatically for the rwsem inline asm statements. So instead of using "(%%eax)" to specify the memory address that is the semaphore, we use "(%1)" or similar. And instead of forcing the operation to always be 32-bit, we use "%z0", taking the size from the actual semaphore data structure itself. This doesn't actually matter on x86-32, but if we want to use the same inline asm for x86-64, we'll need to have the compiler generate the proper 64-bit names for the registers (%rax instead of %eax), and if we want to use a 64-bit counter too (in order to avoid the 15-bit limit on the write counter that limits concurrent users to 32767 threads), we'll need to be able to generate instructions with "q" accesses rather than "l". Since this header currently isn't enabled on x86-64, none of that matters, but we do want to use the xadd version of the semaphores rather than have to take spinlocks to do a rwsem. The mm->mmap_sem can be heavily contended when you have lots of threads all taking page faults, and the fallback rwsem code that uses a spinlock performs abysmally badly in that case. [ hpa: modified the patch to skip size suffixes entirely when they are redundant due to register operands. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.00.1001121613560.17145@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | | x86: Merge asm/atomic_{32,64}.hBrian Gerst2010-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the now identical code from asm/atomic_32.h and asm/atomic_64.h into asm/atomic.h. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-4-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | | x86: Sync asm/atomic_32.h and asm/atomic_64.hBrian Gerst2010-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for merging into asm/atomic.h. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-3-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | | x86: Split atomic64_t functions into seperate headersBrian Gerst2010-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split atomic64_t functions out into separate headers, since they will not be practical to merge between 32 and 64 bits. Signed-off-by: Brian Gerst <brgerst@gmail.com> LKML-Reference: <1262883215-4034-2-git-send-email-brgerst@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | | | | | | | | | | x86-64: Modify memcpy()/memset() alternatives mechanismJan Beulich2009-12-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid unnecessary chains of branches, rather than implementing memcpy()/memset()'s access to their alternative implementations via a jump, patch the (larger) original function directly. The memcpy() part of this is slightly subtle: while alternative instruction patching does itself use memcpy(), with the replacement block being less than 64-bytes in size the main loop of the original function doesn't get used for copying memcpy_c() over memcpy(), and hence we can safely write over its beginning. Also note that the CFI annotations are fine for both variants of each of the functions. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8D30200007800026AF2@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86-64: Modify copy_user_generic() alternatives mechanismJan Beulich2009-12-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid unnecessary chains of branches, rather than implementing copy_user_generic() as a function consisting of just a single (possibly patched) branch, instead properly deal with patching call instructions in the alternative instructions framework, and move the patching into the callers. As a follow-on, one could also introduce something like __EXPORT_SYMBOL_ALT() to avoid patching call sites in modules. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB8180200007800026AE7@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86: Lift restriction on the location of FIX_BTMAP_*Jan Beulich2009-12-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The early ioremap fixmap entries cover half (or for 32-bit non-PAE, a quarter) of a page table, yet they got uncondtitionally aligned so far to a 256-entry boundary. This is not necessary if the range of page table entries anyway falls into a single page table. This buys back, for (theoretically) 50% of all configurations (25% of all non-PAE ones), at least some of the lowmem necessarily lost with commit e621bd18958ef5dbace3129ebe17a0a475e127d9. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4B2BB66F0200007800026AD6@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | x86, core: Optimize hweight32()Akinobu Mita2009-12-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize hweight32 by using the same technique in hweight64. The proof of this technique can be found in the commit log for f9b4192923fa6e38331e88214b1fe5fc21583fcc ("bitops: hweight() speedup"). The userspace benchmark on x86_32 showed 20% speedup with bitmap_weight() which uses hweight32 to count bits for each unsigned long on 32bit architectures. int main(void) { #define SZ (1024 * 1024 * 512) static DECLARE_BITMAP(bitmap, SZ) = { [0 ... 100] = 1, }; return bitmap_weight(bitmap, SZ); } Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1258603932-4590-1-git-send-email-akinobu.mita@gmail.com> [ only x86 sets ARCH_HAS_FAST_MULTIPLIER so we do this via the x86 tree] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | | | | | | Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2010-02-28
| |\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix SCHED_MC regression caused by change in sched cpu_power sched: Don't use possibly stale sched_class kthread, sched: Remove reference to kthread_create_on_cpu sched: cpuacct: Use bigger percpu counter batch values for stats counters percpu_counter: Make __percpu_counter_add an inline function on UP sched: Remove member rt_se from struct rt_rq sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu] sched: Remove unused update_shares_locked() sched: Use for_each_bit sched: Queue a deboosted task to the head of the RT prio queue sched: Implement head queueing for sched_rt sched: Extend enqueue_task to allow head queueing sched: Remove USER_SCHED sched: Fix the place where group powers are updated sched: Assume *balance is valid sched: Remove load_balance_newidle() sched: Unify load_balance{,_newidle}() sched: Add a lock break for PREEMPT=y sched: Remove from fwd decls sched: Remove rq_iterator from move_one_task ... Fix up trivial conflicts in kernel/sched.c
| | * | | | | | | | | | | | sched: Fix SCHED_MC regression caused by change in sched cpu_powerSuresh Siddha2010-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On platforms like dual socket quad-core platform, the scheduler load balancer is not detecting the load imbalances in certain scenarios. This is leading to scenarios like where one socket is completely busy (with all the 4 cores running with 4 tasks) and leaving another socket completely idle. This causes performance issues as those 4 tasks share the memory controller, last-level cache bandwidth etc. Also we won't be taking advantage of turbo-mode as much as we would like, etc. Some of the comparisons in the scheduler load balancing code are comparing the "weighted cpu load that is scaled wrt sched_group's cpu_power" with the "weighted average load per task that is not scaled wrt sched_group's cpu_power". While this has probably been broken for a longer time (for multi socket numa nodes etc), the problem got aggrevated via this recent change: | | commit f93e65c186ab3c05ce2068733ca10e34fd00125e | Author: Peter Zijlstra <a.p.zijlstra@chello.nl> | Date: Tue Sep 1 10:34:32 2009 +0200 | | sched: Restore __cpu_power to a straight sum of power | Also with this change, the sched group cpu power alone no longer reflects the group capacity that is needed to implement MC, MT performance (default) and power-savings (user-selectable) policies. We need to use the computed group capacity (sgs.group_capacity, that is computed using the SD_PREFER_SIBLING logic in update_sd_lb_stats()) to find out if the group with the max load is above its capacity and how much load to move etc. Reported-by: Ma Ling <ling.ma@intel.com> Initial-Analysis-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> [ -v2: build fix ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@kernel.org> # [2.6.32.x, 2.6.33.x] LKML-Reference: <1266970432.11588.22.camel@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | sched: Don't use possibly stale sched_classThomas Gleixner2010-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | setscheduler() saves task->sched_class outside of the rq->lock held region for a check after the setscheduler changes have become effective. That might result in checking a stale value. rtmutex_setprio() has the same problem, though it is protected by p->pi_lock against setscheduler(), but for correctness sake (and to avoid bad examples) it needs to be fixed as well. Retrieve task->sched_class inside of the rq->lock held region. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org
| | * | | | | | | | | | | | Merge branch 'sched/urgent' into sched/coreThomas Gleixner2010-02-16
| | |\ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: kernel/sched.c Necessary due to the urgent fixes which conflict with the code move from sched.c to sched_fair.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | | | | | | | | | | | | kthread, sched: Remove reference to kthread_create_on_cpuAnton Blanchard2010-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kthread_create_on_cpu doesn't exist so update a comment in kthread.c to reflect this. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100209040740.GB3702@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | sched: cpuacct: Use bigger percpu counter batch values for stats countersAnton Blanchard2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_CGROUP_CPUACCT are enabled we can call cpuacct_update_stats with values much larger than percpu_counter_batch. This means the call to percpu_counter_add will always add to the global count which is protected by a spinlock and we end up with a global spinlock in the scheduler. Based on an idea by KOSAKI Motohiro, this patch scales the batch value by cputime_one_jiffy such that we have the same batch limit as we would if CONFIG_VIRT_CPU_ACCOUNTING was disabled. His patch did this once at boot but that initialisation happened too early on PowerPC (before time_init) and it was never updated at runtime as a result of a hotplug cpu add/remove. This patch instead scales percpu_counter_batch by cputime_one_jiffy at runtime, which keeps the batch correct even after cpu hotplug operations. We cap it at INT_MAX in case of overflow. For architectures that do not support CONFIG_VIRT_CPU_ACCOUNTING, cputime_one_jiffy is the constant 1 and gcc is smart enough to optimise min(s32 percpu_counter_batch, INT_MAX) to just percpu_counter_batch at least on x86 and PowerPC. So there is no need to add an #ifdef. On a 64 thread PowerPC box with CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_CGROUP_CPUACCT enabled, a context switch microbenchmark is 234x faster and almost matches a CONFIG_CGROUP_CPUACCT disabled kernel: CONFIG_CGROUP_CPUACCT disabled: 16906698 ctx switches/sec CONFIG_CGROUP_CPUACCT enabled: 61720 ctx switches/sec CONFIG_CGROUP_CPUACCT + patch: 16663217 ctx switches/sec Tested with: wget http://ozlabs.org/~anton/junkcode/context_switch.c make context_switch for i in `seq 0 63`; do taskset -c $i ./context_switch & done vmstat 1 Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | percpu_counter: Make __percpu_counter_add an inline function on UPAnton Blanchard2010-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though batch isn't used on UP, we may want to pass one in to keep the SMP and UP code paths similar. Convert __percpu_counter_add to an inline function so we wont get variable unused warnings if we do. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | Merge branch 'sched/urgent' into sched/coreIngo Molnar2010-02-08
| | |\ \ \ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Merge dependent fix, update to latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Remove member rt_se from struct rt_rqYong Zhang2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's a duplicate of tg->rt_se[cpu] and the only usage is sched_rt_rq_dequeue() and sched_rt_rq_enqueue(). After the first patch to those two function. rt_se can be removed. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <2674af741001282258q38781619u653ca4a7dd267347@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu]Yong Zhang2010-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step to remove rt_rq member rt_se because it have the same meaning with tg->rt_se[cpu]. And the latter style is also used by the fair scheduling class. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <2674af741001282257r28c97a92o9f90cf16fe8d3d84@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Remove unused update_shares_locked()Peter Zijlstra2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f492e12ef050e02bf0185b6b57874992591b9be1 ("sched: Remove load_balance_newidle()") removed the only user of this function, so remove it too. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1265019219.24455.128.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Use for_each_bitAkinobu Mita2010-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No change in functionality. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1264938810-4173-1-git-send-email-akinobu.mita@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Queue a deboosted task to the head of the RT prio queueThomas Gleixner2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rtmutex_set_prio() is used to implement priority inheritance for futexes. When a task is deboosted it gets enqueued at the tail of its RT priority list. This is violating the POSIX scheduling semantics: rt priority list X contains two runnable tasks A and B task A runs with priority X and holds mutex M task C preempts A and is blocked on mutex M -> task A is boosted to priority of task C (Y) task A unlocks the mutex M and deboosts itself -> A is dequeued from rt priority list Y -> A is enqueued to the tail of rt priority list X task C schedules away task B runs This is wrong as task A did not schedule away and therefor violates the POSIX scheduling semantics. Enqueue the task to the head of the priority list instead. Reported-by: Mathias Weber <mathias.weber.mw1@roche.com> Reported-by: Carsten Emde <cbe@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.809074113@linutronix.de>
| | * | | | | | | | | | | | | | sched: Implement head queueing for sched_rtThomas Gleixner2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ability of enqueueing a task to the head of a SCHED_FIFO priority list is required to fix some violations of POSIX scheduling policy. Implement the functionality in sched_rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.772169931@linutronix.de>
| | * | | | | | | | | | | | | | sched: Extend enqueue_task to allow head queueingThomas Gleixner2010-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ability of enqueueing a task to the head of a SCHED_FIFO priority list is required to fix some violations of POSIX scheduling policy. Extend the related functions with a "head" argument. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Carsten Emde <cbe@osadl.org> Tested-by: Mathias Weber <mathias.weber.mw1@roche.com> LKML-Reference: <20100120171629.734886007@linutronix.de>
| | * | | | | | | | | | | | | | sched: Remove USER_SCHEDDhaval Giani2010-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the USER_SCHED feature. It has been scheduled to be removed in 2.6.34 as per http://marc.info/?l=linux-kernel&m=125728479022976&w=2 Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1263990378.24844.3.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Fix the place where group powers are updatedGautham R Shenoy2010-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to update the sched_group_powers when balance_cpu == this_cpu. Currently the group powers are updated only if the balance_cpu is the first CPU in the local group. But balance_cpu = this_cpu could also be the first idle cpu in the group. Hence fix the place where the group powers are updated. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1264017764.5717.127.camel@jschopp-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Assume *balance is validPeter Zijlstra2010-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since all load_balance() callers will have !NULL balance parameters we can now assume so and remove a few checks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | | | | sched: Remove load_balance_newidle()Peter Zijlstra2010-01-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two functions: load_balance{,_newidle}() are very similar, with the following differences: - rq->lock usage - sb->balance_interval updates - *balance check So remove the load_balance_newidle() call with load_balance(.idle = CPU_NEWLY_IDLE), explicitly unlock the rq->lock before calling (would be done by double_lock_balance() anyway), and ignore the other differences for now. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>