aboutsummaryrefslogtreecommitdiffstats
path: root/arch/m68k/include/asm/bitops.h
Commit message (Collapse)AuthorAge
* Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2018-08-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking/atomics update from Thomas Gleixner: "The locking, atomics and memory model brains delivered: - A larger update to the atomics code which reworks the ordering barriers, consolidates the atomic primitives, provides the new atomic64_fetch_add_unless() primitive and cleans up the include hell. - Simplify cmpxchg() instrumentation and add instrumentation for xchg() and cmpxchg_double(). - Updates to the memory model and documentation" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) locking/atomics: Rework ordering barriers locking/atomics: Instrument cmpxchg_double*() locking/atomics: Instrument xchg() locking/atomics: Simplify cmpxchg() instrumentation locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentation tools/memory-model: Rename litmus tests to comply to norm7 tools/memory-model/Documentation: Fix typo, smb->smp sched/Documentation: Update wake_up() & co. memory-barrier guarantees locking/spinlock, sched/core: Clarify requirements for smp_mb__after_spinlock() sched/core: Use smp_mb() in wake_woken_function() tools/memory-model: Add informal LKMM documentation to MAINTAINERS locking/atomics/Documentation: Describe atomic_set() as a write operation tools/memory-model: Make scripts executable tools/memory-model: Remove ACCESS_ONCE() from model tools/memory-model: Remove ACCESS_ONCE() from recipes locking/memory-barriers.txt/kokr: Update Korean translation to fix broken DMA vs. MMIO ordering example MAINTAINERS: Add Daniel Lustig as an LKMM reviewer tools/memory-model: Fix ISA2+pooncelock+pooncelock+pombonce name tools/memory-model: Add litmus test for full multicopy atomicity locking/refcount: Always allow checked forms ...
| * locking/atomics/m68k: Don't use <asm-generic/bitops/lock.h>Will Deacon2018-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | <asm-generic/bitops/lock.h> is shortly going to be built on top of the atomic_long_*() API, which introduces a nasty circular dependency for m68k where <linux/atomic.h> pulls in <linux/bitops.h> via: linux/atomic.h asm/atomic.h linux/irqflags.h asm/irqflags.h linux/preempt.h asm/preempt.h asm-generic/preempt.h linux/thread_info.h asm/thread_info.h asm/page.h asm-generic/getorder.h linux/log2.h linux/bitops.h Since m68k isn't SMP and doesn't support ACQUIRE/RELEASE barriers, we can just define the lock bitops in terms of the atomic bitops in the <asm/bitops.h> header. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: yamada.masahiro@socionext.com Link: https://lore.kernel.org/lkml/1529412794-17720-3-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | m68k/bitops: convert __ffs to match generic declarationMike Rapoport2018-07-29
|/ | | | | | | | | | | | The generic bitops declare __ffs as static inline unsigned long __ffs(unsigned long word); Convert the m68k version to match the generic declaration. Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
* lib: optimize cpumask_next_and()Clement Courbet2018-02-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and(). It's essentially a joined iteration in search for a non-zero bit, which is currently implemented as a lookup join (find a nonzero bit on the lhs, lookup the rhs to see if it's set there). Implement a direct join (find a nonzero bit on the incrementally built join). Also add generic bitmap benchmarks in the new `test_find_bit` module for new function (see `find_next_and_bit` in [2] and [3] below). For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory usage. Note that on Arm, the new pure-C implementation still outperforms the old one that uses a mix of C and asm (`find_next_bit`) [3]. [1] Approximate benchmark code: ``` unsigned long src1p[nr_cpumask_longs] = {pattern1}; unsigned long src2p[nr_cpumask_longs] = {pattern2}; for (/*a bunch of repetitions*/) { for (int n = -1; n <= nr_cpu_ids; ++n) { asm volatile("" : "+rm"(src1p)); // prevent any optimization asm volatile("" : "+rm"(src2p)); unsigned long result = cpumask_next_and(n, src1p, src2p); asm volatile("" : "+rm"(result)); } } ``` Results: pattern1 pattern2 time_before/time_after 0x0000ffff 0x0000ffff 1.65 0x0000ffff 0x00005555 2.24 0x0000ffff 0x00001111 2.94 0x0000ffff 0x00000000 14.0 0x00005555 0x0000ffff 1.67 0x00005555 0x00005555 1.71 0x00005555 0x00001111 1.90 0x00005555 0x00000000 6.58 0x00001111 0x0000ffff 1.46 0x00001111 0x00005555 1.49 0x00001111 0x00001111 1.45 0x00001111 0x00000000 3.10 0x00000000 0x0000ffff 1.18 0x00000000 0x00005555 1.18 0x00000000 0x00001111 1.17 0x00000000 0x00000000 1.25 ----------------------------- geo.mean 2.06 [2] test_find_next_bit, X86 (skylake) [ 3913.477422] Start testing find_bit() with random-filled bitmap [ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations [ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations [ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations [ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations [ 3913.480216] Start testing find_next_and_bit() with random-filled bitmap [ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations [ 3913.481075] Start testing find_bit() with sparse bitmap [ 3913.481078] find_next_bit: 2536 cycles, 66 iterations [ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations [ 3913.481255] find_last_bit: 2006 cycles, 66 iterations [ 3913.481265] find_first_bit: 17488 cycles, 66 iterations [ 3913.481266] Start testing find_next_and_bit() with sparse bitmap [ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations [3] test_find_next_bit, arm (v7 odroid XU3). [ 267.206928] Start testing find_bit() with random-filled bitmap [ 267.214752] find_next_bit: 4474 cycles, 16419 iterations [ 267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations [ 267.229294] find_last_bit: 4209 cycles, 16419 iterations [ 267.279131] find_first_bit: 1032991 cycles, 16420 iterations [ 267.286265] Start testing find_next_and_bit() with random-filled bitmap [ 267.302386] find_next_and_bit: 2290 cycles, 8140 iterations [ 267.309422] Start testing find_bit() with sparse bitmap [ 267.316054] find_next_bit: 191 cycles, 66 iterations [ 267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations [ 267.329803] find_last_bit: 84 cycles, 66 iterations [ 267.336169] find_first_bit: 4118 cycles, 66 iterations [ 267.342627] Start testing find_next_and_bit() with sparse bitmap [ 267.356919] find_next_and_bit: 91 cycles, 1 iterations [courbet@google.com: v6] Link: http://lkml.kernel.org/r/20171129095715.23430-1-courbet@google.com [geert@linux-m68k.org: m68k/bitops: always include <asm-generic/bitops/find.h>] Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com Signed-off-by: Clement Courbet <courbet@google.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yury Norov <ynorov@caviumnetworks.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* m68k/bitops: Correct signature of test_bit()Geert Uytterhoeven2017-03-20
| | | | | | | | | | mm/filemap.c: In function ‘clear_bit_unlock_is_negative_byte’: mm/filemap.c:933: warning: passing argument 2 of ‘test_bit’ discards qualifiers from pointer target type Make the bitmask pointed to by the "vaddr" parameter volatile to fix this, like is done on other architectures. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
* arch,m68k: Convert smp_mb__*()Peter Zijlstra2014-04-18
| | | | | | | | | | | | | | m68k uses asm-generic/barrier.h and its smp_mb() is barrier(), therefore we can use the generic versions that use smp_mb(). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-s5dvosrb7qhvpmtaffwfn0zg@git.kernel.org Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* m68k: merge mmu and non-mmu bitops.hGreg Ungerer2011-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following patch merges the mmu and non-mmu versions of the m68k bitops.h files. Now there is a good deal of difference between the two files, but none of it is actually an mmu specific difference. It is all about the specific m68k/coldfire varient we are targeting. So it makes an awful lot of sense to merge these into a single bitops.h. There is a number of ways I can see to factor this code. The approach I have taken here is to keep the various versions of each macro/function type together. This means that there is some ifdefery with each to handle each CPU type. I have added some comments in a couple of appropriate places to try and make it clear what the differences we are dealing with are. Specifically the instruction and addressing mode differences we have to deal with. The merged form keeps the same underlying optimizations for each CPU type for all the general bit clear/set/change and find bit operations. It does switch to using the generic le operations though, instead of any local varients. Build tested on ColdFire, 68328, 68360 (which is cpu32) and 68020+. Run tested on ColdFire and ARAnyM. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
* m68k,m68knommu: merge header filesSam Ravnborg2009-01-16
Merge header files for m68k and m68knommu to the single location: arch/m68k/include/asm The majority of this patch was the result of the script that is included in the changelog below. The script was originally written by Arnd Bergman and exten by me to cover a few more files. When the header files differed the script uses the following: The original m68k file is named <file>_mm.h [mm for memory manager] The m68knommu file is named <file>_no.h [no for no memory manager] The files uses the following include guard: This include gaurd works as the m68knommu toolchain set the __uClinux__ symbol - so this should work in userspace too. Merging the header files for m68k and m68knommu exposes the (unexpected?) ABI differences thus it is easier to actually identify these and thus to fix them. The commit has been build tested with both a m68k and a m68knommu toolchain - with success. The commit has also been tested with "make headers_check" and this patch fixes make headers_check for m68knommu. The script used: TARGET=arch/m68k/include/asm SOURCE=arch/m68knommu/include/asm INCLUDE="cachectl.h errno.h fcntl.h hwtest.h ioctls.h ipcbuf.h \ linkage.h math-emu.h md.h mman.h movs.h msgbuf.h openprom.h \ oplib.h poll.h posix_types.h resource.h rtc.h sembuf.h shmbuf.h \ shm.h shmparam.h socket.h sockios.h spinlock.h statfs.h stat.h \ termbits.h termios.h tlb.h types.h user.h" EQUAL="auxvec.h cputime.h device.h emergency-restart.h futex.h \ ioctl.h irq_regs.h kdebug.h local.h mutex.h percpu.h \ sections.h topology.h" NOMUUFILES="anchor.h bootstd.h coldfire.h commproc.h dbg.h \ elia.h flat.h m5206sim.h m520xsim.h m523xsim.h m5249sim.h \ m5272sim.h m527xsim.h m528xsim.h m5307sim.h m532xsim.h \ m5407sim.h m68360_enet.h m68360.h m68360_pram.h m68360_quicc.h \ m68360_regs.h MC68328.h MC68332.h MC68EZ328.h MC68VZ328.h \ mcfcache.h mcfdma.h mcfmbus.h mcfne.h mcfpci.h mcfpit.h \ mcfsim.h mcfsmc.h mcftimer.h mcfuart.h mcfwdebug.h \ nettel.h quicc_simple.h smp.h" FILES="atomic.h bitops.h bootinfo.h bug.h bugs.h byteorder.h cache.h \ cacheflush.h checksum.h current.h delay.h div64.h \ dma-mapping.h dma.h elf.h entry.h fb.h fpu.h hardirq.h hw_irq.h io.h \ irq.h kmap_types.h machdep.h mc146818rtc.h mmu.h mmu_context.h \ module.h page.h page_offset.h param.h pci.h pgalloc.h \ pgtable.h processor.h ptrace.h scatterlist.h segment.h \ setup.h sigcontext.h siginfo.h signal.h string.h system.h swab.h \ thread_info.h timex.h tlbflush.h traps.h uaccess.h ucontext.h \ unaligned.h unistd.h" mergefile() { BASE=${1%.h} git mv ${SOURCE}/$1 ${TARGET}/${BASE}_no.h git mv ${TARGET}/$1 ${TARGET}/${BASE}_mm.h cat << EOF > ${TARGET}/$1 EOF git add ${TARGET}/$1 } set -e mkdir -p ${TARGET} git mv include/asm-m68k/* ${TARGET} rmdir include/asm-m68k git rm ${SOURCE}/Kbuild for F in $INCLUDE $EQUAL; do git rm ${SOURCE}/$F done for F in $NOMUUFILES; do git mv ${SOURCE}/$F ${TARGET}/$F done for F in $FILES ; do mergefile $F done rmdir arch/m68knommu/include/asm rmdir arch/m68knommu/include Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Greg Ungerer <gerg@uclinux.org>