aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/include/asm
Commit message (Collapse)AuthorAge
...
* | | | | | s390: remove runtime instrumentation interruptsMartin Schwidefsky2015-11-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The external interrupts for runtime instrumentation buffer-full and runtime instrumentation halted are unused and have no current user. Remove the support and ignore the second parameter of the s390_runtime_instr system call from now on. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/nmi: remove castsHeiko Carstens2015-10-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove all the casts to and from the machine check interruption code. This patch changes struct mci to a union, which contains an anonymous structure with the already known bits and in addition an unsigned long field, which contains the raw machine check interruption code. This allows to simply assign and decoce the interruption code value without the need for all those casts we had all the time. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390: don't store registers on disabled wait anymoreHeiko Carstens2015-10-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current disabled wait code stores register contents into their save areas, however it is (at least) missing the new vector registers. Given the fact that the whole exercise seems to be rather pointless simply don't save any registers anymore. In a "live" system it is always possible to inspect register contents, and in case of a dump the register contents will be stored by the dump mechanism. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390: get rid of __set_psw_mask()Heiko Carstens2015-10-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the removal of 31 bit code we can always assume that the epsw instruction is available. Therefore use the __extract_psw() function to disable and enable machine checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/fpu: split fpu-internal.h into fpu internals, api, and type headersHendrik Brueckner2015-10-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the API and FPU type definitions into separate header files similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b). The new header files and their meaning are: asm/fpu/types.h: FPU related data types, needed for 'struct thread_struct' and 'struct task_struct'. asm/fpu/api.h: FPU related 'public' functions for other subsystems and device drivers. asm/fpu/internal.h: FPU internal functions mainly used to convert FPU register contents in signal handling. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/spinlock: remove unneeded serializations at unlockChristian Borntraeger2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the kernel locks have aqcuire/release semantics. No operation done after the lock can be "moved" before the lock and no operation before the unlock can be moved after the unlock. But it is perfectly fine that memory accesses which happen code wise after unlock are performed within the critical section. On s390x, reads are in-order with other reads (PoP section "Storage-Operand Fetch References") and writes are in-order with other writes (PoP section "Storage-Operand Store References"). Writes are also in-order with reads to the same memory location (PoP section "Storage-Operand Store References"). To other CPUs (and the channel subsystem), reads additionally appear to be performed prior to reads or writes that happen after them in the conceptual sequence (PoP section "Relation between Operand Accesses"). So at least as observed by other CPUs and the channel subsystem, reads inside the critical sections will not happen after unlock (and writes are in-order anyway). That's exactly what we need for "RELEASE operations" (memory-barriers.txt): "It guarantees that all memory operations before the RELEASE operation will appear to happen before the RELEASE operation with respect to the other components of the system." Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com> [cross-reading and lot of improvements for the patch description] Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/etr,stp: fix possible deadlock on machine checkHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first level machine check handler for etr and stp machine checks may call queue_work() while in nmi context. This may deadlock e.g. if the machine check happened when the interrupted context did hold a lock, that also will be acquired by queue_work(). Therefore split etr and stp machine check handling into first and second level handling. The second level handling will then issue the queue_work() call in process context which avoids the potential deadlock. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/bitops: remove 31 bit related commentsHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/cmpxchg: remove dead codeHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the removal of 31 bit support a couple of defines became unused. Remove them. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/nmi: change type of mcck_interruption_code lowcore fieldHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some unknown reason the mcck_interruption_code field is defined as array of two 32 bit values. Given that this actually is a 64 bit field according to the architecture, change the type to u64. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/flags: use _BITUL macroHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The defines that are used in entry.S have been partially converted to use the _BITUL macro (setup.h). This patch converts the rest. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/flags: fix flag handlingHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cpu flags and pt_regs flags fields are each 64 bits in size. A flag can be set with helper functions like set_cpu_flags(). These functions create a mask using "1U << flag". This doesn't work if flag is larger than 31, since 1U << 32 == 0. So fix this in case we ever will have such flag numbers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/udelay: make udelay have busy loop semanticsHeiko Carstens2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using systemtap it was observed that our udelay implementation is rather suboptimal if being called from a kprobe handler installed by systemtap. The problem observed when a kprobe was installed on lock_acquired(). When the probe was hit the kprobe handler did call udelay, which set up an (internal) timer and reenabled interrupts (only the clock comparator interrupt) and waited for the interrupt. This is an optimization to avoid that the cpu is busy looping while waiting that enough time passes. The problem is that the interrupt handler still does call irq_enter()/irq_exit() which then again can lead to a deadlock, since some accounting functions may take locks as well. If one of these locks is the same, which caused lock_acquired() to be called, we have a nice deadlock. This patch reworks the udelay code for the interrupts disabled case to immediately leave the low level interrupt handler when the clock comparator interrupt happens. That way no C code is being called and the deadlock cannot happen anymore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/cpumf: rework program parameter setting to detect guest samplesChristian Borntraeger2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program parameter can be used to mark hardware samples with some token. Previously, it was used to mark guest samples only. Improve the program parameter doubleword by combining two parts, the leftmost LPP part and the rightmost PID part. Set the PID part for processes by using the task PID. To distinguish host and guest samples for the kernel (PID part is zero), the guest must always set the program paramater to a non-zero value. Use the leftmost bit in the LPP part of the program parameter to be able to detect guest kernel samples. [brueckner@linux.vnet.ibm.com]: Split __LC_CURRENT and introduced __LC_LPP. Corrected __LC_CURRENT users and adjusted assembler parts. And updated the commit message accordingly. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/entry: add assembler macro to conveniently tests under maskHendrik Brueckner2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Various functions in entry.S perform test-under-mask instructions to test for particular bits in memory. Because test-under-mask uses a mask value of one byte, the mask value and the offset into the memory must be calculated manually. This easily introduces errors and is hard to review and read. Introduce the TSTMSK assembler macro to specify a mask constant and let the macro calculate the offset and the byte mask to generate a test-under-mask instruction. The benefit is that existing symbolic constants can now be used for tests. Also the macro checks for zero mask values and mask values that consist of multiple bytes. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/fpu: add static FPU save area for init_taskHendrik Brueckner2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the init task did not have an allocated FPU save area and saving an FPU state was not possible. Now if the vector extension is always enabled, provide a static FPU save area to save FPU states of vector instructions that can be executed quite early. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/fpu: always enable the vector facility if it is availableHendrik Brueckner2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the kernel detects that the s390 hardware supports the vector facility, it is enabled by default at an early stage. To force it off, use the novx kernel parameter. Note that there is a small time window, where the vector facility is enabled before it is forced to be off. With enabling the vector facility by default, the FPU save and restore functions can be improved. They do not longer require to manage expensive control register updates to enable or disable the vector enablement control for particular processes. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/mm: try to avoid storage key operation in ptep_set_access_flagsMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The call to pgste_set_key in ptep_set_access_flags can be avoided if the old pte is found to be valid at the time the new access rights are set. The function that created the old, valid pte already completed the required storage key operation. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/barrier: remove unnecessary serialization in atomics and bitopsMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The principles of operation states reads are in order, writes are in order, writes can be reordered after reads, but no reads can be reordered after writes. The atomic and bitops variantes for z196 use the interlocked-access facility instructions with a memory barrier before and after the instruction. Because of the memory ordering the first barrier is unnecessary and can be removed. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/diag: add tracepoint for diagnose callsMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To be able to analyse problems in regard to hypervisor overhead add a tracepoing for diagnose calls. It reports the number of the diagnose issued, e.g. sshd-1385 [002] .... 42.701431: diagnose: nr=0x9c <idle>-0 [001] ..s. 43.587528: diagnose: nr=0x9c Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/diag: add a statistic for diagnose callsMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce /sys/debug/kernel/diag_stat with a statistic how many diagnose calls have been done by each CPU in the system. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/bitops: implement cache friendly test_and_set_bit_lockMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic implementation for test_and_set_bit_lock in include/asm-generic uses the standard test_and_set_bit operation. This is done with either a 'csg' or a 'loag' instruction. For both version the cache line is fetched exclusively, even if the bit is already set. The result is an increase in cache traffic, for a contented lock this is a bad idea. Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/mm: implement soft-dirty bits for user memory change trackingMartin Schwidefsky2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use bit 2**1 of the pte and bit 2**14 of the pmd for the soft dirty bit. The fault mechanism to do dirty tracking is already in place. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/cio: introduce pathmask_to_posSebastian Ott2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We often need to correlate an 8 bit path mask with the position in a channel path array. Introduce and use pathmask_to_pos for that task. Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/cio: use device_lock during cmb activationSebastian Ott2015-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hold the device_lock during [de]activation of the channel measurement block to synchronize concurrent usage of these functions. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | s390/barrier: avoid serialization in [smp_]rmb and [smp_]wmbChristian Borntraeger2015-10-14
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The principles of operation says: The storage-operand fetch references of one instruction occur after those of all preceding instructions and before those of subsequent instructions, as observed by other CPUs and by channel programs. [...] The CPU may fetch the operands of instructions before the instructions are executed. [...] The CPU may delay placing results in storage. [...] the results of one instruction are placed in storage after the results of all preceding instructions have been placed in storage and before any results of the succeeding instructions are stored, as observed by other CPUs and by the channel subsystem. which boils down to: - reads are in order - writes are in order - reads can happen earlier - writes can happen later By definition (see memory-barrier.txt) read barriers orders reads vs reads and write barriers orders writes agains writes. but neither of these orders reads vs. writes. That means we can implement smp_wmb,smp_rmb,wmb and rmb as simple compiler barriers. To avoid reviewing all driver code for correct barrier usage we keep dma_[rw]mb as serialization for now. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-10-06
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Three bug fixes and an update to the default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/defconfig: set SCSI_DH=y s390/vtime: correct scaled cputime of partially idle CPUs s390/boot/decompression: disable floating point in decompressor s390/numa: use correct type for node_to_cpumask_map
| * | | | s390/numa: use correct type for node_to_cpumask_mapMartin Schwidefsky2015-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CONFIG_CPUMASK_OFFSTACK=y cpumask_var_t is a pointer to a CPU mask. Replace the incorrect type for node_to_cpumask_map with cpumask_t. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | Merge branch 'strscpy' of ↵Linus Torvalds2015-10-04
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull strscpy string copy function implementation from Chris Metcalf. Chris sent this during the merge window, but I waffled back and forth on the pull request, which is why it's going in only now. The new "strscpy()" function is definitely easier to use and more secure than either strncpy() or strlcpy(), both of which are horrible nasty interfaces that have serious and irredeemable problems. strncpy() has a useless return value, and doesn't NUL-terminate an overlong result. To make matters worse, it pads a short result with zeroes, which is a performance disaster if you have big buffers. strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking the insane NUL padding, but having a differently broken return value which returns the original length of the source string. Which means that it will read characters past the count from the source buffer, and you have to trust the source to be properly terminated. It also makes error handling fragile, since the test for overflow is unnecessarily subtle. strscpy() avoids both these problems, guaranteeing the NUL termination (but not excessive padding) if the destination size wasn't zero, and making the overflow condition very obvious by returning -E2BIG. It also doesn't read past the size of the source, and can thus be used for untrusted source data too. So why did I waffle about this for so long? Every time we introduce a new-and-improved interface, people start doing these interminable series of trivial conversion patches. And every time that happens, somebody does some silly mistake, and the conversion patch to the improved interface actually makes things worse. Because the patch is mindnumbing and trivial, nobody has the attention span to look at it carefully, and it's usually done over large swatches of source code which means that not every conversion gets tested. So I'm pulling the strscpy() support because it *is* a better interface. But I will refuse to pull mindless conversion patches. Use this in places where it makes sense, but don't do trivial patches to fix things that aren't actually known to be broken. * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: use global strscpy() rather than private copy string: provide strscpy() Make asm/word-at-a-time.h available on all architectures
| * | | | Make asm/word-at-a-time.h available on all architecturesChris Metcalf2015-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added the x86 implementation of word-at-a-time to the generic version, which previously only supported big-endian. Omitted the x86-specific load_unaligned_zeropad(), which in any case is also not present for the existing BE-only implementation of a word-at-a-time, and is only used under CONFIG_DCACHE_WORD_ACCESS. Added as a "generic-y" to the Kbuilds of all architectures that didn't previously have it. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
* | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-09-25
|\ \ \ \ \ | |_|/ / / |/| | | / | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: disable halt_poll_ns as default for s390x KVM: x86: fix off-by-one in reserved bits check KVM: x86: use correct page table format to check nested page table reserved bits KVM: svm: do not call kvm_set_cr0 from init_vmcb KVM: x86: trap AMD MSRs for the TSeg base and mask KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs kvm: svm: reset mmu on VCPU reset
| * | | KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand2015-09-25
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2015-09-21
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of system call updates. The two new system calls userfaultfd and membarrier have been added, as well as the 17 direct calls for the multiplexed socket system calls. In addition the system call compat wrappers have been flagged as notrace functions and a few wrappers could be removed. And bug fixes for the vector register handling, cpu_mf, suspend/resume, compat signals, SMT cputime accounting and the zfcp dumper" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: wire up separate socketcalls system calls s390/compat: remove superfluous compat wrappers s390/compat: do not trace compat wrapper functions s390/s390x: allocate sys_membarrier system call number s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK s390: wire up userfaultfd system call s390/vtime: correct scaled cputime for SMT s390/cpum_cf: Corrected return code for unauthorized counter sets s390/compat: correct uc_sigmask of the compat signal frame s390: fix floating point register corruption s390/hibernate: fix save and restore of vector registers
| * | s390: wire up separate socketcalls system callsHeiko Carstens2015-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed on linux-arch all architectures should wire up the separate system calls that are hidden behind the socketcall multiplexer system call. It's just a couple more system calls and gives us a very small performance improvement. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | KVM: add halt_attempted_poll to VCPU statsPaolo Bonzini2015-09-16
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | dma-mapping: consolidate dma_set_maskChristoph Hellwig2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dma-mapping: consolidate dma_supportedChristoph Hellwig2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most architectures just call into ->dma_supported, but some also return 1 if the method is not present, or 0 if no dma ops are present (although that should never happeb). Consolidate this more broad version into common code. Also fix h8300 which inorrectly always returned 0, which would have been a problem if it's dma_set_mask implementation wasn't a similarly buggy noop. As a few architectures have much more elaborate implementations, we still allow for arch overrides. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dma-mapping: cosolidate dma_mapping_errorChristoph Hellwig2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there are three valid implementations of dma_mapping_error: (1) call ->mapping_error (2) check for a hardcoded error code (3) always return 0 This patch provides a common implementation that calls ->mapping_error if present, then checks for DMA_ERROR_CODE if defined or otherwise returns 0. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dma-mapping: consolidate dma_{alloc,free}_noncoherentChristoph Hellwig2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most architectures do not support non-coherent allocations and either define dma_{alloc,free}_noncoherent to their coherent versions or stub them out. Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips implements them directly. This patch moves the Openrisc version to common code, and handles the DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance. Note that actual non-coherent allocations require a dma_cache_sync implementation, so if non-coherent allocations didn't work on an architecture before this patch they still won't work after it. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}Christoph Hellwig2015-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2015-09-03
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking and atomic updates from Ingo Molnar: "Main changes in this cycle are: - Extend atomic primitives with coherent logic op primitives (atomic_{or,and,xor}()) and deprecate the old partial APIs (atomic_{set,clear}_mask()) The old ops were incoherent with incompatible signatures across architectures and with incomplete support. Now every architecture supports the primitives consistently (by Peter Zijlstra) - Generic support for 'relaxed atomics': - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return() - atomic_read_acquire() - atomic_set_release() This came out of porting qwrlock code to arm64 (by Will Deacon) - Clean up the fragile static_key APIs that were causing repeat bugs, by introducing a new one: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. To be able to know the 'type' of the static key we encode it in the jump entry (by Peter Zijlstra) - Static key self-tests (by Jason Baron) - qrwlock optimizations (by Waiman Long) - small futex enhancements (by Davidlohr Bueso) - ... and misc other changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) jump_label/x86: Work around asm build bug on older/backported GCCs locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics locking/qrwlock: Implement queue_write_unlock() using smp_store_release() locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t' locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic locking/static_keys: Make verify_keys() static jump label, locking/static_keys: Update docs locking/static_keys: Provide a selftest jump_label: Provide a self-test s390/uaccess, locking/static_keys: employ static_branch_likely() x86, tsc, locking/static_keys: Employ static_branch_likely() locking/static_keys: Add selftest locking/static_keys: Add a new static_key interface locking/static_keys: Rework update logic locking/static_keys: Add static_key_{en,dis}able() helpers ...
| * \ Merge branch 'locking/arch-atomic' into locking/core, because it's ready for ↵Ingo Molnar2015-08-12
| |\ \ | | | | | | | | | | | | | | | | | | | | upstream Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | atomic: Collapse all atomic_{set,clear}_mask definitionsPeter Zijlstra2015-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the now generic definitions of atomic_{set,clear}_mask() into linux/atomic.h to avoid endless and pointless repetition. Also, provide an atomic_andnot() wrapper for those few archs that can implement that. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | atomic: Provide atomic_{or,xor,and}Peter Zijlstra2015-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement atomic logic ops -- atomic_{or,xor,and}. These will replace the atomic_{set,clear}_mask functions that are available on some archs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | s390: Provide atomic_{or,xor,and}Peter Zijlstra2015-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement atomic logic ops -- atomic_{or,xor,and}. These will replace the atomic_{set,clear}_mask functions that are available on some archs. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | | locking/static_keys: Add a new static_key interfacePeter Zijlstra2015-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are various problems and short-comings with the current static_key interface: - static_key_{true,false}() read like a branch depending on the key value, instead of the actual likely/unlikely branch depending on init value. - static_key_{true,false}() are, as stated above, tied to the static_key init values STATIC_KEY_INIT_{TRUE,FALSE}. - we're limited to the 2 (out of 4) possible options that compile to a default NOP because that's what our arch_static_branch() assembly emits. So provide a new static_key interface: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); Which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. This means adding a second arch_static_branch_jump() assembly helper which emits a JMP per default. In order to determine the right instruction for the right state, encode the branch type in the LSB of jump_entry::key. This is the final step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()") ... but it also allows new static key combinations that will give us performance enhancements in the subsequent patches. Tested-by: Rabin Vincent <rabin@rab.in> # arm Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | locking, arch: use WRITE_ONCE()/READ_ONCE() in ↵Andrey Konovalov2015-08-03
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smp_store_release()/smp_load_acquire() Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire() with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips, powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work reliably on non-scalar types. WRITE_ONCE() and READ_ONCE() were introduced in the following commits: 230fa253df63 ("kernel: Provide READ_ONCE and ASSIGN_ONCE") 43239cbe79fc ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2015-09-01
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: "The dominant change in this cycle was the continued work to isolate kernel drivers from MTRR legacies: this tree gets rid of all kernel internal driver interfaces to MTRRs (mostly by rewriting it to proper PAT interfaces), the only access left is the /proc/mtrr ABI. This work was done by Luis R Rodriguez. There's also some related PCI interface additions for which I've Cc:-ed Bjorn" * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del() s390/io: Add pci_iomap_wc() and pci_iomap_wc_range() drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc() drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc() drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc() PCI: Add pci_iomap_wc() variants drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar() drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar() PCI: Add pci_ioremap_wc_bar() x86/mm: Make kernel/check.c explicitly non-modular x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular x86/mm/pat: Add comments to cachemode translation tables arch/*/io.h: Add ioremap_uc() to all architectures drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc() drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC drivers/video/fbdev/atyfb: Clarify ioremap() base and length used drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default ...
| * | | s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()Luis R. Rodriguez2015-08-28
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following commit: 1b3d4200c1e0 ("PCI: Add pci_iomap_wc() variants") Introduced pci_iomap_wc() variants but broke the s390 build, because s390 requires its own implementation of pcio_iomap*() calls. The reason for that is that: "BAR spaces are not disjunctive on s390 so we need the bar parameter of pci_iomap to find the corresponding device and create the mapping cookie" so it has its own lookup/lock solution and it does not include asm-generic/pci_iomap.h. Since it currenty maps ioremap_wc() to ioremap_nocache() and that's the architecture default we can easily just map the wc calls to the default calls as well. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Frank Blaschka <frank.blaschka@de.ibm.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thierry Reding <treding@nvidia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-fbdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux390@de.ibm.com Link: http://lkml.kernel.org/r/1440632050-23648-1-git-send-email-mcgrof@do-not-panic.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2015-08-31
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The big one is support for fake NUMA, splitting a really large machine in more manageable piece improves performance in some cases, e.g. for a KVM host. The FICON Link Incident handling has been improved, this helps the operator to identify degraded or non-operational FICON connections. The save and restore of floating point and vector registers has been overhauled to allow the future use of vector registers in the kernel. A few small enhancement, magic sys-requests for the vt220 console via SCLP, some more assembler code has been converted to C, the PCI error handling is improved. And the usual cleanup and bug fixing" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits) s390/jump_label: Use %*ph to print small buffers s390/sclp_vt220: support magic sysrequests s390/ctrlchar: improve handling of magic sysrequests s390/numa: remove superfluous ARCH_WANT defines s390/3270: redraw screen on unsolicited device end s390/dcssblk: correct out of bounds array indexes s390/mm: simplify page table alloc/free code s390/pci: move debug messages to debugfs s390/nmi: initialize control register 0 earlier s390/zcrypt: use msleep() instead of mdelay() s390/hmcdrv: fix interrupt registration s390/setup: fix novx parameter s390/uaccess: remove uaccess_primary kernel parameter s390: remove unneeded sizeof(void *) comparisons s390/facilities: remove transactional-execution bits s390/numa: re-add DIE sched_domain_topology_level s390/dasd: enhance CUIR scope detection s390/dasd: fix failing path verification s390/vdso: emit a GNU hash s390/numa: make core to node mapping data dynamic ...