aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* ext4: delete "set but not used" variablesjon ernst2014-01-11
| | | | | | Signed-off-by: Jon Ernst <jonernst07@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
* ext4: don't pass freed handle to ext4_walk_page_buffersTheodore Ts'o2014-01-07
| | | | | | | | | | | | This is harmless, since ext4_walk_page_buffers only passes the handle onto the callback function, and in this call site the function in question, bput_one(), doesn't actually use the handle. But there's no point passing in an invalid handle, and it creates a Coverity warning, so let's just clean it up. Addresses-Coverity-Id: #1091168 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: avoid clearing beyond i_blocks when truncating an inline data fileTheodore Ts'o2014-01-07
| | | | | | | | | | | | A missing cast means that when we are truncating a file which is less than 60 bytes, we don't clear the correct area of memory, and in fact we can end up truncating the next inode in the inode table, or worse yet, some other kernel data structure. Addresses-Coverity-Id: #751987 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: ext4_inode_is_fast_symlink should use EXT4_CLUSTER_SIZEYongqiang Yang2014-01-06
| | | | | | | | Can be reproduced by xfstests 62 with bigalloc and 128bit size inode. Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
* ext4: fix a typo in extents.cYongqiang Yang2014-01-06
| | | | | Signed-off-by: Yongqiang Yang <yangyongqiang01@baidu.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
* ext4: use %pd printk specificerDavid Howells2014-01-06
| | | | | | | | | | | Use the new %pd printk() specifier in Ext4 to replace passing of dentry name or dentry name and name length * 2 with just passing the dentry. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> cc: Andreas Dilger <adilger.kernel@dilger.ca> cc: linux-ext4@vger.kernel.org
* ext4: standardize error handling in ext4_da_write_inline_data_begin()Jan Kara2014-01-06
| | | | | | | | | The function has a bit non-standard (for ext4) error recovery in that it used a mix of 'out' labels and testing for 'handle' being NULL. There isn't a good reason for that in the function so clean it up a bit. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: retry allocation when inline->extent conversion failedJan Kara2014-01-06
| | | | | | | | | | Similarly as other ->write_begin functions in ext4, also ext4_da_write_inline_data_begin() should retry allocation if the conversion failed because of ENOSPC. This avoids returning ENOSPC prematurely because of uncommitted block deletions. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: enable punch hole for bigallocZheng Liu2014-01-06
| | | | | | | | | | After applied this commit (d23142c6), ext4 has supported punch hole for a file system with bigalloc feature. But we forgot to enable it. This commit fixes it. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix bigalloc regressionEric Whitney2014-01-06
| | | | | | | | | | | | | Commit f5a44db5d2 introduced a regression on filesystems created with the bigalloc feature (cluster size > blocksize). It causes xfstests generic/006 and /013 to fail with an unexpected JBD2 failure and transaction abort that leaves the test file system in a read only state. Other xfstests run on bigalloc file systems are likely to fail as well. The cause is the accidental use of a cluster mask where a cluster offset was needed in ext4_ext_map_blocks(). Signed-off-by: Eric Whitney <enwlinux@gmail.com>
* ext4: add explicit casts when masking cluster sizesTheodore Ts'o2013-12-20
| | | | | | | | | | | | | | | The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team <pageexec@freemail.hu> Reported-by: Emese Revfy <re.emese@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix deadlock when writing in ENOSPC conditionsJan Kara2013-12-18
| | | | | | | | | | | | | | | | | | | Akira-san has been reporting rare deadlocks of his machine when running xfstests test 269 on ext4 filesystem. The problem turned out to be in ext4_da_reserve_metadata() and ext4_da_reserve_space() which called ext4_should_retry_alloc() while holding i_data_sem. Since ext4_should_retry_alloc() can force a transaction commit, this is a lock ordering violation and leads to deadlocks. Fix the problem by just removing the retry loops. These functions should just report ENOSPC to the caller (e.g. ext4_da_write_begin()) and that function must take care of retrying after dropping all necessary locks. Reported-and-tested-by: Akira Fujita <a-fujita@rs.jp.nec.com> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* jbd2: rename obsoleted msg JBD->JBD2Dmitry Monakhov2013-12-08
| | | | | | | | Rename performed via: perl -pi -e 's/JBD:/JBD2:/g' fs/jbd2/*.c Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
* jbd2: revise KERN_EMERG error messagesJan Kara2013-12-08
| | | | | | | | | | | Some of KERN_EMERG printk messages do not really deserve this log level and the one in log_wait_commit() is even rather useless (the journal has been previously aborted and *that* is where we should have been complaining). So make some messages just KERN_ERR and remove the useless message. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* jbd2: don't BUG but return ENOSPC if a handle runs out of spaceTheodore Ts'o2013-12-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | If a handle runs out of space, we currently stop the kernel with a BUG in jbd2_journal_dirty_metadata(). This makes it hard to figure out what might be going on. So return an error of ENOSPC, so we can let the file system layer figure out what is going on, to make it more likely we can get useful debugging information). This should make it easier to debug problems such as the one which was reported by: https://bugzilla.kernel.org/show_bug.cgi?id=44731 The only two callers of this function are ext4_handle_dirty_metadata() and ocfs2_journal_dirty(). The ocfs2 function will trigger a BUG_ON(), which means there will be no change in behavior. The ext4 function will call ext4_error_inode() which will print the useful debugging information and then handle the situation using ext4's error handling mechanisms (i.e., which might mean halting the kernel or remounting the file system read-only). Also, since both file systems already call WARN_ON(), drop the WARN_ON from jbd2_journal_dirty_metadata() to avoid two stack traces from being displayed. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: ocfs2-devel@oss.oracle.com Acked-by: Joel Becker <jlbec@evilplan.org>
* ext4: Do not reserve clusters when fs doesn't support extentsJan Kara2013-12-08
| | | | | | | | | | | | | | | | | | When the filesystem doesn't support extents (like in ext2/3 compatibility modes), there is no need to reserve any clusters. Space estimates for writing are exact, hole punching doesn't need new metadata, and there are no unwritten extents to convert. This fixes a problem when filesystem still having some free space when accessed with a native ext2/3 driver suddently reports ENOSPC when accessed with ext4 driver. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix del_timer() misuse for ->s_err_reportAl Viro2013-12-08
| | | | | | | | | | | | | | That thing should be del_timer_sync(); consider what happens if ext4_put_super() call of del_timer() happens to come just as it's getting run on another CPU. Since that timer reschedules itself to run next day, you are pretty much guaranteed that you'll end up with kfree'd scheduled timer, with usual fun consequences. AFAICS, that's -stable fodder all way back to 2010... [the second del_timer_sync() is almost certainly not needed, but it doesn't hurt either] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: check for overlapping extents in ext4_valid_extent_entries()Eryu Guan2013-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A corrupted ext4 may have out of order leaf extents, i.e. extent: lblk 0--1023, len 1024, pblk 9217, flags: LEAF UNINIT extent: lblk 1000--2047, len 1024, pblk 10241, flags: LEAF UNINIT ^^^^ overlap with previous extent Reading such extent could hit BUG_ON() in ext4_es_cache_extent(). BUG_ON(end < lblk); The problem is that __read_extent_tree_block() tries to cache holes as well but assumes 'lblk' is greater than 'prev' and passes underflowed length to ext4_es_cache_extent(). Fix it by checking for overlapping extents in ext4_valid_extent_entries(). I hit this when fuzz testing ext4, and am able to reproduce it by modifying the on-disk extent by hand. Also add the check for (ee_block + len - 1) in ext4_valid_extent() to make sure the value is not overflow. Ran xfstests on patched ext4 and no regression. Cc: Lukáš Czerner <lczerner@redhat.com> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix use-after-free in ext4_mb_new_blocksJunho Ryu2013-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ext4_mb_put_pa should hold pa->pa_lock before accessing pa->pa_count. While ext4_mb_use_preallocated checks pa->pa_deleted first and then increments pa->count later, ext4_mb_put_pa decrements pa->pa_count before holding pa->pa_lock and then sets pa->pa_deleted. * Free sequence ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa * Use sequence ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_use_preallocated (4): increase pa->pa_count ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_release_context: access pa * Use-after-free sequence [initial status] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (1): iterate over preallocation ext4_mb_use_preallocated (2): lock pa->pa_lock ext4_mb_use_preallocated (3): check pa->pa_deleted ext4_mb_put_pa (1): atomic_dec_and_test pa->pa_count [pa_count decremented] <pa->pa_deleted = 0, pa_count = 0> ext4_mb_use_preallocated (4): increase pa->pa_count [pa_count incremented] <pa->pa_deleted = 0, pa_count = 1> ext4_mb_use_preallocated (5): unlock pa->pa_lock ext4_mb_put_pa (2): lock pa->pa_lock ext4_mb_put_pa (3): check pa->pa_deleted ext4_mb_put_pa (4): set pa->pa_deleted=1 [race condition!] <pa->pa_deleted = 1, pa_count = 1> ext4_mb_put_pa (5): unlock pa->pa_lock ext4_mb_put_pa (6): remove pa from a list ext4_mb_pa_callback: free pa ext4_mb_release_context: access pa AddressSanitizer has detected use-after-free in ext4_mb_new_blocks Bug report: http://goo.gl/rG1On3 Signed-off-by: Junho Ryu <jayr@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: call ext4_error_inode() if jbd2_journal_dirty_metadata() failsTheodore Ts'o2013-12-02
| | | | | | | | | | | | | | | | | | | | | | | | While it's true that errors can only happen if there is a bug in jbd2_journal_dirty_metadata(), if a bug does happen, we need to halt the kernel or remount the file system read-only in order to avoid further data loss. The ext4_journal_abort_handle() function doesn't do any of this, and while it's likely that this call (since it doesn't adjust refcounts) will likely result in the file system eventually deadlocking since the current transaction will never be able to close, it's much cleaner to call let ext4's error handling system deal with this situation. There's a separate bug here which is that if certain jbd2 errors errors occur and file system is mounted errors=continue, the file system will probably eventually end grind to a halt as described above. But things have been this way in a long time, and usually when we have these sorts of errors it's pretty much a disaster --- and that's why the jbd2 layer aggressively retries memory allocations, which is the most likely cause of these jbd2 errors. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
* Linux 3.13-rc2Linus Torvalds2013-11-29
|
* Merge tag 'arm64-stable' of ↵Linus Torvalds2013-11-29
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull ARM64 fixes from Catalin Marinas: - Remove preempt_count modifications in the arm64 IRQ handling code since that's already dealt with in generic irq_enter/irq_exit - PTE_PROT_NONE bit moved higher up to avoid overlapping with the hardware bits (for PROT_NONE mappings which are pte_present) - Big-endian fixes for ptrace support - Asynchronous aborts unmasking while in the kernel - pgprot_writecombine() change to create Normal NonCacheable memory rather than Device GRE * tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: arm64: Move PTE_PROT_NONE higher up arm64: Use Normal NonCacheable memory for writecombine arm64: debug: make aarch32 bkpt checking endian clean arm64: ptrace: fix compat registes get/set to be endian clean arm64: Unmask asynchronous aborts when in kernel mode arm64: dts: Reserve the memory used for secondary CPU release address arm64: let the core code deal with preempt_count
| * arm64: Move PTE_PROT_NONE higher upCatalin Marinas2013-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PTE_PROT_NONE means that a pte is present but does not have any read/write attributes. However, setting the memory type like pgprot_writecombine() is allowed and such bits overlap with PTE_PROT_NONE. This causes mmap/munmap issues in drivers that change the vma->vm_pg_prot on PROT_NONE mappings. This patch reverts the PTE_FILE/PTE_PROT_NONE shift in commit 59911ca4325d (ARM64: mm: Move PTE_PROT_NONE bit) and moves PTE_PROT_NONE together with the other software bits. Signed-off-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> # 3.11+
| * arm64: Use Normal NonCacheable memory for writecombineCatalin Marinas2013-11-29
| | | | | | | | | | | | | | | | This provides better performance compared to Device GRE and also allows unaligned accesses. Such memory is intended to be used with standard RAM (e.g. framebuffers) and not I/O. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: debug: make aarch32 bkpt checking endian cleanMatthew Leach2013-11-28
| | | | | | | | | | | | | | | | | | | | The current breakpoint instruction checking code for A32 is not endian clean. Fix this with appropriate byte-swapping when retrieving instructions. Signed-off-by: Matthew Leach <matthew.leach@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: ptrace: fix compat registes get/set to be endian cleanMatthew Leach2013-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | On a BE system the wrong half of the X registers is retrieved/written when attempting to get/set the value of aarch32 registers through ptrace. Ensure that types are the correct width so that the relevant casting occurs. Signed-off-by: Matthew Leach <matthew.leach@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: Unmask asynchronous aborts when in kernel modeCatalin Marinas2013-11-25
| | | | | | | | | | | | | | | | | | The asynchronous aborts are generally fatal for the kernel but they can be masked via the pstate A bit. If a system error happens while in kernel mode, it won't be visible until returning to user space. This patch enables this kind of abort early to help identifying the cause. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: dts: Reserve the memory used for secondary CPU release addressCatalin Marinas2013-11-25
| | | | | | | | | | | | | | | | | | With the spin-table SMP booting method, secondary CPUs poll a location passed in the DT. The foundation-v8.dts file doesn't have this memory reserved and there is a risk of Linux using it before secondary CPUs are started. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: let the core code deal with preempt_countMarc Zyngier2013-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f27dde8deef3 (sched: Add NEED_RESCHED to the preempt_count) introduced the use of bit 31 in preempt_count for obscure scheduling purposes. This causes interrupts taken from EL0 to hit the (open coded) BUG when this flag is flipped while handling the interrupt (we compare the values before and after, and kill the kernel if they are different). The fix is to stop messing with the preempt count entirely, as this is already being dealt with in the generic code (irq_enter/irq_exit). Tested on a dual A53 FPGA running cyclictest. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-11-29
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "One performance improvement and a few bug fixes. Two of the fixes deal with the clock related problems we have seen on recent kernels" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: handle asce-type exceptions as normal page fault s390,time: revert direct ktime path for s390 clockevent device s390/time,vdso: convert to the new update_vsyscall interface s390/uaccess: add missing page table walk range check s390/mm: optimize copy_page s390/dasd: validate request size before building CCW/TCW request s390/signal: always restore saved runtime instrumentation psw bit
| * | s390/mm: handle asce-type exceptions as normal page faultMartin Schwidefsky2013-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git commit 9e34f2686bb088b211b6cac8772e1f644c6180f8 "s390/mm,tlb: tlb flush on page table upgrade fixup" removed the exception handler for the asce-type exception. This is incorrect as the user-copy with MVCOS can cause asce-type exceptions in the kernel if a user pointer is too large. Those need to be handled with do_no_context to branch to the fixup in the user-copy code. The simplest fix for this problem is to call do_dat_exception for asce-type excpetions, as there is no vma for the address the code will handle the exception correctly. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390,time: revert direct ktime path for s390 clockevent deviceMartin Schwidefsky2013-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git commit 4f37a68cdaf6dea833cfdded2a3e0c47c0f006da "s390: Use direct ktime path for s390 clockevent device" makes use of the CLOCK_EVT_FEAT_KTIME clockevent option to avoid the delta calculation with ktime_get() in clockevents_program_event and the get_tod_clock() in s390_next_event. This is based on the assumption that the difference between the internal ktime and the hardware clock is reflected in the wall_to_monotonic delta. But this is not true, the ntp corrections are applied via changes to the tk->mult multiplier and this is not reflected in wall_to_monotonic. In theory this could be solved by using the raw monotonic clock but it is simpler to switch back to the standard clock delta calculation. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/time,vdso: convert to the new update_vsyscall interfaceMartin Schwidefsky2013-11-25
| | | | | | | | | | | | | | | | | | | | | Switch to the improved update_vsyscall interface that provides sub-nanosecond precision for gettimeofday and clock_gettime. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/uaccess: add missing page table walk range checkHeiko Carstens2013-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When translating a user space address, the address must be checked against the ASCE limit of the process. If the address is larger than the maximum address that is reachable with the ASCE, an ASCE type exception must be generated. The current code simply ignored the higher order bits. This resulted in an address wrap around in user space instead of an exception in user space. Cc: stable@vger.kernel.org # v3.9+ Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | s390/mm: optimize copy_pageHeiko Carstens2013-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always use the mvcl instruction to copy a page instead of mvpg or a couple of mvc instructions. Copying a huge page is 25% faster this way. Also bypass caches when copying pages since only parts of a page will be used afterwards. Especially when copying a huge page this would kick everything out of the L1 and L2 data caches on a zEC12 machine. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
| * | s390/dasd: validate request size before building CCW/TCW requestStefan Weinhuber2013-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An I/O request that does not read or write full blocks cannot be translated into a correct CCW or TCW program and should be rejected right away. In particular the code that creates TCW requests will not notice this problem and create broken TCWs that will be rejected by the hardware. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Reference-ID: RQM1956
| * | s390/signal: always restore saved runtime instrumentation psw bitHendrik Brueckner2013-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit "s390: fix handling of runtime instrumentation psw bit" (5ebf250dab) changed the behavior of setting the runtime instrumentation psw bit. This commit restores the original logic: 1. When returning from the signal handler, the runtime instrumentation psw bit is restored to its saved state. 2. If the runtime instrumentation psw bit is enabled during the signal handler, it is always turned off when leaving the signal handler. The saved state is restored as described in 1. That also implies that turning on runtime instrumentation in the signal handler is only effective while running in the signal context. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
* | | Merge branch 'i2c/for-current' of ↵Linus Torvalds2013-11-29
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some easy but needed fixes for i2c drivers since rc1" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: bcm2835: Linking platform nodes to adapter nodes i2c: omap: raw read and write endian fix i2c: i2c-bcm-kona: Fix module build i2c: i2c-diolan-u2c: different usb endpoints for DLN-2-U2C i2c: bcm-kona: remove duplicated include i2c: davinci: raw read and write endian fix
| * | | i2c: bcm2835: Linking platform nodes to adapter nodesFlorian Meier2013-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to find I2C devices in the device tree, the platform nodes have to be known by the I2C core. This requires setting the dev.of_node parameter of the adapter. Signed-off-by: Florian Meier <florian.meier@koalo.de> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | i2c: omap: raw read and write endian fixVictor Kamensky2013-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All OMAP IP blocks expect LE data, but CPU may operate in BE mode. Need to use endian neutral functions to read/write h/w registers. I.e instead of __raw_read[lw] and __raw_write[lw] functions code need to use read[lw]_relaxed and write[lw]_relaxed functions. If the first simply reads/writes register, the second will byteswap it if host operates in BE mode. Changes are trivial sed like replacement of __raw_xxx functions with xxx_relaxed variant. Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org> Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | i2c: i2c-bcm-kona: Fix module buildTim Kryger2013-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct a typo that prevented the driver from being built as a module. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Tim Kryger <tim.kryger@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | i2c: i2c-diolan-u2c: different usb endpoints for DLN-2-U2CMartin Vogt2013-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous diolan adapter uses other out/in endpoints than the current DLN-2-U2C in compatibility mode. They changed from 0x2/0x84 to 0x3/0x83. This patch gets the endpoints from the usb interface, instead of hardcode them in the driver. This was tested on a current DLN-2-U2C board. Signed-off-by: Martin Vogt <mvogt1@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | i2c: bcm-kona: remove duplicated includeWei Yongjun2013-11-26
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Markus Mayer <markus.mayer@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
| * | | i2c: davinci: raw read and write endian fixTaras Kondratiuk2013-11-26
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I2C IP block expect LE data, but CPU may operate in BE mode. Need to use endian neutral functions to read/write h/w registers. I.e instead of __raw_read[lw] and __raw_write[lw] functions code need to use read[lw]_relaxed and write[lw]_relaxed functions. If the first simply reads/writes register, the second will byteswap it if host operates in BE mode. Changes are trivial sed like replacement of __raw_xxx functions with xxx_relaxed variant. Signed-off-by: Taras Kondratiuk <taras.kondratiuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
* | | Merge branch 'for-3.13-fixes' of ↵Linus Torvalds2013-11-29
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: "This contains one important fix. The NUMA support added a while back broke ordering guarantees on ordered workqueues. It was enforced by having single frontend interface with @max_active == 1 but the NUMA support puts multiple interfaces on unbound workqueues on NUMA machines thus breaking the ordered guarantee. This is fixed by disabling NUMA support on ordered workqueues. The above and a couple other patches were sitting in for-3.12-fixes but I forgot to push that out, so they ended up waiting a bit too long. My aplogies. Other fixes are minor" * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues workqueue: fix comment typo for __queue_work() workqueue: fix ordered workqueues in NUMA setups workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY
| * | | workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in ↵Li Bin2013-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | init_workqueues When one work starts execution, the high bits of work's data contain pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID is assigned WORK_OFFQ_POOL_NONE when the work being initialized indicating that no pool is associated and get_work_pool() uses it to check the associated pool. So if worker_pool_assign_id() assigns a ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it triggers leakage, and it may break the non-reentrance guarantee. This patch fix this issue by modifying the worker_pool_assign_id() function calling idr_alloc() by setting @end param WORK_OFFQ_POOL_NONE. Furthermore, in the current implementation, the BUILD_BUG_ON() in init_workqueues makes no sense. The number of worker pools needed cannot be determined at compile time, because the number of backing pools for UNBOUND workqueues is dynamic based on the assigned custom attributes. So remove it. tj: Minor comment and indentation updates. Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | workqueue: fix comment typo for __queue_work()Li Bin2013-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems the "dying" should be "draining" here. Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | workqueue: fix ordered workqueues in NUMA setupsTejun Heo2013-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An ordered workqueue implements execution ordering by using single pool_workqueue with max_active == 1. On a given pool_workqueue, work items are processed in FIFO order and limiting max_active to 1 enforces the queued work items to be processed one by one. Unfortunately, 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues") accidentally broke this guarantee by applying NUMA affinity to ordered workqueues too. On NUMA setups, an ordered workqueue would end up with separate pool_workqueues for different nodes. Each pool_workqueue still limits max_active to 1 but multiple work items may be executed concurrently and out of order depending on which node they are queued to. Fix it by using dedicated ordered_wq_attrs[] when creating ordered workqueues. The new attrs match the unbound ones except that no_numa is always set thus forcing all NUMA nodes to share the default pool_workqueue. While at it, add sanity check in workqueue creation path which verifies that an ordered workqueues has only the default pool_workqueue. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Libin <huawei.libin@huawei.com> Cc: stable@vger.kernel.org Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
| * | | workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITYOleg Nesterov2013-11-22
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | Move the setting of PF_NO_SETAFFINITY up before set_cpus_allowed() in create_worker(). Otherwise userland can change ->cpus_allowed in between. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | Merge branch 'for-3.13-fixes' of ↵Linus Torvalds2013-11-29
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "libata device removal path was removing parent device node before its child, which is mostly harmless but triggers warning after recent sysfs changes. Rafael's patch fixes the order. Other than that, minor controller-specific fixes and device ID additions" * 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ATA: Fix port removal ordering ahci: add Marvell 9230 to the AHCI PCI device list ata: fix acpi_bus_get_device() return value check pata_arasan_cf: add missing clk_disable_unprepare() on error path ahci: add support for IBM Akebono platform device