aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* f2fs: reuse the locked dnode page and its inodeJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following deadlock bug during the recovery. INFO: task mount:1322 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mount D ffffffff81125870 0 1322 1266 0x00000000 ffff8801207e39d8 0000000000000046 ffff88012ab1dee0 0000000000000046 ffff8801207e3a08 ffff880115903f40 ffff8801207e3fd8 ffff8801207e3fd8 ffff8801207e3fd8 ffff880115903f40 ffff8801207e39d8 ffff88012fc94520 Call Trace: [<ffffffff81125870>] ? __lock_page+0x70/0x70 [<ffffffff816a92d9>] schedule+0x29/0x70 [<ffffffff816a93af>] io_schedule+0x8f/0xd0 [<ffffffff8112587e>] sleep_on_page+0xe/0x20 [<ffffffff816a649a>] __wait_on_bit_lock+0x5a/0xc0 [<ffffffff81125867>] __lock_page+0x67/0x70 [<ffffffff8106c7b0>] ? autoremove_wake_function+0x40/0x40 [<ffffffff81126857>] find_lock_page+0x67/0x80 [<ffffffff8112698f>] find_or_create_page+0x3f/0xb0 [<ffffffffa03901a8>] ? sync_inode_page+0xa8/0xd0 [f2fs] [<ffffffffa038fdf7>] get_node_page+0x67/0x180 [f2fs] [<ffffffffa039818b>] recover_fsync_data+0xacb/0xff0 [f2fs] [<ffffffff816aaa1e>] ? _raw_spin_unlock+0x3e/0x40 [<ffffffffa0389634>] f2fs_fill_super+0x7d4/0x850 [f2fs] [<ffffffff81184cf9>] mount_bdev+0x1c9/0x210 [<ffffffffa0388e60>] ? validate_superblock+0x180/0x180 [f2fs] [<ffffffffa0387635>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff81185a13>] mount_fs+0x43/0x1b0 [<ffffffff81145ba0>] ? __alloc_percpu+0x10/0x20 [<ffffffff811a0796>] vfs_kern_mount+0x76/0x120 [<ffffffff811a2cb7>] do_mount+0x237/0xa10 [<ffffffff81140b9b>] ? strndup_user+0x5b/0x80 [<ffffffff811a3520>] SyS_mount+0x90/0xe0 [<ffffffff816b3502>] system_call_fastpath+0x16/0x1b The bug is triggered when check_index_in_prev_nodes tries to get the direct node page by calling get_node_page. At this point, if the direct node page is already locked by get_dnode_of_data, its caller, we got a deadlock condition. This patch adds additional condition check for the reuse of locked direct node pages prior to the get_node_page call. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix wrong condition checkJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | While an orphan inode has zero link_count, f2fs_gc is able to select the inode for foreground gc. - f2fs_gc - do_garbage_collect - gc_data_segment : f2fs_iget is failed : get_valid_blocks() != 0, so that retry --> here we got the infinite loop. This patch resolved this issue. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add f2fs_readonly()Jaegeuk Kim2013-05-28
| | | | | | Introduce a simple macro function for readability. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid RECLAIM_FS-ON-W: deadlockJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to avoid the following deadlock condition of which the reclaim path can trigger f2fs_balance_fs again. ================================= [ INFO: inconsistent lock state ] --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes: (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs] {RECLAIM_FS-ON-W} state was registered at: [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140 [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0 [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0 [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180 [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0 [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0 [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs] [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs] [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs] [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs] [<ffffffff811ae51b>] notify_change+0x1db/0x3a0 [<ffffffff8118fbd0>] do_truncate+0x60/0xa0 [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0 [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0 [<ffffffff8118ffee>] SyS_truncate+0xe/0x10 [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: don't do checkpoint if error is occurredJaegeuk Kim2013-05-28
| | | | | | | | If we met an error during the dentry recovery, we should not conduct checkpoint. Otherwise, some errorneous dentry blocks overwrites the existing blocks that contain the remaining recovery information. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix to unlock page before exitJaegeuk Kim2013-05-28
| | | | | | If we got an error after lock_page, we should unlock it before exit. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unnecessary kmap/kunmap operationsJaegeuk Kim2013-05-28
| | | | | | | The allocated page used by the recovery is not on HIGHMEM, so that we don't need to use kmap/kunmap. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: reorganize f2fs_vm_page_mkwriteNamjae Jeon2013-05-28
| | | | | | | | | | | | Few things can be changed in the default mkwrite function 1) Make file_update_time at the start before acquiring any lock 2) the condition page_offset(page) >= i_size_read(inode) should be changed to page_offset(page) > i_size_read 3) Move wait_on_page_writeback. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: use list_for_each_entry rather than list_for_each_entry_safemajianpeng2013-05-28
| | | | | | | | | We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its list operations. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unecessary variable and codeHaicheng Li2013-05-28
| | | | | | | Code cleanup without behavior changed. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs, lockdep: annotate mutex_lock_all()Peter Zijlstra2013-05-28
| | | | | | | | | | | | | Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all() acquires an array of locks (in global/local lock style). Any such operation is always serialized using cp_mutex, therefore there is no fs_lock[] lock-order issue; tell lockdep about this using the mutex_lock_nest_lock() primitive. Reported-by: majianpeng <majianpeng@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add debug msgs in the recovery routineJaegeuk Kim2013-05-28
| | | | | | This patch adds some trivial debugging messages in the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: update inode page after creationJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: change get_new_data_page to pass a locked node pageJaegeuk Kim2013-05-28
| | | | | | This patch is for passing a locked node page to get_dnode_of_data. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: skip get_node_page if locked node page is passedJaegeuk Kim2013-05-28
| | | | | | | | If get_dnode_of_data gets a locked node page, let's skip redundant get_node_page calls. This is for the futher enhancement. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unnecessary por_doing checkJaegeuk Kim2013-05-28
| | | | | | This por_doing check is totally not related to the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix BUG_ON during f2fs_evict_inode(dir)Jaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the dentry recovery routine, recover_inode() triggers __f2fs_add_link with its directory inode. In the following scenario, a bug is captured. 1. dir = f2fs_iget(pino) 2. __f2fs_add_link(dir, name) 3. iput(dir) -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents)) Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable] [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs] Call Trace: [<ffffffff8118ea00>] evict+0xb0/0x1b0 [<ffffffff8118f1c5>] iput+0x105/0x190 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs] [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70 This means that we should flush all the dentry pages between iget and iput(). But, during the recovery routine, it is unallowed due to consistency, so we have to wait the whole recovery process. And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we can put the stale dir inodes from the dirty_dir_inode_list. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix por_doing variable coverageJaegeuk Kim2013-05-28
| | | | | | | | | The reason of using sbi->por_doing is to alleviate data writes during the recovery. The find_fsync_dnodes() produces some dirty dentry pages, so we should cover it too with sbi->por_doing. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove redundant assignmentJaegeuk Kim2013-05-28
| | | | | | We don't need to assign a value redundantly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix the inconsistent state of data pagesJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | In get_lock_data_page, if there is a data race between get_dnode_of_data for node and grab_cache_page for data, f2fs is able to face with the following BUG_ON(dn.data_blkaddr == NEW_ADDR). kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251! [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs] Call Trace: [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs] [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0 [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110 [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b This bug is able to be occurred when the block address of the data block is changed after f2fs_put_dnode(). In order to avoid that, this patch fixes the lock order of node and data blocks in which the node block lock is covered by the data block lock. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix inconsistency of block count during recoveryJaegeuk Kim2013-05-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently f2fs recovers the dentry of fsynced files. When power-off-recovery is conducted, this newly recovered inode should increase node block count as well as inode block count. This patch resolves this inconsistency that results in: 1. create a file 2. write data 3. fsync 4. reboot without sync 5. mount and recover the file 6. node block count is 1 and inode block count is 2 : fall into the inconsistent state 7. unlink the file : trigger the following BUG_ON ------------[ cut here ]------------ kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716! Call Trace: [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs] [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs] [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs] [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs] [<ffffffff811c7a67>] evict+0xa7/0x1a0 [<ffffffff811c82b5>] iput+0x105/0x190 [<ffffffff811c2b30>] d_kill+0xe0/0x120 [<ffffffff811c2c57>] dput+0xe7/0x1e0 [<ffffffff811acc3d>] __fput+0x19d/0x2d0 [<ffffffff811acd7e>] ____fput+0xe/0x10 [<ffffffff81070645>] task_work_run+0xb5/0xe0 [<ffffffff81002941>] do_notify_resume+0x71/0xb0 [<ffffffff8175f14a>] int_signal+0x12/0x17 Reported-and-Tested-by: Chris Fries <C.Fries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* Linux 3.10-rc3Linus Torvalds2013-05-26
|
* ipc/sem.c: Fix missing wakeups in do_smart_update_queue()Manfred Spraul2013-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | do_smart_update_queue() is called when an operation (semop, semctl(SETVAL), semctl(SETALL), ...) modified the array. It must check which of the sleeping tasks can proceed. do_smart_update_queue() missed a few wakeups: - if a sleeping complex op was completed, then all per-semaphore queues must be scanned - not only those that were modified by *sops - if a sleeping simple op proceeded, then the global queue must be scanned again And: - the test for "|sops == NULL) before scanning the global queue is not required: If the global queue is empty, then it doesn't need to be scanned - regardless of the reason for calling do_smart_update_queue() The patch is not optimized, i.e. even completing a wait-for-zero operation causes a rescan. This is done to keep the patch as simple as possible. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2013-05-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: - Stable fix to prevent an rpc_task wakeup race - Fix a NFSv4.1 session drain deadlock - Fix a NFSv4/v4.1 mount regression when not running rpc.gssd - Ensure auth_gss pipe detection works in namespaces - Fix SETCLIENTID fallback if rpcsec_gss is not available * tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Fix SETCLIENTID fallback if GSS is not available SUNRPC: Prevent an rpc_task wakeup race NFSv4.1 Fix a pNFS session draining deadlock SUNRPC: Convert auth_gss pipe detection to work in namespaces SUNRPC: Faster detection if gssd is actually running SUNRPC: Fix a bug in gss_create_upcall
| * NFS: Fix SETCLIENTID fallback if GSS is not availableChuck Lever2013-05-23
| | | | | | | | | | | | | | | | | | Commit 79d852bf "NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE" did not take into account commit 23631227 "NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * SUNRPC: Prevent an rpc_task wakeup raceTrond Myklebust2013-05-22
| | | | | | | | | | | | | | | | | | | | | | The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to be careful about ordering the calls to rpc_test_and_set_running(task) and rpc_clear_queued(task). If we get the order wrong, then we may end up testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped and changed the state of the rpc_task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
| * NFSv4.1 Fix a pNFS session draining deadlockAndy Adamson2013-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a CB_RECALL the callback service thread flushes the inode using filemap_flush prior to scheduling the state manager thread to return the delegation. When pNFS is used and I/O has not yet gone to the data server servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async filemap_flush call, the LAYOUTGET must proceed to completion. If the state manager starts to recover data while the inode flush is sending the LAYOUTGET, a deadlock occurs as the callback service thread holds the single callback session slot until the flushing is done which blocks the state manager thread, and the state manager thread has set the session draining bit which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot table waitq. Separate the draining of the back channel from the draining of the fore channel by moving the NFS4_SESSION_DRAINING bit from session scope into the fore and back slot tables. Drain the back channel first allowing the LAYOUTGET call to proceed (and fail) so the callback service thread frees the callback slot. Then proceed with draining the forechannel. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * SUNRPC: Convert auth_gss pipe detection to work in namespacesTrond Myklebust2013-05-16
| | | | | | | | | | | | | | | | | | This seems to have been overlooked when we did the namespace conversion. If a container is running a legacy version of rpc.gssd then it will be disrupted if the global 'pipe_version' is set by a container running the new version of rpc.gssd. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * SUNRPC: Faster detection if gssd is actually runningTrond Myklebust2013-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes to the NFS security flavour negotiation mean that we have a stronger dependency on rpc.gssd. If the latter is not running, because the user failed to start it, then we time out and mark the container as not having an instance. We then use that information to time out faster the next time. If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe, then we mark the container as having an rpc.gssd instance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * SUNRPC: Fix a bug in gss_create_upcallTrond Myklebust2013-05-15
| | | | | | | | | | | | | | | | If wait_event_interruptible_timeout() is successful, it returns the number of seconds remaining until the timeout. In that case, we should be retrying the upcall. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* | Merge tag 'edac_fixes_for_3.10' of ↵Linus Torvalds2013-05-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull amd64 edac fix from Borislav Petkov: "A sysfs file permissions correction" * tag 'edac_fixes_for_3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: amd64_edac: Fix bogus sysfs file permissions
| * | amd64_edac: Fix bogus sysfs file permissionsBorislav Petkov2013-05-21
| |/ | | | | | | | | | | | | Fix yet another issue caught by 8f46baaa7ec6c ("base: core: WARN() about bogus permissions on device attributes"). Signed-off-by: Borislav Petkov <bp@suse.de>
* | Merge branch 'parisc-for-3.10' of ↵Linus Torvalds2013-05-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "This time we made the kernel- and interruption stack allocation reentrant which fixed some strange kernel crashes (specifically protection ID traps). Furthemore this patchset fixes the interrupt stack in UP and SMP configurations by using native locking instructions. And finally usage of floating point calculations on parisc were disabled in the MPILIB." * 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: fix irq stack on UP and SMP parisc/superio: Use module_pci_driver to register driver parisc: make interrupt and interruption stack allocation reentrant parisc: show number of FPE and unaligned access handler calls in /proc/interrupts parisc: add additional parisc git tree to MAINTAINERS file parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.S parisc: add rp5470 entry to machine database MPILIB: disable usage of floating point registers on parisc
| * | parisc: fix irq stack on UP and SMPHelge Deller2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic to detect if the irq stack was already in use with raw_spin_trylock() is wrong, because it will generate a "trylock failure on UP" error message with CONFIG_SMP=n and CONFIG_DEBUG_SPINLOCK=y. arch_spin_trylock() can't be used either since in the CONFIG_SMP=n case no atomic protection is given and we are reentrant here. A mutex didn't worked either and brings more overhead by turning off interrupts. So, let's use the fastest path for parisc which is the ldcw instruction. Counting how often the irq stack was used is pretty useless, so just drop this piece of code. Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc/superio: Use module_pci_driver to register driverPeter Huewe2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | Removing some boilerplate by using module_pci_driver instead of calling register and unregister in the otherwise empty init/exit functions. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: make interrupt and interruption stack allocation reentrantJohn David Anglin2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The get_stack_use_cr30 and get_stack_use_r30 macros allocate a stack frame for external interrupts and interruptions requiring a stack frame. They are currently not reentrant in that they save register context before the stack is set or adjusted. I have observed a number of system crashes where there was clear evidence of stack corruption during interrupt processing, and as a result register corruption. Some interruptions can still occur during interruption processing, however external interrupts are disabled and data TLB misses don't occur for absolute accesses. So, it's not entirely clear what triggers this issue. Also, if an interruption occurs when Q=0, it is generally not possible to recover as the shadowed registers are not copied. The attached patch reworks the get_stack_use_cr30 and get_stack_use_r30 macros to allocate stack before doing register saves. The new code is a couple of instructions shorter than the old implementation. Thus, it's an improvement even if it doesn't fully resolve the stack corruption issue. Based on limited testing, it improves SMP system stability. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: show number of FPE and unaligned access handler calls in ↵Helge Deller2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | /proc/interrupts Show number of floating point assistant and unaligned access fixup handler in /proc/interrupts file. Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: add additional parisc git tree to MAINTAINERS fileHelge Deller2013-05-24
| | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.SHelge Deller2013-05-24
| | | | | | | | | | | | | | | | | | additionally clean up some whitespaces & tabs. Signed-off-by: Helge Deller <deller@gmx.de>
| * | parisc: add rp5470 entry to machine databaseHelge Deller2013-05-24
| | | | | | | | | | | | Signed-off-by: Helge Deller <deller@gmx.de>
| * | MPILIB: disable usage of floating point registers on pariscHelge Deller2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The umul_ppmm() macro for parisc uses the xmpyu assembler statement which does calculation via a floating point register. But usage of floating point registers inside the Linux kernel are not allowed and gcc will stop compilation due to the -mdisable-fpregs compiler option. Fix this by disabling the umul_ppmm() and udiv_qrnnd() macros. The mpilib will then use the generic built-in implementations instead. Signed-off-by: Helge Deller <deller@gmx.de>
* | | Merge tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-05-26
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs fixes from Ben Myers: "Here are fixes for corruption on 512 byte filesystems, a rounding error, a use-after-free, some flags to fix lockdep reports, and several fixes related to CRCs. We have a somewhat larger post -rc1 queue than usual due to fixes related to the CRC feature we merged for 3.10: - Fix for corruption with FSX on 512 byte blocksize filesystems - Fix rounding error in xfs_free_file_space - Fix use-after-free with extent free intents - Add several missing KM_NOFS flags to fix lockdep reports - Several fixes for CRC related code" * tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs: xfs: remote attribute lookups require the value length xfs: xfs_attr_shortform_allfit() does not handle attr3 format. xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC xfs: fix missing KM_NOFS tags to keep lockdep happy xfs: Don't reference the EFI after it is freed xfs: fix rounding in xfs_free_file_space xfs: fix sub-page blocksize data integrity writes
| * | | xfs: remote attribute lookups require the value lengthDave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading a remote attribute, to correctly calculate the length of the data buffer for CRC enable filesystems, we need to know the length of the attribute data. We get this information when we look up the attribute, but we don't store it in the args structure along with the other remote attr information we get from the lookup. Add this information to the args structure so we can use it appropriately. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit e461fcb194172b3f709e0b478d2ac1bdac7ab9a3)
| * | | xfs: xfs_attr_shortform_allfit() does not handle attr3 format.Dave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfstests generic/117 fails with: XFS: Assertion failed: leaf->hdr.info.magic == cpu_to_be16(XFS_ATTR_LEAF_MAGIC) indicating a function that does not handle the attr3 format correctly. Fix it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit b38958d715316031fe9ea0cc6c22043072a55f49)
| * | | xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGICDave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 72916fb8cbcf0c2928f56cdc2fbe8c7bf5517758)
| * | | xfs: fix missing KM_NOFS tags to keep lockdep happyDave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several places where we use KM_SLEEP allocation contexts and use the fact that they are called from transaction context to add KM_NOFS where appropriate. Unfortunately, there are several places where the code makes this assumption but can be called from outside transaction context but with filesystem locks held. These places need explicit KM_NOFS annotations to avoid lockdep complaining about reclaim contexts. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit ac14876cf9255175bf3bdad645bf8aa2b8fb2d7c)
| * | | xfs: Don't reference the EFI after it is freedDave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking the EFI for whether it is being released from recovery after we've already released the known active reference is a mistake worthy of a brown paper bag. Fix the (now) obvious use after free that it can cause. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 52c24ad39ff02d7bd73c92eb0c926fb44984a41d)
| * | | xfs: fix rounding in xfs_free_file_spaceDave Chinner2013-05-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The offset passed into xfs_free_file_space() needs to be rounded down to a certain size, but the rounding mask is built by a 32 bit variable. Hence the mask will always mask off the upper 32 bits of the offset and lead to incorrect writeback and invalidation ranges. This is not actually exposed as a bug because we writeback and invalidate from the rounded offset to the end of the file, and hence the offset we are actually punching a hole out of will always be covered by the code. This needs fixing, however, if we ever want to use exact ranges for writeback/invalidation here... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 28ca489c63e9aceed8801d2f82d731b3c9aa50f5)
| * | | xfs: fix sub-page blocksize data integrity writesDave Chinner2013-05-24
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FSX on 512 byte block size filesystems has been failing for some time with corrupted data. The fault dates back to the change in the writeback data integrity algorithm that uses a mark-and-sweep approach to avoid data writeback livelocks. Unfortunately, a side effect of this mark-and-sweep approach is that each page will only be written once for a data integrity sync, and there is a condition in writeback in XFS where a page may require two writeback attempts to be fully written. As a result of the high level change, we now only get a partial page writeback during the integrity sync because the first pass through writeback clears the mark left on the page index to tell writeback that the page needs writeback.... The cause is writing a partial page in the clustering code. This can happen when a mapping boundary falls in the middle of a page - we end up writing back the first part of the page that the mapping covers, but then never revisit the page to have the remainder mapped and written. The fix is simple - if the mapping boundary falls inside a page, then simple abort clustering without touching the page. This means that the next ->writepage entry that write_cache_pages() will make is the page we aborted on, and xfs_vm_writepage() will map all sections of the page correctly. This behaviour is also optimal for non-data integrity writes, as it results in contiguous sequential writeback of the file rather than missing small holes and having to write them a "random" writes in a future pass. With this fix, all the fsx tests in xfstests now pass on a 512 byte block size filesystem on a 4k page machine. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> (cherry picked from commit 49b137cbbcc836ef231866c137d24f42c42bb483)
* | | Merge tag 'for-v3.10-fixes' of git://git.infradead.org/battery-2.6Linus Torvalds2013-05-25
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull bettery fixes from Anton Vorontsov: "Last minute one-liners: wrong kfree usage fix, module alias fixup and kconfig adjustments" * tag 'for-v3.10-fixes' of git://git.infradead.org/battery-2.6: pm2301_charger: Fix module alias prefix wm831x_backup: Fix wrong kfree call for devdata->backup.name bq27x00: Fix I2C dependency in KConfig lp8788-charger: Fix kconfig dependency