litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	ceph: fix file mode calculation	Sage Weil	2011-07-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	open(2) must always include one of O_RDONLY, O_WRONLY, or O_RDWR. No need for any O_APPEND special case. Passing O_WRONLY\|O_RDWR is undefined according to the man page, but the Linux VFS interprets this as O_RDWR, so we'll do the same. This fixes open(2) with flags O_RDWR\|O_APPEND, which was incorrectly being translated to readonly. Reported-by: Fyodor Ustinov <ufm@ufm.su> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix sync and dio writes across stripe boundaries	Sage Weil	2011-06-13
\| \| \| \| \| \| \| \| \|	We were iterating across stripe boundaries properly, but not moving the write buffer pointer forward. This caused us to rewrite the same data after the break. Fix by adjusting the data pointer forward, and recalculating the io and buffer alignment after the break. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: fix page calculation for non-page-aligned io	Sage Weil	2011-06-13
\| \| \| \| \| \| \| \|	Set the page count correctly for non-page-aligned IO. We were already doing this correctly for alignment, but not the page count. Fixes DIRECT_IO writes from unaligned pages. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix page alignment corrections	Sage Weil	2011-06-13
\| \| \| \| \| \| \| \|	dd if=/dev/urandom of=/mnt/fs_depot/dd10 bs=500 seek=8388 count=1 dd if=/mnt/fs_depot/dd10 of=/root/dd10out bs=500 skip=8388 count=1 Reported-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: unwind canceled flock state	Sage Weil	2011-06-08
\| \| \| \| \| \| \| \| \|	If we request a lock and then abort (e.g., ^C), we need to send a matching unlock request to the MDS to unwind our lock attempt to avoid indefinitely blocking other clients. Reported-by: Brian Chrisman <brchrisman@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix ENOENT logic in striped_read	Sage Weil	2011-06-08
\| \| \| \| \| \| \| \| \| \| \| \|	Getting ENOENT is equivalent to reading 0 bytes. Make that correction before setting up the hit_stripe and was_short flags. Fixes the following case: dd if=/dev/zero of=/mnt/fs_depot/dd3 bs=1 seek=1048576 count=0 dd if=/mnt/fs_depot/dd3 of=/root/ddout1 skip=8 bs=500 count=2 iflag=direct Reported-by: Henry C Chang <henry.cy.chang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix short sync reads from the OSD	Sage Weil	2011-06-08
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we get a short read from the OSD because the object is small, we need to zero the remainder of the buffer. For O_DIRECT reads, the attempted range is not trimmed to i_size by the VFS, so we were actually looping indefinitely. Fix by trimming by i_size, and the unconditionally zeroing the trailing range. Reported-by: Jeff Wu <cpwu@tnsoft.com.cn> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix sync vs canceled write	Sage Weil	2011-06-08
\| \| \| \| \| \| \|	If we cancel a write, trigger the safe completions to prevent a sync from blocking indefinitely in ceph_osdc_sync(). Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use ihold when we already have an inode ref	Sage Weil	2011-06-08
\| \| \| \| \| \| \| \|	We should use ihold whenever we already have a stable inode ref, even when we aren't holding i_lock. This avoids adding new and unnecessary locking dependencies. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix cap flush race reentrancy	Sage Weil	2011-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In e9964c10 we change cap flushing to do a delicate dance because some inodes on the cap_dirty list could be in a migrating state (got EXPORT but not IMPORT) in which we couldn't actually flush and move from dirty->flushing, breaking the while (!empty) { process first } loop structure. It worked for a single sync thread, but was not reentrant and triggered infinite loops when multiple syncers came along. Instead, move inodes with dirty to a separate cap_dirty_migrating list when in the limbo export-but-no-import state, allowing us to go back to the simple loop structure (which was reentrant). This is cleaner and more robust. Audited the cap_dirty users and this looks fine: list_empty(&ci->i_dirty_item) is still a reliable indicator of whether we have dirty caps (which list we're on is irrelevant) and list_del_init() calls still do the right thing. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: subscribe to osdmap when cluster is full	Sage Weil	2011-05-24
\| \| \| \| \| \| \| \|	When the cluster is marked full, subscribe to subsequent map updates to ensure we find out promptly when it is no longer full. This will prevent us from spewing ENOSPC for (much) longer than necessary. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: handle new osdmap down/state change encoding	Sage Weil	2011-05-24
\| \| \| \| \| \| \| \|	Old incrementals encode a 0 value (nearly always) when an osd goes down. Change that to allow any state bit(s) to be flipped. Special case 0 to mean flip the CEPH_OSD_UP bit to mimic the old behavior. Signed-off-by: Sage Weil <sage@newdream.net>
*	rbd: handle online resize of underlying rbd image	Sage Weil	2011-05-24
\| \| \| \| \| \| \|	If we get a notification that the image header has changed, check for a change in the image size. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: avoid inode lookup on nfs fh reconnect	Sage Weil	2011-05-24
\| \| \| \| \| \| \|	If we get the inode from the MDS, we have a reference in req; don't do a fresh lookup. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use LOOKUPINO to make unconnected nfs fh more reliable	Sage Weil	2011-05-24
\| \| \| \| \| \| \|	If we are unable to locate an inode by ino, ask the MDS using the new LOOKUPINO command. Signed-off-by: Sage Weil <sage@newdream.net>
*	rbd: use snprintf for disk->disk_name	Sage Weil	2011-05-24
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	rbd: cleanup: make kfree match kmalloc	Sage Weil	2011-05-24
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	rbd: warn on update_snaps failure on notify	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: check return value for start_request in writepages	Sage Weil	2011-05-19
\| \| \| \| \| \| \|	Since we pass the nofail arg, we should never get an error; BUG if we do. (And fix the function to not return an error if __map_request fails.) Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: remove useless check	Sage Weil	2011-05-19
\| \| \| \| \| \|	rc is only ever 0 or negative in this method. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: add missing breaks in addr_set_port	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: fix TAG_WAIT case	Sage Weil	2011-05-19
\| \| \| \| \| \| \|	If we get a WAIT as a client something went wrong; error out. And don't fall through to an unrelated case. Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix broken comparison in readdir loop	Sage Weil	2011-05-19
\| \| \| \| \| \| \|	Both off and fi->offset are unsigned, so the difference is always >= 0. Compare them directly instead of the sign of the difference. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: fix osdmap timestamp assignment	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: fix rare potential cap leak	Sage Weil	2011-05-19
\| \| \| \| \| \| \|	If we grab new_cap, retake the lock, and find we already have a cap now for the given mds, release new_cap. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: use snprintf for unknown addrs	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: use snprintf for formatting object name	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: use snprintf for dirstat content	Sage Weil	2011-05-19
\| \| \| \| \| \| \|	We allocate a buffer for rstats if the dirstat option is enabled. Use snprintf. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: fix uninitialized value when no get_authorizer method is set	Sage Weil	2011-05-19
\| \| \| \| \| \| \| \|	If there is no get_authorizer method we set the out_kvec to a bogus pointer. The length is also zero in that case, so it doesn't much matter, but it's better not to add the empty item in the first place. Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: remove unused variable	Sage Weil	2011-05-19
\| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
*	libceph: handle connection reopen race with callbacks	Sage Weil	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a connection is closed and/or reopened (ceph_con_close, ceph_con_open) it can race with a callback. con_work does various state checks for closed or reopened sockets at the beginning, but drops con->mutex before making callbacks. We need to check for state bit changes after retaking the lock to ensure we restart con_work and execute those CLOSED/OPENING tests or else we may end up operating under stale assumptions. In Jim's case, this was causing 'bad tag' errors. There are four cases where we re-take the con->mutex inside con_work: catch them all and return EAGAIN from try_{read,write} so that we can restart con_work. Reported-by: Jim Schutt <jaschut@sandia.gov> Tested-by: Jim Schutt <jaschut@sandia.gov> Signed-off-by: Sage Weil <sage@newdream.net>
*	ceph: take reference on mds request r_unsafe_dir	Sage Weil	2011-05-19
\| \| \| \| \| \| \| \| \| \|	We put ourselves on an inode list for the parent directory of metadata operations so that an fsync on the directory will wait for metadata updates to commit to disk. We weren't holding a reference to that directory, however, and under certain workloads (fsstress in this case) the directory can go away. Signed-off-by: Sage Weil <sage@newdream.net>
*	Linux 2.6.39	Linus Torvalds	2011-05-19
\|
*	Merge branch 'fixes' of ↵	Linus Torvalds	2011-05-18
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: configfs: Fix race between configfs_readdir() and configfs_d_iput() configfs: Don't try to d_delete() negative dentries. ocfs2/dlm: Target node death during resource migration leads to thread spin ocfs2: Skip mount recovery for hard-ro mounts ocfs2/cluster: Heartbeat mismatch message improved ocfs2/cluster: Increase the live threshold for global heartbeat ocfs2/dlm: Use negotiated o2dlm protocol version ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes. ocfs2: Initialize data_ac (might be used uninitialized)
\| *	configfs: Fix race between configfs_readdir() and configfs_d_iput()	Joel Becker	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	configfs_readdir() will use the existing inode numbers of inodes in the dcache, but it makes them up for attribute files that aren't currently instantiated. There is a race where a closing attribute file can be tearing down at the same time as configfs_readdir() is trying to get its inode number. We want to get the inode number of open attribute files, because they should match while instantiated. We can't lock down the transition where dentry->d_inode is set to NULL, so we just check for NULL there. We can, however, ensure that an inode we find isn't iput() in configfs_d_iput() until after we've accessed it. Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	configfs: Don't try to d_delete() negative dentries.	Joel Becker	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When configfs is faking mkdir() on its subsystem or default group objects, it starts by adding a negative dentry. It then tries to instantiate the group. If that should fail, it must clean up after itself. I was using d_delete() here, but configfs_attach_group() promises to return an empty dentry on error. d_delete() explodes with the entry dentry. Let's try d_drop() instead. The unhashing is what we want for our dentry. Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/dlm: Target node death during resource migration leads to thread spin	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During resource migration, if the target node were to die, the thread doing the migration spins until the target node is not removed from the domain map. This patch slows the spin by making the thread wait for the recovery to kick in. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: Skip mount recovery for hard-ro mounts	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch skips mount recovery for hard-ro mounts which otherwise leads to an oops. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/cluster: Heartbeat mismatch message improved	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If o2hb finds unexpected values in the heartbeat slot, it prints a message "ERROR: Device "dm-6": another node is heartbeating in our slot!" This message could be misleading. This patch adds two more messages to help users better diagnose the problem. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/cluster: Increase the live threshold for global heartbeat	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have seen isolated cases (very few, I might add) of o2hb not detecting all live nodes on startup. One plausible reasoning for it is that other node had a hb io delay at the same time. The live threshold set at 2 (as low as it can be) could be increased to ameliorate the situation. But increasing the threshold directly affects mount time. Currently it takes around 5 secs to mount a volume in o2cb cluster with local heartbeat. Increasing the threshold will make mounts even slower. As the issue itself is rare, we have left things as they are for the local heartbeat mode. However we can improve the situation for global heartbeat mode as in that mode, we start the heartbeat much before the mount. This patch doubles the live threshold for the start of the first region in global heartbeat mode. Addresses internal Oracle bug#10635585. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/dlm: Use negotiated o2dlm protocol version	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch fixes a bug in the o2dlm protocol negotiation in that it is using the builtin version rather than the negotiated version during the domain join. This causes join errors when a node having kernel >= 2.6.37 joins a cluster with nodes having kernels < 2.6.37. This only affects the o2cb cluster stack. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Reported-by: Jacek Stepniewski <Jacek.Stepniewski@agora.pl> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: skip existing hole when removing the last extent_rec in punching-hole ↵	Tristan Ye	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	codes. In the case of removing a partial extent record which covers a hole, current punching-hole logic will try to remove more than the length of whole extent record, which leads to the failure of following assert(fs/ocfs2/alloc.c): 5507 BUG_ON(cpos < le32_to_cpu(rec->e_cpos) \|\| trunc_range > rec_range); This patch tries to skip existing hole at the last attempt of removing a partial extent record, what's more, it also adds some necessary comments for better understanding of punching-hole codes. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: Initialize data_ac (might be used uninitialized)	Marcus Meissner	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CLANG found that there is a path that has data_ac uninitialized, this place 2917 /* This gets us the dx_root */ 2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac); 2919 if (ret) { 3 Taking true branch 2920 mlog_errno(ret); 2921 goto out; 4 Control jumps to line 3168 2922 } Goes to the out: label without data_ac being initialized. Ciao, Marcus Signed-Off-By: Marcus Meissner <meissner@suse.de> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
* \|	Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6	Linus Torvalds	2011-05-18
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6: drivercore: revert addition of of_match to struct device of: fix race when matching drivers
\| * \|	drivercore: revert addition of of_match to struct device	Grant Likely	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit b826291c, "drivercore/dt: add a match table pointer to struct device" added an of_match pointer to struct device to cache the of_match_table entry discovered at driver match time. This was unsafe because matching is not an atomic operation with probing a driver. If two or more drivers are attempted to be matched to a driver at the same time, then the cached matching entry pointer could get overwritten. This patch reverts the of_match cache pointer and reworks all users to call of_match_device() directly instead. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
\| * \|	of: fix race when matching drivers	Milton Miller	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If two drivers are probing devices at the same time, both will write their match table result to the dev->of_match cache at the same time. Only write the result if the device matches. In a thread titled "SBus devices sometimes detected, sometimes not", Meelis reported his SBus hme was not detected about 50% of the time. From the debug suggested by Grant it was obvious another driver matched some devices between the call to match the hme and the hme discovery failling. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Milton Miller <miltonm@bga.com> [grant.likely: modified to only call of_match_device() once] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* \| \|	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	2011-05-18
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: Kludge IP27 build for 2.6.39. MIPS: AR7: Fix GPIO register size for Titan variant. MIPS: Fix duplicate invocation of notify_die. MIPS: RB532: Fix iomap resource size miscalculation.
\| * \| \|	MIPS: Kludge IP27 build for 2.6.39.	Ralf Baechle	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \| \|	MIPS: AR7: Fix GPIO register size for Titan variant.	Florian Fainelli	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'size' variable contains the correct register size for both AR7 and Titan, but we never used it to ioremap the correct register size. This problem only shows up on Titan. [ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork recognizes the problem correctly then fails to fix it ...] Reported-by: Alexander Clouter <alex@digriz.org.uk> Signed-off-by: Florian Fainelli <florian@openwrt.org> Patchwork: https://patchwork.linux-mips.org/patch/2380/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \| \|	MIPS: Fix duplicate invocation of notify_die.	Ralf Baechle	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initial patch by Yury Polyanskiy <ypolyans@princeton.edu>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/2373/