litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	cifs: Add support for mounting Windows 2008 DFS shares	Sean Finney	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Windows 2008 CIFS servers do not always return PATH_NOT_COVERED when attempting to access a DFS share. Therefore, when checking for remote shares, unconditionally ask for a DFS referral for the UNC (w/out prepath) before continuing with previous behavior of attempting to access the UNC + prepath and checking for PATH_NOT_COVERED. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=31092 Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Sean Finney <seanius@seanius.net> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: Extract DFS referral expansion logic to separate function	Sean Finney	2011-05-19
\| \| \| \| \| \| \| \| \| \| \|	The logic behind the expansion of DFS referrals is now extracted from cifs_mount into a new static function, expand_dfs_referral. This will reduce duplicate code in upcoming commits. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Sean Finney <seanius@seanius.net> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: turn BCC into a static inlined function	Jeff Layton	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's a bad idea to have macro functions that reference variables more than once, as the arguments could have side effects. Turn BCC() into a static inlined function instead. While we're at it, make it return a void * to discourage anyone from dereferencing it as-is. Reported-and-acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: keep BCC in little-endian format	Jeff Layton	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the same patch as originally posted, just with some merge conflicts fixed up... Currently, the ByteCount is usually converted to host-endian on receive. This is confusing however, as we need to keep two sets of routines for accessing it, and keep track of when to use each routine. Munging received packets like this also limits when the signature can be calulated. Simplify the code by keeping the received ByteCount in little-endian format. This allows us to eliminate a set of routines for accessing it and we can now drop the *_le suffixes from the accessor functions since that's now implied. While we're at it, switch all of the places that read the ByteCount directly to use the get_bcc inline which should also clean up some unaligned accesses. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: fix some unused variable warnings in id_rb_search	Jeff Layton	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \|	fs/cifs/cifsacl.c: In function ‘id_rb_search’: fs/cifs/cifsacl.c:215:19: warning: variable ‘linkto’ set but not used [-Wunused-but-set-variable] fs/cifs/cifsacl.c:214:18: warning: variable ‘parent’ set but not used [-Wunused-but-set-variable] Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	CIFS: Simplify invalidate part (try #5)	Pavel Shilovsky	2011-05-19
\| \| \| \| \| \| \| \| \|	Simplify many places when we call cifs_revalidate/invalidate to make it do what it exactly needs. Reviewed-by: Jeff Layton <jlayton@samba.org> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	CIFS: directio read/write cleanups	Pavel Shilovsky	2011-05-19
\| \| \| \| \| \| \| \| \| \|	Recently introduced strictcache mode brought a new code that can be efficiently used by directio part. That's let us add vectored operations and break unnecessary cifs_user_read and cifs_user_write. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	consistently use smb_buf_length as be32 for cifs (try 3)	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is one big endian field in the cifs protocol, the RFC1001 length, which cifs code (unlike in the smb2 code) had been handling as u32 until the last possible moment, when it was converted to be32 (its native form) before sending on the wire. To remove the last sparse endian warning, and to make this consistent with the smb2 implementation (which always treats the fields in their native size and endianness), convert all uses of smb_buf_length to be32. This version incorporates Christoph's comment about using be32_add_cpu, and fixes a typo in the second version of the patch. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: Invoke id mapping functions (try #17 repost)	Shirish Pargaonkar	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rb tree search and insertion routines. A SID which needs to be mapped, is looked up in one of the rb trees depending on whether SID is either owner or group SID. If found in the tree, a (mapped) id from that node is assigned to uid or gid as appropriate. If unmapped, an upcall is attempted to map the SID to an id. If upcall is successful, node is marked as mapped. If upcall fails, node stays marked as unmapped and a mapping is attempted again only after an arbitrary time period has passed. To map a SID, which can be either a Owner SID or a Group SID, key description starts with the string "os" or "gs" followed by SID converted to a string. Without "os" or "gs", cifs.upcall does not know whether SID needs to be mapped to either an uid or a gid. Nodes in rb tree have fields to prevent multiple upcalls for a SID. Searching, adding, and removing nodes is done within global locks. Whenever a node is either found or inserted in a tree, a reference is taken on that node. Shrinker routine prunes a node if it has expired but does not prune an expired node if its refcount is not zero (i.e. sid/id of that node is_being/will_be accessed). Thus a node, if its SID needs to be mapped by making an upcall, can safely stay and its fields accessed without shrinker pruning it. A reference (refcount) is put on the node without holding the spinlock but a reference is get on the node by holding the spinlock. Every time an existing mapped node is accessed or mapping is attempted, its timestamp is updated to prevent it from getting erased or a to prevent multiple unnecessary repeat mapping retries respectively. For now, cifs.upcall is only used to map a SID to an id (uid or gid) but it would be used to obtain an SID for an id. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: Add idmap key and related data structures and functions (try #17 repost)	Shirish Pargaonkar	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define (global) data structures to store ids, uids and gids, to which a SID maps. There are two separate trees, one for SID/uid and another one for SID/gid. A new type of key, cifs_idmap_key_type, is used. Keys are instantiated and searched using credential of the root by overriding and restoring the credentials of the caller requesting the key. Id mapping functions are invoked under config option of cifs acl. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	CIFS: Add launder_page operation (try #3)	Pavel Shilovsky	2011-05-19
\| \| \| \| \| \| \| \| \|	Add this let us drop filemap_write_and_wait from cifs_invalidate_mapping and simplify the code to properly process invalidate logic. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Introduce smb2 mounts as vers=2	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As with Linux nfs client, which uses "nfsvers=" or "vers=" to indicate which protocol to use for mount, specifying "vers=smb2" or "vers=2" will force an smb2 mount. When vers is not specified cifs is used ie "vers=cifs" or "vers=1" We can eventually autonegotiate down from smb2 to cifs when smb2 is stable enough to make it the default, but this is for the future. At that time we could also implement a "maxprotocol" mount option as smbclient and Samba have today, but that would be premature until smb2 is stable. Intially the smb2 Kconfig option will depend on "BROKEN" until the merge is complete, and then be "EXPERIMENTAL" When it is no longer experimental we can consider changing the default protocol to attempt first. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	CIFS: Use invalidate_inode_pages2 instead of invalidate_remote_inode (try #4)	Pavel Shilovsky	2011-05-19
\| \| \| \| \| \| \| \| \| \|	Use invalidate_inode_pages2 that don't leave pages even if shrink_page_list() has a temp ref on them. It prevents a data coherency problem when cifs_invalidate_mapping didn't invalidate pages but the client thinks that a data from the cache is uptodate according to an oplock level (exclusive or II). Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: fix comment in validate_t2	Jeff Layton	2011-05-19
\| \| \| \| \| \| \| \| \|	The comment about checking the bcc is in the wrong place. Also make it match kernel coding style. Reported-and-acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	VFS: trivial: fix comment on s_maxbytes value warning check	Jeff Layton	2011-05-19
\| \| \| \| \| \| \| \| \|	I originally intended to remove this warning in 2.6.34, but it's not in a high performance codepath and might help us to catch bugs later. Let's keep it, but fix the comment to allay confusion about its removal. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	[CIFS] Allow to set extended attribute cifs_acl (try #2)	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Allow setting cifs_acl on the server. Pass on to the server the ACL blob generated by an application. cifs is just a pass-through, it does not monitor or inspect the contents of the blob, server decides whether to enforce/apply the ACL blob composed by an application. If setting of ACL is succeessful, mark the inode for revalidation. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	[CIFS] Use ecb des kernel crypto APIs instead of	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	local cifs functions (repost) Using kernel crypto APIs for DES encryption during LM and NT hash generation instead of local functions within cifs. Source file smbdes.c is deleted sans four functions, one of which uses ecb des functionality provided by kernel crypto APIs. Remove function SMBOWFencrypt. Add return codes to various functions such as calc_lanman_hash, SMBencrypt, and SMBNTencrypt. Includes fix noticed by Dan Carpenter. Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> CC: Dan Carpenter <error27@gmail.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: cleanup: Rename and remove config flags	Shirish Pargaonkar	2011-05-19
\| \| \| \| \| \| \| \| \|	Remove config flag CIFS_EXPERIMENTAL. Do export operations under new config flag CIFS_NFSD_EXPORT Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Introduce SMB2 Kconfig option	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SMB2 is the followon to the CIFS (and SMB) protocols and the default for Windows since Windows Vista, and also now implemented by various non-Windows servers. SMB2 is more secure, has various performance advantages, including larger i/o sizes, flow control, better caching model and more. SMB2 also resolves some scalability limits in the cifs protocol and adds many new features while being much simpler (only a few dozen commands instead of hundreds) and since the protocol is clearer it is also more consistently implemented across servers and thus easier to optimize. After much discussion with Jeff Layton, Jeremy Allison and others at Connectathon, we decided to move the smb2 code from a distinct .ko and fstype into distinct C files that optionally build in cifs.ko. As a result the Kconfig gets simpler. To avoid destabilizing cifs, the smb2 code is going to be moved into its own experimental CONFIG_CIFS_SMB2 ifdef as it is merged and rereviewed. The changes to stable cifs (builds with the smb2 ifdef off) are expected to be fairly small. Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Shrink stack space usage in cifs_construct_tcon	Steve French	2011-05-19
\| \| \| \| \| \| \| \|	We were reserving MAX_USERNAME (now 256) on stack for something which only needs to fit about 24 bytes ie string krb50x + printf version of uid Signed-off-by: Steve French <sfrench@us.ibm.com>
*	fs:cifs:connect.c remove one to many l's in the word.	Justin P. Mattock	2011-05-19
\| \| \| \| \| \| \|	The patch below removes an extra "l" in the word. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Don't compile in unused reparse point symlink code	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent Windows versions now create symlinks more frequently and they do use this "reparse point" symlink mechanism. We can of course do symlinks nicely to Samba and other servers which support the CIFS Unix Extensions and we can also do SFU symlinks and "client only" "MF" symlinks optionally, but for recent Windows we currently can not handle the common "reparse point" symlinks fully, removing the caller for this. We will need to extend and reenable this "reparse point" worker code in cifs and fix cifs_symlink to call this. In the interim this code has been moved to its own config option so it is not compiled in by default until cifs_symlink fixed up (and tested) to use this. CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Remove unused CIFSSMBNotify worker function	Steve French	2011-05-19
\| \| \| \| \| \| \| \| \| \| \|	The CIFSSMBNotify worker is unused, pending changes to allow it to be called via inotify, so move it into its own experimental config option so it does not get built in, until the necessary VFS support is fixed. It used to be used in dnotify, but according to Jeff, inotify needs minor changes before we can reenable this. CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	cifs: Remove unused inode number while fetching root inode	Shirish Pargaonkar	2011-05-19
\| \| \| \| \| \| \|	ino is unused in function cifs_root_iget(). Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
*	Linux 2.6.39	Linus Torvalds	2011-05-19
\|
*	Merge branch 'fixes' of ↵	Linus Torvalds	2011-05-18
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: configfs: Fix race between configfs_readdir() and configfs_d_iput() configfs: Don't try to d_delete() negative dentries. ocfs2/dlm: Target node death during resource migration leads to thread spin ocfs2: Skip mount recovery for hard-ro mounts ocfs2/cluster: Heartbeat mismatch message improved ocfs2/cluster: Increase the live threshold for global heartbeat ocfs2/dlm: Use negotiated o2dlm protocol version ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes. ocfs2: Initialize data_ac (might be used uninitialized)
\| *	configfs: Fix race between configfs_readdir() and configfs_d_iput()	Joel Becker	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	configfs_readdir() will use the existing inode numbers of inodes in the dcache, but it makes them up for attribute files that aren't currently instantiated. There is a race where a closing attribute file can be tearing down at the same time as configfs_readdir() is trying to get its inode number. We want to get the inode number of open attribute files, because they should match while instantiated. We can't lock down the transition where dentry->d_inode is set to NULL, so we just check for NULL there. We can, however, ensure that an inode we find isn't iput() in configfs_d_iput() until after we've accessed it. Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	configfs: Don't try to d_delete() negative dentries.	Joel Becker	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When configfs is faking mkdir() on its subsystem or default group objects, it starts by adding a negative dentry. It then tries to instantiate the group. If that should fail, it must clean up after itself. I was using d_delete() here, but configfs_attach_group() promises to return an empty dentry on error. d_delete() explodes with the entry dentry. Let's try d_drop() instead. The unhashing is what we want for our dentry. Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/dlm: Target node death during resource migration leads to thread spin	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During resource migration, if the target node were to die, the thread doing the migration spins until the target node is not removed from the domain map. This patch slows the spin by making the thread wait for the recovery to kick in. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: Skip mount recovery for hard-ro mounts	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch skips mount recovery for hard-ro mounts which otherwise leads to an oops. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/cluster: Heartbeat mismatch message improved	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If o2hb finds unexpected values in the heartbeat slot, it prints a message "ERROR: Device "dm-6": another node is heartbeating in our slot!" This message could be misleading. This patch adds two more messages to help users better diagnose the problem. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/cluster: Increase the live threshold for global heartbeat	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have seen isolated cases (very few, I might add) of o2hb not detecting all live nodes on startup. One plausible reasoning for it is that other node had a hb io delay at the same time. The live threshold set at 2 (as low as it can be) could be increased to ameliorate the situation. But increasing the threshold directly affects mount time. Currently it takes around 5 secs to mount a volume in o2cb cluster with local heartbeat. Increasing the threshold will make mounts even slower. As the issue itself is rare, we have left things as they are for the local heartbeat mode. However we can improve the situation for global heartbeat mode as in that mode, we start the heartbeat much before the mount. This patch doubles the live threshold for the start of the first region in global heartbeat mode. Addresses internal Oracle bug#10635585. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2/dlm: Use negotiated o2dlm protocol version	Sunil Mushran	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch fixes a bug in the o2dlm protocol negotiation in that it is using the builtin version rather than the negotiated version during the domain join. This causes join errors when a node having kernel >= 2.6.37 joins a cluster with nodes having kernels < 2.6.37. This only affects the o2cb cluster stack. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Reported-by: Jacek Stepniewski <Jacek.Stepniewski@agora.pl> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: skip existing hole when removing the last extent_rec in punching-hole ↵	Tristan Ye	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	codes. In the case of removing a partial extent record which covers a hole, current punching-hole logic will try to remove more than the length of whole extent record, which leads to the failure of following assert(fs/ocfs2/alloc.c): 5507 BUG_ON(cpos < le32_to_cpu(rec->e_cpos) \|\| trunc_range > rec_range); This patch tries to skip existing hole at the last attempt of removing a partial extent record, what's more, it also adds some necessary comments for better understanding of punching-hole codes. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
\| *	ocfs2: Initialize data_ac (might be used uninitialized)	Marcus Meissner	2011-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CLANG found that there is a path that has data_ac uninitialized, this place 2917 /* This gets us the dx_root */ 2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac); 2919 if (ret) { 3 Taking true branch 2920 mlog_errno(ret); 2921 goto out; 4 Control jumps to line 3168 2922 } Goes to the out: label without data_ac being initialized. Ciao, Marcus Signed-Off-By: Marcus Meissner <meissner@suse.de> Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <jlbec@evilplan.org>
* \|	Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6	Linus Torvalds	2011-05-18
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6: drivercore: revert addition of of_match to struct device of: fix race when matching drivers
\| * \|	drivercore: revert addition of of_match to struct device	Grant Likely	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit b826291c, "drivercore/dt: add a match table pointer to struct device" added an of_match pointer to struct device to cache the of_match_table entry discovered at driver match time. This was unsafe because matching is not an atomic operation with probing a driver. If two or more drivers are attempted to be matched to a driver at the same time, then the cached matching entry pointer could get overwritten. This patch reverts the of_match cache pointer and reworks all users to call of_match_device() directly instead. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
\| * \|	of: fix race when matching drivers	Milton Miller	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If two drivers are probing devices at the same time, both will write their match table result to the dev->of_match cache at the same time. Only write the result if the device matches. In a thread titled "SBus devices sometimes detected, sometimes not", Meelis reported his SBus hme was not detected about 50% of the time. From the debug suggested by Grant it was obvious another driver matched some devices between the call to match the hme and the hme discovery failling. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Milton Miller <miltonm@bga.com> [grant.likely: modified to only call of_match_device() once] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* \| \|	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	2011-05-18
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: Kludge IP27 build for 2.6.39. MIPS: AR7: Fix GPIO register size for Titan variant. MIPS: Fix duplicate invocation of notify_die. MIPS: RB532: Fix iomap resource size miscalculation.
\| * \| \|	MIPS: Kludge IP27 build for 2.6.39.	Ralf Baechle	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \| \|	MIPS: AR7: Fix GPIO register size for Titan variant.	Florian Fainelli	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 'size' variable contains the correct register size for both AR7 and Titan, but we never used it to ioremap the correct register size. This problem only shows up on Titan. [ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork recognizes the problem correctly then fails to fix it ...] Reported-by: Alexander Clouter <alex@digriz.org.uk> Signed-off-by: Florian Fainelli <florian@openwrt.org> Patchwork: https://patchwork.linux-mips.org/patch/2380/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| * \| \|	MIPS: Fix duplicate invocation of notify_die.	Ralf Baechle	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initial patch by Yury Polyanskiy <ypolyans@princeton.edu>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/2373/
\| * \| \|	MIPS: RB532: Fix iomap resource size miscalculation.	Ralf Baechle	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the MIPS portion of Joe Perches <joe@perches.com>'s https://patchwork.linux-mips.org/patch/2172/ which seems to have been lost in time and space. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* \| \| \|	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2011-05-18
\|\ \ \ \ \| \|_\|/ / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: don't delay blk_run_queue_async scsi: remove performance regression due to async queue run blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup block: rescan partitions on invalidated devices on -ENOMEDIA too cdrom: always check_disk_change() on open block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
\| * \| \|	block: don't delay blk_run_queue_async	Shaohua Li	2011-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let's check a scenario: 1. blk_delay_queue(q, SCSI_QUEUE_DELAY); 2. blk_run_queue_async(); the second one will became a noop, because q->delay_work already has WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed work runs immediately. Fix this by doing a cancel on potentially pending delayed work before queuing an immediate run of the workqueue. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| * \| \|	scsi: remove performance regression due to async queue run	Jens Axboe	2011-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit c21e6beb removed our queue request_fn re-enter protection, and defaulted to always running the queues from kblockd to be safe. This was a known potential slow down, but should be safe. Unfortunately this is causing big performance regressions for some, so we need to improve this logic. Looking into the details of the re-enter, the real issue is on requeue of requests. Requeue of requests upon seeing a BUSY condition from the device ends up re-running the queue, causing traces like this: scsi_request_fn() scsi_dispatch_cmd() scsi_queue_insert() __scsi_queue_insert() scsi_run_queue() scsi_request_fn() ... potentially causing the issue we want to avoid. So special case the requeue re-run of the queue, but improve it to offload the entire run of local queue and starved queue from a single workqueue callback. This is a lot better than potentially kicking off a workqueue run for each device seen. This also fixes the issue of the local device going into recursion, since the above mentioned commit never moved that queue run out of line. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| * \| \|	blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup	Vivek Goyal	2011-05-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currentlly we first map the task to cgroup and then cgroup to blkio_cgroup. There is a more direct way to get to blkio_cgroup from task using task_subsys_state(). Use that. The real reason for the fix is that it also avoids a race in generic cgroup code. During remount/umount rebind_subsystems() is called and it can do following with and rcu protection. cgrp->subsys[i] = NULL; That means if somebody got hold of cgroup under rcu and then it tried to do cgroup->subsys[] to get to blkio_cgroup, it would get NULL which is wrong. I was running into this race condition with ltp running on a upstream derived kernel and that lead to crash. So ideally we should also fix cgroup generic code to wait for rcu grace period before setting pointer to NULL. Li Zefan is not very keen on introducing synchronize_wait() as he thinks it will slow down moun/remount/umount operations. So for the time being atleast fix the kernel crash by taking a more direct route to blkio_cgroup. One tester had reported a crash while running LTP on a derived kernel and with this fix crash is no more seen while the test has been running for over 6 days. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| * \| \|	block: rescan partitions on invalidated devices on -ENOMEDIA too	Tejun Heo	2011-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	__blkdev_get() doesn't rescan partitions if disk->fops->open() fails, which leads to ghost partition devices lingering after medimum removal is known to both the kernel and userland. The behavior also creates a subtle inconsistency where O_NONBLOCK open, which doesn't fail even if there's no medium, clears the ghots partitions, which is exploited to work around the problem from userland. Fix it by updating __blkdev_get() to issue partition rescan after -ENOMEDIA too. This was reported in the following bz. https://bugzilla.kernel.org/show_bug.cgi?id=13029 Stable: 2.6.38 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Zeuthen <zeuthen@gmail.com> Reported-by: Martin Pitt <martin.pitt@ubuntu.com> Reported-by: Kay Sievers <kay.sievers@vrfy.org> Tested-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| * \| \|	cdrom: always check_disk_change() on open	Tejun Heo	2011-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cdrom_open() called check_disk_change() after the rest of open path succeeded which leads to the following bizarre behavior. * After media change, if the device opened without O_NONBLOCK, open_for_data() naturally fails with -ENOMEDIA and check_disk_change() is never called. The media is known to be gone and the open failure makes it obvious to the userland but device invalidation never happens. * But if the device is opened with O_NONBLOCK, all the checks are bypassed and cdrom_open() doesn't notice that the media is not there and check_disk_change() is called and invalidation happens. There's nothing to be gained by avoiding calling check_disk_change() on open failure. Common cases end up calling check_disk_change() anyway. All we get is inconsistent behavior. Fix it by moving check_disk_change() invocation to the top of cdrom_open() so that it always gets called regardless of how the rest of open proceeds. Stable: 2.6.38 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Amit Shah <amit.shah@redhat.com> Tested-by: Amit Shah <amit.shah@redhat.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
\| * \| \|	block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers	Tejun Heo	2011-04-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In-kernel disk event polling doesn't matter for legacy/fringe drivers and may lead to infinite event loop if ->check_events() implementation generates events on level condition instead of edge. Now that block layer supports suppressing exporting unlisted events, simply leaving disk->events cleared allows these drivers to keep the internal revalidation behavior intact while avoiding weird interactions with userland event handler. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>