litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	Merge branch 'next-evm' of ↵	James Morris	2011-08-08
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/zohar/ima-2.6 into next Conflicts: fs/attr.c Resolve conflict manually. Signed-off-by: James Morris <jmorris@namei.org>
\| *	evm: add evm_inode_setattr to prevent updating an invalid security.evm	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Permit changing of security.evm only when valid, unless in fixmode. Reported-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: permit only valid security.evm xattrs to be updated	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In addition to requiring CAP_SYS_ADMIN permission to modify/delete security.evm, prohibit invalid security.evm xattrs from changing, unless in fixmode. This patch prevents inadvertent 'fixing' of security.evm to reflect offline modifications. Changelog v7: - rename boot paramater 'evm_mode' to 'evm' Reported-by: Roberto Sassu <roberto.sassu@polito.it> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: replace hmac_status with evm_status	Dmitry Kasatkin	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We will use digital signatures in addtion to hmac. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: evm_verify_hmac must not return INTEGRITY_UNKNOWN	Dmitry Kasatkin	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If EVM is not supported or enabled, evm_verify_hmac() returns INTEGRITY_UNKNOWN, which ima_appraise_measurement() ignores and sets the appraisal status based solely on the security.ima verification. evm_verify_hmac() also returns INTEGRITY_UNKNOWN for other failures, such as temporary failures like -ENOMEM, resulting in possible attack vectors. This patch changes the default return code for temporary/unexpected failures, like -ENOMEM, from INTEGRITY_UNKNOWN to INTEGRITY_FAIL, making evm_verify_hmac() fail safe. As a result, failures need to be re-evaluated in order to catch both temporary errors, such as the -ENOMEM, as well as errors that have been resolved in fix mode. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: additional parameter to pass integrity cache entry 'iint'	Dmitry Kasatkin	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Additional iint parameter allows to skip lookup in the cache. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
\| *	evm: crypto hash replaced by shash	Dmitry Kasatkin	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using shash is more efficient, because the algorithm is allocated only once. Only the descriptor to store the hash state needs to be allocated for every operation. Changelog v6: - check for crypto_shash_setkey failure Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
\| *	evm: call evm_inode_init_security from security_inode_init_security	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changelog v7: - moved the initialization call to security_inode_init_security, renaming evm_inode_post_init_security to evm_inode_init_security - increase size of xattr array for EVM xattr Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: add evm_inode_init_security to initialize new files	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initialize 'security.evm' for new files. Changelog v7: - renamed evm_inode_post_init_security to evm_inode_init_security - moved struct xattr definition to earlier patch - allocate xattr name Changelog v6: - Use 'struct evm_ima_xattr_data' Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
\| *	evm: imbed evm_inode_post_setattr	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changing the inode's metadata may require the 'security.evm' extended attribute to be re-calculated and updated. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
\| *	evm: evm_inode_post_removexattr	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an EVM protected extended attribute is removed, update 'security.evm'. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
\| *	security: imbed evm calls in security hooks	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Imbed the evm calls evm_inode_setxattr(), evm_inode_post_setxattr(), evm_inode_removexattr() in the security hooks. evm_inode_setxattr() protects security.evm xattr. evm_inode_post_setxattr() and evm_inode_removexattr() updates the hmac associated with an inode. (Assumes an LSM module protects the setting/removing of xattr.) Changelog: - Don't define evm_verifyxattr(), unless CONFIG_INTEGRITY is enabled. - xattr_name is a 'const', value is 'void *' Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
\| *	evm: add support for different security.evm data types	Dmitry Kasatkin	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EVM protects a file's security extended attributes(xattrs) against integrity attacks. The current patchset maintains an HMAC-sha1 value across the security xattrs, storing the value as the extended attribute 'security.evm'. We anticipate other methods for protecting the security extended attributes. This patch reserves the first byte of 'security.evm' as a place holder for the type of method. Changelog v6: - move evm_ima_xattr_type definition to security/integrity/integrity.h - defined a structure for the EVM xattr called evm_ima_xattr_data (based on Serge Hallyn's suggestion) - removed unnecessary memset Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
\| *	evm: re-release	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EVM protects a file's security extended attributes(xattrs) against integrity attacks. This patchset provides the framework and an initial method. The initial method maintains an HMAC-sha1 value across the security extended attributes, storing the HMAC value as the extended attribute 'security.evm'. Other methods of validating the integrity of a file's metadata will be posted separately (eg. EVM-digital-signatures). While this patchset does authenticate the security xattrs, and cryptographically binds them to the inode, coming extensions will bind other directory and inode metadata for more complete protection. To help simplify the review and upstreaming process, each extension will be posted separately (eg. IMA-appraisal, IMA-appraisal-directory). For a general overview of the proposed Linux integrity subsystem, refer to Dave Safford's whitepaper: http://downloads.sf.net/project/linux-ima/linux-ima/Integrity_overview.pdf. EVM depends on the Kernel Key Retention System to provide it with a trusted/encrypted key for the HMAC-sha1 operation. The key is loaded onto the root's keyring using keyctl. Until EVM receives notification that the key has been successfully loaded onto the keyring (echo 1 > <securityfs>/evm), EVM can not create or validate the 'security.evm' xattr, but returns INTEGRITY_UNKNOWN. Loading the key and signaling EVM should be done as early as possible. Normally this is done in the initramfs, which has already been measured as part of the trusted boot. For more information on creating and loading existing trusted/encrypted keys, refer to Documentation/keys-trusted-encrypted.txt. A sample dracut patch, which loads the trusted/encrypted key and enables EVM, is available from http://linux-ima.sourceforge.net/#EVM. Based on the LSMs enabled, the set of EVM protected security xattrs is defined at compile. EVM adds the following three calls to the existing security hooks: evm_inode_setxattr(), evm_inode_post_setxattr(), and evm_inode_removexattr. To initialize and update the 'security.evm' extended attribute, EVM defines three calls: evm_inode_post_init(), evm_inode_post_setattr() and evm_inode_post_removexattr() hooks. To verify the integrity of a security xattr, EVM exports evm_verifyxattr(). Changelog v7: - Fixed URL in EVM ABI documentation Changelog v6: (based on Serge Hallyn's review) - fix URL in patch description - remove evm_hmac_size definition - use SHA1_DIGEST_SIZE (removed both MAX_DIGEST_SIZE and evm_hmac_size) - moved linux include before other includes - test for crypto_hash_setkey failure - fail earlier for invalid key - clear entire encrypted key, even on failure - check xattr name length before comparing xattr names Changelog: - locking based on i_mutex, remove evm_mutex - using trusted/encrypted keys for storing the EVM key used in the HMAC-sha1 operation. - replaced crypto hash with shash (Dmitry Kasatkin) - support for additional methods of verifying the security xattrs (Dmitry Kasatkin) - iint not allocated for all regular files, but only for those appraised - Use cap_sys_admin in lieu of cap_mac_admin - Use __vfs_setxattr_noperm(), without permission checks, from EVM Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
\| *	xattr: define vfs_getxattr_alloc and vfs_xattr_cmp	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vfs_getxattr_alloc() and vfs_xattr_cmp() are two new kernel xattr helper functions. vfs_getxattr_alloc() first allocates memory for the requested xattr and then retrieves it. vfs_xattr_cmp() compares a given value with the contents of an extended attribute. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
\| *	integrity: move ima inode integrity data management	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the inode integrity data(iint) management up to the integrity directory in order to share the iint among the different integrity models. Changelog: - don't define MAX_DIGEST_SIZE - rename several globally visible 'ima_' prefixed functions, structs, locks, etc to 'integrity_' - replace '20' with SHA1_DIGEST_SIZE - reflect location change in appropriate Kconfig and Makefiles - remove unnecessary initialization of iint_initialized to 0 - rebased on current ima_iint.c - define integrity_iint_store/lock as static There should be no other functional changes. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
\| *	security: new security_inode_init_security API adds function callback	Mimi Zohar	2011-07-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the security_inode_init_security API by adding a filesystem specific callback to write security extended attributes. This change is in preparation for supporting the initialization of multiple LSM xattrs and the EVM xattr. Initially the callback function walks an array of xattrs, writing each xattr separately, but could be optimized to write multiple xattrs at once. For existing security_inode_init_security() calls, which have not yet been converted to use the new callback function, such as those in reiserfs and ocfs2, this patch defines security_old_inode_init_security(). Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
* \|	Merge branch 'next-queue' into next	James Morris	2011-08-08
\|\ \
\| * \|	doc: Update the email address for Paul Moore in various source files	Paul Moore	2011-08-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	My @hp.com will no longer be valid starting August 5, 2011 so an update is necessary. My new email address is employer independent so we don't have to worry about doing this again any time soon. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <jmorris@namei.org>
\| * \|	doc: Update the MAINTAINERS info for Paul Moore	Paul Moore	2011-08-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	My @hp.com will no longer be valid starting August 5, 2011 so an update is necessary. My new email address is employer independent so we don't have to worry about doing this again any time soon. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <jmorris@namei.org>
* \| \|	Merge branch 'linus'; commit 'v3.1-rc1' into next	James Morris	2011-08-07
\| \| \|
* \| \|	Linux 3.1-rc1	Linus Torvalds	2011-08-07
\| \| \|
* \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc	Linus Torvalds	2011-08-07
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Fix build with DEBUG_PAGEALLOC enabled.
\| * \| \|	sparc: Fix build with DEBUG_PAGEALLOC enabled.	David S. Miller	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	arch/sparc/mm/init_64.c:1622:22: error: unused variable '__swapper_4m_tsb_phys_patch_end' [-Werror=unused-variable] arch/sparc/mm/init_64.c:1621:22: error: unused variable '__swapper_4m_tsb_phys_patch' [-Werror=unused-variable] Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \| \|	sh: Fix boot crash related to SCI	Rafael J. Wysocki	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d006199e72a9 ("serial: sh-sci: Regtype probing doesn't need to be fatal.") made sci_init_single() return when sci_probe_regmap() succeeds, although it should return when sci_probe_regmap() fails. This causes systems using the serial sh-sci driver to crash during boot. Fix the problem by using the right return condition. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	arm: remove stale export of 'sha_transform'	Linus Torvalds	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The generic library code already exports the generic function, this was left-over from the ARM-specific version that just got removed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	arm: remove "optimized" SHA1 routines	Linus Torvalds	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 1eb19a12bd22 ("lib/sha1: use the git implementation of SHA-1"), the ARM SHA1 routines no longer work. The reason? They depended on the larger 320-byte workspace, and now the sha1 workspace is just 16 words (64 bytes). So the assembly version would overwrite the stack randomly. The optimized asm version is also probably slower than the new improved C version, so there's no reason to keep it around. At least that was the case in git, where what appears to be the same assembly language version was removed two years ago because the optimized C BLK_SHA1 code was faster. Reported-and-tested-by: Joachim Eastwood <manabian@gmail.com> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	fix rcu annotations noise in cred.h	Al Viro	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	task->cred is declared as __rcu, and access to other tasks' ->cred is, indeed, protected. Access to current->cred does not need rcu_dereference() at all, since only the task itself can change its ->cred. sparse, of course, has no way of knowing that... Add force-cast in current_cred(), make current_fsuid() et.al. use it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	vfs: rename 'do_follow_link' to 'should_follow_link'	Linus Torvalds	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Al points out that the do_follow_link() helper function really is misnamed - it's about whether we should try to follow a symlink or not, not about actually doing the following. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	Fix POSIX ACL permission check	Ari Savolainen	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit 3567866bf261: "RCUify freeing acls, let check_acl() go ahead in RCU mode if acl is cached" posix_acl_permission is being called with an unsupported flag and the permission check fails. This patch fixes the issue. Signed-off-by: Ari Savolainen <ari.m.savolainen@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \| \| \|	Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd	Linus Torvalds	2011-08-07
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.open-osd.org/linux-open-osd: ore: Make ore its own module exofs: Rename raid engine from exofs/ios.c => ore exofs: ios: Move to a per inode components & device-table exofs: Move exofs specific osd operations out of ios.c exofs: Add offset/length to exofs_get_io_state exofs: Fix truncate for the raid-groups case exofs: Small cleanup of exofs_fill_super exofs: BUG: Avoid sbi realloc exofs: Remove pnfs-osd private definitions nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
\| * \| \| \|	ore: Make ore its own module	Boaz Harrosh	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Export everything from ore need exporting. Change Kbuild and Kconfig to build ore.ko as an independent module. Import ore from exofs Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Rename raid engine from exofs/ios.c => ore	Boaz Harrosh	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ORE stands for "Objects Raid Engine" This patch is a mechanical rename of everything that was in ios.c and its API declaration to an ore.c and an osd_ore.h header. The ore engine will later be used by the pnfs objects layout driver. * File ios.c => ore.c * Declaration of types and API are moved from exofs.h to a new osd_ore.h * All used types are prefixed by ore_ from their exofs_ name. * Shift includes from exofs.h to osd_ore.h so osd_ore.h is independent, include it from exofs.h. Other than a pure rename there are no other changes. Next patch will move the ore into it's own module and will export the API to be used by exofs and later the layout driver Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: ios: Move to a per inode components & device-table	Boaz Harrosh	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Exofs raid engine was saving on memory space by having a single layout-info, single pid, and a single device-table, global to the filesystem. Then passing a credential and object_id info at the io_state level, private for each inode. It would also devise this contraption of rotating the device table view for each inode->ino to spread out the device usage. This is not compatible with the pnfs-objects standard, demanding that each inode can have it's own layout-info, device-table, and each object component it's own pid, oid and creds. So: Bring exofs raid engine to be usable for generic pnfs-objects use by: * Define an exofs_comp structure that holds obj_id and credential info. * Break up exofs_layout struct to an exofs_components structure that holds a possible array of exofs_comp and the array of devices + the size of the arrays. * Add a "comps" parameter to get_io_state() that specifies the ids creds and device array to use for each IO. This enables to keep the layout global, but the device-table view, creds and IDs at the inode level. It only adds two 64bit to each inode, since some of these members already existed in another form. * ios raid engine now access layout-info and comps-info through the passed pointers. Everything is pre-prepared by caller for generic access of these structures and arrays. At the exofs Level: * Super block holds an exofs_components struct that holds the device array, previously in layout. The devices there are in device-table order. The device-array is twice bigger and repeats the device-table twice so now each inode's device array can point to a random device and have a round-robin view of the table, making it compatible to previous exofs versions. * Each inode has an exofs_components struct that is initialized at load time, with it's own view of the device table IDs and creds. When doing IO this gets passed to the io_state together with the layout. While preforming this change. Bugs where found where credentials with the wrong IDs where used to access the different SB objects (super.c). As well as some dead code. It was never noticed because the target we use does not check the credentials. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Move exofs specific osd operations out of ios.c	Boaz Harrosh	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ios.c will be moving to an external library, for use by the objects-layout-driver. Remove from it some exofs specific functions. Also g_attr_logical_length is used both by inode.c and ios.c move definition to the later, to keep it independent Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Add offset/length to exofs_get_io_state	Boaz Harrosh	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In future raid code we will need to know the IO offset/length and if it's a read or write to determine some of the array sizes we'll need. So add a new exofs_get_rw_state() API for use when writeing/reading. All other simple cases are left using the old way. The major change to this is that now we need to call exofs_get_io_state later at inode.c::read_exec and inode.c::write_exec when we actually know these things. So this patch is kept separate so I can test things apart from other changes. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Fix truncate for the raid-groups case	Boaz Harrosh	2011-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the general raid-group case the truncate was wrong in that it did not also fix the object length of the neighboring groups. There are two bad cases in the old code: 1. Space that should be freed was not. 2. If a file That was big is truncated small, then made bigger again, the holes would not contain zeros but could expose old data. (If the growing of the file expands to more than a full groups cycle + group size (> S + T)) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Small cleanup of exofs_fill_super	Boaz Harrosh	2011-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Small cleanup that unifies duplicated code used in both the error and success cases Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: BUG: Avoid sbi realloc	Boaz Harrosh	2011-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the beginning we realloced the sbi structure when a bigger then one device table was specified. (I know that was really stupid). Then much later when "register bdi" was added (By Jens) it was registering the pointer to sbi->bdi before the realloc. We never saw this problem because up till now the realloc did not do anything since the device table was small enough to fit in the original allocation. But once we starting testing with large device tables (Bigger then 28) we noticed the crash of writeback operating on a deallocated pointer. * Avoid the all mess by allocating the device-table as a second array and get rid of the variable-sized structure and the rest of this mess. * Take the chance to clean near by structures and comments. * Add a needed dprint on startup to indicate the loaded layout. * Also move the bdi registration to the very end because it will only fail in a low memory, which will probably fail before hand. There are many more likely causes to not load before that. This way the error handling is made simpler. (Just doing this would be enough to fix the BUG) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	exofs: Remove pnfs-osd private definitions	Boaz Harrosh	2011-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that pnfs-osd has hit mainline we can remove exofs's private header. (And the FIXME comment) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
\| * \| \| \|	nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4	Boaz Harrosh	2011-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	exofs file system wants to use pnfs_osd_xdr.h file instead of redefining pnfs-objects types in it's private "pnfs.h" headr. Before we do the switch we must make sure pnfs_osd_xdr.h is compilable also under NFS versions smaller than 4.1. Since now it is needed regardless of version, by the exofs code. nfs4_string is not the only nfs4 type out in the global scope. Ack-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* \| \| \| \|	vfs: optimize inode cache access patterns	Linus Torvalds	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inode structure layout is largely random, and some of the vfs paths really do care. The path lookup in particular is already quite D$ intensive, and profiles show that accessing the 'inode->i_op->xyz' fields is quite costly. We already optimized the dcache to not unnecessarily load the d_op structure for members that are often NULL using the DCACHE_OP_xyz bits in dentry->d_flags, and this does something very similar for the inode ops that are used during pathname lookup. It also re-orders the fields so that the fields accessed by 'stat' are together at the beginning of the inode structure, and roughly in the order accessed. The effect of this seems to be in the 1-2% range for an empty kernel "make -j" run (which is fairly kernel-intensive, mostly in filename lookup), so it's visible. The numbers are fairly noisy, though, and likely depend a lot on exact microarchitecture. So there's more tuning to be done. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \| \|	vfs: renumber DCACHE_xyz flags, remove some stale ones	Linus Torvalds	2011-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gcc tends to generate better code with small integers, including the DCACHE_xyz flag tests - so move the common ones to be first in the list. Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and DCACHE_AUTOFS_PENDING values, their users no longer exists in the source tree. And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the common case to be a nice straight-line fall-through. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	2011-08-07
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: Compute protocol sequence numbers and fragment IDs using MD5. crypto: Move md5_transform to lib/md5.c
\| * \| \| \| \|	net: Compute protocol sequence numbers and fragment IDs using MD5.	David S. Miller	2011-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	crypto: Move md5_transform to lib/md5.c	David S. Miller	2011-08-06
\| \| \|/ / / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are going to use this for TCP/IP sequence number and fragment ID generation. Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6	Linus Torvalds	2011-08-06
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: cope with negative dentries in cifs_get_root cifs: convert prefixpath delimiters in cifs_build_path_to_root CIFS: Fix missing a decrement of inFlight value cifs: demote DFS referral lookup errors to cFYI Revert "cifs: advertise the right receive buffer size to the server"
\| * \| \| \| \|	cifs: cope with negative dentries in cifs_get_root	Jeff Layton	2011-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop around lookup_one_len doesn't handle the case where it might return a negative dentry, which can cause an oops on the next pass through the loop. Check for that and break out of the loop with an error of -ENOENT if there is one. Fixes the panic reported here: https://bugzilla.redhat.com/show_bug.cgi?id=727927 Reported-by: TR Bentley <home@trarbentley.net> Reported-by: Iain Arnell <iarnell@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \| \| \| \|	cifs: convert prefixpath delimiters in cifs_build_path_to_root	Jeff Layton	2011-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Regression from 2.6.39... The delimiters in the prefixpath are not being converted based on whether posix paths are in effect. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=727834 Reported-and-Tested-by: Iain Arnell <iarnell@gmail.com> Reported-by: Patrick Oltmann <patrick.oltmann@gmx.net> Cc: Pavel Shilovsky <piastryyy@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \| \| \| \|	CIFS: Fix missing a decrement of inFlight value	Pavel Shilovsky	2011-08-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if we failed on getting mid entry in cifs_call_async. Cc: stable@kernel.org Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>