aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge tag 'stable/for-linus-3.18-rc0-tag' of ↵Linus Torvalds2014-10-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from David Vrabel: "Features and fixes: - Add pvscsi frontend and backend drivers. - Remove _PAGE_IOMAP PTE flag, freeing it for alternate uses. - Try and keep memory contiguous during PV memory setup (reduces SWIOTLB usage). - Allow front/back drivers to use threaded irqs. - Support large initrds in PV guests. - Fix PVH guests in preparation for Xen 4.5" * tag 'stable/for-linus-3.18-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (22 commits) xen: remove DEFINE_XENBUS_DRIVER() macro xen/xenbus: Remove BUG_ON() when error string trucated xen/xenbus: Correct the comments for xenbus_grant_ring() x86/xen: Set EFER.NX and EFER.SCE in PVH guests xen: eliminate scalability issues from initrd handling xen: sync some headers with xen tree xen: make pvscsi frontend dependant on xenbus frontend arm{,64}/xen: Remove "EXPERIMENTAL" in the description of the Xen options xen-scsifront: don't deadlock if the ring becomes full x86: remove the Xen-specific _PAGE_IOMAP PTE flag x86/xen: do not use _PAGE_IOMAP PTE flag for I/O mappings x86: skip check for spurious faults for non-present faults xen/efi: Directly include needed headers xen-scsiback: clean up a type issue in scsiback_make_tpg() xen-scsifront: use GFP_ATOMIC under spin_lock MAINTAINERS: Add xen pvscsi maintainer xen-scsiback: Add Xen PV SCSI backend driver xen-scsifront: Add Xen PV SCSI frontend driver xen: Add Xen pvSCSI protocol description xen/events: support threaded irqs for interdomain event channels ...
| * xen: remove DEFINE_XENBUS_DRIVER() macroDavid Vrabel2014-10-06
| | | | | | | | | | | | | | | | | | | | The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse errors. Replace the uses with standard structure definitions instead. This is similar to pci and usb device registration. Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/xenbus: Remove BUG_ON() when error string trucatedChen Gang2014-10-06
| | | | | | | | | | | | | | | | | | xenbus_va_dev_error() is for printing error, so when error string is too long to be truncated, need not BUG_ON(), still return truncation string is OK. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/xenbus: Correct the comments for xenbus_grant_ring()Chen Gang2014-10-06
| | | | | | | | | | | | | | | | A grant reference (which is a positive number) can indicate success, so the original comments need be improved. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * x86/xen: Set EFER.NX and EFER.SCE in PVH guestsMukesh Rathor2014-10-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two bugs in PVH guests: - Not setting EFER.NX means the NX bit in page table entries is ignored on Intel processors and causes reserved bit page faults on AMD processors. - After the Xen commit 7645640d6ff1 ("x86/PVH: don't set EFER_SCE for pvh guest") PVH guests are required to set EFER.SCE to enable the SYSCALL instruction. Secondary VCPUs are started with pagetables with the NX bit set so EFER.NX must be set before using any stack or data segment. xen_pvh_cpu_early_init() is the new secondary VCPU entry point that sets EFER before jumping to cpu_bringup_and_idle(). Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: eliminate scalability issues from initrd handlingJuergen Gross2014-10-03
| | | | | | | | | | | | | | | | | | | | Size restrictions native kernels wouldn't have resulted from the initrd getting mapped into the initial mapping. The kernel doesn't really need the initrd to be mapped, so use infrastructure available in Xen to avoid the mapping and hence the restriction. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: sync some headers with xen treeJuergen Gross2014-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To be able to use an initially unmapped initrd with xen the following header files must be synced to a newer version from the xen tree: include/xen/interface/elfnote.h include/xen/interface/xen.h As the KEXEC and DUMPCORE related ELFNOTES are not relevant for the kernel they are omitted from elfnote.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: make pvscsi frontend dependant on xenbus frontendJuergen Gross2014-10-03
| | | | | | | | | | | | | | | | The pvscsi frontend driver requires the xenbus frontend driver. Reflect this in Kconfig. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
| * arm{,64}/xen: Remove "EXPERIMENTAL" in the description of the Xen optionsJulien Grall2014-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | The Xen ARM API is stable since Xen 4.4 and everything has been upstreamed in Linux for ARM and ARM64. Therefore we can drop "EXPERIMENTAL" from the Xen option in the both Kconfig. Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org
| * xen-scsifront: don't deadlock if the ring becomes fullDavid Vrabel2014-10-03
| | | | | | | | | | | | | | | | scsifront_action_handler() will deadlock on host->host_lock, if the ring is full and it has to wait for entries to become available. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
| * x86: remove the Xen-specific _PAGE_IOMAP PTE flagDavid Vrabel2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The _PAGE_IO_MAP PTE flag was only used by Xen PV guests to mark PTEs that were used to map I/O regions that are 1:1 in the p2m. This allowed Xen to obtain the correct PFN when converting the MFNs read from a PTE back to their PFN. Xen guests no longer use _PAGE_IOMAP for this. Instead mfn_to_pfn() returns the correct PFN by using a combination of the m2p and p2m to determine if an MFN corresponds to a 1:1 mapping in the the p2m. Remove _PAGE_IOMAP, replacing it with _PAGE_UNUSED2 to allow for future uses of the PTE flag. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com>
| * x86/xen: do not use _PAGE_IOMAP PTE flag for I/O mappingsDavid Vrabel2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since mfn_to_pfn() returns the correct PFN for identity mappings (as used for MMIO regions), the use of _PAGE_IOMAP is not required in pte_mfn_to_pfn(). Do not set the _PAGE_IOMAP flag in pte_pfn_to_mfn() and do not use it in pte_mfn_to_pfn(). This will allow _PAGE_IOMAP to be removed, making it available for future use. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * x86: skip check for spurious faults for non-present faultsDavid Vrabel2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a fault on a kernel address is due to a non-present page, then it cannot be the result of stale TLB entry from a protection change (RO to RW or NX to X). Thus the pagetable walk in spurious_fault() can be skipped. See the initial if in spurious_fault() and the tests in spurious_fault_check()) for the set of possible error codes checked for spurious faults. These are: IRUWP Before x00xx && ( 1xxxx || xxx1x ) After ( 10001 || 00011 ) && ( 1xxxx || xxx1x ) Thus the new condition is a subset of the previous one, excluding only non-present faults (I == 1 and W == 1 are mutually exclusive). This avoids spurious_fault() oopsing in some cases if the pagetables it attempts to walk are not accessible. This obscures the location of the original fault. This also fixes a crash with Xen PV guests when they access entries in the M2P corresponding to device MMIO regions. The M2P is mapped (read-only) by Xen into the kernel address space of the guest and this mapping may contains holes for non-RAM regions. Read faults will result in calls to spurious_fault(), but because the page tables for the M2P mappings are not accessible by the guest the pagetable walk would fault. This was not normally a problem as MMIO mappings would not normally result in a M2P lookup because of the use of the _PAGE_IOMAP bit the PTE. However, removing the _PAGE_IOMAP bit requires M2P lookups for MMIO mappings as well. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Dave Hansen <dave.hansen@intel.com>
| * xen/efi: Directly include needed headersDaniel Kiper2014-09-23
| | | | | | | | | | | | | | | | | | | | I discovered that some needed stuff is defined/declared in headers which are not included directly. Currently it works but if somebody remove required headers from currently included headers then build will break. So, just in case directly include all needed headers. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen-scsiback: clean up a type issue in scsiback_make_tpg()Dan Carpenter2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | This code was confusing because we had an unsigned long and then we compared it to UINT_MAX and then we stored it in a u16. How many bytes is this supposed to have: 2, 4 or 16??? I've made it a u16 throughout. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen-scsifront: use GFP_ATOMIC under spin_lockDan Carpenter2014-09-23
| | | | | | | | | | | | | | | | | | | | This function is only called with a spin_lock held and IRQs disabled. The allocation is not allowed to sleep and NOIO is not sufficient, it has to be ATOMIC. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * MAINTAINERS: Add xen pvscsi maintainerJuergen Gross2014-09-23
| | | | | | | | | | | | | | | | Add myself as maintainer for the Xen pvSCSI drivers. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen-scsiback: Add Xen PV SCSI backend driverJuergen Gross2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduces the Xen pvSCSI backend. With pvSCSI it is possible for a Xen domU to issue SCSI commands to a SCSI LUN assigned to that domU. The SCSI commands are passed to the pvSCSI backend in a driver domain (usually Dom0) which is owner of the physical device. This allows e.g. to use SCSI tape drives in a Xen domU. The code is taken from the pvSCSI implementation in Xen done by Fujitsu based on Linux kernel 2.6.18. Changes from the original version are: - port to upstream kernel - put all code in just one source file - adapt to Linux style guide - use target core infrastructure instead doing pure pass-through - enable module unloading - support SG-list in grant page(s) - support task abort - remove redundant struct backend - allocate resources dynamically - correct minor error in scsiback_fast_flush_area - free allocated resources in case of error during I/O preparation - remove CDB emulation, now handled by target core infrastructure Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen-scsifront: Add Xen PV SCSI frontend driverJuergen Gross2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduces the Xen pvSCSI frontend. With pvSCSI it is possible for a Xen domU to issue SCSI commands to a SCSI LUN assigned to that domU. The SCSI commands are passed to the pvSCSI backend in a driver domain (usually Dom0) which is owner of the physical device. This allows e.g. to use SCSI tape drives in a Xen domU. The code is taken from the pvSCSI implementation in Xen done by Fujitsu based on Linux kernel 2.6.18. Changes from the original version are: - port to upstream kernel - put all code in just one source file - move module to appropriate location in kernel tree - adapt to Linux style guide - some minor code simplifications - replace constants with defines - remove not used defines - add support for larger SG lists by putting them in a granted page Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen: Add Xen pvSCSI protocol descriptionJuergen Gross2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the definition of pvSCSI protocol used between the pvSCSI frontend in a XEN domU and the pvSCSI backend in a XEN driver domain (usually Dom0). This header was originally provided by Fujitsu for Xen based on Linux 2.6.18. Changes are: - Added comments. - Adapt to Linux style guide. - Add support for larger SG-lists by putting them in an own granted page. - Remove stale definitions. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/events: support threaded irqs for interdomain event channelsJuergen Gross2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Export bind_interdomain_evtchn_to_irq() so drivers can use threaded interrupt handlers with: irq = bind_interdomain_evtchn_to_irq(remote_dom, remote_port); if (irq < 0) /* error */ ret = request_threaded_irq(...); Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/grant-table: refactor error cleanup in grow_gnttab_list()Chen Gang2014-09-23
| | | | | | | | | | | | | | | | | | The cleanup loop in grow_gnttab_list() is safe from the underflow of the unsigned 'i' since nr_glist_frames is >= 1, but refactor it anyway. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
| * xen/setup: Remap Xen Identity Mapped RAMMatt Rushton2014-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of ballooning up and down dom0 memory this remaps the existing mfns that were replaced by the identity map. The reason for this is that the existing implementation ballooned memory up and and down which caused dom0 to have discontiguous pages. In some cases this resulted in the use of bounce buffers which reduced network I/O performance significantly. This change will honor the existing order of the pages with the exception of some boundary conditions. To do this we need to update both the Linux p2m table and the Xen m2p table. Particular care must be taken when updating the p2m table since it's important to limit table memory consumption and reuse the existing leaf pages which get freed when an entire leaf page is set to the identity map. To implement this, mapping updates are grouped into blocks with table entries getting cached temporarily and then released. On my test system before: Total pages: 2105014 Total contiguous: 1640635 After: Total pages: 2105014 Total contiguous: 2098904 Signed-off-by: Matthew Rushton <mrushton@amazon.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
* | Merge tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linuxLinus Torvalds2014-10-11
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull file locking related changes from Jeff Layton: "This release is a little more busy for file locking changes than the last: - a set of patches from Kinglong Mee to fix the lockowner handling in knfsd - a pile of cleanups to the internal file lease API. This should get us a bit closer to allowing for setlease methods that can block. There are some dependencies between mine and Bruce's trees this cycle, and I based my tree on top of the requisite patches in Bruce's tree" * tag 'locks-v3.18-1' of git://git.samba.org/jlayton/linux: (26 commits) locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKING locks: flock_make_lock should return a struct file_lock (or PTR_ERR) locks: set fl_owner for leases to filp instead of current->files locks: give lm_break a return value locks: __break_lease cleanup in preparation of allowing direct removal of leases locks: remove i_have_this_lease check from __break_lease locks: move freeing of leases outside of i_lock locks: move i_lock acquisition into generic_*_lease handlers locks: define a lm_setup handler for leases locks: plumb a "priv" pointer into the setlease routines nfsd: don't keep a pointer to the lease in nfs4_file locks: clean up vfs_setlease kerneldoc comments locks: generic_delete_lease doesn't need a file_lock at all nfsd: fix potential lease memory leak in nfs4_setlease locks: close potential race in lease_get_mtime security: make security_file_set_fowner, f_setown and __f_setown void return locks: consolidate "nolease" routines locks: remove lock_may_read and lock_may_write lockd: rip out deferred lock handling from testlock codepath NFSD: Get reference of lockowner when coping file_lock ...
| * | locks: fix fcntl_setlease/getlease return when !CONFIG_FILE_LOCKINGJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently they both just return 0. Fix them to return more appropriate values instead. For better or worse, most places in the kernel return -EINVAL when leases aren't available. -ENOLCK would probably have been better, but let's follow suit here in the case of F_SETLEASE. In the F_GETLEASE case, just return F_UNLCK since we know that no lease will have been set. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: flock_make_lock should return a struct file_lock (or PTR_ERR)Jeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | Eliminate the need for a return pointer. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: set fl_owner for leases to filp instead of current->filesJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like flock locks, leases are owned by the file description. Now that the i_have_this_lease check in __break_lease is gone, we don't actually use the fl_owner for leases for anything. So, it's now safe to set this more appropriately to the same value as the fl_file. While we're at it, fix up the comments over the fl_owner_t definition since they're rather out of date. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: give lm_break a return valueJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Christoph suggests: "Add a return value to lm_break so that the lock manager can tell the core code "you can delete this lease right now". That gets rid of the games with the timeout which require all kinds of race avoidance code in the users." Do that here and have the nfsd lease break routine use it when it detects that there was a race between setting up the lease and it being broken. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: __break_lease cleanup in preparation of allowing direct removal of leasesJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Eliminate an unneeded "flock" variable. We can use "fl" as a loop cursor everywhere. Add a any_leases_conflict helper function as well to consolidate a bit of code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: remove i_have_this_lease check from __break_leaseJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I think that the intent of this code was to ensure that a process won't deadlock if it has one fd open with a lease on it and then breaks that lease by opening another fd. In that case it'll treat the __break_lease call as if it were non-blocking. This seems wrong -- the process could (for instance) be multithreaded and managing different fds via different threads. I also don't see any mention of this limitation in the (somewhat sketchy) documentation. Remove the check and the non-blocking behavior when i_have_this_lease is true. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: move freeing of leases outside of i_lockJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was only one place where we still could free a file_lock while holding the i_lock -- lease_modify. Add a new list_head argument to the lm_change operation, pass in a private list when calling it, and fix those callers to dispose of the list once the lock has been dropped. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: move i_lock acquisition into generic_*_lease handlersJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a saner internal API for managing leases, we no longer need to mandate that the inode->i_lock be held over most of the lease code. Push it down into generic_add_lease and generic_delete_lease. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: define a lm_setup handler for leasesJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...and move the fasync setup into it for fcntl lease calls. At the same time, change the semantics of how the file_lock double-pointer is handled. Up until now, on a successful lease return you got a pointer to the lock on the list. This is bad, since that pointer can no longer be relied on as valid once the inode->i_lock has been released. Change the code to instead just zero out the pointer if the lease we passed in ended up being used. Then the callers can just check to see if it's NULL after the call and free it if it isn't. The priv argument has the same semantics. The lm_setup function can zero the pointer out to signal to the caller that it should not be freed after the function returns. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: plumb a "priv" pointer into the setlease routinesJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In later patches, we're going to add a new lock_manager_operation to finish setting up the lease while still holding the i_lock. To do this, we'll need to pass a little bit of info in the fcntl setlease case (primarily an fasync structure). Plumb the extra pointer into there in advance of that. We declare this pointer as a void ** to make it clear that this is private info, and that the caller isn't required to set this unless the lm_setup specifically requires it. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | nfsd: don't keep a pointer to the lease in nfs4_fileJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we don't need to pass in an actual lease pointer to vfs_setlease on unlock, we can stop tracking a pointer to the lease in the nfs4_file. Switch all of the places that check the fi_lease to check fi_deleg_file instead. We always set that at the same time so it will have the same semantics. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: clean up vfs_setlease kerneldoc commentsJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the latter paragraphs seem ambiguous and just plain wrong. In particular the break_lease comment makes no sense. We call break_lease (and break_deleg) from all sorts of vfs-layer functions, so there is clearly such a method. Also get rid of some of the other comments about what's needed for a full implementation. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: generic_delete_lease doesn't need a file_lock at allJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that it's OK to pass in a NULL file_lock double pointer on a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to do just that. Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE with an error return. That's a problem we can handle without crashing the box if it occurs. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | nfsd: fix potential lease memory leak in nfs4_setleaseJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unlikely to ever occur, but if there were already a lease set on the file then we could end up getting back a different pointer on a successful setlease attempt than the one we allocated. If that happens, the one we allocated could leak. In practice, I don't think this will happen due to the fact that we only try to set up the lease once per nfs4_file, but this error handling is a bit more correct given the current lease API. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: close potential race in lease_get_mtimeJeff Layton2014-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lease_get_mtime is called without the i_lock held, so there's no guarantee about the stability of the list. Between the time when we assign "flock" and then dereference it to check whether it's a lease and for write, the lease could be freed. Ensure that that doesn't occur by taking the i_lock before trying to check the lease. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | security: make security_file_set_fowner, f_setown and __f_setown void returnJeff Layton2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | security_file_set_fowner always returns 0, so make it f_setown and __f_setown void return functions and fix up the error handling in the callers. Cc: linux-security-module@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: consolidate "nolease" routinesJeff Layton2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GFS2 and NFS have setlease routines that always just return -EINVAL. Turn that into a generic routine that can live in fs/libfs.c. Cc: <linux-nfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: <cluster-devel@redhat.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | locks: remove lock_may_read and lock_may_writeJeff Layton2014-09-09
| | | | | | | | | | | | | | | | | | There are no callers of these functions. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | lockd: rip out deferred lock handling from testlock codepathJeff Layton2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Kinglong points out, the nlm_block->b_fl field is no longer used at all. Also, vfs_test_lock in the generic locking code will only return FILE_LOCK_DEFERRED if FL_SLEEP is set, and it isn't here. The only other place that returns that value is the DLM lock code, but it only does that in dlm_posix_lock, never in dlm_posix_get. Remove all of the deferred locking code from the testlock codepath since it doesn't appear to ever be used anyway. I do have a small concern that this might cause a behavior change in the case where you have a block already sitting on the list when the testlock request comes in, but that looks like it doesn't really work properly anyway. I think it's best to just pass that down to vfs_test_lock and let the filesystem report that instead of trying to infer what's going on with the lock by looking at an existing block. Cc: cluster-devel@redhat.com Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
| * | NFSD: Get reference of lockowner when coping file_lockKinglong Mee2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v5: using nfs4_get_stateowner() instead of an inline function v3: Update based on Jeff's comments v2: Fix bad using of struct file_lock_operations for handle the owner Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | NFSD: New helper nfs4_get_stateowner() for atomic_inc sop referenceKinglong Mee2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | v5: same as the first version Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: Copy fl_lmops information for conflock in locks_copy_conflock()Kinglong Mee2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d5b9026a67 ([PATCH] knfsd: locks: flag NFSv4-owned locks) using fl_lmops field in file_lock for checking nfsd4 lockowner. But, commit 1a747ee0cc (locks: don't call ->copy_lock methods on return of conflicting locks) causes the fl_lmops of conflock always be NULL. Also, commit 0996905f93 (lockd: posix_test_lock() should not call locks_copy_lock()) caused the fl_lmops of conflock always be NULL too. Make sure copy the private information by fl_copy_lock() in struct file_lock_operations, merge __locks_copy_lock() to fl_copy_lock(). Jeff advice, "Set fl_lmops on conflocks, but don't set fl_ops. fl_ops are superfluous, since they are callbacks into the filesystem. There should be no need to bother the filesystem at all with info in a conflock. But, lock _ownership_ matters for conflocks and that's indicated by the fl_lmops. So you really do want to copy the fl_lmops for conflocks I think." v5: add missing calling of locks_release_private() in nlmsvc_testlock() v4: only copy fl_lmops for conflock, don't copy fl_ops Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: New ops in lock_manager_operations for get/put ownerKinglong Mee2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFSD or other lockmanager may increase the owner's reference, so adds two new options for copying and releasing owner. v5: change order from 2/6 to 3/6 v4: rename lm_copy_owner/lm_release_owner to lm_get_owner/lm_put_owner Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: Rename __locks_copy_lock() to locks_copy_conflock()Kinglong Mee2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jeff advice, " Right now __locks_copy_lock is only used to copy conflocks. It would be good to rename that to something more distinct (i.e.locks_copy_conflock), to make it clear that we're generating a conflock there." v5: change order from 3/6 to 2/6 v4: new patch only renaming function name Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: Remove unused conf argument from lm_grantJoe Perches2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | This argument is always NULL so don't pass it around. [jlayton: remove dependencies on previous patches in series] Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
| * | locks: pass correct "before" pointer to locks_unlink_lock in generic_add_leaseJeff Layton2014-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The argument to locks_unlink_lock can't be just any pointer to a pointer. It must be a pointer to the fl_next field in the previous lock in the list. Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>