litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	dcache: d_obtain_alias callers don't all want DISCONNECTED	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a few d_obtain_alias callers that are using it to get the root of a filesystem which may already have an alias somewhere else. This is not the same as the filehandle-lookup case, and none of them actually need DCACHE_DISCONNECTED set. It isn't really a serious problem, but it would really be clearer if we reserved DCACHE_DISCONNECTED for those cases where it's actually needed. In the btrfs case this was causing a spurious printk from nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED dentry. Josef worked around this by unsetting DCACHE_DISCONNECTED manually in 3a0dfa6a12e "Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol", and this replaces that workaround. Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	dcache: d_splice_alias should ignore DCACHE_DISCONNECTED	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Any IS_ROOT() alias should be safe to use; there's nothing special about DCACHE_DISCONNECTED dentries. Note that this is in fact useful for filesystems such as btrfs which can legimately encounter a directory with a preexisting IS_ROOT alias on a lookup that crosses into a subvolume. (Those aliases are currently marked DCACHE_DISCONNECTED--but not really for any good reason, and we'll change that soon.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	dcache: d_splice_alias mustn't create directory aliases	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently if d_splice_alias finds a directory with an alias that is not IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory. Duplicate directory dentries are unacceptable; it is better just to error out. (In the case of a local filesystem the most likely case is filesystem corruption: for example, perhaps two directories point to the same child directory, and the other parent has already been found and cached.) Note that distributed filesystems may encounter this case in normal operation if a remote host moves a directory to a location different from the one we last cached in the dcache. For that reason, such filesystems should instead use d_materialise_unique, which tries to move the old directory alias to the right place instead of erroring out. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	dcache: close d_move race in d_splice_alias	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	d_splice_alias will d_move an IS_ROOT() directory dentry into place if one exists. This should be safe as long as the dentry remains IS_ROOT, but I can't see what guarantees that: once we drop the i_lock all we hold here is the i_mutex on an unrelated parent directory. Instead copy the logic of d_materialise_unique. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	dcache: move d_splice_alias	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \|	Just a trivial move to locate it near (similar) d_materialise_unique code and save some forward references in a following patch. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	namei: trivial fix to vfs_rename_dir comment	J. Bruce Fields	2014-08-07
\| \| \| \| \| \| \| \| \|	Looks like the directory loop check is actually done in renameat? Whatever, leave this out rather than trying to keep it up to date with the code. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	VFS: allow ->d_manage() to declare -EISDIR in rcu_walk mode.	NeilBrown	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In REF-walk mode, ->d_manage can return -EISDIR to indicate that the dentry is not really a mount trap (or even a mount point) and that any mounts or any DCACHE_NEED_AUTOMOUNT flag should be ignored. RCU-walk mode doesn't currently support this, so if there is a dentry with DCACHE_NEED_AUTOMOUNT set but which shouldn't be a mount-trap, lookup_fast() will always drop in REF-walk mode. With this patch, an -EISDIR from ->d_manage will always cause mounts and automounts to be ignored, both in REF-walk and RCU-walk. Bug-fixed-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	cifs: support RENAME_NOREPLACE	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \|	This flag gives CIFS the ability to support its native rename semantics. Implementation is simple: just bail out before trying to hack around the noreplace semantics. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	hostfs: support rename flags	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \|	Support RENAME_NOREPLACE and RENAME_EXCHANGE flags on hostfs if the underlying filesystem supports it. Since renameat2(2) is not yet in any libc, use syscall(2) to invoke the renameat2 syscall. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	shmem: support RENAME_EXCHANGE	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \| \| \|	This is really simple in tmpfs since the VFS already takes care of shuffling the dentries. Just adjust nlink on parent directories and touch c & mtimes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	shmem: support RENAME_NOREPLACE	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \|	Implement ->rename2 instead of ->rename. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	btrfs: add RENAME_NOREPLACE	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \| \| \|	RENAME_NOREPLACE is trivial to implement for most filesystems: switch over to ->rename2() and check for the supported flags. The rest is done by the VFS. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Chris Mason <clm@fb.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	bad_inode: add ->rename2()	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \|	so we return -EIO instead of -EINVAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fs: call rename2 if exists	Miklos Szeredi	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Christoph Hellwig suggests: 1) make vfs_rename call ->rename2 if it exists instead of ->rename 2) switch all filesystems that you're adding NOREPLACE support for to use ->rename2 3) see how many ->rename instances we'll have left after a few iterations of 2. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	kernel/acct.c: fix coding style warnings and errors	Ionut Alexa	2014-08-07
\| \| \| \| \| \|	Signed-off-by: Ionut Alexa <ionut.m.alexa@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	death to mnt_pinned	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than playing silly buggers with vfsmount refcounts, just have acct_on() ask fs/namespace.c for internal clone of file->f_path.mnt and replace it with said clone. Then attach the pin to original vfsmount. Voila - the clone will be alive until the file gets closed, making sure that underlying superblock remains active, etc., and we can drop the original vfsmount, so that it's not kept busy. If the file lives until the final mntput of the original vfsmount, we'll notice that there's an fs_pin (one in bsd_acct_struct that holds that file) and mnt_pin_kill() will take it out. Since ->kill() is synchronous, we won't proceed past that point until these files are closed (and private clones of our vfsmount are gone), so we get the same ordering warranties we used to get. mnt_pin()/mnt_unpin()/->mnt_pinned is gone now, and good riddance - it never became usable outside of kernel/acct.c (and racy wrt umount even there). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	make fs/{namespace,super}.c forget about acct.h	Al Viro	2014-08-07
\| \| \| \| \| \| \|	These externs belong in fs/internal.h. Rename (they are not acct-specific anymore) and move them over there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	take fs_pin stuff to fs/*	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new field to fs_pin - kill(pin). That's what umount and r/o remount will be calling for all pins attached to vfsmount and superblock resp. Called after bumping the refcount, so it won't go away under us. Dropping the refcount is responsibility of the instance. All generic stuff moved to fs/fs_pin.c; the next step will rip all the knowledge of kernel/acct.c from fs/super.c and fs/namespace.c. After that - death to mnt_pin(); it was intended to be usable as generic mechanism for code that wants to attach objects to vfsmount, so that they would not make the sucker busy and would get killed on umount. Never got it right; it remained acct.c-specific all along. Now it's very close to being killable. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	start carving bsd_acct_struct up	Al Viro	2014-08-07
\| \| \| \| \| \| \| \|	pull generic parts into struct fs_pin. Eventually we want those to replace mnt_pin()/mnt_unpin() mess; that stuff will move to fs/*. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: move mnt_pin() upwards.	Al Viro	2014-08-07
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	make acct_kill() wait for file closing.	Al Viro	2014-08-07
\| \| \| \| \| \| \|	Do actual closing of file via schedule_work(). And use __fput_sync() there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	drop ->s_umount around acct_auto_close()	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \| \|	just repeat the frozen check after regaining it, and check that sb is still alive. If several threads hit acct_auto_close() at the same time, acct_auto_close() will survive that just fine. And we really don't want to play with writes and closing the file with ->s_umount held exclusive - it's a deadlock country. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: get rid of acct_lock for acct->count	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \|	* make acct->count atomic and acct freeing - rcu-delayed. * instead of grabbing acct_lock around the places where we take a reference, do that under rcu_read_lock() with atomic_long_inc_not_zero(). * have the new acct locked before making ns->bacct point to it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: get rid of acct_list	Al Viro	2014-08-07
\| \| \| \| \| \| \| \|	Put these suckers on per-vfsmount and per-superblock lists instead. Note: right now it's still acct_lock for everything, but that's going to change. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: simplify check_free_space()	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \|	a) file can't be NULL b) file can't be changed under us c) all writes are serialized by acct->lock; no need to mess with spinlock there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: new lifetime rules	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not reuse bsd_acct_struct after closing the damn thing. Structure lifetime is controlled by refcount now. We also have a mutex in there, held over closing and writing (the file is O_APPEND, so we are not losing any concurrency). As the result, we do not need to bother with get_file()/fput() on log write anymore. Moreover, do_acct_process() only needs acct itself; file and pidns are picked from it. Killed instances are distinguished by having NULL ->ns. Refcount is protected by acct_lock; anybody taking the mutex needs to grab a reference first. The things will get a lot simpler in the next commits - this is just the minimal chunk switching to the new lifetime rules. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: serialize acct_on()	Al Viro	2014-08-07
\| \| \| \| \| \|	brute-force - on a global mutex that isn't nested into anything. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct() should honour the limits from the very beginning	Al Viro	2014-08-07
\| \| \| \| \| \|	We need to check free space on the first write to freshly opened log. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	split the slow path in acct_process() off	Al Viro	2014-08-07
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	separate namespace-independent parts of filling acct_t	Al Viro	2014-08-07
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: switch to __kernel_write()	Al Viro	2014-08-07
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	acct: encode_comp_t(0) is 0, fortunately...	Al Viro	2014-08-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was an amusing bogosity in ac_rw calculation - it tried to do encode_comp_t(encode_comp_t(0) / 1024). Seeing that comp_t is a 3-bit exponent + 13-bit mantissa... it's a good thing that 0 is represented by all-bits-clear. The history of that one is interesting - it was introduced in 2.1.68pre1, when acct.c had been reworked and moved to separate file. Two months later (2.1.86) somebody has noticed that the sucker won't compile - there was no task_struct::io_usage. At which point the ac_io calculation had changed from encode_comp_t(current->io_usage) to encode_comp_t(0) and the bug in the next line (absolutely real back then, had it ever managed to compile) become a harmless bogosity. Looks like nobody has ever noticed until now. Anyway, let's bury that idiocy now that it got noticed. 17 years is long enough... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Merge commit 'ccbf62d8a284cf181ac28c8e8407dd077d90dd4b' into for-next	Al Viro	2014-08-07
\|\ \| \| \| \| \| \|	backmerge to avoid kernel/acct.c conflict
\| *	sched: Make task->start_time nanoseconds based	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify the timespec to nsec/usec conversions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	sched: Make task->real_start_time nanoseconds based	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simplify the only user of this data by removing the timespec conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	time: Export nsecs_to_jiffies()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Required for moving drivers to the nanosecond based interfaces. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Provide ktime_get[*]_ns() helpers	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A lot of code converts either timespecs or ktime_t to nanoseconds. Provide helper functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Remove ktime_get_monotonic_offset()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	No more users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	drm: Use ktime_mono_to_real()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the monotonic timestamp with ktime_mono_to_real() in drm_calc_vbltimestamp_from_scanoutpos(). In get_drm_timestamp we can call either ktime_get() or ktime_get_real() depending on drm_timestamp_monotonic. No point in having two calls into the core for CLOCK_REALTIME. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	input: evdev: Use ktime_mono_to_real()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the monotonic timestamp with ktime_mono_to_real() in evdev_events(). In evdev_queue_syn_dropped() we can call either ktime_get() or ktime_get_real() depending on the clkid. No point in having two calls for CLOCK_REALTIME. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timerfd: Use ktime_mono_to_real()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a few other use cases of ktime_get_monotonic_offset() which can be optimized with ktime_mono_to_real(). The timerfd code uses the offset only for comparison, so we can use ktime_mono_to_real(0) for this as well. Funny enough text size shrinks with that on ARM and x8664 !? Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Provide ktime_mono_to_any()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ktime based conversion function to map a monotonic time stamp to a different CLOCK. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping; Use ktime based data for ktime_get_update_offsets_tick()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	No need to juggle with timespecs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Use ktime_t data for ktime_get_update_offsets_now()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	No need to juggle with timespecs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Use ktime_t based data for ktime_get_clocktai()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping; Use ktime_t based data for ktime_get_boottime()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Use ktime_t based data for ktime_get_real()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Speed up the readout. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Provide ktime_get_with_offset()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide a helper function which lets us implement ktime_t based interfaces for real, boot and tai clocks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Use ktime_t based data for ktime_get()	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Speed up ktime_get() by using ktime_t based data. Text size shrinks by 64 bytes on x8664. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>
\| *	timekeeping: Provide internal ktime_t based data	Thomas Gleixner	2014-07-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ktime_t based interfaces are used a lot in performance critical code pathes. Add ktime_t based data so the interfaces don't have to convert from the xtime/timespec based data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>