aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
...
| * ceph: remove useless BUG_ONYan, Zheng2016-03-25
| | | | | | | | | | | | ceph_osdc_start_request() never return -EOLDSNAP Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: don't enable rbytes mount option by defaultYan, Zheng2016-03-25
| | | | | | | | | | | | | | | | When rbytes mount option is enabled, directory size is recursive size. Recursive size is not updated instantly. This can cause directory size to change between successive stat(1) Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * ceph: encode ctime in cap messageYan, Zheng2016-03-25
| | | | | | | | Signed-off-by: Yan, Zheng <zyan@redhat.com>
| * libceph: behave in mon_fault() if cur_mon < 0Ilya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | This can happen if __close_session() in ceph_monc_stop() races with a connection reset. We need to ignore such faults, otherwise it's likely we would take !hunting, call __schedule_delayed() and end up with delayed_work() executing on invalid memory, among other things. The (two!) con->private tests are useless, as nothing ever clears con->private. Nuke them. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: reschedule tick in mon_fault()Ilya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Doing __schedule_delayed() in the hunting branch is pointless, as the tick will have already been scheduled by then. What we need to do instead is *reschedule* it in the !hunting branch, after reopen_session() changes hunt_mult, which affects the delay. This helps with spacing out connection attempts and avoiding things like two back-to-back attempts followed by a longer period of waiting around. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: introduce and switch to reopen_session()Ilya Dryomov2016-03-25
| | | | | | | | | | | | | | | | hunting is now set in __open_session() and cleared in finish_hunting(), instead of all around. The "session lost" message is printed not only on connection resets, but also on keepalive timeouts. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: monc hunt rate is 3s with backoff up to 30sIlya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | Unless we are in the process of setting up a client (i.e. connecting to the monitor cluster for the first time), apply a backoff: every time we want to reopen a session, increase our timeout by a multiple (currently 2); when we complete the connection, reduce that multipler by 50%. Mirrors ceph.git commit 794c86fd289bd62a35ed14368fa096c46736e9a2. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: monc ping rate is 10sIlya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | Split ping interval and ping timeout: ping interval is 10s; keepalive timeout is 30s. Make monc_ping_timeout a constant while at it - it's not actually exported as a mount option (and the rest of tick-related settings won't be either), so it's got no place in ceph_options. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: pick a different monitor when reconnectingIlya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Don't try to reconnect to the same monitor when we fail to establish a session within a timeout or it's lost. For that, pick_new_mon() needs to see the old value of cur_mon, so don't clear it in __close_session() - all calls to __close_session() but one are followed by __open_session() anyway. __open_session() is only called when a new session needs to be established, so the "already open?" branch, which is now in the way, is simply dropped. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: revamp subs code, switch to SUBSCRIBE2 protocolIlya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is currently hard-coded in the mon_client that mdsmap and monmap subs are continuous, while osdmap sub is always "onetime". To better handle full clusters/pools in the osd_client, we need to be able to issue continuous osdmap subs. Revamp subs code to allow us to specify for each sub whether it should be continuous or not. Although not strictly required for the above, switch to SUBSCRIBE2 protocol while at it, eliminating the ambiguity between a request for "every map since X" and a request for "just the latest" when we don't have a map yet (i.e. have epoch 0). SUBSCRIBE2 feature bit is now required - it's been supported since pre-argonaut (2010). Move "got mdsmap" call to the end of ceph_mdsc_handle_map() - calling in before we validate the epoch and successfully install the new map can mess up mon_client sub state. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: decouple hunting and subs managementIlya Dryomov2016-03-25
| | | | | | | | | | | | | | Coupling hunting state with subscribe state is not a good idea. Clear hunting when we complete the authentication handshake. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
| * libceph: move debugfs initialization into __ceph_open_session()Ilya Dryomov2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our debugfs dir name is a concatenation of cluster fsid and client unique ID ("global_id"). It used to be the case that we learned global_id first, nowadays we always learn fsid first - the monmap is sent before any auth replies are. ceph_debugfs_client_init() call in ceph_monc_handle_map() is therefore never executed and can be removed. Its counterpart in handle_auth_reply() doesn't really belong there either: having to do monc->client and unlocking early to work around lockdep is a testament to that. Move it into __ceph_open_session(), where it can be called unconditionally. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
* | Merge tag 'ofs-pull-tag-1' of ↵Linus Torvalds2016-03-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs filesystem from Mike Marshall. This finally merges the long-pending orangefs filesystem, which has been much cleaned up with input from Al Viro over the last six months. From the documentation file: "OrangeFS is an LGPL userspace scale-out parallel storage system. It is ideal for large storage problems faced by HPC, BigData, Streaming Video, Genomics, Bioinformatics. Orangefs, originally called PVFS, was first developed in 1993 by Walt Ligon and Eric Blumer as a parallel file system for Parallel Virtual Machine (PVM) as part of a NASA grant to study the I/O patterns of parallel programs. Orangefs features include: - Distributes file data among multiple file servers - Supports simultaneous access by multiple clients - Stores file data and metadata on servers using local file system and access methods - Userspace implementation is easy to install and maintain - Direct MPI support - Stateless" see Documentation/filesystems/orangefs.txt for more in-depth details. * tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (174 commits) orangefs: fix orangefs_superblock locking orangefs: fix do_readv_writev() handling of error halfway through orangefs: have ->kill_sb() evict the VFS side of things first orangefs: sanitize ->llseek() orangefs-bufmap.h: trim unused junk orangefs: saner calling conventions for getting a slot orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer orangefs: get rid of readdir_handle_s ornagefs: ensure that truncate has an up to date inode size orangefs: move code which sets i_link to orangefs_inode_getattr orangefs: remove needless wrapper around GFP_KERNEL orangefs: remove wrapper around mutex_lock(&inode->i_mutex) orangefs: refactor inode type or link_target change detection orangefs: use new getattr for revalidate and remove old getattr orangefs: use new getattr in inode getattr and permission orangefs: use new orangefs_inode_getattr to get size in write and llseek orangefs: use new orangefs_inode_getattr to create new inodes orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr orangefs: remove inode->i_lock wrapper orangefs: put register_chrdev immediately before register_filesystem ...
| * | orangefs: fix orangefs_superblock lockingAl Viro2016-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb * remove from the list _before_ orangefs_unmount() - request_mutex in the latter will make sure that nothing observed in the loop in ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end of loop * on removal, keep the forward pointer and zero the back one. That way we can drop and regain the spinlock in the loop body (again, ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the rest of the list. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: fix do_readv_writev() handling of error halfway throughAl Viro2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | Error should only be returned if nothing had been read/written. Otherwise we need to report a short read/write instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: have ->kill_sb() evict the VFS side of things firstAl Viro2016-03-25
| | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: sanitize ->llseek()Al Viro2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | a) open files can't have NULL inodes b) it's SEEK_END, not ORANGEFS_SEEK_END; no need to get cute. c) make_bad_inode() on lseek()? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs-bufmap.h: trim unused junkAl Viro2016-03-25
| | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: saner calling conventions for getting a slotAl Viro2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | just have it return the slot number or -E... - the caller checks the sign anyway Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointerAl Viro2016-03-25
| | | | | | | | | | | | | | | | | | | | | it's always __orangefs_bufmap Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: get rid of readdir_handle_sAl Viro2016-03-25
| | | | | | | | | | | | | | | | | | | | | | | | no point, really - we couldn't keep those across the calls of getdents(); it would be too easy to DoS, having all slots exhausted. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | ornagefs: ensure that truncate has an up to date inode sizeMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: move code which sets i_link to orangefs_inode_getattrMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | | | | | | | Everything else setting inode->i_ values is in there. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove needless wrapper around GFP_KERNELMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove wrapper around mutex_lock(&inode->i_mutex)Martin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: refactor inode type or link_target change detectionMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new getattr for revalidate and remove old getattrMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new getattr in inode getattr and permissionMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new orangefs_inode_getattr to get size in write and llseekMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: use new orangefs_inode_getattr to create new inodesMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattrMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | | | | | | | | | | This is motivated by orangefs_inode_old_getattr's habit of writing over live inodes. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove inode->i_lock wrapperMartin Brandenburg2016-03-23
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: put register_chrdev immediately before register_filesystemMartin Brandenburg2016-03-17
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove paranoia in orangefs_set_inodeMartin Brandenburg2016-03-17
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: sanitize listxattr and return EIO on impossible valuesMartin Brandenburg2016-03-17
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | orangefs: remove unused reference to xattr key lengthMartin Brandenburg2016-03-17
| | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: adjust unwind on module init failure.Mike Marshall2016-03-17
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: fix sloppy cleanups of debugfs and sysfs init failures.Mike Marshall2016-03-14
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: follow_link -> get_link changeMike Marshall2016-03-14
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: Extra sanity insurance on buffer before using string functions on it.Mike Marshall2016-03-14
| | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | Orangefs: merge to v4.5Mike Marshall2016-03-14
| |\ \ | | | | | | | | | | | | | | | | | | | | Merge tag 'v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into current Linux 4.5
| * | | orangefs: make fs_mount_pending staticMartin Brandenburg2016-03-09
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | orangefs: Avoid symlink upcall if target is too long.Martin Brandenburg2016-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the client-core detected this condition by sheer luck! Since we used strncpy, no NUL byte would be included on the name. The client-core would call strlen, which would read past the end of its buffer, but return a number large enough that the client-core would return ENAMETOOLONG. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: improve the POSIXness of interrupted writes...Mike Marshall2016-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't return EINTR on interrupted writes if some data has already been written. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: add a new gossip statementMike Marshall2016-03-09
| | | | | | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: improve gossip statementsMike Marshall2016-03-03
| | | | | | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: update orangefs.txtMike Marshall2016-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Al Viro has cleaned up the way ops are processed and waited for, now orangefs.txt has an overview of how it works. Several recent related commits have added to the comments in the code as well. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | Orangefs: code sanitation.Mike Marshall2016-02-26
| | | | | | | | | | | | | | | | Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | orangefs: remove unused 'diff' functionArnd Bergmann2016-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | orangefs contains a helper function to calculate the difference between two timeval structures. We are trying to remove all instances of timespec from the kernel, and this one is not used at all, so let's remove it now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mike Marshall <hubcap@omnibond.com>
| * | | orangefs: avoid time conversion functionArnd Bergmann2016-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new orangefs code uses a helper function to read a time field to its private structures from struct iattr. This will conflict with the move to 64-bit timestamps in the kernel and is generally not necessary. This replaces the conversion with a simple cast to time64_t that shows what is going on. As the orangefs-internal representation already uses 64-bit timestamps, there should be no ambiguity to negative values, and the cast ensures that we treat them as times before 1970 on both 32-bit and 64-bit architectures, rather than times after 2038. This patch keeps that behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mike Marshall <hubcap@omnibond.com>