afs: Overhaul volume and server record caching and fileserver rotation

The current code assumes that volumes and servers are per-cell and are never shared, but this is not enforced, and, indeed, public cells do exist that are aliases of each other. Further, an organisation can, say, set up a public cell and a private cell with overlapping, but not identical, sets of servers. The difference is purely in the database attached to the VL servers. The current code will malfunction if it sees a server in two cells as it assumes global address -> server record mappings and that each server is in just one cell. Further, each server may have multiple addresses - and may have addresses of different families (IPv4 and IPv6, say). To this end, the following structural changes are made: (1) Server record management is overhauled: (a) Server records are made independent of cell. The namespace keeps track of them, volume records have lists of them and each vnode has a server on which its callback interest currently resides. (b) The cell record no longer keeps a list of servers known to be in that cell. (c) The server records are now kept in a flat list because there's no single address to sort on. (d) Server records are now keyed by their UUID within the namespace. (e) The addresses for a server are obtained with the VL.GetAddrsU rather than with VL.GetEntryByName, using the server's UUID as a parameter. (f) Cached server records are garbage collected after a period of non-use and are counted out of existence before purging is allowed to complete. This protects the work functions against rmmod. (g) The servers list is now in /proc/fs/afs/servers. (2) Volume record management is overhauled: (a) An RCU-replaceable server list is introduced. This tracks both servers and their coresponding callback interests. (b) The superblock is now keyed on cell record and numeric volume ID. (c) The volume record is now tied to the superblock which mounts it, and is activated when mounted and deactivated when unmounted. This makes it easier to handle the cache cookie without causing a double-use in fscache. (d) The volume record is loaded from the VLDB using VL.GetEntryByNameU to get the server UUID list. (e) The volume name is updated if it is seen to have changed when the volume is updated (the update is keyed on the volume ID). (3) The vlocation record is got rid of and VLDB records are no longer cached. Sufficient information is stored in the volume record, though an update to a volume record is now no longer shared between related volumes (volumes come in bundles of three: R/W, R/O and backup). and the following procedural changes are made: (1) The fileserver cursor introduced previously is now fleshed out and used to iterate over fileservers and their addresses. (2) Volume status is checked during iteration, and the server list is replaced if a change is detected. (3) Server status is checked during iteration, and the address list is replaced if a change is detected. (4) The abort code is saved into the address list cursor and -ECONNABORTED returned in afs_make_call() if a remote abort happened rather than translating the abort into an error message. This allows actions to be taken depending on the abort code more easily. (a) If a VMOVED abort is seen then this is handled by rechecking the volume and restarting the iteration. (b) If a VBUSY, VRESTARTING or VSALVAGING abort is seen then this is handled by sleeping for a short period and retrying and/or trying other servers that might serve that volume. A message is also displayed once until the condition has cleared. (c) If a VOFFLINE abort is seen, then this is handled as VBUSY for the moment. (d) If a VNOVOL abort is seen, the volume is rechecked in the VLDB to see if it has been deleted; if not, the fileserver is probably indicating that the volume couldn't be attached and needs salvaging. (e) If statfs() sees one of these aborts, it does not sleep, but rather returns an error, so as not to block the umount program. (5) The fileserver iteration functions in vnode.c are now merged into their callers and more heavily macroised around the cursor. vnode.c is removed. (6) Operations on a particular vnode are serialised on that vnode because the server will lock that vnode whilst it operates on it, so a second op sent will just have to wait. (7) Fileservers are probed with FS.GetCapabilities before being used. This is where service upgrade will be done. (8) A callback interest on a fileserver is set up before an FS operation is performed and passed through to afs_make_call() so that it can be set on the vnode if the operation returns a callback. The callback interest is passed through to afs_iget() also so that it can be set there too. In general, record updating is done on an as-needed basis when we try to access servers, volumes or vnodes rather than offloading it to work items and special threads. Notes: (1) Pre AFS-3.4 servers are no longer supported, though this can be added back if necessary (AFS-3.4 was released in 1998). (2) VBUSY is retried forever for the moment at intervals of 1s. (3) /proc/fs/afs/<cell>/servers no longer exists. Signed-off-by: David Howells <dhowells@redhat.com>
author: David Howells <dhowells@redhat.com> 2017-11-02 11:27:50 -0400
committer: David Howells <dhowells@redhat.com> 2017-11-13 10:38:19 -0500
commit: d2ddc776a4581d900fc3bdc7803b403daae64d88 (patch)
tree: 6c5f6bac9e9e9a326ed0a9248914b53c421558e5
parent: 9cc6fc50f7bc69ac28bee45eed13cbc65a86210f (diff)
26 files changed, 2795 insertions, 2575 deletions
diff --git a/fs/afs/Makefile b/fs/afs/Makefile
index 192d476d7c76..45b7fc405fa6 100644
--- a/fs/afs/Makefile
+++ b/fs/afs/Makefile
@@ -24,11 +24,10 @@ kafs-objs := \
        rxrpc.o \
        security.o \
        server.o \
+        server_list.o \
        super.o \
        netdevices.o \
        vlclient.o \
-        vlocation.o \
-        vnode.o \
        volume.o \
        write.o \
        xattr.o
diff --git a/fs/afs/addr_list.c b/fs/afs/addr_list.c
index ecb9c72aebd2..b91e59a77f0e 100644
--- a/fs/afs/addr_list.c
+++ b/fs/afs/addr_list.c
@@ -228,6 +228,37 @@ struct afs_addr_list *afs_dns_query(struct afs_cell *cell, time64_t *_expiry)
 }
 /*
+ * Merge an IPv4 entry into a fileserver address list.
+ */
+void afs_merge_fs_addr4(struct afs_addr_list *alist, __be32 xdr)
+{
+        struct sockaddr_in6 *a;
+        int i;
+        for (i = 0; i < alist->nr_ipv4; i++) {
+                a = &alist->addrs[i].transport.sin6;
+                if (xdr == a->sin6_addr.s6_addr32[3])
+                        return;
+                if (xdr < a->sin6_addr.s6_addr32[3])
+                        break;
+        }
+        if (i < alist->nr_addrs)
+                memmove(alist->addrs + i + 1,
+                        alist->addrs + i,
+                        sizeof(alist->addrs[0]) * (alist->nr_addrs - i));
+        a = &alist->addrs[i].transport.sin6;
+        a->sin6_port              = htons(AFS_FS_PORT);
+        a->sin6_addr.s6_addr32[0] = 0;
+        a->sin6_addr.s6_addr32[1] = 0;
+        a->sin6_addr.s6_addr32[2] = htonl(0xffff);
+        a->sin6_addr.s6_addr32[3] = xdr;
+        alist->nr_ipv4++;
+        alist->nr_addrs++;
+}
+/*
 * Get an address to try.
 */
 bool afs_iterate_addresses(struct afs_addr_cursor *ac)
diff --git a/fs/afs/afs_fs.h b/fs/afs/afs_fs.h
index 05395d0f1941..d47b6d01e4c0 100644
--- a/fs/afs/afs_fs.h
+++ b/fs/afs/afs_fs.h
@@ -38,6 +38,7 @@ enum AFS_FS_Operations {
        FSFETCHDATA64           = 65537, /* AFS Fetch file data */
        FSSTOREDATA64           = 65538, /* AFS Store file data */
        FSGIVEUPALLCALLBACKS    = 65539, /* AFS Give up all outstanding callbacks on a server */
+        FSGETCAPABILITIES       = 65540, /* Probe and get the capabilities of a fileserver */
 };
 enum AFS_FS_Errors {
diff --git a/fs/afs/afs_vl.h b/fs/afs/afs_vl.h
index 4eaa620992c8..6350b417aee9 100644
--- a/fs/afs/afs_vl.h
+++ b/fs/afs/afs_vl.h
@@ -88,4 +88,46 @@ struct afs_vldbentry {
 #define AFS_VLDB_MAXNAMELEN 65
+struct afs_ListAddrByAttributes__xdr {
+        __be32                  Mask;
+#define AFS_VLADDR_IPADDR       0x1     /* Match by ->ipaddr */
+#define AFS_VLADDR_INDEX        0x2     /* Match by ->index */
+#define AFS_VLADDR_UUID         0x4     /* Match by ->uuid */
+        __be32                  ipaddr;
+        __be32                  index;
+        __be32                  spare;
+        struct afs_uuid__xdr    uuid;
+};
+struct afs_uvldbentry__xdr {
+        __be32                  name[AFS_VLDB_MAXNAMELEN];
+        __be32                  nServers;
+        struct afs_uuid__xdr    serverNumber[AFS_NMAXNSERVERS];
+        __be32                  serverUnique[AFS_NMAXNSERVERS];
+        __be32                  serverPartition[AFS_NMAXNSERVERS];
+        __be32                  serverFlags[AFS_NMAXNSERVERS];
+        __be32                  volumeId[AFS_MAXTYPES];
+        __be32                  cloneId;
+        __be32                  flags;
+        __be32                  spares1;
+        __be32                  spares2;
+        __be32                  spares3;
+        __be32                  spares4;
+        __be32                  spares5;
+        __be32                  spares6;
+        __be32                  spares7;
+        __be32                  spares8;
+        __be32                  spares9;
+};
+struct afs_address_list {
+        refcount_t              usage;
+        unsigned int            version;
+        unsigned int            nr_addrs;
+        struct sockaddr_rxrpc   addrs[];
+};
+extern void afs_put_address_list(struct afs_address_list *alist);
 #endif /* AFS_VL_H */
diff --git a/fs/afs/callback.c b/fs/afs/callback.c
index 82f4c7a3b7b6..f4291b576054 100644
--- a/fs/afs/callback.c
+++ b/fs/afs/callback.c
@@ -26,10 +26,10 @@
 * - Called with volume->server_sem held.
 */
 int afs_register_server_cb_interest(struct afs_vnode *vnode,
-                                    struct afs_cb_interest **ppcbi,
+                                    struct afs_server_entry *entry)
-                                    struct afs_server *server)
 {
-        struct afs_cb_interest *cbi = *ppcbi, *vcbi, *new, *x;
+        struct afs_cb_interest *cbi = entry->cb_interest, *vcbi, *new, *x;
+        struct afs_server *server = entry->server;
 again:
        vcbi = vnode->cb_interest;
@@ -47,7 +47,7 @@ again:
                if (!cbi && vcbi->server == server) {
                        afs_get_cb_interest(vcbi);
-                        x = cmpxchg(ppcbi, cbi, vcbi);
+                        x = cmpxchg(&entry->cb_interest, cbi, vcbi);
                        if (x != cbi) {
                                cbi = x;
                                afs_put_cb_interest(afs_v2net(vnode), vcbi);
@@ -72,7 +72,7 @@ again:
                list_add_tail(&new->cb_link, &server->cb_interests);
                write_unlock(&server->cb_break_lock);
-                x = cmpxchg(ppcbi, cbi, new);
+                x = cmpxchg(&entry->cb_interest, cbi, new);
                if (x == cbi) {
                        cbi = new;
                } else {
@@ -137,7 +137,7 @@ void afs_put_cb_interest(struct afs_net *net, struct afs_cb_interest *cbi)
 */
 void afs_init_callback_state(struct afs_server *server)
 {
-        if (!test_and_clear_bit(AFS_SERVER_NEW, &server->flags))
+        if (!test_and_clear_bit(AFS_SERVER_FL_NEW, &server->flags))
                server->cb_s_break++;
 }
@@ -233,12 +233,12 @@ void afs_break_callbacks(struct afs_server *server, size_t count,
 /*
 * Clear the callback interests in a server list.
 */
-void afs_clear_callback_interests(struct afs_net *net, struct afs_volume *volume)
+void afs_clear_callback_interests(struct afs_net *net, struct afs_server_list *slist)
 {
        int i;
-        for (i = 0; i < ARRAY_SIZE(volume->cb_interests); i++) {
+        for (i = 0; i < slist->nr_servers; i++) {
-                afs_put_cb_interest(net, volume->cb_interests[i]);
+                afs_put_cb_interest(net, slist->servers[i].cb_interest);
-                volume->cb_interests[i] = NULL;
+                slist->servers[i].cb_interest = NULL;
        }
 }
diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index a0e08d3a108c..1858c91169e4 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -146,13 +146,10 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net,
        atomic_set(&cell->usage, 2);
        INIT_WORK(&cell->manager, afs_manage_cell);
-        rwlock_init(&cell->servers_lock);
-        INIT_LIST_HEAD(&cell->servers);
-        init_rwsem(&cell->vl_sem);
-        INIT_LIST_HEAD(&cell->vl_list);
-        spin_lock_init(&cell->vl_lock);
        cell->flags = ((1 << AFS_CELL_FL_NOT_READY) |
                       (1 << AFS_CELL_FL_NO_LOOKUP_YET));
+        INIT_LIST_HEAD(&cell->proc_volumes);
+        rwlock_init(&cell->proc_lock);
        rwlock_init(&cell->vl_addrs_lock);
        /* Fill in the VL server list if we were given a list of addresses to
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 37083699a0df..53f3917440e7 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -553,7 +553,7 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
        dentry->d_fsdata = (void *)(unsigned long) vnode->status.data_version;
        /* instantiate the dentry */
-        inode = afs_iget(dir->i_sb, key, &fid, NULL, NULL);
+        inode = afs_iget(dir->i_sb, key, &fid, NULL, NULL, NULL);
        key_put(key);
        if (IS_ERR(inode)) {
                _leave(" = %ld", PTR_ERR(inode));
@@ -741,20 +741,48 @@ static void afs_d_release(struct dentry *dentry)
 }
 /*
+ * Create a new inode for create/mkdir/symlink
+ */
+static void afs_vnode_new_inode(struct afs_fs_cursor *fc,
+                                struct dentry *new_dentry,
+                                struct afs_fid *newfid,
+                                struct afs_file_status *newstatus,
+                                struct afs_callback *newcb)
+{
+        struct inode *inode;
+        if (fc->ac.error < 0)
+                return;
+        inode = afs_iget(fc->vnode->vfs_inode.i_sb, fc->key,
+                         newfid, newstatus, newcb, fc->cbi);
+        if (IS_ERR(inode)) {
+                /* ENOMEM or EINTR at a really inconvenient time - just abandon
+                 * the new directory on the server.
+                 */
+                fc->ac.error = PTR_ERR(inode);
+                return;
+        }
+        d_instantiate(new_dentry, inode);
+        if (d_unhashed(new_dentry))
+                d_rehash(new_dentry);
+}
+/*
 * create a directory on an AFS filesystem
 */
 static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
-        struct afs_file_status status;
+        struct afs_file_status newstatus;
-        struct afs_callback cb;
+        struct afs_fs_cursor fc;
-        struct afs_server *server;
+        struct afs_callback newcb;
-        struct afs_vnode *dvnode, *vnode;
+        struct afs_vnode *dvnode = AFS_FS_I(dir);
-        struct afs_fid fid;
+        struct afs_fid newfid;
-        struct inode *inode;
        struct key *key;
        int ret;
-        dvnode = AFS_FS_I(dir);
+        mode |= S_IFDIR;
        _enter("{%x:%u},{%pd},%ho",
               dvnode->fid.vid, dvnode->fid.vnode, dentry, mode);
@@ -765,40 +793,27 @@ static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
                goto error;
        }
-        mode |= S_IFDIR;
+        ret = -ERESTARTSYS;
-        ret = afs_vnode_create(dvnode, key, dentry->d_name.name,
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-                               mode, &fid, &status, &cb, &server);
+                while (afs_select_fileserver(&fc)) {
-        if (ret < 0)
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-                goto mkdir_error;
+                        afs_fs_create(&fc, dentry->d_name.name, mode,
+                                      &newfid, &newstatus, &newcb);
+                }
-        inode = afs_iget(dir->i_sb, key, &fid, &status, &cb);
+                afs_check_for_remote_deletion(&fc, fc.vnode);
-        if (IS_ERR(inode)) {
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
-                /* ENOMEM at a really inconvenient time - just abandon the new
+                afs_vnode_new_inode(&fc, dentry, &newfid, &newstatus, &newcb);
-                 * directory on the server */
+                ret = afs_end_vnode_operation(&fc);
-                ret = PTR_ERR(inode);
+                if (ret < 0)
-                goto iget_error;
+                        goto error_key;
        }
-        /* apply the status report we've got for the new vnode */
-        vnode = AFS_FS_I(inode);
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_vnode_finalise_status_update(vnode, server);
-        afs_put_server(afs_i2net(dir), server);
-        d_instantiate(dentry, inode);
-        if (d_unhashed(dentry)) {
-                _debug("not hashed");
-                d_rehash(dentry);
-        }
        key_put(key);
        _leave(" = 0");
        return 0;
-iget_error:
+error_key:
-        afs_put_server(afs_i2net(dir), server);
-mkdir_error:
        key_put(key);
 error:
        d_drop(dentry);
@@ -807,16 +822,29 @@ error:
 }
 /*
+ * Remove a subdir from a directory.
+ */
+static void afs_dir_remove_subdir(struct dentry *dentry)
+{
+        if (d_really_is_positive(dentry)) {
+                struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
+                clear_nlink(&vnode->vfs_inode);
+                set_bit(AFS_VNODE_DELETED, &vnode->flags);
+                clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+        }
+}
+/*
 * remove a directory from an AFS filesystem
 */
 static int afs_rmdir(struct inode *dir, struct dentry *dentry)
 {
-        struct afs_vnode *dvnode, *vnode;
+        struct afs_fs_cursor fc;
+        struct afs_vnode *dvnode = AFS_FS_I(dir);
        struct key *key;
        int ret;
-        dvnode = AFS_FS_I(dir);
        _enter("{%x:%u},{%pd}",
               dvnode->fid.vid, dvnode->fid.vnode, dentry);
@@ -826,45 +854,69 @@ static int afs_rmdir(struct inode *dir, struct dentry *dentry)
                goto error;
        }
-        ret = afs_vnode_remove(dvnode, key, dentry->d_name.name, true);
+        ret = -ERESTARTSYS;
-        if (ret < 0)
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-                goto rmdir_error;
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
+                        afs_fs_remove(&fc, dentry->d_name.name, true);
+                }
-        if (d_really_is_positive(dentry)) {
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
-                vnode = AFS_FS_I(d_inode(dentry));
+                ret = afs_end_vnode_operation(&fc);
-                clear_nlink(&vnode->vfs_inode);
+                if (ret == 0)
-                set_bit(AFS_VNODE_DELETED, &vnode->flags);
+                        afs_dir_remove_subdir(dentry);
-                clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
        }
        key_put(key);
-        _leave(" = 0");
-        return 0;
-rmdir_error:
-        key_put(key);
 error:
-        _leave(" = %d", ret);
        return ret;
 }
 /*
- * remove a file from an AFS filesystem
+ * Remove a link to a file or symlink from a directory.
+ *
+ * If the file was not deleted due to excess hard links, the fileserver will
+ * break the callback promise on the file - if it had one - before it returns
+ * to us, and if it was deleted, it won't
+ *
+ * However, if we didn't have a callback promise outstanding, or it was
+ * outstanding on a different server, then it won't break it either...
+ */
+static int afs_dir_remove_link(struct dentry *dentry, struct key *key)
+{
+        int ret = 0;
+        if (d_really_is_positive(dentry)) {
+                struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
+                if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+                        kdebug("AFS_VNODE_DELETED");
+                clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+                ret = afs_validate(vnode, key);
+                if (ret == -ESTALE)
+                        ret = 0;
+                _debug("nlink %d [val %d]", vnode->vfs_inode.i_nlink, ret);
+        }
+        return ret;
+}
+/*
+ * Remove a file or symlink from an AFS filesystem.
 */
 static int afs_unlink(struct inode *dir, struct dentry *dentry)
 {
-        struct afs_vnode *dvnode, *vnode;
+        struct afs_fs_cursor fc;
+        struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode;
        struct key *key;
        int ret;
-        dvnode = AFS_FS_I(dir);
        _enter("{%x:%u},{%pd}",
               dvnode->fid.vid, dvnode->fid.vnode, dentry);
-        ret = -ENAMETOOLONG;
        if (dentry->d_name.len >= AFSNAMEMAX)
-                goto error;
+                return -ENAMETOOLONG;
        key = afs_request_key(dvnode->volume->cell);
        if (IS_ERR(key)) {
@@ -872,42 +924,28 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
                goto error;
        }
+        /* Try to make sure we have a callback promise on the victim. */
        if (d_really_is_positive(dentry)) {
                vnode = AFS_FS_I(d_inode(dentry));
-                /* make sure we have a callback promise on the victim */
                ret = afs_validate(vnode, key);
                if (ret < 0)
-                        goto error;
+                        goto error_key;
        }
-        ret = afs_vnode_remove(dvnode, key, dentry->d_name.name, false);
+        ret = -ERESTARTSYS;
-        if (ret < 0)
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-                goto remove_error;
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
+                        afs_fs_remove(&fc, dentry->d_name.name, false);
+                }
-        if (d_really_is_positive(dentry)) {
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
-                /* if the file wasn't deleted due to excess hard links, the
+                ret = afs_end_vnode_operation(&fc);
-                 * fileserver will break the callback promise on the file - if
+                if (ret == 0)
-                 * it had one - before it returns to us, and if it was deleted,
+                        ret = afs_dir_remove_link(dentry, key);
-                 * it won't
-                 *
-                 * however, if we didn't have a callback promise outstanding,
-                 * or it was outstanding on a different server, then it won't
-                 * break it either...
-                 */
-                vnode = AFS_FS_I(d_inode(dentry));
-                if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
-                        _debug("AFS_VNODE_DELETED");
-                clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
-                ret = afs_validate(vnode, key);
-                _debug("nlink %d [val %d]", vnode->vfs_inode.i_nlink, ret);
        }
-        key_put(key);
+error_key:
-        _leave(" = 0");
-        return 0;
-remove_error:
        key_put(key);
 error:
        _leave(" = %d", ret);
@@ -920,60 +958,50 @@ error:
 static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
                      bool excl)
 {
-        struct afs_file_status status;
+        struct afs_fs_cursor fc;
-        struct afs_callback cb;
+        struct afs_file_status newstatus;
-        struct afs_server *server;
+        struct afs_callback newcb;
-        struct afs_vnode *dvnode, *vnode;
+        struct afs_vnode *dvnode = dvnode = AFS_FS_I(dir);
-        struct afs_fid fid;
+        struct afs_fid newfid;
-        struct inode *inode;
        struct key *key;
        int ret;
-        dvnode = AFS_FS_I(dir);
+        mode |= S_IFREG;
        _enter("{%x:%u},{%pd},%ho,",
               dvnode->fid.vid, dvnode->fid.vnode, dentry, mode);
+        ret = -ENAMETOOLONG;
+        if (dentry->d_name.len >= AFSNAMEMAX)
+                goto error;
        key = afs_request_key(dvnode->volume->cell);
        if (IS_ERR(key)) {
                ret = PTR_ERR(key);
                goto error;
        }
-        mode |= S_IFREG;
+        ret = -ERESTARTSYS;
-        ret = afs_vnode_create(dvnode, key, dentry->d_name.name,
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-                               mode, &fid, &status, &cb, &server);
+                while (afs_select_fileserver(&fc)) {
-        if (ret < 0)
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
-                goto create_error;
+                        afs_fs_create(&fc, dentry->d_name.name, mode,
+                                      &newfid, &newstatus, &newcb);
+                }
-        inode = afs_iget(dir->i_sb, key, &fid, &status, &cb);
+                afs_check_for_remote_deletion(&fc, fc.vnode);
-        if (IS_ERR(inode)) {
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
-                /* ENOMEM at a really inconvenient time - just abandon the new
+                afs_vnode_new_inode(&fc, dentry, &newfid, &newstatus, &newcb);
-                 * directory on the server */
+                ret = afs_end_vnode_operation(&fc);
-                ret = PTR_ERR(inode);
+                if (ret < 0)
-                goto iget_error;
+                        goto error_key;
        }
-        /* apply the status report we've got for the new vnode */
-        vnode = AFS_FS_I(inode);
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_vnode_finalise_status_update(vnode, server);
-        afs_put_server(afs_i2net(dir), server);
-        d_instantiate(dentry, inode);
-        if (d_unhashed(dentry)) {
-                _debug("not hashed");
-                d_rehash(dentry);
-        }
        key_put(key);
        _leave(" = 0");
        return 0;
-iget_error:
+error_key:
-        afs_put_server(afs_i2net(dir), server);
-create_error:
        key_put(key);
 error:
        d_drop(dentry);
@@ -987,6 +1015,7 @@ error:
 static int afs_link(struct dentry *from, struct inode *dir,
                    struct dentry *dentry)
 {
+        struct afs_fs_cursor fc;
        struct afs_vnode *dvnode, *vnode;
        struct key *key;
        int ret;
@@ -999,23 +1028,45 @@ static int afs_link(struct dentry *from, struct inode *dir,
               dvnode->fid.vid, dvnode->fid.vnode,
               dentry);
+        ret = -ENAMETOOLONG;
+        if (dentry->d_name.len >= AFSNAMEMAX)
+                goto error;
        key = afs_request_key(dvnode->volume->cell);
        if (IS_ERR(key)) {
                ret = PTR_ERR(key);
                goto error;
        }
-        ret = afs_vnode_link(dvnode, vnode, key, dentry->d_name.name);
+        ret = -ERESTARTSYS;
-        if (ret < 0)
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-                goto link_error;
+                if (mutex_lock_interruptible_nested(&vnode->io_lock, 1) < 0) {
+                        afs_end_vnode_operation(&fc);
+                        return -ERESTARTSYS;
+                }
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
+                        fc.cb_break_2 = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_link(&fc, vnode, dentry->d_name.name);
+                }
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break_2);
+                ihold(&vnode->vfs_inode);
+                d_instantiate(dentry, &vnode->vfs_inode);
+                mutex_unlock(&vnode->io_lock);
+                ret = afs_end_vnode_operation(&fc);
+                if (ret < 0)
+                        goto error_key;
+        }
-        ihold(&vnode->vfs_inode);
-        d_instantiate(dentry, &vnode->vfs_inode);
        key_put(key);
        _leave(" = 0");
        return 0;
-link_error:
+error_key:
        key_put(key);
 error:
        d_drop(dentry);
@@ -1029,20 +1080,21 @@ error:
 static int afs_symlink(struct inode *dir, struct dentry *dentry,
                       const char *content)
 {
-        struct afs_file_status status;
+        struct afs_fs_cursor fc;
-        struct afs_server *server;
+        struct afs_file_status newstatus;
-        struct afs_vnode *dvnode, *vnode;
+        struct afs_vnode *dvnode = AFS_FS_I(dir);
-        struct afs_fid fid;
+        struct afs_fid newfid;
-        struct inode *inode;
        struct key *key;
        int ret;
-        dvnode = AFS_FS_I(dir);
        _enter("{%x:%u},{%pd},%s",
               dvnode->fid.vid, dvnode->fid.vnode, dentry,
               content);
+        ret = -ENAMETOOLONG;
+        if (dentry->d_name.len >= AFSNAMEMAX)
+                goto error;
        ret = -EINVAL;
        if (strlen(content) >= AFSPATHMAX)
                goto error;
@@ -1053,39 +1105,27 @@ static int afs_symlink(struct inode *dir, struct dentry *dentry,
                goto error;
        }
-        ret = afs_vnode_symlink(dvnode, key, dentry->d_name.name, content,
+        ret = -ERESTARTSYS;
-                                &fid, &status, &server);
+        if (afs_begin_vnode_operation(&fc, dvnode, key)) {
-        if (ret < 0)
+                while (afs_select_fileserver(&fc)) {
-                goto create_error;
+                        fc.cb_break = dvnode->cb_break + dvnode->cb_s_break;
+                        afs_fs_symlink(&fc, dentry->d_name.name, content,
+                                       &newfid, &newstatus);
+                }
-        inode = afs_iget(dir->i_sb, key, &fid, &status, NULL);
+                afs_check_for_remote_deletion(&fc, fc.vnode);
-        if (IS_ERR(inode)) {
+                afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
-                /* ENOMEM at a really inconvenient time - just abandon the new
+                afs_vnode_new_inode(&fc, dentry, &newfid, &newstatus, NULL);
-                 * directory on the server */
+                ret = afs_end_vnode_operation(&fc);
-                ret = PTR_ERR(inode);
+                if (ret < 0)
-                goto iget_error;
+                        goto error_key;
        }
-        /* apply the status report we've got for the new vnode */
-        vnode = AFS_FS_I(inode);
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_vnode_finalise_status_update(vnode, server);
-        afs_put_server(afs_i2net(dir), server);
-        d_instantiate(dentry, inode);
-        if (d_unhashed(dentry)) {
-                _debug("not hashed");
-                d_rehash(dentry);
-        }
        key_put(key);
        _leave(" = 0");
        return 0;
-iget_error:
+error_key:
-        afs_put_server(afs_i2net(dir), server);
-create_error:
        key_put(key);
 error:
        d_drop(dentry);
@@ -1100,6 +1140,7 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
                      struct inode *new_dir, struct dentry *new_dentry,
                      unsigned int flags)
 {
+        struct afs_fs_cursor fc;
        struct afs_vnode *orig_dvnode, *new_dvnode, *vnode;
        struct key *key;
        int ret;
@@ -1123,16 +1164,35 @@ static int afs_rename(struct inode *old_dir, struct dentry *old_dentry,
                goto error;
        }
-        ret = afs_vnode_rename(orig_dvnode, new_dvnode, key,
+        ret = -ERESTARTSYS;
-                               old_dentry->d_name.name,
+        if (afs_begin_vnode_operation(&fc, orig_dvnode, key)) {
-                               new_dentry->d_name.name);
+                if (orig_dvnode != new_dvnode) {
-        if (ret < 0)
+                        if (mutex_lock_interruptible_nested(&new_dvnode->io_lock, 1) < 0) {
-                goto rename_error;
+                                afs_end_vnode_operation(&fc);
+                                return -ERESTARTSYS;
+                        }
+                }
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = orig_dvnode->cb_break + orig_dvnode->cb_s_break;
+                        fc.cb_break_2 = new_dvnode->cb_break + new_dvnode->cb_s_break;
+                        afs_fs_rename(&fc, old_dentry->d_name.name,
+                                      new_dvnode, new_dentry->d_name.name);
+                }
+                afs_vnode_commit_status(&fc, orig_dvnode, fc.cb_break);
+                afs_vnode_commit_status(&fc, new_dvnode, fc.cb_break_2);
+                if (orig_dvnode != new_dvnode)
+                        mutex_unlock(&new_dvnode->io_lock);
+                ret = afs_end_vnode_operation(&fc);
+                if (ret < 0)
+                        goto error_key;
+        }
        key_put(key);
        _leave(" = 0");
        return 0;
-rename_error:
+error_key:
        key_put(key);
 error:
        d_drop(new_dentry);
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 08f9f0c5dfac..1f26ac9f816d 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -138,6 +138,37 @@ static void afs_file_readpage_read_complete(struct page *page,
 #endif
 /*
+ * Fetch file data from the volume.
+ */
+int afs_fetch_data(struct afs_vnode *vnode, struct key *key, struct afs_read *desc)
+{
+        struct afs_fs_cursor fc;
+        int ret;
+        _enter("%s{%x:%u.%u},%x,,,",
+               vnode->volume->name,
+               vnode->fid.vid,
+               vnode->fid.vnode,
+               vnode->fid.unique,
+               key_serial(key));
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_fetch_data(&fc, desc);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
 * read page from file, directory or symlink, given a key to use
 */
 int afs_page_filler(void *data, struct page *page)
@@ -199,7 +230,7 @@ int afs_page_filler(void *data, struct page *page)
                /* read the contents of the file from the server into the
                 * page */
-                ret = afs_vnode_fetch_data(vnode, key, req);
+                ret = afs_fetch_data(vnode, key, req);
                afs_put_read(req);
                if (ret < 0) {
                        if (ret == -ENOENT) {
@@ -264,7 +295,7 @@ static int afs_readpage(struct file *file, struct page *page)
                ret = afs_page_filler(key, page);
        } else {
                struct inode *inode = page->mapping->host;
-                key = afs_request_key(AFS_FS_S(inode->i_sb)->volume->cell);
+                key = afs_request_key(AFS_FS_S(inode->i_sb)->cell);
                if (IS_ERR(key)) {
                        ret = PTR_ERR(key);
                } else {
@@ -369,7 +400,7 @@ static int afs_readpages_one(struct file *file, struct address_space *mapping,
                return 0;
        }
-        ret = afs_vnode_fetch_data(vnode, key, req);
+        ret = afs_fetch_data(vnode, key, req);
        if (ret < 0)
                goto error;
diff --git a/fs/afs/flock.c b/fs/afs/flock.c
index 2b31ea58c50c..77b0a4606efd 100644
--- a/fs/afs/flock.c
+++ b/fs/afs/flock.c
@@ -68,6 +68,100 @@ static void afs_grant_locks(struct afs_vnode *vnode, struct file_lock *fl)
 }
 /*
+ * Get a lock on a file
+ */
+static int afs_set_lock(struct afs_vnode *vnode, struct key *key,
+                        afs_lock_type_t type)
+{
+        struct afs_fs_cursor fc;
+        int ret;
+        _enter("%s{%x:%u.%u},%x,%u",
+               vnode->volume->name,
+               vnode->fid.vid,
+               vnode->fid.vnode,
+               vnode->fid.unique,
+               key_serial(key), type);
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_set_lock(&fc, type);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
+ * Extend a lock on a file
+ */
+static int afs_extend_lock(struct afs_vnode *vnode, struct key *key)
+{
+        struct afs_fs_cursor fc;
+        int ret;
+        _enter("%s{%x:%u.%u},%x",
+               vnode->volume->name,
+               vnode->fid.vid,
+               vnode->fid.vnode,
+               vnode->fid.unique,
+               key_serial(key));
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_current_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_extend_lock(&fc);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
+ * Release a lock on a file
+ */
+static int afs_release_lock(struct afs_vnode *vnode, struct key *key)
+{
+        struct afs_fs_cursor fc;
+        int ret;
+        _enter("%s{%x:%u.%u},%x",
+               vnode->volume->name,
+               vnode->fid.vid,
+               vnode->fid.vnode,
+               vnode->fid.unique,
+               key_serial(key));
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_current_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_release_lock(&fc);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
 * do work for a lock, including:
 * - probing for a lock we're waiting on but didn't get immediately
 * - extending a lock that's close to timing out
@@ -91,7 +185,7 @@ void afs_lock_work(struct work_struct *work)
                /* attempt to release the server lock; if it fails, we just
                 * wait 5 minutes and it'll time out anyway */
-                ret = afs_vnode_release_lock(vnode, vnode->unlock_key);
+                ret = afs_release_lock(vnode, vnode->unlock_key);
                if (ret < 0)
                        printk(KERN_WARNING "AFS:"
                               " Failed to release lock on {%x:%x} error %d\n",
@@ -115,7 +209,7 @@ void afs_lock_work(struct work_struct *work)
                key = key_get(fl->fl_file->private_data);
                spin_unlock(&vnode->lock);
-                ret = afs_vnode_extend_lock(vnode, key);
+                ret = afs_extend_lock(vnode, key);
                clear_bit(AFS_VNODE_LOCKING, &vnode->flags);
                key_put(key);
                switch (ret) {
@@ -151,7 +245,7 @@ void afs_lock_work(struct work_struct *work)
                        AFS_LOCK_READ : AFS_LOCK_WRITE;
                spin_unlock(&vnode->lock);
-                ret = afs_vnode_set_lock(vnode, key, type);
+                ret = afs_set_lock(vnode, key, type);
                clear_bit(AFS_VNODE_LOCKING, &vnode->flags);
                switch (ret) {
                case -EWOULDBLOCK:
@@ -182,7 +276,7 @@ void afs_lock_work(struct work_struct *work)
                                clear_bit(AFS_VNODE_READLOCKED, &vnode->flags);
                                clear_bit(AFS_VNODE_WRITELOCKED, &vnode->flags);
                                spin_unlock(&vnode->lock);
-                                afs_vnode_release_lock(vnode, key);
+                                afs_release_lock(vnode, key);
                                if (!list_empty(&vnode->pending_locks))
                                        afs_lock_may_be_available(vnode);
                        }
@@ -280,7 +374,7 @@ static int afs_do_setlk(struct file *file, struct file_lock *fl)
                set_bit(AFS_VNODE_LOCKING, &vnode->flags);
                spin_unlock(&vnode->lock);
-                ret = afs_vnode_set_lock(vnode, key, type);
+                ret = afs_set_lock(vnode, key, type);
                clear_bit(AFS_VNODE_LOCKING, &vnode->flags);
                switch (ret) {
                case 0:
@@ -383,7 +477,7 @@ given_lock:
        /* again, make sure we've got a callback on this file and, again, make
         * sure that our view of the data version is up to date (we ignore
         * errors incurred here and deal with the consequences elsewhere) */
-        afs_vnode_fetch_status(vnode, key, false);
+        afs_validate(vnode, key);
 error:
        spin_unlock(&inode->i_lock);
@@ -455,7 +549,7 @@ static int afs_do_getlk(struct file *file, struct file_lock *fl)
        posix_test_lock(file, fl);
        if (fl->fl_type == F_UNLCK) {
                /* no local locks; consult the server */
-                ret = afs_vnode_fetch_status(vnode, key, true);
+                ret = afs_fetch_status(vnode, key);
                if (ret < 0)
                        goto error;
                lock_count = vnode->status.lock_count;
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index 6614d0a78daa..72ff3679fa2a 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -22,9 +22,9 @@
 */
 static u8 afs_discard_buffer[64];
-static inline void afs_use_fs_server(struct afs_call *call, struct afs_server *server)
+static inline void afs_use_fs_server(struct afs_call *call, struct afs_cb_interest *cbi)
 {
-        call->server = afs_get_server(server);
+        call->cbi = afs_get_cb_interest(cbi);
 }
 /*
@@ -56,7 +56,8 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
        kuid_t owner;
        kgid_t group;
-        write_seqlock(&vnode->cb_lock);
+        if (vnode)
+                write_seqlock(&vnode->cb_lock);
 #define EXTRACT(DST)                            \
        do {                                    \
@@ -141,7 +142,8 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
                status->data_version = data_version;
        }
-        write_sequnlock(&vnode->cb_lock);
+        if (vnode)
+                write_sequnlock(&vnode->cb_lock);
 }
 /*
@@ -151,22 +153,29 @@ static void xdr_decode_AFSCallBack(struct afs_call *call,
                                   struct afs_vnode *vnode,
                                   const __be32 **_bp)
 {
+        struct afs_cb_interest *old, *cbi = call->cbi;
        const __be32 *bp = *_bp;
        u32 cb_expiry;
        write_seqlock(&vnode->cb_lock);
-        if (call->cb_break == (vnode->cb_break + call->server->cb_s_break)) {
+        if (call->cb_break == (vnode->cb_break + cbi->server->cb_s_break)) {
                vnode->cb_version       = ntohl(*bp++);
                cb_expiry               = ntohl(*bp++);
                vnode->cb_type          = ntohl(*bp++);
                vnode->cb_expires_at    = cb_expiry + ktime_get_real_seconds();
+                old = vnode->cb_interest;
+                if (old != call->cbi) {
+                        vnode->cb_interest = cbi;
+                        cbi = old;
+                }
                set_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
        } else {
                bp += 3;
        }
        write_sequnlock(&vnode->cb_lock);
+        call->cbi = cbi;
        *_bp = bp;
 }
@@ -297,24 +306,23 @@ static const struct afs_call_type afs_RXFSFetchStatus = {
 /*
 * fetch the status information for a file
 */
-int afs_fs_fetch_file_status(struct afs_fs_cursor *fc,
+int afs_fs_fetch_file_status(struct afs_fs_cursor *fc, struct afs_volsync *volsync)
-                             struct key *key,
-                             struct afs_vnode *vnode,
-                             struct afs_volsync *volsync,
-                             bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
        _enter(",%x,{%x:%u},,",
-               key_serial(key), vnode->fid.vid, vnode->fid.vnode);
+               key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
        call = afs_alloc_flat_call(net, &afs_RXFSFetchStatus, 16, (21 + 3 + 6) * 4);
-        if (!call)
+        if (!call) {
+                fc->ac.error = -ENOMEM;
                return -ENOMEM;
+        }
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = volsync;
@@ -325,9 +333,9 @@ int afs_fs_fetch_file_status(struct afs_fs_cursor *fc,
        bp[2] = htonl(vnode->fid.vnode);
        bp[3] = htonl(vnode->fid.unique);
-        call->cb_break = vnode->cb_break + fc->server->cb_s_break;
+        call->cb_break = fc->cb_break;
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -502,12 +510,9 @@ static const struct afs_call_type afs_RXFSFetchData64 = {
 /*
 * fetch data from a very large file
 */
-static int afs_fs_fetch_data64(struct afs_fs_cursor *fc,
+static int afs_fs_fetch_data64(struct afs_fs_cursor *fc, struct afs_read *req)
-                               struct key *key,
-                               struct afs_vnode *vnode,
-                               struct afs_read *req,
-                               bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -518,7 +523,7 @@ static int afs_fs_fetch_data64(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = NULL; /* volsync */
        call->reply[2] = req;
@@ -536,20 +541,17 @@ static int afs_fs_fetch_data64(struct afs_fs_cursor *fc,
        bp[7] = htonl(lower_32_bits(req->len));
        atomic_inc(&req->usage);
-        call->cb_break = vnode->cb_break + fc->server->cb_s_break;
+        call->cb_break = fc->cb_break;
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
 * fetch data from a file
 */
-int afs_fs_fetch_data(struct afs_fs_cursor *fc,
+int afs_fs_fetch_data(struct afs_fs_cursor *fc, struct afs_read *req)
-                      struct key *key,
-                      struct afs_vnode *vnode,
-                      struct afs_read *req,
-                      bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -557,7 +559,7 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc,
        if (upper_32_bits(req->pos) ||
            upper_32_bits(req->len) ||
            upper_32_bits(req->pos + req->len))
-                return afs_fs_fetch_data64(fc, key, vnode, req, async);
+                return afs_fs_fetch_data64(fc, req);
        _enter("");
@@ -565,7 +567,7 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = NULL; /* volsync */
        call->reply[2] = req;
@@ -581,9 +583,9 @@ int afs_fs_fetch_data(struct afs_fs_cursor *fc,
        bp[5] = htonl(lower_32_bits(req->len));
        atomic_inc(&req->usage);
-        call->cb_break = vnode->cb_break + fc->server->cb_s_break;
+        call->cb_break = fc->cb_break;
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -626,15 +628,13 @@ static const struct afs_call_type afs_RXFSCreateXXXX = {
 * create a file or make a directory
 */
 int afs_fs_create(struct afs_fs_cursor *fc,
-                  struct key *key,
-                  struct afs_vnode *vnode,
                  const char *name,
                  umode_t mode,
                  struct afs_fid *newfid,
                  struct afs_file_status *newstatus,
-                  struct afs_callback *newcb,
+                  struct afs_callback *newcb)
-                  bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        size_t namesz, reqsz, padsz;
@@ -651,7 +651,7 @@ int afs_fs_create(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = newfid;
        call->reply[2] = newstatus;
@@ -677,8 +677,8 @@ int afs_fs_create(struct afs_fs_cursor *fc,
        *bp++ = htonl(mode & S_IALLUGO); /* unix mode */
        *bp++ = 0; /* segment size */
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -717,13 +717,9 @@ static const struct afs_call_type afs_RXFSRemoveXXXX = {
 /*
 * remove a file or directory
 */
-int afs_fs_remove(struct afs_fs_cursor *fc,
+int afs_fs_remove(struct afs_fs_cursor *fc, const char *name, bool isdir)
-                  struct key *key,
-                  struct afs_vnode *vnode,
-                  const char *name,
-                  bool isdir,
-                  bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        size_t namesz, reqsz, padsz;
@@ -739,7 +735,7 @@ int afs_fs_remove(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        /* marshall the parameters */
@@ -756,8 +752,8 @@ int afs_fs_remove(struct afs_fs_cursor *fc,
                bp = (void *) bp + padsz;
        }
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -797,13 +793,10 @@ static const struct afs_call_type afs_RXFSLink = {
 /*
 * make a hard link
 */
-int afs_fs_link(struct afs_fs_cursor *fc,
+int afs_fs_link(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
-                struct key *key,
+                const char *name)
-                struct afs_vnode *dvnode,
-                struct afs_vnode *vnode,
-                const char *name,
-                bool async)
 {
+        struct afs_vnode *dvnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        size_t namesz, reqsz, padsz;
@@ -819,7 +812,7 @@ int afs_fs_link(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = dvnode;
        call->reply[1] = vnode;
@@ -840,8 +833,8 @@ int afs_fs_link(struct afs_fs_cursor *fc,
        *bp++ = htonl(vnode->fid.vnode);
        *bp++ = htonl(vnode->fid.unique);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -883,14 +876,12 @@ static const struct afs_call_type afs_RXFSSymlink = {
 * create a symbolic link
 */
 int afs_fs_symlink(struct afs_fs_cursor *fc,
-                   struct key *key,
-                   struct afs_vnode *vnode,
                   const char *name,
                   const char *contents,
                   struct afs_fid *newfid,
-                   struct afs_file_status *newstatus,
+                   struct afs_file_status *newstatus)
-                   bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        size_t namesz, reqsz, padsz, c_namesz, c_padsz;
@@ -911,7 +902,7 @@ int afs_fs_symlink(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = newfid;
        call->reply[2] = newstatus;
@@ -943,8 +934,8 @@ int afs_fs_symlink(struct afs_fs_cursor *fc,
        *bp++ = htonl(S_IRWXUGO); /* unix mode */
        *bp++ = 0; /* segment size */
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -987,13 +978,11 @@ static const struct afs_call_type afs_RXFSRename = {
 * create a symbolic link
 */
 int afs_fs_rename(struct afs_fs_cursor *fc,
-                  struct key *key,
-                  struct afs_vnode *orig_dvnode,
                  const char *orig_name,
                  struct afs_vnode *new_dvnode,
-                  const char *new_name,
+                  const char *new_name)
-                  bool async)
 {
+        struct afs_vnode *orig_dvnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(orig_dvnode);
        size_t reqsz, o_namesz, o_padsz, n_namesz, n_padsz;
@@ -1016,7 +1005,7 @@ int afs_fs_rename(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = orig_dvnode;
        call->reply[1] = new_dvnode;
@@ -1045,8 +1034,8 @@ int afs_fs_rename(struct afs_fs_cursor *fc,
                bp = (void *) bp + n_padsz;
        }
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -1098,8 +1087,7 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
                               struct afs_writeback *wb,
                               pgoff_t first, pgoff_t last,
                               unsigned offset, unsigned to,
-                               loff_t size, loff_t pos, loff_t i_size,
+                               loff_t size, loff_t pos, loff_t i_size)
-                               bool async)
 {
        struct afs_vnode *vnode = wb->vnode;
        struct afs_call *call;
@@ -1147,8 +1135,7 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
        *bp++ = htonl(i_size >> 32);
        *bp++ = htonl((u32) i_size);
-        afs_use_fs_server(call, fc->server);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
 }
 /*
@@ -1156,8 +1143,7 @@ static int afs_fs_store_data64(struct afs_fs_cursor *fc,
 */
 int afs_fs_store_data(struct afs_fs_cursor *fc, struct afs_writeback *wb,
                      pgoff_t first, pgoff_t last,
-                      unsigned offset, unsigned to,
+                      unsigned offset, unsigned to)
-                      bool async)
 {
        struct afs_vnode *vnode = wb->vnode;
        struct afs_call *call;
@@ -1184,7 +1170,7 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct afs_writeback *wb,
        if (pos >> 32 || i_size >> 32 || size >> 32 || (pos + size) >> 32)
                return afs_fs_store_data64(fc, wb, first, last, offset, to,
-                                           size, pos, i_size, async);
+                                           size, pos, i_size);
        call = afs_alloc_flat_call(net, &afs_RXFSStoreData,
                                   (4 + 6 + 3) * 4,
@@ -1221,8 +1207,8 @@ int afs_fs_store_data(struct afs_fs_cursor *fc, struct afs_writeback *wb,
        *bp++ = htonl(size);
        *bp++ = htonl(i_size);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -1279,16 +1265,15 @@ static const struct afs_call_type afs_RXFSStoreData64_as_Status = {
 * set the attributes on a very large file, using FS.StoreData rather than
 * FS.StoreStatus so as to alter the file size also
 */
-static int afs_fs_setattr_size64(struct afs_fs_cursor *fc, struct key *key,
+static int afs_fs_setattr_size64(struct afs_fs_cursor *fc, struct iattr *attr)
-                                 struct afs_vnode *vnode, struct iattr *attr,
-                                 bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
        _enter(",%x,{%x:%u},,",
-               key_serial(key), vnode->fid.vid, vnode->fid.vnode);
+               key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
        ASSERT(attr->ia_valid & ATTR_SIZE);
@@ -1298,7 +1283,7 @@ static int afs_fs_setattr_size64(struct afs_fs_cursor *fc, struct key *key,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->store_version = vnode->status.data_version + 1;
        call->operation_ID = FSSTOREDATA;
@@ -1319,28 +1304,27 @@ static int afs_fs_setattr_size64(struct afs_fs_cursor *fc, struct key *key,
        *bp++ = htonl(attr->ia_size >> 32);     /* new file length */
        *bp++ = htonl((u32) attr->ia_size);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
 * set the attributes on a file, using FS.StoreData rather than FS.StoreStatus
 * so as to alter the file size also
 */
-static int afs_fs_setattr_size(struct afs_fs_cursor *fc, struct key *key,
+static int afs_fs_setattr_size(struct afs_fs_cursor *fc, struct iattr *attr)
-                               struct afs_vnode *vnode, struct iattr *attr,
-                               bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
        _enter(",%x,{%x:%u},,",
-               key_serial(key), vnode->fid.vid, vnode->fid.vnode);
+               key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
        ASSERT(attr->ia_valid & ATTR_SIZE);
        if (attr->ia_size >> 32)
-                return afs_fs_setattr_size64(fc, key, vnode, attr, async);
+                return afs_fs_setattr_size64(fc, attr);
        call = afs_alloc_flat_call(net, &afs_RXFSStoreData_as_Status,
                                   (4 + 6 + 3) * 4,
@@ -1348,7 +1332,7 @@ static int afs_fs_setattr_size(struct afs_fs_cursor *fc, struct key *key,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->store_version = vnode->status.data_version + 1;
        call->operation_ID = FSSTOREDATA;
@@ -1366,27 +1350,26 @@ static int afs_fs_setattr_size(struct afs_fs_cursor *fc, struct key *key,
        *bp++ = 0;                              /* size of write */
        *bp++ = htonl(attr->ia_size);           /* new file length */
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
 * set the attributes on a file, using FS.StoreData if there's a change in file
 * size, and FS.StoreStatus otherwise
 */
-int afs_fs_setattr(struct afs_fs_cursor *fc, struct key *key,
+int afs_fs_setattr(struct afs_fs_cursor *fc, struct iattr *attr)
-                   struct afs_vnode *vnode, struct iattr *attr,
-                   bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
        if (attr->ia_valid & ATTR_SIZE)
-                return afs_fs_setattr_size(fc, key, vnode, attr, async);
+                return afs_fs_setattr_size(fc, attr);
        _enter(",%x,{%x:%u},,",
-               key_serial(key), vnode->fid.vid, vnode->fid.vnode);
+               key_serial(fc->key), vnode->fid.vid, vnode->fid.vnode);
        call = afs_alloc_flat_call(net, &afs_RXFSStoreStatus,
                                   (4 + 6) * 4,
@@ -1394,7 +1377,7 @@ int afs_fs_setattr(struct afs_fs_cursor *fc, struct key *key,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->operation_ID = FSSTORESTATUS;
@@ -1407,8 +1390,8 @@ int afs_fs_setattr(struct afs_fs_cursor *fc, struct key *key,
        xdr_encode_AFS_StoreStatus(&bp, attr);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -1606,11 +1589,9 @@ static const struct afs_call_type afs_RXFSGetVolumeStatus = {
 * fetch the status of a volume
 */
 int afs_fs_get_volume_status(struct afs_fs_cursor *fc,
-                             struct key *key,
+                             struct afs_volume_status *vs)
-                             struct afs_vnode *vnode,
-                             struct afs_volume_status *vs,
-                             bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -1628,7 +1609,7 @@ int afs_fs_get_volume_status(struct afs_fs_cursor *fc,
                return -ENOMEM;
        }
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        call->reply[1] = vs;
        call->reply[2] = tmpbuf;
@@ -1638,8 +1619,8 @@ int afs_fs_get_volume_status(struct afs_fs_cursor *fc,
        bp[0] = htonl(FSGETVOLUMESTATUS);
        bp[1] = htonl(vnode->fid.vid);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -1692,14 +1673,11 @@ static const struct afs_call_type afs_RXFSReleaseLock = {
 };
 /*
- * get a lock on a file
+ * Set a lock on a file
 */
-int afs_fs_set_lock(struct afs_fs_cursor *fc,
+int afs_fs_set_lock(struct afs_fs_cursor *fc, afs_lock_type_t type)
-                    struct key *key,
-                    struct afs_vnode *vnode,
-                    afs_lock_type_t type,
-                    bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -1710,7 +1688,7 @@ int afs_fs_set_lock(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        /* marshall the parameters */
@@ -1721,18 +1699,16 @@ int afs_fs_set_lock(struct afs_fs_cursor *fc,
        *bp++ = htonl(vnode->fid.unique);
        *bp++ = htonl(type);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
 * extend a lock on a file
 */
-int afs_fs_extend_lock(struct afs_fs_cursor *fc,
+int afs_fs_extend_lock(struct afs_fs_cursor *fc)
-                       struct key *key,
-                       struct afs_vnode *vnode,
-                       bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -1743,7 +1719,7 @@ int afs_fs_extend_lock(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        /* marshall the parameters */
@@ -1753,18 +1729,16 @@ int afs_fs_extend_lock(struct afs_fs_cursor *fc,
        *bp++ = htonl(vnode->fid.vnode);
        *bp++ = htonl(vnode->fid.unique);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
 * release a lock on a file
 */
-int afs_fs_release_lock(struct afs_fs_cursor *fc,
+int afs_fs_release_lock(struct afs_fs_cursor *fc)
-                        struct key *key,
-                        struct afs_vnode *vnode,
-                        bool async)
 {
+        struct afs_vnode *vnode = fc->vnode;
        struct afs_call *call;
        struct afs_net *net = afs_v2net(vnode);
        __be32 *bp;
@@ -1775,7 +1749,7 @@ int afs_fs_release_lock(struct afs_fs_cursor *fc,
        if (!call)
                return -ENOMEM;
-        call->key = key;
+        call->key = fc->key;
        call->reply[0] = vnode;
        /* marshall the parameters */
@@ -1785,8 +1759,8 @@ int afs_fs_release_lock(struct afs_fs_cursor *fc,
        *bp++ = htonl(vnode->fid.vnode);
        *bp++ = htonl(vnode->fid.unique);
-        afs_use_fs_server(call, fc->server);
+        afs_use_fs_server(call, fc->cbi);
-        return afs_make_call(&fc->ac, call, GFP_NOFS, async);
+        return afs_make_call(&fc->ac, call, GFP_NOFS, false);
 }
 /*
@@ -1809,17 +1783,17 @@ static const struct afs_call_type afs_RXFSGiveUpAllCallBacks = {
 /*
 * Flush all the callbacks we have on a server.
 */
-int afs_fs_give_up_all_callbacks(struct afs_server *server,
+int afs_fs_give_up_all_callbacks(struct afs_net *net,
+                                 struct afs_server *server,
                                 struct afs_addr_cursor *ac,
-                                 struct key *key,
+                                 struct key *key)
-                                 bool async)
 {
        struct afs_call *call;
        __be32 *bp;
        _enter("");
-        call = afs_alloc_flat_call(server->net, &afs_RXFSGiveUpAllCallBacks, 2 * 4, 0);
+        call = afs_alloc_flat_call(net, &afs_RXFSGiveUpAllCallBacks, 1 * 4, 0);
        if (!call)
                return -ENOMEM;
@@ -1830,5 +1804,96 @@ int afs_fs_give_up_all_callbacks(struct afs_server *server,
        *bp++ = htonl(FSGIVEUPALLCALLBACKS);
        /* Can't take a ref on server */
-        return afs_make_call(ac, call, GFP_NOFS, async);
+        return afs_make_call(ac, call, GFP_NOFS, false);
+}
+/*
+ * Deliver reply data to an FS.GetCapabilities operation.
+ */
+static int afs_deliver_fs_get_capabilities(struct afs_call *call)
+{
+        u32 count;
+        int ret;
+        _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+again:
+        switch (call->unmarshall) {
+        case 0:
+                call->offset = 0;
+                call->unmarshall++;
+                /* Extract the capabilities word count */
+        case 1:
+                ret = afs_extract_data(call, &call->tmp,
+                                       1 * sizeof(__be32),
+                                       true);
+                if (ret < 0)
+                        return ret;
+                count = ntohl(call->tmp);
+                call->count = count;
+                call->count2 = count;
+                call->offset = 0;
+                call->unmarshall++;
+                /* Extract capabilities words */
+        case 2:
+                count = min(call->count, 16U);
+                ret = afs_extract_data(call, call->buffer,
+                                       count * sizeof(__be32),
+                                       call->count > 16);
+                if (ret < 0)
+                        return ret;
+                /* TODO: Examine capabilities */
+                call->count -= count;
+                if (call->count > 0)
+                        goto again;
+                call->offset = 0;
+                call->unmarshall++;
+                break;
+        }
+        _leave(" = 0 [done]");
+        return 0;
+}
+/*
+ * FS.GetCapabilities operation type
+ */
+static const struct afs_call_type afs_RXFSGetCapabilities = {
+        .name           = "FS.GetCapabilities",
+        .deliver        = afs_deliver_fs_get_capabilities,
+        .destructor     = afs_flat_call_destructor,
+};
+/*
+ * Probe a fileserver for the capabilities that it supports.  This can
+ * return up to 196 words.
+ */
+int afs_fs_get_capabilities(struct afs_net *net,
+                            struct afs_server *server,
+                            struct afs_addr_cursor *ac,
+                            struct key *key)
+{
+        struct afs_call *call;
+        __be32 *bp;
+        _enter("");
+        call = afs_alloc_flat_call(net, &afs_RXFSGetCapabilities, 1 * 4, 16 * 4);
+        if (!call)
+                return -ENOMEM;
+        call->key = key;
+        /* marshall the parameters */
+        bp = call->request;
+        *bp++ = htonl(FSGETCAPABILITIES);
+        /* Can't take a ref on server */
+        return afs_make_call(ac, call, GFP_NOFS, false);
 }
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index ee86d5ad22d1..5a2f5854f349 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -102,6 +102,35 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 }
 /*
+ * Fetch file status from the volume.
+ */
+int afs_fetch_status(struct afs_vnode *vnode, struct key *key)
+{
+        struct afs_fs_cursor fc;
+        int ret;
+        _enter("%s,{%x:%u.%u,S=%lx}",
+               vnode->volume->name,
+               vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique,
+               vnode->flags);
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_fetch_file_status(&fc, NULL);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
 * iget5() comparator
 */
 int afs_iget5_test(struct inode *inode, void *opaque)
@@ -205,7 +234,7 @@ struct inode *afs_iget_autocell(struct inode *dir, const char *dev_name,
 */
 struct inode *afs_iget(struct super_block *sb, struct key *key,
                       struct afs_fid *fid, struct afs_file_status *status,
-                       struct afs_callback *cb)
+                       struct afs_callback *cb, struct afs_cb_interest *cbi)
 {
        struct afs_iget_data data = { .fid = *fid };
        struct afs_super_info *as;
@@ -238,7 +267,7 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
        if (!status) {
                /* it's a remotely extant inode */
-                ret = afs_vnode_fetch_status(vnode, key, true);
+                ret = afs_fetch_status(vnode, key);
                if (ret < 0)
                        goto bad_inode;
        } else {
@@ -255,6 +284,7 @@ struct inode *afs_iget(struct super_block *sb, struct key *key,
                        vnode->cb_version = cb->version;
                        vnode->cb_type = cb->type;
                        vnode->cb_expires_at = cb->expiry;
+                        vnode->cb_interest = afs_get_cb_interest(cbi);
                        set_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
                }
@@ -358,7 +388,7 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
         * access */
        if (!test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
                _debug("not promised");
-                ret = afs_vnode_fetch_status(vnode, key, false);
+                ret = afs_fetch_status(vnode, key);
                if (ret < 0) {
                        if (ret == -ENOENT) {
                                set_bit(AFS_VNODE_DELETED, &vnode->flags);
@@ -468,6 +498,7 @@ void afs_evict_inode(struct inode *inode)
 */
 int afs_setattr(struct dentry *dentry, struct iattr *attr)
 {
+        struct afs_fs_cursor fc;
        struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
        struct key *key;
        int ret;
@@ -498,7 +529,18 @@ int afs_setattr(struct dentry *dentry, struct iattr *attr)
                }
        }
-        ret = afs_vnode_setattr(vnode, key, attr);
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_setattr(&fc, attr);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
        if (!(attr->ia_valid & ATTR_FILE))
                key_put(key);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index df52bf18a263..1fadf40551fd 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -32,16 +32,6 @@
 struct pagevec;
 struct afs_call;
-typedef enum {
-        AFS_VL_NEW,                     /* new, uninitialised record */
-        AFS_VL_CREATING,                /* creating record */
-        AFS_VL_VALID,                   /* record is pending */
-        AFS_VL_NO_VOLUME,               /* no such volume available */
-        AFS_VL_UPDATING,                /* update in progress */
-        AFS_VL_VOLUME_DELETED,          /* volume was deleted */
-        AFS_VL_UNCERTAIN,               /* uncertain state (update failed) */
-} __attribute__((packed)) afs_vlocation_state_t;
 struct afs_mount_params {
        bool                    rwpath;         /* T if the parent should be considered R/W */
        bool                    force;          /* T to force cell type */
@@ -76,8 +66,10 @@ enum afs_call_state {
 struct afs_addr_list {
        struct rcu_head         rcu;            /* Must be first */
        refcount_t              usage;
+        u32                     version;        /* Version */
        unsigned short          nr_addrs;
        unsigned short          index;          /* Address currently in use */
+        unsigned short          nr_ipv4;        /* Number of IPv4 addresses */
        struct sockaddr_rxrpc   addrs[];
 };
@@ -93,7 +85,7 @@ struct afs_call {
        struct key              *key;           /* security for this call */
        struct afs_net          *net;           /* The network namespace */
        struct afs_server       *cm_server;     /* Server affected by incoming CM call */
-        struct afs_server       *server;        /* Server used by client call */
+        struct afs_cb_interest  *cbi;           /* Callback interest for server used */
        void                    *request;       /* request data (first part) */
        struct address_space    *mapping;       /* page set */
        struct afs_writeback    *wb;            /* writeback being performed */
@@ -192,7 +184,6 @@ struct afs_super_info {
        struct afs_net          *net;           /* Network namespace */
        struct afs_cell         *cell;          /* The cell in which the volume resides */
        struct afs_volume       *volume;        /* volume record */
-        char                    rwparent;       /* T if parent is R/W AFS volume */
 };
 static inline struct afs_super_info *AFS_FS_S(struct super_block *sb)
@@ -228,26 +219,26 @@ struct afs_net {
        spinlock_t              proc_cells_lock;
        struct list_head        proc_cells;
-        /* Volume location database */
+        /* Known servers.  Theoretically each fileserver can only be in one
-        struct list_head        vl_updates;             /* VL records in need-update order */
+         * cell, but in practice, people create aliases and subsets and there's
-        struct list_head        vl_graveyard;           /* Inactive VL records */
+         * no easy way to distinguish them.
-        struct delayed_work     vl_reaper;
+         */
-        struct delayed_work     vl_updater;
+        seqlock_t               fs_lock;        /* For fs_servers */
-        spinlock_t              vl_updates_lock;
+        struct rb_root          fs_servers;     /* afs_server (by server UUID or address) */
-        spinlock_t              vl_graveyard_lock;
+        struct list_head        fs_updates;     /* afs_server (by update_at) */
+        struct hlist_head       fs_proc;        /* procfs servers list */
-        /* File locking renewal management */
+        struct hlist_head       fs_addresses4;  /* afs_server (by lowest IPv4 addr) */
-        struct mutex            lock_manager_mutex;
+        struct hlist_head       fs_addresses6;  /* afs_server (by lowest IPv6 addr) */
+        seqlock_t               fs_addr_lock;   /* For fs_addresses[46] */
-        /* Server database */
+        struct work_struct      fs_manager;
-        struct rb_root          servers;                /* Active servers */
+        struct timer_list       fs_timer;
-        rwlock_t                servers_lock;
-        struct list_head        server_graveyard;       /* Inactive server LRU list */
-        spinlock_t              server_graveyard_lock;
-        struct timer_list       server_timer;
-        struct work_struct      server_reaper;
        atomic_t                servers_outstanding;
+        /* File locking renewal management */
+        struct mutex            lock_manager_mutex;
        /* Misc */
        struct proc_dir_entry   *proc_afs;              /* /proc/net/afs directory */
 };
@@ -264,7 +255,21 @@ enum afs_cell_state {
 };
 /*
- * AFS cell record
+ * AFS cell record.
+ *
+ * This is a tricky concept to get right as it is possible to create aliases
+ * simply by pointing AFSDB/SRV records for two names at the same set of VL
+ * servers; it is also possible to do things like setting up two sets of VL
+ * servers, one of which provides a superset of the volumes provided by the
+ * other (for internal/external division, for example).
+ *
+ * Cells only exist in the sense that (a) a cell's name maps to a set of VL
+ * servers and (b) a cell's name is used by the client to select the key to use
+ * for authentication and encryption.  The cell name is not typically used in
+ * the protocol.
+ *
+ * There is no easy way to determine if two cells are aliases or one is a
+ * subset of another.
 */
 struct afs_cell {
        union {
@@ -278,14 +283,6 @@ struct afs_cell {
 #ifdef CONFIG_AFS_FSCACHE
        struct fscache_cookie   *cache;         /* caching cookie */
 #endif
-        /* server record management */
-        rwlock_t                servers_lock;   /* active server list lock */
-        struct list_head        servers;        /* active server list */
-        /* volume location record management */
-        struct rw_semaphore     vl_sem;         /* volume management serialisation semaphore */
-        struct list_head        vl_list;        /* cell's active VL record list */
        time64_t                dns_expiry;     /* Time AFSDB/SRV record expires */
        time64_t                last_inactive;  /* Time of last drop of usage count */
        atomic_t                usage;
@@ -298,9 +295,11 @@ struct afs_cell {
        enum afs_cell_state     state;
        short                   error;
-        spinlock_t              vl_lock;        /* vl_list lock */
+        /* Active fileserver interaction state. */
+        struct list_head        proc_volumes;   /* procfs volume list */
+        rwlock_t                proc_lock;
-        /* VLDB server list. */
+        /* VL server list. */
        rwlock_t                vl_addrs_lock;  /* Lock on vl_addrs */
        struct afs_addr_list    __rcu *vl_addrs; /* List of VL servers */
        u8                      name_len;       /* Length of name */
@@ -308,65 +307,62 @@ struct afs_cell {
 };
 /*
- * entry in the cached volume location catalogue
+ * Cached VLDB entry.
+ *
+ * This is pointed to by cell->vldb_entries, indexed by name.
 */
-struct afs_cache_vlocation {
+struct afs_vldb_entry {
-        /* volume name (lowercase, padded with NULs) */
+        afs_volid_t             vid[3];         /* Volume IDs for R/W, R/O and Bak volumes */
-        uint8_t                 name[AFS_MAXVOLNAME + 1];
-        uint8_t                 nservers;       /* number of entries used in servers[] */
+        unsigned long           flags;
-        uint8_t                 vidmask;        /* voltype mask for vid[] */
+#define AFS_VLDB_HAS_RW         0               /* - R/W volume exists */
-        uint8_t                 srvtmask[8];    /* voltype masks for servers[] */
+#define AFS_VLDB_HAS_RO         1               /* - R/O volume exists */
+#define AFS_VLDB_HAS_BAK        2               /* - Backup volume exists */
+#define AFS_VLDB_QUERY_VALID    3               /* - Record is valid */
+#define AFS_VLDB_QUERY_ERROR    4               /* - VL server returned error */
+        uuid_t                  fs_server[AFS_NMAXNSERVERS];
+        u8                      fs_mask[AFS_NMAXNSERVERS];
 #define AFS_VOL_VTM_RW  0x01 /* R/W version of the volume is available (on this server) */
 #define AFS_VOL_VTM_RO  0x02 /* R/O version of the volume is available (on this server) */
 #define AFS_VOL_VTM_BAK 0x04 /* backup version of the volume is available (on this server) */
+        short                   error;
-        afs_volid_t             vid[3];         /* volume IDs for R/W, R/O and Bak volumes */
+        u8                      nr_servers;     /* Number of server records */
-        struct sockaddr_rxrpc   servers[8];     /* fileserver addresses */
+        u8                      name_len;
-        time_t                  rtime;          /* last retrieval time */
+        u8                      name[AFS_MAXVOLNAME + 1]; /* NUL-padded volume name */
-};
-/*
- * AFS volume location record
- */
-struct afs_vlocation {
-        atomic_t                usage;
-        time64_t                time_of_death;  /* time at which put reduced usage to 0 */
-        struct list_head        link;           /* link in cell volume location list */
-        struct list_head        grave;          /* link in master graveyard list */
-        struct list_head        update;         /* link in master update list */
-        struct afs_cell         *cell;          /* cell to which volume belongs */
-        struct afs_cache_vlocation vldb;        /* volume information DB record */
-        struct afs_volume       *vols[3];       /* volume access record pointer (index by type) */
-        wait_queue_head_t       waitq;          /* status change waitqueue */
-        time64_t                update_at;      /* time at which record should be updated */
-        spinlock_t              lock;           /* access lock */
-        afs_vlocation_state_t   state;          /* volume location state */
-        unsigned short          upd_rej_cnt;    /* ENOMEDIUM count during update */
-        unsigned short          upd_busy_cnt;   /* EBUSY count during update */
-        bool                    valid;          /* T if valid */
 };
 /*
- * AFS fileserver record
+ * Record of fileserver with which we're actively communicating.
 */
 struct afs_server {
-        atomic_t                usage;
+        struct rcu_head         rcu;
-        time64_t                time_of_death;  /* time at which put reduced usage to 0 */
+        union {
-        struct afs_addr_list    __rcu *addrs;   /* List of addresses for this server */
+                uuid_t          uuid;           /* Server ID */
-        struct afs_net          *net;           /* Network namespace in which the server resides */
+                struct afs_uuid _uuid;
-        struct afs_cell         *cell;          /* cell in which server resides */
+        };
-        struct list_head        link;           /* link in cell's server list */
-        struct list_head        grave;          /* link in master graveyard list */
+        struct afs_addr_list    __rcu *addresses;
+        struct rb_node          uuid_rb;        /* Link in net->servers */
-        struct rb_node          master_rb;      /* link in master by-addr tree */
+        struct hlist_node       addr4_link;     /* Link in net->fs_addresses4 */
-        struct rw_semaphore     sem;            /* access lock */
+        struct hlist_node       addr6_link;     /* Link in net->fs_addresses6 */
+        struct hlist_node       proc_link;      /* Link in net->fs_proc */
+        struct afs_server       *gc_next;       /* Next server in manager's list */
+        time64_t                put_time;       /* Time at which last put */
+        time64_t                update_at;      /* Time at which to next update the record */
        unsigned long           flags;
-#define AFS_SERVER_NEW          0               /* New server, don't inc cb_s_break */
+#define AFS_SERVER_FL_NEW       0               /* New server, don't inc cb_s_break */
+#define AFS_SERVER_FL_NOT_READY 1               /* The record is not ready for use */
+#define AFS_SERVER_FL_NOT_FOUND 2               /* VL server says no such server */
+#define AFS_SERVER_FL_VL_FAIL   3               /* Failed to access VL server */
+#define AFS_SERVER_FL_UPDATING  4
+#define AFS_SERVER_FL_PROBED    5               /* The fileserver has been probed */
+#define AFS_SERVER_FL_PROBING   6               /* Fileserver is being probed */
+        atomic_t                usage;
+        u32                     addr_version;   /* Address list version */
        /* file service access */
-        int                     fs_state;       /* 0 or reason FS currently marked dead (-errno) */
+        rwlock_t                fs_lock;        /* access lock */
-        spinlock_t              fs_lock;        /* access lock */
        /* callback promise management */
        struct list_head        cb_interests;   /* List of superblocks using this server */
@@ -386,32 +382,50 @@ struct afs_cb_interest {
 };
 /*
- * AFS volume access record
+ * Replaceable server list.
+ */
+struct afs_server_entry {
+        struct afs_server       *server;
+        struct afs_cb_interest  *cb_interest;
+};
+struct afs_server_list {
+        refcount_t              usage;
+        unsigned short          nr_servers;
+        unsigned short          index;          /* Server currently in use */
+        unsigned short          vnovol_mask;    /* Servers to be skipped due to VNOVOL */
+        unsigned int            seq;            /* Set to ->servers_seq when installed */
+        struct afs_server_entry servers[];
+};
+/*
+ * Live AFS volume management.
 */
 struct afs_volume {
+        afs_volid_t             vid;            /* volume ID */
        atomic_t                usage;
-        struct afs_cell         *cell;          /* cell to which belongs (unrefd ptr) */
+        time64_t                update_at;      /* Time at which to next update */
-        struct afs_vlocation    *vlocation;     /* volume location */
+        struct afs_cell         *cell;          /* Cell to which belongs (pins ref) */
+        struct list_head        proc_link;      /* Link in cell->vl_proc */
+        unsigned long           flags;
+#define AFS_VOLUME_NEEDS_UPDATE 0       /* - T if an update needs performing */
+#define AFS_VOLUME_UPDATING     1       /* - T if an update is in progress */
+#define AFS_VOLUME_WAIT         2       /* - T if users must wait for update */
+#define AFS_VOLUME_DELETED      3       /* - T if volume appears deleted */
+#define AFS_VOLUME_OFFLINE      4       /* - T if volume offline notice given */
+#define AFS_VOLUME_BUSY         5       /* - T if volume busy notice given */
 #ifdef CONFIG_AFS_FSCACHE
        struct fscache_cookie   *cache;         /* caching cookie */
 #endif
-        afs_volid_t             vid;            /* volume ID */
+        struct afs_server_list  *servers;       /* List of servers on which volume resides */
+        rwlock_t                servers_lock;   /* Lock for ->servers */
+        unsigned int            servers_seq;    /* Incremented each time ->servers changes */
        afs_voltype_t           type;           /* type of volume */
+        short                   error;
        char                    type_force;     /* force volume type (suppress R/O -> R/W) */
-        unsigned short          nservers;       /* number of server slots filled */
+        u8                      name_len;
-        unsigned short          rjservers;      /* number of servers discarded due to -ENOMEDIUM */
+        u8                      name[AFS_MAXVOLNAME + 1]; /* NUL-padded volume name */
-        struct afs_server       *servers[8];    /* servers on which volume resides (ordered) */
-        struct afs_cb_interest  *cb_interests[8]; /* Interests on servers for callbacks */
-        struct rw_semaphore     server_sem;     /* lock for accessing current server */
-};
-/*
- * vnode catalogue entry
- */
-struct afs_cache_vnode {
-        afs_vnodeid_t           vnode_id;       /* vnode ID */
-        unsigned                vnode_unique;   /* vnode ID uniquifier */
-        afs_dataversion_t       data_version;   /* data version */
 };
 /*
@@ -427,10 +441,8 @@ struct afs_vnode {
        struct fscache_cookie   *cache;         /* caching cookie */
 #endif
        struct afs_permits      *permit_cache;  /* cache of permits so far obtained */
+        struct mutex            io_lock;        /* Lock for serialising I/O on this mutex */
        struct mutex            validate_lock;  /* lock for validating this vnode */
-        wait_queue_head_t       update_waitq;   /* status fetch waitqueue */
-        int                     update_cnt;     /* number of outstanding ops that will update the
-                                                 * status */
        spinlock_t              writeback_lock; /* lock for writebacks */
        spinlock_t              lock;           /* waitqueue/flags lock */
        unsigned long           flags;
@@ -501,6 +513,7 @@ struct afs_interface {
 struct afs_addr_cursor {
        struct afs_addr_list    *alist;         /* Current address list (pins ref) */
        struct sockaddr_rxrpc   *addr;
+        u32                     abort_code;
        unsigned short          start;          /* Starting point in alist->addrs[] */
        unsigned short          index;          /* Wrapping offset from start to current addr */
        short                   error;
@@ -513,7 +526,21 @@ struct afs_addr_cursor {
 */
 struct afs_fs_cursor {
        struct afs_addr_cursor  ac;
-        struct afs_server       *server;        /* Current server (pins ref) */
+        struct afs_vnode        *vnode;
+        struct afs_server_list  *server_list;   /* Current server list (pins ref) */
+        struct afs_cb_interest  *cbi;           /* Server on which this resides (pins ref) */
+        struct key              *key;           /* Key for the server */
+        unsigned int            cb_break;       /* cb_break + cb_s_break before the call */
+        unsigned int            cb_break_2;     /* cb_break + cb_s_break (2nd vnode) */
+        unsigned char           start;          /* Initial index in server list */
+        unsigned char           index;          /* Number of servers tried beyond start */
+        unsigned short          flags;
+#define AFS_FS_CURSOR_STOP      0x0001          /* Set to cease iteration */
+#define AFS_FS_CURSOR_VBUSY     0x0002          /* Set if seen VBUSY */
+#define AFS_FS_CURSOR_VMOVED    0x0004          /* Set if seen VMOVED */
+#define AFS_FS_CURSOR_VNOVOL    0x0008          /* Set if seen VNOVOL */
+#define AFS_FS_CURSOR_CUR_ONLY  0x0010          /* Set if current server only (file lock held) */
+#define AFS_FS_CURSOR_NO_VSLEEP 0x0020          /* Set to prevent sleep on VBUSY, VOFFLINE, ... */
 };
 /*****************************************************************************/
@@ -537,6 +564,8 @@ extern bool afs_iterate_addresses(struct afs_addr_cursor *);
 extern int afs_end_cursor(struct afs_addr_cursor *);
 extern int afs_set_vl_cursor(struct afs_addr_cursor *, struct afs_cell *);
+extern void afs_merge_fs_addr4(struct afs_addr_list *, __be32);
 /*
 * cache.c
 */
@@ -558,10 +587,9 @@ extern void afs_init_callback_state(struct afs_server *);
 extern void afs_break_callback(struct afs_vnode *);
 extern void afs_break_callbacks(struct afs_server *, size_t,struct afs_callback[]);
-extern int afs_register_server_cb_interest(struct afs_vnode *, struct afs_cb_interest **,
+extern int afs_register_server_cb_interest(struct afs_vnode *, struct afs_server_entry *);
-                                           struct afs_server *);
 extern void afs_put_cb_interest(struct afs_net *, struct afs_cb_interest *);
-extern void afs_clear_callback_interests(struct afs_net *, struct afs_volume *);
+extern void afs_clear_callback_interests(struct afs_net *, struct afs_server_list *);
 static inline struct afs_cb_interest *afs_get_cb_interest(struct afs_cb_interest *cbi)
 {
@@ -603,6 +631,7 @@ extern const struct file_operations afs_file_operations;
 extern int afs_open(struct inode *, struct file *);
 extern int afs_release(struct inode *, struct file *);
+extern int afs_fetch_data(struct afs_vnode *, struct key *, struct afs_read *);
 extern int afs_page_filler(void *, struct page *);
 extern void afs_put_read(struct afs_read *);
@@ -619,51 +648,40 @@ extern int afs_flock(struct file *, int, struct file_lock *);
 /*
 * fsclient.c
 */
-extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_fetch_file_status(struct afs_fs_cursor *, struct afs_volsync *);
-                                    struct afs_vnode *, struct afs_volsync *,
+extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *);
-                                    bool);
+extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct afs_read *);
-extern int afs_fs_give_up_callbacks(struct afs_net *, struct afs_server *, bool);
+extern int afs_fs_create(struct afs_fs_cursor *, const char *, umode_t,
-extern int afs_fs_fetch_data(struct afs_fs_cursor *, struct key *,
+                         struct afs_fid *, struct afs_file_status *, struct afs_callback *);
-                             struct afs_vnode *, struct afs_read *, bool);
+extern int afs_fs_remove(struct afs_fs_cursor *, const char *, bool);
-extern int afs_fs_create(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_link(struct afs_fs_cursor *, struct afs_vnode *, const char *);
-                         struct afs_vnode *, const char *, umode_t,
+extern int afs_fs_symlink(struct afs_fs_cursor *, const char *, const char *,
-                         struct afs_fid *, struct afs_file_status *,
+                          struct afs_fid *, struct afs_file_status *);
-                         struct afs_callback *, bool);
+extern int afs_fs_rename(struct afs_fs_cursor *, const char *,
-extern int afs_fs_remove(struct afs_fs_cursor *, struct key *,
+                         struct afs_vnode *, const char *);
-                         struct afs_vnode *, const char *, bool, bool);
-extern int afs_fs_link(struct afs_fs_cursor *, struct key *, struct afs_vnode *,
-                       struct afs_vnode *, const char *, bool);
-extern int afs_fs_symlink(struct afs_fs_cursor *, struct key *,
-                          struct afs_vnode *, const char *, const char *,
-                          struct afs_fid *, struct afs_file_status *, bool);
-extern int afs_fs_rename(struct afs_fs_cursor *, struct key *,
-                         struct afs_vnode *, const char *,
-                         struct afs_vnode *, const char *, bool);
 extern int afs_fs_store_data(struct afs_fs_cursor *, struct afs_writeback *,
-                             pgoff_t, pgoff_t, unsigned, unsigned, bool);
+                             pgoff_t, pgoff_t, unsigned, unsigned);
-extern int afs_fs_setattr(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_setattr(struct afs_fs_cursor *, struct iattr *);
-                          struct afs_vnode *, struct iattr *, bool);
+extern int afs_fs_get_volume_status(struct afs_fs_cursor *, struct afs_volume_status *);
-extern int afs_fs_get_volume_status(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_set_lock(struct afs_fs_cursor *, afs_lock_type_t);
-                                    struct afs_vnode *,
+extern int afs_fs_extend_lock(struct afs_fs_cursor *);
-                                    struct afs_volume_status *, bool);
+extern int afs_fs_release_lock(struct afs_fs_cursor *);
-extern int afs_fs_set_lock(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_give_up_all_callbacks(struct afs_net *, struct afs_server *,
-                           struct afs_vnode *, afs_lock_type_t, bool);
+                                        struct afs_addr_cursor *, struct key *);
-extern int afs_fs_extend_lock(struct afs_fs_cursor *, struct key *,
+extern int afs_fs_get_capabilities(struct afs_net *, struct afs_server *,
-                              struct afs_vnode *, bool);
+                                   struct afs_addr_cursor *, struct key *);
-extern int afs_fs_release_lock(struct afs_fs_cursor *, struct key *,
-                               struct afs_vnode *, bool);
-extern int afs_fs_give_up_all_callbacks(struct afs_server *, struct afs_addr_cursor *,
-                                        struct key *, bool);
 /*
 * inode.c
 */
+extern int afs_fetch_status(struct afs_vnode *, struct key *);
 extern int afs_iget5_test(struct inode *, void *);
 extern struct inode *afs_iget_autocell(struct inode *, const char *, int,
                                       struct key *);
 extern struct inode *afs_iget(struct super_block *, struct key *,
                              struct afs_fid *, struct afs_file_status *,
-                              struct afs_callback *);
+                              struct afs_callback *,
+                              struct afs_cb_interest *);
 extern void afs_zap_data(struct afs_vnode *);
 extern int afs_validate(struct afs_vnode *, struct key *);
 extern int afs_getattr(const struct path *, struct kstat *, u32, unsigned int);
@@ -734,6 +752,15 @@ extern int afs_proc_cell_setup(struct afs_net *, struct afs_cell *);
 extern void afs_proc_cell_remove(struct afs_net *, struct afs_cell *);
 /*
+ * rotate.c
+ */
+extern bool afs_begin_vnode_operation(struct afs_fs_cursor *, struct afs_vnode *,
+                                      struct key *);
+extern bool afs_select_fileserver(struct afs_fs_cursor *);
+extern bool afs_select_current_fileserver(struct afs_fs_cursor *);
+extern int afs_end_vnode_operation(struct afs_fs_cursor *);
+/*
 * rxrpc.c
 */
 extern struct workqueue_struct *afs_async_calls;
@@ -779,104 +806,62 @@ static inline struct afs_server *afs_get_server(struct afs_server *server)
        return server;
 }
-extern void afs_server_timer(struct timer_list *);
-extern struct afs_server *afs_lookup_server(struct afs_cell *,
-                                            struct sockaddr_rxrpc *);
 extern struct afs_server *afs_find_server(struct afs_net *,
                                          const struct sockaddr_rxrpc *);
+extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *);
+extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *);
 extern void afs_put_server(struct afs_net *, struct afs_server *);
-extern void afs_reap_server(struct work_struct *);
+extern void afs_manage_servers(struct work_struct *);
+extern void afs_servers_timer(struct timer_list *);
 extern void __net_exit afs_purge_servers(struct afs_net *);
+extern bool afs_probe_fileserver(struct afs_fs_cursor *);
+extern bool afs_check_server_record(struct afs_fs_cursor *, struct afs_server *);
 /*
- * super.c
+ * server_list.c
 */
-extern int __init afs_fs_init(void);
+static inline struct afs_server_list *afs_get_serverlist(struct afs_server_list *slist)
-extern void __exit afs_fs_exit(void);
+{
+        refcount_inc(&slist->usage);
+        return slist;
+}
-/*
+extern void afs_put_serverlist(struct afs_net *, struct afs_server_list *);
- * vlclient.c
+extern struct afs_server_list *afs_alloc_server_list(struct afs_cell *, struct key *,
- */
+                                                     struct afs_vldb_entry *,
-extern int afs_vl_get_entry_by_name(struct afs_net *, struct afs_addr_cursor *,
+                                                     u8);
-                                    struct key *, const char *,
+extern bool afs_annotate_server_list(struct afs_server_list *, struct afs_server_list *);
-                                    struct afs_cache_vlocation *, bool);
-extern int afs_vl_get_entry_by_id(struct afs_net *, struct afs_addr_cursor *,
-                                  struct key *, afs_volid_t, afs_voltype_t,
-                                  struct afs_cache_vlocation *, bool);
 /*
- * vlocation.c
+ * super.c
 */
-extern struct workqueue_struct *afs_vlocation_update_worker;
+extern int __init afs_fs_init(void);
+extern void __exit afs_fs_exit(void);
-#define afs_get_vlocation(V) do { atomic_inc(&(V)->usage); } while(0)
-extern struct afs_vlocation *afs_vlocation_lookup(struct afs_net *,
-                                                  struct afs_cell *,
-                                                  struct key *,
-                                                  const char *, size_t);
-extern void afs_put_vlocation(struct afs_net *, struct afs_vlocation *);
-extern void afs_vlocation_updater(struct work_struct *);
-extern void afs_vlocation_reaper(struct work_struct *);
-extern void __net_exit afs_vlocation_purge(struct afs_net *);
 /*
- * vnode.c
+ * vlclient.c
 */
-static inline struct afs_vnode *AFS_FS_I(struct inode *inode)
+extern struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_net *,
-{
+                                                         struct afs_addr_cursor *,
-        return container_of(inode, struct afs_vnode, vfs_inode);
+                                                         struct key *, const char *, int);
-}
+extern struct afs_addr_list *afs_vl_get_addrs_u(struct afs_net *, struct afs_addr_cursor *,
+                                                struct key *, const uuid_t *);
-static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
-{
-        return &vnode->vfs_inode;
-}
-extern void afs_vnode_finalise_status_update(struct afs_vnode *,
-                                             struct afs_server *);
-extern int afs_vnode_fetch_status(struct afs_vnode *, struct key *, bool);
-extern int afs_vnode_fetch_data(struct afs_vnode *, struct key *,
-                                struct afs_read *);
-extern int afs_vnode_create(struct afs_vnode *, struct key *, const char *,
-                            umode_t, struct afs_fid *, struct afs_file_status *,
-                            struct afs_callback *, struct afs_server **);
-extern int afs_vnode_remove(struct afs_vnode *, struct key *, const char *,
-                            bool);
-extern int afs_vnode_link(struct afs_vnode *, struct afs_vnode *, struct key *,
-                          const char *);
-extern int afs_vnode_symlink(struct afs_vnode *, struct key *, const char *,
-                             const char *, struct afs_fid *,
-                             struct afs_file_status *, struct afs_server **);
-extern int afs_vnode_rename(struct afs_vnode *, struct afs_vnode *,
-                            struct key *, const char *, const char *);
-extern int afs_vnode_store_data(struct afs_writeback *, pgoff_t, pgoff_t,
-                                unsigned, unsigned);
-extern int afs_vnode_setattr(struct afs_vnode *, struct key *, struct iattr *);
-extern int afs_vnode_get_volume_status(struct afs_vnode *, struct key *,
-                                       struct afs_volume_status *);
-extern int afs_vnode_set_lock(struct afs_vnode *, struct key *,
-                              afs_lock_type_t);
-extern int afs_vnode_extend_lock(struct afs_vnode *, struct key *);
-extern int afs_vnode_release_lock(struct afs_vnode *, struct key *);
 /*
 * volume.c
 */
-static inline struct afs_volume *afs_get_volume(struct afs_volume *volume)
+static inline struct afs_volume *__afs_get_volume(struct afs_volume *volume)
 {
        if (volume)
                atomic_inc(&volume->usage);
        return volume;
 }
+extern struct afs_volume *afs_create_volume(struct afs_mount_params *);
+extern void afs_activate_volume(struct afs_volume *);
+extern void afs_deactivate_volume(struct afs_volume *);
 extern void afs_put_volume(struct afs_cell *, struct afs_volume *);
-extern struct afs_volume *afs_volume_lookup(struct afs_mount_params *);
+extern int afs_check_volume_status(struct afs_volume *, struct key *);
-extern void afs_init_fs_cursor(struct afs_fs_cursor *, struct afs_vnode *);
-extern int afs_set_fs_cursor(struct afs_fs_cursor *, struct afs_vnode *);
-extern bool afs_volume_pick_fileserver(struct afs_fs_cursor *, struct afs_vnode *);
-extern bool afs_iterate_fs_cursor(struct afs_fs_cursor *, struct afs_vnode *);
-extern int afs_end_fs_cursor(struct afs_fs_cursor *, struct afs_net *);
 /*
 * write.c
@@ -903,6 +888,38 @@ extern int afs_fsync(struct file *, loff_t, loff_t, int);
 extern const struct xattr_handler *afs_xattr_handlers[];
 extern ssize_t afs_listxattr(struct dentry *, char *, size_t);
+/*
+ * Miscellaneous inline functions.
+ */
+static inline struct afs_vnode *AFS_FS_I(struct inode *inode)
+{
+        return container_of(inode, struct afs_vnode, vfs_inode);
+}
+static inline struct inode *AFS_VNODE_TO_I(struct afs_vnode *vnode)
+{
+        return &vnode->vfs_inode;
+}
+static inline void afs_vnode_commit_status(struct afs_fs_cursor *fc,
+                                           struct afs_vnode *vnode,
+                                           unsigned int cb_break)
+{
+        if (fc->ac.error == 0)
+                afs_cache_permit(vnode, fc->key, cb_break);
+}
+static inline void afs_check_for_remote_deletion(struct afs_fs_cursor *fc,
+                                                 struct afs_vnode *vnode)
+{
+        if (fc->ac.error == -ENOENT) {
+                set_bit(AFS_VNODE_DELETED, &vnode->flags);
+                afs_break_callback(vnode);
+        }
+}
 /*****************************************************************************/
 /*
 * debug tracing
diff --git a/fs/afs/main.c b/fs/afs/main.c
index e7f87d723761..15a02a05ff40 100644
--- a/fs/afs/main.c
+++ b/fs/afs/main.c
@@ -55,18 +55,17 @@ static int __net_init afs_net_init(struct afs_net *net)
        spin_lock_init(&net->proc_cells_lock);
        INIT_LIST_HEAD(&net->proc_cells);
-        INIT_LIST_HEAD(&net->vl_updates);
+        seqlock_init(&net->fs_lock);
-        INIT_LIST_HEAD(&net->vl_graveyard);
+        net->fs_servers = RB_ROOT;
-        INIT_DELAYED_WORK(&net->vl_reaper, afs_vlocation_reaper);
+        INIT_LIST_HEAD(&net->fs_updates);
-        INIT_DELAYED_WORK(&net->vl_updater, afs_vlocation_updater);
+        INIT_HLIST_HEAD(&net->fs_proc);
-        spin_lock_init(&net->vl_updates_lock);
-        spin_lock_init(&net->vl_graveyard_lock);
+        INIT_HLIST_HEAD(&net->fs_addresses4);
-        net->servers = RB_ROOT;
+        INIT_HLIST_HEAD(&net->fs_addresses6);
-        rwlock_init(&net->servers_lock);
+        seqlock_init(&net->fs_addr_lock);
-        INIT_LIST_HEAD(&net->server_graveyard);
-        spin_lock_init(&net->server_graveyard_lock);
+        INIT_WORK(&net->fs_manager, afs_manage_servers);
-        INIT_WORK(&net->server_reaper, afs_reap_server);
+        timer_setup(&net->fs_timer, afs_servers_timer, 0);
-        timer_setup(&net->server_timer, afs_server_timer, 0);
        /* Register the /proc stuff */
        ret = afs_proc_init(net);
@@ -87,8 +86,8 @@ static int __net_init afs_net_init(struct afs_net *net)
 error_open_socket:
        net->live = false;
-        afs_vlocation_purge(net);
        afs_cell_purge(net);
+        afs_purge_servers(net);
 error_cell_init:
        net->live = false;
        afs_proc_cleanup(net);
@@ -103,9 +102,8 @@ error_proc:
 static void __net_exit afs_net_exit(struct afs_net *net)
 {
        net->live = false;
-        afs_purge_servers(net);
-        afs_vlocation_purge(net);
        afs_cell_purge(net);
+        afs_purge_servers(net);
        afs_close_socket(net);
        afs_proc_cleanup(net);
 }
@@ -125,10 +123,6 @@ static int __init afs_init(void)
        afs_async_calls = alloc_workqueue("kafsd", WQ_MEM_RECLAIM, 0);
        if (!afs_async_calls)
                goto error_async;
-        afs_vlocation_update_worker =
-                alloc_workqueue("kafs_vlupdated", WQ_MEM_RECLAIM, 0);
-        if (!afs_vlocation_update_worker)
-                goto error_vl_up;
        afs_lock_manager = alloc_workqueue("kafs_lockd", WQ_MEM_RECLAIM, 0);
        if (!afs_lock_manager)
                goto error_lockmgr;
@@ -160,8 +154,6 @@ error_cache:
 #endif
        destroy_workqueue(afs_lock_manager);
 error_lockmgr:
-        destroy_workqueue(afs_vlocation_update_worker);
-error_vl_up:
        destroy_workqueue(afs_async_calls);
 error_async:
        destroy_workqueue(afs_wq);
@@ -189,7 +181,6 @@ static void __exit afs_exit(void)
        fscache_unregister_netfs(&afs_cache_netfs);
 #endif
        destroy_workqueue(afs_lock_manager);
-        destroy_workqueue(afs_vlocation_update_worker);
        destroy_workqueue(afs_async_calls);
        destroy_workqueue(afs_wq);
        afs_clean_up_permit_cache();
diff --git a/fs/afs/proc.c b/fs/afs/proc.c
index 9cf9ce88a8dd..4508dd54f789 100644
--- a/fs/afs/proc.c
+++ b/fs/afs/proc.c
@@ -105,22 +105,22 @@ static const struct file_operations afs_proc_cell_vlservers_fops = {
        .release        = seq_release,
 };
-static int afs_proc_cell_servers_open(struct inode *inode, struct file *file);
+static int afs_proc_servers_open(struct inode *inode, struct file *file);
-static void *afs_proc_cell_servers_start(struct seq_file *p, loff_t *pos);
+static void *afs_proc_servers_start(struct seq_file *p, loff_t *pos);
-static void *afs_proc_cell_servers_next(struct seq_file *p, void *v,
+static void *afs_proc_servers_next(struct seq_file *p, void *v,
                                        loff_t *pos);
-static void afs_proc_cell_servers_stop(struct seq_file *p, void *v);
+static void afs_proc_servers_stop(struct seq_file *p, void *v);
-static int afs_proc_cell_servers_show(struct seq_file *m, void *v);
+static int afs_proc_servers_show(struct seq_file *m, void *v);
-static const struct seq_operations afs_proc_cell_servers_ops = {
+static const struct seq_operations afs_proc_servers_ops = {
-        .start  = afs_proc_cell_servers_start,
+        .start  = afs_proc_servers_start,
-        .next   = afs_proc_cell_servers_next,
+        .next   = afs_proc_servers_next,
-        .stop   = afs_proc_cell_servers_stop,
+        .stop   = afs_proc_servers_stop,
-        .show   = afs_proc_cell_servers_show,
+        .show   = afs_proc_servers_show,
 };
-static const struct file_operations afs_proc_cell_servers_fops = {
+static const struct file_operations afs_proc_servers_fops = {
-        .open           = afs_proc_cell_servers_open,
+        .open           = afs_proc_servers_open,
        .read           = seq_read,
        .llseek         = seq_lseek,
        .release        = seq_release,
@@ -138,7 +138,8 @@ int afs_proc_init(struct afs_net *net)
                goto error_dir;
        if (!proc_create("cells", 0644, net->proc_afs, &afs_proc_cells_fops) ||
-            !proc_create("rootcell", 0644, net->proc_afs, &afs_proc_rootcell_fops))
+            !proc_create("rootcell", 0644, net->proc_afs, &afs_proc_rootcell_fops) ||
+            !proc_create("servers", 0644, net->proc_afs, &afs_proc_servers_fops))
                goto error_tree;
        _leave(" = 0");
@@ -174,7 +175,6 @@ static int afs_proc_cells_open(struct inode *inode, struct file *file)
        m = file->private_data;
        m->private = PDE_DATA(inode);
        return 0;
 }
@@ -357,12 +357,10 @@ int afs_proc_cell_setup(struct afs_net *net, struct afs_cell *cell)
        if (!dir)
                goto error_dir;
-        if (!proc_create_data("servers", 0, dir,
+        if (!proc_create_data("vlservers", 0, dir,
-                             &afs_proc_cell_servers_fops, cell) ||
+                              &afs_proc_cell_vlservers_fops, cell) ||
-            !proc_create_data("vlservers", 0, dir,
-                             &afs_proc_cell_vlservers_fops, cell) ||
            !proc_create_data("volumes", 0, dir,
-                             &afs_proc_cell_volumes_fops, cell))
+                              &afs_proc_cell_volumes_fops, cell))
                goto error_tree;
        _leave(" = 0");
@@ -420,9 +418,8 @@ static void *afs_proc_cell_volumes_start(struct seq_file *m, loff_t *_pos)
        _enter("cell=%p pos=%Ld", cell, *_pos);
-        /* lock the list against modification */
+        read_lock(&cell->proc_lock);
-        down_read(&cell->vl_sem);
+        return seq_list_start_head(&cell->proc_volumes, *_pos);
-        return seq_list_start_head(&cell->vl_list, *_pos);
 }
 /*
@@ -434,7 +431,7 @@ static void *afs_proc_cell_volumes_next(struct seq_file *p, void *v,
        struct afs_cell *cell = p->private;
        _enter("cell=%p pos=%Ld", cell, *_pos);
-        return seq_list_next(v, &cell->vl_list, _pos);
+        return seq_list_next(v, &cell->proc_volumes, _pos);
 }
 /*
@@ -444,17 +441,13 @@ static void afs_proc_cell_volumes_stop(struct seq_file *p, void *v)
 {
        struct afs_cell *cell = p->private;
-        up_read(&cell->vl_sem);
+        read_unlock(&cell->proc_lock);
 }
-static const char afs_vlocation_states[][4] = {
+static const char afs_vol_types[3][3] = {
-        [AFS_VL_NEW]                    = "New",
+        [AFSVL_RWVOL]   = "RW",
-        [AFS_VL_CREATING]               = "Crt",
+        [AFSVL_ROVOL]   = "RO",
-        [AFS_VL_VALID]                  = "Val",
+        [AFSVL_BACKVOL] = "BK",
-        [AFS_VL_NO_VOLUME]              = "NoV",
-        [AFS_VL_UPDATING]               = "Upd",
-        [AFS_VL_VOLUME_DELETED]         = "Del",
-        [AFS_VL_UNCERTAIN]              = "Unc",
 };
 /*
@@ -463,23 +456,17 @@ static const char afs_vlocation_states[][4] = {
 static int afs_proc_cell_volumes_show(struct seq_file *m, void *v)
 {
        struct afs_cell *cell = m->private;
-        struct afs_vlocation *vlocation =
+        struct afs_volume *vol = list_entry(v, struct afs_volume, proc_link);
-                list_entry(v, struct afs_vlocation, link);
-        /* display header on line 1 */
+        /* Display header on line 1 */
-        if (v == &cell->vl_list) {
+        if (v == &cell->proc_volumes) {
-                seq_puts(m, "USE STT VLID[0]  VLID[1]  VLID[2]  NAME\n");
+                seq_puts(m, "USE VID      TY\n");
                return 0;
        }
-        /* display one cell per line on subsequent lines */
+        seq_printf(m, "%3d %08x %s\n",
-        seq_printf(m, "%3d %s %08x %08x %08x %s\n",
+                   atomic_read(&vol->usage), vol->vid,
-                   atomic_read(&vlocation->usage),
+                   afs_vol_types[vol->type]);
-                   afs_vlocation_states[vlocation->state],
-                   vlocation->vldb.vid[0],
-                   vlocation->vldb.vid[1],
-                   vlocation->vldb.vid[2],
-                   vlocation->vldb.name);
        return 0;
 }
@@ -580,86 +567,62 @@ static int afs_proc_cell_vlservers_show(struct seq_file *m, void *v)
 }
 /*
- * open "/proc/fs/afs/<cell>/servers" which provides a summary of active
+ * open "/proc/fs/afs/servers" which provides a summary of active
 * servers
 */
-static int afs_proc_cell_servers_open(struct inode *inode, struct file *file)
+static int afs_proc_servers_open(struct inode *inode, struct file *file)
 {
-        struct afs_cell *cell;
+        return seq_open(file, &afs_proc_servers_ops);
-        struct seq_file *m;
-        int ret;
-        cell = PDE_DATA(inode);
-        if (!cell)
-                return -ENOENT;
-        ret = seq_open(file, &afs_proc_cell_servers_ops);
-        if (ret < 0)
-                return ret;
-        m = file->private_data;
-        m->private = cell;
-        return 0;
 }
 /*
- * set up the iterator to start reading from the cells list and return the
+ * Set up the iterator to start reading from the server list and return the
- * first item
+ * first item.
 */
-static void *afs_proc_cell_servers_start(struct seq_file *m, loff_t *_pos)
+static void *afs_proc_servers_start(struct seq_file *m, loff_t *_pos)
-        __acquires(m->private->servers_lock)
 {
-        struct afs_cell *cell = m->private;
+        struct afs_net *net = afs_seq2net(m);
-        _enter("cell=%p pos=%Ld", cell, *_pos);
-        /* lock the list against modification */
+        rcu_read_lock();
-        read_lock(&cell->servers_lock);
+        return seq_hlist_start_head_rcu(&net->fs_proc, *_pos);
-        return seq_list_start_head(&cell->servers, *_pos);
 }
 /*
 * move to next cell in cells list
 */
-static void *afs_proc_cell_servers_next(struct seq_file *p, void *v,
+static void *afs_proc_servers_next(struct seq_file *m, void *v, loff_t *_pos)
-                                        loff_t *_pos)
 {
-        struct afs_cell *cell = p->private;
+        struct afs_net *net = afs_seq2net(m);
-        _enter("cell=%p pos=%Ld", cell, *_pos);
+        return seq_hlist_next_rcu(v, &net->fs_proc, _pos);
-        return seq_list_next(v, &cell->servers, _pos);
 }
 /*
 * clean up after reading from the cells list
 */
-static void afs_proc_cell_servers_stop(struct seq_file *p, void *v)
+static void afs_proc_servers_stop(struct seq_file *p, void *v)
-        __releases(p->private->servers_lock)
 {
-        struct afs_cell *cell = p->private;
+        rcu_read_unlock();
-        read_unlock(&cell->servers_lock);
 }
 /*
 * display a header line followed by a load of volume lines
 */
-static int afs_proc_cell_servers_show(struct seq_file *m, void *v)
+static int afs_proc_servers_show(struct seq_file *m, void *v)
 {
-        struct afs_cell *cell = m->private;
+        struct afs_server *server;
-        struct afs_server *server = list_entry(v, struct afs_server, link);
+        struct afs_addr_list *alist;
-        char ipaddr[64];
-        /* display header on line 1 */
+        if (v == SEQ_START_TOKEN) {
-        if (v == &cell->servers) {
+                seq_puts(m, "UUID                                 USE ADDR\n");
-                seq_puts(m, "USE ADDR            STATE\n");
                return 0;
        }
-        /* display one cell per line on subsequent lines */
+        server = list_entry(v, struct afs_server, proc_link);
-        sprintf(ipaddr, "%pISp", &server->addrs->addrs[0].transport);
+        alist = rcu_dereference(server->addresses);
-        seq_printf(m, "%3d %-15s %5d\n",
+        seq_printf(m, "%pU %3d %pISp\n",
-                   atomic_read(&server->usage), ipaddr, server->fs_state);
+                   &server->uuid,
+                   atomic_read(&server->usage),
+                   &alist->addrs[alist->index].transport);
        return 0;
 }
diff --git a/fs/afs/rotate.c b/fs/afs/rotate.c
index c7975b3ba59a..e728ca1776c9 100644
--- a/fs/afs/rotate.c
+++ b/fs/afs/rotate.c
@@ -11,7 +11,12 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/sched.h>
+#include <linux/delay.h>
+#include <linux/sched/signal.h>
 #include "internal.h"
+#include "afs_fs.h"
 /*
 * Initialise a filesystem server cursor for iterating over FS servers.
@@ -22,6 +27,460 @@ void afs_init_fs_cursor(struct afs_fs_cursor *fc, struct afs_vnode *vnode)
 }
 /*
+ * Begin an operation on the fileserver.
+ *
+ * Fileserver operations are serialised on the server by vnode, so we serialise
+ * them here also using the io_lock.
+ */
+bool afs_begin_vnode_operation(struct afs_fs_cursor *fc, struct afs_vnode *vnode,
+                               struct key *key)
+{
+        afs_init_fs_cursor(fc, vnode);
+        fc->vnode = vnode;
+        fc->key = key;
+        fc->ac.error = SHRT_MAX;
+        if (mutex_lock_interruptible(&vnode->io_lock) < 0) {
+                fc->ac.error = -EINTR;
+                fc->flags |= AFS_FS_CURSOR_STOP;
+                return false;
+        }
+        if (test_bit(AFS_VNODE_READLOCKED, &vnode->flags) ||
+            test_bit(AFS_VNODE_WRITELOCKED, &vnode->flags))
+                fc->flags |= AFS_FS_CURSOR_CUR_ONLY;
+        return true;
+}
+/*
+ * Begin iteration through a server list, starting with the vnode's last used
+ * server if possible, or the last recorded good server if not.
+ */
+static bool afs_start_fs_iteration(struct afs_fs_cursor *fc,
+                                   struct afs_vnode *vnode)
+{
+        struct afs_cb_interest *cbi;
+        int i;
+        read_lock(&vnode->volume->servers_lock);
+        fc->server_list = afs_get_serverlist(vnode->volume->servers);
+        read_unlock(&vnode->volume->servers_lock);
+        cbi = vnode->cb_interest;
+        if (cbi) {
+                /* See if the vnode's preferred record is still available */
+                for (i = 0; i < fc->server_list->nr_servers; i++) {
+                        if (fc->server_list->servers[i].cb_interest == cbi) {
+                                fc->start = i;
+                                goto found_interest;
+                        }
+                }
+                /* If we have a lock outstanding on a server that's no longer
+                 * serving this vnode, then we can't switch to another server
+                 * and have to return an error.
+                 */
+                if (fc->flags & AFS_FS_CURSOR_CUR_ONLY) {
+                        fc->ac.error = -ESTALE;
+                        return false;
+                }
+                /* Note that the callback promise is effectively broken */
+                write_seqlock(&vnode->cb_lock);
+                ASSERTCMP(cbi, ==, vnode->cb_interest);
+                vnode->cb_interest = NULL;
+                if (test_and_clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags))
+                        vnode->cb_break++;
+                write_sequnlock(&vnode->cb_lock);
+                afs_put_cb_interest(afs_v2net(vnode), cbi);
+                cbi = NULL;
+        } else {
+                fc->start = READ_ONCE(fc->server_list->index);
+        }
+found_interest:
+        fc->index = fc->start;
+        return true;
+}
+/*
+ * Post volume busy note.
+ */
+static void afs_busy(struct afs_volume *volume, u32 abort_code)
+{
+        const char *m;
+        switch (abort_code) {
+        case VOFFLINE:          m = "offline";          break;
+        case VRESTARTING:       m = "restarting";       break;
+        case VSALVAGING:        m = "being salvaged";   break;
+        default:                m = "busy";             break;
+        }
+        
+        pr_notice("kAFS: Volume %u '%s' is %s\n", volume->vid, volume->name, m);
+}
+/*
+ * Sleep and retry the operation to the same fileserver.
+ */
+static bool afs_sleep_and_retry(struct afs_fs_cursor *fc)
+{
+        msleep_interruptible(1000);
+        if (signal_pending(current)) {
+                fc->ac.error = -ERESTARTSYS;
+                return false;
+        }
+        return true;
+}
+/*
+ * Select the fileserver to use.  May be called multiple times to rotate
+ * through the fileservers.
+ */
+bool afs_select_fileserver(struct afs_fs_cursor *fc)
+{
+        struct afs_addr_list *alist;
+        struct afs_server *server;
+        struct afs_vnode *vnode = fc->vnode;
+        _enter("%u/%u,%u/%u,%d,%d",
+               fc->index, fc->start,
+               fc->ac.index, fc->ac.start,
+               fc->ac.error, fc->ac.abort_code);
+        if (fc->flags & AFS_FS_CURSOR_STOP) {
+                _leave(" = f [stopped]");
+                return false;
+        }
+        /* Evaluate the result of the previous operation, if there was one. */
+        switch (fc->ac.error) {
+        case SHRT_MAX:
+                goto start;
+        case 0:
+        default:
+                /* Success or local failure.  Stop. */
+                fc->flags |= AFS_FS_CURSOR_STOP;
+                _leave(" = f [okay/local %d]", fc->ac.error);
+                return false;
+        case -ECONNABORTED:
+                /* The far side rejected the operation on some grounds.  This
+                 * might involve the server being busy or the volume having been moved.
+                 */
+                switch (fc->ac.abort_code) {
+                case VNOVOL:
+                        /* This fileserver doesn't know about the volume.
+                         * - May indicate that the VL is wrong - retry once and compare
+                         *   the results.
+                         * - May indicate that the fileserver couldn't attach to the vol.
+                         */
+                        if (fc->flags & AFS_FS_CURSOR_VNOVOL) {
+                                fc->ac.error = -EREMOTEIO;
+                                goto failed;
+                        }
+                        write_lock(&vnode->volume->servers_lock);
+                        fc->server_list->vnovol_mask |= 1 << fc->index;
+                        write_unlock(&vnode->volume->servers_lock);
+                        set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
+                        fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
+                        if (fc->ac.error < 0)
+                                goto failed;
+                        if (test_bit(AFS_VOLUME_DELETED, &vnode->volume->flags)) {
+                                fc->ac.error = -ENOMEDIUM;
+                                goto failed;
+                        }
+                        /* If the server list didn't change, then assume that
+                         * it's the fileserver having trouble.
+                         */
+                        if (vnode->volume->servers == fc->server_list) {
+                                fc->ac.error = -EREMOTEIO;
+                                goto failed;
+                        }
+                        /* Try again */
+                        fc->flags |= AFS_FS_CURSOR_VNOVOL;
+                        _leave(" = t [vnovol]");
+                        return true;
+                case VSALVAGE: /* TODO: Should this return an error or iterate? */
+                case VVOLEXISTS:
+                case VNOSERVICE:
+                case VONLINE:
+                case VDISKFULL:
+                case VOVERQUOTA:
+                        fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
+                        goto next_server;
+                case VOFFLINE:
+                        if (!test_and_set_bit(AFS_VOLUME_OFFLINE, &vnode->volume->flags)) {
+                                afs_busy(vnode->volume, fc->ac.abort_code);
+                                clear_bit(AFS_VOLUME_BUSY, &vnode->volume->flags);
+                        }
+                        if (fc->flags & AFS_FS_CURSOR_NO_VSLEEP) {
+                                fc->ac.error = -EADV;
+                                goto failed;
+                        }
+                        if (fc->flags & AFS_FS_CURSOR_CUR_ONLY) {
+                                fc->ac.error = -ESTALE;
+                                goto failed;
+                        }
+                        goto busy;
+                case VSALVAGING:
+                case VRESTARTING:
+                case VBUSY:
+                        /* Retry after going round all the servers unless we
+                         * have a file lock we need to maintain.
+                         */
+                        if (fc->flags & AFS_FS_CURSOR_NO_VSLEEP) {
+                                fc->ac.error = -EBUSY;
+                                goto failed;
+                        }
+                        if (!test_and_set_bit(AFS_VOLUME_BUSY, &vnode->volume->flags)) {
+                                afs_busy(vnode->volume, fc->ac.abort_code);
+                                clear_bit(AFS_VOLUME_OFFLINE, &vnode->volume->flags);
+                        }
+                busy:
+                        if (fc->flags & AFS_FS_CURSOR_CUR_ONLY) {
+                                if (!afs_sleep_and_retry(fc))
+                                        goto failed;
+                                 /* Retry with same server & address */
+                                _leave(" = t [vbusy]");
+                                return true;
+                        }
+                        fc->flags |= AFS_FS_CURSOR_VBUSY;
+                        goto next_server;
+                case VMOVED:
+                        /* The volume migrated to another server.  We consider
+                         * consider all locks and callbacks broken and request
+                         * an update from the VLDB.
+                         *
+                         * We also limit the number of VMOVED hops we will
+                         * honour, just in case someone sets up a loop.
+                         */
+                        if (fc->flags & AFS_FS_CURSOR_VMOVED) {
+                                fc->ac.error = -EREMOTEIO;
+                                goto failed;
+                        }
+                        fc->flags |= AFS_FS_CURSOR_VMOVED;
+                        set_bit(AFS_VOLUME_WAIT, &vnode->volume->flags);
+                        set_bit(AFS_VOLUME_NEEDS_UPDATE, &vnode->volume->flags);
+                        fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
+                        if (fc->ac.error < 0)
+                                goto failed;
+                        /* If the server list didn't change, then the VLDB is
+                         * out of sync with the fileservers.  This is hopefully
+                         * a temporary condition, however, so we don't want to
+                         * permanently block access to the file.
+                         *
+                         * TODO: Try other fileservers if we can.
+                         *
+                         * TODO: Retry a few times with sleeps.
+                         */
+                        if (vnode->volume->servers == fc->server_list) {
+                                fc->ac.error = -ENOMEDIUM;
+                                goto failed;
+                        }
+                        goto restart_from_beginning;
+                default:
+                        clear_bit(AFS_VOLUME_OFFLINE, &vnode->volume->flags);
+                        clear_bit(AFS_VOLUME_BUSY, &vnode->volume->flags);
+                        fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
+                        goto failed;
+                }
+        case -ENETUNREACH:
+        case -EHOSTUNREACH:
+        case -ECONNREFUSED:
+        case -ETIMEDOUT:
+        case -ETIME:
+                _debug("no conn");
+                goto iterate_address;
+        }
+restart_from_beginning:
+        _debug("restart");
+        afs_end_cursor(&fc->ac);
+        afs_put_cb_interest(afs_v2net(vnode), fc->cbi);
+        fc->cbi = NULL;
+        afs_put_serverlist(afs_v2net(vnode), fc->server_list);
+        fc->server_list = NULL;
+start:
+        _debug("start");
+        /* See if we need to do an update of the volume record.  Note that the
+         * volume may have moved or even have been deleted.
+         */
+        fc->ac.error = afs_check_volume_status(vnode->volume, fc->key);
+        if (fc->ac.error < 0)
+                goto failed;
+        if (!afs_start_fs_iteration(fc, vnode))
+                goto failed;
+        goto use_server;
+next_server:
+        _debug("next");
+        afs_put_cb_interest(afs_v2net(vnode), fc->cbi);
+        fc->cbi = NULL;
+        fc->index++;
+        if (fc->index >= fc->server_list->nr_servers)
+                fc->index = 0;
+        if (fc->index != fc->start)
+                goto use_server;
+        /* That's all the servers poked to no good effect.  Try again if some
+         * of them were busy.
+         */
+        if (fc->flags & AFS_FS_CURSOR_VBUSY)
+                goto restart_from_beginning;
+        fc->ac.error = -EDESTADDRREQ;
+        goto failed;
+use_server:
+        _debug("use");
+        /* We're starting on a different fileserver from the list.  We need to
+         * check it, create a callback intercept, find its address list and
+         * probe its capabilities before we use it.
+         */
+        ASSERTCMP(fc->ac.alist, ==, NULL);
+        server = fc->server_list->servers[fc->index].server;
+        if (!afs_check_server_record(fc, server))
+                goto failed;
+        _debug("USING SERVER: %pU", &server->uuid);
+        /* Make sure we've got a callback interest record for this server.  We
+         * have to link it in before we send the request as we can be sent a
+         * break request before we've finished decoding the reply and
+         * installing the vnode.
+         */
+        fc->ac.error = afs_register_server_cb_interest(
+                vnode, &fc->server_list->servers[fc->index]);
+        if (fc->ac.error < 0)
+                goto failed;
+        fc->cbi = afs_get_cb_interest(vnode->cb_interest);
+        read_lock(&server->fs_lock);
+        alist = rcu_dereference_protected(server->addresses,
+                                          lockdep_is_held(&server->fs_lock));
+        afs_get_addrlist(alist);
+        read_unlock(&server->fs_lock);
+        /* Probe the current fileserver if we haven't done so yet. */
+        if (!test_bit(AFS_SERVER_FL_PROBED, &server->flags)) {
+                fc->ac.alist = afs_get_addrlist(alist);
+                if (!afs_probe_fileserver(fc))
+                        goto failed;
+        }
+        if (!fc->ac.alist)
+                fc->ac.alist = alist;
+        else
+                afs_put_addrlist(alist);
+        fc->ac.addr  = NULL;
+        fc->ac.start = READ_ONCE(alist->index);
+        fc->ac.index = fc->ac.start;
+        fc->ac.error = 0;
+        fc->ac.begun = false;
+        goto iterate_address;
+iterate_address:
+        ASSERT(fc->ac.alist);
+        _debug("iterate %d/%d", fc->ac.index, fc->ac.alist->nr_addrs);
+        /* Iterate over the current server's address list to try and find an
+         * address on which it will respond to us.
+         */
+        if (afs_iterate_addresses(&fc->ac)) {
+                _leave(" = t");
+                return true;
+        }
+        afs_end_cursor(&fc->ac);
+        goto next_server;
+failed:
+        fc->flags |= AFS_FS_CURSOR_STOP;
+        _leave(" = f [failed %d]", fc->ac.error);
+        return false;
+}
+/*
+ * Select the same fileserver we used for a vnode before and only that
+ * fileserver.  We use this when we have a lock on that file, which is backed
+ * only by the fileserver we obtained it from.
+ */
+bool afs_select_current_fileserver(struct afs_fs_cursor *fc)
+{
+        struct afs_vnode *vnode = fc->vnode;
+        struct afs_cb_interest *cbi = vnode->cb_interest;
+        struct afs_addr_list *alist;
+        _enter("");
+        if (!cbi) {
+                fc->ac.error = -ESTALE;
+                fc->flags |= AFS_FS_CURSOR_STOP;
+                return false;
+        }
+        read_lock(&cbi->server->fs_lock);
+        alist = afs_get_addrlist(cbi->server->addresses);
+        read_unlock(&cbi->server->fs_lock);
+        if (!alist) {
+                fc->ac.error = -ESTALE;
+                fc->flags |= AFS_FS_CURSOR_STOP;
+                return false;
+        }
+        fc->ac.alist = alist;
+        fc->ac.error = 0;
+        return true;
+}
+/*
+ * Tidy up a filesystem cursor and unlock the vnode.
+ */
+int afs_end_vnode_operation(struct afs_fs_cursor *fc)
+{
+        struct afs_net *net = afs_v2net(fc->vnode);
+        int ret;
+        mutex_unlock(&fc->vnode->io_lock);
+        afs_end_cursor(&fc->ac);
+        afs_put_cb_interest(net, fc->cbi);
+        afs_put_serverlist(net, fc->server_list);
+        ret = fc->ac.error;
+        if (ret == -ECONNABORTED)
+                afs_abort_to_error(fc->ac.abort_code);
+        return fc->ac.error;
+}
+#if 0
+/*
 * Set a filesystem server cursor for using a specific FS server.
 */
 int afs_set_fs_cursor(struct afs_fs_cursor *fc, struct afs_vnode *vnode)
@@ -252,3 +711,5 @@ int afs_end_fs_cursor(struct afs_fs_cursor *fc, struct afs_net *net)
        afs_put_server(net, fc->server);
        return fc->ac.error;
 }
+#endif
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 5ddfb7c4cf78..1bbd5854507d 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -20,7 +20,7 @@
 struct workqueue_struct *afs_async_calls;
 static void afs_wake_up_call_waiter(struct sock *, struct rxrpc_call *, unsigned long);
-static long afs_wait_for_call_to_complete(struct afs_call *);
+static long afs_wait_for_call_to_complete(struct afs_call *, struct afs_addr_cursor *);
 static void afs_wake_up_async_call(struct sock *, struct rxrpc_call *, unsigned long);
 static void afs_process_async_call(struct work_struct *);
 static void afs_rx_new_call(struct sock *, struct rxrpc_call *, unsigned long);
@@ -162,6 +162,7 @@ void afs_put_call(struct afs_call *call)
                        call->type->destructor(call);
                afs_put_server(call->net, call->cm_server);
+                afs_put_cb_interest(call->net, call->cbi);
                kfree(call->request);
                kfree(call);
@@ -330,7 +331,6 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
        struct kvec iov[1];
        size_t offset;
        s64 tx_total_len;
-        u32 abort_code;
        int ret;
        _enter(",{%pISp},", &srx->transport);
@@ -362,7 +362,6 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
                                          afs_wake_up_async_call :
                                          afs_wake_up_call_waiter),
                                         call->upgrade);
-        call->key = NULL;
        if (IS_ERR(rxcall)) {
                ret = PTR_ERR(rxcall);
                goto error_kill_call;
@@ -406,7 +405,7 @@ long afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call,
        if (call->async)
                return -EINPROGRESS;
-        return afs_wait_for_call_to_complete(call);
+        return afs_wait_for_call_to_complete(call, ac);
 error_do_abort:
        call->state = AFS_CALL_COMPLETE;
@@ -414,15 +413,16 @@ error_do_abort:
                rxrpc_kernel_abort_call(call->net->socket, rxcall,
                                        RX_USER_ABORT, ret, "KSD");
        } else {
-                abort_code = 0;
                offset = 0;
                rxrpc_kernel_recv_data(call->net->socket, rxcall, NULL,
                                       0, &offset, false, &call->abort_code,
                                       &call->service_id);
-                ret = afs_abort_to_error(call->abort_code);
+                ac->abort_code = call->abort_code;
+                ac->responded = true;
        }
 error_kill_call:
        afs_put_call(call);
+        ac->error = ret;
        _leave(" = %d", ret);
        return ret;
 }
@@ -510,7 +510,8 @@ save_error:
 /*
 * wait synchronously for a call to complete
 */
-static long afs_wait_for_call_to_complete(struct afs_call *call)
+static long afs_wait_for_call_to_complete(struct afs_call *call,
+                                          struct afs_addr_cursor *ac)
 {
        signed long rtt2, timeout;
        long ret;
@@ -563,16 +564,25 @@ static long afs_wait_for_call_to_complete(struct afs_call *call)
        /* Kill off the call if it's still live. */
        if (call->state < AFS_CALL_COMPLETE) {
                _debug("call interrupted");
-                rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
+                if (rxrpc_kernel_abort_call(call->net->socket, call->rxcall,
-                                        RX_USER_ABORT, -EINTR, "KWI");
+                                            RX_USER_ABORT, -EINTR, "KWI"))
+                        call->error = -ERESTARTSYS;
        }
-        ret = call->error;
+        ac->abort_code = call->abort_code;
-        if (ret < 0) {
+        ac->error = call->error;
-                ret = afs_abort_to_error(call->abort_code);
-        } else if (ret == 0 && call->ret_reply0) {
+        ret = ac->error;
-                ret = (long)call->reply[0];
+        switch (ret) {
-                call->reply[0] = NULL;
+        case 0:
+                if (call->ret_reply0) {
+                        ret = (long)call->reply[0];
+                        call->reply[0] = NULL;
+                }
+                /* Fall through */
+        case -ECONNABORTED:
+                ac->responded = true;
+                break;
        }
        _debug("call complete");
@@ -882,10 +892,7 @@ int afs_extract_data(struct afs_call *call, void *buf, size_t count,
                return 0;
        }
-        if (ret == -ECONNABORTED)
+        call->error = ret;
-                call->error = afs_abort_to_error(call->abort_code);
-        else
-                call->error = ret;
        call->state = AFS_CALL_COMPLETE;
        return ret;
 }
diff --git a/fs/afs/security.c b/fs/afs/security.c
index 1b5198fc1657..46a881a4d08f 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -325,7 +325,7 @@ static int afs_check_permit(struct afs_vnode *vnode, struct key *key,
                 */
                _debug("no valid permit");
-                ret = afs_vnode_fetch_status(vnode, key, true);
+                ret = afs_fetch_status(vnode, key);
                if (ret < 0) {
                        *_access = 0;
                        _leave(" = %d", ret);
diff --git a/fs/afs/server.c b/fs/afs/server.c
index 9ca174b24f5b..a6c860bcf391 100644
--- a/fs/afs/server.c
+++ b/fs/afs/server.c
@@ -14,7 +14,8 @@
 #include "afs_fs.h"
 #include "internal.h"
-static unsigned afs_server_timeout = 10;        /* server timeout in seconds */
+static unsigned afs_server_gc_delay = 10;       /* Server record timeout in seconds */
+static unsigned afs_server_update_delay = 30;   /* Time till VLDB recheck in secs */
 static void afs_inc_servers_outstanding(struct afs_net *net)
 {
@@ -27,60 +28,201 @@ static void afs_dec_servers_outstanding(struct afs_net *net)
                wake_up_atomic_t(&net->servers_outstanding);
 }
-void afs_server_timer(struct timer_list *timer)
+/*
+ * Find a server by one of its addresses.
+ */
+struct afs_server *afs_find_server(struct afs_net *net,
+                                   const struct sockaddr_rxrpc *srx)
 {
-        struct afs_net *net = container_of(timer, struct afs_net, server_timer);
+        const struct sockaddr_in6 *a = &srx->transport.sin6, *b;
+        const struct afs_addr_list *alist;
+        struct afs_server *server = NULL;
+        unsigned int i;
+        bool ipv6 = true;
+        int seq = 0, diff;
+        if (srx->transport.sin6.sin6_addr.s6_addr32[0] == 0 ||
+            srx->transport.sin6.sin6_addr.s6_addr32[1] == 0 ||
+            srx->transport.sin6.sin6_addr.s6_addr32[2] == htonl(0xffff))
+                ipv6 = false;
+        rcu_read_lock();
+        do {
+                if (server)
+                        afs_put_server(net, server);
+                server = NULL;
+                read_seqbegin_or_lock(&net->fs_addr_lock, &seq);
+                if (ipv6) {
+                        hlist_for_each_entry_rcu(server, &net->fs_addresses6, addr6_link) {
+                                alist = rcu_dereference(server->addresses);
+                                for (i = alist->nr_ipv4; i < alist->nr_addrs; i++) {
+                                        b = &alist->addrs[i].transport.sin6;
+                                        diff = (u16)a->sin6_port - (u16)b->sin6_port;
+                                        if (diff == 0)
+                                                diff = memcmp(&a->sin6_addr,
+                                                              &b->sin6_addr,
+                                                              sizeof(struct in6_addr));
+                                        if (diff == 0)
+                                                goto found;
+                                        if (diff < 0) {
+                                                // TODO: Sort the list
+                                                //if (i == alist->nr_ipv4)
+                                                //      goto not_found;
+                                                break;
+                                        }
+                                }
+                        }
+                } else {
+                        hlist_for_each_entry_rcu(server, &net->fs_addresses4, addr4_link) {
+                                alist = rcu_dereference(server->addresses);
+                                for (i = 0; i < alist->nr_ipv4; i++) {
+                                        b = &alist->addrs[i].transport.sin6;
+                                        diff = (u16)a->sin6_port - (u16)b->sin6_port;
+                                        if (diff == 0)
+                                                diff = ((u32)a->sin6_addr.s6_addr32[3] -
+                                                        (u32)b->sin6_addr.s6_addr32[3]);
+                                        if (diff == 0)
+                                                goto found;
+                                        if (diff < 0) {
+                                                // TODO: Sort the list
+                                                //if (i == 0)
+                                                //      goto not_found;
+                                                break;
+                                        }
+                                }
+                        }
+                }
-        if (!queue_work(afs_wq, &net->server_reaper))
+        //not_found:
-                afs_dec_servers_outstanding(net);
+                server = NULL;
+        found:
+                if (server && !atomic_inc_not_zero(&server->usage))
+                        server = NULL;
+        } while (need_seqretry(&net->fs_addr_lock, seq));
+        done_seqretry(&net->fs_addr_lock, seq);
+        rcu_read_unlock();
+        return server;
 }
 /*
- * install a server record in the master tree
+ * Look up a server by its UUID
 */
-static int afs_install_server(struct afs_server *server)
+struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uuid)
 {
-        struct afs_server *xserver;
+        struct afs_server *server = NULL;
-        struct afs_net *net = server->cell->net;
+        struct rb_node *p;
+        int diff, seq = 0;
+        _enter("%pU", uuid);
+        do {
+                /* Unfortunately, rbtree walking doesn't give reliable results
+                 * under just the RCU read lock, so we have to check for
+                 * changes.
+                 */
+                if (server)
+                        afs_put_server(net, server);
+                server = NULL;
+                read_seqbegin_or_lock(&net->fs_lock, &seq);
+                p = net->fs_servers.rb_node;
+                while (p) {
+                        server = rb_entry(p, struct afs_server, uuid_rb);
+                        diff = memcmp(uuid, &server->uuid, sizeof(*uuid));
+                        if (diff < 0) {
+                                p = p->rb_left;
+                        } else if (diff > 0) {
+                                p = p->rb_right;
+                        } else {
+                                afs_get_server(server);
+                                break;
+                        }
+                        server = NULL;
+                }
+        } while (need_seqretry(&net->fs_lock, seq));
+        done_seqretry(&net->fs_lock, seq);
+        _leave(" = %p", server);
+        return server;
+}
+/*
+ * Install a server record in the namespace tree
+ */
+static struct afs_server *afs_install_server(struct afs_net *net,
+                                             struct afs_server *candidate)
+{
+        const struct afs_addr_list *alist;
+        struct afs_server *server;
        struct rb_node **pp, *p;
-        int ret, diff;
+        int ret = -EEXIST, diff;
-        _enter("%p", server);
+        _enter("%p", candidate);
-        write_lock(&net->servers_lock);
+        write_seqlock(&net->fs_lock);
-        ret = -EEXIST;
+        /* Firstly install the server in the UUID lookup tree */
-        pp = &net->servers.rb_node;
+        pp = &net->fs_servers.rb_node;
        p = NULL;
        while (*pp) {
                p = *pp;
                _debug("- consider %p", p);
-                xserver = rb_entry(p, struct afs_server, master_rb);
+                server = rb_entry(p, struct afs_server, uuid_rb);
-                diff = memcmp(&server->addrs->addrs[0],
+                diff = memcmp(&candidate->uuid, &server->uuid, sizeof(uuid_t));
-                              &xserver->addrs->addrs[0],
-                              sizeof(sizeof(server->addrs->addrs[0])));
                if (diff < 0)
                        pp = &(*pp)->rb_left;
                else if (diff > 0)
                        pp = &(*pp)->rb_right;
                else
-                        goto error;
+                        goto exists;
        }
-        rb_link_node(&server->master_rb, p, pp);
+        server = candidate;
-        rb_insert_color(&server->master_rb, &net->servers);
+        rb_link_node(&server->uuid_rb, p, pp);
+        rb_insert_color(&server->uuid_rb, &net->fs_servers);
+        hlist_add_head_rcu(&server->proc_link, &net->fs_proc);
+        write_seqlock(&net->fs_addr_lock);
+        alist = rcu_dereference_protected(server->addresses,
+                                          lockdep_is_held(&net->fs_addr_lock.lock));
+        /* Secondly, if the server has any IPv4 and/or IPv6 addresses, install
+         * it in the IPv4 and/or IPv6 reverse-map lists.
+         *
+         * TODO: For speed we want to use something other than a flat list
+         * here; even sorting the list in terms of lowest address would help a
+         * bit, but anything we might want to do gets messy and memory
+         * intensive.
+         */
+        if (alist->nr_ipv4 > 0)
+                hlist_add_head_rcu(&server->addr4_link, &net->fs_addresses4);
+        if (alist->nr_addrs > alist->nr_ipv4)
+                hlist_add_head_rcu(&server->addr6_link, &net->fs_addresses6);
+        write_sequnlock(&net->fs_addr_lock);
        ret = 0;
-error:
+exists:
-        write_unlock(&net->servers_lock);
+        afs_get_server(server);
-        return ret;
+        write_sequnlock(&net->fs_lock);
+        return server;
 }
 /*
 * allocate a new server record
 */
-static struct afs_server *afs_alloc_server(struct afs_cell *cell,
+static struct afs_server *afs_alloc_server(struct afs_net *net,
-                                           const struct sockaddr_rxrpc *addr)
+                                           const uuid_t *uuid,
+                                           struct afs_addr_list *alist)
 {
        struct afs_server *server;
@@ -89,194 +231,155 @@ static struct afs_server *afs_alloc_server(struct afs_cell *cell,
        server = kzalloc(sizeof(struct afs_server), GFP_KERNEL);
        if (!server)
                goto enomem;
-        server->addrs = kzalloc(sizeof(struct afs_addr_list) +
-                                sizeof(struct sockaddr_rxrpc),
-                                GFP_KERNEL);
-        if (!server->addrs)
-                goto enomem_server;
        atomic_set(&server->usage, 1);
-        server->net = cell->net;
+        RCU_INIT_POINTER(server->addresses, alist);
-        server->cell = cell;
+        server->addr_version = alist->version;
+        server->uuid = *uuid;
-        INIT_LIST_HEAD(&server->link);
+        server->flags = (1UL << AFS_SERVER_FL_NEW);
-        INIT_LIST_HEAD(&server->grave);
+        server->update_at = ktime_get_real_seconds() + afs_server_update_delay;
-        init_rwsem(&server->sem);
+        rwlock_init(&server->fs_lock);
-        spin_lock_init(&server->fs_lock);
        INIT_LIST_HEAD(&server->cb_interests);
        rwlock_init(&server->cb_break_lock);
-        refcount_set(&server->addrs->usage, 1);
+        afs_inc_servers_outstanding(net);
-        server->addrs->nr_addrs = 1;
+        _leave(" = %p", server);
-        server->addrs->addrs[0] = *addr;
-        afs_inc_servers_outstanding(cell->net);
-        _leave(" = %p{%d}", server, atomic_read(&server->usage));
        return server;
-enomem_server:
-        kfree(server);
 enomem:
        _leave(" = NULL [nomem]");
        return NULL;
 }
 /*
- * get an FS-server record for a cell
+ * Look up an address record for a server
 */
-struct afs_server *afs_lookup_server(struct afs_cell *cell,
+static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell,
-                                     struct sockaddr_rxrpc *addr)
+                                                 struct key *key, const uuid_t *uuid)
 {
-        struct afs_server *server, *candidate;
+        struct afs_addr_cursor ac;
+        struct afs_addr_list *alist;
-        _enter("%p,%pIS", cell, &addr->transport);
+        int ret;
-        /* quick scan of the list to see if we already have the server */
+        ret = afs_set_vl_cursor(&ac, cell);
-        read_lock(&cell->servers_lock);
+        if (ret < 0)
+                return ERR_PTR(ret);
-        list_for_each_entry(server, &cell->servers, link) {
-                if (memcmp(&server->addrs->addrs[0], addr, sizeof(*addr)) == 0)
+        while (afs_iterate_addresses(&ac)) {
-                        goto found_server_quickly;
+                alist = afs_vl_get_addrs_u(cell->net, &ac, key, uuid);
-        }
+                switch (ac.error) {
-        read_unlock(&cell->servers_lock);
+                case 0:
+                        afs_end_cursor(&ac);
-        candidate = afs_alloc_server(cell, addr);
+                        return alist;
-        if (!candidate) {
+                case -ECONNABORTED:
-                _leave(" = -ENOMEM");
+                        ac.error = afs_abort_to_error(ac.abort_code);
-                return ERR_PTR(-ENOMEM);
+                        goto error;
-        }
+                case -ENOMEM:
+                case -ENONET:
-        write_lock(&cell->servers_lock);
+                        goto error;
+                case -ENETUNREACH:
-        /* check the cell's server list again */
+                case -EHOSTUNREACH:
-        list_for_each_entry(server, &cell->servers, link) {
+                case -ECONNREFUSED:
-                if (memcmp(&server->addrs->addrs[0], addr, sizeof(*addr)) == 0)
+                        break;
-                        goto found_server;
+                default:
-        }
+                        ac.error = -EIO;
+                        goto error;
-        _debug("new");
+                }
-        server = candidate;
-        if (afs_install_server(server) < 0)
-                goto server_in_two_cells;
-        afs_get_cell(cell);
-        list_add_tail(&server->link, &cell->servers);
-        write_unlock(&cell->servers_lock);
-        _leave(" = %p{%d}", server, atomic_read(&server->usage));
-        return server;
-        /* found a matching server quickly */
-found_server_quickly:
-        _debug("found quickly");
-        afs_get_server(server);
-        read_unlock(&cell->servers_lock);
-no_longer_unused:
-        if (!list_empty(&server->grave)) {
-                spin_lock(&cell->net->server_graveyard_lock);
-                list_del_init(&server->grave);
-                spin_unlock(&cell->net->server_graveyard_lock);
        }
-        _leave(" = %p{%d}", server, atomic_read(&server->usage));
-        return server;
-        /* found a matching server on the second pass */
+error:
-found_server:
+        return ERR_PTR(afs_end_cursor(&ac));
-        _debug("found");
-        afs_get_server(server);
-        write_unlock(&cell->servers_lock);
-        kfree(candidate);
-        goto no_longer_unused;
-        /* found a server that seems to be in two cells */
-server_in_two_cells:
-        write_unlock(&cell->servers_lock);
-        kfree(candidate);
-        afs_dec_servers_outstanding(cell->net);
-        printk(KERN_NOTICE "kAFS: Server %pI4 appears to be in two cells\n",
-               addr);
-        _leave(" = -EEXIST");
-        return ERR_PTR(-EEXIST);
 }
 /*
- * look up a server by its IP address
+ * Get or create a fileserver record.
 */
-struct afs_server *afs_find_server(struct afs_net *net,
+struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key,
-                                   const struct sockaddr_rxrpc *srx)
+                                     const uuid_t *uuid)
 {
-        struct afs_server *server = NULL;
+        struct afs_addr_list *alist;
-        struct rb_node *p;
+        struct afs_server *server, *candidate;
-        int diff;
-        _enter("{%d,%pIS}", srx->transport.family, &srx->transport);
+        _enter("%p,%pU", cell->net, uuid);
-        read_lock(&net->servers_lock);
+        server = afs_find_server_by_uuid(cell->net, uuid);
+        if (server)
+                return server;
-        p = net->servers.rb_node;
+        alist = afs_vl_lookup_addrs(cell, key, uuid);
-        while (p) {
+        if (IS_ERR(alist))
-                server = rb_entry(p, struct afs_server, master_rb);
+                return ERR_CAST(alist);
-                _debug("- consider %p", p);
+        candidate = afs_alloc_server(cell->net, uuid, alist);
+        if (!candidate) {
+                afs_put_addrlist(alist);
+                return ERR_PTR(-ENOMEM);
+        }
-                diff = memcmp(srx, &server->addrs->addrs[0], sizeof(*srx));
+        server = afs_install_server(cell->net, candidate);
-                if (diff < 0) {
+        if (server != candidate) {
-                        p = p->rb_left;
+                afs_put_addrlist(alist);
-                } else if (diff > 0) {
+                kfree(candidate);
-                        p = p->rb_right;
-                } else {
-                        afs_get_server(server);
-                        goto found;
-                }
        }
-        server = NULL;
+        _leave(" = %p{%d}", server, atomic_read(&server->usage));
-found:
-        read_unlock(&net->servers_lock);
-        _leave(" = %p", server);
        return server;
 }
+/*
+ * Set the server timer to fire after a given delay, assuming it's not already
+ * set for an earlier time.
+ */
 static void afs_set_server_timer(struct afs_net *net, time64_t delay)
 {
-        afs_inc_servers_outstanding(net);
        if (net->live) {
-                if (timer_reduce(&net->server_timer, jiffies + delay * HZ))
+                afs_inc_servers_outstanding(net);
-                        afs_dec_servers_outstanding(net);
+                if (timer_reduce(&net->fs_timer, jiffies + delay * HZ))
-        } else {
-                if (!queue_work(afs_wq, &net->server_reaper))
                        afs_dec_servers_outstanding(net);
        }
 }
 /*
- * destroy a server record
+ * Server management timer.  We have an increment on fs_outstanding that we
- * - removes from the cell list
+ * need to pass along to the work item.
+ */
+void afs_servers_timer(struct timer_list *timer)
+{
+        struct afs_net *net = container_of(timer, struct afs_net, fs_timer);
+        _enter("");
+        if (!queue_work(afs_wq, &net->fs_manager))
+                afs_dec_servers_outstanding(net);
+}
+/*
+ * Release a reference on a server record.
 */
 void afs_put_server(struct afs_net *net, struct afs_server *server)
 {
+        unsigned int usage;
        if (!server)
                return;
-        _enter("%p{%d}", server, atomic_read(&server->usage));
+        server->put_time = ktime_get_real_seconds();
-        _debug("PUT SERVER %d", atomic_read(&server->usage));
+        usage = atomic_dec_return(&server->usage);
-        ASSERTCMP(atomic_read(&server->usage), >, 0);
+        _enter("{%u}", usage);
-        if (likely(!atomic_dec_and_test(&server->usage))) {
+        if (likely(usage > 0))
-                _leave("");
                return;
-        }
-        spin_lock(&net->server_graveyard_lock);
+        afs_set_server_timer(net, afs_server_gc_delay);
-        if (atomic_read(&server->usage) == 0) {
+}
-                list_move_tail(&server->grave, &net->server_graveyard);
-                server->time_of_death = ktime_get_real_seconds();
+static void afs_server_rcu(struct rcu_head *rcu)
-                afs_set_server_timer(net, afs_server_timeout);
+{
-        }
+        struct afs_server *server = container_of(rcu, struct afs_server, rcu);
-        spin_unlock(&net->server_graveyard_lock);
-        _leave(" [dead]");
+        afs_put_addrlist(server->addresses);
+        kfree(server);
 }
 /*
@@ -284,7 +387,7 @@ void afs_put_server(struct afs_net *net, struct afs_server *server)
 */
 static void afs_destroy_server(struct afs_net *net, struct afs_server *server)
 {
-        struct afs_addr_list *alist = server->addrs;
+        struct afs_addr_list *alist = server->addresses;
        struct afs_addr_cursor ac = {
                .alist  = alist,
                .addr   = &alist->addrs[0],
@@ -294,79 +397,300 @@ static void afs_destroy_server(struct afs_net *net, struct afs_server *server)
        };
        _enter("%p", server);
-        afs_fs_give_up_all_callbacks(server, &ac, NULL, false);
+        afs_fs_give_up_all_callbacks(net, server, &ac, NULL);
-        afs_put_cell(net, server->cell);
+        call_rcu(&server->rcu, afs_server_rcu);
-        afs_put_addrlist(server->addrs);
-        kfree(server);
        afs_dec_servers_outstanding(net);
 }
 /*
- * reap dead server records
+ * Garbage collect any expired servers.
 */
-void afs_reap_server(struct work_struct *work)
+static void afs_gc_servers(struct afs_net *net, struct afs_server *gc_list)
 {
-        LIST_HEAD(corpses);
        struct afs_server *server;
-        struct afs_net *net = container_of(work, struct afs_net, server_reaper);
+        bool deleted;
-        unsigned long delay, expiry;
+        int usage;
-        time64_t now;
+        while ((server = gc_list)) {
-        now = ktime_get_real_seconds();
+                gc_list = server->gc_next;
-        spin_lock(&net->server_graveyard_lock);
+                write_seqlock(&net->fs_lock);
-        while (!list_empty(&net->server_graveyard)) {
+                usage = 1;
-                server = list_entry(net->server_graveyard.next,
+                deleted = atomic_try_cmpxchg(&server->usage, &usage, 0);
-                                    struct afs_server, grave);
+                if (deleted) {
+                        rb_erase(&server->uuid_rb, &net->fs_servers);
-                /* the queue is ordered most dead first */
+                        hlist_del_rcu(&server->proc_link);
-                if (net->live) {
-                        expiry = server->time_of_death + afs_server_timeout;
-                        if (expiry > now) {
-                                delay = (expiry - now);
-                                afs_set_server_timer(net, delay);
-                                break;
-                        }
                }
+                write_sequnlock(&net->fs_lock);
-                write_lock(&server->cell->servers_lock);
+                if (deleted)
-                write_lock(&net->servers_lock);
+                        afs_destroy_server(net, server);
-                if (atomic_read(&server->usage) > 0) {
+        }
-                        list_del_init(&server->grave);
+}
-                } else {
-                        list_move_tail(&server->grave, &corpses);
+/*
-                        list_del_init(&server->link);
+ * Manage the records of servers known to be within a network namespace.  This
-                        rb_erase(&server->master_rb, &net->servers);
+ * includes garbage collecting unused servers.
+ *
+ * Note also that we were given an increment on net->servers_outstanding by
+ * whoever queued us that we need to deal with before returning.
+ */
+void afs_manage_servers(struct work_struct *work)
+{
+        struct afs_net *net = container_of(work, struct afs_net, fs_manager);
+        struct afs_server *gc_list = NULL;
+        struct rb_node *cursor;
+        time64_t now = ktime_get_real_seconds(), next_manage = TIME64_MAX;
+        bool purging = !net->live;
+        _enter("");
+        /* Trawl the server list looking for servers that have expired from
+         * lack of use.
+         */
+        read_seqlock_excl(&net->fs_lock);
+        for (cursor = rb_first(&net->fs_servers); cursor; cursor = rb_next(cursor)) {
+                struct afs_server *server =
+                        rb_entry(cursor, struct afs_server, uuid_rb);
+                int usage = atomic_read(&server->usage);
+                _debug("manage %pU %u", &server->uuid, usage);
+                ASSERTCMP(usage, >=, 1);
+                ASSERTIFCMP(purging, usage, ==, 1);
+                if (usage == 1) {
+                        time64_t expire_at = server->put_time;
+                        if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) &&
+                            !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags))
+                                expire_at += afs_server_gc_delay;
+                        if (purging || expire_at <= now) {
+                                server->gc_next = gc_list;
+                                gc_list = server;
+                        } else if (expire_at < next_manage) {
+                                next_manage = expire_at;
+                        }
                }
-                write_unlock(&net->servers_lock);
-                write_unlock(&server->cell->servers_lock);
        }
-        spin_unlock(&net->server_graveyard_lock);
+        read_sequnlock_excl(&net->fs_lock);
-        /* now reap the corpses we've extracted */
+        /* Update the timer on the way out.  We have to pass an increment on
-        while (!list_empty(&corpses)) {
+         * servers_outstanding in the namespace that we are in to the timer or
-                server = list_entry(corpses.next, struct afs_server, grave);
+         * the work scheduler.
-                list_del(&server->grave);
+         */
-                afs_destroy_server(net, server);
+        if (!purging && next_manage < TIME64_MAX) {
+                now = ktime_get_real_seconds();
+                if (next_manage - now <= 0) {
+                        if (queue_work(afs_wq, &net->fs_manager))
+                                afs_inc_servers_outstanding(net);
+                } else {
+                        afs_set_server_timer(net, next_manage - now);
+                }
        }
+        afs_gc_servers(net, gc_list);
        afs_dec_servers_outstanding(net);
+        _leave(" [%d]", atomic_read(&net->servers_outstanding));
+}
+static void afs_queue_server_manager(struct afs_net *net)
+{
+        afs_inc_servers_outstanding(net);
+        if (!queue_work(afs_wq, &net->fs_manager))
+                afs_dec_servers_outstanding(net);
 }
 /*
- * Discard all the server records from a net namespace when it is destroyed or
+ * Purge list of servers.
- * the afs module is removed.
 */
-void __net_exit afs_purge_servers(struct afs_net *net)
+void afs_purge_servers(struct afs_net *net)
 {
-        if (del_timer_sync(&net->server_timer))
+        _enter("");
+        if (del_timer_sync(&net->fs_timer))
                atomic_dec(&net->servers_outstanding);
-        afs_inc_servers_outstanding(net);
+        afs_queue_server_manager(net);
-        if (!queue_work(afs_wq, &net->server_reaper))
-                afs_dec_servers_outstanding(net);
+        _debug("wait");
        wait_on_atomic_t(&net->servers_outstanding, atomic_t_wait,
                         TASK_UNINTERRUPTIBLE);
+        _leave("");
+}
+/*
+ * Probe a fileserver to find its capabilities.
+ *
+ * TODO: Try service upgrade.
+ */
+static bool afs_do_probe_fileserver(struct afs_fs_cursor *fc)
+{
+        _enter("");
+        fc->ac.addr = NULL;
+        fc->ac.start = READ_ONCE(fc->ac.alist->index);
+        fc->ac.index = fc->ac.start;
+        fc->ac.error = 0;
+        fc->ac.begun = false;
+        while (afs_iterate_addresses(&fc->ac)) {
+                afs_fs_get_capabilities(afs_v2net(fc->vnode), fc->cbi->server,
+                                        &fc->ac, fc->key);
+                switch (fc->ac.error) {
+                case 0:
+                        afs_end_cursor(&fc->ac);
+                        set_bit(AFS_SERVER_FL_PROBED, &fc->cbi->server->flags);
+                        return true;
+                case -ECONNABORTED:
+                        fc->ac.error = afs_abort_to_error(fc->ac.abort_code);
+                        goto error;
+                case -ENOMEM:
+                case -ENONET:
+                        goto error;
+                case -ENETUNREACH:
+                case -EHOSTUNREACH:
+                case -ECONNREFUSED:
+                case -ETIMEDOUT:
+                case -ETIME:
+                        break;
+                default:
+                        fc->ac.error = -EIO;
+                        goto error;
+                }
+        }
+error:
+        afs_end_cursor(&fc->ac);
+        return false;
+}
+/*
+ * If we haven't already, try probing the fileserver to get its capabilities.
+ * We try not to instigate parallel probes, but it's possible that the parallel
+ * probes will fail due to authentication failure when ours would succeed.
+ *
+ * TODO: Try sending an anonymous probe if an authenticated probe fails.
+ */
+bool afs_probe_fileserver(struct afs_fs_cursor *fc)
+{
+        bool success;
+        int ret, retries = 0;
+        _enter("");
+retry:
+        if (test_bit(AFS_SERVER_FL_PROBED, &fc->cbi->server->flags)) {
+                _leave(" = t");
+                return true;
+        }
+        if (!test_and_set_bit_lock(AFS_SERVER_FL_PROBING, &fc->cbi->server->flags)) {
+                success = afs_do_probe_fileserver(fc);
+                clear_bit_unlock(AFS_SERVER_FL_PROBING, &fc->cbi->server->flags);
+                wake_up_bit(&fc->cbi->server->flags, AFS_SERVER_FL_PROBING);
+                _leave(" = t");
+                return success;
+        }
+        _debug("wait");
+        ret = wait_on_bit(&fc->cbi->server->flags, AFS_SERVER_FL_PROBING,
+                          TASK_INTERRUPTIBLE);
+        if (ret == -ERESTARTSYS) {
+                fc->ac.error = ret;
+                _leave(" = f [%d]", ret);
+                return false;
+        }
+        retries++;
+        if (retries == 4) {
+                fc->ac.error = -ESTALE;
+                _leave(" = f [stale]");
+                return false;
+        }
+        _debug("retry");
+        goto retry;
+}
+/*
+ * Get an update for a server's address list.
+ */
+static noinline bool afs_update_server_record(struct afs_fs_cursor *fc, struct afs_server *server)
+{
+        struct afs_addr_list *alist, *discard;
+        _enter("");
+        alist = afs_vl_lookup_addrs(fc->vnode->volume->cell, fc->key,
+                                    &server->uuid);
+        if (IS_ERR(alist)) {
+                fc->ac.error = PTR_ERR(alist);
+                _leave(" = f [%d]", fc->ac.error);
+                return false;
+        }
+        discard = alist;
+        if (server->addr_version != alist->version) {
+                write_lock(&server->fs_lock);
+                discard = rcu_dereference_protected(server->addresses,
+                                                    lockdep_is_held(&server->fs_lock));
+                rcu_assign_pointer(server->addresses, alist);
+                server->addr_version = alist->version;
+                write_unlock(&server->fs_lock);
+        }
+        server->update_at = ktime_get_real_seconds() + afs_server_update_delay;
+        afs_put_addrlist(discard);
+        _leave(" = t");
+        return true;
+}
+/*
+ * See if a server's address list needs updating.
+ */
+bool afs_check_server_record(struct afs_fs_cursor *fc, struct afs_server *server)
+{
+        time64_t now = ktime_get_real_seconds();
+        long diff;
+        bool success;
+        int ret, retries = 0;
+        _enter("");
+        ASSERT(server);
+retry:
+        diff = READ_ONCE(server->update_at) - now;
+        if (diff > 0) {
+                _leave(" = t [not now %ld]", diff);
+                return true;
+        }
+        if (!test_and_set_bit_lock(AFS_SERVER_FL_UPDATING, &server->flags)) {
+                success = afs_update_server_record(fc, server);
+                clear_bit_unlock(AFS_SERVER_FL_UPDATING, &server->flags);
+                wake_up_bit(&server->flags, AFS_SERVER_FL_UPDATING);
+                _leave(" = %d", success);
+                return success;
+        }
+        ret = wait_on_bit(&server->flags, AFS_SERVER_FL_UPDATING,
+                          TASK_INTERRUPTIBLE);
+        if (ret == -ERESTARTSYS) {
+                fc->ac.error = ret;
+                _leave(" = f [intr]");
+                return false;
+        }
+        retries++;
+        if (retries == 4) {
+                _leave(" = f [stale]");
+                ret = -ESTALE;
+                return false;
+        }
+        goto retry;
 }
diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c
new file mode 100644
index 000000000000..26bad7032bba
--- /dev/null
+++ b/fs/afs/server_list.c
@@ -0,0 +1,153 @@
+/* AFS fileserver list management.
+ *
+ * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include "internal.h"
+void afs_put_serverlist(struct afs_net *net, struct afs_server_list *slist)
+{
+        int i;
+        if (refcount_dec_and_test(&slist->usage)) {
+                for (i = 0; i < slist->nr_servers; i++) {
+                        afs_put_cb_interest(net, slist->servers[i].cb_interest);
+                        afs_put_server(net, slist->servers[i].server);
+                }
+                kfree(slist);
+        }
+}
+/*
+ * Build a server list from a VLDB record.
+ */
+struct afs_server_list *afs_alloc_server_list(struct afs_cell *cell,
+                                              struct key *key,
+                                              struct afs_vldb_entry *vldb,
+                                              u8 type_mask)
+{
+        struct afs_server_list *slist;
+        struct afs_server *server;
+        int ret = -ENOMEM, nr_servers = 0, i, j;
+        for (i = 0; i < vldb->nr_servers; i++)
+                if (vldb->fs_mask[i] & type_mask)
+                        nr_servers++;
+        slist = kzalloc(sizeof(struct afs_server_list) +
+                        sizeof(struct afs_server_entry) * nr_servers,
+                        GFP_KERNEL);
+        if (!slist)
+                goto error;
+        refcount_set(&slist->usage, 1);
+        /* Make sure a records exists for each server in the list. */
+        for (i = 0; i < vldb->nr_servers; i++) {
+                if (!(vldb->fs_mask[i] & type_mask))
+                        continue;
+                server = afs_lookup_server(cell, key, &vldb->fs_server[i]);
+                if (IS_ERR(server)) {
+                        ret = PTR_ERR(server);
+                        if (ret == -ENOENT)
+                                continue;
+                        goto error_2;
+                }
+                /* Insertion-sort by server pointer */
+                for (j = 0; j < slist->nr_servers; j++)
+                        if (slist->servers[j].server >= server)
+                                break;
+                if (j < slist->nr_servers) {
+                        if (slist->servers[j].server == server) {
+                                afs_put_server(cell->net, server);
+                                continue;
+                        }
+                        memmove(slist->servers + j + 1,
+                                slist->servers + j,
+                                (slist->nr_servers - j) * sizeof(struct afs_server_entry));
+                }
+                slist->servers[j].server = server;
+                slist->nr_servers++;
+        }
+        if (slist->nr_servers == 0) {
+                ret = -EDESTADDRREQ;
+                goto error_2;
+        }
+        return slist;
+error_2:
+        afs_put_serverlist(cell->net, slist);
+error:
+        return ERR_PTR(ret);
+}
+/*
+ * Copy the annotations from an old server list to its potential replacement.
+ */
+bool afs_annotate_server_list(struct afs_server_list *new,
+                              struct afs_server_list *old)
+{
+        struct afs_server *cur;
+        int i, j;
+        if (old->nr_servers != new->nr_servers)
+                goto changed;
+        for (i = 0; i < old->nr_servers; i++)
+                if (old->servers[i].server != new->servers[i].server)
+                        goto changed;
+        return false;
+changed:
+        /* Maintain the same current server as before if possible. */
+        cur = old->servers[old->index].server;
+        for (j = 0; j < new->nr_servers; j++) {
+                if (new->servers[j].server == cur) {
+                        new->index = j;
+                        break;
+                }
+        }
+        /* Keep the old callback interest records where possible so that we
+         * maintain callback interception.
+         */
+        i = 0;
+        j = 0;
+        while (i < old->nr_servers && j < new->nr_servers) {
+                if (new->servers[j].server == old->servers[i].server) {
+                        struct afs_cb_interest *cbi = old->servers[i].cb_interest;
+                        if (cbi) {
+                                new->servers[j].cb_interest = cbi;
+                                refcount_inc(&cbi->usage);
+                        }
+                        i++;
+                        j++;
+                        continue;
+                }
+                if (new->servers[j].server < old->servers[i].server) {
+                        j++;
+                        continue;
+                }
+                i++;
+                continue;
+        }
+        return true;
+}
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 3d53b78b350d..af1e769aaebf 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -142,9 +142,9 @@ void __exit afs_fs_exit(void)
 */
 static int afs_show_devname(struct seq_file *m, struct dentry *root)
 {
-        struct afs_super_info *as = root->d_sb->s_fs_info;
+        struct afs_super_info *as = AFS_FS_S(root->d_sb);
        struct afs_volume *volume = as->volume;
-        struct afs_cell *cell = volume->cell;
+        struct afs_cell *cell = as->cell;
        const char *suf = "";
        char pref = '%';
@@ -162,7 +162,7 @@ static int afs_show_devname(struct seq_file *m, struct dentry *root)
                break;
        }
-        seq_printf(m, "%c%s:%s%s", pref, cell->name, volume->vlocation->vldb.name, suf);
+        seq_printf(m, "%c%s:%s%s", pref, cell->name, volume->name, suf);
        return 0;
 }
@@ -334,14 +334,16 @@ static int afs_parse_device_name(struct afs_mount_params *params,
 static int afs_test_super(struct super_block *sb, void *data)
 {
        struct afs_super_info *as1 = data;
-        struct afs_super_info *as = sb->s_fs_info;
+        struct afs_super_info *as = AFS_FS_S(sb);
-        return as->net == as1->net && as->volume == as1->volume;
+        return as->net == as1->net && as->volume->vid == as1->volume->vid;
 }
 static int afs_set_super(struct super_block *sb, void *data)
 {
-        sb->s_fs_info = data;
+        struct afs_super_info *as = data;
+        sb->s_fs_info = as;
        return set_anon_super(sb, NULL);
 }
@@ -351,7 +353,7 @@ static int afs_set_super(struct super_block *sb, void *data)
 static int afs_fill_super(struct super_block *sb,
                          struct afs_mount_params *params)
 {
-        struct afs_super_info *as = sb->s_fs_info;
+        struct afs_super_info *as = AFS_FS_S(sb);
        struct afs_fid fid;
        struct inode *inode = NULL;
        int ret;
@@ -368,13 +370,15 @@ static int afs_fill_super(struct super_block *sb,
        if (ret)
                return ret;
        sb->s_bdi->ra_pages     = VM_MAX_READAHEAD * 1024 / PAGE_SIZE;
-        strlcpy(sb->s_id, as->volume->vlocation->vldb.name, sizeof(sb->s_id));
+        sprintf(sb->s_id, "%u", as->volume->vid);
+        afs_activate_volume(as->volume);
        /* allocate the root inode and dentry */
        fid.vid         = as->volume->vid;
        fid.vnode       = 1;
        fid.unique      = 1;
-        inode = afs_iget(sb, params->key, &fid, NULL, NULL);
+        inode = afs_iget(sb, params->key, &fid, NULL, NULL, NULL);
        if (IS_ERR(inode))
                return PTR_ERR(inode);
@@ -426,7 +430,7 @@ static struct dentry *afs_mount(struct file_system_type *fs_type,
 {
        struct afs_mount_params params;
        struct super_block *sb;
-        struct afs_volume *vol;
+        struct afs_volume *candidate;
        struct key *key;
        struct afs_super_info *as;
        int ret;
@@ -464,15 +468,19 @@ static struct dentry *afs_mount(struct file_system_type *fs_type,
        ret = -ENOMEM;
        as = afs_alloc_sbi(&params);
        if (!as)
-                goto error;
+                goto error_key;
-        /* parse the device name */
+        /* Assume we're going to need a volume record; at the very least we can
-        vol = afs_volume_lookup(&params);
+         * use it to update the volume record if we have one already.  This
-        if (IS_ERR(vol)) {
+         * checks that the volume exists within the cell.
-                ret = PTR_ERR(vol);
+         */
-                goto error;
+        candidate = afs_create_volume(&params);
+        if (IS_ERR(candidate)) {
+                ret = PTR_ERR(candidate);
+                goto error_as;
        }
-        as->volume = vol;
+        as->volume = candidate;
        /* allocate a deviceless superblock */
        sb = sget(fs_type, afs_test_super, afs_set_super, flags, as);
@@ -503,11 +511,13 @@ static struct dentry *afs_mount(struct file_system_type *fs_type,
 error_sb:
        deactivate_locked_super(sb);
+        goto error_key;
 error_as:
        afs_destroy_sbi(as);
+error_key:
+        key_put(params.key);
 error:
        afs_put_cell(params.net, params.cell);
-        key_put(params.key);
        _leave(" = %d", ret);
        return ERR_PTR(ret);
 }
@@ -519,8 +529,9 @@ static void afs_kill_super(struct super_block *sb)
        /* Clear the callback interests (which will do ilookup5) before
         * deactivating the superblock.
         */
-        afs_clear_callback_interests(as->net, as->volume);
+        afs_clear_callback_interests(as->net, as->volume->servers);
        kill_anon_super(sb);
+        afs_deactivate_volume(as->volume);
        afs_destroy_sbi(as);
 }
@@ -533,7 +544,7 @@ static void afs_i_init_once(void *_vnode)
        memset(vnode, 0, sizeof(*vnode));
        inode_init_once(&vnode->vfs_inode);
-        init_waitqueue_head(&vnode->update_waitq);
+        mutex_init(&vnode->io_lock);
        mutex_init(&vnode->validate_lock);
        spin_lock_init(&vnode->writeback_lock);
        spin_lock_init(&vnode->lock);
@@ -561,7 +572,6 @@ static struct inode *afs_alloc_inode(struct super_block *sb)
        memset(&vnode->status, 0, sizeof(vnode->status));
        vnode->volume           = NULL;
-        vnode->update_cnt       = 0;
        vnode->flags            = 1 << AFS_VNODE_UNSET;
        _leave(" = %p", &vnode->vfs_inode);
@@ -597,6 +607,7 @@ static void afs_destroy_inode(struct inode *inode)
 */
 static int afs_statfs(struct dentry *dentry, struct kstatfs *buf)
 {
+        struct afs_fs_cursor fc;
        struct afs_volume_status vs;
        struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
        struct key *key;
@@ -606,21 +617,32 @@ static int afs_statfs(struct dentry *dentry, struct kstatfs *buf)
        if (IS_ERR(key))
                return PTR_ERR(key);
-        ret = afs_vnode_get_volume_status(vnode, key, &vs);
+        ret = -ERESTARTSYS;
-        key_put(key);
+        if (afs_begin_vnode_operation(&fc, vnode, key)) {
-        if (ret < 0) {
+                fc.flags |= AFS_FS_CURSOR_NO_VSLEEP;
-                _leave(" = %d", ret);
+                while (afs_select_fileserver(&fc)) {
-                return ret;
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_get_volume_status(&fc, &vs);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
        }
-        buf->f_type     = dentry->d_sb->s_magic;
+        key_put(key);
-        buf->f_bsize    = AFS_BLOCK_SIZE;
-        buf->f_namelen  = AFSNAMEMAX - 1;
-        if (vs.max_quota == 0)
+        if (ret == 0) {
-                buf->f_blocks = vs.part_max_blocks;
+                buf->f_type     = dentry->d_sb->s_magic;
-        else
+                buf->f_bsize    = AFS_BLOCK_SIZE;
-                buf->f_blocks = vs.max_quota;
+                buf->f_namelen  = AFSNAMEMAX - 1;
-        buf->f_bavail = buf->f_bfree = buf->f_blocks - vs.blocks_in_use;
-        return 0;
+                if (vs.max_quota == 0)
+                        buf->f_blocks = vs.part_max_blocks;
+                else
+                        buf->f_blocks = vs.max_quota;
+                buf->f_bavail = buf->f_bfree = buf->f_blocks - vs.blocks_in_use;
+        }
+        return ret;
 }
diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c
index 1d1e7df77dd5..173c652fe875 100644
--- a/fs/afs/vlclient.c
+++ b/fs/afs/vlclient.c
@@ -16,14 +16,15 @@
 #include "internal.h"
 /*
- * deliver reply data to a VL.GetEntryByXXX call
+ * Deliver reply data to a VL.GetEntryByNameU call.
 */
-static int afs_deliver_vl_get_entry_by_xxx(struct afs_call *call)
+static int afs_deliver_vl_get_entry_by_name_u(struct afs_call *call)
 {
-        struct afs_cache_vlocation *entry;
+        struct afs_uvldbentry__xdr *uvldb;
-        __be32 *bp;
+        struct afs_vldb_entry *entry;
+        bool new_only = false;
        u32 tmp;
-        int loop, ret;
+        int i, ret;
        _enter("");
@@ -32,152 +33,270 @@ static int afs_deliver_vl_get_entry_by_xxx(struct afs_call *call)
                return ret;
        /* unmarshall the reply once we've received all of it */
+        uvldb = call->buffer;
        entry = call->reply[0];
-        bp = call->buffer;
+        for (i = 0; i < ARRAY_SIZE(uvldb->name) - 1; i++)
-        for (loop = 0; loop < 64; loop++)
+                entry->name[i] = (u8)ntohl(uvldb->name[i]);
-                entry->name[loop] = ntohl(*bp++);
+        entry->name[i] = 0;
-        entry->name[loop] = 0;
+        entry->name_len = strlen(entry->name);
-        bp++; /* final NUL */
+        /* If there is a new replication site that we can use, ignore all the
-        bp++; /* type */
+         * sites that aren't marked as new.
-        entry->nservers = ntohl(*bp++);
+         */
+        for (i = 0; i < AFS_NMAXNSERVERS; i++) {
-        for (loop = 0; loop < 8; loop++) {
+                tmp = ntohl(uvldb->serverFlags[i]);
-                entry->servers[loop].srx_family = AF_RXRPC;
+                if (!(tmp & AFS_VLSF_DONTUSE) &&
-                entry->servers[loop].srx_service = FS_SERVICE;
+                    (tmp & AFS_VLSF_NEWREPSITE))
-                entry->servers[loop].transport_type = SOCK_DGRAM;
+                        new_only = true;
-                entry->servers[loop].transport_len = sizeof(entry->servers[loop].transport.sin6);
-                entry->servers[loop].transport.sin6.sin6_family = AF_INET6;
-                entry->servers[loop].transport.sin6.sin6_port = htons(AFS_FS_PORT);
-                entry->servers[loop].transport.sin6.sin6_flowinfo = 0;
-                entry->servers[loop].transport.sin6.sin6_scope_id = 0;
-                entry->servers[loop].transport.sin6.sin6_addr.s6_addr32[0] = 0;
-                entry->servers[loop].transport.sin6.sin6_addr.s6_addr32[1] = 0;
-                entry->servers[loop].transport.sin6.sin6_addr.s6_addr32[2] = htonl(0xffff);
-                entry->servers[loop].transport.sin6.sin6_addr.s6_addr32[3] = *bp++;
        }
-        bp += 8; /* partition IDs */
+        for (i = 0; i < AFS_NMAXNSERVERS; i++) {
+                struct afs_uuid__xdr *xdr;
+                struct afs_uuid *uuid;
+                int j;
-        for (loop = 0; loop < 8; loop++) {
+                tmp = ntohl(uvldb->serverFlags[i]);
-                tmp = ntohl(*bp++);
+                if (tmp & AFS_VLSF_DONTUSE ||
-                entry->srvtmask[loop] = 0;
+                    (new_only && !(tmp & AFS_VLSF_NEWREPSITE)))
+                        continue;
                if (tmp & AFS_VLSF_RWVOL)
-                        entry->srvtmask[loop] |= AFS_VOL_VTM_RW;
+                        entry->fs_mask[i] |= AFS_VOL_VTM_RW;
                if (tmp & AFS_VLSF_ROVOL)
-                        entry->srvtmask[loop] |= AFS_VOL_VTM_RO;
+                        entry->fs_mask[i] |= AFS_VOL_VTM_RO;
                if (tmp & AFS_VLSF_BACKVOL)
-                        entry->srvtmask[loop] |= AFS_VOL_VTM_BAK;
+                        entry->fs_mask[i] |= AFS_VOL_VTM_BAK;
-        }
+                if (!entry->fs_mask[i])
+                        continue;
+                xdr = &uvldb->serverNumber[i];
+                uuid = (struct afs_uuid *)&entry->fs_server[i];
+                uuid->time_low                  = xdr->time_low;
+                uuid->time_mid                  = htons(ntohl(xdr->time_mid));
+                uuid->time_hi_and_version       = htons(ntohl(xdr->time_hi_and_version));
+                uuid->clock_seq_hi_and_reserved = (u8)ntohl(xdr->clock_seq_hi_and_reserved);
+                uuid->clock_seq_low             = (u8)ntohl(xdr->clock_seq_low);
+                for (j = 0; j < 6; j++)
+                        uuid->node[j] = (u8)ntohl(xdr->node[j]);
-        entry->vid[0] = ntohl(*bp++);
+                entry->nr_servers++;
-        entry->vid[1] = ntohl(*bp++);
+        }
-        entry->vid[2] = ntohl(*bp++);
-        bp++; /* clone ID */
+        for (i = 0; i < AFS_MAXTYPES; i++)
+                entry->vid[i] = ntohl(uvldb->volumeId[i]);
-        tmp = ntohl(*bp++); /* flags */
+        tmp = ntohl(uvldb->flags);
-        entry->vidmask = 0;
        if (tmp & AFS_VLF_RWEXISTS)
-                entry->vidmask |= AFS_VOL_VTM_RW;
+                __set_bit(AFS_VLDB_HAS_RW, &entry->flags);
        if (tmp & AFS_VLF_ROEXISTS)
-                entry->vidmask |= AFS_VOL_VTM_RO;
+                __set_bit(AFS_VLDB_HAS_RO, &entry->flags);
        if (tmp & AFS_VLF_BACKEXISTS)
-                entry->vidmask |= AFS_VOL_VTM_BAK;
+                __set_bit(AFS_VLDB_HAS_BAK, &entry->flags);
-        if (!entry->vidmask)
-                return -EBADMSG;
+        if (!(tmp & (AFS_VLF_RWEXISTS | AFS_VLF_ROEXISTS | AFS_VLF_BACKEXISTS))) {
+                entry->error = -ENOMEDIUM;
+                __set_bit(AFS_VLDB_QUERY_ERROR, &entry->flags);
+        }
+        __set_bit(AFS_VLDB_QUERY_VALID, &entry->flags);
        _leave(" = 0 [done]");
        return 0;
 }
-/*
+static void afs_destroy_vl_get_entry_by_name_u(struct afs_call *call)
- * VL.GetEntryByName operation type
+{
- */
+        kfree(call->reply[0]);
-static const struct afs_call_type afs_RXVLGetEntryByName = {
+        afs_flat_call_destructor(call);
-        .name           = "VL.GetEntryByName",
+}
-        .deliver        = afs_deliver_vl_get_entry_by_xxx,
-        .destructor     = afs_flat_call_destructor,
-};
 /*
- * VL.GetEntryById operation type
+ * VL.GetEntryByNameU operation type.
 */
-static const struct afs_call_type afs_RXVLGetEntryById = {
+static const struct afs_call_type afs_RXVLGetEntryByNameU = {
-        .name           = "VL.GetEntryById",
+        .name           = "VL.GetEntryByNameU",
-        .deliver        = afs_deliver_vl_get_entry_by_xxx,
+        .deliver        = afs_deliver_vl_get_entry_by_name_u,
-        .destructor     = afs_flat_call_destructor,
+        .destructor     = afs_destroy_vl_get_entry_by_name_u,
 };
 /*
- * dispatch a get volume entry by name operation
+ * Dispatch a get volume entry by name or ID operation (uuid variant).  If the
+ * volname is a decimal number then it's a volume ID not a volume name.
 */
-int afs_vl_get_entry_by_name(struct afs_net *net,
+struct afs_vldb_entry *afs_vl_get_entry_by_name_u(struct afs_net *net,
-                             struct afs_addr_cursor *ac,
+                                                  struct afs_addr_cursor *ac,
-                             struct key *key,
+                                                  struct key *key,
-                             const char *volname,
+                                                  const char *volname,
-                             struct afs_cache_vlocation *entry,
+                                                  int volnamesz)
-                             bool async)
 {
+        struct afs_vldb_entry *entry;
        struct afs_call *call;
-        size_t volnamesz, reqsz, padsz;
+        size_t reqsz, padsz;
        __be32 *bp;
        _enter("");
-        volnamesz = strlen(volname);
        padsz = (4 - (volnamesz & 3)) & 3;
        reqsz = 8 + volnamesz + padsz;
-        call = afs_alloc_flat_call(net, &afs_RXVLGetEntryByName, reqsz, 384);
+        entry = kzalloc(sizeof(struct afs_vldb_entry), GFP_KERNEL);
-        if (!call)
+        if (!entry)
-                return -ENOMEM;
+                return ERR_PTR(-ENOMEM);
+        call = afs_alloc_flat_call(net, &afs_RXVLGetEntryByNameU, reqsz,
+                                   sizeof(struct afs_uvldbentry__xdr));
+        if (!call) {
+                kfree(entry);
+                return ERR_PTR(-ENOMEM);
+        }
        call->key = key;
        call->reply[0] = entry;
+        call->ret_reply0 = true;
-        /* marshall the parameters */
+        /* Marshall the parameters */
        bp = call->request;
-        *bp++ = htonl(VLGETENTRYBYNAME);
+        *bp++ = htonl(VLGETENTRYBYNAMEU);
        *bp++ = htonl(volnamesz);
        memcpy(bp, volname, volnamesz);
        if (padsz > 0)
-                memset((void *) bp + volnamesz, 0, padsz);
+                memset((void *)bp + volnamesz, 0, padsz);
-        /* initiate the call */
+        return (struct afs_vldb_entry *)afs_make_call(ac, call, GFP_KERNEL, false);
-        return afs_make_call(ac, call, GFP_KERNEL, async);
 }
 /*
- * dispatch a get volume entry by ID operation
+ * Deliver reply data to a VL.GetAddrsU call.
+ *
+ *      GetAddrsU(IN ListAddrByAttributes *inaddr,
+ *                OUT afsUUID *uuidp1,
+ *                OUT uint32_t *uniquifier,
+ *                OUT uint32_t *nentries,
+ *                OUT bulkaddrs *blkaddrs);
 */
-int afs_vl_get_entry_by_id(struct afs_net *net,
+static int afs_deliver_vl_get_addrs_u(struct afs_call *call)
-                           struct afs_addr_cursor *ac,
-                           struct key *key,
-                           afs_volid_t volid,
-                           afs_voltype_t voltype,
-                           struct afs_cache_vlocation *entry,
-                           bool async)
 {
+        struct afs_addr_list *alist;
+        __be32 *bp;
+        u32 uniquifier, nentries, count;
+        int i, ret;
+        _enter("{%u,%zu/%u}", call->unmarshall, call->offset, call->count);
+again:
+        switch (call->unmarshall) {
+        case 0:
+                call->offset = 0;
+                call->unmarshall++;
+                /* Extract the returned uuid, uniquifier, nentries and blkaddrs size */
+        case 1:
+                ret = afs_extract_data(call, call->buffer,
+                                       sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32),
+                                       true);
+                if (ret < 0)
+                        return ret;
+                bp = call->buffer + sizeof(struct afs_uuid__xdr);
+                uniquifier      = ntohl(*bp++);
+                nentries        = ntohl(*bp++);
+                count           = ntohl(*bp);
+                nentries = min(nentries, count);
+                alist = afs_alloc_addrlist(nentries, FS_SERVICE, AFS_FS_PORT);
+                if (!alist)
+                        return -ENOMEM;
+                alist->version = uniquifier;
+                call->reply[0] = alist;
+                call->count = count;
+                call->count2 = nentries;
+                call->offset = 0;
+                call->unmarshall++;
+                /* Extract entries */
+        case 2:
+                count = min(call->count, 4U);
+                ret = afs_extract_data(call, call->buffer,
+                                       count * sizeof(__be32),
+                                       call->count > 4);
+                if (ret < 0)
+                        return ret;
+                alist = call->reply[0];
+                bp = call->buffer;
+                for (i = 0; i < count; i++)
+                        if (alist->nr_addrs < call->count2)
+                                afs_merge_fs_addr4(alist, *bp++);
+                call->count -= count;
+                if (call->count > 0)
+                        goto again;
+                call->offset = 0;
+                call->unmarshall++;
+                break;
+        }
+        _leave(" = 0 [done]");
+        return 0;
+}
+static void afs_vl_get_addrs_u_destructor(struct afs_call *call)
+{
+        afs_put_server(call->net, (struct afs_server *)call->reply[0]);
+        kfree(call->reply[1]);
+        return afs_flat_call_destructor(call);
+}
+/*
+ * VL.GetAddrsU operation type.
+ */
+static const struct afs_call_type afs_RXVLGetAddrsU = {
+        .name           = "VL.GetAddrsU",
+        .deliver        = afs_deliver_vl_get_addrs_u,
+        .destructor     = afs_vl_get_addrs_u_destructor,
+};
+/*
+ * Dispatch an operation to get the addresses for a server, where the server is
+ * nominated by UUID.
+ */
+struct afs_addr_list *afs_vl_get_addrs_u(struct afs_net *net,
+                                         struct afs_addr_cursor *ac,
+                                         struct key *key,
+                                         const uuid_t *uuid)
+{
+        struct afs_ListAddrByAttributes__xdr *r;
+        const struct afs_uuid *u = (const struct afs_uuid *)uuid;
        struct afs_call *call;
        __be32 *bp;
+        int i;
        _enter("");
-        call = afs_alloc_flat_call(net, &afs_RXVLGetEntryById, 12, 384);
+        call = afs_alloc_flat_call(net, &afs_RXVLGetAddrsU,
+                                   sizeof(__be32) + sizeof(struct afs_ListAddrByAttributes__xdr),
+                                   sizeof(struct afs_uuid__xdr) + 3 * sizeof(__be32));
        if (!call)
-                return -ENOMEM;
+                return ERR_PTR(-ENOMEM);
        call->key = key;
-        call->reply[0] = entry;
+        call->reply[0] = NULL;
+        call->ret_reply0 = true;
-        /* marshall the parameters */
+        /* Marshall the parameters */
        bp = call->request;
-        *bp++ = htonl(VLGETENTRYBYID);
+        *bp++ = htonl(VLGETADDRSU);
-        *bp++ = htonl(volid);
+        r = (struct afs_ListAddrByAttributes__xdr *)bp;
-        *bp   = htonl(voltype);
+        r->Mask         = htonl(AFS_VLADDR_UUID);
+        r->ipaddr       = 0;
+        r->index        = 0;
+        r->spare        = 0;
+        r->uuid.time_low                        = u->time_low;
+        r->uuid.time_mid                        = htonl(ntohs(u->time_mid));
+        r->uuid.time_hi_and_version             = htonl(ntohs(u->time_hi_and_version));
+        r->uuid.clock_seq_hi_and_reserved       = htonl(u->clock_seq_hi_and_reserved);
+        r->uuid.clock_seq_low                   = htonl(u->clock_seq_low);
+        for (i = 0; i < 6; i++)
+                r->uuid.node[i] = ntohl(u->node[i]);
-        /* initiate the call */
+        return (struct afs_addr_list *)afs_make_call(ac, call, GFP_KERNEL, false);
-        return afs_make_call(ac, call, GFP_KERNEL, async);
 }
diff --git a/fs/afs/vlocation.c b/fs/afs/vlocation.c
deleted file mode 100644
index 52c31ad0ef60..000000000000
--- a/fs/afs/vlocation.c
+++ /dev/null
@@ -1,669 +0,0 @@
-/* AFS volume location management
- *
- * Copyright (C) 2002, 2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/slab.h>
-#include <linux/init.h>
-#include <linux/sched.h>
-#include "internal.h"
-struct workqueue_struct *afs_vlocation_update_worker;
-static unsigned afs_vlocation_timeout = 10;     /* volume location timeout in seconds */
-static unsigned afs_vlocation_update_timeout = 10 * 60;
-/*
- * iterate through the VL servers in a cell until one of them admits knowing
- * about the volume in question
- */
-static int afs_vlocation_access_vl_by_name(struct afs_vlocation *vl,
-                                           struct key *key,
-                                           struct afs_cache_vlocation *vldb)
-{
-        struct afs_addr_cursor ac;
-        int ret;
-        _enter("%s,%s", vl->cell->name, vl->vldb.name);
-        ret = afs_set_vl_cursor(&ac, vl->cell);
-        if (ret < 0)
-                return ret;
-        down_write(&vl->cell->vl_sem);
-        
-        ret = -ENOMEDIUM;
-        while (afs_iterate_addresses(&ac)) {
-                _debug("CellServ[%hu]: %pIS", ac.index, &ac.addr->transport);
-                /* attempt to access the VL server */
-                ac.error = afs_vl_get_entry_by_name(vl->cell->net, &ac, key,
-                                                    vl->vldb.name, vldb, false);
-                switch (ac.error) {
-                case 0:
-                        goto out;
-                case -ENOMEM:
-                case -ENONET:
-                case -ENETUNREACH:
-                case -EHOSTUNREACH:
-                case -ECONNREFUSED:
-                        if (ac.error == -ENOMEM || ac.error == -ENONET)
-                                goto out;
-                        break;
-                case -ENOMEDIUM:
-                case -EKEYREJECTED:
-                case -EKEYEXPIRED:
-                        ac.responded = true;
-                        goto out;
-                default:
-                        ac.responded = true;
-                        ac.error = -EIO;
-                        break;
-                }
-        }
-out:
-        up_write(&vl->cell->vl_sem);
-        ret = afs_end_cursor(&ac);
-        _leave(" = %d", ret);
-        return ret;
-}
-/*
- * iterate through the VL servers in a cell until one of them admits knowing
- * about the volume in question
- */
-static int afs_vlocation_access_vl_by_id(struct afs_vlocation *vl,
-                                         struct key *key,
-                                         afs_volid_t volid,
-                                         afs_voltype_t voltype,
-                                         struct afs_cache_vlocation *vldb)
-{
-        struct afs_addr_cursor ac;
-        int ret;
-        _enter("%s,%x,%d,", vl->cell->name, volid, voltype);
-        ret = afs_set_vl_cursor(&ac, vl->cell);
-        if (ret < 0)
-                return ret;
-        down_write(&vl->cell->vl_sem);
-        ret = -ENOMEDIUM;
-        while (afs_iterate_addresses(&ac)) {
-                _debug("CellServ[%hu]: %pIS", ac.index, &ac.addr->transport);
-                /* attempt to access the VL server */
-                ac.error = afs_vl_get_entry_by_id(vl->cell->net, &ac, key, volid,
-                                                  voltype, vldb, false);
-                switch (ac.error) {
-                case 0:
-                        goto out;
-                case -ENOMEM:
-                case -ENONET:
-                case -ENETUNREACH:
-                case -EHOSTUNREACH:
-                case -ECONNREFUSED:
-                        if (ac.error == -ENOMEM || ac.error == -ENONET)
-                                goto out;
-                        goto rotate;
-                case -EBUSY:
-                        ac.responded = true;
-                        vl->upd_busy_cnt++;
-                        if (vl->upd_busy_cnt <= 3) {
-                                if (vl->upd_busy_cnt > 1) {
-                                        /* second+ BUSY - sleep a little bit */
-                                        set_current_state(TASK_UNINTERRUPTIBLE);
-                                        schedule_timeout(1);
-                                }
-                                continue;
-                        }
-                        break;
-                case -ENOMEDIUM:
-                        ac.responded = true;
-                        vl->upd_rej_cnt++;
-                        goto rotate;
-                default:
-                        ac.responded = true;
-                        ac.error = -EIO;
-                        goto rotate;
-                }
-                /* rotate the server records upon lookup failure */
-        rotate:
-                vl->upd_busy_cnt = 0;
-        }
-out:
-        if (ac.error < 0 && vl->upd_rej_cnt > 0) {
-                printk(KERN_NOTICE "kAFS:"
-                       " Active volume no longer valid '%s'\n",
-                       vl->vldb.name);
-                vl->valid = 0;
-                ac.error = -ENOMEDIUM;
-        }
-        up_write(&vl->cell->vl_sem);
-        ret = afs_end_cursor(&ac);
-        _leave(" = %d", ret);
-        return ret;
-}
-/*
- * allocate a volume location record
- */
-static struct afs_vlocation *afs_vlocation_alloc(struct afs_cell *cell,
-                                                 const char *name,
-                                                 size_t namesz)
-{
-        struct afs_vlocation *vl;
-        vl = kzalloc(sizeof(struct afs_vlocation), GFP_KERNEL);
-        if (vl) {
-                vl->cell = cell;
-                vl->state = AFS_VL_NEW;
-                atomic_set(&vl->usage, 1);
-                INIT_LIST_HEAD(&vl->link);
-                INIT_LIST_HEAD(&vl->grave);
-                INIT_LIST_HEAD(&vl->update);
-                init_waitqueue_head(&vl->waitq);
-                spin_lock_init(&vl->lock);
-                memcpy(vl->vldb.name, name, namesz);
-        }
-        _leave(" = %p", vl);
-        return vl;
-}
-/*
- * update record if we found it in the cache
- */
-static int afs_vlocation_update_record(struct afs_vlocation *vl,
-                                       struct key *key,
-                                       struct afs_cache_vlocation *vldb)
-{
-        afs_voltype_t voltype;
-        afs_volid_t vid;
-        int ret;
-        /* try to look up a cached volume in the cell VL databases by ID */
-        _debug("Locally Cached: %s %02x", vl->vldb.name, vl->vldb.vidmask);
-        _debug("Vids: %08x %08x %08x",
-               vl->vldb.vid[0],
-               vl->vldb.vid[1],
-               vl->vldb.vid[2]);
-        if (vl->vldb.vidmask & AFS_VOL_VTM_RW) {
-                vid = vl->vldb.vid[0];
-                voltype = AFSVL_RWVOL;
-        } else if (vl->vldb.vidmask & AFS_VOL_VTM_RO) {
-                vid = vl->vldb.vid[1];
-                voltype = AFSVL_ROVOL;
-        } else if (vl->vldb.vidmask & AFS_VOL_VTM_BAK) {
-                vid = vl->vldb.vid[2];
-                voltype = AFSVL_BACKVOL;
-        } else {
-                BUG();
-                vid = 0;
-                voltype = 0;
-        }
-        /* contact the server to make sure the volume is still available
-         * - TODO: need to handle disconnected operation here
-         */
-        ret = afs_vlocation_access_vl_by_id(vl, key, vid, voltype, vldb);
-        switch (ret) {
-                /* net error */
-        default:
-                printk(KERN_WARNING "kAFS:"
-                       " failed to update volume '%s' (%x) up in '%s': %d\n",
-                       vl->vldb.name, vid, vl->cell->name, ret);
-                _leave(" = %d", ret);
-                return ret;
-                /* pulled from local cache into memory */
-        case 0:
-                _leave(" = 0");
-                return 0;
-                /* uh oh... looks like the volume got deleted */
-        case -ENOMEDIUM:
-                printk(KERN_ERR "kAFS:"
-                       " volume '%s' (%x) does not exist '%s'\n",
-                       vl->vldb.name, vid, vl->cell->name);
-                /* TODO: make existing record unavailable */
-                _leave(" = %d", ret);
-                return ret;
-        }
-}
-/*
- * apply the update to a VL record
- */
-static void afs_vlocation_apply_update(struct afs_vlocation *vl,
-                                       struct afs_cache_vlocation *vldb)
-{
-        _debug("Done VL Lookup: %s %02x", vldb->name, vldb->vidmask);
-        _debug("Vids: %08x %08x %08x",
-               vldb->vid[0], vldb->vid[1], vldb->vid[2]);
-        if (strcmp(vldb->name, vl->vldb.name) != 0)
-                printk(KERN_NOTICE "kAFS:"
-                       " name of volume '%s' changed to '%s' on server\n",
-                       vl->vldb.name, vldb->name);
-        vl->vldb = *vldb;
-}
-/*
- * fill in a volume location record, consulting the cache and the VL server
- * both
- */
-static int afs_vlocation_fill_in_record(struct afs_vlocation *vl,
-                                        struct key *key)
-{
-        struct afs_cache_vlocation vldb;
-        int ret;
-        _enter("");
-        ASSERTCMP(vl->valid, ==, 0);
-        memset(&vldb, 0, sizeof(vldb));
-        /* Try to look up an unknown volume in the cell VL databases by name */
-        ret = afs_vlocation_access_vl_by_name(vl, key, &vldb);
-        if (ret < 0) {
-                printk("kAFS: failed to locate '%s' in cell '%s'\n",
-                       vl->vldb.name, vl->cell->name);
-                return ret;
-        }
-        afs_vlocation_apply_update(vl, &vldb);
-        _leave(" = 0");
-        return 0;
-}
-/*
- * queue a vlocation record for updates
- */
-static void afs_vlocation_queue_for_updates(struct afs_net *net,
-                                            struct afs_vlocation *vl)
-{
-        struct afs_vlocation *xvl;
-        /* wait at least 10 minutes before updating... */
-        vl->update_at = ktime_get_real_seconds() +
-                        afs_vlocation_update_timeout;
-        spin_lock(&net->vl_updates_lock);
-        if (!list_empty(&net->vl_updates)) {
-                /* ... but wait at least 1 second more than the newest record
-                 * already queued so that we don't spam the VL server suddenly
-                 * with lots of requests
-                 */
-                xvl = list_entry(net->vl_updates.prev,
-                                 struct afs_vlocation, update);
-                if (vl->update_at <= xvl->update_at)
-                        vl->update_at = xvl->update_at + 1;
-        } else if (net->live) {
-                queue_delayed_work(afs_vlocation_update_worker,
-                                   &net->vl_updater,
-                                   afs_vlocation_update_timeout * HZ);
-        }
-        list_add_tail(&vl->update, &net->vl_updates);
-        spin_unlock(&net->vl_updates_lock);
-}
-/*
- * lookup volume location
- * - iterate through the VL servers in a cell until one of them admits knowing
- *   about the volume in question
- * - lookup in the local cache if not able to find on the VL server
- * - insert/update in the local cache if did get a VL response
- */
-struct afs_vlocation *afs_vlocation_lookup(struct afs_net *net,
-                                           struct afs_cell *cell,
-                                           struct key *key,
-                                           const char *name,
-                                           size_t namesz)
-{
-        struct afs_vlocation *vl;
-        int ret;
-        _enter("{%s},{%x},%*.*s,%zu",
-               cell->name, key_serial(key),
-               (int) namesz, (int) namesz, name, namesz);
-        if (namesz >= sizeof(vl->vldb.name)) {
-                _leave(" = -ENAMETOOLONG");
-                return ERR_PTR(-ENAMETOOLONG);
-        }
-        /* see if we have an in-memory copy first */
-        down_write(&cell->vl_sem);
-        spin_lock(&cell->vl_lock);
-        list_for_each_entry(vl, &cell->vl_list, link) {
-                if (vl->vldb.name[namesz] != '\0')
-                        continue;
-                if (memcmp(vl->vldb.name, name, namesz) == 0)
-                        goto found_in_memory;
-        }
-        spin_unlock(&cell->vl_lock);
-        /* not in the cell's in-memory lists - create a new record */
-        vl = afs_vlocation_alloc(cell, name, namesz);
-        if (!vl) {
-                up_write(&cell->vl_sem);
-                return ERR_PTR(-ENOMEM);
-        }
-        afs_get_cell(cell);
-        list_add_tail(&vl->link, &cell->vl_list);
-        vl->state = AFS_VL_CREATING;
-        up_write(&cell->vl_sem);
-fill_in_record:
-        ret = afs_vlocation_fill_in_record(vl, key);
-        if (ret < 0)
-                goto error_abandon;
-        spin_lock(&vl->lock);
-        vl->state = AFS_VL_VALID;
-        spin_unlock(&vl->lock);
-        wake_up(&vl->waitq);
-        /* schedule for regular updates */
-        afs_vlocation_queue_for_updates(net, vl);
-        goto success;
-found_in_memory:
-        /* found in memory */
-        _debug("found in memory");
-        atomic_inc(&vl->usage);
-        spin_unlock(&cell->vl_lock);
-        if (!list_empty(&vl->grave)) {
-                spin_lock(&net->vl_graveyard_lock);
-                list_del_init(&vl->grave);
-                spin_unlock(&net->vl_graveyard_lock);
-        }
-        up_write(&cell->vl_sem);
-        /* see if it was an abandoned record that we might try filling in */
-        spin_lock(&vl->lock);
-        while (vl->state != AFS_VL_VALID) {
-                afs_vlocation_state_t state = vl->state;
-                _debug("invalid [state %d]", state);
-                if (state == AFS_VL_NEW || state == AFS_VL_NO_VOLUME) {
-                        vl->state = AFS_VL_CREATING;
-                        spin_unlock(&vl->lock);
-                        goto fill_in_record;
-                }
-                /* must now wait for creation or update by someone else to
-                 * complete */
-                _debug("wait");
-                spin_unlock(&vl->lock);
-                ret = wait_event_interruptible(vl->waitq,
-                                               vl->state == AFS_VL_NEW ||
-                                               vl->state == AFS_VL_VALID ||
-                                               vl->state == AFS_VL_NO_VOLUME);
-                if (ret < 0)
-                        goto error;
-                spin_lock(&vl->lock);
-        }
-        spin_unlock(&vl->lock);
-success:
-        _leave(" = %p", vl);
-        return vl;
-error_abandon:
-        spin_lock(&vl->lock);
-        vl->state = AFS_VL_NEW;
-        spin_unlock(&vl->lock);
-        wake_up(&vl->waitq);
-error:
-        ASSERT(vl != NULL);
-        afs_put_vlocation(net, vl);
-        _leave(" = %d", ret);
-        return ERR_PTR(ret);
-}
-/*
- * finish using a volume location record
- */
-void afs_put_vlocation(struct afs_net *net, struct afs_vlocation *vl)
-{
-        if (!vl)
-                return;
-        _enter("%s", vl->vldb.name);
-        ASSERTCMP(atomic_read(&vl->usage), >, 0);
-        if (likely(!atomic_dec_and_test(&vl->usage))) {
-                _leave("");
-                return;
-        }
-        spin_lock(&net->vl_graveyard_lock);
-        if (atomic_read(&vl->usage) == 0) {
-                _debug("buried");
-                list_move_tail(&vl->grave, &net->vl_graveyard);
-                vl->time_of_death = ktime_get_real_seconds();
-                queue_delayed_work(afs_wq, &net->vl_reaper,
-                                   afs_vlocation_timeout * HZ);
-                /* suspend updates on this record */
-                if (!list_empty(&vl->update)) {
-                        spin_lock(&net->vl_updates_lock);
-                        list_del_init(&vl->update);
-                        spin_unlock(&net->vl_updates_lock);
-                }
-        }
-        spin_unlock(&net->vl_graveyard_lock);
-        _leave(" [killed?]");
-}
-/*
- * destroy a dead volume location record
- */
-static void afs_vlocation_destroy(struct afs_net *net, struct afs_vlocation *vl)
-{
-        _enter("%p", vl);
-        afs_put_cell(net, vl->cell);
-        kfree(vl);
-}
-/*
- * reap dead volume location records
- */
-void afs_vlocation_reaper(struct work_struct *work)
-{
-        LIST_HEAD(corpses);
-        struct afs_vlocation *vl;
-        struct afs_net *net = container_of(work, struct afs_net, vl_reaper.work);
-        unsigned long delay, expiry;
-        time64_t now;
-        _enter("");
-        now = ktime_get_real_seconds();
-        spin_lock(&net->vl_graveyard_lock);
-        while (!list_empty(&net->vl_graveyard)) {
-                vl = list_entry(net->vl_graveyard.next,
-                                struct afs_vlocation, grave);
-                _debug("check %p", vl);
-                /* the queue is ordered most dead first */
-                if (net->live) {
-                        expiry = vl->time_of_death + afs_vlocation_timeout;
-                        if (expiry > now) {
-                                delay = (expiry - now) * HZ;
-                                _debug("delay %lu", delay);
-                                mod_delayed_work(afs_wq, &net->vl_reaper, delay);
-                                break;
-                        }
-                }
-                spin_lock(&vl->cell->vl_lock);
-                if (atomic_read(&vl->usage) > 0) {
-                        _debug("no reap");
-                        list_del_init(&vl->grave);
-                } else {
-                        _debug("reap");
-                        list_move_tail(&vl->grave, &corpses);
-                        list_del_init(&vl->link);
-                }
-                spin_unlock(&vl->cell->vl_lock);
-        }
-        spin_unlock(&net->vl_graveyard_lock);
-        /* now reap the corpses we've extracted */
-        while (!list_empty(&corpses)) {
-                vl = list_entry(corpses.next, struct afs_vlocation, grave);
-                list_del(&vl->grave);
-                afs_vlocation_destroy(net, vl);
-        }
-        _leave("");
-}
-/*
- * discard all the volume location records for rmmod
- */
-void __net_exit afs_vlocation_purge(struct afs_net *net)
-{
-        spin_lock(&net->vl_updates_lock);
-        list_del_init(&net->vl_updates);
-        spin_unlock(&net->vl_updates_lock);
-        mod_delayed_work(afs_vlocation_update_worker, &net->vl_updater, 0);
-        mod_delayed_work(afs_wq, &net->vl_reaper, 0);
-}
-/*
- * update a volume location
- */
-void afs_vlocation_updater(struct work_struct *work)
-{
-        struct afs_cache_vlocation vldb;
-        struct afs_vlocation *vl, *xvl;
-        struct afs_net *net = container_of(work, struct afs_net, vl_updater.work);
-        time64_t now;
-        long timeout;
-        int ret;
-        if (!net->live)
-                return;
-        _enter("");
-        now = ktime_get_real_seconds();
-        /* find a record to update */
-        spin_lock(&net->vl_updates_lock);
-        for (;;) {
-                if (list_empty(&net->vl_updates) || !net->live) {
-                        spin_unlock(&net->vl_updates_lock);
-                        _leave(" [nothing]");
-                        return;
-                }
-                vl = list_entry(net->vl_updates.next,
-                                struct afs_vlocation, update);
-                if (atomic_read(&vl->usage) > 0)
-                        break;
-                list_del_init(&vl->update);
-        }
-        timeout = vl->update_at - now;
-        if (timeout > 0) {
-                queue_delayed_work(afs_vlocation_update_worker,
-                                   &net->vl_updater, timeout * HZ);
-                spin_unlock(&net->vl_updates_lock);
-                _leave(" [nothing]");
-                return;
-        }
-        list_del_init(&vl->update);
-        atomic_inc(&vl->usage);
-        spin_unlock(&net->vl_updates_lock);
-        /* we can now perform the update */
-        _debug("update %s", vl->vldb.name);
-        vl->state = AFS_VL_UPDATING;
-        vl->upd_rej_cnt = 0;
-        vl->upd_busy_cnt = 0;
-        ret = afs_vlocation_update_record(vl, NULL, &vldb);
-        spin_lock(&vl->lock);
-        switch (ret) {
-        case 0:
-                afs_vlocation_apply_update(vl, &vldb);
-                vl->state = AFS_VL_VALID;
-                break;
-        case -ENOMEDIUM:
-                vl->state = AFS_VL_VOLUME_DELETED;
-                break;
-        default:
-                vl->state = AFS_VL_UNCERTAIN;
-                break;
-        }
-        spin_unlock(&vl->lock);
-        wake_up(&vl->waitq);
-        /* and then reschedule */
-        _debug("reschedule");
-        vl->update_at = ktime_get_real_seconds() +
-                        afs_vlocation_update_timeout;
-        spin_lock(&net->vl_updates_lock);
-        if (!list_empty(&net->vl_updates)) {
-                /* next update in 10 minutes, but wait at least 1 second more
-                 * than the newest record already queued so that we don't spam
-                 * the VL server suddenly with lots of requests
-                 */
-                xvl = list_entry(net->vl_updates.prev,
-                                 struct afs_vlocation, update);
-                if (vl->update_at <= xvl->update_at)
-                        vl->update_at = xvl->update_at + 1;
-                xvl = list_entry(net->vl_updates.next,
-                                 struct afs_vlocation, update);
-                timeout = xvl->update_at - now;
-                if (timeout < 0)
-                        timeout = 0;
-        } else {
-                timeout = afs_vlocation_update_timeout;
-        }
-        ASSERT(list_empty(&vl->update));
-        list_add_tail(&vl->update, &net->vl_updates);
-        _debug("timeout %ld", timeout);
-        queue_delayed_work(afs_vlocation_update_worker, &net->vl_updater, timeout * HZ);
-        spin_unlock(&net->vl_updates_lock);
-        afs_put_vlocation(net, vl);
-}
diff --git a/fs/afs/vnode.c b/fs/afs/vnode.c
deleted file mode 100644
index 9c7333eb01c2..000000000000
--- a/fs/afs/vnode.c
+++ /dev/null
@@ -1,750 +0,0 @@
-/* AFS vnode management
- *
- * Copyright (C) 2002, 2007 Red Hat, Inc. All Rights Reserved.
- * Written by David Howells (dhowells@redhat.com)
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version
- * 2 of the License, or (at your option) any later version.
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/fs.h>
-#include <linux/sched.h>
-#include "internal.h"
-/*
- * Handle remote file deletion.
- */
-static void afs_vnode_deleted_remotely(struct afs_vnode *vnode)
-{
-        struct afs_cb_interest *cbi = vnode->cb_interest;
-        _enter("{%p}", cbi);
-        set_bit(AFS_VNODE_DELETED, &vnode->flags);
-        if (cbi) {
-                vnode->cb_interest = NULL;
-                afs_put_cb_interest(afs_v2net(vnode), cbi);
-        }
-        _leave("");
-}
-/*
- * finish off updating the recorded status of a file after a successful
- * operation completion
- * - starts callback expiry timer
- * - adds to server's callback list
- */
-void afs_vnode_finalise_status_update(struct afs_vnode *vnode,
-                                      struct afs_server *server)
-{
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        wake_up_all(&vnode->update_waitq);
-        _leave("");
-}
-/*
- * finish off updating the recorded status of a file after an operation failed
- */
-static void afs_vnode_status_update_failed(struct afs_fs_cursor *fc,
-                                           struct afs_vnode *vnode)
-{
-        _enter("{%x:%u},%d", vnode->fid.vid, vnode->fid.vnode, fc->ac.error);
-        spin_lock(&vnode->lock);
-        if (fc->ac.error == -ENOENT) {
-                /* the file was deleted on the server */
-                _debug("got NOENT from server - marking file deleted");
-                afs_vnode_deleted_remotely(vnode);
-        }
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        wake_up_all(&vnode->update_waitq);
-        _leave("");
-}
-/*
- * fetch file status from the volume
- * - don't issue a fetch if:
- *   - the changed bit is not set and there's a valid callback
- *   - there are any outstanding ops that will fetch the status
- * - TODO implement local caching
- */
-int afs_vnode_fetch_status(struct afs_vnode *vnode, struct key *key, bool force)
-{
-        struct afs_fs_cursor fc;
-        unsigned int cb_break = 0;
-        DECLARE_WAITQUEUE(myself, current);
-        _enter("%s,{%x:%u.%u,S=%lx},%u",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique,
-               vnode->flags,
-               force);
-        if (!force && test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
-                _leave(" [unchanged]");
-                return 0;
-        }
-        if (test_bit(AFS_VNODE_DELETED, &vnode->flags)) {
-                _leave(" [deleted]");
-                return -ENOENT;
-        }
-        cb_break = vnode->cb_break + vnode->cb_s_break;
-        spin_lock(&vnode->lock);
-        if (!force && test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
-                spin_unlock(&vnode->lock);
-                _leave(" [unchanged]");
-                return 0;
-        }
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        if (vnode->update_cnt > 0) {
-                /* someone else started a fetch */
-                _debug("wait on fetch %d", vnode->update_cnt);
-                set_current_state(TASK_UNINTERRUPTIBLE);
-                ASSERT(myself.func != NULL);
-                add_wait_queue(&vnode->update_waitq, &myself);
-                /* wait for the status to be updated */
-                for (;;) {
-                        if (test_bit(AFS_VNODE_CB_PROMISED, &vnode->flags))
-                                break;
-                        if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
-                                break;
-                        /* check to see if it got updated and invalidated all
-                         * before we saw it */
-                        if (vnode->update_cnt == 0) {
-                                remove_wait_queue(&vnode->update_waitq,
-                                                  &myself);
-                                set_current_state(TASK_RUNNING);
-                                goto get_anyway;
-                        }
-                        spin_unlock(&vnode->lock);
-                        schedule();
-                        set_current_state(TASK_UNINTERRUPTIBLE);
-                        spin_lock(&vnode->lock);
-                }
-                remove_wait_queue(&vnode->update_waitq, &myself);
-                spin_unlock(&vnode->lock);
-                set_current_state(TASK_RUNNING);
-                return test_bit(AFS_VNODE_DELETED, &vnode->flags) ?
-                        -ENOENT : 0;
-        }
-get_anyway:
-        /* okay... we're going to have to initiate the op */
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        /* merge AFS status fetches and clear outstanding callback on this
-         * vnode */
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_fetch_file_status(&fc, key, vnode, NULL, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                _debug("adjust");
-                afs_cache_permit(vnode, key, cb_break);
-                afs_vnode_finalise_status_update(vnode, fc.server);
-        } else {
-                _debug("failed [%d]", fc.ac.error);
-                afs_vnode_status_update_failed(&fc, vnode);
-        }
-out:
-        afs_end_fs_cursor(&fc, afs_v2net(vnode));
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        _leave(" = %d [cnt %d]", fc.ac.error, vnode->update_cnt);
-        return fc.ac.error;
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * fetch file data from the volume
- * - TODO implement caching
- */
-int afs_vnode_fetch_data(struct afs_vnode *vnode, struct key *key,
-                         struct afs_read *desc)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,,,",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key));
-        /* this op will fetch the status */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        /* merge in AFS status fetches and clear outstanding callback on this
-         * vnode */
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_fetch_data(&fc, key, vnode, desc, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0)
-                afs_vnode_finalise_status_update(vnode, fc.server);
-        else
-                afs_vnode_status_update_failed(&fc, vnode);
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * make a file or a directory
- */
-int afs_vnode_create(struct afs_vnode *vnode, struct key *key,
-                     const char *name, umode_t mode, struct afs_fid *newfid,
-                     struct afs_file_status *newstatus,
-                     struct afs_callback *newcb, struct afs_server **_server)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,%s,,",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key),
-               name);
-        /* this op will fetch the status on the directory we're creating in */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_create(&fc, key, vnode, name, mode, newfid,
-                                            newstatus, newcb, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(vnode, fc.server);
-                *_server = fc.server;
-                fc.server = NULL;
-        } else {
-                afs_vnode_status_update_failed(&fc, vnode);
-                *_server = NULL;
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * remove a file or directory
- */
-int afs_vnode_remove(struct afs_vnode *vnode, struct key *key, const char *name,
-                     bool isdir)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,%s",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key),
-               name);
-        /* this op will fetch the status on the directory we're removing from */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_remove(&fc, key, vnode, name, isdir, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0)
-                afs_vnode_finalise_status_update(vnode, fc.server);
-        else
-                afs_vnode_status_update_failed(&fc, vnode);
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * create a hard link
- */
-int afs_vnode_link(struct afs_vnode *dvnode, struct afs_vnode *vnode,
-                          struct key *key, const char *name)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%s{%x:%u.%u},%x,%s",
-               dvnode->volume->vlocation->vldb.name,
-               dvnode->fid.vid,
-               dvnode->fid.vnode,
-               dvnode->fid.unique,
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key),
-               name);
-        /* this op will fetch the status on the directory we're removing from */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        spin_lock(&dvnode->lock);
-        dvnode->update_cnt++;
-        spin_unlock(&dvnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, dvnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_link(&fc, key, dvnode, vnode, name, false);
-        } while (afs_iterate_fs_cursor(&fc, dvnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(vnode, fc.server);
-                afs_vnode_finalise_status_update(dvnode, fc.server);
-        } else {
-                afs_vnode_status_update_failed(&fc, vnode);
-                afs_vnode_status_update_failed(&fc, dvnode);
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        spin_lock(&dvnode->lock);
-        dvnode->update_cnt--;
-        ASSERTCMP(dvnode->update_cnt, >=, 0);
-        spin_unlock(&dvnode->lock);
-        goto out;
-}
-/*
- * create a symbolic link
- */
-int afs_vnode_symlink(struct afs_vnode *vnode, struct key *key,
-                      const char *name, const char *content,
-                      struct afs_fid *newfid,
-                      struct afs_file_status *newstatus,
-                      struct afs_server **_server)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,%s,%s,,,",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key),
-               name, content);
-        /* this op will fetch the status on the directory we're creating in */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_symlink(&fc, key, vnode, name, content,
-                                             newfid, newstatus, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(vnode, fc.server);
-                *_server = fc.server;
-                fc.server = NULL;
-        } else {
-                afs_vnode_status_update_failed(&fc, vnode);
-                *_server = NULL;
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        *_server = NULL;
-        goto out;
-}
-/*
- * rename a file
- */
-int afs_vnode_rename(struct afs_vnode *orig_dvnode,
-                     struct afs_vnode *new_dvnode,
-                     struct key *key,
-                     const char *orig_name,
-                     const char *new_name)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%s{%u,%u,%u},%x,%s,%s",
-               orig_dvnode->volume->vlocation->vldb.name,
-               orig_dvnode->fid.vid,
-               orig_dvnode->fid.vnode,
-               orig_dvnode->fid.unique,
-               new_dvnode->volume->vlocation->vldb.name,
-               new_dvnode->fid.vid,
-               new_dvnode->fid.vnode,
-               new_dvnode->fid.unique,
-               key_serial(key),
-               orig_name,
-               new_name);
-        /* this op will fetch the status on both the directories we're dealing
-         * with */
-        spin_lock(&orig_dvnode->lock);
-        orig_dvnode->update_cnt++;
-        spin_unlock(&orig_dvnode->lock);
-        if (new_dvnode != orig_dvnode) {
-                spin_lock(&new_dvnode->lock);
-                new_dvnode->update_cnt++;
-                spin_unlock(&new_dvnode->lock);
-        }
-        afs_init_fs_cursor(&fc, orig_dvnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, orig_dvnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_rename(&fc, key, orig_dvnode, orig_name,
-                                            new_dvnode, new_name, false);
-        } while (afs_iterate_fs_cursor(&fc, orig_dvnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(orig_dvnode, fc.server);
-                if (new_dvnode != orig_dvnode)
-                        afs_vnode_finalise_status_update(new_dvnode, fc.server);
-        } else {
-                afs_vnode_status_update_failed(&fc, orig_dvnode);
-                if (new_dvnode != orig_dvnode)
-                        afs_vnode_status_update_failed(&fc, new_dvnode);
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(orig_dvnode));
-no_server:
-        spin_lock(&orig_dvnode->lock);
-        orig_dvnode->update_cnt--;
-        ASSERTCMP(orig_dvnode->update_cnt, >=, 0);
-        spin_unlock(&orig_dvnode->lock);
-        if (new_dvnode != orig_dvnode) {
-                spin_lock(&new_dvnode->lock);
-                new_dvnode->update_cnt--;
-                ASSERTCMP(new_dvnode->update_cnt, >=, 0);
-                spin_unlock(&new_dvnode->lock);
-        }
-        goto out;
-}
-/*
- * write to a file
- */
-int afs_vnode_store_data(struct afs_writeback *wb, pgoff_t first, pgoff_t last,
-                         unsigned offset, unsigned to)
-{
-        struct afs_fs_cursor fc;
-        struct afs_vnode *vnode = wb->vnode;
-        _enter("%s{%x:%u.%u},%x,%lx,%lx,%x,%x",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(wb->key),
-               first, last, offset, to);
-        /* this op will fetch the status */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_store_data(&fc, wb, first, last, offset, to,
-                                                false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(vnode, fc.server);
-        } else {
-                afs_vnode_status_update_failed(&fc, vnode);
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * set the attributes on a file
- */
-int afs_vnode_setattr(struct afs_vnode *vnode, struct key *key,
-                      struct iattr *attr)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key));
-        /* this op will fetch the status */
-        spin_lock(&vnode->lock);
-        vnode->update_cnt++;
-        spin_unlock(&vnode->lock);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        goto no_server;
-                fc.ac.error = afs_fs_setattr(&fc, key, vnode, attr, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        /* adjust the flags */
-        if (fc.ac.error == 0) {
-                afs_vnode_finalise_status_update(vnode, fc.server);
-        } else {
-                afs_vnode_status_update_failed(&fc, vnode);
-        }
-out:
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-no_server:
-        spin_lock(&vnode->lock);
-        vnode->update_cnt--;
-        ASSERTCMP(vnode->update_cnt, >=, 0);
-        spin_unlock(&vnode->lock);
-        goto out;
-}
-/*
- * get the status of a volume
- */
-int afs_vnode_get_volume_status(struct afs_vnode *vnode, struct key *key,
-                                struct afs_volume_status *vs)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key));
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        break;
-                fc.ac.error = afs_fs_get_volume_status(&fc, key, vnode, vs, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-}
-/*
- * get a lock on a file
- */
-int afs_vnode_set_lock(struct afs_vnode *vnode, struct key *key,
-                       afs_lock_type_t type)
-{
-        struct afs_fs_cursor fc;
-        _enter("%s{%x:%u.%u},%x,%u",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key), type);
-        afs_init_fs_cursor(&fc, vnode);
-        do {
-                /* pick a server to query */
-                if (!afs_volume_pick_fileserver(&fc, vnode))
-                        break;
-                fc.ac.error = afs_fs_set_lock(&fc, key, vnode, type, false);
-        } while (afs_iterate_fs_cursor(&fc, vnode));
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-}
-/*
- * extend a lock on a file
- */
-int afs_vnode_extend_lock(struct afs_vnode *vnode, struct key *key)
-{
-        struct afs_fs_cursor fc;
-        int ret;
-        _enter("%s{%x:%u.%u},%x",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key));
-        ret = afs_set_fs_cursor(&fc, vnode);
-        if (ret < 0)
-                return ret;
-        fc.ac.error = afs_fs_extend_lock(&fc, key, vnode, false);
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-}
-/*
- * release a lock on a file
- */
-int afs_vnode_release_lock(struct afs_vnode *vnode, struct key *key)
-{
-        struct afs_fs_cursor fc;
-        int ret;
-        _enter("%s{%x:%u.%u},%x",
-               vnode->volume->vlocation->vldb.name,
-               vnode->fid.vid,
-               vnode->fid.vnode,
-               vnode->fid.unique,
-               key_serial(key));
-        ret = afs_set_fs_cursor(&fc, vnode);
-        if (ret < 0)
-                return ret;
-        fc.ac.error = afs_fs_release_lock(&fc, key, vnode, false);
-        return afs_end_fs_cursor(&fc, afs_v2net(vnode));
-}
diff --git a/fs/afs/volume.c b/fs/afs/volume.c
index 3c5ad1cc50f3..2295dd4f9b15 100644
--- a/fs/afs/volume.c
+++ b/fs/afs/volume.c
@@ -13,11 +13,148 @@
 #include <linux/slab.h>
 #include "internal.h"
-static const char *afs_voltypes[] = { "R/W", "R/O", "BAK" };
+unsigned __read_mostly afs_volume_gc_delay = 10;
+unsigned __read_mostly afs_volume_record_life = 60 * 60;
+static const char *const afs_voltypes[] = { "R/W", "R/O", "BAK" };
 /*
- * lookup a volume by name
+ * Allocate a volume record and load it up from a vldb record.
- * - this can be one of the following:
+ */
+static struct afs_volume *afs_alloc_volume(struct afs_mount_params *params,
+                                           struct afs_vldb_entry *vldb,
+                                           unsigned long type_mask)
+{
+        struct afs_server_list *slist;
+        struct afs_server *server;
+        struct afs_volume *volume;
+        int ret = -ENOMEM, nr_servers = 0, i, j;
+        for (i = 0; i < vldb->nr_servers; i++)
+                if (vldb->fs_mask[i] & type_mask)
+                        nr_servers++;
+        volume = kzalloc(sizeof(struct afs_volume), GFP_KERNEL);
+        if (!volume)
+                goto error_0;
+        volume->vid             = vldb->vid[params->type];
+        volume->update_at       = ktime_get_real_seconds() + afs_volume_record_life;
+        volume->cell            = afs_get_cell(params->cell);
+        volume->type            = params->type;
+        volume->type_force      = params->force;
+        volume->name_len        = vldb->name_len;
+        atomic_set(&volume->usage, 1);
+        INIT_LIST_HEAD(&volume->proc_link);
+        rwlock_init(&volume->servers_lock);
+        memcpy(volume->name, vldb->name, vldb->name_len + 1);
+        slist = afs_alloc_server_list(params->cell, params->key, vldb, type_mask);
+        if (IS_ERR(slist)) {
+                ret = PTR_ERR(slist);
+                goto error_1;
+        }
+        refcount_set(&slist->usage, 1);
+        volume->servers = slist;
+        /* Make sure a records exists for each server this volume occupies. */
+        for (i = 0; i < nr_servers; i++) {
+                if (!(vldb->fs_mask[i] & type_mask))
+                        continue;
+                server = afs_lookup_server(params->cell, params->key,
+                                           &vldb->fs_server[i]);
+                if (IS_ERR(server)) {
+                        ret = PTR_ERR(server);
+                        if (ret == -ENOENT)
+                                continue;
+                        goto error_2;
+                }
+                /* Insertion-sort by server pointer */
+                for (j = 0; j < slist->nr_servers; j++)
+                        if (slist->servers[j].server >= server)
+                                break;
+                if (j < slist->nr_servers) {
+                        if (slist->servers[j].server == server) {
+                                afs_put_server(params->net, server);
+                                continue;
+                        }
+                        memmove(slist->servers + j + 1,
+                                slist->servers + j,
+                                (slist->nr_servers - j) * sizeof(struct afs_server_entry));
+                }
+                slist->servers[j].server = server;
+                slist->nr_servers++;
+        }
+        if (slist->nr_servers == 0) {
+                ret = -EDESTADDRREQ;
+                goto error_2;
+        }
+        return volume;
+error_2:
+        afs_put_serverlist(params->net, slist);
+error_1:
+        kfree(volume);
+error_0:
+        return ERR_PTR(ret);
+}
+/*
+ * Look up a VLDB record for a volume.
+ */
+static struct afs_vldb_entry *afs_vl_lookup_vldb(struct afs_cell *cell,
+                                                 struct key *key,
+                                                 const char *volname,
+                                                 size_t volnamesz)
+{
+        struct afs_addr_cursor ac;
+        struct afs_vldb_entry *vldb;
+        int ret;
+        ret = afs_set_vl_cursor(&ac, cell);
+        if (ret < 0)
+                return ERR_PTR(ret);
+        while (afs_iterate_addresses(&ac)) {
+                vldb = afs_vl_get_entry_by_name_u(cell->net, &ac, key,
+                                                  volname, volnamesz);
+                switch (ac.error) {
+                case 0:
+                        afs_end_cursor(&ac);
+                        return vldb;
+                case -ECONNABORTED:
+                        ac.error = afs_abort_to_error(ac.abort_code);
+                        goto error;
+                case -ENOMEM:
+                case -ENONET:
+                        goto error;
+                case -ENETUNREACH:
+                case -EHOSTUNREACH:
+                case -ECONNREFUSED:
+                        break;
+                default:
+                        ac.error = -EIO;
+                        goto error;
+                }
+        }
+error:
+        return ERR_PTR(afs_end_cursor(&ac));
+}
+/*
+ * Look up a volume in the VL server and create a candidate volume record for
+ * it.
+ *
+ * The volume name can be one of the following:
 *      "%[cell:]volume[.]"             R/W volume
 *      "#[cell:]volume[.]"             R/O or R/W volume (rwparent=0),
 *                                       or R/W (rwparent=1) volume
@@ -37,169 +174,218 @@ static const char *afs_voltypes[] = { "R/W", "R/O", "BAK" };
 * - Rule 3: If parent volume is R/W, then only mount R/W volume unless
 *           explicitly told otherwise
 */
-struct afs_volume *afs_volume_lookup(struct afs_mount_params *params)
+struct afs_volume *afs_create_volume(struct afs_mount_params *params)
 {
-        struct afs_vlocation *vlocation = NULL;
+        struct afs_vldb_entry *vldb;
-        struct afs_volume *volume = NULL;
+        struct afs_volume *volume;
-        struct afs_server *server = NULL;
+        unsigned long type_mask = 1UL << params->type;
-        char srvtmask;
-        int ret, loop;
-        _enter("{%*.*s,%d}",
-               params->volnamesz, params->volnamesz, params->volname, params->rwpath);
-        /* lookup the volume location record */
-        vlocation = afs_vlocation_lookup(params->net, params->cell, params->key,
-                                         params->volname, params->volnamesz);
-        if (IS_ERR(vlocation)) {
-                ret = PTR_ERR(vlocation);
-                vlocation = NULL;
-                goto error;
-        }
-        /* make the final decision on the type we want */
+        vldb = afs_vl_lookup_vldb(params->cell, params->key,
-        ret = -ENOMEDIUM;
+                                  params->volname, params->volnamesz);
-        if (params->force && !(vlocation->vldb.vidmask & (1 << params->type)))
+        if (IS_ERR(vldb))
-                goto error;
+                return ERR_CAST(vldb);
-        srvtmask = 0;
+        if (test_bit(AFS_VLDB_QUERY_ERROR, &vldb->flags)) {
-        for (loop = 0; loop < vlocation->vldb.nservers; loop++)
+                volume = ERR_PTR(vldb->error);
-                srvtmask |= vlocation->vldb.srvtmask[loop];
+                goto error;
+        }
+        /* Make the final decision on the type we want */
+        volume = ERR_PTR(-ENOMEDIUM);
        if (params->force) {
-                if (!(srvtmask & (1 << params->type)))
+                if (!(vldb->flags & type_mask))
                        goto error;
-        } else if (srvtmask & AFS_VOL_VTM_RO) {
+        } else if (test_bit(AFS_VLDB_HAS_RO, &vldb->flags)) {
                params->type = AFSVL_ROVOL;
-        } else if (srvtmask & AFS_VOL_VTM_RW) {
+        } else if (test_bit(AFS_VLDB_HAS_RW, &vldb->flags)) {
                params->type = AFSVL_RWVOL;
        } else {
                goto error;
        }
-        down_write(&params->cell->vl_sem);
+        type_mask = 1UL << params->type;
+        volume = afs_alloc_volume(params, vldb, type_mask);
-        /* is the volume already active? */
+error:
-        if (vlocation->vols[params->type]) {
+        kfree(vldb);
-                /* yes - re-use it */
+        return volume;
-                volume = vlocation->vols[params->type];
+}
-                afs_get_volume(volume);
-                goto success;
-        }
-        /* create a new volume record */
+/*
-        _debug("creating new volume record");
+ * Destroy a volume record
+ */
+static void afs_destroy_volume(struct afs_net *net, struct afs_volume *volume)
+{
+        _enter("%p", volume);
-        ret = -ENOMEM;
+#ifdef CONFIG_AFS_FSCACHE
-        volume = kzalloc(sizeof(struct afs_volume), GFP_KERNEL);
+        ASSERTCMP(volume->cache, ==, NULL);
-        if (!volume)
+#endif
-                goto error_up;
-        atomic_set(&volume->usage, 1);
+        afs_put_serverlist(net, volume->servers);
-        volume->type            = params->type;
+        afs_put_cell(net, volume->cell);
-        volume->type_force      = params->force;
+        kfree(volume);
-        volume->cell            = params->cell;
-        volume->vid             = vlocation->vldb.vid[params->type];
-        init_rwsem(&volume->server_sem);
-        /* look up all the applicable server records */
-        for (loop = 0; loop < 8; loop++) {
-                if (vlocation->vldb.srvtmask[loop] & (1 << volume->type)) {
-                        server = afs_lookup_server(
-                               volume->cell, &vlocation->vldb.servers[loop]);
-                        if (IS_ERR(server)) {
-                                ret = PTR_ERR(server);
-                                goto error_discard;
-                        }
-                        volume->servers[volume->nservers] = server;
+        _leave(" [destroyed]");
-                        volume->nservers++;
+}
-                }
+/*
+ * Drop a reference on a volume record.
+ */
+void afs_put_volume(struct afs_cell *cell, struct afs_volume *volume)
+{
+        if (volume) {
+                _enter("%s", volume->name);
+                if (atomic_dec_and_test(&volume->usage))
+                        afs_destroy_volume(cell->net, volume);
        }
+}
-        /* attach the cache and volume location */
+/*
+ * Activate a volume.
+ */
+void afs_activate_volume(struct afs_volume *volume)
+{
 #ifdef CONFIG_AFS_FSCACHE
        volume->cache = fscache_acquire_cookie(volume->cell->cache,
                                               &afs_volume_cache_index_def,
                                               volume, true);
 #endif
-        afs_get_vlocation(vlocation);
-        volume->vlocation = vlocation;
-        vlocation->vols[volume->type] = volume;
-success:
+        write_lock(&volume->cell->proc_lock);
-        _debug("kAFS selected %s volume %08x",
+        list_add_tail(&volume->proc_link, &volume->cell->proc_volumes);
-               afs_voltypes[volume->type], volume->vid);
+        write_unlock(&volume->cell->proc_lock);
-        up_write(&params->cell->vl_sem);
+}
-        afs_put_vlocation(params->net, vlocation);
-        _leave(" = %p", volume);
-        return volume;
-        /* clean up */
+/*
-error_up:
+ * Deactivate a volume.
-        up_write(&params->cell->vl_sem);
+ */
-error:
+void afs_deactivate_volume(struct afs_volume *volume)
-        afs_put_vlocation(params->net, vlocation);
+{
-        _leave(" = %d", ret);
+        _enter("%s", volume->name);
-        return ERR_PTR(ret);
-error_discard:
+        write_lock(&volume->cell->proc_lock);
-        up_write(&params->cell->vl_sem);
+        list_del_init(&volume->proc_link);
+        write_unlock(&volume->cell->proc_lock);
-        for (loop = volume->nservers - 1; loop >= 0; loop--) {
+#ifdef CONFIG_AFS_FSCACHE
-                afs_put_cb_interest(params->net, volume->cb_interests[loop]);
+        fscache_relinquish_cookie(volume->cache,
-                afs_put_server(params->net, volume->servers[loop]);
+                                  test_bit(AFS_VOLUME_DELETED, &volume->flags));
-        }
+        volume->cache = NULL;
+#endif
-        kfree(volume);
+        _leave("");
-        goto error;
 }
 /*
- * destroy a volume record
+ * Query the VL service to update the volume status.
 */
-void afs_put_volume(struct afs_cell *cell, struct afs_volume *volume)
+static int afs_update_volume_status(struct afs_volume *volume, struct key *key)
 {
-        struct afs_vlocation *vlocation;
+        struct afs_server_list *new, *old, *discard;
-        int loop;
+        struct afs_vldb_entry *vldb;
+        char idbuf[16];
+        int ret, idsz;
-        if (!volume)
+        _enter("");
-                return;
-        _enter("%p", volume);
+        /* We look up an ID by passing it as a decimal string in the
+         * operation's name parameter.
+         */
+        idsz = sprintf(idbuf, "%u", volume->vid);
-        ASSERTCMP(atomic_read(&volume->usage), >, 0);
+        vldb = afs_vl_lookup_vldb(volume->cell, key, idbuf, idsz);
+        if (IS_ERR(vldb)) {
+                ret = PTR_ERR(vldb);
+                goto error;
+        }
-        vlocation = volume->vlocation;
+        /* See if the volume got renamed. */
+        if (vldb->name_len != volume->name_len ||
+            memcmp(vldb->name, volume->name, vldb->name_len) != 0) {
+                /* TODO: Use RCU'd string. */
+                memcpy(volume->name, vldb->name, AFS_MAXVOLNAME);
+                volume->name_len = vldb->name_len;
+        }
+        /* See if the volume's server list got updated. */
+        new = afs_alloc_server_list(volume->cell, key,
+                                      vldb, (1 << volume->type));
+        if (IS_ERR(new)) {
+                ret = PTR_ERR(new);
+                goto error_vldb;
+        }
-        /* to prevent a race, the decrement and the dequeue must be effectively
+        write_lock(&volume->servers_lock);
-         * atomic */
-        down_write(&cell->vl_sem);
-        if (likely(!atomic_dec_and_test(&volume->usage))) {
+        discard = new;
-                up_write(&vlocation->cell->vl_sem);
+        old = volume->servers;
-                _leave("");
+        if (afs_annotate_server_list(new, old)) {
-                return;
+                new->seq = volume->servers_seq + 1;
+                volume->servers = new;
+                smp_wmb();
+                volume->servers_seq++;
+                discard = old;
        }
-        vlocation->vols[volume->type] = NULL;
+        volume->update_at = ktime_get_real_seconds() + afs_volume_record_life;
+        clear_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
+        write_unlock(&volume->servers_lock);
+        ret = 0;
-        up_write(&cell->vl_sem);
+        afs_put_serverlist(volume->cell->net, discard);
+error_vldb:
+        kfree(vldb);
+error:
+        _leave(" = %d", ret);
+        return ret;
+}
-        /* finish cleaning up the volume */
+/*
-#ifdef CONFIG_AFS_FSCACHE
+ * Make sure the volume record is up to date.
-        fscache_relinquish_cookie(volume->cache, 0);
+ */
-#endif
+int afs_check_volume_status(struct afs_volume *volume, struct key *key)
-        afs_put_vlocation(cell->net, vlocation);
+{
+        time64_t now = ktime_get_real_seconds();
+        int ret, retries = 0;
-        for (loop = volume->nservers - 1; loop >= 0; loop--) {
+        _enter("");
-                afs_put_cb_interest(cell->net, volume->cb_interests[loop]);
-                afs_put_server(cell->net, volume->servers[loop]);
+        if (volume->update_at <= now)
+                set_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags);
+retry:
+        if (!test_bit(AFS_VOLUME_NEEDS_UPDATE, &volume->flags) &&
+            !test_bit(AFS_VOLUME_WAIT, &volume->flags)) {
+                _leave(" = 0");
+                return 0;
        }
-        kfree(volume);
+        if (!test_and_set_bit_lock(AFS_VOLUME_UPDATING, &volume->flags)) {
+                ret = afs_update_volume_status(volume, key);
+                clear_bit_unlock(AFS_VOLUME_WAIT, &volume->flags);
+                clear_bit_unlock(AFS_VOLUME_UPDATING, &volume->flags);
+                wake_up_bit(&volume->flags, AFS_VOLUME_WAIT);
+                _leave(" = %d", ret);
+                return ret;
+        }
-        _leave(" [destroyed]");
+        if (!test_bit(AFS_VOLUME_WAIT, &volume->flags)) {
+                _leave(" = 0 [no wait]");
+                return 0;
+        }
+        ret = wait_on_bit(&volume->flags, AFS_VOLUME_WAIT, TASK_INTERRUPTIBLE);
+        if (ret == -ERESTARTSYS) {
+                _leave(" = %d", ret);
+                return ret;
+        }
+        retries++;
+        if (retries == 4) {
+                _leave(" = -ESTALE");
+                return -ESTALE;
+        }
+        goto retry;
 }
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 106e43db1115..1377a40ecdbb 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -103,7 +103,7 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
        req->pages[0] = page;
        get_page(page);
-        ret = afs_vnode_fetch_data(vnode, key, req);
+        ret = afs_fetch_data(vnode, key, req);
        afs_put_read(req);
        if (ret < 0) {
                if (ret == -ENOENT) {
@@ -338,6 +338,40 @@ static void afs_kill_pages(struct afs_vnode *vnode, bool error,
 }
 /*
+ * write to a file
+ */
+static int afs_store_data(struct afs_writeback *wb, pgoff_t first, pgoff_t last,
+                          unsigned offset, unsigned to)
+{
+        struct afs_fs_cursor fc;
+        struct afs_vnode *vnode = wb->vnode;
+        int ret;
+        _enter("%s{%x:%u.%u},%x,%lx,%lx,%x,%x",
+               vnode->volume->name,
+               vnode->fid.vid,
+               vnode->fid.vnode,
+               vnode->fid.unique,
+               key_serial(wb->key),
+               first, last, offset, to);
+        ret = -ERESTARTSYS;
+        if (afs_begin_vnode_operation(&fc, vnode, wb->key)) {
+                while (afs_select_fileserver(&fc)) {
+                        fc.cb_break = vnode->cb_break + vnode->cb_s_break;
+                        afs_fs_store_data(&fc, wb, first, last, offset, to);
+                }
+                afs_check_for_remote_deletion(&fc, fc.vnode);
+                afs_vnode_commit_status(&fc, vnode, fc.cb_break);
+                ret = afs_end_vnode_operation(&fc);
+        }
+        _leave(" = %d", ret);
+        return ret;
+}
+/*
 * synchronously write back the locked page and any subsequent non-locked dirty
 * pages also covered by the same writeback record
 */
@@ -420,7 +454,7 @@ no_more:
        _debug("write back %lx[%u..] to %lx[..%u]", first, offset, last, to);
-        ret = afs_vnode_store_data(wb, first, last, offset, to);
+        ret = afs_store_data(wb, first, last, offset, to);
        if (ret < 0) {
                switch (ret) {
                case -EDQUOT:
diff --git a/fs/afs/xattr.c b/fs/afs/xattr.c
index e58e00ee9747..cfcc674e64a5 100644
--- a/fs/afs/xattr.c
+++ b/fs/afs/xattr.c
@@ -96,7 +96,7 @@ static int afs_xattr_get_volume(const struct xattr_handler *handler,
                              void *buffer, size_t size)
 {
        struct afs_vnode *vnode = AFS_FS_I(inode);
-        const char *volname = vnode->volume->vlocation->vldb.name;
+        const char *volname = vnode->volume->name;
        size_t namelen;
        namelen = strlen(volname);
author	David Howells <dhowells@redhat.com>	2017-11-02 11:27:50 -0400
committer	David Howells <dhowells@redhat.com>	2017-11-13 10:38:19 -0500
commit	d2ddc776a4581d900fc3bdc7803b403daae64d88 (patch)
tree	6c5f6bac9e9e9a326ed0a9248914b53c421558e5
parent	9cc6fc50f7bc69ac28bee45eed13cbc65a86210f (diff)