aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ceph/caps.c
diff options
context:
space:
mode:
authorYan, Zheng <zheng.z.yan@intel.com>2013-08-13 00:42:15 -0400
committerSage Weil <sage@inktank.com>2013-08-15 14:12:06 -0400
commitb0d7c2231015b331b942746610a05b6ea72977ab (patch)
treea13e3f015fc3144371550b4f5c25363bd66bdf1f /fs/ceph/caps.c
parentb150f5c1c759d551da9146435d3dc9df5f7e15ef (diff)
ceph: introduce i_truncate_mutex
I encountered below deadlock when running fsstress wmtruncate work truncate MDS --------------- ------------------ -------------------------- lock i_mutex <- truncate file lock i_mutex (blocked) <- revoking Fcb (filelock to MIX) send request -> handle request (xlock filelock) At the initial time, there are some dirty pages in the page cache. When the kclient receives the truncate message, it reduces inode size and creates some 'out of i_size' dirty pages. wmtruncate work can't truncate these dirty pages because it's blocked by the i_mutex. Later when the kclient receives the cap message that revokes Fcb caps, It can't flush all dirty pages because writepages() only flushes dirty pages within the inode size. When the MDS handles the 'truncate' request from kclient, it waits for the filelock to become stable. But the filelock is stuck in unstable state because it can't finish revoking kclient's Fcb caps. The truncate pagecache locking has already caused lots of trouble for use. I think it's time simplify it by introducing a new mutex. We use the new mutex to prevent concurrent truncate_inode_pages(). There is no need to worry about race between buffered write and truncate_inode_pages(), because our "get caps" mechanism prevents them from concurrent execution. Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Diffstat (limited to 'fs/ceph/caps.c')
-rw-r--r--fs/ceph/caps.c4
1 files changed, 0 insertions, 4 deletions
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index 430121a795bd..0e94d27fa284 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -2072,11 +2072,7 @@ static int try_get_cap_refs(struct ceph_inode_info *ci, int need, int want,
2072 /* finish pending truncate */ 2072 /* finish pending truncate */
2073 while (ci->i_truncate_pending) { 2073 while (ci->i_truncate_pending) {
2074 spin_unlock(&ci->i_ceph_lock); 2074 spin_unlock(&ci->i_ceph_lock);
2075 if (!(need & CEPH_CAP_FILE_WR))
2076 mutex_lock(&inode->i_mutex);
2077 __ceph_do_pending_vmtruncate(inode); 2075 __ceph_do_pending_vmtruncate(inode);
2078 if (!(need & CEPH_CAP_FILE_WR))
2079 mutex_unlock(&inode->i_mutex);
2080 spin_lock(&ci->i_ceph_lock); 2076 spin_lock(&ci->i_ceph_lock);
2081 } 2077 }
2082 2078