diff options
author | David Teigland <teigland@redhat.com> | 2012-04-26 16:54:29 -0400 |
---|---|---|
committer | David Teigland <teigland@redhat.com> | 2012-05-02 15:15:27 -0400 |
commit | 4875647a08e35f77274838d97ca8fa44158d50e2 (patch) | |
tree | bf8a39eaf3219af5d661ed3e347545306fd84bda /fs/dlm/recoverd.c | |
parent | 6d40c4a708e0e996fd9c60d4093aebba5fe1f749 (diff) |
dlm: fixes for nodir mode
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used. This commit
fixes a number of problems, making nodir much more usable.
- Major change to recovery: recover all locks and restart
all in-progress operations after recovery. In some
cases it's not possible to know which in-progess locks
to recover, so recover all. (Most require recovery
in nodir mode anyway since rehashing changes most
master nodes.)
- Change the way nodir mode is enabled, from a command
line mount arg passed through gfs2, into a sysfs
file managed by dlm_controld, consistent with the
other config settings.
- Allow recovering MSTCPY locks on an rsb that has not
yet been turned into a master copy.
- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
from a previous, aborted recovery cycle. Base this
on the local recovery status not being in the state
where any nodes should be sending LOCK messages for the
current recovery cycle.
- Hold rsb lock around dlm_purge_mstcpy_locks() because it
may run concurrently with dlm_recover_master_copy().
- Maintain highbast on process-copy lkb's (in addition to
the master as is usual), because the lkb can switch
back and forth between being a master and being a
process copy as the master node changes in recovery.
- When recovering MSTCPY locks, flag rsb's that have
non-empty convert or waiting queues for granting
at the end of recovery. (Rename flag from LOCKS_PURGED
to RECOVER_GRANT and similar for the recovery function,
because it's not only resources with purged locks
that need grant a grant attempt.)
- Replace a couple of unnecessary assertion panics with
error messages.
Signed-off-by: David Teigland <teigland@redhat.com>
Diffstat (limited to 'fs/dlm/recoverd.c')
-rw-r--r-- | fs/dlm/recoverd.c | 9 |
1 files changed, 7 insertions, 2 deletions
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c index 11351b57c781..f1a9073c0835 100644 --- a/fs/dlm/recoverd.c +++ b/fs/dlm/recoverd.c | |||
@@ -84,6 +84,8 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv) | |||
84 | goto fail; | 84 | goto fail; |
85 | } | 85 | } |
86 | 86 | ||
87 | ls->ls_recover_locks_in = 0; | ||
88 | |||
87 | dlm_set_recover_status(ls, DLM_RS_NODES); | 89 | dlm_set_recover_status(ls, DLM_RS_NODES); |
88 | 90 | ||
89 | error = dlm_recover_members_wait(ls); | 91 | error = dlm_recover_members_wait(ls); |
@@ -130,7 +132,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv) | |||
130 | * Clear lkb's for departed nodes. | 132 | * Clear lkb's for departed nodes. |
131 | */ | 133 | */ |
132 | 134 | ||
133 | dlm_purge_locks(ls); | 135 | dlm_recover_purge(ls); |
134 | 136 | ||
135 | /* | 137 | /* |
136 | * Get new master nodeid's for rsb's that were mastered on | 138 | * Get new master nodeid's for rsb's that were mastered on |
@@ -161,6 +163,9 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv) | |||
161 | goto fail; | 163 | goto fail; |
162 | } | 164 | } |
163 | 165 | ||
166 | log_debug(ls, "dlm_recover_locks %u in", | ||
167 | ls->ls_recover_locks_in); | ||
168 | |||
164 | /* | 169 | /* |
165 | * Finalize state in master rsb's now that all locks can be | 170 | * Finalize state in master rsb's now that all locks can be |
166 | * checked. This includes conversion resolution and lvb | 171 | * checked. This includes conversion resolution and lvb |
@@ -225,7 +230,7 @@ static int ls_recover(struct dlm_ls *ls, struct dlm_recover *rv) | |||
225 | goto fail; | 230 | goto fail; |
226 | } | 231 | } |
227 | 232 | ||
228 | dlm_grant_after_purge(ls); | 233 | dlm_recover_grant(ls); |
229 | 234 | ||
230 | log_debug(ls, "dlm_recover %llu generation %u done: %u ms", | 235 | log_debug(ls, "dlm_recover %llu generation %u done: %u ms", |
231 | (unsigned long long)rv->seq, ls->ls_generation, | 236 | (unsigned long long)rv->seq, ls->ls_generation, |