diff options
author | Alex Elder <elder@inktank.com> | 2012-10-08 23:37:30 -0400 |
---|---|---|
committer | Alex Elder <elder@inktank.com> | 2012-10-10 00:59:52 -0400 |
commit | 588377d6199034c36d335e7df5818b731fea072c (patch) | |
tree | 95efb0eac7f7376395ca3f282e79a5123138c79d | |
parent | 6285bc231277419255f3498d3eb5ddc9f8e7fe79 (diff) |
rbd: reset BACKOFF if unable to re-queue
If ceph_fault() is unable to queue work after a delay, it sets the
BACKOFF connection flag so con_work() will attempt to do so.
In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
result in newly-queued work, it simply ignores this condition and
proceeds as if no backoff delay were desired. There are two
problems with this--one of which is a bug.
The first problem is simply that the intended behavior is to back
off, and if we aren't able queue the work item to run after a delay
we're not doing that.
The only reason queue_delayed_work() won't queue work is if the
provided work item is already queued. In the messenger, this
means that con_work() is already scheduled to be run again. So
if we simply set the BACKOFF flag again when this occurs, we know
the next con_work() call will again attempt to hold off activity
on the connection until after the delay.
The second problem--the bug--is a leak of a reference count. If
queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
the connection reference held on entry to con_work(). However,
processing is (was) allowed to continue, and at the end of the
function a second con->ops->put() is called.
This patch fixes both problems.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
-rw-r--r-- | net/ceph/messenger.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 159aa8bef9e7..cad0d17ec45e 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c | |||
@@ -2300,10 +2300,11 @@ restart: | |||
2300 | mutex_unlock(&con->mutex); | 2300 | mutex_unlock(&con->mutex); |
2301 | return; | 2301 | return; |
2302 | } else { | 2302 | } else { |
2303 | con->ops->put(con); | ||
2304 | dout("con_work %p FAILED to back off %lu\n", con, | 2303 | dout("con_work %p FAILED to back off %lu\n", con, |
2305 | con->delay); | 2304 | con->delay); |
2305 | set_bit(CON_FLAG_BACKOFF, &con->flags); | ||
2306 | } | 2306 | } |
2307 | goto done; | ||
2307 | } | 2308 | } |
2308 | 2309 | ||
2309 | if (con->state == CON_STATE_STANDBY) { | 2310 | if (con->state == CON_STATE_STANDBY) { |