diff options
author | Wu Fengguang <fengguang.wu@intel.com> | 2010-08-11 17:17:43 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2010-08-12 11:43:30 -0400 |
commit | a50aeb40144982eb766053309b6fc33e14ca46f0 (patch) | |
tree | 15837ddb24c356b910a9af3e06f47937c0716027 | |
parent | 4ea879b96d437693485d21f4b7e1eb72f7615fc2 (diff) |
writeback: merge for_kupdate and !for_kupdate cases
Unify the logic for kupdate and non-kupdate cases. There won't be
starvation because the inodes requeued into b_more_io will later be
spliced _after_ the remaining inodes in b_io, hence won't stand in the way
of other inodes in the next run.
It avoids unnecessary redirty_tail() calls, hence the update of
i_dirtied_when. The timestamp update is undesirable because it could
later delay the inode's periodic writeback, or may exclude the inode from
the data integrity sync operation (which checks timestamp to avoid extra
work and livelock).
===
How the redirty_tail() comes about:
It was a long story.. This redirty_tail() was introduced with
wbc.more_io. The initial patch for more_io actually does not have the
redirty_tail(), and when it's merged, several 100% iowait bug reports
arised:
reiserfs:
http://lkml.org/lkml/2007/10/23/93
jfs:
commit 29a424f28390752a4ca2349633aaacc6be494db5
JFS: clear PAGECACHE_TAG_DIRTY for no-write pages
ext2:
http://www.spinics.net/linux/lists/linux-ext4/msg04762.html
They are all old bugs hidden in various filesystems that become "visible"
with the more_io patch. At the time, the ext2 bug is thought to be
"trivial", so not fixed. Instead the following updated more_io patch with
redirty_tail() is merged:
http://www.spinics.net/linux/lists/linux-ext4/msg04507.html
This will in general prevent 100% on ext2 and possibly other unknown FS bugs.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | fs/fs-writeback.c | 43 |
1 files changed, 10 insertions, 33 deletions
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 863bfb0eb492..8a5807d2fb9d 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c | |||
@@ -374,45 +374,22 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) | |||
374 | if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { | 374 | if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { |
375 | /* | 375 | /* |
376 | * We didn't write back all the pages. nfs_writepages() | 376 | * We didn't write back all the pages. nfs_writepages() |
377 | * sometimes bales out without doing anything. Redirty | 377 | * sometimes bales out without doing anything. |
378 | * the inode; Move it from b_io onto b_more_io/b_dirty. | ||
379 | */ | 378 | */ |
380 | /* | 379 | inode->i_state |= I_DIRTY_PAGES; |
381 | * akpm: if the caller was the kupdate function we put | 380 | if (wbc->nr_to_write <= 0) { |
382 | * this inode at the head of b_dirty so it gets first | ||
383 | * consideration. Otherwise, move it to the tail, for | ||
384 | * the reasons described there. I'm not really sure | ||
385 | * how much sense this makes. Presumably I had a good | ||
386 | * reasons for doing it this way, and I'd rather not | ||
387 | * muck with it at present. | ||
388 | */ | ||
389 | if (wbc->for_kupdate) { | ||
390 | /* | 381 | /* |
391 | * For the kupdate function we move the inode | 382 | * slice used up: queue for next turn |
392 | * to b_more_io so it will get more writeout as | ||
393 | * soon as the queue becomes uncongested. | ||
394 | */ | 383 | */ |
395 | inode->i_state |= I_DIRTY_PAGES; | 384 | requeue_io(inode); |
396 | if (wbc->nr_to_write <= 0) { | ||
397 | /* | ||
398 | * slice used up: queue for next turn | ||
399 | */ | ||
400 | requeue_io(inode); | ||
401 | } else { | ||
402 | /* | ||
403 | * somehow blocked: retry later | ||
404 | */ | ||
405 | redirty_tail(inode); | ||
406 | } | ||
407 | } else { | 385 | } else { |
408 | /* | 386 | /* |
409 | * Otherwise fully redirty the inode so that | 387 | * Writeback blocked by something other than |
410 | * other inodes on this superblock will get some | 388 | * congestion. Delay the inode for some time to |
411 | * writeout. Otherwise heavy writing to one | 389 | * avoid spinning on the CPU (100% iowait) |
412 | * file would indefinitely suspend writeout of | 390 | * retrying writeback of the dirty page/inode |
413 | * all the other files. | 391 | * that cannot be performed immediately. |
414 | */ | 392 | */ |
415 | inode->i_state |= I_DIRTY_PAGES; | ||
416 | redirty_tail(inode); | 393 | redirty_tail(inode); |
417 | } | 394 | } |
418 | } else if (inode->i_state & I_DIRTY) { | 395 | } else if (inode->i_state & I_DIRTY) { |