aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorWu Fengguang <fengguang.wu@intel.com>2010-08-11 17:17:43 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2010-08-12 11:43:30 -0400
commita50aeb40144982eb766053309b6fc33e14ca46f0 (patch)
tree15837ddb24c356b910a9af3e06f47937c0716027
parent4ea879b96d437693485d21f4b7e1eb72f7615fc2 (diff)
writeback: merge for_kupdate and !for_kupdate cases
Unify the logic for kupdate and non-kupdate cases. There won't be starvation because the inodes requeued into b_more_io will later be spliced _after_ the remaining inodes in b_io, hence won't stand in the way of other inodes in the next run. It avoids unnecessary redirty_tail() calls, hence the update of i_dirtied_when. The timestamp update is undesirable because it could later delay the inode's periodic writeback, or may exclude the inode from the data integrity sync operation (which checks timestamp to avoid extra work and livelock). === How the redirty_tail() comes about: It was a long story.. This redirty_tail() was introduced with wbc.more_io. The initial patch for more_io actually does not have the redirty_tail(), and when it's merged, several 100% iowait bug reports arised: reiserfs: http://lkml.org/lkml/2007/10/23/93 jfs: commit 29a424f28390752a4ca2349633aaacc6be494db5 JFS: clear PAGECACHE_TAG_DIRTY for no-write pages ext2: http://www.spinics.net/linux/lists/linux-ext4/msg04762.html They are all old bugs hidden in various filesystems that become "visible" with the more_io patch. At the time, the ext2 bug is thought to be "trivial", so not fixed. Instead the following updated more_io patch with redirty_tail() is merged: http://www.spinics.net/linux/lists/linux-ext4/msg04507.html This will in general prevent 100% on ext2 and possibly other unknown FS bugs. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Martin Bligh <mbligh@google.com> Cc: Michael Rubin <mrubin@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--fs/fs-writeback.c43
1 files changed, 10 insertions, 33 deletions
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 863bfb0eb492..8a5807d2fb9d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -374,45 +374,22 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
374 if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { 374 if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
375 /* 375 /*
376 * We didn't write back all the pages. nfs_writepages() 376 * We didn't write back all the pages. nfs_writepages()
377 * sometimes bales out without doing anything. Redirty 377 * sometimes bales out without doing anything.
378 * the inode; Move it from b_io onto b_more_io/b_dirty.
379 */ 378 */
380 /* 379 inode->i_state |= I_DIRTY_PAGES;
381 * akpm: if the caller was the kupdate function we put 380 if (wbc->nr_to_write <= 0) {
382 * this inode at the head of b_dirty so it gets first
383 * consideration. Otherwise, move it to the tail, for
384 * the reasons described there. I'm not really sure
385 * how much sense this makes. Presumably I had a good
386 * reasons for doing it this way, and I'd rather not
387 * muck with it at present.
388 */
389 if (wbc->for_kupdate) {
390 /* 381 /*
391 * For the kupdate function we move the inode 382 * slice used up: queue for next turn
392 * to b_more_io so it will get more writeout as
393 * soon as the queue becomes uncongested.
394 */ 383 */
395 inode->i_state |= I_DIRTY_PAGES; 384 requeue_io(inode);
396 if (wbc->nr_to_write <= 0) {
397 /*
398 * slice used up: queue for next turn
399 */
400 requeue_io(inode);
401 } else {
402 /*
403 * somehow blocked: retry later
404 */
405 redirty_tail(inode);
406 }
407 } else { 385 } else {
408 /* 386 /*
409 * Otherwise fully redirty the inode so that 387 * Writeback blocked by something other than
410 * other inodes on this superblock will get some 388 * congestion. Delay the inode for some time to
411 * writeout. Otherwise heavy writing to one 389 * avoid spinning on the CPU (100% iowait)
412 * file would indefinitely suspend writeout of 390 * retrying writeback of the dirty page/inode
413 * all the other files. 391 * that cannot be performed immediately.
414 */ 392 */
415 inode->i_state |= I_DIRTY_PAGES;
416 redirty_tail(inode); 393 redirty_tail(inode);
417 } 394 }
418 } else if (inode->i_state & I_DIRTY) { 395 } else if (inode->i_state & I_DIRTY) {