aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBrian Foster <bfoster@redhat.com>2018-12-28 03:37:20 -0500
committerLinus Torvalds <torvalds@linux-foundation.org>2018-12-28 15:11:49 -0500
commit3fa750dcf29e8606e3969d13d8e188cc1c0f511d (patch)
treed8fbf2cbb7f3d4d92e791fb2388db170afa0c83f
parentc3a5c77afefa697bf87f15272c7257e1574cad56 (diff)
mm/page-writeback.c: don't break integrity writeback on ->writepage() error
write_cache_pages() is used in both background and integrity writeback scenarios by various filesystems. Background writeback is mostly concerned with cleaning a certain number of dirty pages based on various mm heuristics. It may not write the full set of dirty pages or wait for I/O to complete. Integrity writeback is responsible for persisting a set of dirty pages before the writeback job completes. For example, an fsync() call must perform integrity writeback to ensure data is on disk before the call returns. write_cache_pages() unconditionally breaks out of its processing loop in the event of a ->writepage() error. This is fine for background writeback, which had no strict requirements and will eventually come around again. This can cause problems for integrity writeback on filesystems that might need to clean up state associated with failed page writeouts. For example, XFS performs internal delayed allocation accounting before returning a ->writepage() error, where applicable. If the current writeback happens to be associated with an unmount and write_cache_pages() completes the writeback prematurely due to error, the filesystem is unmounted in an inconsistent state if dirty+delalloc pages still exist. To handle this problem, update write_cache_pages() to always process the full set of pages for integrity writeback regardless of ->writepage() errors. Save the first encountered error and return it to the caller once complete. This facilitates XFS (or any other fs that expects integrity writeback to process the entire set of dirty pages) to clean up its internal state completely in the event of persistent mapping errors. Background writeback continues to exit on the first error encountered. [akpm@linux-foundation.org: fix typo in comment] Link: http://lkml.kernel.org/r/20181116134304.32440-1-bfoster@redhat.com Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--mm/page-writeback.c35
1 files changed, 21 insertions, 14 deletions
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f690bae6b78..7d1010453fb9 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2154,6 +2154,7 @@ int write_cache_pages(struct address_space *mapping,
2154{ 2154{
2155 int ret = 0; 2155 int ret = 0;
2156 int done = 0; 2156 int done = 0;
2157 int error;
2157 struct pagevec pvec; 2158 struct pagevec pvec;
2158 int nr_pages; 2159 int nr_pages;
2159 pgoff_t uninitialized_var(writeback_index); 2160 pgoff_t uninitialized_var(writeback_index);
@@ -2227,25 +2228,31 @@ continue_unlock:
2227 goto continue_unlock; 2228 goto continue_unlock;
2228 2229
2229 trace_wbc_writepage(wbc, inode_to_bdi(mapping->host)); 2230 trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
2230 ret = (*writepage)(page, wbc, data); 2231 error = (*writepage)(page, wbc, data);
2231 if (unlikely(ret)) { 2232 if (unlikely(error)) {
2232 if (ret == AOP_WRITEPAGE_ACTIVATE) { 2233 /*
2234 * Handle errors according to the type of
2235 * writeback. There's no need to continue for
2236 * background writeback. Just push done_index
2237 * past this page so media errors won't choke
2238 * writeout for the entire file. For integrity
2239 * writeback, we must process the entire dirty
2240 * set regardless of errors because the fs may
2241 * still have state to clear for each page. In
2242 * that case we continue processing and return
2243 * the first error.
2244 */
2245 if (error == AOP_WRITEPAGE_ACTIVATE) {
2233 unlock_page(page); 2246 unlock_page(page);
2234 ret = 0; 2247 error = 0;
2235 } else { 2248 } else if (wbc->sync_mode != WB_SYNC_ALL) {
2236 /* 2249 ret = error;
2237 * done_index is set past this page,
2238 * so media errors will not choke
2239 * background writeout for the entire
2240 * file. This has consequences for
2241 * range_cyclic semantics (ie. it may
2242 * not be suitable for data integrity
2243 * writeout).
2244 */
2245 done_index = page->index + 1; 2250 done_index = page->index + 1;
2246 done = 1; 2251 done = 1;
2247 break; 2252 break;
2248 } 2253 }
2254 if (!ret)
2255 ret = error;
2249 } 2256 }
2250 2257
2251 /* 2258 /*