aboutsummaryrefslogtreecommitdiffstats
path: root/mm
diff options
context:
space:
mode:
authorMel Gorman <mgorman@techsingularity.net>2016-03-15 17:55:39 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2016-03-15 19:55:16 -0400
commitebded02788b5d7c7600f8cff26ae07896d568649 (patch)
tree155d79f674e6a3eafbc43bdb6893f0688bcf482d /mm
parent32b635298ff4e991d8d8f64dc23782b02eec29c3 (diff)
mm: filemap: avoid unnecessary calls to lock_page when waiting for IO to complete during a read
In the generic read paths the kernel looks up a page in the page cache and if it's up to date, it is used. If not, the page lock is acquired to wait for IO to complete and then check the page. If multiple processes are waiting on IO, they all serialise against the lock and duplicate the checks. This is unnecessary. The page lock in itself does not give any guarantees to the callers about the page state as it can be immediately truncated or reclaimed after the page is unlocked. It's sufficient to wait_on_page_locked and then continue if the page is up to date on wakeup. It is possible that a truncated but up-to-date page is returned but the reference taken during read prevents it disappearing underneath the caller and the data is still valid if PageUptodate. The overall impact is small as even if processes serialise on the lock, the lock section is tiny once the IO is complete. Profiles indicated that unlock_page and friends are generally a tiny portion of a read-intensive workload. An artificial test was created that had instances of dd access a cache-cold file on an ext4 filesystem and measure how long the read took. paralleldd 4.4.0 4.4.0 vanilla avoidlock Amean Elapsd-1 5.28 ( 0.00%) 5.15 ( 2.50%) Amean Elapsd-4 5.29 ( 0.00%) 5.17 ( 2.12%) Amean Elapsd-7 5.28 ( 0.00%) 5.18 ( 1.78%) Amean Elapsd-12 5.20 ( 0.00%) 5.33 ( -2.50%) Amean Elapsd-21 5.14 ( 0.00%) 5.21 ( -1.41%) Amean Elapsd-30 5.30 ( 0.00%) 5.12 ( 3.38%) Amean Elapsd-48 5.78 ( 0.00%) 5.42 ( 6.21%) Amean Elapsd-79 6.78 ( 0.00%) 6.62 ( 2.46%) Amean Elapsd-110 9.09 ( 0.00%) 8.99 ( 1.15%) Amean Elapsd-128 10.60 ( 0.00%) 10.43 ( 1.66%) The impact is small but intuitively, it makes sense to avoid unnecessary calls to lock_page. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r--mm/filemap.c49
1 files changed, 49 insertions, 0 deletions
diff --git a/mm/filemap.c b/mm/filemap.c
index deae0b9ad90b..4a0f5fa79dbd 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1668,6 +1668,15 @@ find_page:
1668 index, last_index - index); 1668 index, last_index - index);
1669 } 1669 }
1670 if (!PageUptodate(page)) { 1670 if (!PageUptodate(page)) {
1671 /*
1672 * See comment in do_read_cache_page on why
1673 * wait_on_page_locked is used to avoid unnecessarily
1674 * serialisations and why it's safe.
1675 */
1676 wait_on_page_locked_killable(page);
1677 if (PageUptodate(page))
1678 goto page_ok;
1679
1671 if (inode->i_blkbits == PAGE_CACHE_SHIFT || 1680 if (inode->i_blkbits == PAGE_CACHE_SHIFT ||
1672 !mapping->a_ops->is_partially_uptodate) 1681 !mapping->a_ops->is_partially_uptodate)
1673 goto page_not_up_to_date; 1682 goto page_not_up_to_date;
@@ -2341,12 +2350,52 @@ filler:
2341 if (PageUptodate(page)) 2350 if (PageUptodate(page))
2342 goto out; 2351 goto out;
2343 2352
2353 /*
2354 * Page is not up to date and may be locked due one of the following
2355 * case a: Page is being filled and the page lock is held
2356 * case b: Read/write error clearing the page uptodate status
2357 * case c: Truncation in progress (page locked)
2358 * case d: Reclaim in progress
2359 *
2360 * Case a, the page will be up to date when the page is unlocked.
2361 * There is no need to serialise on the page lock here as the page
2362 * is pinned so the lock gives no additional protection. Even if the
2363 * the page is truncated, the data is still valid if PageUptodate as
2364 * it's a race vs truncate race.
2365 * Case b, the page will not be up to date
2366 * Case c, the page may be truncated but in itself, the data may still
2367 * be valid after IO completes as it's a read vs truncate race. The
2368 * operation must restart if the page is not uptodate on unlock but
2369 * otherwise serialising on page lock to stabilise the mapping gives
2370 * no additional guarantees to the caller as the page lock is
2371 * released before return.
2372 * Case d, similar to truncation. If reclaim holds the page lock, it
2373 * will be a race with remove_mapping that determines if the mapping
2374 * is valid on unlock but otherwise the data is valid and there is
2375 * no need to serialise with page lock.
2376 *
2377 * As the page lock gives no additional guarantee, we optimistically
2378 * wait on the page to be unlocked and check if it's up to date and
2379 * use the page if it is. Otherwise, the page lock is required to
2380 * distinguish between the different cases. The motivation is that we
2381 * avoid spurious serialisations and wakeups when multiple processes
2382 * wait on the same page for IO to complete.
2383 */
2384 wait_on_page_locked(page);
2385 if (PageUptodate(page))
2386 goto out;
2387
2388 /* Distinguish between all the cases under the safety of the lock */
2344 lock_page(page); 2389 lock_page(page);
2390
2391 /* Case c or d, restart the operation */
2345 if (!page->mapping) { 2392 if (!page->mapping) {
2346 unlock_page(page); 2393 unlock_page(page);
2347 page_cache_release(page); 2394 page_cache_release(page);
2348 goto repeat; 2395 goto repeat;
2349 } 2396 }
2397
2398 /* Someone else locked and filled the page in a very small window */
2350 if (PageUptodate(page)) { 2399 if (PageUptodate(page)) {
2351 unlock_page(page); 2400 unlock_page(page);
2352 goto out; 2401 goto out;