mm: fix vm-scalability regression in cgroup-aware workingset code

Commit 23047a96d7cf ("mm: workingset: per-cgroup cache thrash detection") added a page->mem_cgroup lookup to the cache eviction, refault, and activation paths, as well as locking to the activation path, and the vm-scalability tests showed a regression of -23%. While the test in question is an artificial worst-case scenario that doesn't occur in real workloads - reading two sparse files in parallel at full CPU speed just to hammer the LRU paths - there is still some optimizations that can be done in those paths. Inline the lookup functions to eliminate calls. Also, page->mem_cgroup doesn't need to be stabilized when counting an activation; we merely need to hold the RCU lock to prevent the memcg from being freed. This cuts down on overhead quite a bit: 23047a96d7cfcfca 063f6715e77a7be5770d6081fe ---------------- -------------------------- %stddev %change %stddev \ | \ 21621405 +- 0% +11.3% 24069657 +- 2% vm-scalability.throughput [linux@roeck-us.net: drop unnecessary include file] [hannes@cmpxchg.org: add WARN_ON_ONCE()s] Link: http://lkml.kernel.org/r/20160707194024.GA26580@cmpxchg.org Link: http://lkml.kernel.org/r/20160624175101.GA3024@cmpxchg.org Reported-by: Ye Xiaolong <xiaolong.ye@intel.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Johannes Weiner <hannes@cmpxchg.org> 2016-07-28 18:45:10 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2016-07-28 19:07:41 -0400
commit: 55779ec759ccc3c12b917b3712a7716e1140c652 (patch)
tree: d119d51e0c82b2535f0a6519799f5387b94192f6 /mm/workingset.c
parent: 400bc7fd4fa7d33c96d836e6b65eeed246f1959a (diff)
1 files changed, 6 insertions, 4 deletions
diff --git a/mm/workingset.c b/mm/workingset.c
index 577277546d98..d7cc4bbd7e1b 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -305,9 +305,10 @@ bool workingset_refault(void *shadow)
 */
 void workingset_activation(struct page *page)
 {
+        struct mem_cgroup *memcg;
        struct lruvec *lruvec;
-        lock_page_memcg(page);
+        rcu_read_lock();
        /*
         * Filter non-memcg pages here, e.g. unmap can call
         * mark_page_accessed() on VDSO pages.
@@ -315,12 +316,13 @@ void workingset_activation(struct page *page)
         * XXX: See workingset_refault() - this should return
         * root_mem_cgroup even for !CONFIG_MEMCG.
         */
-        if (!mem_cgroup_disabled() && !page_memcg(page))
+        memcg = page_memcg_rcu(page);
+        if (!mem_cgroup_disabled() && !memcg)
                goto out;
-        lruvec = mem_cgroup_zone_lruvec(page_zone(page), page_memcg(page));
+        lruvec = mem_cgroup_zone_lruvec(page_zone(page), memcg);
        atomic_long_inc(&lruvec->inactive_age);
 out:
-        unlock_page_memcg(page);
+        rcu_read_unlock();
 }
 /*
author	Johannes Weiner <hannes@cmpxchg.org>	2016-07-28 18:45:10 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 19:07:41 -0400
commit	55779ec759ccc3c12b917b3712a7716e1140c652 (patch)
tree	d119d51e0c82b2535f0a6519799f5387b94192f6 /mm/workingset.c
parent	400bc7fd4fa7d33c96d836e6b65eeed246f1959a (diff)