mm, THP, swap: delay splitting THP during swap out

Patch series "THP swap: Delay splitting THP during swapping out", v11. This patchset is to optimize the performance of Transparent Huge Page (THP) swap. Recently, the performance of the storage devices improved so fast that we cannot saturate the disk bandwidth with single logical CPU when do page swap out even on a high-end server machine. Because the performance of the storage device improved faster than that of single logical CPU. And it seems that the trend will not change in the near future. On the other hand, the THP becomes more and more popular because of increased memory size. So it becomes necessary to optimize THP swap performance. The advantages of the THP swap support include: - Batch the swap operations for the THP to reduce lock acquiring/releasing, including allocating/freeing the swap space, adding/deleting to/from the swap cache, and writing/reading the swap space, etc. This will help improve the performance of the THP swap. - The THP swap space read/write will be 2M sequential IO. It is particularly helpful for the swap read, which are usually 4k random IO. This will improve the performance of the THP swap too. - It will help the memory fragmentation, especially when the THP is heavily used by the applications. The 2M continuous pages will be free up after THP swapping out. - It will improve the THP utilization on the system with the swap turned on. Because the speed for khugepaged to collapse the normal pages into the THP is quite slow. After the THP is split during the swapping out, it will take quite long time for the normal pages to collapse back into the THP after being swapped in. The high THP utilization helps the efficiency of the page based memory management too. There are some concerns regarding THP swap in, mainly because possible enlarged read/write IO size (for swap in/out) may put more overhead on the storage device. To deal with that, the THP swap in should be turned on only when necessary. For example, it can be selected via "always/never/madvise" logic, to be turned on globally, turned off globally, or turned on only for VMA with MADV_HUGEPAGE, etc. This patchset is the first step for the THP swap support. The plan is to delay splitting THP step by step, finally avoid splitting THP during the THP swapping out and swap out/in the THP as a whole. As the first step, in this patchset, the splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP and adding the THP into the swap cache. This will reduce lock acquiring/releasing for the locks used for the swap cache management. With the patchset, the swap out throughput improves 15.5% (from about 3.73GB/s to about 4.31GB/s) in the vm-scalability swap-w-seq test case with 8 processes. The test is done on a Xeon E5 v3 system. The swap device used is a RAM simulated PMEM (persistent memory) device. To test the sequential swapping out, the test case creates 8 processes, which sequentially allocate and write to the anonymous pages until the RAM and part of the swap device is used up. This patch (of 5): In this patch, splitting huge page is delayed from almost the first step of swapping out to after allocating the swap space for the THP (Transparent Huge Page) and adding the THP into the swap cache. This will batch the corresponding operation, thus improve THP swap out throughput. This is the first step for the THP swap optimization. The plan is to delay splitting the THP step by step and avoid splitting the THP finally. In this patch, one swap cluster is used to hold the contents of each THP swapped out. So, the size of the swap cluster is changed to that of the THP (Transparent Huge Page) on x86_64 architecture (512). For other architectures which want such THP swap optimization, ARCH_USES_THP_SWAP_CLUSTER needs to be selected in the Kconfig file for the architecture. In effect, this will enlarge swap cluster size by 2 times on x86_64. Which may make it harder to find a free cluster when the swap space becomes fragmented. So that, this may reduce the continuous swap space allocation and sequential write in theory. The performance test in 0day shows no regressions caused by this. In the future of THP swap optimization, some information of the swapped out THP (such as compound map count) will be recorded in the swap_cluster_info data structure. The mem cgroup swap accounting functions are enhanced to support charge or uncharge a swap cluster backing a THP as a whole. The swap cluster allocate/free functions are added to allocate/free a swap cluster for a THP. A fair simple algorithm is used for swap cluster allocation, that is, only the first swap device in priority list will be tried to allocate the swap cluster. The function will fail if the trying is not successful, and the caller will fallback to allocate a single swap slot instead. This works good enough for normal cases. If the difference of the number of the free swap clusters among multiple swap devices is significant, it is possible that some THPs are split earlier than necessary. For example, this could be caused by big size difference among multiple swap devices. The swap cache functions is enhanced to support add/delete THP to/from the swap cache as a set of (HPAGE_PMD_NR) sub-pages. This may be enhanced in the future with multi-order radix tree. But because we will split the THP soon during swapping out, that optimization doesn't make much sense for this first step. The THP splitting functions are enhanced to support to split THP in swap cache during swapping out. The page lock will be held during allocating the swap cluster, adding the THP into the swap cache and splitting the THP. So in the code path other than swapping out, if the THP need to be split, the PageSwapCache(THP) will be always false. The swap cluster is only available for SSD, so the THP swap optimization in this patchset has no effect for HDD. [ying.huang@intel.com: fix two issues in THP optimize patch] Link: http://lkml.kernel.org/r/87k25ed8zo.fsf@yhuang-dev.intel.com [hannes@cmpxchg.org: extensive cleanups and simplifications, reduce code size] Link: http://lkml.kernel.org/r/20170515112522.32457-2-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Andrew Morton <akpm@linux-foundation.org> [for config option] Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [for changes in huge_memory.c and huge_mm.h] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Shaohua Li <shli@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Huang Ying <ying.huang@intel.com> 2017-07-06 18:37:18 -0400
committer: Linus Torvalds <torvalds@linux-foundation.org> 2017-07-06 19:24:31 -0400
commit: 38d8b4e6bdc872f07a3149309ab01719c96f3894 (patch)
tree: a4bdf8e41a90f49465829b98a46645af64b0103d /mm/swap_state.c
parent: 9d85e15f1d552653c989dbecf051d8eea5937be8 (diff)
1 files changed, 70 insertions, 44 deletions
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 539b8885e3d1..16ff89d058f4 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -19,6 +19,7 @@
 #include <linux/migrate.h>
 #include <linux/vmalloc.h>
 #include <linux/swap_slots.h>
+#include <linux/huge_mm.h>
 #include <asm/pgtable.h>
@@ -38,6 +39,7 @@ struct address_space *swapper_spaces[MAX_SWAPFILES];
 static unsigned int nr_swapper_spaces[MAX_SWAPFILES];
 #define INC_CACHE_INFO(x)       do { swap_cache_info.x++; } while (0)
+#define ADD_CACHE_INFO(x, nr)   do { swap_cache_info.x += (nr); } while (0)
 static struct {
        unsigned long add_total;
@@ -90,39 +92,46 @@ void show_swap_cache_info(void)
 */
 int __add_to_swap_cache(struct page *page, swp_entry_t entry)
 {
-        int error;
+        int error, i, nr = hpage_nr_pages(page);
        struct address_space *address_space;
+        pgoff_t idx = swp_offset(entry);
        VM_BUG_ON_PAGE(!PageLocked(page), page);
        VM_BUG_ON_PAGE(PageSwapCache(page), page);
        VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
-        get_page(page);
+        page_ref_add(page, nr);
        SetPageSwapCache(page);
-        set_page_private(page, entry.val);
        address_space = swap_address_space(entry);
        spin_lock_irq(&address_space->tree_lock);
-        error = radix_tree_insert(&address_space->page_tree,
+        for (i = 0; i < nr; i++) {
-                                  swp_offset(entry), page);
+                set_page_private(page + i, entry.val + i);
-        if (likely(!error)) {
+                error = radix_tree_insert(&address_space->page_tree,
-                address_space->nrpages++;
+                                          idx + i, page + i);
-                __inc_node_page_state(page, NR_FILE_PAGES);
+                if (unlikely(error))
-                INC_CACHE_INFO(add_total);
+                        break;
        }
-        spin_unlock_irq(&address_space->tree_lock);
+        if (likely(!error)) {
+                address_space->nrpages += nr;
-        if (unlikely(error)) {
+                __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
+                ADD_CACHE_INFO(add_total, nr);
+        } else {
                /*
                 * Only the context which have set SWAP_HAS_CACHE flag
                 * would call add_to_swap_cache().
                 * So add_to_swap_cache() doesn't returns -EEXIST.
                 */
                VM_BUG_ON(error == -EEXIST);
-                set_page_private(page, 0UL);
+                set_page_private(page + i, 0UL);
+                while (i--) {
+                        radix_tree_delete(&address_space->page_tree, idx + i);
+                        set_page_private(page + i, 0UL);
+                }
                ClearPageSwapCache(page);
-                put_page(page);
+                page_ref_sub(page, nr);
        }
+        spin_unlock_irq(&address_space->tree_lock);
        return error;
 }
@@ -132,7 +141,7 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp_mask)
 {
        int error;
-        error = radix_tree_maybe_preload(gfp_mask);
+        error = radix_tree_maybe_preload_order(gfp_mask, compound_order(page));
        if (!error) {
                error = __add_to_swap_cache(page, entry);
                radix_tree_preload_end();
@@ -146,8 +155,10 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp_mask)
 */
 void __delete_from_swap_cache(struct page *page)
 {
-        swp_entry_t entry;
        struct address_space *address_space;
+        int i, nr = hpage_nr_pages(page);
+        swp_entry_t entry;
+        pgoff_t idx;
        VM_BUG_ON_PAGE(!PageLocked(page), page);
        VM_BUG_ON_PAGE(!PageSwapCache(page), page);
@@ -155,12 +166,15 @@ void __delete_from_swap_cache(struct page *page)
        entry.val = page_private(page);
        address_space = swap_address_space(entry);
-        radix_tree_delete(&address_space->page_tree, swp_offset(entry));
+        idx = swp_offset(entry);
-        set_page_private(page, 0);
+        for (i = 0; i < nr; i++) {
+                radix_tree_delete(&address_space->page_tree, idx + i);
+                set_page_private(page + i, 0);
+        }
        ClearPageSwapCache(page);
-        address_space->nrpages--;
+        address_space->nrpages -= nr;
-        __dec_node_page_state(page, NR_FILE_PAGES);
+        __mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
-        INC_CACHE_INFO(del_total);
+        ADD_CACHE_INFO(del_total, nr);
 }
 /**
@@ -178,20 +192,12 @@ int add_to_swap(struct page *page, struct list_head *list)
        VM_BUG_ON_PAGE(!PageLocked(page), page);
        VM_BUG_ON_PAGE(!PageUptodate(page), page);
-        entry = get_swap_page();
+retry:
+        entry = get_swap_page(page);
        if (!entry.val)
-                return 0;
+                goto fail;
+        if (mem_cgroup_try_charge_swap(page, entry))
-        if (mem_cgroup_try_charge_swap(page, entry)) {
+                goto fail_free;
-                swapcache_free(entry);
-                return 0;
-        }
-        if (unlikely(PageTransHuge(page)))
-                if (unlikely(split_huge_page_to_list(page, list))) {
-                        swapcache_free(entry);
-                        return 0;
-                }
        /*
         * Radix-tree node allocations from PF_MEMALLOC contexts could
@@ -206,17 +212,33 @@ int add_to_swap(struct page *page, struct list_head *list)
         */
        err = add_to_swap_cache(page, entry,
                        __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
+        /* -ENOMEM radix-tree allocation failure */
-        if (!err) {
+        if (err)
-                return 1;
-        } else {        /* -ENOMEM radix-tree allocation failure */
                /*
                 * add_to_swap_cache() doesn't return -EEXIST, so we can safely
                 * clear SWAP_HAS_CACHE flag.
                 */
-                swapcache_free(entry);
+                goto fail_free;
-                return 0;
+        if (PageTransHuge(page)) {
+                err = split_huge_page_to_list(page, list);
+                if (err) {
+                        delete_from_swap_cache(page);
+                        return 0;
+                }
        }
+        return 1;
+fail_free:
+        if (PageTransHuge(page))
+                swapcache_free_cluster(entry);
+        else
+                swapcache_free(entry);
+fail:
+        if (PageTransHuge(page) && !split_huge_page_to_list(page, list))
+                goto retry;
+        return 0;
 }
 /*
@@ -237,8 +259,12 @@ void delete_from_swap_cache(struct page *page)
        __delete_from_swap_cache(page);
        spin_unlock_irq(&address_space->tree_lock);
-        swapcache_free(entry);
+        if (PageTransHuge(page))
-        put_page(page);
+                swapcache_free_cluster(entry);
+        else
+                swapcache_free(entry);
+        page_ref_sub(page, hpage_nr_pages(page));
 }
 /* 
@@ -295,7 +321,7 @@ struct page * lookup_swap_cache(swp_entry_t entry)
        page = find_get_page(swap_address_space(entry), swp_offset(entry));
-        if (page) {
+        if (page && likely(!PageTransCompound(page))) {
                INC_CACHE_INFO(find_success);
                if (TestClearPageReadahead(page))
                        atomic_inc(&swapin_readahead_hits);
@@ -506,7 +532,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
                                                gfp_mask, vma, addr);
                if (!page)
                        continue;
-                if (offset != entry_offset)
+                if (offset != entry_offset && likely(!PageTransCompound(page)))
                        SetPageReadahead(page);
                put_page(page);
        }
author	Huang Ying <ying.huang@intel.com>	2017-07-06 18:37:18 -0400
committer	Linus Torvalds <torvalds@linux-foundation.org>	2017-07-06 19:24:31 -0400
commit	38d8b4e6bdc872f07a3149309ab01719c96f3894 (patch)
tree	a4bdf8e41a90f49465829b98a46645af64b0103d /mm/swap_state.c
parent	9d85e15f1d552653c989dbecf051d8eea5937be8 (diff)

diff --git a/mm/swap_state.c b/mm/swap_state.c index 539b8885e3d1..16ff89d058f4 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c
@@ -19,6 +19,7 @@
19	#include <linux/migrate.h>	19	#include <linux/migrate.h>
20	#include <linux/vmalloc.h>	20	#include <linux/vmalloc.h>
21	#include <linux/swap_slots.h>	21	#include <linux/swap_slots.h>
		22	#include <linux/huge_mm.h>
22		23
23	#include <asm/pgtable.h>	24	#include <asm/pgtable.h>
24		25
@@ -38,6 +39,7 @@ struct address_space *swapper_spaces[MAX_SWAPFILES];
38	static unsigned int nr_swapper_spaces[MAX_SWAPFILES];	39	static unsigned int nr_swapper_spaces[MAX_SWAPFILES];
39		40
40	#define INC_CACHE_INFO(x) do { swap_cache_info.x++; } while (0)	41	#define INC_CACHE_INFO(x) do { swap_cache_info.x++; } while (0)
		42	#define ADD_CACHE_INFO(x, nr) do { swap_cache_info.x += (nr); } while (0)
41		43
42	static struct {	44	static struct {
43	unsigned long add_total;	45	unsigned long add_total;
@@ -90,39 +92,46 @@ void show_swap_cache_info(void)
90	*/	92	*/
91	int __add_to_swap_cache(struct page *page, swp_entry_t entry)	93	int __add_to_swap_cache(struct page *page, swp_entry_t entry)
92	{	94	{
93	int error;	95	int error, i, nr = hpage_nr_pages(page);
94	struct address_space *address_space;	96	struct address_space *address_space;
		97	pgoff_t idx = swp_offset(entry);
95		98
96	VM_BUG_ON_PAGE(!PageLocked(page), page);	99	VM_BUG_ON_PAGE(!PageLocked(page), page);
97	VM_BUG_ON_PAGE(PageSwapCache(page), page);	100	VM_BUG_ON_PAGE(PageSwapCache(page), page);
98	VM_BUG_ON_PAGE(!PageSwapBacked(page), page);	101	VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
99		102
100	get_page(page);	103	page_ref_add(page, nr);
101	SetPageSwapCache(page);	104	SetPageSwapCache(page);
102	set_page_private(page, entry.val);
103		105
104	address_space = swap_address_space(entry);	106	address_space = swap_address_space(entry);
105	spin_lock_irq(&address_space->tree_lock);	107	spin_lock_irq(&address_space->tree_lock);
106	error = radix_tree_insert(&address_space->page_tree,	108	for (i = 0; i < nr; i++) {
107	swp_offset(entry), page);	109	set_page_private(page + i, entry.val + i);
108	if (likely(!error)) {	110	error = radix_tree_insert(&address_space->page_tree,
109	address_space->nrpages++;	111	idx + i, page + i);
110	__inc_node_page_state(page, NR_FILE_PAGES);	112	if (unlikely(error))
111	INC_CACHE_INFO(add_total);	113	break;
112	}	114	}
113	spin_unlock_irq(&address_space->tree_lock);	115	if (likely(!error)) {
114		116	address_space->nrpages += nr;
115	if (unlikely(error)) {	117	__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
		118	ADD_CACHE_INFO(add_total, nr);
		119	} else {
116	/*	120	/*
117	* Only the context which have set SWAP_HAS_CACHE flag	121	* Only the context which have set SWAP_HAS_CACHE flag
118	* would call add_to_swap_cache().	122	* would call add_to_swap_cache().
119	* So add_to_swap_cache() doesn't returns -EEXIST.	123	* So add_to_swap_cache() doesn't returns -EEXIST.
120	*/	124	*/
121	VM_BUG_ON(error == -EEXIST);	125	VM_BUG_ON(error == -EEXIST);
122	set_page_private(page, 0UL);	126	set_page_private(page + i, 0UL);
		127	while (i--) {
		128	radix_tree_delete(&address_space->page_tree, idx + i);
		129	set_page_private(page + i, 0UL);
		130	}
123	ClearPageSwapCache(page);	131	ClearPageSwapCache(page);
124	put_page(page);	132	page_ref_sub(page, nr);
125	}	133	}
		134	spin_unlock_irq(&address_space->tree_lock);
126		135
127	return error;	136	return error;
128	}	137	}
@@ -132,7 +141,7 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp_mask)
132	{	141	{
133	int error;	142	int error;
134		143
135	error = radix_tree_maybe_preload(gfp_mask);	144	error = radix_tree_maybe_preload_order(gfp_mask, compound_order(page));
136	if (!error) {	145	if (!error) {
137	error = __add_to_swap_cache(page, entry);	146	error = __add_to_swap_cache(page, entry);
138	radix_tree_preload_end();	147	radix_tree_preload_end();
@@ -146,8 +155,10 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp_mask)
146	*/	155	*/
147	void __delete_from_swap_cache(struct page *page)	156	void __delete_from_swap_cache(struct page *page)
148	{	157	{
149	swp_entry_t entry;
150	struct address_space *address_space;	158	struct address_space *address_space;
		159	int i, nr = hpage_nr_pages(page);
		160	swp_entry_t entry;
		161	pgoff_t idx;
151		162
152	VM_BUG_ON_PAGE(!PageLocked(page), page);	163	VM_BUG_ON_PAGE(!PageLocked(page), page);
153	VM_BUG_ON_PAGE(!PageSwapCache(page), page);	164	VM_BUG_ON_PAGE(!PageSwapCache(page), page);
@@ -155,12 +166,15 @@ void __delete_from_swap_cache(struct page *page)
155		166
156	entry.val = page_private(page);	167	entry.val = page_private(page);
157	address_space = swap_address_space(entry);	168	address_space = swap_address_space(entry);
158	radix_tree_delete(&address_space->page_tree, swp_offset(entry));	169	idx = swp_offset(entry);
159	set_page_private(page, 0);	170	for (i = 0; i < nr; i++) {
		171	radix_tree_delete(&address_space->page_tree, idx + i);
		172	set_page_private(page + i, 0);
		173	}
160	ClearPageSwapCache(page);	174	ClearPageSwapCache(page);
161	address_space->nrpages--;	175	address_space->nrpages -= nr;
162	__dec_node_page_state(page, NR_FILE_PAGES);	176	__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
163	INC_CACHE_INFO(del_total);	177	ADD_CACHE_INFO(del_total, nr);
164	}	178	}
165		179
166	/**	180	/**
@@ -178,20 +192,12 @@ int add_to_swap(struct page page, struct list_head list)
178	VM_BUG_ON_PAGE(!PageLocked(page), page);	192	VM_BUG_ON_PAGE(!PageLocked(page), page);
179	VM_BUG_ON_PAGE(!PageUptodate(page), page);	193	VM_BUG_ON_PAGE(!PageUptodate(page), page);
180		194
181	entry = get_swap_page();	195	retry:
		196	entry = get_swap_page(page);
182	if (!entry.val)	197	if (!entry.val)
183	return 0;	198	goto fail;
184		199	if (mem_cgroup_try_charge_swap(page, entry))
185	if (mem_cgroup_try_charge_swap(page, entry)) {	200	goto fail_free;
186	swapcache_free(entry);
187	return 0;
188	}
189
190	if (unlikely(PageTransHuge(page)))
191	if (unlikely(split_huge_page_to_list(page, list))) {
192	swapcache_free(entry);
193	return 0;
194	}
195		201
196	/*	202	/*
197	* Radix-tree node allocations from PF_MEMALLOC contexts could	203	* Radix-tree node allocations from PF_MEMALLOC contexts could
@@ -206,17 +212,33 @@ int add_to_swap(struct page page, struct list_head list)
206	*/	212	*/
207	err = add_to_swap_cache(page, entry,	213	err = add_to_swap_cache(page, entry,
208	__GFP_HIGH\|__GFP_NOMEMALLOC\|__GFP_NOWARN);	214	__GFP_HIGH\|__GFP_NOMEMALLOC\|__GFP_NOWARN);
209		215	/* -ENOMEM radix-tree allocation failure */
210	if (!err) {	216	if (err)
211	return 1;
212	} else { /* -ENOMEM radix-tree allocation failure */
213	/*	217	/*
214	* add_to_swap_cache() doesn't return -EEXIST, so we can safely	218	* add_to_swap_cache() doesn't return -EEXIST, so we can safely
215	* clear SWAP_HAS_CACHE flag.	219	* clear SWAP_HAS_CACHE flag.
216	*/	220	*/
217	swapcache_free(entry);	221	goto fail_free;
218	return 0;	222
		223	if (PageTransHuge(page)) {
		224	err = split_huge_page_to_list(page, list);
		225	if (err) {
		226	delete_from_swap_cache(page);
		227	return 0;
		228	}
219	}	229	}
		230
		231	return 1;
		232
		233	fail_free:
		234	if (PageTransHuge(page))
		235	swapcache_free_cluster(entry);
		236	else
		237	swapcache_free(entry);
		238	fail:
		239	if (PageTransHuge(page) && !split_huge_page_to_list(page, list))
		240	goto retry;
		241	return 0;
220	}	242	}
221		243
222	/*	244	/*
@@ -237,8 +259,12 @@ void delete_from_swap_cache(struct page *page)
237	__delete_from_swap_cache(page);	259	__delete_from_swap_cache(page);
238	spin_unlock_irq(&address_space->tree_lock);	260	spin_unlock_irq(&address_space->tree_lock);
239		261
240	swapcache_free(entry);	262	if (PageTransHuge(page))
241	put_page(page);	263	swapcache_free_cluster(entry);
		264	else
		265	swapcache_free(entry);
		266
		267	page_ref_sub(page, hpage_nr_pages(page));
242	}	268	}
243		269
244	/*	270	/*
@@ -295,7 +321,7 @@ struct page * lookup_swap_cache(swp_entry_t entry)
295		321
296	page = find_get_page(swap_address_space(entry), swp_offset(entry));	322	page = find_get_page(swap_address_space(entry), swp_offset(entry));
297		323
298	if (page) {	324	if (page && likely(!PageTransCompound(page))) {
299	INC_CACHE_INFO(find_success);	325	INC_CACHE_INFO(find_success);
300	if (TestClearPageReadahead(page))	326	if (TestClearPageReadahead(page))
301	atomic_inc(&swapin_readahead_hits);	327	atomic_inc(&swapin_readahead_hits);
@@ -506,7 +532,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
506	gfp_mask, vma, addr);	532	gfp_mask, vma, addr);
507	if (!page)	533	if (!page)
508	continue;	534	continue;
509	if (offset != entry_offset)	535	if (offset != entry_offset && likely(!PageTransCompound(page)))
510	SetPageReadahead(page);	536	SetPageReadahead(page);
511	put_page(page);	537	put_page(page);
512	}	538	}