aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Documentation/vm/cleancache.txt37
-rw-r--r--fs/block_dev.c2
-rw-r--r--fs/super.c2
-rw-r--r--include/linux/cleancache.h23
-rw-r--r--mm/cleancache.c19
-rw-r--r--mm/filemap.c2
-rw-r--r--mm/truncate.c10
7 files changed, 51 insertions, 44 deletions
diff --git a/Documentation/vm/cleancache.txt b/Documentation/vm/cleancache.txt
index 36c367c73084..e0a535677b7b 100644
--- a/Documentation/vm/cleancache.txt
+++ b/Documentation/vm/cleancache.txt
@@ -46,10 +46,11 @@ a negative return value indicates failure. A "put_page" will copy a
46the pool id, a file key, and a page index into the file. (The combination 46the pool id, a file key, and a page index into the file. (The combination
47of a pool id, a file key, and an index is sometimes called a "handle".) 47of a pool id, a file key, and an index is sometimes called a "handle".)
48A "get_page" will copy the page, if found, from cleancache into kernel memory. 48A "get_page" will copy the page, if found, from cleancache into kernel memory.
49A "flush_page" will ensure the page no longer is present in cleancache; 49An "invalidate_page" will ensure the page no longer is present in cleancache;
50a "flush_inode" will flush all pages associated with the specified file; 50an "invalidate_inode" will invalidate all pages associated with the specified
51and, when a filesystem is unmounted, a "flush_fs" will flush all pages in 51file; and, when a filesystem is unmounted, an "invalidate_fs" will invalidate
52all files specified by the given pool id and also surrender the pool id. 52all pages in all files specified by the given pool id and also surrender
53the pool id.
53 54
54An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache 55An "init_shared_fs", like init_fs, obtains a pool id but tells cleancache
55to treat the pool as shared using a 128-bit UUID as a key. On systems 56to treat the pool as shared using a 128-bit UUID as a key. On systems
@@ -62,12 +63,12 @@ of the kernel (e.g. by "tools" that control cleancache). Or a
62cleancache implementation can simply disable shared_init by always 63cleancache implementation can simply disable shared_init by always
63returning a negative value. 64returning a negative value.
64 65
65If a get_page is successful on a non-shared pool, the page is flushed (thus 66If a get_page is successful on a non-shared pool, the page is invalidated
66making cleancache an "exclusive" cache). On a shared pool, the page 67(thus making cleancache an "exclusive" cache). On a shared pool, the page
67is NOT flushed on a successful get_page so that it remains accessible to 68is NOT invalidated on a successful get_page so that it remains accessible to
68other sharers. The kernel is responsible for ensuring coherency between 69other sharers. The kernel is responsible for ensuring coherency between
69cleancache (shared or not), the page cache, and the filesystem, using 70cleancache (shared or not), the page cache, and the filesystem, using
70cleancache flush operations as required. 71cleancache invalidate operations as required.
71 72
72Note that cleancache must enforce put-put-get coherency and get-get 73Note that cleancache must enforce put-put-get coherency and get-get
73coherency. For the former, if two puts are made to the same handle but 74coherency. For the former, if two puts are made to the same handle but
@@ -77,7 +78,7 @@ if a get for a given handle fails, subsequent gets for that handle will
77never succeed unless preceded by a successful put with that handle. 78never succeed unless preceded by a successful put with that handle.
78 79
79Last, cleancache provides no SMP serialization guarantees; if two 80Last, cleancache provides no SMP serialization guarantees; if two
80different Linux threads are simultaneously putting and flushing a page 81different Linux threads are simultaneously putting and invalidating a page
81with the same handle, the results are indeterminate. Callers must 82with the same handle, the results are indeterminate. Callers must
82lock the page to ensure serial behavior. 83lock the page to ensure serial behavior.
83 84
@@ -90,7 +91,7 @@ can be measured (across all filesystems) with:
90succ_gets - number of gets that were successful 91succ_gets - number of gets that were successful
91failed_gets - number of gets that failed 92failed_gets - number of gets that failed
92puts - number of puts attempted (all "succeed") 93puts - number of puts attempted (all "succeed")
93flushes - number of flushes attempted 94invalidates - number of invalidates attempted
94 95
95A backend implementatation may provide additional metrics. 96A backend implementatation may provide additional metrics.
96 97
@@ -143,7 +144,7 @@ systems.
143 144
144The core hooks for cleancache in VFS are in most cases a single line 145The core hooks for cleancache in VFS are in most cases a single line
145and the minimum set are placed precisely where needed to maintain 146and the minimum set are placed precisely where needed to maintain
146coherency (via cleancache_flush operations) between cleancache, 147coherency (via cleancache_invalidate operations) between cleancache,
147the page cache, and disk. All hooks compile into nothingness if 148the page cache, and disk. All hooks compile into nothingness if
148cleancache is config'ed off and turn into a function-pointer- 149cleancache is config'ed off and turn into a function-pointer-
149compare-to-NULL if config'ed on but no backend claims the ops 150compare-to-NULL if config'ed on but no backend claims the ops
@@ -184,15 +185,15 @@ or for real kernel-addressable RAM, it makes perfect sense for
184transcendent memory. 185transcendent memory.
185 186
1864) Why is non-shared cleancache "exclusive"? And where is the 1874) Why is non-shared cleancache "exclusive"? And where is the
187 page "flushed" after a "get"? (Minchan Kim) 188 page "invalidated" after a "get"? (Minchan Kim)
188 189
189The main reason is to free up space in transcendent memory and 190The main reason is to free up space in transcendent memory and
190to avoid unnecessary cleancache_flush calls. If you want inclusive, 191to avoid unnecessary cleancache_invalidate calls. If you want inclusive,
191the page can be "put" immediately following the "get". If 192the page can be "put" immediately following the "get". If
192put-after-get for inclusive becomes common, the interface could 193put-after-get for inclusive becomes common, the interface could
193be easily extended to add a "get_no_flush" call. 194be easily extended to add a "get_no_invalidate" call.
194 195
195The flush is done by the cleancache backend implementation. 196The invalidate is done by the cleancache backend implementation.
196 197
1975) What's the performance impact? 1985) What's the performance impact?
198 199
@@ -222,7 +223,7 @@ Some points for a filesystem to consider:
222 as tmpfs should not enable cleancache) 223 as tmpfs should not enable cleancache)
223- To ensure coherency/correctness, the FS must ensure that all 224- To ensure coherency/correctness, the FS must ensure that all
224 file removal or truncation operations either go through VFS or 225 file removal or truncation operations either go through VFS or
225 add hooks to do the equivalent cleancache "flush" operations 226 add hooks to do the equivalent cleancache "invalidate" operations
226- To ensure coherency/correctness, either inode numbers must 227- To ensure coherency/correctness, either inode numbers must
227 be unique across the lifetime of the on-disk file OR the 228 be unique across the lifetime of the on-disk file OR the
228 FS must provide an "encode_fh" function. 229 FS must provide an "encode_fh" function.
@@ -243,11 +244,11 @@ If cleancache would use the inode virtual address instead of
243inode/filehandle, the pool id could be eliminated. But, this 244inode/filehandle, the pool id could be eliminated. But, this
244won't work because cleancache retains pagecache data pages 245won't work because cleancache retains pagecache data pages
245persistently even when the inode has been pruned from the 246persistently even when the inode has been pruned from the
246inode unused list, and only flushes the data page if the file 247inode unused list, and only invalidates the data page if the file
247gets removed/truncated. So if cleancache used the inode kva, 248gets removed/truncated. So if cleancache used the inode kva,
248there would be potential coherency issues if/when the inode 249there would be potential coherency issues if/when the inode
249kva is reused for a different file. Alternately, if cleancache 250kva is reused for a different file. Alternately, if cleancache
250flushed the pages when the inode kva was freed, much of the value 251invalidated the pages when the inode kva was freed, much of the value
251of cleancache would be lost because the cache of pages in cleanache 252of cleancache would be lost because the cache of pages in cleanache
252is potentially much larger than the kernel pagecache and is most 253is potentially much larger than the kernel pagecache and is most
253useful if the pages survive inode cache removal. 254useful if the pages survive inode cache removal.
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 69a5b6fbee2b..d6d5f29463cd 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -110,7 +110,7 @@ void invalidate_bdev(struct block_device *bdev)
110 /* 99% of the time, we don't need to flush the cleancache on the bdev. 110 /* 99% of the time, we don't need to flush the cleancache on the bdev.
111 * But, for the strange corners, lets be cautious 111 * But, for the strange corners, lets be cautious
112 */ 112 */
113 cleancache_flush_inode(mapping); 113 cleancache_invalidate_inode(mapping);
114} 114}
115EXPORT_SYMBOL(invalidate_bdev); 115EXPORT_SYMBOL(invalidate_bdev);
116 116
diff --git a/fs/super.c b/fs/super.c
index de41e1e46f09..e5d9765ff5f4 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -250,7 +250,7 @@ void deactivate_locked_super(struct super_block *s)
250{ 250{
251 struct file_system_type *fs = s->s_type; 251 struct file_system_type *fs = s->s_type;
252 if (atomic_dec_and_test(&s->s_active)) { 252 if (atomic_dec_and_test(&s->s_active)) {
253 cleancache_flush_fs(s); 253 cleancache_invalidate_fs(s);
254 fs->kill_sb(s); 254 fs->kill_sb(s);
255 255
256 /* caches are now gone, we can safely kill the shrinker now */ 256 /* caches are now gone, we can safely kill the shrinker now */
diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 04ffb2e6c9d0..66fb63b243a8 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -28,6 +28,11 @@ struct cleancache_ops {
28 pgoff_t, struct page *); 28 pgoff_t, struct page *);
29 void (*put_page)(int, struct cleancache_filekey, 29 void (*put_page)(int, struct cleancache_filekey,
30 pgoff_t, struct page *); 30 pgoff_t, struct page *);
31 /*
32 * NOTE: per akpm, flush_page, flush_inode and flush_fs will be
33 * renamed to invalidate_* in a later commit in which all
34 * dependencies (i.e Xen, zcache) will be renamed simultaneously
35 */
31 void (*flush_page)(int, struct cleancache_filekey, pgoff_t); 36 void (*flush_page)(int, struct cleancache_filekey, pgoff_t);
32 void (*flush_inode)(int, struct cleancache_filekey); 37 void (*flush_inode)(int, struct cleancache_filekey);
33 void (*flush_fs)(int); 38 void (*flush_fs)(int);
@@ -39,9 +44,9 @@ extern void __cleancache_init_fs(struct super_block *);
39extern void __cleancache_init_shared_fs(char *, struct super_block *); 44extern void __cleancache_init_shared_fs(char *, struct super_block *);
40extern int __cleancache_get_page(struct page *); 45extern int __cleancache_get_page(struct page *);
41extern void __cleancache_put_page(struct page *); 46extern void __cleancache_put_page(struct page *);
42extern void __cleancache_flush_page(struct address_space *, struct page *); 47extern void __cleancache_invalidate_page(struct address_space *, struct page *);
43extern void __cleancache_flush_inode(struct address_space *); 48extern void __cleancache_invalidate_inode(struct address_space *);
44extern void __cleancache_flush_fs(struct super_block *); 49extern void __cleancache_invalidate_fs(struct super_block *);
45extern int cleancache_enabled; 50extern int cleancache_enabled;
46 51
47#ifdef CONFIG_CLEANCACHE 52#ifdef CONFIG_CLEANCACHE
@@ -99,24 +104,24 @@ static inline void cleancache_put_page(struct page *page)
99 __cleancache_put_page(page); 104 __cleancache_put_page(page);
100} 105}
101 106
102static inline void cleancache_flush_page(struct address_space *mapping, 107static inline void cleancache_invalidate_page(struct address_space *mapping,
103 struct page *page) 108 struct page *page)
104{ 109{
105 /* careful... page->mapping is NULL sometimes when this is called */ 110 /* careful... page->mapping is NULL sometimes when this is called */
106 if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping)) 111 if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping))
107 __cleancache_flush_page(mapping, page); 112 __cleancache_invalidate_page(mapping, page);
108} 113}
109 114
110static inline void cleancache_flush_inode(struct address_space *mapping) 115static inline void cleancache_invalidate_inode(struct address_space *mapping)
111{ 116{
112 if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping)) 117 if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping))
113 __cleancache_flush_inode(mapping); 118 __cleancache_invalidate_inode(mapping);
114} 119}
115 120
116static inline void cleancache_flush_fs(struct super_block *sb) 121static inline void cleancache_invalidate_fs(struct super_block *sb)
117{ 122{
118 if (cleancache_enabled) 123 if (cleancache_enabled)
119 __cleancache_flush_fs(sb); 124 __cleancache_invalidate_fs(sb);
120} 125}
121 126
122#endif /* _LINUX_CLEANCACHE_H */ 127#endif /* _LINUX_CLEANCACHE_H */
diff --git a/mm/cleancache.c b/mm/cleancache.c
index bcaae4c2a770..237c6e0feea0 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -19,7 +19,7 @@
19 19
20/* 20/*
21 * This global enablement flag may be read thousands of times per second 21 * This global enablement flag may be read thousands of times per second
22 * by cleancache_get/put/flush even on systems where cleancache_ops 22 * by cleancache_get/put/invalidate even on systems where cleancache_ops
23 * is not claimed (e.g. cleancache is config'ed on but remains 23 * is not claimed (e.g. cleancache is config'ed on but remains
24 * disabled), so is preferred to the slower alternative: a function 24 * disabled), so is preferred to the slower alternative: a function
25 * call that checks a non-global. 25 * call that checks a non-global.
@@ -148,10 +148,11 @@ void __cleancache_put_page(struct page *page)
148EXPORT_SYMBOL(__cleancache_put_page); 148EXPORT_SYMBOL(__cleancache_put_page);
149 149
150/* 150/*
151 * Flush any data from cleancache associated with the poolid and the 151 * Invalidate any data from cleancache associated with the poolid and the
152 * page's inode and page index so that a subsequent "get" will fail. 152 * page's inode and page index so that a subsequent "get" will fail.
153 */ 153 */
154void __cleancache_flush_page(struct address_space *mapping, struct page *page) 154void __cleancache_invalidate_page(struct address_space *mapping,
155 struct page *page)
155{ 156{
156 /* careful... page->mapping is NULL sometimes when this is called */ 157 /* careful... page->mapping is NULL sometimes when this is called */
157 int pool_id = mapping->host->i_sb->cleancache_poolid; 158 int pool_id = mapping->host->i_sb->cleancache_poolid;
@@ -165,14 +166,14 @@ void __cleancache_flush_page(struct address_space *mapping, struct page *page)
165 } 166 }
166 } 167 }
167} 168}
168EXPORT_SYMBOL(__cleancache_flush_page); 169EXPORT_SYMBOL(__cleancache_invalidate_page);
169 170
170/* 171/*
171 * Flush all data from cleancache associated with the poolid and the 172 * Invalidate all data from cleancache associated with the poolid and the
172 * mappings's inode so that all subsequent gets to this poolid/inode 173 * mappings's inode so that all subsequent gets to this poolid/inode
173 * will fail. 174 * will fail.
174 */ 175 */
175void __cleancache_flush_inode(struct address_space *mapping) 176void __cleancache_invalidate_inode(struct address_space *mapping)
176{ 177{
177 int pool_id = mapping->host->i_sb->cleancache_poolid; 178 int pool_id = mapping->host->i_sb->cleancache_poolid;
178 struct cleancache_filekey key = { .u.key = { 0 } }; 179 struct cleancache_filekey key = { .u.key = { 0 } };
@@ -180,14 +181,14 @@ void __cleancache_flush_inode(struct address_space *mapping)
180 if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0) 181 if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
181 (*cleancache_ops.flush_inode)(pool_id, key); 182 (*cleancache_ops.flush_inode)(pool_id, key);
182} 183}
183EXPORT_SYMBOL(__cleancache_flush_inode); 184EXPORT_SYMBOL(__cleancache_invalidate_inode);
184 185
185/* 186/*
186 * Called by any cleancache-enabled filesystem at time of unmount; 187 * Called by any cleancache-enabled filesystem at time of unmount;
187 * note that pool_id is surrendered and may be reutrned by a subsequent 188 * note that pool_id is surrendered and may be reutrned by a subsequent
188 * cleancache_init_fs or cleancache_init_shared_fs 189 * cleancache_init_fs or cleancache_init_shared_fs
189 */ 190 */
190void __cleancache_flush_fs(struct super_block *sb) 191void __cleancache_invalidate_fs(struct super_block *sb)
191{ 192{
192 if (sb->cleancache_poolid >= 0) { 193 if (sb->cleancache_poolid >= 0) {
193 int old_poolid = sb->cleancache_poolid; 194 int old_poolid = sb->cleancache_poolid;
@@ -195,7 +196,7 @@ void __cleancache_flush_fs(struct super_block *sb)
195 (*cleancache_ops.flush_fs)(old_poolid); 196 (*cleancache_ops.flush_fs)(old_poolid);
196 } 197 }
197} 198}
198EXPORT_SYMBOL(__cleancache_flush_fs); 199EXPORT_SYMBOL(__cleancache_invalidate_fs);
199 200
200#ifdef CONFIG_SYSFS 201#ifdef CONFIG_SYSFS
201 202
diff --git a/mm/filemap.c b/mm/filemap.c
index a0701e6eec10..0aa3faa48219 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -123,7 +123,7 @@ void __delete_from_page_cache(struct page *page)
123 if (PageUptodate(page) && PageMappedToDisk(page)) 123 if (PageUptodate(page) && PageMappedToDisk(page))
124 cleancache_put_page(page); 124 cleancache_put_page(page);
125 else 125 else
126 cleancache_flush_page(mapping, page); 126 cleancache_invalidate_page(mapping, page);
127 127
128 radix_tree_delete(&mapping->page_tree, page->index); 128 radix_tree_delete(&mapping->page_tree, page->index);
129 page->mapping = NULL; 129 page->mapping = NULL;
diff --git a/mm/truncate.c b/mm/truncate.c
index 632b15e29f74..b4d575c9a0ee 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -52,7 +52,7 @@ void do_invalidatepage(struct page *page, unsigned long offset)
52static inline void truncate_partial_page(struct page *page, unsigned partial) 52static inline void truncate_partial_page(struct page *page, unsigned partial)
53{ 53{
54 zero_user_segment(page, partial, PAGE_CACHE_SIZE); 54 zero_user_segment(page, partial, PAGE_CACHE_SIZE);
55 cleancache_flush_page(page->mapping, page); 55 cleancache_invalidate_page(page->mapping, page);
56 if (page_has_private(page)) 56 if (page_has_private(page))
57 do_invalidatepage(page, partial); 57 do_invalidatepage(page, partial);
58} 58}
@@ -213,7 +213,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
213 pgoff_t end; 213 pgoff_t end;
214 int i; 214 int i;
215 215
216 cleancache_flush_inode(mapping); 216 cleancache_invalidate_inode(mapping);
217 if (mapping->nrpages == 0) 217 if (mapping->nrpages == 0)
218 return; 218 return;
219 219
@@ -292,7 +292,7 @@ void truncate_inode_pages_range(struct address_space *mapping,
292 mem_cgroup_uncharge_end(); 292 mem_cgroup_uncharge_end();
293 index++; 293 index++;
294 } 294 }
295 cleancache_flush_inode(mapping); 295 cleancache_invalidate_inode(mapping);
296} 296}
297EXPORT_SYMBOL(truncate_inode_pages_range); 297EXPORT_SYMBOL(truncate_inode_pages_range);
298 298
@@ -444,7 +444,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
444 int ret2 = 0; 444 int ret2 = 0;
445 int did_range_unmap = 0; 445 int did_range_unmap = 0;
446 446
447 cleancache_flush_inode(mapping); 447 cleancache_invalidate_inode(mapping);
448 pagevec_init(&pvec, 0); 448 pagevec_init(&pvec, 0);
449 index = start; 449 index = start;
450 while (index <= end && pagevec_lookup(&pvec, mapping, index, 450 while (index <= end && pagevec_lookup(&pvec, mapping, index,
@@ -500,7 +500,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
500 cond_resched(); 500 cond_resched();
501 index++; 501 index++;
502 } 502 }
503 cleancache_flush_inode(mapping); 503 cleancache_invalidate_inode(mapping);
504 return ret; 504 return ret;
505} 505}
506EXPORT_SYMBOL_GPL(invalidate_inode_pages2_range); 506EXPORT_SYMBOL_GPL(invalidate_inode_pages2_range);