diff options
author | Vladimir Davydov <vdavydov@parallels.com> | 2014-06-04 19:07:20 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-06-04 19:53:59 -0400 |
commit | 03afc0e25f7fc03537014a770f4c54ebbe63a24c (patch) | |
tree | 520cdb32e6d35cd5b4e61fc5254a151fb03fc24a /mm/slub.c | |
parent | bfc8c90139ebd049b9801a951db3b9a4a00bed9c (diff) |
slab: get_online_mems for kmem_cache_{create,destroy,shrink}
When we create a sl[au]b cache, we allocate kmem_cache_node structures
for each online NUMA node. To handle nodes taken online/offline, we
register memory hotplug notifier and allocate/free kmem_cache_node
corresponding to the node that changes its state for each kmem cache.
To synchronize between the two paths we hold the slab_mutex during both
the cache creationg/destruction path and while tuning per-node parts of
kmem caches in memory hotplug handler, but that's not quite right,
because it does not guarantee that a newly created cache will have all
kmem_cache_nodes initialized in case it races with memory hotplug. For
instance, in case of slub:
CPU0 CPU1
---- ----
kmem_cache_create: online_pages:
__kmem_cache_create: slab_memory_callback:
slab_mem_going_online_callback:
lock slab_mutex
for each slab_caches list entry
allocate kmem_cache node
unlock slab_mutex
lock slab_mutex
init_kmem_cache_nodes:
for_each_node_state(node, N_NORMAL_MEMORY)
allocate kmem_cache node
add kmem_cache to slab_caches list
unlock slab_mutex
online_pages (continued):
node_states_set_node
As a result we'll get a kmem cache with not all kmem_cache_nodes
allocated.
To avoid issues like that we should hold get/put_online_mems() during
the whole kmem cache creation/destruction/shrink paths, just like we
deal with cpu hotplug. This patch does the trick.
Note, that after it's applied, there is no need in taking the slab_mutex
for kmem_cache_shrink any more, so it is removed from there.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'mm/slub.c')
-rw-r--r-- | mm/slub.c | 5 |
1 files changed, 2 insertions, 3 deletions
@@ -3398,7 +3398,7 @@ EXPORT_SYMBOL(kfree); | |||
3398 | * being allocated from last increasing the chance that the last objects | 3398 | * being allocated from last increasing the chance that the last objects |
3399 | * are freed in them. | 3399 | * are freed in them. |
3400 | */ | 3400 | */ |
3401 | int kmem_cache_shrink(struct kmem_cache *s) | 3401 | int __kmem_cache_shrink(struct kmem_cache *s) |
3402 | { | 3402 | { |
3403 | int node; | 3403 | int node; |
3404 | int i; | 3404 | int i; |
@@ -3454,7 +3454,6 @@ int kmem_cache_shrink(struct kmem_cache *s) | |||
3454 | kfree(slabs_by_inuse); | 3454 | kfree(slabs_by_inuse); |
3455 | return 0; | 3455 | return 0; |
3456 | } | 3456 | } |
3457 | EXPORT_SYMBOL(kmem_cache_shrink); | ||
3458 | 3457 | ||
3459 | static int slab_mem_going_offline_callback(void *arg) | 3458 | static int slab_mem_going_offline_callback(void *arg) |
3460 | { | 3459 | { |
@@ -3462,7 +3461,7 @@ static int slab_mem_going_offline_callback(void *arg) | |||
3462 | 3461 | ||
3463 | mutex_lock(&slab_mutex); | 3462 | mutex_lock(&slab_mutex); |
3464 | list_for_each_entry(s, &slab_caches, list) | 3463 | list_for_each_entry(s, &slab_caches, list) |
3465 | kmem_cache_shrink(s); | 3464 | __kmem_cache_shrink(s); |
3466 | mutex_unlock(&slab_mutex); | 3465 | mutex_unlock(&slab_mutex); |
3467 | 3466 | ||
3468 | return 0; | 3467 | return 0; |