aboutsummaryrefslogtreecommitdiffstats
path: root/mm/page_alloc.c
diff options
context:
space:
mode:
authorChristoph Lameter <clameter@engr.sgi.com>2006-03-09 20:33:54 -0500
committerLinus Torvalds <torvalds@g5.osdl.org>2006-03-09 22:47:38 -0500
commit8fce4d8e3b9e3cf47cc8afeb6077e22ab795d989 (patch)
tree4930be5756f7a3893717d38f443f6261f11a1f60 /mm/page_alloc.c
parent7b61fcda8a640bb87be23f9f09c1f24357b5c6e1 (diff)
[PATCH] slab: Node rotor for freeing alien caches and remote per cpu pages.
The cache reaper currently tries to free all alien caches and all remote per cpu pages in each pass of cache_reap. For a machines with large number of nodes (such as Altix) this may lead to sporadic delays of around ~10ms. Interrupts are disabled while reclaiming creating unacceptable delays. This patch changes that behavior by adding a per cpu reap_node variable. Instead of attempting to free all caches, we free only one alien cache and the per cpu pages from one remote node. That reduces the time spend in cache_reap. However, doing so will lengthen the time it takes to completely drain all remote per cpu pagesets and all alien caches. The time needed will grow with the number of nodes in the system. All caches are drained when they overflow their respective capacity. So the drawback here is only that a bit of memory may be wasted for awhile longer. Details: 1. Rename drain_remote_pages to drain_node_pages to allow the specification of the node to drain of pcp pages. 2. Add additional functions init_reap_node, next_reap_node for NUMA that manage a per cpu reap_node counter. 3. Add a reap_alien function that reaps only from the current reap_node. For us this seems to be a critical issue. Holdoffs of an average of ~7ms cause some HPC benchmarks to slow down significantly. F.e. NAS parallel slows down dramatically. NAS parallel has a 12-16 seconds runtime w/o rotor compared to 5.8 secs with the rotor patches. It gets down to 5.05 secs with the additional interrupt holdoff reductions. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'mm/page_alloc.c')
-rw-r--r--mm/page_alloc.c17
1 files changed, 8 insertions, 9 deletions
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 791690d7d3fa..234bd4895d14 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -590,21 +590,20 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
590} 590}
591 591
592#ifdef CONFIG_NUMA 592#ifdef CONFIG_NUMA
593/* Called from the slab reaper to drain remote pagesets */ 593/*
594void drain_remote_pages(void) 594 * Called from the slab reaper to drain pagesets on a particular node that
595 * belong to the currently executing processor.
596 */
597void drain_node_pages(int nodeid)
595{ 598{
596 struct zone *zone; 599 int i, z;
597 int i;
598 unsigned long flags; 600 unsigned long flags;
599 601
600 local_irq_save(flags); 602 local_irq_save(flags);
601 for_each_zone(zone) { 603 for (z = 0; z < MAX_NR_ZONES; z++) {
604 struct zone *zone = NODE_DATA(nodeid)->node_zones + z;
602 struct per_cpu_pageset *pset; 605 struct per_cpu_pageset *pset;
603 606
604 /* Do not drain local pagesets */
605 if (zone->zone_pgdat->node_id == numa_node_id())
606 continue;
607
608 pset = zone_pcp(zone, smp_processor_id()); 607 pset = zone_pcp(zone, smp_processor_id());
609 for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) { 608 for (i = 0; i < ARRAY_SIZE(pset->pcp); i++) {
610 struct per_cpu_pages *pcp; 609 struct per_cpu_pages *pcp;