diff options
Diffstat (limited to 'Documentation/vm/numa')
-rw-r--r-- | Documentation/vm/numa | 41 |
1 files changed, 41 insertions, 0 deletions
diff --git a/Documentation/vm/numa b/Documentation/vm/numa new file mode 100644 index 000000000000..4b8db1bd3b78 --- /dev/null +++ b/Documentation/vm/numa | |||
@@ -0,0 +1,41 @@ | |||
1 | Started Nov 1999 by Kanoj Sarcar <kanoj@sgi.com> | ||
2 | |||
3 | The intent of this file is to have an uptodate, running commentary | ||
4 | from different people about NUMA specific code in the Linux vm. | ||
5 | |||
6 | What is NUMA? It is an architecture where the memory access times | ||
7 | for different regions of memory from a given processor varies | ||
8 | according to the "distance" of the memory region from the processor. | ||
9 | Each region of memory to which access times are the same from any | ||
10 | cpu, is called a node. On such architectures, it is beneficial if | ||
11 | the kernel tries to minimize inter node communications. Schemes | ||
12 | for this range from kernel text and read-only data replication | ||
13 | across nodes, and trying to house all the data structures that | ||
14 | key components of the kernel need on memory on that node. | ||
15 | |||
16 | Currently, all the numa support is to provide efficient handling | ||
17 | of widely discontiguous physical memory, so architectures which | ||
18 | are not NUMA but can have huge holes in the physical address space | ||
19 | can use the same code. All this code is bracketed by CONFIG_DISCONTIGMEM. | ||
20 | |||
21 | The initial port includes NUMAizing the bootmem allocator code by | ||
22 | encapsulating all the pieces of information into a bootmem_data_t | ||
23 | structure. Node specific calls have been added to the allocator. | ||
24 | In theory, any platform which uses the bootmem allocator should | ||
25 | be able to to put the bootmem and mem_map data structures anywhere | ||
26 | it deems best. | ||
27 | |||
28 | Each node's page allocation data structures have also been encapsulated | ||
29 | into a pg_data_t. The bootmem_data_t is just one part of this. To | ||
30 | make the code look uniform between NUMA and regular UMA platforms, | ||
31 | UMA platforms have a statically allocated pg_data_t too (contig_page_data). | ||
32 | For the sake of uniformity, the function num_online_nodes() is also defined | ||
33 | for all platforms. As we run benchmarks, we might decide to NUMAize | ||
34 | more variables like low_on_memory, nr_free_pages etc into the pg_data_t. | ||
35 | |||
36 | The NUMA aware page allocation code currently tries to allocate pages | ||
37 | from different nodes in a round robin manner. This will be changed to | ||
38 | do concentratic circle search, starting from current node, once the | ||
39 | NUMA port achieves more maturity. The call alloc_pages_node has been | ||
40 | added, so that drivers can make the call and not worry about whether | ||
41 | it is running on a NUMA or UMA platform. | ||