diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/prio_tree.txt |
Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation/prio_tree.txt')
-rw-r--r-- | Documentation/prio_tree.txt | 107 |
1 files changed, 107 insertions, 0 deletions
diff --git a/Documentation/prio_tree.txt b/Documentation/prio_tree.txt new file mode 100644 index 000000000000..2fbb0c49bc5b --- /dev/null +++ b/Documentation/prio_tree.txt | |||
@@ -0,0 +1,107 @@ | |||
1 | The prio_tree.c code indexes vmas using 3 different indexes: | ||
2 | * heap_index = vm_pgoff + vm_size_in_pages : end_vm_pgoff | ||
3 | * radix_index = vm_pgoff : start_vm_pgoff | ||
4 | * size_index = vm_size_in_pages | ||
5 | |||
6 | A regular radix-priority-search-tree indexes vmas using only heap_index and | ||
7 | radix_index. The conditions for indexing are: | ||
8 | * ->heap_index >= ->left->heap_index && | ||
9 | ->heap_index >= ->right->heap_index | ||
10 | * if (->heap_index == ->left->heap_index) | ||
11 | then ->radix_index < ->left->radix_index; | ||
12 | * if (->heap_index == ->right->heap_index) | ||
13 | then ->radix_index < ->right->radix_index; | ||
14 | * nodes are hashed to left or right subtree using radix_index | ||
15 | similar to a pure binary radix tree. | ||
16 | |||
17 | A regular radix-priority-search-tree helps to store and query | ||
18 | intervals (vmas). However, a regular radix-priority-search-tree is only | ||
19 | suitable for storing vmas with different radix indices (vm_pgoff). | ||
20 | |||
21 | Therefore, the prio_tree.c extends the regular radix-priority-search-tree | ||
22 | to handle many vmas with the same vm_pgoff. Such vmas are handled in | ||
23 | 2 different ways: 1) All vmas with the same radix _and_ heap indices are | ||
24 | linked using vm_set.list, 2) if there are many vmas with the same radix | ||
25 | index, but different heap indices and if the regular radix-priority-search | ||
26 | tree cannot index them all, we build an overflow-sub-tree that indexes such | ||
27 | vmas using heap and size indices instead of heap and radix indices. For | ||
28 | example, in the figure below some vmas with vm_pgoff = 0 (zero) are | ||
29 | indexed by regular radix-priority-search-tree whereas others are pushed | ||
30 | into an overflow-subtree. Note that all vmas in an overflow-sub-tree have | ||
31 | the same vm_pgoff (radix_index) and if necessary we build different | ||
32 | overflow-sub-trees to handle each possible radix_index. For example, | ||
33 | in figure we have 3 overflow-sub-trees corresponding to radix indices | ||
34 | 0, 2, and 4. | ||
35 | |||
36 | In the final tree the first few (prio_tree_root->index_bits) levels | ||
37 | are indexed using heap and radix indices whereas the overflow-sub-trees below | ||
38 | those levels (i.e. levels prio_tree_root->index_bits + 1 and higher) are | ||
39 | indexed using heap and size indices. In overflow-sub-trees the size_index | ||
40 | is used for hashing the nodes to appropriate places. | ||
41 | |||
42 | Now, an example prio_tree: | ||
43 | |||
44 | vmas are represented [radix_index, size_index, heap_index] | ||
45 | i.e., [start_vm_pgoff, vm_size_in_pages, end_vm_pgoff] | ||
46 | |||
47 | level prio_tree_root->index_bits = 3 | ||
48 | ----- | ||
49 | _ | ||
50 | 0 [0,7,7] | | ||
51 | / \ | | ||
52 | ------------------ ------------ | Regular | ||
53 | / \ | radix priority | ||
54 | 1 [1,6,7] [4,3,7] | search tree | ||
55 | / \ / \ | | ||
56 | ------- ----- ------ ----- | heap-and-radix | ||
57 | / \ / \ | indexed | ||
58 | 2 [0,6,6] [2,5,7] [5,2,7] [6,1,7] | | ||
59 | / \ / \ / \ / \ | | ||
60 | 3 [0,5,5] [1,5,6] [2,4,6] [3,4,7] [4,2,6] [5,1,6] [6,0,6] [7,0,7] | | ||
61 | / / / _ | ||
62 | / / / _ | ||
63 | 4 [0,4,4] [2,3,5] [4,1,5] | | ||
64 | / / / | | ||
65 | 5 [0,3,3] [2,2,4] [4,0,4] | Overflow-sub-trees | ||
66 | / / | | ||
67 | 6 [0,2,2] [2,1,3] | heap-and-size | ||
68 | / / | indexed | ||
69 | 7 [0,1,1] [2,0,2] | | ||
70 | / | | ||
71 | 8 [0,0,0] | | ||
72 | _ | ||
73 | |||
74 | Note that we use prio_tree_root->index_bits to optimize the height | ||
75 | of the heap-and-radix indexed tree. Since prio_tree_root->index_bits is | ||
76 | set according to the maximum end_vm_pgoff mapped, we are sure that all | ||
77 | bits (in vm_pgoff) above prio_tree_root->index_bits are 0 (zero). Therefore, | ||
78 | we only use the first prio_tree_root->index_bits as radix_index. | ||
79 | Whenever index_bits is increased in prio_tree_expand, we shuffle the tree | ||
80 | to make sure that the first prio_tree_root->index_bits levels of the tree | ||
81 | is indexed properly using heap and radix indices. | ||
82 | |||
83 | We do not optimize the height of overflow-sub-trees using index_bits. | ||
84 | The reason is: there can be many such overflow-sub-trees and all of | ||
85 | them have to be suffled whenever the index_bits increases. This may involve | ||
86 | walking the whole prio_tree in prio_tree_insert->prio_tree_expand code | ||
87 | path which is not desirable. Hence, we do not optimize the height of the | ||
88 | heap-and-size indexed overflow-sub-trees using prio_tree->index_bits. | ||
89 | Instead the overflow sub-trees are indexed using full BITS_PER_LONG bits | ||
90 | of size_index. This may lead to skewed sub-trees because most of the | ||
91 | higher significant bits of the size_index are likely to be be 0 (zero). In | ||
92 | the example above, all 3 overflow-sub-trees are skewed. This may marginally | ||
93 | affect the performance. However, processes rarely map many vmas with the | ||
94 | same start_vm_pgoff but different end_vm_pgoffs. Therefore, we normally | ||
95 | do not require overflow-sub-trees to index all vmas. | ||
96 | |||
97 | From the above discussion it is clear that the maximum height of | ||
98 | a prio_tree can be prio_tree_root->index_bits + BITS_PER_LONG. | ||
99 | However, in most of the common cases we do not need overflow-sub-trees, | ||
100 | so the tree height in the common cases will be prio_tree_root->index_bits. | ||
101 | |||
102 | It is fair to mention here that the prio_tree_root->index_bits | ||
103 | is increased on demand, however, the index_bits is not decreased when | ||
104 | vmas are removed from the prio_tree. That's tricky to do. Hence, it's | ||
105 | left as a home work problem. | ||
106 | |||
107 | |||