diff options
author | Mike Rapoport <rppt@linux.vnet.ibm.com> | 2018-03-21 15:22:42 -0400 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2018-04-16 16:18:14 -0400 |
commit | a5e4da91e024677cc72d4fd8ea2bbc82217d2443 (patch) | |
tree | 57999813560c41b2286a936d2e2c12e901a32e5d /Documentation/vm/unevictable-lru.txt | |
parent | 44f380fe901c8390df4f7576a3176efe65e2653c (diff) |
docs/vm: unevictable-lru.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/vm/unevictable-lru.txt')
-rw-r--r-- | Documentation/vm/unevictable-lru.txt | 117 |
1 files changed, 49 insertions, 68 deletions
diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt index e14718572476..fdd84cb8d511 100644 --- a/Documentation/vm/unevictable-lru.txt +++ b/Documentation/vm/unevictable-lru.txt | |||
@@ -1,37 +1,13 @@ | |||
1 | ============================== | 1 | .. _unevictable_lru: |
2 | UNEVICTABLE LRU INFRASTRUCTURE | ||
3 | ============================== | ||
4 | |||
5 | ======== | ||
6 | CONTENTS | ||
7 | ======== | ||
8 | |||
9 | (*) The Unevictable LRU | ||
10 | |||
11 | - The unevictable page list. | ||
12 | - Memory control group interaction. | ||
13 | - Marking address spaces unevictable. | ||
14 | - Detecting Unevictable Pages. | ||
15 | - vmscan's handling of unevictable pages. | ||
16 | |||
17 | (*) mlock()'d pages. | ||
18 | |||
19 | - History. | ||
20 | - Basic management. | ||
21 | - mlock()/mlockall() system call handling. | ||
22 | - Filtering special vmas. | ||
23 | - munlock()/munlockall() system call handling. | ||
24 | - Migrating mlocked pages. | ||
25 | - Compacting mlocked pages. | ||
26 | - mmap(MAP_LOCKED) system call handling. | ||
27 | - munmap()/exit()/exec() system call handling. | ||
28 | - try_to_unmap(). | ||
29 | - try_to_munlock() reverse map scan. | ||
30 | - Page reclaim in shrink_*_list(). | ||
31 | 2 | ||
3 | ============================== | ||
4 | Unevictable LRU Infrastructure | ||
5 | ============================== | ||
32 | 6 | ||
33 | ============ | 7 | .. contents:: :local: |
34 | INTRODUCTION | 8 | |
9 | |||
10 | Introduction | ||
35 | ============ | 11 | ============ |
36 | 12 | ||
37 | This document describes the Linux memory manager's "Unevictable LRU" | 13 | This document describes the Linux memory manager's "Unevictable LRU" |
@@ -46,8 +22,8 @@ details - the "what does it do?" - by reading the code. One hopes that the | |||
46 | descriptions below add value by provide the answer to "why does it do that?". | 22 | descriptions below add value by provide the answer to "why does it do that?". |
47 | 23 | ||
48 | 24 | ||
49 | =================== | 25 | |
50 | THE UNEVICTABLE LRU | 26 | The Unevictable LRU |
51 | =================== | 27 | =================== |
52 | 28 | ||
53 | The Unevictable LRU facility adds an additional LRU list to track unevictable | 29 | The Unevictable LRU facility adds an additional LRU list to track unevictable |
@@ -66,17 +42,17 @@ completely unresponsive. | |||
66 | 42 | ||
67 | The unevictable list addresses the following classes of unevictable pages: | 43 | The unevictable list addresses the following classes of unevictable pages: |
68 | 44 | ||
69 | (*) Those owned by ramfs. | 45 | * Those owned by ramfs. |
70 | 46 | ||
71 | (*) Those mapped into SHM_LOCK'd shared memory regions. | 47 | * Those mapped into SHM_LOCK'd shared memory regions. |
72 | 48 | ||
73 | (*) Those mapped into VM_LOCKED [mlock()ed] VMAs. | 49 | * Those mapped into VM_LOCKED [mlock()ed] VMAs. |
74 | 50 | ||
75 | The infrastructure may also be able to handle other conditions that make pages | 51 | The infrastructure may also be able to handle other conditions that make pages |
76 | unevictable, either by definition or by circumstance, in the future. | 52 | unevictable, either by definition or by circumstance, in the future. |
77 | 53 | ||
78 | 54 | ||
79 | THE UNEVICTABLE PAGE LIST | 55 | The Unevictable Page List |
80 | ------------------------- | 56 | ------------------------- |
81 | 57 | ||
82 | The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list | 58 | The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list |
@@ -118,7 +94,7 @@ the unevictable list when one task has the page isolated from the LRU and other | |||
118 | tasks are changing the "evictability" state of the page. | 94 | tasks are changing the "evictability" state of the page. |
119 | 95 | ||
120 | 96 | ||
121 | MEMORY CONTROL GROUP INTERACTION | 97 | Memory Control Group Interaction |
122 | -------------------------------- | 98 | -------------------------------- |
123 | 99 | ||
124 | The unevictable LRU facility interacts with the memory control group [aka | 100 | The unevictable LRU facility interacts with the memory control group [aka |
@@ -144,7 +120,9 @@ effects: | |||
144 | the control group to thrash or to OOM-kill tasks. | 120 | the control group to thrash or to OOM-kill tasks. |
145 | 121 | ||
146 | 122 | ||
147 | MARKING ADDRESS SPACES UNEVICTABLE | 123 | .. _mark_addr_space_unevict: |
124 | |||
125 | Marking Address Spaces Unevictable | ||
148 | ---------------------------------- | 126 | ---------------------------------- |
149 | 127 | ||
150 | For facilities such as ramfs none of the pages attached to the address space | 128 | For facilities such as ramfs none of the pages attached to the address space |
@@ -152,15 +130,15 @@ may be evicted. To prevent eviction of any such pages, the AS_UNEVICTABLE | |||
152 | address space flag is provided, and this can be manipulated by a filesystem | 130 | address space flag is provided, and this can be manipulated by a filesystem |
153 | using a number of wrapper functions: | 131 | using a number of wrapper functions: |
154 | 132 | ||
155 | (*) void mapping_set_unevictable(struct address_space *mapping); | 133 | * ``void mapping_set_unevictable(struct address_space *mapping);`` |
156 | 134 | ||
157 | Mark the address space as being completely unevictable. | 135 | Mark the address space as being completely unevictable. |
158 | 136 | ||
159 | (*) void mapping_clear_unevictable(struct address_space *mapping); | 137 | * ``void mapping_clear_unevictable(struct address_space *mapping);`` |
160 | 138 | ||
161 | Mark the address space as being evictable. | 139 | Mark the address space as being evictable. |
162 | 140 | ||
163 | (*) int mapping_unevictable(struct address_space *mapping); | 141 | * ``int mapping_unevictable(struct address_space *mapping);`` |
164 | 142 | ||
165 | Query the address space, and return true if it is completely | 143 | Query the address space, and return true if it is completely |
166 | unevictable. | 144 | unevictable. |
@@ -177,12 +155,13 @@ These are currently used in two places in the kernel: | |||
177 | ensure they're in memory. | 155 | ensure they're in memory. |
178 | 156 | ||
179 | 157 | ||
180 | DETECTING UNEVICTABLE PAGES | 158 | Detecting Unevictable Pages |
181 | --------------------------- | 159 | --------------------------- |
182 | 160 | ||
183 | The function page_evictable() in vmscan.c determines whether a page is | 161 | The function page_evictable() in vmscan.c determines whether a page is |
184 | evictable or not using the query function outlined above [see section "Marking | 162 | evictable or not using the query function outlined above [see section |
185 | address spaces unevictable"] to check the AS_UNEVICTABLE flag. | 163 | :ref:`Marking address spaces unevictable <mark_addr_space_unevict>`] |
164 | to check the AS_UNEVICTABLE flag. | ||
186 | 165 | ||
187 | For address spaces that are so marked after being populated (as SHM regions | 166 | For address spaces that are so marked after being populated (as SHM regions |
188 | might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate | 167 | might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate |
@@ -202,7 +181,7 @@ flag, PG_mlocked (as wrapped by PageMlocked()), which is set when a page is | |||
202 | faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED. | 181 | faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED. |
203 | 182 | ||
204 | 183 | ||
205 | VMSCAN'S HANDLING OF UNEVICTABLE PAGES | 184 | Vmscan's Handling of Unevictable Pages |
206 | -------------------------------------- | 185 | -------------------------------------- |
207 | 186 | ||
208 | If unevictable pages are culled in the fault path, or moved to the unevictable | 187 | If unevictable pages are culled in the fault path, or moved to the unevictable |
@@ -233,8 +212,7 @@ extra evictabilty checks should not occur in the majority of calls to | |||
233 | putback_lru_page(). | 212 | putback_lru_page(). |
234 | 213 | ||
235 | 214 | ||
236 | ============= | 215 | MLOCKED Pages |
237 | MLOCKED PAGES | ||
238 | ============= | 216 | ============= |
239 | 217 | ||
240 | The unevictable page list is also useful for mlock(), in addition to ramfs and | 218 | The unevictable page list is also useful for mlock(), in addition to ramfs and |
@@ -242,7 +220,7 @@ SYSV SHM. Note that mlock() is only available in CONFIG_MMU=y situations; in | |||
242 | NOMMU situations, all mappings are effectively mlocked. | 220 | NOMMU situations, all mappings are effectively mlocked. |
243 | 221 | ||
244 | 222 | ||
245 | HISTORY | 223 | History |
246 | ------- | 224 | ------- |
247 | 225 | ||
248 | The "Unevictable mlocked Pages" infrastructure is based on work originally | 226 | The "Unevictable mlocked Pages" infrastructure is based on work originally |
@@ -263,7 +241,7 @@ replaced by walking the reverse map to determine whether any VM_LOCKED VMAs | |||
263 | mapped the page. More on this below. | 241 | mapped the page. More on this below. |
264 | 242 | ||
265 | 243 | ||
266 | BASIC MANAGEMENT | 244 | Basic Management |
267 | ---------------- | 245 | ---------------- |
268 | 246 | ||
269 | mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable | 247 | mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable |
@@ -304,10 +282,10 @@ mlocked pages become unlocked and rescued from the unevictable list when: | |||
304 | (4) before a page is COW'd in a VM_LOCKED VMA. | 282 | (4) before a page is COW'd in a VM_LOCKED VMA. |
305 | 283 | ||
306 | 284 | ||
307 | mlock()/mlockall() SYSTEM CALL HANDLING | 285 | mlock()/mlockall() System Call Handling |
308 | --------------------------------------- | 286 | --------------------------------------- |
309 | 287 | ||
310 | Both [do_]mlock() and [do_]mlockall() system call handlers call mlock_fixup() | 288 | Both [do\_]mlock() and [do\_]mlockall() system call handlers call mlock_fixup() |
311 | for each VMA in the range specified by the call. In the case of mlockall(), | 289 | for each VMA in the range specified by the call. In the case of mlockall(), |
312 | this is the entire active address space of the task. Note that mlock_fixup() | 290 | this is the entire active address space of the task. Note that mlock_fixup() |
313 | is used for both mlocking and munlocking a range of memory. A call to mlock() | 291 | is used for both mlocking and munlocking a range of memory. A call to mlock() |
@@ -351,7 +329,7 @@ mlock_vma_page() is unable to isolate the page from the LRU, vmscan will handle | |||
351 | it later if and when it attempts to reclaim the page. | 329 | it later if and when it attempts to reclaim the page. |
352 | 330 | ||
353 | 331 | ||
354 | FILTERING SPECIAL VMAS | 332 | Filtering Special VMAs |
355 | ---------------------- | 333 | ---------------------- |
356 | 334 | ||
357 | mlock_fixup() filters several classes of "special" VMAs: | 335 | mlock_fixup() filters several classes of "special" VMAs: |
@@ -379,8 +357,9 @@ VM_LOCKED flag. Therefore, we won't have to deal with them later during | |||
379 | munlock(), munmap() or task exit. Neither does mlock_fixup() account these | 357 | munlock(), munmap() or task exit. Neither does mlock_fixup() account these |
380 | VMAs against the task's "locked_vm". | 358 | VMAs against the task's "locked_vm". |
381 | 359 | ||
360 | .. _munlock_munlockall_handling: | ||
382 | 361 | ||
383 | munlock()/munlockall() SYSTEM CALL HANDLING | 362 | munlock()/munlockall() System Call Handling |
384 | ------------------------------------------- | 363 | ------------------------------------------- |
385 | 364 | ||
386 | The munlock() and munlockall() system calls are handled by the same functions - | 365 | The munlock() and munlockall() system calls are handled by the same functions - |
@@ -426,7 +405,7 @@ This is fine, because we'll catch it later if and if vmscan tries to reclaim | |||
426 | the page. This should be relatively rare. | 405 | the page. This should be relatively rare. |
427 | 406 | ||
428 | 407 | ||
429 | MIGRATING MLOCKED PAGES | 408 | Migrating MLOCKED Pages |
430 | ----------------------- | 409 | ----------------------- |
431 | 410 | ||
432 | A page that is being migrated has been isolated from the LRU lists and is held | 411 | A page that is being migrated has been isolated from the LRU lists and is held |
@@ -451,7 +430,7 @@ list because of a race between munlock and migration, page migration uses the | |||
451 | putback_lru_page() function to add migrated pages back to the LRU. | 430 | putback_lru_page() function to add migrated pages back to the LRU. |
452 | 431 | ||
453 | 432 | ||
454 | COMPACTING MLOCKED PAGES | 433 | Compacting MLOCKED Pages |
455 | ------------------------ | 434 | ------------------------ |
456 | 435 | ||
457 | The unevictable LRU can be scanned for compactable regions and the default | 436 | The unevictable LRU can be scanned for compactable regions and the default |
@@ -461,7 +440,7 @@ unevictable LRU is enabled, the work of compaction is mostly handled by | |||
461 | the page migration code and the same work flow as described in MIGRATING | 440 | the page migration code and the same work flow as described in MIGRATING |
462 | MLOCKED PAGES will apply. | 441 | MLOCKED PAGES will apply. |
463 | 442 | ||
464 | MLOCKING TRANSPARENT HUGE PAGES | 443 | MLOCKING Transparent Huge Pages |
465 | ------------------------------- | 444 | ------------------------------- |
466 | 445 | ||
467 | A transparent huge page is represented by a single entry on an LRU list. | 446 | A transparent huge page is represented by a single entry on an LRU list. |
@@ -483,7 +462,7 @@ to unevictable LRU and the rest can be reclaimed. | |||
483 | 462 | ||
484 | See also comment in follow_trans_huge_pmd(). | 463 | See also comment in follow_trans_huge_pmd(). |
485 | 464 | ||
486 | mmap(MAP_LOCKED) SYSTEM CALL HANDLING | 465 | mmap(MAP_LOCKED) System Call Handling |
487 | ------------------------------------- | 466 | ------------------------------------- |
488 | 467 | ||
489 | In addition the mlock()/mlockall() system calls, an application can request | 468 | In addition the mlock()/mlockall() system calls, an application can request |
@@ -514,7 +493,7 @@ memory range accounted as locked_vm, as the protections could be changed later | |||
514 | and pages allocated into that region. | 493 | and pages allocated into that region. |
515 | 494 | ||
516 | 495 | ||
517 | munmap()/exit()/exec() SYSTEM CALL HANDLING | 496 | munmap()/exit()/exec() System Call Handling |
518 | ------------------------------------------- | 497 | ------------------------------------------- |
519 | 498 | ||
520 | When unmapping an mlocked region of memory, whether by an explicit call to | 499 | When unmapping an mlocked region of memory, whether by an explicit call to |
@@ -568,16 +547,18 @@ munlock or munmap system calls, mm teardown (munlock_vma_pages_all), reclaim, | |||
568 | holepunching, and truncation of file pages and their anonymous COWed pages. | 547 | holepunching, and truncation of file pages and their anonymous COWed pages. |
569 | 548 | ||
570 | 549 | ||
571 | try_to_munlock() REVERSE MAP SCAN | 550 | try_to_munlock() Reverse Map Scan |
572 | --------------------------------- | 551 | --------------------------------- |
573 | 552 | ||
574 | [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the | 553 | .. warning:: |
575 | page_referenced() reverse map walker. | 554 | [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the |
555 | page_referenced() reverse map walker. | ||
576 | 556 | ||
577 | When munlock_vma_page() [see section "munlock()/munlockall() System Call | 557 | When munlock_vma_page() [see section :ref:`munlock()/munlockall() System Call |
578 | Handling" above] tries to munlock a page, it needs to determine whether or not | 558 | Handling <munlock_munlockall_handling>` above] tries to munlock a |
579 | the page is mapped by any VM_LOCKED VMA without actually attempting to unmap | 559 | page, it needs to determine whether or not the page is mapped by any |
580 | all PTEs from the page. For this purpose, the unevictable/mlock infrastructure | 560 | VM_LOCKED VMA without actually attempting to unmap all PTEs from the |
561 | page. For this purpose, the unevictable/mlock infrastructure | ||
581 | introduced a variant of try_to_unmap() called try_to_munlock(). | 562 | introduced a variant of try_to_unmap() called try_to_munlock(). |
582 | 563 | ||
583 | try_to_munlock() calls the same functions as try_to_unmap() for anonymous and | 564 | try_to_munlock() calls the same functions as try_to_unmap() for anonymous and |
@@ -595,7 +576,7 @@ large region or tearing down a large address space that has been mlocked via | |||
595 | mlockall(), overall this is a fairly rare event. | 576 | mlockall(), overall this is a fairly rare event. |
596 | 577 | ||
597 | 578 | ||
598 | PAGE RECLAIM IN shrink_*_list() | 579 | Page Reclaim in shrink_*_list() |
599 | ------------------------------- | 580 | ------------------------------- |
600 | 581 | ||
601 | shrink_active_list() culls any obviously unevictable pages - i.e. | 582 | shrink_active_list() culls any obviously unevictable pages - i.e. |