summaryrefslogtreecommitdiffstats
path: root/Documentation/vm/unevictable-lru.txt
diff options
context:
space:
mode:
authorMike Rapoport <rppt@linux.vnet.ibm.com>2018-03-21 15:22:42 -0400
committerJonathan Corbet <corbet@lwn.net>2018-04-16 16:18:14 -0400
commita5e4da91e024677cc72d4fd8ea2bbc82217d2443 (patch)
tree57999813560c41b2286a936d2e2c12e901a32e5d /Documentation/vm/unevictable-lru.txt
parent44f380fe901c8390df4f7576a3176efe65e2653c (diff)
docs/vm: unevictable-lru.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/vm/unevictable-lru.txt')
-rw-r--r--Documentation/vm/unevictable-lru.txt117
1 files changed, 49 insertions, 68 deletions
diff --git a/Documentation/vm/unevictable-lru.txt b/Documentation/vm/unevictable-lru.txt
index e14718572476..fdd84cb8d511 100644
--- a/Documentation/vm/unevictable-lru.txt
+++ b/Documentation/vm/unevictable-lru.txt
@@ -1,37 +1,13 @@
1 ============================== 1.. _unevictable_lru:
2 UNEVICTABLE LRU INFRASTRUCTURE
3 ==============================
4
5========
6CONTENTS
7========
8
9 (*) The Unevictable LRU
10
11 - The unevictable page list.
12 - Memory control group interaction.
13 - Marking address spaces unevictable.
14 - Detecting Unevictable Pages.
15 - vmscan's handling of unevictable pages.
16
17 (*) mlock()'d pages.
18
19 - History.
20 - Basic management.
21 - mlock()/mlockall() system call handling.
22 - Filtering special vmas.
23 - munlock()/munlockall() system call handling.
24 - Migrating mlocked pages.
25 - Compacting mlocked pages.
26 - mmap(MAP_LOCKED) system call handling.
27 - munmap()/exit()/exec() system call handling.
28 - try_to_unmap().
29 - try_to_munlock() reverse map scan.
30 - Page reclaim in shrink_*_list().
31 2
3==============================
4Unevictable LRU Infrastructure
5==============================
32 6
33============ 7.. contents:: :local:
34INTRODUCTION 8
9
10Introduction
35============ 11============
36 12
37This document describes the Linux memory manager's "Unevictable LRU" 13This document describes the Linux memory manager's "Unevictable LRU"
@@ -46,8 +22,8 @@ details - the "what does it do?" - by reading the code. One hopes that the
46descriptions below add value by provide the answer to "why does it do that?". 22descriptions below add value by provide the answer to "why does it do that?".
47 23
48 24
49=================== 25
50THE UNEVICTABLE LRU 26The Unevictable LRU
51=================== 27===================
52 28
53The Unevictable LRU facility adds an additional LRU list to track unevictable 29The Unevictable LRU facility adds an additional LRU list to track unevictable
@@ -66,17 +42,17 @@ completely unresponsive.
66 42
67The unevictable list addresses the following classes of unevictable pages: 43The unevictable list addresses the following classes of unevictable pages:
68 44
69 (*) Those owned by ramfs. 45 * Those owned by ramfs.
70 46
71 (*) Those mapped into SHM_LOCK'd shared memory regions. 47 * Those mapped into SHM_LOCK'd shared memory regions.
72 48
73 (*) Those mapped into VM_LOCKED [mlock()ed] VMAs. 49 * Those mapped into VM_LOCKED [mlock()ed] VMAs.
74 50
75The infrastructure may also be able to handle other conditions that make pages 51The infrastructure may also be able to handle other conditions that make pages
76unevictable, either by definition or by circumstance, in the future. 52unevictable, either by definition or by circumstance, in the future.
77 53
78 54
79THE UNEVICTABLE PAGE LIST 55The Unevictable Page List
80------------------------- 56-------------------------
81 57
82The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list 58The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list
@@ -118,7 +94,7 @@ the unevictable list when one task has the page isolated from the LRU and other
118tasks are changing the "evictability" state of the page. 94tasks are changing the "evictability" state of the page.
119 95
120 96
121MEMORY CONTROL GROUP INTERACTION 97Memory Control Group Interaction
122-------------------------------- 98--------------------------------
123 99
124The unevictable LRU facility interacts with the memory control group [aka 100The unevictable LRU facility interacts with the memory control group [aka
@@ -144,7 +120,9 @@ effects:
144 the control group to thrash or to OOM-kill tasks. 120 the control group to thrash or to OOM-kill tasks.
145 121
146 122
147MARKING ADDRESS SPACES UNEVICTABLE 123.. _mark_addr_space_unevict:
124
125Marking Address Spaces Unevictable
148---------------------------------- 126----------------------------------
149 127
150For facilities such as ramfs none of the pages attached to the address space 128For facilities such as ramfs none of the pages attached to the address space
@@ -152,15 +130,15 @@ may be evicted. To prevent eviction of any such pages, the AS_UNEVICTABLE
152address space flag is provided, and this can be manipulated by a filesystem 130address space flag is provided, and this can be manipulated by a filesystem
153using a number of wrapper functions: 131using a number of wrapper functions:
154 132
155 (*) void mapping_set_unevictable(struct address_space *mapping); 133 * ``void mapping_set_unevictable(struct address_space *mapping);``
156 134
157 Mark the address space as being completely unevictable. 135 Mark the address space as being completely unevictable.
158 136
159 (*) void mapping_clear_unevictable(struct address_space *mapping); 137 * ``void mapping_clear_unevictable(struct address_space *mapping);``
160 138
161 Mark the address space as being evictable. 139 Mark the address space as being evictable.
162 140
163 (*) int mapping_unevictable(struct address_space *mapping); 141 * ``int mapping_unevictable(struct address_space *mapping);``
164 142
165 Query the address space, and return true if it is completely 143 Query the address space, and return true if it is completely
166 unevictable. 144 unevictable.
@@ -177,12 +155,13 @@ These are currently used in two places in the kernel:
177 ensure they're in memory. 155 ensure they're in memory.
178 156
179 157
180DETECTING UNEVICTABLE PAGES 158Detecting Unevictable Pages
181--------------------------- 159---------------------------
182 160
183The function page_evictable() in vmscan.c determines whether a page is 161The function page_evictable() in vmscan.c determines whether a page is
184evictable or not using the query function outlined above [see section "Marking 162evictable or not using the query function outlined above [see section
185address spaces unevictable"] to check the AS_UNEVICTABLE flag. 163:ref:`Marking address spaces unevictable <mark_addr_space_unevict>`]
164to check the AS_UNEVICTABLE flag.
186 165
187For address spaces that are so marked after being populated (as SHM regions 166For address spaces that are so marked after being populated (as SHM regions
188might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate 167might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate
@@ -202,7 +181,7 @@ flag, PG_mlocked (as wrapped by PageMlocked()), which is set when a page is
202faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED. 181faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED.
203 182
204 183
205VMSCAN'S HANDLING OF UNEVICTABLE PAGES 184Vmscan's Handling of Unevictable Pages
206-------------------------------------- 185--------------------------------------
207 186
208If unevictable pages are culled in the fault path, or moved to the unevictable 187If unevictable pages are culled in the fault path, or moved to the unevictable
@@ -233,8 +212,7 @@ extra evictabilty checks should not occur in the majority of calls to
233putback_lru_page(). 212putback_lru_page().
234 213
235 214
236============= 215MLOCKED Pages
237MLOCKED PAGES
238============= 216=============
239 217
240The unevictable page list is also useful for mlock(), in addition to ramfs and 218The unevictable page list is also useful for mlock(), in addition to ramfs and
@@ -242,7 +220,7 @@ SYSV SHM. Note that mlock() is only available in CONFIG_MMU=y situations; in
242NOMMU situations, all mappings are effectively mlocked. 220NOMMU situations, all mappings are effectively mlocked.
243 221
244 222
245HISTORY 223History
246------- 224-------
247 225
248The "Unevictable mlocked Pages" infrastructure is based on work originally 226The "Unevictable mlocked Pages" infrastructure is based on work originally
@@ -263,7 +241,7 @@ replaced by walking the reverse map to determine whether any VM_LOCKED VMAs
263mapped the page. More on this below. 241mapped the page. More on this below.
264 242
265 243
266BASIC MANAGEMENT 244Basic Management
267---------------- 245----------------
268 246
269mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable 247mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable
@@ -304,10 +282,10 @@ mlocked pages become unlocked and rescued from the unevictable list when:
304 (4) before a page is COW'd in a VM_LOCKED VMA. 282 (4) before a page is COW'd in a VM_LOCKED VMA.
305 283
306 284
307mlock()/mlockall() SYSTEM CALL HANDLING 285mlock()/mlockall() System Call Handling
308--------------------------------------- 286---------------------------------------
309 287
310Both [do_]mlock() and [do_]mlockall() system call handlers call mlock_fixup() 288Both [do\_]mlock() and [do\_]mlockall() system call handlers call mlock_fixup()
311for each VMA in the range specified by the call. In the case of mlockall(), 289for each VMA in the range specified by the call. In the case of mlockall(),
312this is the entire active address space of the task. Note that mlock_fixup() 290this is the entire active address space of the task. Note that mlock_fixup()
313is used for both mlocking and munlocking a range of memory. A call to mlock() 291is used for both mlocking and munlocking a range of memory. A call to mlock()
@@ -351,7 +329,7 @@ mlock_vma_page() is unable to isolate the page from the LRU, vmscan will handle
351it later if and when it attempts to reclaim the page. 329it later if and when it attempts to reclaim the page.
352 330
353 331
354FILTERING SPECIAL VMAS 332Filtering Special VMAs
355---------------------- 333----------------------
356 334
357mlock_fixup() filters several classes of "special" VMAs: 335mlock_fixup() filters several classes of "special" VMAs:
@@ -379,8 +357,9 @@ VM_LOCKED flag. Therefore, we won't have to deal with them later during
379munlock(), munmap() or task exit. Neither does mlock_fixup() account these 357munlock(), munmap() or task exit. Neither does mlock_fixup() account these
380VMAs against the task's "locked_vm". 358VMAs against the task's "locked_vm".
381 359
360.. _munlock_munlockall_handling:
382 361
383munlock()/munlockall() SYSTEM CALL HANDLING 362munlock()/munlockall() System Call Handling
384------------------------------------------- 363-------------------------------------------
385 364
386The munlock() and munlockall() system calls are handled by the same functions - 365The munlock() and munlockall() system calls are handled by the same functions -
@@ -426,7 +405,7 @@ This is fine, because we'll catch it later if and if vmscan tries to reclaim
426the page. This should be relatively rare. 405the page. This should be relatively rare.
427 406
428 407
429MIGRATING MLOCKED PAGES 408Migrating MLOCKED Pages
430----------------------- 409-----------------------
431 410
432A page that is being migrated has been isolated from the LRU lists and is held 411A page that is being migrated has been isolated from the LRU lists and is held
@@ -451,7 +430,7 @@ list because of a race between munlock and migration, page migration uses the
451putback_lru_page() function to add migrated pages back to the LRU. 430putback_lru_page() function to add migrated pages back to the LRU.
452 431
453 432
454COMPACTING MLOCKED PAGES 433Compacting MLOCKED Pages
455------------------------ 434------------------------
456 435
457The unevictable LRU can be scanned for compactable regions and the default 436The unevictable LRU can be scanned for compactable regions and the default
@@ -461,7 +440,7 @@ unevictable LRU is enabled, the work of compaction is mostly handled by
461the page migration code and the same work flow as described in MIGRATING 440the page migration code and the same work flow as described in MIGRATING
462MLOCKED PAGES will apply. 441MLOCKED PAGES will apply.
463 442
464MLOCKING TRANSPARENT HUGE PAGES 443MLOCKING Transparent Huge Pages
465------------------------------- 444-------------------------------
466 445
467A transparent huge page is represented by a single entry on an LRU list. 446A transparent huge page is represented by a single entry on an LRU list.
@@ -483,7 +462,7 @@ to unevictable LRU and the rest can be reclaimed.
483 462
484See also comment in follow_trans_huge_pmd(). 463See also comment in follow_trans_huge_pmd().
485 464
486mmap(MAP_LOCKED) SYSTEM CALL HANDLING 465mmap(MAP_LOCKED) System Call Handling
487------------------------------------- 466-------------------------------------
488 467
489In addition the mlock()/mlockall() system calls, an application can request 468In addition the mlock()/mlockall() system calls, an application can request
@@ -514,7 +493,7 @@ memory range accounted as locked_vm, as the protections could be changed later
514and pages allocated into that region. 493and pages allocated into that region.
515 494
516 495
517munmap()/exit()/exec() SYSTEM CALL HANDLING 496munmap()/exit()/exec() System Call Handling
518------------------------------------------- 497-------------------------------------------
519 498
520When unmapping an mlocked region of memory, whether by an explicit call to 499When unmapping an mlocked region of memory, whether by an explicit call to
@@ -568,16 +547,18 @@ munlock or munmap system calls, mm teardown (munlock_vma_pages_all), reclaim,
568holepunching, and truncation of file pages and their anonymous COWed pages. 547holepunching, and truncation of file pages and their anonymous COWed pages.
569 548
570 549
571try_to_munlock() REVERSE MAP SCAN 550try_to_munlock() Reverse Map Scan
572--------------------------------- 551---------------------------------
573 552
574 [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the 553.. warning::
575 page_referenced() reverse map walker. 554 [!] TODO/FIXME: a better name might be page_mlocked() - analogous to the
555 page_referenced() reverse map walker.
576 556
577When munlock_vma_page() [see section "munlock()/munlockall() System Call 557When munlock_vma_page() [see section :ref:`munlock()/munlockall() System Call
578Handling" above] tries to munlock a page, it needs to determine whether or not 558Handling <munlock_munlockall_handling>` above] tries to munlock a
579the page is mapped by any VM_LOCKED VMA without actually attempting to unmap 559page, it needs to determine whether or not the page is mapped by any
580all PTEs from the page. For this purpose, the unevictable/mlock infrastructure 560VM_LOCKED VMA without actually attempting to unmap all PTEs from the
561page. For this purpose, the unevictable/mlock infrastructure
581introduced a variant of try_to_unmap() called try_to_munlock(). 562introduced a variant of try_to_unmap() called try_to_munlock().
582 563
583try_to_munlock() calls the same functions as try_to_unmap() for anonymous and 564try_to_munlock() calls the same functions as try_to_unmap() for anonymous and
@@ -595,7 +576,7 @@ large region or tearing down a large address space that has been mlocked via
595mlockall(), overall this is a fairly rare event. 576mlockall(), overall this is a fairly rare event.
596 577
597 578
598PAGE RECLAIM IN shrink_*_list() 579Page Reclaim in shrink_*_list()
599------------------------------- 580-------------------------------
600 581
601shrink_active_list() culls any obviously unevictable pages - i.e. 582shrink_active_list() culls any obviously unevictable pages - i.e.