aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/vm/frontswap.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/vm/frontswap.txt')
-rw-r--r--Documentation/vm/frontswap.txt50
1 files changed, 25 insertions, 25 deletions
diff --git a/Documentation/vm/frontswap.txt b/Documentation/vm/frontswap.txt
index a9f731af0fac..37067cf455f4 100644
--- a/Documentation/vm/frontswap.txt
+++ b/Documentation/vm/frontswap.txt
@@ -21,21 +21,21 @@ frontswap_ops funcs appropriately and the functions it provides must
21conform to certain policies as follows: 21conform to certain policies as follows:
22 22
23An "init" prepares the device to receive frontswap pages associated 23An "init" prepares the device to receive frontswap pages associated
24with the specified swap device number (aka "type"). A "put_page" will 24with the specified swap device number (aka "type"). A "store" will
25copy the page to transcendent memory and associate it with the type and 25copy the page to transcendent memory and associate it with the type and
26offset associated with the page. A "get_page" will copy the page, if found, 26offset associated with the page. A "load" will copy the page, if found,
27from transcendent memory into kernel memory, but will NOT remove the page 27from transcendent memory into kernel memory, but will NOT remove the page
28from from transcendent memory. An "invalidate_page" will remove the page 28from from transcendent memory. An "invalidate_page" will remove the page
29from transcendent memory and an "invalidate_area" will remove ALL pages 29from transcendent memory and an "invalidate_area" will remove ALL pages
30associated with the swap type (e.g., like swapoff) and notify the "device" 30associated with the swap type (e.g., like swapoff) and notify the "device"
31to refuse further puts with that swap type. 31to refuse further stores with that swap type.
32 32
33Once a page is successfully put, a matching get on the page will normally 33Once a page is successfully stored, a matching load on the page will normally
34succeed. So when the kernel finds itself in a situation where it needs 34succeed. So when the kernel finds itself in a situation where it needs
35to swap out a page, it first attempts to use frontswap. If the put returns 35to swap out a page, it first attempts to use frontswap. If the store returns
36success, the data has been successfully saved to transcendent memory and 36success, the data has been successfully saved to transcendent memory and
37a disk write and, if the data is later read back, a disk read are avoided. 37a disk write and, if the data is later read back, a disk read are avoided.
38If a put returns failure, transcendent memory has rejected the data, and the 38If a store returns failure, transcendent memory has rejected the data, and the
39page can be written to swap as usual. 39page can be written to swap as usual.
40 40
41If a backend chooses, frontswap can be configured as a "writethrough 41If a backend chooses, frontswap can be configured as a "writethrough
@@ -44,18 +44,18 @@ in swap device writes is lost (and also a non-trivial performance advantage)
44in order to allow the backend to arbitrarily "reclaim" space used to 44in order to allow the backend to arbitrarily "reclaim" space used to
45store frontswap pages to more completely manage its memory usage. 45store frontswap pages to more completely manage its memory usage.
46 46
47Note that if a page is put and the page already exists in transcendent memory 47Note that if a page is stored and the page already exists in transcendent memory
48(a "duplicate" put), either the put succeeds and the data is overwritten, 48(a "duplicate" store), either the store succeeds and the data is overwritten,
49or the put fails AND the page is invalidated. This ensures stale data may 49or the store fails AND the page is invalidated. This ensures stale data may
50never be obtained from frontswap. 50never be obtained from frontswap.
51 51
52If properly configured, monitoring of frontswap is done via debugfs in 52If properly configured, monitoring of frontswap is done via debugfs in
53the /sys/kernel/debug/frontswap directory. The effectiveness of 53the /sys/kernel/debug/frontswap directory. The effectiveness of
54frontswap can be measured (across all swap devices) with: 54frontswap can be measured (across all swap devices) with:
55 55
56failed_puts - how many put attempts have failed 56failed_stores - how many store attempts have failed
57gets - how many gets were attempted (all should succeed) 57loads - how many loads were attempted (all should succeed)
58succ_puts - how many put attempts have succeeded 58succ_stores - how many store attempts have succeeded
59invalidates - how many invalidates were attempted 59invalidates - how many invalidates were attempted
60 60
61A backend implementation may provide additional metrics. 61A backend implementation may provide additional metrics.
@@ -125,7 +125,7 @@ nothingness and the only overhead is a few extra bytes per swapon'ed
125swap device. If CONFIG_FRONTSWAP is enabled but no frontswap "backend" 125swap device. If CONFIG_FRONTSWAP is enabled but no frontswap "backend"
126registers, there is one extra global variable compared to zero for 126registers, there is one extra global variable compared to zero for
127every swap page read or written. If CONFIG_FRONTSWAP is enabled 127every swap page read or written. If CONFIG_FRONTSWAP is enabled
128AND a frontswap backend registers AND the backend fails every "put" 128AND a frontswap backend registers AND the backend fails every "store"
129request (i.e. provides no memory despite claiming it might), 129request (i.e. provides no memory despite claiming it might),
130CPU overhead is still negligible -- and since every frontswap fail 130CPU overhead is still negligible -- and since every frontswap fail
131precedes a swap page write-to-disk, the system is highly likely 131precedes a swap page write-to-disk, the system is highly likely
@@ -159,13 +159,13 @@ entirely dynamic and random.
159 159
160Whenever a swap-device is swapon'd frontswap_init() is called, 160Whenever a swap-device is swapon'd frontswap_init() is called,
161passing the swap device number (aka "type") as a parameter. 161passing the swap device number (aka "type") as a parameter.
162This notifies frontswap to expect attempts to "put" swap pages 162This notifies frontswap to expect attempts to "store" swap pages
163associated with that number. 163associated with that number.
164 164
165Whenever the swap subsystem is readying a page to write to a swap 165Whenever the swap subsystem is readying a page to write to a swap
166device (c.f swap_writepage()), frontswap_put_page is called. Frontswap 166device (c.f swap_writepage()), frontswap_store is called. Frontswap
167consults with the frontswap backend and if the backend says it does NOT 167consults with the frontswap backend and if the backend says it does NOT
168have room, frontswap_put_page returns -1 and the kernel swaps the page 168have room, frontswap_store returns -1 and the kernel swaps the page
169to the swap device as normal. Note that the response from the frontswap 169to the swap device as normal. Note that the response from the frontswap
170backend is unpredictable to the kernel; it may choose to never accept a 170backend is unpredictable to the kernel; it may choose to never accept a
171page, it could accept every ninth page, or it might accept every 171page, it could accept every ninth page, or it might accept every
@@ -177,7 +177,7 @@ corresponding to the page offset on the swap device to which it would
177otherwise have written the data. 177otherwise have written the data.
178 178
179When the swap subsystem needs to swap-in a page (swap_readpage()), 179When the swap subsystem needs to swap-in a page (swap_readpage()),
180it first calls frontswap_get_page() which checks the frontswap_map to 180it first calls frontswap_load() which checks the frontswap_map to
181see if the page was earlier accepted by the frontswap backend. If 181see if the page was earlier accepted by the frontswap backend. If
182it was, the page of data is filled from the frontswap backend and 182it was, the page of data is filled from the frontswap backend and
183the swap-in is complete. If not, the normal swap-in code is 183the swap-in is complete. If not, the normal swap-in code is
@@ -185,7 +185,7 @@ executed to obtain the page of data from the real swap device.
185 185
186So every time the frontswap backend accepts a page, a swap device read 186So every time the frontswap backend accepts a page, a swap device read
187and (potentially) a swap device write are replaced by a "frontswap backend 187and (potentially) a swap device write are replaced by a "frontswap backend
188put" and (possibly) a "frontswap backend get", which are presumably much 188store" and (possibly) a "frontswap backend loads", which are presumably much
189faster. 189faster.
190 190
1914) Can't frontswap be configured as a "special" swap device that is 1914) Can't frontswap be configured as a "special" swap device that is
@@ -215,8 +215,8 @@ that are inappropriate for a RAM-oriented device including delaying
215the write of some pages for a significant amount of time. Synchrony is 215the write of some pages for a significant amount of time. Synchrony is
216required to ensure the dynamicity of the backend and to avoid thorny race 216required to ensure the dynamicity of the backend and to avoid thorny race
217conditions that would unnecessarily and greatly complicate frontswap 217conditions that would unnecessarily and greatly complicate frontswap
218and/or the block I/O subsystem. That said, only the initial "put" 218and/or the block I/O subsystem. That said, only the initial "store"
219and "get" operations need be synchronous. A separate asynchronous thread 219and "load" operations need be synchronous. A separate asynchronous thread
220is free to manipulate the pages stored by frontswap. For example, 220is free to manipulate the pages stored by frontswap. For example,
221the "remotification" thread in RAMster uses standard asynchronous 221the "remotification" thread in RAMster uses standard asynchronous
222kernel sockets to move compressed frontswap pages to a remote machine. 222kernel sockets to move compressed frontswap pages to a remote machine.
@@ -229,7 +229,7 @@ choose to accept pages only until host-swapping might be imminent,
229then force guests to do their own swapping. 229then force guests to do their own swapping.
230 230
231There is a downside to the transcendent memory specifications for 231There is a downside to the transcendent memory specifications for
232frontswap: Since any "put" might fail, there must always be a real 232frontswap: Since any "store" might fail, there must always be a real
233slot on a real swap device to swap the page. Thus frontswap must be 233slot on a real swap device to swap the page. Thus frontswap must be
234implemented as a "shadow" to every swapon'd device with the potential 234implemented as a "shadow" to every swapon'd device with the potential
235capability of holding every page that the swap device might have held 235capability of holding every page that the swap device might have held
@@ -240,16 +240,16 @@ installation, frontswap is useless. Swapless portable devices
240can still use frontswap but a backend for such devices must configure 240can still use frontswap but a backend for such devices must configure
241some kind of "ghost" swap device and ensure that it is never used. 241some kind of "ghost" swap device and ensure that it is never used.
242 242
2435) Why this weird definition about "duplicate puts"? If a page 2435) Why this weird definition about "duplicate stores"? If a page
244 has been previously successfully put, can't it always be 244 has been previously successfully stored, can't it always be
245 successfully overwritten? 245 successfully overwritten?
246 246
247Nearly always it can, but no, sometimes it cannot. Consider an example 247Nearly always it can, but no, sometimes it cannot. Consider an example
248where data is compressed and the original 4K page has been compressed 248where data is compressed and the original 4K page has been compressed
249to 1K. Now an attempt is made to overwrite the page with data that 249to 1K. Now an attempt is made to overwrite the page with data that
250is non-compressible and so would take the entire 4K. But the backend 250is non-compressible and so would take the entire 4K. But the backend
251has no more space. In this case, the put must be rejected. Whenever 251has no more space. In this case, the store must be rejected. Whenever
252frontswap rejects a put that would overwrite, it also must invalidate 252frontswap rejects a store that would overwrite, it also must invalidate
253the old data and ensure that it is no longer accessible. Since the 253the old data and ensure that it is no longer accessible. Since the
254swap subsystem then writes the new data to the read swap device, 254swap subsystem then writes the new data to the read swap device,
255this is the correct course of action to ensure coherency. 255this is the correct course of action to ensure coherency.