aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/ABI
diff options
context:
space:
mode:
authorAndi Kleen <andi@firstfloor.org>2009-12-16 06:20:00 -0500
committerAndi Kleen <ak@linux.intel.com>2009-12-16 06:20:00 -0500
commitfacb6011f3993947283fa15d039dacb4ad140230 (patch)
treec317e401fa7c867e1652879627163331f43085ef /Documentation/ABI
parent2326c467df4ff814dc07cf1bdaa1e6e0a9c9f21c (diff)
HWPOISON: Add soft page offline support
This is a simpler, gentler variant of memory_failure() for soft page offlining controlled from user space. It doesn't kill anything, just tries to invalidate and if that doesn't work migrate the page away. This is useful for predictive failure analysis, where a page has a high rate of corrected errors, but hasn't gone bad yet. Instead it can be offlined early and avoided. The offlining is controlled from sysfs, including a new generic entry point for hard page offlining for symmetry too. We use the page isolate facility to prevent re-allocation race. Normally this is only used by memory hotplug. To avoid races with memory allocation I am using lock_system_sleep(). This avoids the situation where memory hotplug is about to isolate a page range and then hwpoison undoes that work. This is a big hammer currently, but the simplest solution currently. When the page is not free or LRU we try to free pages from slab and other caches. The slab freeing is currently quite dumb and does not try to focus on the specific slab cache which might own the page. This could be potentially improved later. Thanks to Fengguang Wu and Haicheng Li for some fixes. [Added fix from Andrew Morton to adapt to new migrate_pages prototype] Signed-off-by: Andi Kleen <ak@linux.intel.com>
Diffstat (limited to 'Documentation/ABI')
-rw-r--r--Documentation/ABI/testing/sysfs-memory-page-offline44
1 files changed, 44 insertions, 0 deletions
diff --git a/Documentation/ABI/testing/sysfs-memory-page-offline b/Documentation/ABI/testing/sysfs-memory-page-offline
new file mode 100644
index 00000000000..e14703f12fd
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-memory-page-offline
@@ -0,0 +1,44 @@
1What: /sys/devices/system/memory/soft_offline_page
2Date: Sep 2009
3KernelVersion: 2.6.33
4Contact: andi@firstfloor.org
5Description:
6 Soft-offline the memory page containing the physical address
7 written into this file. Input is a hex number specifying the
8 physical address of the page. The kernel will then attempt
9 to soft-offline it, by moving the contents elsewhere or
10 dropping it if possible. The kernel will then be placed
11 on the bad page list and never be reused.
12
13 The offlining is done in kernel specific granuality.
14 Normally it's the base page size of the kernel, but
15 this might change.
16
17 The page must be still accessible, not poisoned. The
18 kernel will never kill anything for this, but rather
19 fail the offline. Return value is the size of the
20 number, or a error when the offlining failed. Reading
21 the file is not allowed.
22
23What: /sys/devices/system/memory/hard_offline_page
24Date: Sep 2009
25KernelVersion: 2.6.33
26Contact: andi@firstfloor.org
27Description:
28 Hard-offline the memory page containing the physical
29 address written into this file. Input is a hex number
30 specifying the physical address of the page. The
31 kernel will then attempt to hard-offline the page, by
32 trying to drop the page or killing any owner or
33 triggering IO errors if needed. Note this may kill
34 any processes owning the page. The kernel will avoid
35 to access this page assuming it's poisoned by the
36 hardware.
37
38 The offlining is done in kernel specific granuality.
39 Normally it's the base page size of the kernel, but
40 this might change.
41
42 Return value is the size of the number, or a error when
43 the offlining failed.
44 Reading the file is not allowed.