diff options
-rw-r--r-- | Documentation/vm/zsmalloc.txt | 70 | ||||
-rw-r--r-- | MAINTAINERS | 1 | ||||
-rw-r--r-- | mm/zsmalloc.c | 29 |
3 files changed, 71 insertions, 29 deletions
diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt new file mode 100644 index 000000000000..64ed63c4f69d --- /dev/null +++ b/Documentation/vm/zsmalloc.txt | |||
@@ -0,0 +1,70 @@ | |||
1 | zsmalloc | ||
2 | -------- | ||
3 | |||
4 | This allocator is designed for use with zram. Thus, the allocator is | ||
5 | supposed to work well under low memory conditions. In particular, it | ||
6 | never attempts higher order page allocation which is very likely to | ||
7 | fail under memory pressure. On the other hand, if we just use single | ||
8 | (0-order) pages, it would suffer from very high fragmentation -- | ||
9 | any object of size PAGE_SIZE/2 or larger would occupy an entire page. | ||
10 | This was one of the major issues with its predecessor (xvmalloc). | ||
11 | |||
12 | To overcome these issues, zsmalloc allocates a bunch of 0-order pages | ||
13 | and links them together using various 'struct page' fields. These linked | ||
14 | pages act as a single higher-order page i.e. an object can span 0-order | ||
15 | page boundaries. The code refers to these linked pages as a single entity | ||
16 | called zspage. | ||
17 | |||
18 | For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE | ||
19 | since this satisfies the requirements of all its current users (in the | ||
20 | worst case, page is incompressible and is thus stored "as-is" i.e. in | ||
21 | uncompressed form). For allocation requests larger than this size, failure | ||
22 | is returned (see zs_malloc). | ||
23 | |||
24 | Additionally, zs_malloc() does not return a dereferenceable pointer. | ||
25 | Instead, it returns an opaque handle (unsigned long) which encodes actual | ||
26 | location of the allocated object. The reason for this indirection is that | ||
27 | zsmalloc does not keep zspages permanently mapped since that would cause | ||
28 | issues on 32-bit systems where the VA region for kernel space mappings | ||
29 | is very small. So, before using the allocating memory, the object has to | ||
30 | be mapped using zs_map_object() to get a usable pointer and subsequently | ||
31 | unmapped using zs_unmap_object(). | ||
32 | |||
33 | stat | ||
34 | ---- | ||
35 | |||
36 | With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via | ||
37 | /sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output: | ||
38 | |||
39 | # cat /sys/kernel/debug/zsmalloc/zram0/classes | ||
40 | |||
41 | class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage | ||
42 | .. | ||
43 | .. | ||
44 | 9 176 0 1 186 129 8 4 | ||
45 | 10 192 1 0 2880 2872 135 3 | ||
46 | 11 208 0 1 819 795 42 2 | ||
47 | 12 224 0 1 219 159 12 4 | ||
48 | .. | ||
49 | .. | ||
50 | |||
51 | |||
52 | class: index | ||
53 | size: object size zspage stores | ||
54 | almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below) | ||
55 | almost_full: the number of ZS_ALMOST_FULL zspages(see below) | ||
56 | obj_allocated: the number of objects allocated | ||
57 | obj_used: the number of objects allocated to the user | ||
58 | pages_used: the number of pages allocated for the class | ||
59 | pages_per_zspage: the number of 0-order pages to make a zspage | ||
60 | |||
61 | We assign a zspage to ZS_ALMOST_EMPTY fullness group when: | ||
62 | n <= N / f, where | ||
63 | n = number of allocated objects | ||
64 | N = total number of objects zspage can store | ||
65 | f = fullness_threshold_frac(ie, 4 at the moment) | ||
66 | |||
67 | Similarly, we assign zspage to: | ||
68 | ZS_ALMOST_FULL when n > N / f | ||
69 | ZS_EMPTY when n == 0 | ||
70 | ZS_FULL when n == N | ||
diff --git a/MAINTAINERS b/MAINTAINERS index 6ee1e79ea16b..190981382853 100644 --- a/MAINTAINERS +++ b/MAINTAINERS | |||
@@ -10972,6 +10972,7 @@ L: linux-mm@kvack.org | |||
10972 | S: Maintained | 10972 | S: Maintained |
10973 | F: mm/zsmalloc.c | 10973 | F: mm/zsmalloc.c |
10974 | F: include/linux/zsmalloc.h | 10974 | F: include/linux/zsmalloc.h |
10975 | F: Documentation/vm/zsmalloc.txt | ||
10975 | 10976 | ||
10976 | ZSWAP COMPRESSED SWAP CACHING | 10977 | ZSWAP COMPRESSED SWAP CACHING |
10977 | M: Seth Jennings <sjennings@variantweb.net> | 10978 | M: Seth Jennings <sjennings@variantweb.net> |
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 461243e14d3e..1833fc9e09cb 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c | |||
@@ -12,35 +12,6 @@ | |||
12 | */ | 12 | */ |
13 | 13 | ||
14 | /* | 14 | /* |
15 | * This allocator is designed for use with zram. Thus, the allocator is | ||
16 | * supposed to work well under low memory conditions. In particular, it | ||
17 | * never attempts higher order page allocation which is very likely to | ||
18 | * fail under memory pressure. On the other hand, if we just use single | ||
19 | * (0-order) pages, it would suffer from very high fragmentation -- | ||
20 | * any object of size PAGE_SIZE/2 or larger would occupy an entire page. | ||
21 | * This was one of the major issues with its predecessor (xvmalloc). | ||
22 | * | ||
23 | * To overcome these issues, zsmalloc allocates a bunch of 0-order pages | ||
24 | * and links them together using various 'struct page' fields. These linked | ||
25 | * pages act as a single higher-order page i.e. an object can span 0-order | ||
26 | * page boundaries. The code refers to these linked pages as a single entity | ||
27 | * called zspage. | ||
28 | * | ||
29 | * For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE | ||
30 | * since this satisfies the requirements of all its current users (in the | ||
31 | * worst case, page is incompressible and is thus stored "as-is" i.e. in | ||
32 | * uncompressed form). For allocation requests larger than this size, failure | ||
33 | * is returned (see zs_malloc). | ||
34 | * | ||
35 | * Additionally, zs_malloc() does not return a dereferenceable pointer. | ||
36 | * Instead, it returns an opaque handle (unsigned long) which encodes actual | ||
37 | * location of the allocated object. The reason for this indirection is that | ||
38 | * zsmalloc does not keep zspages permanently mapped since that would cause | ||
39 | * issues on 32-bit systems where the VA region for kernel space mappings | ||
40 | * is very small. So, before using the allocating memory, the object has to | ||
41 | * be mapped using zs_map_object() to get a usable pointer and subsequently | ||
42 | * unmapped using zs_unmap_object(). | ||
43 | * | ||
44 | * Following is how we use various fields and flags of underlying | 15 | * Following is how we use various fields and flags of underlying |
45 | * struct page(s) to form a zspage. | 16 | * struct page(s) to form a zspage. |
46 | * | 17 | * |