diff options
author | Kees Cook <keescook@chromium.org> | 2017-05-13 07:51:41 -0400 |
---|---|---|
committer | Jonathan Corbet <corbet@lwn.net> | 2017-05-18 12:30:23 -0400 |
commit | c2ed6743434d1d9ef49b044c6bdfd6ac1ce140a2 (patch) | |
tree | 3474f0927898a416ceb19b3ef0a5588ea0b7da11 /Documentation/security | |
parent | af777cd1b83e95138e7285fde87c795ef0ae7c4d (diff) |
doc: ReSTify self-protection.txt
This updates the credentials API documentation to ReST markup and moves
it under the security subsection of kernel API documentation.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Diffstat (limited to 'Documentation/security')
-rw-r--r-- | Documentation/security/index.rst | 1 | ||||
-rw-r--r-- | Documentation/security/self-protection.rst (renamed from Documentation/security/self-protection.txt) | 99 |
2 files changed, 64 insertions, 36 deletions
diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst index 415be8e0b013..4212d7ac58b6 100644 --- a/Documentation/security/index.rst +++ b/Documentation/security/index.rst | |||
@@ -7,4 +7,5 @@ Security Documentation | |||
7 | 7 | ||
8 | credentials | 8 | credentials |
9 | IMA-templates | 9 | IMA-templates |
10 | self-protection | ||
10 | tpm/index | 11 | tpm/index |
diff --git a/Documentation/security/self-protection.txt b/Documentation/security/self-protection.rst index 141acfebe6ef..60c8bd8b77bf 100644 --- a/Documentation/security/self-protection.txt +++ b/Documentation/security/self-protection.rst | |||
@@ -1,4 +1,6 @@ | |||
1 | # Kernel Self-Protection | 1 | ====================== |
2 | Kernel Self-Protection | ||
3 | ====================== | ||
2 | 4 | ||
3 | Kernel self-protection is the design and implementation of systems and | 5 | Kernel self-protection is the design and implementation of systems and |
4 | structures within the Linux kernel to protect against security flaws in | 6 | structures within the Linux kernel to protect against security flaws in |
@@ -26,7 +28,8 @@ mentioning them, since these aspects need to be explored, dealt with, | |||
26 | and/or accepted. | 28 | and/or accepted. |
27 | 29 | ||
28 | 30 | ||
29 | ## Attack Surface Reduction | 31 | Attack Surface Reduction |
32 | ======================== | ||
30 | 33 | ||
31 | The most fundamental defense against security exploits is to reduce the | 34 | The most fundamental defense against security exploits is to reduce the |
32 | areas of the kernel that can be used to redirect execution. This ranges | 35 | areas of the kernel that can be used to redirect execution. This ranges |
@@ -34,13 +37,15 @@ from limiting the exposed APIs available to userspace, making in-kernel | |||
34 | APIs hard to use incorrectly, minimizing the areas of writable kernel | 37 | APIs hard to use incorrectly, minimizing the areas of writable kernel |
35 | memory, etc. | 38 | memory, etc. |
36 | 39 | ||
37 | ### Strict kernel memory permissions | 40 | Strict kernel memory permissions |
41 | -------------------------------- | ||
38 | 42 | ||
39 | When all of kernel memory is writable, it becomes trivial for attacks | 43 | When all of kernel memory is writable, it becomes trivial for attacks |
40 | to redirect execution flow. To reduce the availability of these targets | 44 | to redirect execution flow. To reduce the availability of these targets |
41 | the kernel needs to protect its memory with a tight set of permissions. | 45 | the kernel needs to protect its memory with a tight set of permissions. |
42 | 46 | ||
43 | #### Executable code and read-only data must not be writable | 47 | Executable code and read-only data must not be writable |
48 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
44 | 49 | ||
45 | Any areas of the kernel with executable memory must not be writable. | 50 | Any areas of the kernel with executable memory must not be writable. |
46 | While this obviously includes the kernel text itself, we must consider | 51 | While this obviously includes the kernel text itself, we must consider |
@@ -51,18 +56,19 @@ kernel, they are implemented in a way where the memory is temporarily | |||
51 | made writable during the update, and then returned to the original | 56 | made writable during the update, and then returned to the original |
52 | permissions.) | 57 | permissions.) |
53 | 58 | ||
54 | In support of this are CONFIG_STRICT_KERNEL_RWX and | 59 | In support of this are ``CONFIG_STRICT_KERNEL_RWX`` and |
55 | CONFIG_STRICT_MODULE_RWX, which seek to make sure that code is not | 60 | ``CONFIG_STRICT_MODULE_RWX``, which seek to make sure that code is not |
56 | writable, data is not executable, and read-only data is neither writable | 61 | writable, data is not executable, and read-only data is neither writable |
57 | nor executable. | 62 | nor executable. |
58 | 63 | ||
59 | Most architectures have these options on by default and not user selectable. | 64 | Most architectures have these options on by default and not user selectable. |
60 | For some architectures like arm that wish to have these be selectable, | 65 | For some architectures like arm that wish to have these be selectable, |
61 | the architecture Kconfig can select ARCH_OPTIONAL_KERNEL_RWX to enable | 66 | the architecture Kconfig can select ARCH_OPTIONAL_KERNEL_RWX to enable |
62 | a Kconfig prompt. CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT determines | 67 | a Kconfig prompt. ``CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT`` determines |
63 | the default setting when ARCH_OPTIONAL_KERNEL_RWX is enabled. | 68 | the default setting when ARCH_OPTIONAL_KERNEL_RWX is enabled. |
64 | 69 | ||
65 | #### Function pointers and sensitive variables must not be writable | 70 | Function pointers and sensitive variables must not be writable |
71 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
66 | 72 | ||
67 | Vast areas of kernel memory contain function pointers that are looked | 73 | Vast areas of kernel memory contain function pointers that are looked |
68 | up by the kernel and used to continue execution (e.g. descriptor/vector | 74 | up by the kernel and used to continue execution (e.g. descriptor/vector |
@@ -74,8 +80,8 @@ so that they live in the .rodata section instead of the .data section | |||
74 | of the kernel, gaining the protection of the kernel's strict memory | 80 | of the kernel, gaining the protection of the kernel's strict memory |
75 | permissions as described above. | 81 | permissions as described above. |
76 | 82 | ||
77 | For variables that are initialized once at __init time, these can | 83 | For variables that are initialized once at ``__init`` time, these can |
78 | be marked with the (new and under development) __ro_after_init | 84 | be marked with the (new and under development) ``__ro_after_init`` |
79 | attribute. | 85 | attribute. |
80 | 86 | ||
81 | What remains are variables that are updated rarely (e.g. GDT). These | 87 | What remains are variables that are updated rarely (e.g. GDT). These |
@@ -85,7 +91,8 @@ of their lifetime read-only. (For example, when being updated, only the | |||
85 | CPU thread performing the update would be given uninterruptible write | 91 | CPU thread performing the update would be given uninterruptible write |
86 | access to the memory.) | 92 | access to the memory.) |
87 | 93 | ||
88 | #### Segregation of kernel memory from userspace memory | 94 | Segregation of kernel memory from userspace memory |
95 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
89 | 96 | ||
90 | The kernel must never execute userspace memory. The kernel must also never | 97 | The kernel must never execute userspace memory. The kernel must also never |
91 | access userspace memory without explicit expectation to do so. These | 98 | access userspace memory without explicit expectation to do so. These |
@@ -95,10 +102,11 @@ By blocking userspace memory in this way, execution and data parsing | |||
95 | cannot be passed to trivially-controlled userspace memory, forcing | 102 | cannot be passed to trivially-controlled userspace memory, forcing |
96 | attacks to operate entirely in kernel memory. | 103 | attacks to operate entirely in kernel memory. |
97 | 104 | ||
98 | ### Reduced access to syscalls | 105 | Reduced access to syscalls |
106 | -------------------------- | ||
99 | 107 | ||
100 | One trivial way to eliminate many syscalls for 64-bit systems is building | 108 | One trivial way to eliminate many syscalls for 64-bit systems is building |
101 | without CONFIG_COMPAT. However, this is rarely a feasible scenario. | 109 | without ``CONFIG_COMPAT``. However, this is rarely a feasible scenario. |
102 | 110 | ||
103 | The "seccomp" system provides an opt-in feature made available to | 111 | The "seccomp" system provides an opt-in feature made available to |
104 | userspace, which provides a way to reduce the number of kernel entry | 112 | userspace, which provides a way to reduce the number of kernel entry |
@@ -112,7 +120,8 @@ to trusted processes. This would keep the scope of kernel entry points | |||
112 | restricted to the more regular set of normally available to unprivileged | 120 | restricted to the more regular set of normally available to unprivileged |
113 | userspace. | 121 | userspace. |
114 | 122 | ||
115 | ### Restricting access to kernel modules | 123 | Restricting access to kernel modules |
124 | ------------------------------------ | ||
116 | 125 | ||
117 | The kernel should never allow an unprivileged user the ability to | 126 | The kernel should never allow an unprivileged user the ability to |
118 | load specific kernel modules, since that would provide a facility to | 127 | load specific kernel modules, since that would provide a facility to |
@@ -127,11 +136,12 @@ for debate in some scenarios.) | |||
127 | To protect against even privileged users, systems may need to either | 136 | To protect against even privileged users, systems may need to either |
128 | disable module loading entirely (e.g. monolithic kernel builds or | 137 | disable module loading entirely (e.g. monolithic kernel builds or |
129 | modules_disabled sysctl), or provide signed modules (e.g. | 138 | modules_disabled sysctl), or provide signed modules (e.g. |
130 | CONFIG_MODULE_SIG_FORCE, or dm-crypt with LoadPin), to keep from having | 139 | ``CONFIG_MODULE_SIG_FORCE``, or dm-crypt with LoadPin), to keep from having |
131 | root load arbitrary kernel code via the module loader interface. | 140 | root load arbitrary kernel code via the module loader interface. |
132 | 141 | ||
133 | 142 | ||
134 | ## Memory integrity | 143 | Memory integrity |
144 | ================ | ||
135 | 145 | ||
136 | There are many memory structures in the kernel that are regularly abused | 146 | There are many memory structures in the kernel that are regularly abused |
137 | to gain execution control during an attack, By far the most commonly | 147 | to gain execution control during an attack, By far the most commonly |
@@ -139,16 +149,18 @@ understood is that of the stack buffer overflow in which the return | |||
139 | address stored on the stack is overwritten. Many other examples of this | 149 | address stored on the stack is overwritten. Many other examples of this |
140 | kind of attack exist, and protections exist to defend against them. | 150 | kind of attack exist, and protections exist to defend against them. |
141 | 151 | ||
142 | ### Stack buffer overflow | 152 | Stack buffer overflow |
153 | --------------------- | ||
143 | 154 | ||
144 | The classic stack buffer overflow involves writing past the expected end | 155 | The classic stack buffer overflow involves writing past the expected end |
145 | of a variable stored on the stack, ultimately writing a controlled value | 156 | of a variable stored on the stack, ultimately writing a controlled value |
146 | to the stack frame's stored return address. The most widely used defense | 157 | to the stack frame's stored return address. The most widely used defense |
147 | is the presence of a stack canary between the stack variables and the | 158 | is the presence of a stack canary between the stack variables and the |
148 | return address (CONFIG_CC_STACKPROTECTOR), which is verified just before | 159 | return address (``CONFIG_CC_STACKPROTECTOR``), which is verified just before |
149 | the function returns. Other defenses include things like shadow stacks. | 160 | the function returns. Other defenses include things like shadow stacks. |
150 | 161 | ||
151 | ### Stack depth overflow | 162 | Stack depth overflow |
163 | -------------------- | ||
152 | 164 | ||
153 | A less well understood attack is using a bug that triggers the | 165 | A less well understood attack is using a bug that triggers the |
154 | kernel to consume stack memory with deep function calls or large stack | 166 | kernel to consume stack memory with deep function calls or large stack |
@@ -158,27 +170,31 @@ important changes need to be made for better protections: moving the | |||
158 | sensitive thread_info structure elsewhere, and adding a faulting memory | 170 | sensitive thread_info structure elsewhere, and adding a faulting memory |
159 | hole at the bottom of the stack to catch these overflows. | 171 | hole at the bottom of the stack to catch these overflows. |
160 | 172 | ||
161 | ### Heap memory integrity | 173 | Heap memory integrity |
174 | --------------------- | ||
162 | 175 | ||
163 | The structures used to track heap free lists can be sanity-checked during | 176 | The structures used to track heap free lists can be sanity-checked during |
164 | allocation and freeing to make sure they aren't being used to manipulate | 177 | allocation and freeing to make sure they aren't being used to manipulate |
165 | other memory areas. | 178 | other memory areas. |
166 | 179 | ||
167 | ### Counter integrity | 180 | Counter integrity |
181 | ----------------- | ||
168 | 182 | ||
169 | Many places in the kernel use atomic counters to track object references | 183 | Many places in the kernel use atomic counters to track object references |
170 | or perform similar lifetime management. When these counters can be made | 184 | or perform similar lifetime management. When these counters can be made |
171 | to wrap (over or under) this traditionally exposes a use-after-free | 185 | to wrap (over or under) this traditionally exposes a use-after-free |
172 | flaw. By trapping atomic wrapping, this class of bug vanishes. | 186 | flaw. By trapping atomic wrapping, this class of bug vanishes. |
173 | 187 | ||
174 | ### Size calculation overflow detection | 188 | Size calculation overflow detection |
189 | ----------------------------------- | ||
175 | 190 | ||
176 | Similar to counter overflow, integer overflows (usually size calculations) | 191 | Similar to counter overflow, integer overflows (usually size calculations) |
177 | need to be detected at runtime to kill this class of bug, which | 192 | need to be detected at runtime to kill this class of bug, which |
178 | traditionally leads to being able to write past the end of kernel buffers. | 193 | traditionally leads to being able to write past the end of kernel buffers. |
179 | 194 | ||
180 | 195 | ||
181 | ## Statistical defenses | 196 | Probabilistic defenses |
197 | ====================== | ||
182 | 198 | ||
183 | While many protections can be considered deterministic (e.g. read-only | 199 | While many protections can be considered deterministic (e.g. read-only |
184 | memory cannot be written to), some protections provide only statistical | 200 | memory cannot be written to), some protections provide only statistical |
@@ -186,7 +202,8 @@ defense, in that an attack must gather enough information about a | |||
186 | running system to overcome the defense. While not perfect, these do | 202 | running system to overcome the defense. While not perfect, these do |
187 | provide meaningful defenses. | 203 | provide meaningful defenses. |
188 | 204 | ||
189 | ### Canaries, blinding, and other secrets | 205 | Canaries, blinding, and other secrets |
206 | ------------------------------------- | ||
190 | 207 | ||
191 | It should be noted that things like the stack canary discussed earlier | 208 | It should be noted that things like the stack canary discussed earlier |
192 | are technically statistical defenses, since they rely on a secret value, | 209 | are technically statistical defenses, since they rely on a secret value, |
@@ -201,7 +218,8 @@ It is critical that the secret values used must be separate (e.g. | |||
201 | different canary per stack) and high entropy (e.g. is the RNG actually | 218 | different canary per stack) and high entropy (e.g. is the RNG actually |
202 | working?) in order to maximize their success. | 219 | working?) in order to maximize their success. |
203 | 220 | ||
204 | ### Kernel Address Space Layout Randomization (KASLR) | 221 | Kernel Address Space Layout Randomization (KASLR) |
222 | ------------------------------------------------- | ||
205 | 223 | ||
206 | Since the location of kernel memory is almost always instrumental in | 224 | Since the location of kernel memory is almost always instrumental in |
207 | mounting a successful attack, making the location non-deterministic | 225 | mounting a successful attack, making the location non-deterministic |
@@ -209,22 +227,25 @@ raises the difficulty of an exploit. (Note that this in turn makes | |||
209 | the value of information exposures higher, since they may be used to | 227 | the value of information exposures higher, since they may be used to |
210 | discover desired memory locations.) | 228 | discover desired memory locations.) |
211 | 229 | ||
212 | #### Text and module base | 230 | Text and module base |
231 | ~~~~~~~~~~~~~~~~~~~~ | ||
213 | 232 | ||
214 | By relocating the physical and virtual base address of the kernel at | 233 | By relocating the physical and virtual base address of the kernel at |
215 | boot-time (CONFIG_RANDOMIZE_BASE), attacks needing kernel code will be | 234 | boot-time (``CONFIG_RANDOMIZE_BASE``), attacks needing kernel code will be |
216 | frustrated. Additionally, offsetting the module loading base address | 235 | frustrated. Additionally, offsetting the module loading base address |
217 | means that even systems that load the same set of modules in the same | 236 | means that even systems that load the same set of modules in the same |
218 | order every boot will not share a common base address with the rest of | 237 | order every boot will not share a common base address with the rest of |
219 | the kernel text. | 238 | the kernel text. |
220 | 239 | ||
221 | #### Stack base | 240 | Stack base |
241 | ~~~~~~~~~~ | ||
222 | 242 | ||
223 | If the base address of the kernel stack is not the same between processes, | 243 | If the base address of the kernel stack is not the same between processes, |
224 | or even not the same between syscalls, targets on or beyond the stack | 244 | or even not the same between syscalls, targets on or beyond the stack |
225 | become more difficult to locate. | 245 | become more difficult to locate. |
226 | 246 | ||
227 | #### Dynamic memory base | 247 | Dynamic memory base |
248 | ~~~~~~~~~~~~~~~~~~~ | ||
228 | 249 | ||
229 | Much of the kernel's dynamic memory (e.g. kmalloc, vmalloc, etc) ends up | 250 | Much of the kernel's dynamic memory (e.g. kmalloc, vmalloc, etc) ends up |
230 | being relatively deterministic in layout due to the order of early-boot | 251 | being relatively deterministic in layout due to the order of early-boot |
@@ -232,7 +253,8 @@ initializations. If the base address of these areas is not the same | |||
232 | between boots, targeting them is frustrated, requiring an information | 253 | between boots, targeting them is frustrated, requiring an information |
233 | exposure specific to the region. | 254 | exposure specific to the region. |
234 | 255 | ||
235 | #### Structure layout | 256 | Structure layout |
257 | ~~~~~~~~~~~~~~~~ | ||
236 | 258 | ||
237 | By performing a per-build randomization of the layout of sensitive | 259 | By performing a per-build randomization of the layout of sensitive |
238 | structures, attacks must either be tuned to known kernel builds or expose | 260 | structures, attacks must either be tuned to known kernel builds or expose |
@@ -240,26 +262,30 @@ enough kernel memory to determine structure layouts before manipulating | |||
240 | them. | 262 | them. |
241 | 263 | ||
242 | 264 | ||
243 | ## Preventing Information Exposures | 265 | Preventing Information Exposures |
266 | ================================ | ||
244 | 267 | ||
245 | Since the locations of sensitive structures are the primary target for | 268 | Since the locations of sensitive structures are the primary target for |
246 | attacks, it is important to defend against exposure of both kernel memory | 269 | attacks, it is important to defend against exposure of both kernel memory |
247 | addresses and kernel memory contents (since they may contain kernel | 270 | addresses and kernel memory contents (since they may contain kernel |
248 | addresses or other sensitive things like canary values). | 271 | addresses or other sensitive things like canary values). |
249 | 272 | ||
250 | ### Unique identifiers | 273 | Unique identifiers |
274 | ------------------ | ||
251 | 275 | ||
252 | Kernel memory addresses must never be used as identifiers exposed to | 276 | Kernel memory addresses must never be used as identifiers exposed to |
253 | userspace. Instead, use an atomic counter, an idr, or similar unique | 277 | userspace. Instead, use an atomic counter, an idr, or similar unique |
254 | identifier. | 278 | identifier. |
255 | 279 | ||
256 | ### Memory initialization | 280 | Memory initialization |
281 | --------------------- | ||
257 | 282 | ||
258 | Memory copied to userspace must always be fully initialized. If not | 283 | Memory copied to userspace must always be fully initialized. If not |
259 | explicitly memset(), this will require changes to the compiler to make | 284 | explicitly memset(), this will require changes to the compiler to make |
260 | sure structure holes are cleared. | 285 | sure structure holes are cleared. |
261 | 286 | ||
262 | ### Memory poisoning | 287 | Memory poisoning |
288 | ---------------- | ||
263 | 289 | ||
264 | When releasing memory, it is best to poison the contents (clear stack on | 290 | When releasing memory, it is best to poison the contents (clear stack on |
265 | syscall return, wipe heap memory on a free), to avoid reuse attacks that | 291 | syscall return, wipe heap memory on a free), to avoid reuse attacks that |
@@ -267,9 +293,10 @@ rely on the old contents of memory. This frustrates many uninitialized | |||
267 | variable attacks, stack content exposures, heap content exposures, and | 293 | variable attacks, stack content exposures, heap content exposures, and |
268 | use-after-free attacks. | 294 | use-after-free attacks. |
269 | 295 | ||
270 | ### Destination tracking | 296 | Destination tracking |
297 | -------------------- | ||
271 | 298 | ||
272 | To help kill classes of bugs that result in kernel addresses being | 299 | To help kill classes of bugs that result in kernel addresses being |
273 | written to userspace, the destination of writes needs to be tracked. If | 300 | written to userspace, the destination of writes needs to be tracked. If |
274 | the buffer is destined for userspace (e.g. seq_file backed /proc files), | 301 | the buffer is destined for userspace (e.g. seq_file backed ``/proc`` files), |
275 | it should automatically censor sensitive values. | 302 | it should automatically censor sensitive values. |