diff options
author | David Hildenbrand <david@redhat.com> | 2018-10-30 18:10:44 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-10-31 11:54:17 -0400 |
commit | dee6da22efac451d361f5224a60be2796d847b51 (patch) | |
tree | 19ce8a83f2848e43829faf95f3203fc66483fd64 | |
parent | 5666848774ef43d3db5151ec518f1deb63515c20 (diff) |
memory-hotplug.rst: add some details about locking internals
Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.
Link: http://lkml.kernel.org/r/20180925091457.28651-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r-- | Documentation/admin-guide/mm/memory-hotplug.rst | 42 |
1 files changed, 41 insertions, 1 deletions
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst index 25157aec5b31..5c4432c96c4b 100644 --- a/Documentation/admin-guide/mm/memory-hotplug.rst +++ b/Documentation/admin-guide/mm/memory-hotplug.rst | |||
@@ -5,7 +5,7 @@ Memory Hotplug | |||
5 | ============== | 5 | ============== |
6 | 6 | ||
7 | :Created: Jul 28 2007 | 7 | :Created: Jul 28 2007 |
8 | :Updated: Add description of notifier of memory hotplug: Oct 11 2007 | 8 | :Updated: Add some details about locking internals: Aug 20 2018 |
9 | 9 | ||
10 | This document is about memory hotplug including how-to-use and current status. | 10 | This document is about memory hotplug including how-to-use and current status. |
11 | Because Memory Hotplug is still under development, contents of this text will | 11 | Because Memory Hotplug is still under development, contents of this text will |
@@ -392,6 +392,46 @@ Need more implementation yet.... | |||
392 | - Notification completion of remove works by OS to firmware. | 392 | - Notification completion of remove works by OS to firmware. |
393 | - Guard from remove if not yet. | 393 | - Guard from remove if not yet. |
394 | 394 | ||
395 | |||
396 | Locking Internals | ||
397 | ================= | ||
398 | |||
399 | When adding/removing memory that uses memory block devices (i.e. ordinary RAM), | ||
400 | the device_hotplug_lock should be held to: | ||
401 | |||
402 | - synchronize against online/offline requests (e.g. via sysfs). This way, memory | ||
403 | block devices can only be accessed (.online/.state attributes) by user | ||
404 | space once memory has been fully added. And when removing memory, we | ||
405 | know nobody is in critical sections. | ||
406 | - synchronize against CPU hotplug and similar (e.g. relevant for ACPI and PPC) | ||
407 | |||
408 | Especially, there is a possible lock inversion that is avoided using | ||
409 | device_hotplug_lock when adding memory and user space tries to online that | ||
410 | memory faster than expected: | ||
411 | |||
412 | - device_online() will first take the device_lock(), followed by | ||
413 | mem_hotplug_lock | ||
414 | - add_memory_resource() will first take the mem_hotplug_lock, followed by | ||
415 | the device_lock() (while creating the devices, during bus_add_device()). | ||
416 | |||
417 | As the device is visible to user space before taking the device_lock(), this | ||
418 | can result in a lock inversion. | ||
419 | |||
420 | onlining/offlining of memory should be done via device_online()/ | ||
421 | device_offline() - to make sure it is properly synchronized to actions | ||
422 | via sysfs. Holding device_hotplug_lock is advised (to e.g. protect online_type) | ||
423 | |||
424 | When adding/removing/onlining/offlining memory or adding/removing | ||
425 | heterogeneous/device memory, we should always hold the mem_hotplug_lock in | ||
426 | write mode to serialise memory hotplug (e.g. access to global/zone | ||
427 | variables). | ||
428 | |||
429 | In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read | ||
430 | mode allows for a quite efficient get_online_mems/put_online_mems | ||
431 | implementation, so code accessing memory can protect from that memory | ||
432 | vanishing. | ||
433 | |||
434 | |||
395 | Future Work | 435 | Future Work |
396 | =========== | 436 | =========== |
397 | 437 | ||