diff options
Diffstat (limited to 'Documentation/cgroup-v1/rdma.rst')
| -rw-r--r-- | Documentation/cgroup-v1/rdma.rst | 117 |
1 files changed, 117 insertions, 0 deletions
diff --git a/Documentation/cgroup-v1/rdma.rst b/Documentation/cgroup-v1/rdma.rst new file mode 100644 index 000000000000..2fcb0a9bf790 --- /dev/null +++ b/Documentation/cgroup-v1/rdma.rst | |||
| @@ -0,0 +1,117 @@ | |||
| 1 | =============== | ||
| 2 | RDMA Controller | ||
| 3 | =============== | ||
| 4 | |||
| 5 | .. Contents | ||
| 6 | |||
| 7 | 1. Overview | ||
| 8 | 1-1. What is RDMA controller? | ||
| 9 | 1-2. Why RDMA controller needed? | ||
| 10 | 1-3. How is RDMA controller implemented? | ||
| 11 | 2. Usage Examples | ||
| 12 | |||
| 13 | 1. Overview | ||
| 14 | =========== | ||
| 15 | |||
| 16 | 1-1. What is RDMA controller? | ||
| 17 | ----------------------------- | ||
| 18 | |||
| 19 | RDMA controller allows user to limit RDMA/IB specific resources that a given | ||
| 20 | set of processes can use. These processes are grouped using RDMA controller. | ||
| 21 | |||
| 22 | RDMA controller defines two resources which can be limited for processes of a | ||
| 23 | cgroup. | ||
| 24 | |||
| 25 | 1-2. Why RDMA controller needed? | ||
| 26 | -------------------------------- | ||
| 27 | |||
| 28 | Currently user space applications can easily take away all the rdma verb | ||
| 29 | specific resources such as AH, CQ, QP, MR etc. Due to which other applications | ||
| 30 | in other cgroup or kernel space ULPs may not even get chance to allocate any | ||
| 31 | rdma resources. This can lead to service unavailability. | ||
| 32 | |||
| 33 | Therefore RDMA controller is needed through which resource consumption | ||
| 34 | of processes can be limited. Through this controller different rdma | ||
| 35 | resources can be accounted. | ||
| 36 | |||
| 37 | 1-3. How is RDMA controller implemented? | ||
| 38 | ---------------------------------------- | ||
| 39 | |||
| 40 | RDMA cgroup allows limit configuration of resources. Rdma cgroup maintains | ||
| 41 | resource accounting per cgroup, per device using resource pool structure. | ||
| 42 | Each such resource pool is limited up to 64 resources in given resource pool | ||
| 43 | by rdma cgroup, which can be extended later if required. | ||
| 44 | |||
| 45 | This resource pool object is linked to the cgroup css. Typically there | ||
| 46 | are 0 to 4 resource pool instances per cgroup, per device in most use cases. | ||
| 47 | But nothing limits to have it more. At present hundreds of RDMA devices per | ||
| 48 | single cgroup may not be handled optimally, however there is no | ||
| 49 | known use case or requirement for such configuration either. | ||
| 50 | |||
| 51 | Since RDMA resources can be allocated from any process and can be freed by any | ||
| 52 | of the child processes which shares the address space, rdma resources are | ||
| 53 | always owned by the creator cgroup css. This allows process migration from one | ||
| 54 | to other cgroup without major complexity of transferring resource ownership; | ||
| 55 | because such ownership is not really present due to shared nature of | ||
| 56 | rdma resources. Linking resources around css also ensures that cgroups can be | ||
| 57 | deleted after processes migrated. This allow progress migration as well with | ||
| 58 | active resources, even though that is not a primary use case. | ||
| 59 | |||
| 60 | Whenever RDMA resource charging occurs, owner rdma cgroup is returned to | ||
| 61 | the caller. Same rdma cgroup should be passed while uncharging the resource. | ||
| 62 | This also allows process migrated with active RDMA resource to charge | ||
| 63 | to new owner cgroup for new resource. It also allows to uncharge resource of | ||
| 64 | a process from previously charged cgroup which is migrated to new cgroup, | ||
| 65 | even though that is not a primary use case. | ||
| 66 | |||
| 67 | Resource pool object is created in following situations. | ||
| 68 | (a) User sets the limit and no previous resource pool exist for the device | ||
| 69 | of interest for the cgroup. | ||
| 70 | (b) No resource limits were configured, but IB/RDMA stack tries to | ||
| 71 | charge the resource. So that it correctly uncharge them when applications are | ||
| 72 | running without limits and later on when limits are enforced during uncharging, | ||
| 73 | otherwise usage count will drop to negative. | ||
| 74 | |||
| 75 | Resource pool is destroyed if all the resource limits are set to max and | ||
| 76 | it is the last resource getting deallocated. | ||
| 77 | |||
| 78 | User should set all the limit to max value if it intents to remove/unconfigure | ||
| 79 | the resource pool for a particular device. | ||
| 80 | |||
| 81 | IB stack honors limits enforced by the rdma controller. When application | ||
| 82 | query about maximum resource limits of IB device, it returns minimum of | ||
| 83 | what is configured by user for a given cgroup and what is supported by | ||
| 84 | IB device. | ||
| 85 | |||
| 86 | Following resources can be accounted by rdma controller. | ||
| 87 | |||
| 88 | ========== ============================= | ||
| 89 | hca_handle Maximum number of HCA Handles | ||
| 90 | hca_object Maximum number of HCA Objects | ||
| 91 | ========== ============================= | ||
| 92 | |||
| 93 | 2. Usage Examples | ||
| 94 | ================= | ||
| 95 | |||
| 96 | (a) Configure resource limit:: | ||
| 97 | |||
| 98 | echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max | ||
| 99 | echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max | ||
| 100 | |||
| 101 | (b) Query resource limit:: | ||
| 102 | |||
| 103 | cat /sys/fs/cgroup/rdma/2/rdma.max | ||
| 104 | #Output: | ||
| 105 | mlx4_0 hca_handle=2 hca_object=2000 | ||
| 106 | ocrdma1 hca_handle=3 hca_object=max | ||
| 107 | |||
| 108 | (c) Query current usage:: | ||
| 109 | |||
| 110 | cat /sys/fs/cgroup/rdma/2/rdma.current | ||
| 111 | #Output: | ||
| 112 | mlx4_0 hca_handle=1 hca_object=20 | ||
| 113 | ocrdma1 hca_handle=1 hca_object=23 | ||
| 114 | |||
| 115 | (d) Delete resource limit:: | ||
| 116 | |||
| 117 | echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max | ||
