aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/accounting/taskstats.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/accounting/taskstats.txt')
-rw-r--r--Documentation/accounting/taskstats.txt64
1 files changed, 52 insertions, 12 deletions
diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt
index efd8f605bcd5..92ebf29e9041 100644
--- a/Documentation/accounting/taskstats.txt
+++ b/Documentation/accounting/taskstats.txt
@@ -26,20 +26,28 @@ leader - a process is deemed alive as long as it has any task belonging to it.
26Usage 26Usage
27----- 27-----
28 28
29To get statistics during task's lifetime, userspace opens a unicast netlink 29To get statistics during a task's lifetime, userspace opens a unicast netlink
30socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid. 30socket (NETLINK_GENERIC family) and sends commands specifying a pid or a tgid.
31The response contains statistics for a task (if pid is specified) or the sum of 31The response contains statistics for a task (if pid is specified) or the sum of
32statistics for all tasks of the process (if tgid is specified). 32statistics for all tasks of the process (if tgid is specified).
33 33
34To obtain statistics for tasks which are exiting, userspace opens a multicast 34To obtain statistics for tasks which are exiting, the userspace listener
35netlink socket. Each time a task exits, its per-pid statistics is always sent 35sends a register command and specifies a cpumask. Whenever a task exits on
36by the kernel to each listener on the multicast socket. In addition, if it is 36one of the cpus in the cpumask, its per-pid statistics are sent to the
37the last thread exiting its thread group, an additional record containing the 37registered listener. Using cpumasks allows the data received by one listener
38per-tgid stats are also sent. The latter contains the sum of per-pid stats for 38to be limited and assists in flow control over the netlink interface and is
39all threads in the thread group, both past and present. 39explained in more detail below.
40
41If the exiting task is the last thread exiting its thread group,
42an additional record containing the per-tgid stats is also sent to userspace.
43The latter contains the sum of per-pid stats for all threads in the thread
44group, both past and present.
40 45
41getdelays.c is a simple utility demonstrating usage of the taskstats interface 46getdelays.c is a simple utility demonstrating usage of the taskstats interface
42for reporting delay accounting statistics. 47for reporting delay accounting statistics. Users can register cpumasks,
48send commands and process responses, listen for per-tid/tgid exit data,
49write the data received to a file and do basic flow control by increasing
50receive buffer sizes.
43 51
44Interface 52Interface
45--------- 53---------
@@ -66,10 +74,20 @@ The messages are in the format
66 74
67The taskstats payload is one of the following three kinds: 75The taskstats payload is one of the following three kinds:
68 76
691. Commands: Sent from user to kernel. The payload is one attribute, of type 771. Commands: Sent from user to kernel. Commands to get data on
70TASKSTATS_CMD_ATTR_PID/TGID, containing a u32 pid or tgid in the attribute 78a pid/tgid consist of one attribute, of type TASKSTATS_CMD_ATTR_PID/TGID,
71payload. The pid/tgid denotes the task/process for which userspace wants 79containing a u32 pid or tgid in the attribute payload. The pid/tgid denotes
72statistics. 80the task/process for which userspace wants statistics.
81
82Commands to register/deregister interest in exit data from a set of cpus
83consist of one attribute, of type
84TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK and contain a cpumask in the
85attribute payload. The cpumask is specified as an ascii string of
86comma-separated cpu ranges e.g. to listen to exit data from cpus 1,2,3,5,7,8
87the cpumask would be "1-3,5,7-8". If userspace forgets to deregister interest
88in cpus before closing the listening socket, the kernel cleans up its interest
89set over time. However, for the sake of efficiency, an explicit deregistration
90is advisable.
73 91
742. Response for a command: sent from the kernel in response to a userspace 922. Response for a command: sent from the kernel in response to a userspace
75command. The payload is a series of three attributes of type: 93command. The payload is a series of three attributes of type:
@@ -138,4 +156,26 @@ struct too much, requiring disparate userspace accounting utilities to
138unnecessarily receive large structures whose fields are of no interest, then 156unnecessarily receive large structures whose fields are of no interest, then
139extending the attributes structure would be worthwhile. 157extending the attributes structure would be worthwhile.
140 158
159Flow control for taskstats
160--------------------------
161
162When the rate of task exits becomes large, a listener may not be able to keep
163up with the kernel's rate of sending per-tid/tgid exit data leading to data
164loss. This possibility gets compounded when the taskstats structure gets
165extended and the number of cpus grows large.
166
167To avoid losing statistics, userspace should do one or more of the following:
168
169- increase the receive buffer sizes for the netlink sockets opened by
170listeners to receive exit data.
171
172- create more listeners and reduce the number of cpus being listened to by
173each listener. In the extreme case, there could be one listener for each cpu.
174Users may also consider setting the cpu affinity of the listener to the subset
175of cpus to which it listens, especially if they are listening to just one cpu.
176
177Despite these measures, if the userspace receives ENOBUFS error messages
178indicated overflow of receive buffers, it should take measures to handle the
179loss of data.
180
141---- 181----