aboutsummaryrefslogtreecommitdiffstats
path: root/include/uapi/linux/devlink.h
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2019-08-17 15:40:09 -0400
committerDavid S. Miller <davem@davemloft.net>2019-08-17 15:40:09 -0400
commit83beee5a3aff0fb159b2fb4d0cac8f18a193417e (patch)
treece77ccefee1384488408d9b9e49e2148359f30d9 /include/uapi/linux/devlink.h
parentf77508308fa76d0efc60ebf3c906f467feb062cb (diff)
parent95766451bfb82f972bf3fea93fc6e91a904cf624 (diff)
Merge branch 'drop_monitor-for-offloaded-paths'
Ido Schimmel says: ==================== Add drop monitor for offloaded data paths Users have several ways to debug the kernel and understand why a packet was dropped. For example, using drop monitor and perf. Both utilities trace kfree_skb(), which is the function called when a packet is freed as part of a failure. The information provided by these tools is invaluable when trying to understand the cause of a packet loss. In recent years, large portions of the kernel data path were offloaded to capable devices. Today, it is possible to perform L2 and L3 forwarding in hardware, as well as tunneling (IP-in-IP and VXLAN). Different TC classifiers and actions are also offloaded to capable devices, at both ingress and egress. However, when the data path is offloaded it is not possible to achieve the same level of introspection since packets are dropped by the underlying device and never reach the kernel. This patchset aims to solve this by allowing users to monitor packets that the underlying device decided to drop along with relevant metadata such as the drop reason and ingress port. The above is achieved by exposing a fundamental capability of devices capable of data path offloading - packet trapping. In much the same way as drop monitor registers its probe function with the kfree_skb() tracepoint, the device is instructed to pass to the CPU (trap) packets that it decided to drop in various places in the pipeline. The configuration of the device to pass such packets to the CPU is performed using devlink, as it is not specific to a port, but rather to a device. In the future, we plan to control the policing of such packets using devlink, in order not to overwhelm the CPU. While devlink is used as the control path, the dropped packets are passed along with metadata to drop monitor, which reports them to userspace as netlink events. This allows users to use the same interface for the monitoring of both software and hardware drops. Logically, the solution looks as follows: Netlink event: Packet w/ metadata Or a summary of recent drops ^ | Userspace | +---------------------------------------------------+ Kernel | | +-------+--------+ | | | drop_monitor | | | +-------^--------+ | | | +----+----+ | | Kernel's Rx path | devlink | (non-drop traps) | | +----^----+ ^ | | +-----------+ | +-------+-------+ | | | Device driver | | | +-------^-------+ Kernel | +---------------------------------------------------+ Hardware | | Trapped packet | +--+---+ | | | ASIC | | | +------+ In order to reduce the patch count, this patchset only includes integration with netdevsim. A follow-up patchset will add devlink-trap support in mlxsw. Patches #1-#7 extend drop monitor to also monitor hardware originated drops. Patches #8-#10 add the devlink-trap infrastructure. Patches #11-#12 add devlink-trap support in netdevsim. Patches #13-#16 add tests for the generic infrastructure over netdevsim. Example ======= Instantiate netdevsim --------------------- List supported traps -------------------- netdevsim/netdevsim10: name source_mac_is_multicast type drop generic true action drop group l2_drops name vlan_tag_mismatch type drop generic true action drop group l2_drops name ingress_vlan_filter type drop generic true action drop group l2_drops name ingress_spanning_tree_filter type drop generic true action drop group l2_drops name port_list_is_empty type drop generic true action drop group l2_drops name port_loopback_filter type drop generic true action drop group l2_drops name fid_miss type exception generic false action trap group l2_drops name blackhole_route type drop generic true action drop group l3_drops name ttl_value_is_too_small type exception generic true action trap group l3_drops name tail_drop type drop generic true action drop group buffer_drops Enable a trap ------------- Query statistics ---------------- netdevsim/netdevsim10: name blackhole_route type drop generic true action trap group l3_drops stats: rx: bytes 7384 packets 52 Monitor dropped packets ----------------------- dropwatch> set alertmode packet Setting alert mode Alert mode successfully set dropwatch> set sw true setting software drops monitoring to 1 dropwatch> set hw true setting hardware drops monitoring to 1 dropwatch> start Enabling monitoring... Kernel monitoring activated. Issue Ctrl-C to stop monitoring drop at: ttl_value_is_too_small (l3_drops) origin: hardware input port ifindex: 55 input port name: eth0 timestamp: Mon Aug 12 10:52:20 2019 445911505 nsec protocol: 0x800 length: 142 original length: 142 drop at: ip6_mc_input+0x8b8/0xef8 (0xffffffff9e2bb0e8) origin: software input port ifindex: 4 timestamp: Mon Aug 12 10:53:37 2019 024444587 nsec protocol: 0x86dd length: 110 original length: 110 Future plans ============ * Provide more drop reasons as well as more metadata * Add dropmon support to libpcap, so that tcpdump/tshark could specifically listen on dropmon traffic, instead of capturing all netlink packets via nlmon interface Changes in v3: * Place test with the rest of the netdevsim tests * Fix test to load netdevsim module * Move devlink helpers from the test to devlink_lib.sh. Will be used by mlxsw tests * Re-order netdevsim includes in alphabetical order * Fix reverse xmas tree in netdevsim * Remove double include in netdevsim Changes in v2: * Use drop monitor to report dropped packets instead of devlink * Add drop monitor patches * Add test cases ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/uapi/linux/devlink.h')
-rw-r--r--include/uapi/linux/devlink.h62
1 files changed, 62 insertions, 0 deletions
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index ffc993256527..546e75dd74ac 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -107,6 +107,16 @@ enum devlink_command {
107 DEVLINK_CMD_FLASH_UPDATE_END, /* notification only */ 107 DEVLINK_CMD_FLASH_UPDATE_END, /* notification only */
108 DEVLINK_CMD_FLASH_UPDATE_STATUS, /* notification only */ 108 DEVLINK_CMD_FLASH_UPDATE_STATUS, /* notification only */
109 109
110 DEVLINK_CMD_TRAP_GET, /* can dump */
111 DEVLINK_CMD_TRAP_SET,
112 DEVLINK_CMD_TRAP_NEW,
113 DEVLINK_CMD_TRAP_DEL,
114
115 DEVLINK_CMD_TRAP_GROUP_GET, /* can dump */
116 DEVLINK_CMD_TRAP_GROUP_SET,
117 DEVLINK_CMD_TRAP_GROUP_NEW,
118 DEVLINK_CMD_TRAP_GROUP_DEL,
119
110 /* add new commands above here */ 120 /* add new commands above here */
111 __DEVLINK_CMD_MAX, 121 __DEVLINK_CMD_MAX,
112 DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1 122 DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1
@@ -194,6 +204,47 @@ enum devlink_param_fw_load_policy_value {
194 DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH, 204 DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH,
195}; 205};
196 206
207enum {
208 DEVLINK_ATTR_STATS_RX_PACKETS, /* u64 */
209 DEVLINK_ATTR_STATS_RX_BYTES, /* u64 */
210
211 __DEVLINK_ATTR_STATS_MAX,
212 DEVLINK_ATTR_STATS_MAX = __DEVLINK_ATTR_STATS_MAX - 1
213};
214
215/**
216 * enum devlink_trap_action - Packet trap action.
217 * @DEVLINK_TRAP_ACTION_DROP: Packet is dropped by the device and a copy is not
218 * sent to the CPU.
219 * @DEVLINK_TRAP_ACTION_TRAP: The sole copy of the packet is sent to the CPU.
220 */
221enum devlink_trap_action {
222 DEVLINK_TRAP_ACTION_DROP,
223 DEVLINK_TRAP_ACTION_TRAP,
224};
225
226/**
227 * enum devlink_trap_type - Packet trap type.
228 * @DEVLINK_TRAP_TYPE_DROP: Trap reason is a drop. Trapped packets are only
229 * processed by devlink and not injected to the
230 * kernel's Rx path.
231 * @DEVLINK_TRAP_TYPE_EXCEPTION: Trap reason is an exception. Packet was not
232 * forwarded as intended due to an exception
233 * (e.g., missing neighbour entry) and trapped to
234 * control plane for resolution. Trapped packets
235 * are processed by devlink and injected to
236 * the kernel's Rx path.
237 */
238enum devlink_trap_type {
239 DEVLINK_TRAP_TYPE_DROP,
240 DEVLINK_TRAP_TYPE_EXCEPTION,
241};
242
243enum {
244 /* Trap can report input port as metadata */
245 DEVLINK_ATTR_TRAP_METADATA_TYPE_IN_PORT,
246};
247
197enum devlink_attr { 248enum devlink_attr {
198 /* don't change the order or add anything between, this is ABI! */ 249 /* don't change the order or add anything between, this is ABI! */
199 DEVLINK_ATTR_UNSPEC, 250 DEVLINK_ATTR_UNSPEC,
@@ -348,6 +399,17 @@ enum devlink_attr {
348 DEVLINK_ATTR_PORT_PCI_PF_NUMBER, /* u16 */ 399 DEVLINK_ATTR_PORT_PCI_PF_NUMBER, /* u16 */
349 DEVLINK_ATTR_PORT_PCI_VF_NUMBER, /* u16 */ 400 DEVLINK_ATTR_PORT_PCI_VF_NUMBER, /* u16 */
350 401
402 DEVLINK_ATTR_STATS, /* nested */
403
404 DEVLINK_ATTR_TRAP_NAME, /* string */
405 /* enum devlink_trap_action */
406 DEVLINK_ATTR_TRAP_ACTION, /* u8 */
407 /* enum devlink_trap_type */
408 DEVLINK_ATTR_TRAP_TYPE, /* u8 */
409 DEVLINK_ATTR_TRAP_GENERIC, /* flag */
410 DEVLINK_ATTR_TRAP_METADATA, /* nested */
411 DEVLINK_ATTR_TRAP_GROUP_NAME, /* string */
412
351 /* add new attributes above here, update the policy in devlink.c */ 413 /* add new attributes above here, update the policy in devlink.c */
352 414
353 __DEVLINK_ATTR_MAX, 415 __DEVLINK_ATTR_MAX,