diff options
Diffstat (limited to 'Documentation/networking/snmp_counter.rst')
| -rw-r--r-- | Documentation/networking/snmp_counter.rst | 130 |
1 files changed, 125 insertions, 5 deletions
diff --git a/Documentation/networking/snmp_counter.rst b/Documentation/networking/snmp_counter.rst index b0dfdaaca512..fe8f741193be 100644 --- a/Documentation/networking/snmp_counter.rst +++ b/Documentation/networking/snmp_counter.rst | |||
| @@ -336,7 +336,26 @@ time client replies ACK, this socket will get another chance to move | |||
| 336 | to the accept queue. | 336 | to the accept queue. |
| 337 | 337 | ||
| 338 | 338 | ||
| 339 | TCP Fast Open | 339 | * TcpEstabResets |
| 340 | Defined in `RFC1213 tcpEstabResets`_. | ||
| 341 | |||
| 342 | .. _RFC1213 tcpEstabResets: https://tools.ietf.org/html/rfc1213#page-48 | ||
| 343 | |||
| 344 | * TcpAttemptFails | ||
| 345 | Defined in `RFC1213 tcpAttemptFails`_. | ||
| 346 | |||
| 347 | .. _RFC1213 tcpAttemptFails: https://tools.ietf.org/html/rfc1213#page-48 | ||
| 348 | |||
| 349 | * TcpOutRsts | ||
| 350 | Defined in `RFC1213 tcpOutRsts`_. The RFC says this counter indicates | ||
| 351 | the 'segments sent containing the RST flag', but in linux kernel, this | ||
| 352 | couner indicates the segments kerenl tried to send. The sending | ||
| 353 | process might be failed due to some errors (e.g. memory alloc failed). | ||
| 354 | |||
| 355 | .. _RFC1213 tcpOutRsts: https://tools.ietf.org/html/rfc1213#page-52 | ||
| 356 | |||
| 357 | |||
| 358 | TCP Fast Path | ||
| 340 | ============ | 359 | ============ |
| 341 | When kernel receives a TCP packet, it has two paths to handler the | 360 | When kernel receives a TCP packet, it has two paths to handler the |
| 342 | packet, one is fast path, another is slow path. The comment in kernel | 361 | packet, one is fast path, another is slow path. The comment in kernel |
| @@ -383,8 +402,6 @@ increase 1. | |||
| 383 | 402 | ||
| 384 | TCP abort | 403 | TCP abort |
| 385 | ======== | 404 | ======== |
| 386 | |||
| 387 | |||
| 388 | * TcpExtTCPAbortOnData | 405 | * TcpExtTCPAbortOnData |
| 389 | It means TCP layer has data in flight, but need to close the | 406 | It means TCP layer has data in flight, but need to close the |
| 390 | connection. So TCP layer sends a RST to the other side, indicate the | 407 | connection. So TCP layer sends a RST to the other side, indicate the |
| @@ -545,7 +562,6 @@ packet yet, the sender would know packet 4 is out of order. The TCP | |||
| 545 | stack of kernel will increase TcpExtTCPSACKReorder for both of the | 562 | stack of kernel will increase TcpExtTCPSACKReorder for both of the |
| 546 | above scenarios. | 563 | above scenarios. |
| 547 | 564 | ||
| 548 | |||
| 549 | DSACK | 565 | DSACK |
| 550 | ===== | 566 | ===== |
| 551 | The DSACK is defined in `RFC2883`_. The receiver uses DSACK to report | 567 | The DSACK is defined in `RFC2883`_. The receiver uses DSACK to report |
| @@ -566,13 +582,63 @@ The TCP stack receives an out of order duplicate packet, so it sends a | |||
| 566 | DSACK to the sender. | 582 | DSACK to the sender. |
| 567 | 583 | ||
| 568 | * TcpExtTCPDSACKRecv | 584 | * TcpExtTCPDSACKRecv |
| 569 | The TCP stack receives a DSACK, which indicate an acknowledged | 585 | The TCP stack receives a DSACK, which indicates an acknowledged |
| 570 | duplicate packet is received. | 586 | duplicate packet is received. |
| 571 | 587 | ||
| 572 | * TcpExtTCPDSACKOfoRecv | 588 | * TcpExtTCPDSACKOfoRecv |
| 573 | The TCP stack receives a DSACK, which indicate an out of order | 589 | The TCP stack receives a DSACK, which indicate an out of order |
| 574 | duplicate packet is received. | 590 | duplicate packet is received. |
| 575 | 591 | ||
| 592 | invalid SACK and DSACK | ||
| 593 | ==================== | ||
| 594 | When a SACK (or DSACK) block is invalid, a corresponding counter would | ||
| 595 | be updated. The validation method is base on the start/end sequence | ||
| 596 | number of the SACK block. For more details, please refer the comment | ||
| 597 | of the function tcp_is_sackblock_valid in the kernel source code. A | ||
| 598 | SACK option could have up to 4 blocks, they are checked | ||
| 599 | individually. E.g., if 3 blocks of a SACk is invalid, the | ||
| 600 | corresponding counter would be updated 3 times. The comment of the | ||
| 601 | `Add counters for discarded SACK blocks`_ patch has additional | ||
| 602 | explaination: | ||
| 603 | |||
| 604 | .. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32 | ||
| 605 | |||
| 606 | * TcpExtTCPSACKDiscard | ||
| 607 | This counter indicates how many SACK blocks are invalid. If the invalid | ||
| 608 | SACK block is caused by ACK recording, the TCP stack will only ignore | ||
| 609 | it and won't update this counter. | ||
| 610 | |||
| 611 | * TcpExtTCPDSACKIgnoredOld and TcpExtTCPDSACKIgnoredNoUndo | ||
| 612 | When a DSACK block is invalid, one of these two counters would be | ||
| 613 | updated. Which counter will be updated depends on the undo_marker flag | ||
| 614 | of the TCP socket. If the undo_marker is not set, the TCP stack isn't | ||
| 615 | likely to re-transmit any packets, and we still receive an invalid | ||
| 616 | DSACK block, the reason might be that the packet is duplicated in the | ||
| 617 | middle of the network. In such scenario, TcpExtTCPDSACKIgnoredNoUndo | ||
| 618 | will be updated. If the undo_marker is set, TcpExtTCPDSACKIgnoredOld | ||
| 619 | will be updated. As implied in its name, it might be an old packet. | ||
| 620 | |||
| 621 | SACK shift | ||
| 622 | ========= | ||
| 623 | The linux networking stack stores data in sk_buff struct (skb for | ||
| 624 | short). If a SACK block acrosses multiple skb, the TCP stack will try | ||
| 625 | to re-arrange data in these skb. E.g. if a SACK block acknowledges seq | ||
| 626 | 10 to 15, skb1 has seq 10 to 13, skb2 has seq 14 to 20. The seq 14 and | ||
| 627 | 15 in skb2 would be moved to skb1. This operation is 'shift'. If a | ||
| 628 | SACK block acknowledges seq 10 to 20, skb1 has seq 10 to 13, skb2 has | ||
| 629 | seq 14 to 20. All data in skb2 will be moved to skb1, and skb2 will be | ||
| 630 | discard, this operation is 'merge'. | ||
| 631 | |||
| 632 | * TcpExtTCPSackShifted | ||
| 633 | A skb is shifted | ||
| 634 | |||
| 635 | * TcpExtTCPSackMerged | ||
| 636 | A skb is merged | ||
| 637 | |||
| 638 | * TcpExtTCPSackShiftFallback | ||
| 639 | A skb should be shifted or merged, but the TCP stack doesn't do it for | ||
| 640 | some reasons. | ||
| 641 | |||
| 576 | TCP out of order | 642 | TCP out of order |
| 577 | =============== | 643 | =============== |
| 578 | * TcpExtTCPOFOQueue | 644 | * TcpExtTCPOFOQueue |
| @@ -662,6 +728,60 @@ unacknowledged number (more strict than `RFC 5961 section 5.2`_). | |||
| 662 | .. _RFC 5961 section 4.2: https://tools.ietf.org/html/rfc5961#page-9 | 728 | .. _RFC 5961 section 4.2: https://tools.ietf.org/html/rfc5961#page-9 |
| 663 | .. _RFC 5961 section 5.2: https://tools.ietf.org/html/rfc5961#page-11 | 729 | .. _RFC 5961 section 5.2: https://tools.ietf.org/html/rfc5961#page-11 |
| 664 | 730 | ||
| 731 | TCP receive window | ||
| 732 | ================= | ||
| 733 | * TcpExtTCPWantZeroWindowAdv | ||
| 734 | Depending on current memory usage, the TCP stack tries to set receive | ||
| 735 | window to zero. But the receive window might still be a no-zero | ||
| 736 | value. For example, if the previous window size is 10, and the TCP | ||
| 737 | stack receives 3 bytes, the current window size would be 7 even if the | ||
| 738 | window size calculated by the memory usage is zero. | ||
| 739 | |||
| 740 | * TcpExtTCPToZeroWindowAdv | ||
| 741 | The TCP receive window is set to zero from a no-zero value. | ||
| 742 | |||
| 743 | * TcpExtTCPFromZeroWindowAdv | ||
| 744 | The TCP receive window is set to no-zero value from zero. | ||
| 745 | |||
| 746 | |||
| 747 | Delayed ACK | ||
| 748 | ========== | ||
| 749 | The TCP Delayed ACK is a technique which is used for reducing the | ||
| 750 | packet count in the network. For more details, please refer the | ||
| 751 | `Delayed ACK wiki`_ | ||
| 752 | |||
| 753 | .. _Delayed ACK wiki: https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment | ||
| 754 | |||
| 755 | * TcpExtDelayedACKs | ||
| 756 | A delayed ACK timer expires. The TCP stack will send a pure ACK packet | ||
| 757 | and exit the delayed ACK mode. | ||
| 758 | |||
| 759 | * TcpExtDelayedACKLocked | ||
| 760 | A delayed ACK timer expires, but the TCP stack can't send an ACK | ||
| 761 | immediately due to the socket is locked by a userspace program. The | ||
| 762 | TCP stack will send a pure ACK later (after the userspace program | ||
| 763 | unlock the socket). When the TCP stack sends the pure ACK later, the | ||
| 764 | TCP stack will also update TcpExtDelayedACKs and exit the delayed ACK | ||
| 765 | mode. | ||
| 766 | |||
| 767 | * TcpExtDelayedACKLost | ||
| 768 | It will be updated when the TCP stack receives a packet which has been | ||
| 769 | ACKed. A Delayed ACK loss might cause this issue, but it would also be | ||
| 770 | triggered by other reasons, such as a packet is duplicated in the | ||
| 771 | network. | ||
| 772 | |||
| 773 | Tail Loss Probe (TLP) | ||
| 774 | =================== | ||
| 775 | TLP is an algorithm which is used to detect TCP packet loss. For more | ||
| 776 | details, please refer the `TLP paper`_. | ||
| 777 | |||
| 778 | .. _TLP paper: https://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01 | ||
| 779 | |||
| 780 | * TcpExtTCPLossProbes | ||
| 781 | A TLP probe packet is sent. | ||
| 782 | |||
| 783 | * TcpExtTCPLossProbeRecovery | ||
| 784 | A packet loss is detected and recovered by TLP. | ||
| 665 | 785 | ||
| 666 | examples | 786 | examples |
| 667 | ======= | 787 | ======= |
