Using trace tcpdrop

    The trace tcpdrop gadget traces TCP packets dropped by the kernel.

    On Kubernetes

    In terminal 1, start the trace tcpdrop gadget:

    $ kubectl gadget trace tcpdrop
    K8S.NODE         K8S.NAMESPACE  K8S.POD K8s.CONTAINER  PID     COMM  IP SRC                    DST                        STATE        TCPFLAGS  REASON
    

    In terminal 2, start a pod and configure the network emulator to drop 25% of the packets:

    $ kubectl create service nodeport nginx --tcp=80:80
    $ kubectl create deployment nginx --image=nginx
    $ kubectl run --rm -ti --privileged --image ubuntu shell -- bash
    root@shell:/# apt-get update
    root@shell:/# apt install -y iproute2 curl
    root@shell:/# tc qdisc add dev eth0 root netem drop 25%
    root@shell:/# curl nginx
    

    The results in terminal 1 will show that some packets are dropped by the network emulator qdisc:

    K8S.NODE         K8S.NAMESPACE  K8S.POD K8s.CONTAINER  PID     COMM  IP SRC                    DST                        STATE        TCPFLAGS  REASON
    minikube-docker  default        shell   shell          0             4  p/default/shell:45979  s/kube-system/kube-dns:53  ESTABLISHED  FIN       QDISC_DROP
    minikube-docker  default        shell   shell          406293  curl  4  p/default/shell:34482  s/default/nginx:80         ESTABLISHED  ACK       QDISC_DROP
    

    The network emulator uses a random generator to drop 25% of the packets. The results may vary.

    The gadget tries its best to link the dropped packets to the process which generated it. In some cases, this information might be missing.

    The source and destination addresses are written in condensed form. It is possible to see more detailed information by reading specific columns or using the json or yaml ouput:

    $ kubectl gadget trace tcpdrop \
        -o columns=k8s.node,k8s.namespace,k8s.pod,k8s.container,pid,comm,ip,src.addr,src.port,src.kind,src.ns,src.name,dst.addr,dst.port,dst.kind,dst.ns,dst.name,state,tcpflags,reason
    
    $ kubectl gadget trace tcpdrop -o yaml
    ---
    comm: curl
    container: shell
    dst:
      addr: 10.101.116.61
      kind: svc
      namespace: default
      podlabels:
        app: nginx
      podname: nginx
      port: 80
    gid: 0
    ipversion: 4
    mountnsid: 4026533845
    namespace: default
    netnsid: 4026533672
    node: minikube-docker
    pid: 412491
    pod: shell
    reason: QDISC_DROP
    src:
      addr: 10.244.0.91
      kind: pod
      namespace: default
      podlabels:
        run: shell
      podname: shell
      port: 35802
    state: ESTABLISHED
    tcpflags: ACK
    timestamp: 1681911565379499967
    type: normal
    uid: 0
    

    With ig

    In terminal 1, start the trace tcpdrop gadget:

    $ sudo ig trace tcpdrop -r docker
    CONTAINER  PID     COMM  IP SRC               DST          STATE        TCPFLAGS  REASON
    

    In terminal 2, start a container, configure the network emulator to drop 25% of the packets, and download a web page:

    $ docker run -ti --rm --cap-add NET_ADMIN --name=netem wbitt/network-multitool -- /bin/bash
    # tc qdisc add dev eth0 root netem drop 25%
    # wget 1.1.1.1
    

    The container needs NET_ADMIN capability to manage network interfaces

    The results in terminal 1 will show that some packets are dropped by the network emulator qdisc:

    CONTAINER  PID     COMM  IP SRC               DST          STATE        TCPFLAGS  REASON
    netem      456426  wget  4  172.17.0.2:35790  1.1.1.1:443  ESTABLISHED  ACK       QDISC_DROP
    

    The following section tells us that QDISC_DROP means the packet was “dropped by qdisc when packet outputting (failed to enqueue to current qdisc)”.

    List of drop reasons

    The drop reason enum is not stable and may change between kernel versions. The tcpdrop gadget needs BTF information to decode the drop reason. The following table shows the list of drop reasons for Linux 6.2.

    Name Documentation
    SKB_NOT_DROPPED_YET skb is not dropped yet (used for no-drop case)
    SKB_CONSUMED packet has been consumed
    SKB_DROP_REASON_NOT_SPECIFIED drop reason is not specified
    SKB_DROP_REASON_NO_SOCKET socket not found
    SKB_DROP_REASON_PKT_TOO_SMALL packet size is too small
    SKB_DROP_REASON_TCP_CSUM TCP checksum error
    SKB_DROP_REASON_SOCKET_FILTER dropped by socket filter
    SKB_DROP_REASON_UDP_CSUM UDP checksum error
    SKB_DROP_REASON_NETFILTER_DROP dropped by netfilter
    SKB_DROP_REASON_OTHERHOST packet don’t belong to current host (interface is in promisc mode)
    SKB_DROP_REASON_IP_CSUM IP checksum error
    SKB_DROP_REASON_IP_INHDR there is something wrong with IP header (see IPSTATS_MIB_INHDRERRORS)
    SKB_DROP_REASON_IP_RPFILTER IP rpfilter validate failed. see the document for rp_filter in ip-sysctl.rst for more information
    SKB_DROP_REASON_UNICAST_IN_L2_MULTICAST destination address of L2 is multicast, but L3 is unicast.
    SKB_DROP_REASON_XFRM_POLICY xfrm policy check failed
    SKB_DROP_REASON_IP_NOPROTO no support for IP protocol
    SKB_DROP_REASON_SOCKET_RCVBUFF socket receive buff is full
    SKB_DROP_REASON_PROTO_MEM proto memory limition, such as udp packet drop out of udp_memory_allocated.
    SKB_DROP_REASON_TCP_MD5NOTFOUND no MD5 hash and one expected, corresponding to LINUX_MIB_TCPMD5NOTFOUND
    SKB_DROP_REASON_TCP_MD5UNEXPECTED MD5 hash and we’re not expecting one, corresponding to LINUX_MIB_TCPMD5UNEXPECTED
    SKB_DROP_REASON_TCP_MD5FAILURE MD5 hash and its wrong, corresponding to LINUX_MIB_TCPMD5FAILURE
    SKB_DROP_REASON_SOCKET_BACKLOG failed to add skb to socket backlog ( see LINUX_MIB_TCPBACKLOGDROP)
    SKB_DROP_REASON_TCP_FLAGS TCP flags invalid
    SKB_DROP_REASON_TCP_ZEROWINDOW TCP receive window size is zero, see LINUX_MIB_TCPZEROWINDOWDROP
    SKB_DROP_REASON_TCP_OLD_DATA the TCP data reveived is already received before (spurious retrans may happened), see LINUX_MIB_DELAYEDACKLOST
    SKB_DROP_REASON_TCP_OVERWINDOW the TCP data is out of window, the seq of the first byte exceed the right edges of receive window
    SKB_DROP_REASON_TCP_OFOMERGE the data of skb is already in the ofo queue, corresponding to LINUX_MIB_TCPOFOMERGE
    SKB_DROP_REASON_TCP_RFC7323_PAWS PAWS check, corresponding to LINUX_MIB_PAWSESTABREJECTED
    SKB_DROP_REASON_TCP_INVALID_SEQUENCE Not acceptable SEQ field
    SKB_DROP_REASON_TCP_RESET Invalid RST packet
    SKB_DROP_REASON_TCP_INVALID_SYN Incoming packet has unexpected SYN flag
    SKB_DROP_REASON_TCP_CLOSE TCP socket in CLOSE state
    SKB_DROP_REASON_TCP_FASTOPEN dropped by FASTOPEN request socket
    SKB_DROP_REASON_TCP_OLD_ACK TCP ACK is old, but in window
    SKB_DROP_REASON_TCP_TOO_OLD_ACK TCP ACK is too old
    SKB_DROP_REASON_TCP_ACK_UNSENT_DATA TCP ACK for data we haven’t sent yet
    SKB_DROP_REASON_TCP_OFO_QUEUE_PRUNE pruned from TCP OFO queue
    SKB_DROP_REASON_TCP_OFO_DROP data already in receive queue
    SKB_DROP_REASON_IP_OUTNOROUTES route lookup failed
    SKB_DROP_REASON_BPF_CGROUP_EGRESS dropped by BPF_PROG_TYPE_CGROUP_SKB eBPF program
    SKB_DROP_REASON_IPV6DISABLED IPv6 is disabled on the device
    SKB_DROP_REASON_NEIGH_CREATEFAIL failed to create neigh entry
    SKB_DROP_REASON_NEIGH_FAILED neigh entry in failed state
    SKB_DROP_REASON_NEIGH_QUEUEFULL arp_queue for neigh entry is full
    SKB_DROP_REASON_NEIGH_DEAD neigh entry is dead
    SKB_DROP_REASON_TC_EGRESS dropped in TC egress HOOK
    SKB_DROP_REASON_QDISC_DROP dropped by qdisc when packet outputting ( failed to enqueue to current qdisc)
    SKB_DROP_REASON_CPU_BACKLOG failed to enqueue the skb to the per CPU backlog queue. This can be caused by backlog queue full (see netdev_max_backlog in net.rst) or RPS flow limit
    SKB_DROP_REASON_XDP dropped by XDP in input path
    SKB_DROP_REASON_TC_INGRESS dropped in TC ingress HOOK
    SKB_DROP_REASON_UNHANDLED_PROTO protocol not implemented or not supported
    SKB_DROP_REASON_SKB_CSUM sk_buff checksum computation error
    SKB_DROP_REASON_SKB_GSO_SEG gso segmentation error
    SKB_DROP_REASON_SKB_UCOPY_FAULT failed to copy data from user space, e.g., via zerocopy_sg_from_iter() or skb_orphan_frags_rx()
    SKB_DROP_REASON_DEV_HDR device driver specific header/metadata is invalid
    SKB_DROP_REASON_DEV_READY the device is not ready to xmit/recv due to any of its data structure that is not up/ready/initialized, e.g., the IFF_UP is not set, or driver specific tun->tfiles[txq] is not initialized
    SKB_DROP_REASON_FULL_RING ring buffer is full
    SKB_DROP_REASON_NOMEM error due to OOM
    SKB_DROP_REASON_HDR_TRUNC failed to trunc/extract the header from networking data, e.g., failed to pull the protocol header from frags via pskb_may_pull()
    SKB_DROP_REASON_TAP_FILTER dropped by (ebpf) filter directly attached to tun/tap, e.g., via TUNSETFILTEREBPF
    SKB_DROP_REASON_TAP_TXFILTER dropped by tx filter implemented at tun/tap, e.g., check_filter()
    SKB_DROP_REASON_ICMP_CSUM ICMP checksum error
    SKB_DROP_REASON_INVALID_PROTO the packet doesn’t follow RFC 2211, such as a broadcasts ICMP_TIMESTAMP
    SKB_DROP_REASON_IP_INADDRERRORS host unreachable, corresponding to IPSTATS_MIB_INADDRERRORS
    SKB_DROP_REASON_IP_INNOROUTES network unreachable, corresponding to IPSTATS_MIB_INADDRERRORS
    SKB_DROP_REASON_PKT_TOO_BIG packet size is too big (maybe exceed the MTU)
    SKB_DROP_REASON_DUP_FRAG duplicate fragment
    SKB_DROP_REASON_FRAG_REASM_TIMEOUT fragment reassembly timeout
    SKB_DROP_REASON_FRAG_TOO_FAR ipv4 fragment too far. (/proc/sys/net/ipv4/ipfrag_max_dist)
    SKB_DROP_REASON_MAX the maximum of drop reason, which shouldn’t be used as a real ‘reason’

    This table can be generated with:

    $ go run ./pkg/gadgets/trace/tcpdrop/tracer/dropreasongen/...
    

    Other tools showing dropped packets

    The following tools can be used to show dropped packets but they are not focused on containers or Kubernetes: