Obscure Ethernet for $200 please, Alex: The Ethernet PAUSE frame

This is a bizarre one. It all started when the internet seemed to go out at my house. My desktop, phone, TV, everything stopped working. The usual solution at a time like this is to power cycle the modem and router. While this fixed the situation temporarily, soon after the problem returned. What made me think this was more than just ISP flakiness was that for some reason Chrome actually locked up; good ol’ Windows “this program stopped responding” so like any enterprising engineer I busted open Wireshark.

Some odd frames
Some odd frames

After some clever deductive reasoning, a.k.a randomly unplugging cables from the router, I determined that my TV was sending these mystery frames (yes, my TV — I have a Sony X805D Android TV). After power cycling the TV the problem went away but of course I wanted to figure out what was actually happening. You’d be forgiven if the above frames aren’t immediately recognizable — their definition is buried deep in Appendix 31B of the IEEE 802.3 Ethernet standard.

The type of an Ethernet frame is determined by it’s EtherType, which is a two byte identifier that comes after two six byte MAC addresses denoting source and destination. The mystery frame’s EtherType was 0x8808, which is for Ethernet flow control.

The very existence of Ethernet flow control may come as a shock, especially since protocols like TCP have explicit flow control mechanisms, presumably to compensate for Ethernet’s lack of one. However, on page 752 of the Ethernet spec we find a section dedicated to (rudimentary) flow control. The frame structure is fairly bare-bones: a two byte “opcode”, which in this case is 0x0001 for “PAUSE” and a two byte “pause_time”, denoting increments of 512 bit times (here’s a great diagram of the frame).

To test out the behavior of pause frames more thoroughly I wrote a simple libpcap (or WinPcap) program that transmits a PAUSE frame every ten milliseconds.

Sure enough, sending this frame repeatedly killed all traffic on my home network. (You can check out the full code on GitHub). What’s interesting is that this may have arisen from a bug in my home router (TP-Link AC750 Archer C2 running firmware According to the Ethernet spec (31B.1)

The globally assigned 48-bit multicast address 01-80-C2-00-00-01 has been reserved for use in MAC Control PAUSE frames for inhibiting transmission of data frames from a DTE in a full duplex mode IEEE 802.3 LAN. IEEE 802.1D-conformant bridges will not forward frames sent to this multicast destination address, regardless of the state of the bridge’s ports, or whether or not the bridge implements the MAC Control sublayer.

It would appear that there is a clause that specifically attempts to deal with this scenario: nodes sending PAUSE message to the special multicast address 01:80:C2:00:00:01 are instructing the switch to not send them any more frames. My switch seems to honor this, but also forwards the frames to the other nodes on the network, in effect telling THEM to pause in sending frames, which would explain the observed behavior.

I did some digging — my router uses a MediaTek MT7620A router SoC which relies on a Realtek RTL8367RB to perform switch duties. Unfortunately I couldn’t find the data sheet for this specific chip, although the source code for the router is GPL, so the driver itself is perusable. A data sheet for the RTL8366 (a 6/9 port version of the chip) says on page 22:

Frames with group MAC address 01-80-c2-00-00-01 (802.3x Pause) and 01-80-c2-00-00-02 (802.3ad LCAP) will always be filtered. MAC address 01-80-c2-00-00-03 will always be forwarded. This function is controlled by pin strapping of EN_MLT_FWD (pin 32) upon power reset. After power on, the configuration may be changed by MLTID_ST[15:0][1:0] in Register MCPCR0-1 (0x000F – 0x0010).

Have we reached the end of the road? The above seems to suggest that the forwarding of PAUSE frames is controlled both by a pin and a register on the Ethernet switch chip. It would appear on my router this specific (standard conformant) feature was accidentally disabled, either by a floating pin or a zeroed out register leading to a “frame of death” that was forwarded by my switch, killing the network. It’s amazing what you find when you dig!

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

16 thoughts on “Obscure Ethernet for $200 please, Alex: The Ethernet PAUSE frame”

  1. So most routers/switches ignore PAUSE packets by default?
    If not, someone could theoretically DoS a network by sending PAUSE packets?

  2. The PAUSE frame is meant to be sent by a station (host) to the switch (or vice versa) as a flow control mechanism, only for that port. Assuming the switch has at least some egress buffering, it shouldn’t result in propagation away from that switch port, to say the switch’s uplink port, unless the switch finds itself completely congested. Most hosts won’t have flow control configured at layer 2, instead relying on TCP congestion control. It is only useful when you have non-TCP type traffic, for instance fibre channel over ethernet, and you want to avoid packet loss and prefer to try force buffering upstream

    1. I’ve always seen flow control implemented between switches when congestion becomes a problem. I guess it makes sense on hosts as well if the NIC’s input buffers can’t be expanded.

  3. When you say it’s sent from host to switch I wonder, given that Ethernet was originally specified on coaxial (bus essentially) topology where the flow control would have been expected to propagate to all the stations on the segment. So when you move to the switched interpretation, maybe duplicating this propagation is in fact the most compatible behaviour.

    Of course ideally, the correct behaviour is for all stations to just not send traffic to the originator, not for the whole network to shut down.


  4. Enno: The original 10base2 and 10base5 coaxial Ethernet wasn’t full duplex, and these flow control messages are specified for use only on a full-duplex segment.

  5. Also, TCP flow control works on established TCP connections only.
    So, if a receiving host is congested by incoming traffic, the PAUSE frame allows it to not accept new TCP connections without silently dropping them.
    This is important in high-traffic iSCSI environments, where regular harddisk commands are transmitted in TCP and hosts need to learn to buffer “disk traffic”. Dropped TCP packets would look like “storage offline”, whereas PAUSE frames will signal “too busy to answer”.

Leave a Reply

Your email address will not be published.