Unicast Flooding

Continuing with our CCIE R&S blueprint, in this post I’ll try to explain what unicast flooding is, the difference between this flooding happening as a normal behavior and as something persistent and undesirable within our network due to issues such as asymmetric routing and CAM table overflow.

Unicast flooding means that the switch will send out unicast frames out on all ports in a specific VLAN.

Switches will perform unicast flooding whenever they need to send a frame out to a specific destination and they don’t have the data link address stored in the CAM table (we will limit ourselves to Ethernet frames), the absence of this destination MAC address in the CAM table could be for two main reasons, it has not been learned yet or it already aged out. Let’s see exactly why this would happen with an example, and we’ll take the occasion to review basic ARP and MAC address learning.

ARP is defined in RFC 826.

Asymmetric Routing

In simple words, asymmetric routing means that traffic goes out through path A and returns using path B. We will be using the following topology to see how asymmetric routing can cause switches to perform unicast flooding in an undesirable fashion.

This is the topology we will be using

 

Step 1

All boxes just powered on and we have disabled CDP, DTP and SPT so no traffic goes across, R1 and R2’s ARP tables start populating because both of them will send gratuitous ARP reply messages for the two subinterfaces and if there is other network equipment in the same subnet it will likely send gratuitous ARPs as well so they can all populate their tables. By now SW1 and SW2 will have some MAC addresses in their tables as well since frames are being sent.

R1 sending gratuitous ARPs

SW1’s CAM table

Step 2

P1 in VLAN 10 tries to communicate with P4 in VLAN 20, P1 sees that P4’s IP address is in a different subnet, so P1 will try to reach his gateway (R1) for which he needs to send an ARP to figure out R1’s MAC address. Because this is a broadcast message SW1 will send it out on all ports and only R1 will answer to it with an ARP reply.

Step 3

Now P1 has a full entry for R1 in his ARP table and is ready to send traffic to  SW1 which sends the frame out to R1 and when R1 gets it, strips off the layer 2 addresses and reads the layer 3 IP addresses. Compares destination with routing table and finds that the subnet 10.0.20.0 is directly connected.

Step 4

R1 is getting ready to send this out on interface Fa0/0.20 to SW1 and from there to SW2, but first it will ARP to find out P4’s MAC address and P4 will respond with a frame destined for Fa0/0’s MAC address and sourced from P4’s Fa0/0’s MAC address. At this point SW1 will also learn P4’s MAC address.

Step 5

R1 will finally send the traffic out and because both SW1 and SW2 have P4’s MAC address this will go straight from SW1 to SW2 to P4. This is where it gets interesting, on the way back from P4 to P1, remember that P4 will send this out to his GW (R2) and R2 will send it out to SW2 and from there to SW1 and to P1, but keep in mind that R2 will swap out the MAC addresses and so this frame going out from R1 will have R1’s Fa0/0 MAC address as its source layer 2 address, which means SW1 will eventually age out P4 MAC address, at least until R1 asks for it and sends an ARP request and SW1 sees the ARP reply from P4.

So after SW1 ages out P4’s MAC address, SW1 will start flooding unicast frames out on all VLAN 20 ports every time it needs to send something out from R1’s VLAN 20 subinterface to P4. This is known as unicast flooding due to asymmetric routing. One way to overcome this is to make the ARP timeout the same as the MAC address age timer, this will make R1 ARP right when SW1 ages out P4’s MAC, thus preventing the unicast flooding.

 

Spanning-Tree Topology Change

Another not so common cause of unicast flooding is the TCN or Topology Change Notification that we get every time a port transitions through the Forwarding STP state. The TCN will make the switch shorten the time for the aging of the MAC address associated to this port, in STP it will decrease the aging time to 15 sec instead of the default 300 sec, for RSTP it will flush the MAC at the moment and unicast flooding will occur. If we have a port that is flapping for whatever reason, unicast flooding will occur since the switch will be constantly aging out the MAC address. This will not happen if the portfast keyword is configured for the port, so end stations are less likely to be causing this issue.

We can see the MAC address aging time for a specific VLAN

In the following output we can see from the root bridge’s perspective that the last TCN occured 2 weeks ago and it was from FastEthernet0/24, this means that at that moment the CAM flushed or reduced the aging time (depending on the STP flavor) for MAC(s) associated to this port.

Out of order packets

Out of order packets can be a sign of CEF per-packet load balancing, remember in this mode CEF is going to send traffic out the ECMPs in a round-robin fashion without regards to the actual destination of the packet.

This represents a bigger issue as packets start arriving with more time in between, meaning a couple of packets might arrive in the wrong order 1 msec apart and that would not be a service impacting issue, however, if we start seeing a lot of out of order packets this is going to represent an issue because TCP will not send the traffic up to the application unless they have the correct sequence number.

The common cause for packets arriving out of order is when the flow of traffic utilizes different speed paths to get to the destination, also when we are shaping traffic and packets are being queued up but don’t go out in the same order as they arrived.

Micro burst

Micro bursting is hard to see ! and refers to the act of sending a considerably large amount of traffic in a short period of time, this will cause the interface to fill out the buffer and start dropping packets because of over subscription. This can be a bigger issue if we are doing some sort of QoS shaping, recall that when shaping, we are queuing up the traffic in the interface before it gets sent. So by having a full buffer traffic is going to start being dropped (policed) instead of shaped.

One way to evident micro bursts is by looking at large number of output drops with an overall low interface traffic utilization, so in simple words:

It is okay to see high ~ max overall traffic utilization of an interface and output drops.

It is not okay to see low overall traffic utilization of an interface and output drops, this typically indicates micro bursts.

By using load-interval interface level command we can change how often the interface is going to poll the output/input rate, although the chances of this catching a micro-burst are nearly none since the minimun interval we can set is 30 secs. There is a feature that allows to monitor micro-bursts but is only available for NX-OS or Nexus Switches and its called Micro-Burst Monitoring, this will allow you to monitor input/output or both traffic for a specific port and a syslog message will be generated if there is evidence of micro bursting.