Paris traceroute is a new version of the well-known network diagnosis and measurement tool. It addresses problems caused by load balancers with the initial implementation of traceroute.
Routers can spread their traffic across multiple paths using a per-packet, per-flow, or per-destination policy. In per-flow load balancing, packet header information ascribes each packet to a flow, and the router forwards all packets belonging to a same flow to the same interface. A natural flow identifier is the classic five-tuple of fields from the IP header and either the TCP or UDP headers: Source IP address, Destination IP address, Protocol, Source port, and Destination port. We found through our experiments that load balancers actually use combinations of these fields, as well as three other fields: the IP Type of Service (or TOS), and the ICMP Code and ICMP header checksum fields.
Per-flow load balancing ensures that packets from the same flow are delivered in order. Per-packet load balancing makes no attempt to keep packets from the same flow together, and focuses purely on maintaining an even load. Per-destination load balancing could be seen as a coarse form of per-flow load balancing, as it directs packets based upon the destination IP address. But, as it disregards source information, there is no notion of a flow per se. As seen from the measurement point of view, per-destination load balancing is equivalent to classic routing which is also per destination, and so we will not explore it here. Whether a router balances load per-flow or per-packet depends on the router manufacturer, the OS version, and how the network operator configures it.
Where there is load balancing, there is no longer a single route from a source to a destination. In the case of per-packet load balancing, a given packet might take any one of a number of possible routes. With per-flow load balancing, the notion of a single route persists for packets belonging to a given flow, but different flows for the same source-destination pair can follow different routes. In this regard, designing a new traceroute able to uncover all routes from a source to a given destination would be a significant improvement.
Indeed, the current traceroute is not adequate to that task, as it cannot definitively identify one single route from among many. It suffers from two systematic problems: it fails to discover true nodes and links, and it may report false links. These problems are due to the random manner in which its probe packets are directed by a load-balancing router, or load balancer. Our explanation of these two problems draws on the example in Figure 1. In this example, L is a load balancer at hop 1 from the traceroute source. The true router topology from hops 1 through 4 is shown on the left. Routers are represented as circles and each of their interfaces is numbered. We also show the probe packets sent with TTL 1 to 4. The packets are depicted as yellow arrows, either above the topology, if L directs them to A, or below, if L directs them to B. At the right side, we present the topology that would be inferred given these probe packets.
Figure 1 Traceroute's deficiencies under load balancing
Missing nodes and links
By default, traceroute sends three probes per hop. We can imagine a scenario where all three probes to distance 2 are directed to A. Device B is not discovered, and the links (L,B) and (B,D) can thus not be inferred. With purely random load balancing, there is a significant probability, of 0.53 x 2 = 0.25, that one of the two devices, A or B, would be left undiscovered by a traceroute through this topology. In the general case, there may be more than two next-hop interfaces following a load balancer. For instance, the newer Juniper routers permit up to sixteen equal-cost paths. It is certain that classic traceroute misses many nodes, and the possibility to infer many links.
In our example, L directs the probe dedicated to hop 2 to A and the one to hop 3 to B, which leads to the mistaken inference of a link between A and D. Many topology inference tools send one probe packet per hop to reduce the probing overhead. Consequently, these tools cannot easily detect that a single hop responds with multiple interfaces. In the default usage of traceroute (i.e., three probes per hop), it is easier to identify instances of hops that respond with multiple interfaces. In our example, let us assume that by sending three probes we find both A and B at hop 2, and C and D at hop 3. A liberal approach may infer the existence of links (A,C), (A,D), (B,C), and (B,D), which would lead to two false links. On the other hand, a more conservative approach would require ignoring much data, which would lead to more missing nodes and links. There is no satisfactory solution with the existing traceroute.
False links may be caused by either per-packet or per-flow load balancing. In the latter case, false links arise because traceroute relies upon the ICMP replies to its probe packets in order to identify devices along a route. A router that sends a Time Exceeded message encapsulates the IP header of the packet that it discarded, plus the first 8 octets of data which, in the case of UDP, TCP, or ICMP probes, means the first 8 octets of the relevant transport-layer header.
Traceroute considers these 28 octets in order to match replies to probes. In the UDP probes that it sends out, traceroute sets a high destination port number, and systematically varies this number with each probe so that there will be a unique identifier in the returning ICMP packet. For ICMP probes, traceroute uses the sequence number field as the probe identifier, which in turn impacts the checksum of the ICMP header. Unfortunately, as we have found, varying any field in the first four octets of the transport-layer header amounts to systematically varying the flow identifier with each probe.
How we correct it
We introduce Paris traceroute, a new traceroute to respond to load balancing routers. Its key innovation is to control the probe packet header fields in a manner that allows all probes towards a destination to follow the same path in the presence of per-flow load balancing. It also allows a user to distinguish between the presence of per-flow load balancing and per-packet load balancing. Unfortunately, due to the random nature of per-packet load balancing, Paris traceroute cannot perfectly enumerate all paths in all situations. But it can do considerably better than the classic traceroute, and it can flag those instances where there are doubts. The problem, if one wishes to maintain certain header fields constant, is that traceroute still needs to be able to match response packets to their corresponding probe packets.
Figure 2 The IP, UDP and ICMP headers. Per-flow load balancers
use the grey fields to identify a flow. Red arrows
show the fields incremented by classic traceroute.
Paris traceroute uses the green fields to identify probes
Paris traceroute does this by varying header fields that are within the first 28 octets, but are not used for load balancing. In the case of TCP probes, Paris traceroute varies the sequence number. In UDP probes, it is the checksum field. This requires manipulating the payload to yield the desired checksum, as packets with an incorrect checksum are liable to be discarded. In ICMP probes, it is a combination of the ICMP identifier and the sequence number. We carefully set the value of the ICMP identifier and sequence number to keep constant the header checksum of all probes to a destination. Figure 2 summarizes the IP, UDP, and ICMP header fields that are used by load balancers, classic traceroute, and Paris traceroute. We include the ICMP type field as used for load balancing, even if we cannot verify it experimentally (routers only repond to one type of probes, which is the ICMP echo request). Our experiments with UDP, TCP, and IPSec probes do give strong evidence that routers blindly use the first four octets after the IP header combined with the IP fields to do load balancing.