The Math Proves It—Network Congestion Is Inevitable

Just as highway networks may suffer from snarls of traffic, so too may computer networks face congestion. Now a new study finds that many key algorithms designed to control these delays on computer networks may prove deeply unfair, letting some users hog all the bandwidth while others get essentially nothing.

Computers and other devices that send data over the internet break it down into smaller packets and then use special algorithms to decide how fast to send these packets. These congestion-control algorithms aim to discover and exploit all the available network capacity while sharing it with other users on the same network.

Over the past decade, researchers have developed several congestion-control algorithms that seek to achieve high rates of data transmission while minimizing the delays resulting from data waiting in queues in the network. Some of these, such as Google’s BBR algorithm, are now widely used by many websites and applications.

“Extreme unfairness happens even when everybody cooperates, and it is nobody’s fault.”
—Venkat Arun, MIT

However, although hundreds of congestion-control algorithms have been proposed in the last roughly 40 years, “there is no clear winner,” says study lead author Venkat Arun, a computer scientist at MIT. “I was frustrated by how little we knew about where these algorithms would and would not work. This motivated me to create a mathematical model that could make more systematic predictions.”

Unexpectedly, Arun and his colleagues now find many congestion-control algorithms may prove highly unfair. Their new study finds that given the real-world complexity of network paths, there will always be a scenario where a problem known as “starvation” cannot be avoided—where at least one sender on a network receives almost no bandwidth compared to that of other users.

A user’s computer does not know how fast to send data packets because it lacks knowledge about the network, such as how many other senders are on it or the quality of the connection. Sending packets too slowly makes poor use of the available bandwidth. However, sending packets too quickly may overwhelm a network, resulting in packets getting dropped. These packets then need to be sent again, resulting in delays. Delays may also result from packets waiting in queues for a long time.

Congestion-control algorithms rely on packet losses and delays as details to infer congestion and decide how fast to send data. However, packets can get lost and delayed for reasons other than network congestion. For example, data may be held up and then released in a burst with other packets, or a receiver’s acknowledgement that it received packets might get delayed. The researchers called delays that do not result from congestion “jitter.”

Congestion-control algorithms cannot distinguish the difference between delays caused by congestion and jitter. This can lead to problems, as delays caused by jitter are unpredictable. This ambiguity confuses senders, which can make each of them estimate delay differently and send packets at unequal rates. The researchers found this eventually leads to situations where starvation occurs and some users get shut out completely.

In the new study, the researchers analyzed whether every congestion-control algorithm of which they were aware, as well as some new ones they devised, could avoid starvation. The scientists were surprised to find there were always scenarios with each algorithm where some people got all the bandwidth, and at least one person got basically nothing.

“Some users could be experiencing very poor performance, and we didn’t know about it sooner,” Arun says. “Extreme unfairness happens even when everybody cooperates, and it is nobody’s fault.”

The researchers found that all existing congestion-control algorithms that seek to curb delays are what they call “delay-convergent algorithms” that will always suffer from starvation. The fact that this weakness in these widely used algorithms remained unknown for so long is likely due to how empirical testing alone “could attribute poor performance to insufficient network capacity rather than poor algorithmic decision-making,” Arun says.

Although existing approaches toward congestion control may not be able to avoid starvation, the aim now is to develop a new strategy that does, Arun says. “Better algorithms can enable predictable performance at a reduced cost,” he says.

Arun notes that this research may have applications beyond analyzing network congestion. “We are currently using our method of modeling computer systems to reason about other algorithms that allocate resources in computer systems,” he says. “The goal is to help build systems with predictable performance, which is important since we rely on computers for increasingly critical things. For instance, lives could depend on self-driving cars making timely decisions.”  [READ MORE]