Algorithms designed to ensure that multiple users share a network fairly cannot prevent some users from monopolizing all the bandwidth. –ScienceDaily
When users want to send data over the Internet faster than the network can, traffic jams can occur, much like traffic jams disrupt the morning commute to a big city.
Computers and devices that transmit data over the Internet break the data into smaller packets and use a special algorithm to decide how fast to send those packets. These congestion control algorithms seek to discover and fully utilize available network capacity while sharing it fairly with other users who may share the same network. These algorithms attempt to minimize the delay caused by data waiting in network queues.
Over the past decade, researchers in industry and academia have developed several algorithms that attempt to achieve high rates while controlling delays. Some of them, like the BBR algorithm developed by Google, are now widely used by many websites and applications.
But a team of MIT researchers has discovered that these algorithms can be deeply unfair. In a new study, they show that there will always be a network scenario where at least one sender receives almost no bandwidth compared to other senders; that is, a problem known as starvation cannot be avoided.
“What’s really surprising about this article and the results is that when you consider the real complexity of network paths and all they can do to data packets, it’s basically impossible for congestion control algorithms to control delays to avoid starvation using current methods,” says Mohammad Alizadeh, Associate Professor of Electrical Engineering and Computer Science (EECS).
While Alizadeh and her co-authors were unable to find a traditional congestion control algorithm that could prevent starvation, there may be algorithms in a different class that could prevent this problem. Their analysis also suggests that modifying the way these algorithms work, so that they allow for greater variations in delay, could help prevent starvation in certain network situations.
Alizadeh wrote the article with first author and EECS graduate student Venkat Arun and lead author Hari Balakrishnan, Fujitsu Professor of Computer Science and Artificial Intelligence. The research will be presented at the ACM Special Interest Group on Data Communications (SIGCOMM) conference.
Congestion control is a fundamental problem in networks that researchers have been trying to solve since the 1980s.
A user’s computer does not know how fast to send data packets over the network because it lacks information, such as the quality of the network connection or the number of other senders using the network. Sending packets too slow makes poor use of the available bandwidth. But sending them too quickly can overwhelm the network, and in doing so, packets will start dropping. These packets must be resent, which results in longer delays. Delays can also be caused by packets waiting in queues for a long time.
Congestion control algorithms use packet losses and delays as signals to infer congestion and decide how fast to send data. But the Internet is complicated and packets can be delayed and lost for reasons unrelated to network congestion. For example, data may be stuck in a queue en route and then released with a burst of other packets, or recipient acknowledgment may be delayed. The authors call delays that are not caused by congestion “jitter”.
Even though a congestion control algorithm measures delay perfectly, it cannot tell the difference between delay caused by congestion and delay caused by jitter. The delay caused by jitter is unpredictable and confuses the sender. Because of this ambiguity, users start to estimate delay differently, causing them to send packets at unequal rates. Eventually, this leads to a situation where starvation occurs and someone is completely left out, Arun explains.
“We started the project because we lacked a theoretical understanding of how congestion control behaves in the presence of jitter. some of the complexities of the internet. It’s been very gratifying to see the math tell us things that we didn’t know that have practical relevance,” he says.
The researchers fed their mathematical model to a computer, gave it a series of commonly used congestion control algorithms, and asked the computer to find an algorithm that could avoid starvation, using their model.
“We couldn’t do it. We tried every algorithm we knew, and a few new ones we came up with. Nothing worked. The computer always found a situation where some people get all the bandwidth and at least one person gets practically nothing,” says Arun.
The researchers were surprised by this result, especially since these algorithms are widely considered to be reasonably accurate. They began to suspect that it might not be possible to avoid starvation, an extreme form of injustice. This motivated them to define a class of algorithms they call “delayed convergence algorithms” which they proved would always suffer from starvation under their network model. All existing congestion control algorithms that control delay (of which researchers are aware) are delay convergent.
The fact that such simple failure modes of these widely used algorithms have remained unknown for so long illustrates how difficult it is to understand the algorithms solely through empirical testing, adds Arun. It emphasizes the importance of a solid theoretical base.
But all hope is not lost. Although all of the algorithms they tested failed, there may be other non-delay-convergent algorithms that might be able to avoid starvation. This suggests that one way to solve the problem might be to design congestion control algorithms that vary the delay range more widely, so the range is larger than any delay that may occur due to jitter in the network.
“To control delays, the algorithms have also tried to limit delay variations around a desired balance, but there is nothing wrong with potentially creating greater delay variation to get better measures of congestive delays. It’s just a new design philosophy that you would have to adopt,” adds Balakrishnan.
Now researchers want to keep pushing to see if they can find or build an algorithm that will eliminate starvation. They also want to apply this mathematical modeling and computational proof approach to other unsolved thorny problems in networked systems.
“We increasingly depend on computer systems for very critical things, and we need to establish their reliability on a more solid conceptual basis. We have shown the surprising things you can discover when you take the time to develop these formal specifications. what the problem really is,” says Alizadeh.