Congestion Control

We argue, in a slightly informal manner, that queueing networks implicitly optimize a utility function subject to constraints on network capacity. We start with the simple example of a closed queueing network and, as we shall discuss, a motivating example is the Transmission Control Protocol which controls the number of packets in transfer on an Internet connection.

We consider a network of single-server queues which processes packets of equal size along different routes:

${\mathcal J}$ indexes the set of queues. A route is an ordered set of queues, $r=(j_1,...,j_k)\subset {\mathcal J}$ . ${\mathcal R}$ is the set of routes. ¹
$C=(C_j:j\in{\mathcal J})$ is the service rate of each queue; $\bar{m}=(\bar{m}_r: r\in{\mathcal R}$ is the number of packets on each route; $m=(m_{jr}: r\in{\mathcal R}, j\in{\mathcal J})$ is the number of packets from each route at each queue; $p=(p_j:j\in{\mathcal J})$ is the probability a packet is lost at each queue, if packets are lost; $q=(q_j:j\in{\mathcal J})$ is the sojourn time of a packet at each queue and, most importantly, $\Lambda=(\Lambda_r:r\in{\mathcal R})$ is the rate packets are transferred on each route.

Closed Queueing Networks

We consider a closed multi-class queueing network that is $\bar{m}$ the number of packet in transfer on each route is fixed and no packets are lost. By Little’s law

$\label{KT0} \Lambda_rq_j={m}_{jr}, \quad \text{for}\quad j\in r.$

Summing over $j\in r$ and rearranging gives

(1) $\label{KT1} \frac{\bar{m}_r}{\Lambda_r}-\sum_{j\in r} q_j =0,\quad \forall r\in\mR.$

Since queues are stable, we know

(2) $\label{KT2} \sum_{r: j\in r} \Lambda_r \leq C_j,\quad \forall j\in\mJ.$

One can imagine if the above equality is strict then $q_j\approx 0$ . Thus approximately

(3) $\label{KT3} q_j\Big( C_j - \sum_{r: j\in r} \Lambda_r \Big) = 0 ,\quad \forall j\in\mJ.$

Also

(4) $\label{KT4} q_j\geq 0,\quad \forall j\in\mJ \quad\text{and}\quad \Lambda_r\geq 0, \quad \forall r\in\mR.$

Interpreting $(q_j:j\in{\mathcal J})$ as Lagrange multipliers, then (1)-(4) are precisely the Kuhn-Tucker conditions for the so-called proportionally fair optimization problem:

$\label{NUM:propfair} \max_{\Lambda\in \bR_+^{|\mR|}}\quad \sum_{r\in\mR} \bar{m}_r\log \Lambda_r \quad\text{subject to}\quad \sum_{r: j\in r} \Lambda_r \leq C_j\quad \forall j\in\mJ.$

To make this argument, we assumed that the sojourn times of packets did not depend on the route used and that complementary slackness condition ([KT3]) held. Neither of these conditions need be true in general.
Even so, for a network of FIFO queues with $\bar{m}$ large, we would expect sojourn times at congested queues to depend on their lengths rather than on the route type of the packets that just arrived. And, also, we would expect a congested queue to be fully utilized. So, it would seem reasonable that that this optimization should provide a good approximation for a congested network.

TCP Reno

The Transmission Control Protocol (TCP) is the main protocol used for data transmission on the Internet. In 1988, after a series of congestion collapses on the early Internet, van Jacobson designed TCP Reno, a congestion avoidance protocol. Modern TCP congestion control implementations are variants of this original design. For a sender and receiver transmitting across route $r$ though routers $j\in r$ , we apply the following rationale:

For each packet sent by the sender, the receiver must send an acknowledgement packet, often called an ACK.
Each router $j\in{\mathcal J}$ queues packets in a finite buffer. If the rate packets arrive at the router is too large then the router’s buffer will fill, and packets will be lost. The sender can detect this because an ACK will not be received.
So, as just described, packet losses are seen as an indication of congestion. On the other hand, a packet not being lost is an indication that the network is not being fully utilized.
So, when a packet loss occurs, in order to decrease congestion, the sender should decrease its rate of sending packets. On the other hand, when a packet is successfully sent, in order to utilize the network capacity, the sender should increase its rate of sending. In this way, the congestion control protocol continually probes for the capacity of its route.

Now more specifically, TCP Reno controls congestion as follows:²

The sender records the round trip time, $T_r$ , – the time for a packet to be sent and an acknowledgement returned.
The sender maintains a congestion window, $\bar{m}_r$ , – the number of packets sent per round trip time.
If a packet loss occurs then TCP will decrease its congestion window by half. In effect, halving the transfer rate.
For each congestion window sent, we increase the congestion window by one packet. In effect, linearly increasing the transfer rate.³

Recalling our notations above, the rate of transfer is the number of packets per round trip time

$\label{Reno:Lambda} \Lambda_r=\frac{\bar{m}_r}{T_r}.$

In equilibrium, for each packet sent, the average increase in the congestion window should equal the average decrease in the congestion window

$\label{Reno:equ} 0= \underbrace{\frac{1}{\bar{m}_r}}_{\substack{\text{sending } \bar{m}_r \\ \text{packets increases}\\ \text{window by 1.}}} -\underbrace{p_r \frac{\bar{m}_r}{2}.}_{\substack{\text{one packet}\\ \text{loss halves} \\ \text{the window.}}}$

Solving for $\bar{m}_r$ in and substituting for $\Lambda_r$ in , we gain the so-called TCP square root formula:

(a) $\label{Reno:TCPSqrt} \Lambda_r=\frac{1}{T_r}\sqrt{\frac{2}{p_r}},\qquad\qquad r\in\mR.$

The route loss probability $p_r$ gives the probability that a packet is lost on a router $j\in r$ . A packet being lost at router $j$ is mutually exclusive to being lost at another router $j'$ . Thus, given $p_j$ the probability of loss on router $j$ ,

(b) $p_r=\sum_{j\in r} p_j.$

We have just described for one route. We now imagine a number of TCP connections interacting on different routes. Our equation is analogous to the equation derived for closed queueing networks. Now similar to (1)-(4), since queues are stable in equilibrium, we know

One can imagine if the above equality is strict then the number of packets lost should be small $p_j\approx 0$ . Thus approximately

(d) $\label{Reno3} p_j\Big( C_j - \sum_{r: j\in r} \Lambda_r \Big) = 0 ,\quad \forall j\in\mJ.$

Also

(e) $\label{Reno4} p_j\geq 0,\quad \forall j\in\mJ \quad\text{and}\quad \Lambda_r\geq 0, \quad \forall r\in\mR.$

Interpreting loss probabilities $(p_j:j\in{\mathcal J})$ as Lagrange multipliers, (a)-(e) are precisely the Kuhn-Tucker conditions for the so-called TCP-fair optimization problem:

$\label{NUM:TCP-fair} \max_{\Lambda\in \bR_+^{|\mR|}}\quad \sum_{r\in\mR} -\frac{2}{\Lambda_rT^2_r} \quad\text{subject to}\quad \sum_{r: j\in r} \Lambda_r \leq C_j\quad \forall j\in\mJ.$

Once again, the network flow maximizes a function subject to the networks capacity constraints. This utility function which was the $\log \Lambda_r$ function is now $-2/(\Lambda_r T^2_r)$ . So the congestion control mechanism has changed the function maximized.
Note in the TCP square root formula , give the loss probability is fixed, the transfer rate is inversely proportional to the round-trip-time. So short routes are favored over long routes, which need not be a bad thing. This is property of TCP Reno is called route trip time bias.
A striking feature of these optimization solutions is that a function is maximized implicitly. The congestion control protocols are not explicitly attempting to solve a global network optimization. Never-the-less, their simple operation which attempts to exploit network capacity results in such a solution.

Note, we slightly abuse notation here by choosing $r$ as both a vector and a subset. However, it is conceptually convenient to consider packets to visit queues in order $j_1,...,j_k$ , and it is also convenient to consider a queue to belong to a route $j\in r$ .↩
We remark that we ignore TCP Reno’s slow start phase. This is the initial phase where the protocol increases its rate of sending exponentially until a loss occurs. This initial phase puts the TCP transfer rate roughly at the correct rate. Here we just consider the congestion avoidance phase, this is where the protocol begins to probe for capacity.↩
Perhaps we should say here “For each successfully congestion window sent,…”. This would then have the effect of …!!↩

Closed Queueing Networks

TCP Reno

Share this:

Leave a comment Cancel reply