# Max-Weight Scheduling

We define a discrete-time queueing network where there are restrictions on which queues can be served simultaneously. We give a policy for serving queues which is stable whenever it is possible to stabilize the queueing network.

• Let ${\mathcal J}$ index the set of queues. We let ${\mathcal S}$ be the set of schedules. Each $\sigma=(\sigma_j:j\in{\mathcal J})\in{\mathcal S}$ is a $0-1$ vector where $\sigma_j=1$ indicates that queue $j$ will be served under schedule $\sigma$ and $\sigma_j=0$ indicates that the queue will not be served by schedule $\sigma$. We let $< {\mathcal S} >$ be the convex combination of schedules in ${\mathcal S}$.
• We assume that we can always choose not to serve a queue. That is we can always set a component zero: $\sigma\in{\mathcal S}\implies(\sigma_{-j},0)\in{\mathcal S},\; \forall j\in{\mathcal J}$. This avoids the issue of accidentally serving an empty queue and ensures $<{\mathcal S} >^\circ \neq \emptyset$.
• We let $a(t)=(a_j(t):j\in{\mathcal J})\in{\mathbb Z}_+^{\mathcal J}$ be the number of arrivals occurring at each queue at time $t$. Also, we define $\bar{a}(t)={\mathbb E}[a(t)|a(t-1),...,a(1)]$.
• We let $Q(t)=(Q_j(t):j\in{\mathcal J})$ be the size of each queue at time $t$, and use the shorthand $Q^{\Sigma}(t)=\sum_{j\in{\mathcal J}} Q_j(t)$.
• We define the MaxWeight scheduling policy to be the policy that solves the optimization
• We could define the above optimization as $\max_{s\in<{\mathcal S}>} \sigma\cdot Q(t-1)$, i.e. maximizing over convex set $<{\mathcal S}>$ instead of the discrete set ${\mathcal S}$. This is because any linear program has a solution at an extreme point which must, for these problems, be a subset of the set ${\mathcal S}$.
• We assume our network is a single-hop network: once a job is served at its queue it leaves the network.

The following result states that provided the expected arrivals $\bar{a}(t)$ are within the interior of the set of schedules $< {\mathcal S} >^\circ$ then we can expect the long run queue size to be bounded.

Theorem: If there exists $\epsilon >0$ such that for all $t\in{\mathbb Z}_+$, $\bar{a}(t)+\epsilon\mathbf{1} \in < {\mathcal S} >^\circ$ and $\sup_{t\in{\mathbb Z}_+}{\mathbb E} ||a(t)-\bar{a}(t)||^2<\infty$ then 1 there exists constants $c$ and $\hat{\epsilon}$ such that

Proof: Note the crux of this argument is that for all queue sizes $q$, $q\cdot a$ is by some margin strictly less than the MaxWeight choice $\max_{\sigma\in <{\mathcal S}>} q\cdot \sigma$. Given $\epsilon$ above, we aim to use $\hat{\epsilon}$, the biggest such that $\bar{a}(t)+\hat{\epsilon}\mathbf{1} \in < {\mathcal S} >$ for all time. This can be defined as the biggest $\hat{\epsilon}$ such that $\hat{\epsilon}\leq \epsilon(t):= \max_{\sigma\in<{\mathcal S}>} \min_{j\in{\mathcal J}} (\sigma_j - \bar{a}_j(t))$ for all $t\in{\mathbb Z}_+$ i.e. so the distance of the smallest component to the boundary is maximized. Given this optimization description we observe the following

From this observation, we see one way that we can use $\hat{\epsilon}$ to bound the gap between $q\cdot \bar{a}(t)$ and $\max_{q\in<{\mathcal S}>} q\cdot \sigma$.

Summing the above expression from $t=0,...,T-1$, we gain the expression

Rearranging, dividing by $T$, and letting $T\rightarrow\infty$, we obtain the desired expression. $\square$

• The above schedule, result, and proof is a special case of the Blackwellâs Approachability Theorem. Here wish to approach the set ${\mathbb R}_{-}^{\mathcal J}$ and our Blackwell condition [Theorem [Blackwell:Blackwell].2] states that for every $\bar{a}$ there exist a $\sigma\in{\mathcal S}$ such that $\bar{a}\leq \sigma$ component-wise.

We now study $Q(t)$ as a Markov chain. To do this, we assume $\{a(t)\}_{t=1}^\infty$ are independent identically distributed random variables with mean $\bar{a}$. The result below proves that this Markov chain is transient whenever $\bar{a}\notin <{\mathcal S} >$. Thus $<{\mathcal S} >^\circ$ is the largest open region for which our Markov chain can be positive recurrent. We show that this Markov chain is positive recurrent for $\bar{a}\in <{\mathcal S}>^\circ$.

• As just described, if it is possible to find a policy to stabilize the queueing network then the MaxWeight scheduler will also stabilize the network. This property is sometimes called maximum stable.

Proposition: Given $\{a(t)\}_{t=1}^\infty$ are iid random variables with mean $\bar{a}$ and finite variance.

If $\bar{a}\in <{\mathcal S} >^\circ$ then $\{Q(t)\}_{t=0}^\infty$ is a positive recurrent Markov chain.

If $\bar{a}\notin <{\mathcal S} >$ then $\{Q(t)\}_{t=0}^\infty$ is a transient Markov chain.

Proof: If $\{a(t)\}_{t=1}^\infty$ are iid then $Q(t+1)$ is a function of $Q(t)$ and independent random variable $a(t)$ and so $\{Q(t)\}_{t=0}^\infty$ defines a discrete-time Markov chain. First suppose $\bar{a}\in <{\mathcal S} >^\circ$. The above Theorem holds. If the Markov chain $\{Q(t)\}_{t=0}^\infty$ were not positive recurrent then $\lim_{T\rightarrow\infty}\frac{1}{T} \sum_{t=0}^{T-1} {\mathbb I}[Q(t)=q]=0$ for all $q\in{\mathbb Z}_+^{\mathcal J}$. In particular, this would hold for all $q$ such that $q^\Sigma \leq c/2\epsilon$ which would then imply would not hold. Thus by this contradiction, this Markov chain is positive recurrent.

Now suppose $\bar{a}\notin <{\mathcal S} >$. As $<{\mathcal S} >$ is convex, there exists a hyper-plane separating $\bar{a}$ and $<{\mathcal S} >$, i.e. there exists a $p\in{\mathbb R}^{\mathcal J}$ and $\delta>0$ such that

We define $\hat{\sigma}(T)=\sum_{t=1}^T \sigma(t)/T$ and $\hat{a}(T)=\sum_{t=1}^T a(t)/T$, the average schedule and arrivals by time $T$. Now

In the second equality, we use the Strong Law of Large Numbers on $\{a(t)\}_{t=1}^\infty$. In the final inequality, we use our separating hyper-plane . Thus as $\lim_{T\rightarrow\infty} p\cdot Q(T) =\infty$, our Markov chain visits each set of states $\{ q\in{\mathbb Z}_+^{\mathcal J} : p\cdot q \leq c\}$ finitely many times, for each $c>0$. So, our Markov chain is transient. $\square$

Other maximum stable schedulers exist. For instance, we could estimate $\bar{a}$ and schedule. That is record $\hat{a}(t)=\sum_{\tau=1}^t a(\tau)/t$ and chooses a random schedule with mean $\sigma(t)=\hat{a}(t-1)+\epsilon(t)\mathbf{1}$ where $\epsilon(t)\in {\mathbb R}$ is maximal so that $\hat{a}(t-1)+\epsilon(t)\in< {\mathcal S} >$. These $\hat{a}(t)$ â and hence $\sigma(t)$ â would converge to give a stable queueing network. However, we needed to record the arrival history.

• The MaxWeight scheduler has the advantage that we do not need to record any historical information beyond the queueing networks state in order to construct a schedule. Such policies are often called on-line policies.
• It can be computationally expensive to calculate a solution to at each time-step. So, despite the desirable properties that we have just proven, the MaxWeight policy can be impractical.

1. Here $\mathbf{1}$ is the vector whose components are all ones.â©