A *linear program* is a constrained optimization problem with a linear objective, , and linear functional constraints, , we could for instance write

- Here is a vector, is an matrix with row vectors and, as before, and and .
- A linear program could have some equality constraints or could optimize some variables over over . Even so, with the addition of extra variables, constraints and suitable choice of , and , the above optimization can express any linear program.
- There are various ways to interpret . To start with, we could interpret as the running cost of different machines types. The vector could represent the required rate of production of different goods. If a matching makes good at rate , the component of , then a constraint must be met and the optimization asks to minimize the cost of production subject to the production constraints.
- Because our objective and constraints are convex, we can apply the Lagrangian approach.

**Lemma 1.** The dual of is the optimization

**Proof.** Including slack variables , we minimize the Lagrangian with Lagrange multipliers

So by maximizing, the dual of is as required:

**Lemma 2.** If is feasible for and is feasible for and they obey the complementary slackness conditions and then is optimal for and is optimal for .

**Proof.** By Weak Duality, for all primal feasible and dual feasible , . But for and as above, . Thus we see the primal and dual are equal and thus these and are optimal.

The set is the intersection of half-spaces. In other words, the feasible set is a *convex polytope*.

- An
*extreme point*point of a convex polytope (or indeed any convex set) is a point such that if for some then . - For a polytope we think of an extreme point as a point on the corner of the shape.

**Theorem 1.** Any linear program with a finite optimum has at least one optimum at an extreme point.

So, we just need to check extreme points to find an optimum. Let’s consider the optimization

which, as we know, can be made equivalent to .

Consider matrix and let where . We let be the submatrix of formed by removing the columns not in . We call the matrix a *basis*.^{1} Given and , we use the notation , and

where here elements are understood to be ordered relative to and . If , i.e. non-basis elements are zero, and then we call a *basic solution*. We call a positive basic solution a *basic feasible solution* (bfs).

Henceforth, we make the following assumptions on the matrix and vector .

**Assumption 1.** The rows of are linearly independent.

**Assumption 2.** Every basic solution has exactly non-zero variables.

**Assumption 3.** Every submatrix of is non-singular.

- Assumption 1 removes redundant constraints from optimization .
- Assumption 3 implies that the rows of basis are a basis of vector space .
- Assumptions 2 and 3 can be made to hold by an infinitesimally small perturbation of the entries of .
- Assumptions 2 and 3 imply that each basis uniquely specifies a bfs .

The extreme points for the feasible set of can be found.

**Theorem 2.** The extreme points of are exactly the set of basic feasible solutions.

- Finding a extreme point is now characterized and computable, namely, we find where .

**Theorem 3.** [Simplex Theorem] Given Assumptions 1–3, for , a basic feasible solution is optimal iff

Moreover, if is not optimal, i.e. for some , then there exists a new basis for some such that the basic feasible solution satisfies .

- This theorem gives both a method to verify optimality of a bfs and a mechanism to move to a better bfs from a non-optimal bfs. As we will discuss, this method is called the Simplex Algorithm.
- For a maximization, the inequality in is reversed.
- Since , optimality condition can be expressed as

- The vector , defined above, is called the
*reduced cost*vector and its entries are*reduced costs*. - We can interpret as variables of the dual of . So once , we have a dual feasible solution and a primal feasible solution . Thus, by essentially the same argument we used in Lemma [LP:primaldualcomp], both and are optimal for the primal and the dual problems respectively.

**Proof of the Simplex Theorem.** Let’s compare bfs to a feasible solution . As is feasible . Thus, using Assumption 3,

So

Thus, if holds then the bfs is optimal. Conversely, by Assumptions 2 and 3, for any sufficiently small, i.e. s.t. for some , we can choose so that is feasible. Namely, is given by , and it is positive because it is close to . Hence

This verifies that is necessary and sufficient for optimality.

Now suppose that bfs is not optimal, i.e. for some . This suggests we should increase the -th component. Take where is the th unit vector in and . Note

So the larger is the more improves on . The biggest so that is feasible is

Observe, if no such exists then is unbounded. Otherwise, take to be the argument minimizing . Note is a bfs on with .

- When is known, we can use the term “basis” interchangeably to mean or .↩

Thanks! Could you give me a good link that introduces this business of this ‘dual’ concept in optimization? I often see it but I don’t understand how it comes about or what it is.

LikeLike

Hi Naren,

I’ve updated the post on Lagrangian Optimization. If you read everything up to Theorem 2 on that post. You should have everything you need.

LikeLike

Thank you!

LikeLike