We consider one of the simplest iterative procedures for solving the (unconstrainted) optimization
where is a convex function twice differentiable function. For fixed
If , the gradient is most negative when . Thus, starting from define the gradient descent algorithm:
- We assume the Hessian matrix of at satisfies , for some .So, by a Taylor expansion
- We assume . So the right-hand inequality above guarantees that for all .
- We assume there is a unique finite solution to our optimization.1
We now see our algorithm converges exponentially fast to its solution, so-called linear convergence.
Given the assumptions above there exists and such that
Rewriting , we have
Using lower bound on the Hessian, we bound with the following Taylor expansion
ombining the last two inequalities, we have as required,
Finally for , we reapply our Hessian lower-bound assumption and use the above bound for :
Projected Gradient Descent
We consider a simple iterative procedure for solving a constrained optimization
where is a convex function twice differentiable function and where is some non-empty closed convex set, eg. . Like in Section [GD]we want to follow the steepest descent direction . However, such points need not belong to . Thus we consider the projection of on to :
and then, from , we define the projected gradient descent by
- After projecting it need not be true that . Thus we adjusted the step-size and our proof will study the distance between and the optimal solution rather than the gap .
• The key observation is that making a projection cannot increase distances
For all , we must have , i.e. the plane separates from . If this were not true, then we would have a contradiction, in particular, there would be a point on the line joining and that is closer to . We thus have
Adding together and then applying Cauchy-Schwartz implies
If the gradient of iterates are bounded, then
Hence if , and then .
Recurring the above expression yields Rearranging the above gives the required result.
- This assumption is not really needed, but it saves space.↩