- Dynamic Programs; Bellman’s Equation; An example.
For this section, consider the following dynamic programming formulation:
Time is discrete ;
is the state at time
;
is the action at time
;
Def 1 [Plant Equation][DP:Plant] The state evolves according to functions . Here
This is called the Plant Equation.
A policy choses an action
at each time
. The (instantaneous) reward for taking action
in state
at time
is
and
is the reward for terminating in state
at time
.
Def 2 [Dynamic Program] Given initial state , a dynamic program is the optimization
Further, let (Resp.
) be the objective (Resp. optimal objective) for when the summation is started from
, rather than
.
Ex 1 [Bellman’s Equation] and for
where and
.
Ex 2 An investor has a fund: it has pounds at time zero; money can’t be withdrawn; it pays
interest per-year for
years; the investor consumes proportion
of the interest and reinvests the rest.
What should the investor do to maximize consumption?
Answers
Ans 1 Let .
Ans 2 The yearly fund value satisfies . Backward induction, if reward from time
onward is
, then
where
Also, , i.e. last year consume everything. Therefore solution is consume everything last years while
otherwise consume nothing i.e. initially save and then consume in last years.
Neil this has load of typos here here and here
LikeLike