We consider the following formulation of Lai, Robbins and Wei (1979), and Lai and Wei (1982). Consider the following regression problem,

for where are unobservable random errors and are unknown parameters.

Typically for a regression problem, it is assumed that inputs are given and errors are IID random variables. However, we now want to consider a setting where we sequentially choose inputs and then get observations , and errors are a martingale difference sequence with respect to the filtration generated by .

We let be the matrix of inputs and be the matrix of outputs. Further we let be the least squares estimate of given and .

The following result gives a condition on the eigenvalues of the design matrix for to converge to and also gives a rate of convergence.

If and are, respectively, the minimum and maximum eigenvalues of the design matrix and if we assume that for some , almost surely, Then whenever we have

then converges to and

In what follows, is the Euclidean norm for a vector and for a matrix . (Note that it is well known that the maximum eigenvalue of and )

**Outline of proof.** The least squares estimate to the above regression problem is given by

So where . To prove the above theorem first note that

The inequality above we apply the Cauchey-Schwartz inequality. We bound using the Sherman-Morrison formula. Specifically we will show that

where is some positive increasing sequence. So since

convergence is determine by the rate of convergence of the sequence

which, with some linear algebra, can be bounded by . Thus we arrive at a bound of the form

In what follows we must study the asumptotic behaviour of . What we will show is

**Proposition.** Almost surely

**Proof.** To prove this proposition we will require some lemmas, such as the Sherman-Morris formula. These are stated and proven after the proof of this result.

The Sherman-Morrison Formula states that:

Note that

Thus

Thus summing and rearranging a little

Notice in the above, the first summation (before the equals sign) only acts to decrease , while on the right hand side, the first term is a martingale difference sequence and the second term is a quadratic form.

Now because the above Martingale difference sequence is a martingale we have that

In the second equality above, we use that . Thus we have that

By Lemma 2 (below) we have that

Thus we have that

**Lemma 1** [Sherman-Morrison Formula] For an invertible Matrix and two vectors and

**Proof.** Recalling that the outer-product of two vectors is the matrix it holds that

(Nb. This is matrix multiplication: each column is a constant times and every row is a constant time , so the dot product comes out.)

Using this identity note that

So . Now letting ,

as required.

The following in some sense repeatedly analyses to the determinant under the Sherman-Morrison formula.

**Lemma 2.** If are a sequence of vectors and we let then

**Proof.** First note that if then, as was also in the Sherman-Morris formula

Thus

which should remind you of the derivative of the logarithm. (Also note that this tells us that determinant is increasing and that .) If we apply this to the above sum and apply the concavity of the logarithm

Since is the product of all eigenvalues . So we see that

**References**

This is based on reading:

T.L Lai, Herbert Robbins, C.Z Wei, “Strong consistency of least squares estimates in multiple regression II”, Journal of Multivariate Analysis, Volume 9, Issue 3, 1979, Pages 343-361.

Lai, Tze Leung; Wei, Ching Zong. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems. Ann. Statist. 10 (1982), no. 1, 154–166. doi:10.1214/aos/1176345697.