Your browser (Internet Explorer 6) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.
The Statistical View on Least Squares

Least Squares and Statistics

Last time I talked about the beautiful linear Algebra interpretation of Least Squares which adds a wonderful geometric interpretation to the analytical footwork we had to do before. This time I will borrow from a video I saw on Khan Academy a long time ago. I connects some of the elemental principles of statistics with regression.

The basic idea is similar to the analytical approach. But this time, we will only try to fit a line to $$M$$ $$x$$/$$y$$ pairs. So, we have a model of the form

\begin{equation*}y(x) = m x + c.\end{equation*}

The squared errors between model and data is then

\begin{align*}S := \sum_i \epsilon_i^2 &= \sum_i (m x_i + c - y_i)^2 \\ &= \sum_i m^2 x_i^2 + 2 c m x_i + c^2 - 2 m x_i y_i - 2 c y_i + y_i^2\end{align*}

Of course, we search for the global minima of this solution with respect to the parameters $$m$$ and $$c$$. So let's find the derivations:

\begin{align*}0 \stackrel{!}{=} \frac{\partial S}{\partial m} &= \sum_i 2 m x_i^2 + 2 c x_i - 2 x_i y_i \\ 0 \stackrel{!}{=} \frac{\partial S}{\partial c} &= \sum_i 2 m x_i + 2 c - 2 y_i\end{align*}

We can conveniently drop the 2 that appears in front of all terms. We will now rewrite these equations by using the definition of the sample mean:

\begin{equation*}\overline{x} = \frac{\sum_i x_i}{M}\end{equation*}

This gives:

\begin{align*}0 \stackrel{!}{=} m M \overline{x^2} + M c \overline{x} - M \overline{x y} \\ 0 \stackrel{!}{=} m M \overline{x} + M c - M \overline{y}\end{align*}

Let's loose the $$M$$ s and solve for $$m$$ and $$c$$.

\begin{align*}m &= \frac{\overline{x}\,\overline{y}- \overline{x y}}{\overline{x}^2 - \overline{x^2}} \\ c &= \overline{y} - m \overline{x}\end{align*}

If you look closely at the term for the slope $$m$$ you see that this is actually just the covariance of $$x$$ and $$y$$ divided by the variance of $$x$$. I find this very intriguing.

blog comments powered by Disqus