class: center, middle, inverse, title-slide # Gauss Markov Theorem ## Part 2 ### Dr. D’Agostino McGowan --- layout: true <div class="my-footer"> <span> Dr. Lucy D'Agostino McGowan </span> </div> --- ## Gauss Markov Theorem * Least Squares is the **Best Linear Unbiased Estimator** (BLUE) -- .question[ What does this mean? ] -- * ⬜ **Best**: has the smallest variance -- * ✅ **Linear**: it is linear in the observed output variables -- * ✅ **Unbiased**: it is unbiased -- * ✅ **Estimator** -- .definition[ Let's prove _best_ now. ] --- ## What do we need to do? ☝️ Come up with another linear unbiased estimator of `\(\beta\)`, let's call it `\(\tilde\beta\)` ✌️ Show that this estimator has a variance that is no smaller than `\(\textrm{var}(\hat\beta|\mathbf{X})\)` .question[ What is the `\(\textrm{var}(\hat\beta|\mathbf{X})\)`? ] --- ## ☝️ A linear, unbiased estimator of `\(\beta\)` * `\(\tilde\beta = \mathbf{C}y\)` -- .question[ Why `\(\mathbf{C}y\)`? This is _linear_. ] -- * `\(\mathbf{C} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T + \mathbf{D}\)` --- ## ☝️ A linear, unbiased estimator of `\(\beta\)` .question[ When is `\(\tilde\beta\)` unbiased? ] -- Show that `\(E[\mathbf{C}y|\mathbf{X}] = \beta\)` -- `$$E[\mathbf{C}y|\mathbf{X}] = E[((\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T + \mathbf{D})(\mathbf{X}\beta+\epsilon)|\mathbf{X}]$$` -- ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Try It` Solve for `\(E[Cy|\mathbf{X}]\)`.
02
:
00
--- ## ☝️ A linear, unbiased estimator of `\(\beta\)` .question[ When is `\(\tilde\beta\)` unbiased? ] `$$\begin{align}E[\tilde\beta|\mathbf{X}] &= E[((\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T + \mathbf{D})(\mathbf{X}\beta+\epsilon)|\mathbf{X}]\\ &=\mathbf{D}\mathbf{X}\beta + \beta\end{align}$$` -- .definition[ We need `\(E[\mathbf{C}y|\mathbf{X}] = \beta\)` ] * For `\(\tilde\beta\)` to be _unbiased_ `\(\mathbf{DX}\)` must be 0. --- ## Now let's calculate the variance .question[ For our estimate `\(\hat\beta\)` to be _best_ what needs to be true? ] -- `$$\textrm{var}(\hat\beta|\mathbf{X}) < \textrm{var}(\tilde\beta|\mathbf{X})$$` -- `$$\textrm{var}(\tilde\beta|\mathbf{X}) = \textrm{var}(\mathbf{C}y|\mathbf{X})$$` -- .question[ What is constant? ] --- ## Now let's calculate the variance .question[ For our estimate `\(\hat\beta\)` to be _best_ what needs to be true? ] `$$\textrm{var}(\hat\beta|\mathbf{X}) < \textrm{var}(\tilde\beta|\mathbf{X})$$` `$$\begin{align}\textrm{var}(\tilde\beta|\mathbf{X}) &= \textrm{var}(\mathbf{C}y|\mathbf{X})\\ &= \mathbf{C}\textrm{var}(y|\mathbf{X})\mathbf{C}^T \end{align}$$` ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Try It` Finish solving for `\(var[Cy|\mathbf{X}]\)`. _Remember `\(\mathbf{DX}=0\)` for this to be unbiased_.
03
:
00
--- ## Now let's calculate the variance .question[ For our estimate `\(\hat\beta\)` to be _best_ what needs to be true? ] `$$\textrm{var}(\hat\beta|\mathbf{X}) < \textrm{var}(\tilde\beta|\mathbf{X})$$` `$$\begin{align}\textrm{var}(\tilde\beta|\mathbf{X}) &= \textrm{var}(\mathbf{C}y|\mathbf{X})\\ &= \mathbf{C}\textrm{var}(y|\mathbf{X})\mathbf{C}^T\\ &=\textrm{var}(\hat\beta)+\sigma^2\mathbf{D}\mathbf{D}^T \end{align}$$` -- * `\(\mathbf{DD}^T\)` is a *positive semidefinite* matrix, therefore `\(\textrm{var}(\hat\beta)\)` is **always** `\(<\textrm{var}(\tilde\beta)\)`