class: center, middle, inverse, title-slide # Gauss Markov Theorem ## Part 1 ### Dr. D’Agostino McGowan --- layout: true <div class="my-footer"> <span> Dr. Lucy D'Agostino McGowan </span> </div> --- ## Gauss Markov Theorem * Least Squares is the **Best Linear Unbiased Estimator** (BLUE) -- .question[ What does this mean? ] -- * ⬜ **Best**: has the smallest variance -- * ✅ **Linear**: it is linear in the observed output variables -- * ⬜ **Unbiased**: it is unbiased -- * ✅ **Estimator** -- .definition[ Let's prove _unbiasedness_ first. ] --- ## Bias .question[ What do we want to show? ] -- `$$E[\hat{\beta}]=\beta$$` -- * What is `\(\hat\beta\)` -- `$$\hat\beta = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty$$` -- * What is `\(y\)`? -- `$$y = \mathbf{X}\beta + \epsilon$$` --- ## Bias `$$\begin{align}\hat\beta &= (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T(\mathbf{X}\beta + \epsilon) \end{align}$$` -- ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Try it!` Simplify this expression
01
:
30
--- ## Bias `$$\begin{align}\hat\beta &= (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T(\mathbf{X}\beta + \epsilon)\\ \end{align}$$` --- ## Bias `$$\begin{align}\hat\beta &= (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T(\mathbf{X}\beta + \epsilon)\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{X}\beta + (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\epsilon\\ \end{align}$$` --- ## Bias `$$\begin{align}\hat\beta &= (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T(\mathbf{X}\beta + \epsilon)\\ &=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{X}\beta + (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\epsilon\\ & = \mathbf{I}\beta +(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\epsilon \end{align}$$` --- ## Bias Let's first calculate `\(E[\hat{\beta}|\mathbf{X}]\)` -- .definition[ **Note**: The `\(E[\epsilon|\mathbf{X}] = 0\)`, this is an _assumptiopn_ of least squares ] -- `$$E[\hat{\beta}|\mathbf{X}] = E[\mathbf{I}\beta +(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\epsilon|\mathbf{X}]$$` -- ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Try it` Using the information from the previous lecture, solve this expectation. Type up your answer in an .Rmd file and submit the .html file on Canvas. --- ## Bias **Spoiler alert** `$$E[\hat{\beta}|\mathbf{X}] = \beta$$` -- .question[ We want to know: `\(E[\hat\beta] = \beta\)` ] -- Let's use the Law of iterated expectation! `$$E_X[E[\hat\beta|\mathbf{X}]]=E[\hat\beta]$$` --- ## Bias **Spoiler alert** `$$E[\hat{\beta}|\mathbf{X}] = \beta$$` .question[ We want to know: `\(E[\hat\beta] = \beta\)` ] Let's use the Law of iterated expectation! `$$\begin{align} E_X[E[\hat\beta|\mathbf{X}]]&\\ &=E_X[\beta]\\ &=\beta\\ E[\hat\beta] &= \beta \end{align}$$` --- ## Gauss Markov Theorem * Least Squares is the **Best Linear Unbiased Estimator** (BLUE) -- .question[ What does this mean? ] -- * ⬜ **Best**: has the smallest variance -- * ✅ **Linear**: it is linear in the observed output variables -- * ✅ **Unbiased**: it is unbiased -- * ✅ **Estimator**