class: center, middle, inverse, title-slide # Generalized Least Squares ### Dr. D’Agostino McGowan --- layout: true <div class="my-footer"> <span> Dr. Lucy D'Agostino McGowan </span> </div> --- ## GLS * So far we have assumed `\(\textrm{var}(\epsilon)=\sigma^2\mathbf{I}\)` -- .question[ When is this assumption wrong? ] -- * Sometimes the errors have **non constant variance** -- .question[ How do you know? ] -- * Sometimes the errors are **correlated** --- ## GLS * Instead of assuming the `\(\textrm{var}(\epsilon) = \sigma^2\mathbf{I}\)`, you can assume `$$\textrm{var}(\epsilon)=\sigma^2\Sigma$$` -- * `\(\sigma^2\)` is unknown (the absolute scale of the variation) -- * `\(\Sigma\)` is known (the correlation and relative variance between the errors) --- ## GLS * We can write `\(\Sigma = \mathbf{SS}^T\)` where `\(\mathbf{S}\)` is a triangular matrix -- * _This is done using the Choleski decomposition, which is akin to taking the square root of a matrix_ -- * Then we can do a **transformation** of our original least squares model -- `$$\begin{align}y &=\mathbf{X}\beta+\epsilon\\ \end{align}$$` --- ## GLS * We can write `\(\Sigma = \mathbf{SS}^T\)` where `\(\mathbf{S}\)` is a triangular matrix * _This is done using the Choleski decomposition, which is akin to taking the square root of a matrix_ * Then we can do a **transformation** of our original least squares model `$$\begin{align}y &=\mathbf{X}\beta+\epsilon\\ \mathbf{S}^{-1}y &= \mathbf{S}^{-1}\mathbf{X}\beta+\mathbf{S}^{-1}\epsilon\\ \end{align}$$` --- ## GLS * We can write `\(\Sigma = \mathbf{SS}^T\)` where `\(\mathbf{S}\)` is a triangular matrix * _This is done using the Choleski decomposition, which is akin to taking the square root of a matrix_ * Then we can do a **transformation** of our original least squares model `$$\begin{align}y &=\mathbf{X}\beta+\epsilon\\ \mathbf{S}^{-1}y &= \mathbf{S}^{-1}\mathbf{X}\beta+\mathbf{S}^{-1}\epsilon\\ y'&=\mathbf{X}'\beta + \epsilon' \end{align}$$` --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` -- `$$\textrm{var}(\epsilon')=\sigma^2\mathbf{I}$$` -- * We've returned to the ordinary least squares problem if we regress `\(y'\)` on `\(\mathbf{X}'\)` (yay!) --- class: inverse ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Content Assessment` * Show that `$$\textrm{var}(\epsilon')=\sigma^2\mathbf{I}$$` --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` -- * The sum of squares is: -- `$$\begin{align}RSS&=(y' - \mathbf{X}'\beta)^T(y' - \mathbf{X}'\beta)\\ \end{align}$$` --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` * The sum of squares is: `$$\begin{align}RSS&=(y' - \mathbf{X}'\beta)^T(y' - \mathbf{X}'\beta)\\ &=(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)^T(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)\\ \end{align}$$` --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` * The sum of squares is: `$$\begin{align}RSS&=(y' - \mathbf{X}'\beta)^T(y' - \mathbf{X}'\beta)\\ &=(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)^T(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)\\ &=(y - \mathbf{X}\beta)^T\mathbf{S}^{-T}\mathbf{S}^{-1}(y-\mathbf{X}\beta)\\ \end{align}$$` --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` * The sum of squares is: `$$\begin{align}RSS&=(y' - \mathbf{X}'\beta)^T(y' - \mathbf{X}'\beta)\\ &=(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)^T(\mathbf{S}^{-1}y - \mathbf{S}^{-1}\mathbf{X}\beta)\\ &=(y - \mathbf{X}\beta)^T\mathbf{S}^{-T}\mathbf{S}^{-1}(y-\mathbf{X}\beta)\\ &=(y - \mathbf{X}\beta)^T\Sigma^{-1}(y-\mathbf{X}\beta) \end{align}$$` --- class: inverse ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Content Assessment` Minimize `\((y - \mathbf{X}\beta)^T\Sigma^{-1}(y-\mathbf{X}\beta)\)` with respect to `\(\beta\)` to calculate the estimate for `\(\hat\beta\)` [*Remember how we did this for least squares in the **Matrix Review** Lecture*](https://www.youtube.com/watch?v=Mz_6VSIPEI4&feature=youtu.be) --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` -- .question[ What is the expectation of `\(\hat\beta\)`? ] --- class: inverse ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Content Assessment` Using the estimate for `\(\hat\beta\)` that you previously calculated, take the expectation of `\(\hat\beta\)`. Is this an unbiased estimate for `\(\beta\)`? [*Remember how we did this for least squares in the **Calculating a conditional expected value** Lecture*](https://www.youtube.com/watch?v=V3cZUDnEFEE&feature=youtu.be) --- ## GLS * Let's look at some properties of `\(y'=\mathbf{X}'\beta + \epsilon'\)` -- * The variance of `\(\hat\beta\)` is: `$$\textrm{var}(\hat\beta)=(\mathbf{X}^T\Sigma^{-1}\mathbf{X})^{-1}\sigma^2$$` --- class: inverse ## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Content Assessment` * Show that `$$\textrm{var}(\hat\beta)=(\mathbf{X}^T\Sigma^{-1}\mathbf{X})^{-1}\sigma^2$$` [*Remember how we did this for least squares in the **The variance of the least squares estimator** Lecture*](https://www.youtube.com/watch?v=92plroy_7xU&feature=youtu.be)