Robust Regression

# Robust Regression
### Dr. D’Agostino McGowan

---

<div class="my-footer">
<span>
Dr. Lucy D'Agostino McGowan
</span>
</div>

---

## Robust Regression

* When we have outliers, least squares doesn't always perform well

---

## M-estimation

* Modifies least squares to pick a `$\beta$` that minimizes the general _objective function_

`$$\sum_{i=1}^nf(y_i-x_i^T\beta)$$`
--

Some possibilities:

* `$f(e)=e^2$`
* `$f(e)=|e|$` (this is least absolute deviation regression)
* `$f(e)=\begin{cases}e^2/2&\textrm{if} |e|\leq c\\ c|e|-c^2/2&\textrm{otherwise}\end{cases}$` (Huber's method)

---

## Solving M-estimation

* Differentiating the _objective function_ with respect to `$\beta$` and setting equal to 0 produces `$p+1$` estimating equations for the coefficients

`$$\sum_{i=1}^{n}f'(y_i-x_i^T\beta)x_i^T=0$$`

* We can define a _weight function_ such that `$w(u) = f'(u)/u$`
---

## Solving M-estimation

* SO we can write the estimating equations as:

`$$\sum_{i=1}^{n}w_i(y_i-x_i^T\beta)x_i^T=0$$`

* Solving this is the same as solving weighted least squares! Minimizing `$\sum w_i^2e_i^2$`
--

* But it needs to be solved _iteratively_ because the weights depend on the residuals...which depend on the estimates...which depend on the weights
--

* Iteratively reweighted least squares (IRLS)
---

## IRLS

* Select starting values for `$\beta^{(0)}$`
--

* At each iteration calculate residuals `$e_i^{(t-1)}$` and weights `$w_i^{(t-1)}$`
--

* Solve for the new weighted least squares estimate

`$$\beta^{(t)}=(\mathbf{X}^T\mathbf{W}^{(t-1)}\mathbf{X})^{-1}\mathbf{X}^T\mathbf{W}^{(t-1)}\mathbf{y}$$`
--

* Repeat until converges!

---

## Weight functions

Method | Objective function | Weight function 
---|-----|----
Least Squares | `$f(e)=e^2$` | `$w(e)=1$`
LAD | `$f(e)=\lvert e\rvert$` | `$w(e) = 1/\lvert e\rvert$`
Huber | `$f(e)=\begin{cases}e^2/2&\textrm{if} \lvert e\rvert\leq c\\ c\lvert e\rvert-c^2/2&\textrm{otherwise}\end{cases}$` | `$w(e)=\begin{cases}1&\textrm{if }\lvert e\rvert\leq c\\c/\lvert e\rvert&\textrm{otherwise}\end{cases}$`

* The Huber method combines the downweighting of extreme cases with equal weighting for middle cases

---

## Robust regression in R

You can perform robust regression using the `rlm` function from the `MASS` package

```r
library(MASS)
mod <- rlm(y ~ x, data = data)
```

You can examine the weights like this

```r
mod$w
```

---

## Robust regression recap

* Useful when there are _outliers_ 
* If you find the results from least squares and robust regression noticeably different, robust regression is likely more trustworthy

---

## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 640 512"><path d="M624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"/></svg> `Application Exercise`

* Fit a robust regression model to the `gala` data from the `faraway` package, predicting `Species` using `Area`, `Elevation`, `Nearest`, `Scruz`, and `Adjacent`. 
* Which observations result in weights less than one? 
* What does this mean for these observations?