Comparing Prediction Models

# Comparing Prediction Models
### Dr. D’Agostino McGowan

---

<div class="my-footer">
  <span>
  Dr. Lucy D'Agostino McGowan
</span>
</div>

---

## Comparison metrics

* Adjusted `$R^2$`
* Mallow's Cp

---

## `$R^2$`

`$$\Huge 1 - \frac{RSS}{TSS}$$`

---

## `$R^2$`

`$$\Huge 1 - \frac{RSS}{TSS}$$`

* Adding a new variable to a model can **only decrease** the RSS

---

## `$R^2$`

* This means that `$R^2$` by itself is not a good criteria for determining model fit, because we'd always just pick the largest model!

--
* Enter Adjusted `$R^2$`!

---

## Adjusted `$R^2$`

`$$\Large R^2 = 1 - \frac{RSS/(n-(p+1))}{TSS/(n-1)}$$`
--

* Now this will only chose a larger model if it has some predictive value

---

## Mallow's Cp

* This estimates the average mean square error of prediction:

`$$\Large\frac{1}{\sigma^2}\sum E[(\hat{y}_i-E[y_i])^2]$$`
--

* This can be estimated by

`$$C_p = \frac{RSS_p}{\hat{\sigma}^2}+ 2(p+1)-n$$`

where `$\hat\sigma^2$` is from the full model and `$RSS_p$` is the RSS from a reduced model.

---

## <svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 576 512"><path d="M402.6 83.2l90.2 90.2c3.8 3.8 3.8 10 0 13.8L274.4 405.6l-92.8 10.3c-12.4 1.4-22.9-9.1-21.5-21.5l10.3-92.8L388.8 83.2c3.8-3.8 10-3.8 13.8 0zm162-22.9l-48.8-48.8c-15.2-15.2-39.9-15.2-55.2 0l-35.4 35.4c-3.8 3.8-3.8 10 0 13.8l90.2 90.2c3.8 3.8 10 3.8 13.8 0l35.4-35.4c15.2-15.3 15.2-40 0-55.2zM384 346.2V448H64V128h229.8c3.2 0 6.2-1.3 8.5-3.5l40-40c7.6-7.6 2.2-20.5-8.5-20.5H48C21.5 64 0 85.5 0 112v352c0 26.5 21.5 48 48 48h352c26.5 0 48-21.5 48-48V306.2c0-10.7-12.9-16-20.5-8.5l-40 40c-2.2 2.3-3.5 5.3-3.5 8.5z"/></svg> `Application Exercise`

Show that Mallow's Cp for a full model (where the full model and the reduced model have the same predictors, p) is equal to p+1