What is the hat matrix?
What is the hat matrix?
hi=Hiihi=X(XTX)−1XTii
What do we use the diagnonal of the hat matrix?
What do we use the diagnonal of the hat matrix?
var(ei)=σ2(1−hi)
How do we know this? Let's show it using the fact that the hat matrix is idempotent and symmetric.
How do we know this? Let's show it using the fact that the hat matrix is idempotent and symmetric.
hi=∑jHijHji
How do we know this? Let's show it using the fact that the hat matrix is idempotent and symmetric.
hi=∑jHijHji=∑jH2ij
How do we know this? Let's show it using the fact that the hat matrix is idempotent and symmetric.
hi=∑jHijHji=∑jH2ij=H2ii+∑j≠iH2ij
How do we know this? Let's show it using the fact that the hat matrix is idempotent and symmetric.
hi=∑jHijHji=∑jH2ij=H2ii+∑j≠iH2ij=h2i+∑j≠iH2ij
hi=∑jHijHji=∑jH2ij=H2ii+∑j≠iH2ij=h2i+∑j≠iH2ij
ri=e^σ√1−hi
ri=e^σ√1−hi
Application Exercise| y | x |
|---|---|
| 1 | 0 |
| 5 | 4 |
| 2 | 2 |
| 2 | 1 |
| 11 | 10 |
Using the data above calculate:
rstandard())hatvalues())mod <- lm(mpg ~ disp, data = mtcars)d <- data.frame( standardized_resid = rstandard(mod), fit = fitted(mod))ggplot(d, aes(fit, standardized_resid)) + geom_point() +geom_hline(yintercept = 0) + labs(y = "Standardized Residual")





^y(i)=xTi^β(i)
^y(i)=xTi^β(i)
^y(i)=xTi^β(i)
How do we determine "large"? We need to scale it using the variance!
Application ExerciseShow that
^var((^y−^y(i)))=^σ2(i)(1+xTi(XT(i)X(i))−1xi)
| y | x |
|---|---|
| 1 | 0 |
| 5 | 4 |
| 2 | 2 |
| 2 | 1 |
| 11 | 10 |
Using the data above, calculate ^var(^y)(i) for observation 5.
ti=yi−^y(i)^σ(i)(1+xTi(XT(i)X(i))−1xi)1/2 The have a t distribution with (n−1)−(p+1)=n−p−2 degrees of freedom if the model is correct and ϵ N(0,σ2I).
ti=yi−^y(i)^σ(i)(1+xTi(XT(i)X(i))−1xi)1/2 The have a t distribution with (n−1)−(p+1)=n−p−2 degrees of freedom if the model is correct and ϵ N(0,σ2I).
ti=ri(n−p−2n−p−1−r2i)1/2
Application Exercise| y | x |
|---|---|
| 1 | 0 |
| 5 | 4 |
| 2 | 2 |
| 2 | 1 |
| 11 | 10 |
Calculate the studentized residuals for the data above.
It is good to understand how to calculate these studentized residuals by hand, but there is an R function that does this for you (rstudent())
Di=(^y−^y(i))T(^y−^y(i))(p+1)^σ2
Di=(^y−^y(i))T(^y−^y(i))(p+1)^σ2
Di=(^y−^y(i))T(^y−^y(i))(p+1)^σ2
1p+1r2ihi1−hi
Di=(^y−^y(i))T(^y−^y(i))(p+1)^σ2
1p+1r2ihi1−hi

Application Exercise| y | x |
|---|---|
| 1 | 0 |
| 5 | 4 |
| 2 | 2 |
| 2 | 1 |
| 11 | 10 |
Calculate Cook's Distance for the data above and plot it with the row number on the x-axis and Cook's Distance on the y-axis.
It is good to understand how to calculate these by hand, but there is an R function that does this for you (cooks.distance())
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |