Exercise 1: Change the author to your name, knit, commit, and push

mpg wt
21.0 2.62
21.0 2.88
22.8 2.32
21.4 3.22
18.7 3.44
18.1 3.46
14.3 3.57
library(tidyverse)

Exercise 2: Using the data set below, create a vector, y and a design matrix X.

d <- data.frame(x = c(2.62, 2.88, 2.32, 3.22, 3.44, 3.46, 3.57),
                y = c(21, 21, 22.8, 21.4, 18.7, 18.1, 14.3))
y <- d$y 
X <- matrix(c(rep(1, 7), d$x), ncol = 2)

Exercise 3: Calculate the hat matrix and y hat

Hint 1 Look back at Linear Regression slides: https://sta-363-s20.lucymcgowan.com/slides/04-linear-regression.html#48

Hint 2 Look back at your linear application exercise (appex-01-linear-models)

hat_matrix <- X %*% solve(t(X) %*% X) %*% t(X)
h <- diag(hat_matrix)
y_hat <- hat_matrix %*% y

You can use the diag() function to get the diagonal of the hat matrix

Exercise 4: Calculate the LOOCV error using y, yhat, and the diagonal of the hat matrix calculated

sum(((y - y_hat) / (1 - h))^2) / 7
## [1] 4.505596

Exercise 5: Calculate 7-fold cross validation (LOOCV) using the iterative method

Fill in the blanks in the function below, then change the chunk option to eval = TRUE

d <- d %>%
  mutate(k = 1:7)

mse_k <- function(K = 1) { 
  d_k <- d %>% 
    filter(k != K) #<<
  model_k <- lm(y ~ x, data = d_k)
  d %>%
    filter(k == K) %>% #<<
    mutate(p = predict(model_k, newdata = .)) %>%
    summarise(mse_k = mean((y - p)^2),
              n_k = n())
} 

map_df(1:7, ~ mse_k(.x)) %>%
  summarise(sum((n_k / sum(n_k)) * mse_k))
##   sum((n_k/sum(n_k)) * mse_k)
## 1                    4.505596