Homework 2

Due: Wednesday 2020-03-04 at 11:59pm

Getting started

  1. For parts (a) and (b), indicate which of i. through iv. is correct. Justify your answer.

(a.) The lasso, relative to least squares, is:

(i.) More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.
(ii.) More flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.
(iii.) Less flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.
(iv.) Less flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.

(b.) Repeat (a) for ridge regression relative to least squares.

  1. Suppose we estimate the regression coefficients in a linear regression model by minimizing

\[\sum_{i=1}^n\left(y_i-\beta_0 -\sum_{j=1}^p\beta_jx_{ij}\right)^2 \textrm{ subject to } \sum_{j=1}^p|\beta_j|\leq s\]

for a particular value of \(s\). For parts (a) through (e), indicate which of i. through v. is correct. Justify your answer.

(a.) As we increase \(s\) from 0, the training RSS will:

(i.) Increase initially, and then eventually start decreasing in an inverted U shape.
(ii.) Decrease initially, and then eventually start increasing in a U shape.
(iii.) Steadily increase.
(iv.) Steadily decrease. (v.) Remain constant.

(b.) Repeat (a) for test RSS.

(c.) Repeat (a) for variance.

(d.) Repeat (a) for (squared) bias.

(e.) Repeat (a) for the irreducible error.

  1. Suppose we estimate the regression coefficients in a linear regression model by minimizing

\[\sum_{i=1}^n\left(y_i-\beta_0-\sum_{j=1}^p\beta_jx_ij\right)^2+\lambda\sum_{j=1}^p\beta_j^2\]

for a particular value of \(\lambda\). For parts (a) through (e), indicate which of i. through v. is correct. Justify your answer.

(a.) As we increase \(\lambda\) from 0, the training RSS will:

(i.) Increase initially, and then eventually start decreasing in an inverted U shape.
(ii.) Decrease initially, and then eventually start increasing in a U shape.
(iii.) Steadily increase.
(iv.) Steadily decrease. (v.) Remain constant.

(b.) Repeat (a) for test RSS.

(c.) Repeat (a) for variance.

(d.) Repeat (a) for (squared) bias.

(e.) Repeat (a) for the irreducible error.

  1. Which is better for feature (variable) section, Ridge Regression or Lasso? Justify your answer.

  2. Write the equation that is being minimized for an Elastic Net. Explain how this is related to Ridge Regression and Lasso.

You will need to use math mode in your .Rmd file to do this. This math mode renders most \(\LaTeX\) commands. If you don’t know how to add math expressions to your .Rmd file, you can read about it here. If you are unfamiliar with \(\LaTeX\), here is a cheatsheet.