# Homework 2

Due: Wednesday 2020-03-04 at 11:59pm

# Getting started

• Go to the course organization on GitHub: https://github.com/sta-363-s20.

• Find the repo starting with hw-02 and that has your team name at the end (this should be the only hw-02 repo available to you).

• In the repo, click on the green Clone or download button, select Use HTTPS. Click on the clipboard icon to copy the repo URL.

• If using RStudio Cloud, go to RStudio Cloud and into the course workspace. Create a New Project from Git Repo. You will need to click on the down arrow next to the New Project button to see this option.

• If using RStudio Pro, create a new project by clicking File > New Project Then click Version Control and Git/Github.

• Copy and paste the URL of your assignment repo into the dialog box.

• Hit OK, and you’re good to go!

1. For parts (a) and (b), indicate which of i. through iv. is correct. Justify your answer.

(a.) The lasso, relative to least squares, is:

(i.) More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.
(ii.) More flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.
(iii.) Less flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance.
(iv.) Less flexible and hence will give improved prediction accuracy when its increase in variance is less than its decrease in bias.

(b.) Repeat (a) for ridge regression relative to least squares.

1. Suppose we estimate the regression coefficients in a linear regression model by minimizing

$\sum_{i=1}^n\left(y_i-\beta_0 -\sum_{j=1}^p\beta_jx_{ij}\right)^2 \textrm{ subject to } \sum_{j=1}^p|\beta_j|\leq s$

for a particular value of $$s$$. For parts (a) through (e), indicate which of i. through v. is correct. Justify your answer.

(a.) As we increase $$s$$ from 0, the training RSS will:

(i.) Increase initially, and then eventually start decreasing in an inverted U shape.
(ii.) Decrease initially, and then eventually start increasing in a U shape.
(iv.) Steadily decrease. (v.) Remain constant.

(b.) Repeat (a) for test RSS.

(c.) Repeat (a) for variance.

(d.) Repeat (a) for (squared) bias.

(e.) Repeat (a) for the irreducible error.

1. Suppose we estimate the regression coefficients in a linear regression model by minimizing

$\sum_{i=1}^n\left(y_i-\beta_0-\sum_{j=1}^p\beta_jx_ij\right)^2+\lambda\sum_{j=1}^p\beta_j^2$

for a particular value of $$\lambda$$. For parts (a) through (e), indicate which of i. through v. is correct. Justify your answer.

(a.) As we increase $$\lambda$$ from 0, the training RSS will:

(i.) Increase initially, and then eventually start decreasing in an inverted U shape.
(ii.) Decrease initially, and then eventually start increasing in a U shape.
(iv.) Steadily decrease. (v.) Remain constant.

(b.) Repeat (a) for test RSS.

(c.) Repeat (a) for variance.

(d.) Repeat (a) for (squared) bias.

(e.) Repeat (a) for the irreducible error.

1. Which is better for feature (variable) section, Ridge Regression or Lasso? Justify your answer.

2. Write the equation that is being minimized for an Elastic Net. Explain how this is related to Ridge Regression and Lasso.

You will need to use math mode in your .Rmd file to do this. This math mode renders most $$\LaTeX$$ commands. If you don’t know how to add math expressions to your .Rmd file, you can read about it here. If you are unfamiliar with $$\LaTeX$$, here is a cheatsheet.