+ - 0:00:00
Notes for current slide
Notes for next slide

Midterm 02 Review

Dr. D’Agostino McGowan

1 / 20

Ridge Review

What are we minimizing with Ridge Regression?

2 / 20

Ridge Review

What are we minimizing with Ridge Regression?

RSS+λj=1pβj2

2 / 20

Ridge Review

What are we minimizing with Ridge Regression?

RSS+λj=1pβj2

What is the resulting estimate for β^ridge?

2 / 20

Ridge Review

What are we minimizing with Ridge Regression?

RSS+λj=1pβj2

What is the resulting estimate for β^ridge?

β^ridge=(XTX+λI)1XTy

2 / 20

Ridge Review

What are we minimizing with Ridge Regression?

RSS+λj=1pβj2

What is the resulting estimate for β^ridge?

β^ridge=(XTX+λI)1XTy

Why is this useful?

2 / 20
05:00

Derive the βridge values

Practice deriving the β coefficients by minimizing RSS+λj=1pβj2. Be sure you understand what each step is doing.

3 / 20

Answer

  • Start by FOILing the thing we're minimizing

(Don't remember how do to that? Check out Slide #8 from the Ridge Regression lecture)

4 / 20

Answer

  • Start by FOILing the thing we're minimizing

(Don't remember how do to that? Check out Slide #8 from the Ridge Regression lecture)

  • Take the derivative, set it equal to 0

(Don't remember how do to that? Check out Slide #10 from the Ridge Regression lecture)

4 / 20

Answer

  • Start by FOILing the thing we're minimizing

(Don't remember how do to that? Check out Slide #8 from the Ridge Regression lecture)

  • Take the derivative, set it equal to 0

(Don't remember how do to that? Check out Slide #10 from the Ridge Regression lecture)

  • Solve for β

(Don't remember how do to that? Check out Slide #17 from the Ridge Regression lecture)

4 / 20

Ridge Review

How is λ determined?

RSS+λj=1pβj2

5 / 20

Ridge Review

How is λ determined?

RSS+λj=1pβj2

What is the bias-variance trade-off?

5 / 20

Ridge Regression

Pros

  • Can be used when p>n
  • Can be used to help with multicollinearity
  • Will decrease variance (as λ )
6 / 20

Ridge Regression

Pros

  • Can be used when p>n
  • Can be used to help with multicollinearity
  • Will decrease variance (as λ )

Cons

  • Will have increased bias (compared to least squares)
  • Does not really help with variable selection (all variables are included in some regard, even if their β coefficients are really small)
6 / 20

Lasso!

  • The lasso is similar to ridge, but it actually drives some β coefficients to 0! (So it helps with variable selection)
7 / 20

Lasso!

  • The lasso is similar to ridge, but it actually drives some β coefficients to 0! (So it helps with variable selection)

RSS+λj=1p|βj|

7 / 20

Lasso

Pros

  • Can be used when p>n
  • Can be used to help with multicollinearity
  • Will decrease variance (as λ )
  • Can be used for variable selection, since it will make some β coefficients exactly 0
8 / 20

Lasso

Pros

  • Can be used when p>n
  • Can be used to help with multicollinearity
  • Will decrease variance (as λ )
  • Can be used for variable selection, since it will make some β coefficients exactly 0

Cons

  • Will have increased bias (compared to least squares)
  • If p>n the lasso can select at most n variables
8 / 20

What if we want to do both?

  • Elastic net!
9 / 20

What if we want to do both?

  • Elastic net!

RSS+λ1j=1pβj2+λ2j=1p|βj|

9 / 20

Elastic net

RSS+λ1j=1pβj2+λ2j=1p|βj|

When will this be equivalent to Ridge Regression?

10 / 20

Elastic net

RSS+λ1j=1pβj2+λ2j=1p|βj|

When will this be equivalent to Lasso?

11 / 20

Nonlinear models

12 / 20
03:00

Polynomial Regression

pop=β0+β1age+β2age2+β3age3+ϵ

Using the information below, write out the equation to predict change in population from a change in age from the 25th percentile (24.5) to a 75th percentile (73.5).

term estimate std.error statistic p.value
(Intercept) 1807.8528 56.1241 32.2117 0.0000
age -39.6783 4.9849 -7.9596 0.0000
I(age^2) 0.2064 0.1185 1.7414 0.0849
I(age^3) 0.0001 0.0008 0.1869 0.8522
13 / 20
03:00

Nonlinear models

What is the difference between:

  • Polynomial regression
  • Linear Spline
  • Cubic Spline
  • Natural Spline
14 / 20

Degrees of freedom

15 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2weight+ϵ

16 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2weight+ϵ

How many degrees of freedom are used for the horsepower variable?

16 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2weight+ϵ

How many degrees of freedom are used for the horsepower variable?

  • 1
16 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2horsepower2+β3weight+ϵ

17 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2horsepower2+β3weight+ϵ

How many degrees of freedom are used for the horsepower variable?

17 / 20

Example

A model predicting mpg from horsepower and weight.

mpg=β0+β1horsepower+β2horsepower2+β3weight+ϵ

How many degrees of freedom are used for the horsepower variable?

  • 2
17 / 20

Example

A model predicting mpg from horsepower and weight.

cubic spline with 3 knots

mpg=β0+β1horsepower+β2horsepower2+β3horsepower3+β4b4(horsepower)+β5b5(horsepower)+β6b6(horsepower)+β7weight+ϵ

18 / 20

Example

A model predicting mpg from horsepower and weight.

cubic spline with 3 knots

mpg=β0+β1horsepower+β2horsepower2+β3horsepower3+β4b4(horsepower)+β5b5(horsepower)+β6b6(horsepower)+β7weight+ϵ

How many degrees of freedom are used for the horsepower variable?

18 / 20

Example

A model predicting mpg from horsepower and weight.

cubic spline with 3 knots

mpg=β0+β1horsepower+β2horsepower2+β3horsepower3+β4b4(horsepower)+β5b5(horsepower)+β6b6(horsepower)+β7weight+ϵ

How many degrees of freedom are used for the horsepower variable?

  • 6
18 / 20

Example

A model predicting mpg from horsepower and weight.

cubic spline with 3 knots

mpg=β0+β1horsepower+β2horsepower2+β3horsepower3+β4b4(horsepower)+β5b5(horsepower)+β6b6(horsepower)+β7weight+ϵ

How many degrees of freedom are used for the horsepower variable?

  • 6

Don't remember what those bi() are? Review Non-linear Slide #17

18 / 20

Example

A model predicting mpg from horsepower and weight.

natural cubic spline with 3 knots

mpg=β0+β1horsepower+β2r2(horsepower)+β3r3(horsepower)+β4weight+ϵ

  • ri(horsepower) is a function of horespower similar to bi() from the cubic spline, but slightly different due to the restriction, you don't need to know this specification
19 / 20

Example

A model predicting mpg from horsepower and weight.

natural cubic spline with 3 knots

mpg=β0+β1horsepower+β2r2(horsepower)+β3r3(horsepower)+β4weight+ϵ

  • ri(horsepower) is a function of horespower similar to bi() from the cubic spline, but slightly different due to the restriction, you don't need to know this specification

How many degrees of freedom are used for the horsepower variable?

19 / 20

Example

A model predicting mpg from horsepower and weight.

natural cubic spline with 3 knots

mpg=β0+β1horsepower+β2r2(horsepower)+β3r3(horsepower)+β4weight+ϵ

  • ri(horsepower) is a function of horespower similar to bi() from the cubic spline, but slightly different due to the restriction, you don't need to know this specification

How many degrees of freedom are used for the horsepower variable?

  • 3
19 / 20

Other things to review

  • Make sure you are familiar with how tidymodels works
  • Remember your matrix facts
20 / 20

Ridge Review

What are we minimizing with Ridge Regression?

2 / 20
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow