What are we minimizing with Ridge Regression?
What are we minimizing with Ridge Regression?
RSS+λp∑j=1β2j
What are we minimizing with Ridge Regression?
RSS+λp∑j=1β2j
What is the resulting estimate for ˆβridge?
What are we minimizing with Ridge Regression?
RSS+λp∑j=1β2j
What is the resulting estimate for ˆβridge?
ˆβridge=(XTX+λI)−1XTy
What are we minimizing with Ridge Regression?
RSS+λp∑j=1β2j
What is the resulting estimate for ˆβridge?
ˆβridge=(XTX+λI)−1XTy
Why is this useful?
How is λ determined?
RSS+λp∑j=1β2j
How is λ determined?
RSS+λp∑j=1β2j
What is the bias-variance trade-off?
RSS+λp∑j=1|βj|
RSS+λp∑j=1|βj|
RSS+λp∑j=1|βj|
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
What is the ℓ1 part of the penalty?
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
What is the ℓ1 part of the penalty?
What is the ℓ2 part of the penalty
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
When will this be equivalent to Ridge Regression?
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
When will this be equivalent to Lasso?
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
RSS+λ1p∑j=1β2j+λ2p∑j=1|βj|
How do you think λ1 and λ2 are chosen?
What are we minimizing with Ridge Regression?
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |