LDA
appex-02-lda
^πk=nkn
^πk=nkn ^μk=1nk∑i:yi=kxi
^πk=nkn ^μk=1nk∑i:yi=kxi
^σ2=1n−KK∑k=1∑i:yi=k(xi−^μk)2=K∑k=1nk−1n−K^σ2k
^πk=nkn ^μk=1nk∑i:yi=kxi
^σ2=1n−KK∑k=1∑i:yi=k(xi−^μk)2=K∑k=1nk−1n−K^σ2k
^σ2k=1nk−1∑i:yi=k(xi−^μk)2
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333
group_by()
: do calculations on groupsx | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333
group_by()
: do calculations on groupssummarise()
: reduce variables to valuesx | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333
group_by()
: do calculations on groupssummarise()
: reduce variables to valuesmutate()
: add new variablesx | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
group_by()
: do calculations on groupssummarise()
: reduce variables to valuesmutate()
: add new variablesHow do we pull πk out into their own R object?
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n)) %>% pull(pi) -> pi
How do we pull πk out into their own R object?
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
pi
## [1] 0.3333333 0.3333333 0.3333333
How do we pull πk out into their own R object?
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^μk=1nk∑i:yi=kxi
df %>% group_by(y) %>% summarise(mu = mean(x))
## # A tibble: 3 x 2## y mu## <dbl> <dbl>## 1 1 -1.46## 2 2 1.5 ## 3 3 3.54
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^μk=1nk∑i:yi=kxi
df %>% group_by(y) %>% summarise(mu = mean(x)) %>% pull(mu) -> mu
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^σ2=K∑k=1nk−1n−K^σ2k
df %>% group_by(y) %>% summarise(var_k = var(x), n = n()) %>% mutate(v = ((n - 1) / (sum(n) - 3)) * var_k) %>% summarise(sigma_sq = sum(v))
## # A tibble: 1 x 1## sigma_sq## <dbl>## 1 1.47
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^σ2=K∑k=1nk−1n−K^σ2k
df %>% group_by(y) %>% summarise(var_k = var(x), n = n()) %>% mutate(v = ((n - 1) / (sum(n) - 3)) * var_k) %>% summarise(sigma_sq = sum(v)) %>% pull(sigma_sq) -> sigma_sq
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 2x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -3.8155857 0.1795063 -0.5436021
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 2x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -3.8155857 0.1795063 -0.5436021
Which class should we give this point?
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 6x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -7.796499 4.269486 9.108750
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 6x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -7.796499 4.269486 9.108750
Which class should we give this point?
We can turn ^δk(x) into estimates for class probabilities
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
x <- 6d <- x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)exp(d) / sum(exp(d))
## [1] 4.515655e-08 7.850755e-03 9.921492e-01
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
lda()
in the MASS packagelibrary(MASS)model <- lda(y ~ x, data = df)
x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
lda()
in the MASS packagelibrary(MASS) model <- lda(y ~ x, data = df)predict(model, newdata = data.frame(x = 6))
## $class## [1] 3## Levels: 1 2 3## ## $posterior## 1 2 3## 1 4.515655e-08 0.007850755 0.9921492## ## $x## LD1## 1 3.968523
LDA
appex-02-lda
f(x)=1(2π)p/2|Σ|1/2e−12(x−μ)TΣ−1(x−μ)
δk(x)=xTΣ−1μk−12μTkΣ−1μk+logπk
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What is the misclassification rate?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What is the misclassification rate?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What is the misclassification rate?
Since this is training error what is a possible concern?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What is the misclassification rate?
Since this is training error what is a possible concern?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What would the error rate be if we classified to the prior, No
default?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
What would the error rate be if we classified to the prior, No
default?
True Default (No) | True Default (Yes) | Total | |
---|---|---|---|
Predicted Default (No) | 9644 | 252 | 9895 |
Predicted Default (Yes) | 23 | 81 | 104 |
Total | 9667 | 333 | 10000 |
No
's, we make 23/9667=0.2% errors; of the
true Yes
's, we make 252/333=75.7% errors!What is the false positive rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false negative rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false negative rate in the Credit Default example?
Yes
class if^P(Default|Balance, Student)≥0.5
Yes
class if^P(Default|Balance, Student)≥0.5 We can change the two error rates by changing the *threshold from 0.5 to some other number between 0 and 1
^P(Default|Balance, Student)≥threshold
Which do you think is better, higher or lower AUC?
library(MASS)model <- lda(default ~ balance + student + income, data = Default)
lda()
function in R from the MASS packagelibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)
lda()
function in R from the MASS
packagepredict()
functionlibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class)
lda()
function in R from the MASS
packagepredict()
functionmutate()
functionlibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class) %>% summarise(fpr = sum(default == "No" & predicted_class == "Yes") / sum(default == "No"), fnr = sum(default == "Yes" & predicted_class == "No") / sum(default == "Yes"))
## fpr fnr## 1 0.002275784 0.7627628
summarise()
function to add the false positive and false negative rateslibrary(MASS)library(tidymodels)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class) %>% conf_mat(default, predicted_class) %>% autoplot(type = "heatmap")
conf_mat()
expects your outcome to be a factor variablelibrary(MASS)library(tidymodels)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class, default = as.factor(default)) %>% conf_mat(default, predicted_class) %>% autoplot(type = "heatmap")
LDA
appex-02-lda
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |