LDAappex-02-lda

^πk=nkn
^πk=nkn ^μk=1nk∑i:yi=kxi
^πk=nkn ^μk=1nk∑i:yi=kxi
^σ2=1n−KK∑k=1∑i:yi=k(xi−^μk)2=K∑k=1nk−1n−K^σ2k
^πk=nkn ^μk=1nk∑i:yi=kxi
^σ2=1n−KK∑k=1∑i:yi=k(xi−^μk)2=K∑k=1nk−1n−K^σ2k
^σ2k=1nk−1∑i:yi=k(xi−^μk)2
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333group_by(): do calculations on groups| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333group_by(): do calculations on groupssummarise(): reduce variables to values| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
## # A tibble: 3 x 3## y n pi## <dbl> <int> <dbl>## 1 1 5 0.333## 2 2 5 0.333## 3 3 5 0.333group_by(): do calculations on groupssummarise(): reduce variables to valuesmutate(): add new variables| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n))
group_by(): do calculations on groupssummarise(): reduce variables to valuesmutate(): add new variablesHow do we pull πk out into their own R object?
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
df %>% group_by(y) %>% summarise(n = n()) %>% mutate(pi = n / sum(n)) %>% pull(pi) -> pi
How do we pull πk out into their own R object?
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^πk=nkn
pi
## [1] 0.3333333 0.3333333 0.3333333How do we pull πk out into their own R object?
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^μk=1nk∑i:yi=kxi
df %>% group_by(y) %>% summarise(mu = mean(x))
## # A tibble: 3 x 2## y mu## <dbl> <dbl>## 1 1 -1.46## 2 2 1.5 ## 3 3 3.54| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^μk=1nk∑i:yi=kxi
df %>% group_by(y) %>% summarise(mu = mean(x)) %>% pull(mu) -> mu| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^σ2=K∑k=1nk−1n−K^σ2k
df %>% group_by(y) %>% summarise(var_k = var(x), n = n()) %>% mutate(v = ((n - 1) / (sum(n) - 3)) * var_k) %>% summarise(sigma_sq = sum(v))
## # A tibble: 1 x 1## sigma_sq## <dbl>## 1 1.47| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^σ2=K∑k=1nk−1n−K^σ2k
df %>% group_by(y) %>% summarise(var_k = var(x), n = n()) %>% mutate(v = ((n - 1) / (sum(n) - 3)) * var_k) %>% summarise(sigma_sq = sum(v)) %>% pull(sigma_sq) -> sigma_sq| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 2x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -3.8155857 0.1795063 -0.5436021| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 2x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -3.8155857 0.1795063 -0.5436021Which class should we give this point?
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 6x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -7.796499 4.269486 9.108750| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
δk(x)=xμkσ2−μ2k2σ2+log(πk)
x <- 6x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)
## [1] -7.796499 4.269486 9.108750Which class should we give this point?
We can turn ^δk(x) into estimates for class probabilities
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
We can turn ^δk(x) into estimates for class probabilities
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
^P(Y=k|X=x)=e^δk(x)∑Kl=1e^δl(x)
x <- 6d <- x * (mu / sigma_sq) - mu^2 / (2 * sigma_sq) + log(pi)exp(d) / sum(exp(d))
## [1] 4.515655e-08 7.850755e-03 9.921492e-01| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
lda() in the MASS packagelibrary(MASS)model <- lda(y ~ x, data = df)
| x | -1.6 | 0.2 | -0.9 | -2.0 | -3.0 | 1.9 | 1.2 | 2.2 | 2.7 | -0.5 | 1.8 | 3.3 | 5.0 | 3.4 | 4.2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 |
lda() in the MASS packagelibrary(MASS) model <- lda(y ~ x, data = df)predict(model, newdata = data.frame(x = 6))
## $class## [1] 3## Levels: 1 2 3## ## $posterior## 1 2 3## 1 4.515655e-08 0.007850755 0.9921492## ## $x## LD1## 1 3.968523LDAappex-02-lda
f(x)=1(2π)p/2|Σ|1/2e−12(x−μ)TΣ−1(x−μ)

δk(x)=xTΣ−1μk−12μTkΣ−1μk+logπk

| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What is the misclassification rate?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What is the misclassification rate?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What is the misclassification rate?
Since this is training error what is a possible concern?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What is the misclassification rate?
Since this is training error what is a possible concern?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What would the error rate be if we classified to the prior, No default?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
What would the error rate be if we classified to the prior, No default?
| True Default (No) | True Default (Yes) | Total | |
|---|---|---|---|
| Predicted Default (No) | 9644 | 252 | 9895 |
| Predicted Default (Yes) | 23 | 81 | 104 |
| Total | 9667 | 333 | 10000 |
No's, we make 23/9667=0.2% errors; of the
true Yes's, we make 252/333=75.7% errors!What is the false positive rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false negative rate in the Credit Default example?
What is the false positive rate in the Credit Default example?
What is the false negative rate in the Credit Default example?
Yes class if^P(Default|Balance, Student)≥0.5
Yes class if^P(Default|Balance, Student)≥0.5 We can change the two error rates by changing the *threshold from 0.5 to some other number between 0 and 1
^P(Default|Balance, Student)≥threshold



Which do you think is better, higher or lower AUC?
library(MASS)model <- lda(default ~ balance + student + income, data = Default)
lda() function in R from the MASS packagelibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)
lda() function in R from the MASS packagepredict() functionlibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class)
lda() function in R from the MASS packagepredict() functionmutate() functionlibrary(MASS)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class) %>% summarise(fpr = sum(default == "No" & predicted_class == "Yes") / sum(default == "No"), fnr = sum(default == "Yes" & predicted_class == "No") / sum(default == "Yes"))
## fpr fnr## 1 0.002275784 0.7627628summarise() function to add the false positive and false negative rateslibrary(MASS)library(tidymodels)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class) %>% conf_mat(default, predicted_class) %>% autoplot(type = "heatmap")

conf_mat() expects your outcome to be a factor variablelibrary(MASS)library(tidymodels)model <- lda(default ~ balance + student + income, data = Default)predictions <- predict(model)Default %>% mutate(predicted_class = predictions$class, default = as.factor(default)) %>% conf_mat(default, predicted_class) %>% autoplot(type = "heatmap")LDAappex-02-ldaKeyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |