To illustrate logistic regression in R, let’s use the default data form the ISLR package.
library(ISLR)
data(Default)
head(Default)
## default student balance income
## 1 No No 729.5265 44361.625
## 2 No Yes 817.1804 12106.135
## 3 No No 1073.5492 31767.139
## 4 No No 529.2506 35704.494
## 5 No No 785.6559 38463.496
## 6 No Yes 919.5885 7491.559
summary(Default)
## default student balance income
## No :9667 No :7056 Min. : 0.0 Min. : 772
## Yes: 333 Yes:2944 1st Qu.: 481.7 1st Qu.:21340
## Median : 823.6 Median :34553
## Mean : 835.4 Mean :33517
## 3rd Qu.:1166.3 3rd Qu.:43808
## Max. :2654.3 Max. :73554
We observation corresponds to a credit card account.
We want to relate the binary outcome default to x=balance.
Is the outstanding balance on the card related to whether or not the account holder defaults.
boxplot(balance~default,Default)
lgm = glm(default~balance,Default,family=binomial(link = "logit"))
summary(lgm)
##
## Call:
## glm(formula = default ~ balance, family = binomial(link = "logit"),
## data = Default)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2697 -0.1465 -0.0589 -0.0221 3.7589
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.065e+01 3.612e-01 -29.49 <2e-16 ***
## balance 5.499e-03 2.204e-04 24.95 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2920.6 on 9999 degrees of freedom
## Residual deviance: 1596.5 on 9998 degrees of freedom
## AIC: 1600.5
##
## Number of Fisher Scoring iterations: 8
phat = predict(lgm,type="response")
summary(phat)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000237 0.0003346 0.0021888 0.0333000 0.0142324 0.9810081
plot(Default$balance,phat)
bhat = lgm$coefficients
cat("bhat:",bhat,"\n")
## bhat: -10.65133 0.005498917
eta = bhat[1] + bhat[2] * Default$balance
phat0 = 1/(1+exp(-eta))
summary(phat-phat0)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1.110e-16 -3.390e-21 0.000e+00 -1.299e-19 0.000e+00 1.110e-16
How do I check bhat is the MLE??