7.4 Logistic regression
With a binary outcome measure, logistic regression is generally more appropriate than linear (OLS) regression. Use the glm()
function to estimate a generalized model, and specify the model family as binomial
within the arguments.
# create binary measure of "above average math proficiency"
=
dcps %>%
dcps mutate(AboveAvgMath = if_else(ProfMath > mean(ProfMath),1,0))
=
Model3 glm(
~ ProfLang + NumTested, # specify model
AboveAvgMath family = 'binomial', # logistic estimation
data = dcps
)
To view the coefficient estimates and evaluate hypotheses, again apply the summary()
function to the model object.
# View estimates
summary(Model3)
##
## Call:
## glm(formula = AboveAvgMath ~ ProfLang + NumTested, family = "binomial",
## data = dcps)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.938 -0.547 -0.351 0.213 2.115
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.22427 0.61253 -5.26 1.4e-07 ***
## ProfLang 0.11366 0.02412 4.71 2.4e-06 ***
## NumTested -0.00239 0.00229 -1.04 0.3
## ---
## Signif. codes:
## 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 144.342 on 107 degrees of freedom
## Residual deviance: 77.127 on 105 degrees of freedom
## AIC: 83.13
##
## Number of Fisher Scoring iterations: 6
# Odds ratios
exp(coef(Model3))
## (Intercept) ProfLang NumTested
## 0.03978 1.12037 0.99761
The results indicate that a percentage-point increase in a school’s language proficiency is expected to raise the odds of being above average in math by 12%, conditional on the number of students tested. Again, the increase is significant (\(p < 0.001\)).