简体   繁体   中英

GLM - No R-squared output when running simple linear regression with categorical predictor

I am running a simple linear regression with a numerical response (wellbeing) and a categorical explanatory (education) variable. I know that there are ideas about dealing with the categorical variable as continuous, but in this case I want to keep treating it as a factor.

Now...

When I want to assess the quantity of this model with R-squared, the glance functionality of the broom package doesn't provide me with the metric.

In my understanding, the null model here, is the mean of the response variable and the linear model that I've created here is the response variable mapped onto the explanatory variable. There must be some kind of effect size to gauge here.

What do you think? Why can't I get R-squared and would there be another kind of effect size that would tell me something about the improvement of the model by including this categorical predictor.

df <- tibble(education = c("Low", "Medium", "High", "Low", "Medium", "High", "High"),
             wellbeing = c(7, 6, 7, 4, 5, 4, 5))
df$education <- as.factor(df$education)

mdl <- glm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
)

library(dplyr)
library(broom)
mdl_scgeluk_min_havovwombo %>%
  glance() %>%
  pull(r.squared)

As pointed out in the comments by @Roland, you can use lm() and call the summary() function,

summary(lm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
))$r.squared

 0.9552469

Or since we know the formula of R squared for a ordinary least square is:

在此处输入图片说明

We can pull this out from your glm results:

mdl <- glm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
)

1 - mdl$deviance/mdl$null.deviance

 0.9552469

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM