简体   繁体   中英

Estimating the average marginal effect of binary and continuous coefficients in logit model R

Using data from the National Health Interview Survey, I am hoping to analyze the average marginal effect a variety of demographic factors have on the predicted probability of having hypertension using a logistic regression. To clarify, by average marginal effect I mean that I want to be computing the marginal effect at the mean of every X (like the STATA output).

My issue is that I have both binary and continuous independent variables, but from what I've read, it doesn't make sense to evaluate the binary variables at their mean, since it's either a 0 or 1. I don't know how to make the regression run where I can evaluate the continuous variables at their mean, but not the binary ones. Here is the code I have so far.


#Here I create a data frame of the means of the continuous variables 
mean_df=df %>% select(c(AGE,BMICALC,FAMSIZE,YEARSONJOB,HOURSWRK)) %>% summarise_all(mean)


#here is my regression, variables here not in the line of code above are binary 
logit_margin_diabetes <- glm(DIABETES~scale(AGE)+scale(IMMIGRANT)+scale(HOURSWRK)+scale(BELOW_TWICE_POVERTY)
+scale(BMICALC)+scale(FEMALE)+scale(FAMSIZE)+scale(EDUC_1)+scale(EDUC_2)+scale(EDUC_3)+
scale(EDUC_4)+scale(SMOKE)+scale(MARRIED)+scale(HISP)+scale(AFR_AM)+scale(WHITE), data = df,family="binomial")

#This is the stage where I want to apply the logit so it is evaluated at the means of the continuous variables. But I don't know what to do about the binary variables 
marg_mean<-margins(logit_margin_diabetes,data=mean_df)
summary(marg_mean)

Apologies, it was difficult for me to produce and MRE, since I don't know of a dataset in R that has this sort of information. But if anyone can provide any advice that would be greatly appreciated. Thanks.

Here is the modified output per the comment below. But I would like the output to show the SE,AME,and p values too

margins(logit_margin, at=list(AGE=35.93349,BMICALC=26.90704, FAMSIZE=2.495413, YEARSONJOB=4.538336,
                                        HOURSWRK=32.53768,IMMIGRANT=1,
                                        BELOW_TWICE_POVERTY=1, FEMALE=1,
                                       EDUC_1=1,EDUC_2=1,EDUC_3=1,EDUC_4=1,
                                        SMOKE=1,MARRIED=1,HISP=1,
                                        AFR_AM=1,WHITE=1))
summary(marg_mean)

在此处输入图像描述

This is a photo of the new output I see after running summary(marg_mean)

The margins package takes care of this automatically if you declare a variable to be a factor. See the subsetting section of the vi.nette or you can inspect the source code to see that marginal effects are computed as differences for factor variables.

Note that the default setting for margins is to compute the "average marginal effect", and not the "marginal effect at the mean". IMO, the default setting is best in most cases, but if you insist on considering a "synthetic" average observation, it is easy to do with the at argument of the margins function.

Code example. In the first case, vs is treated as a continuous variable. In the second, vs is treated as a binary variable.

library(margins)

mod1 <- glm(am ~ hp + vs, data=mtcars, family=binomial)
mod2 <- glm(am ~ hp + factor(vs), data=mtcars, family=binomial)

margins(mod1)
#> Average marginal effects
#> glm(formula = am ~ hp + vs, family = binomial, data = mtcars)
#>        hp       vs
#>  -0.00203 -0.03193

margins(mod2)
#> Average marginal effects
#> glm(formula = am ~ hp + factor(vs), family = binomial, data = mtcars)
#>        hp      vs1
#>  -0.00203 -0.03154

Edit: Here's an example of the at argument:

margins(mod1, at=list(hp=200, vs=0))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM