简体   繁体   中英

Regression in R with grouped variables

The dependent variable Value of the data frame DF is predicted using the independent variables Mean , X , Y in the following way:

DF <- DF %>% 
    group_by(Country, Sex) %>%
    do({ 
        mod = lm(Value ~ Mean + X + Y, data = .) 
        A <- predict(mod, .)
        data.frame(., A)
    })

Data are grouped by Country and Sex . So, the fitting formula can be expressed as:

Value(Country, Sex) = a0(Country, Sex) + a1(Country, Sex) Mean + a2(Country, Sex) X + a3(Country, Sex) Y

However, I want to use this formula:

Value(Country, Sex) = a0(Country, Sex) + a1(Country, Sex) Mean + a2(Country) X + a3(Country) Y

Where a2 and a3 are independent of Sex . How can I do it?

I don't think you can when grouping by Country and Sex . You could just group by Country and add interactions with Sex :

DF <- DF %>% 
group_by(Country) %>%
do({ 
    mod = lm(Value ~ Sex + Mean*Sex + X + Y, data = .) 
    A <- predict(mod, .)
    data.frame(., A)
})

or estimate your model in one go adding interactions with Sex and Country :

mod <- lm(Value ~ Sex*Country*Mean + Country*X + Country*Y
A <- predict(mod)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM