简体   繁体   中英

model.matrix explanation in R

I'm trying to understand some code that builds a model matrix in R but having trouble understanding some basic syntax.

Here's some reproducible code below:

test_df <- data.frame(category =c("Poetry", "Narrative Film", "Music"), 
                      country=c("GB", "US", "US"), usd_goal_real=c(1534,30000,45000),
                      time_int = c(59, 60, 45), state=c(0,0,0)
                      )
test_df2 <- data.frame(model.matrix( ~ . -1, test_df))
test_df3 <- data.frame(model.matrix( ~ . , test_df))

What exactly is specified in the line test_df2 <- data.frame(model.matrix( ~ . -1, test_df)) ?

Specifically, what does the ~ . -1 ~ . -1 mean? Is this excluding a field from the model? How does iI differ from the formula ~ . , ~ . , in the next line?

The simplest answer is that the -1 in the formula in model.matrix removes the X intercept term from the model. data.frame(model.matrix( ~ . -1, test_df)) produces:

  categoryMusic categoryNarrative.Film categoryPoetry countryUS usd_goal_real time_int state
1             0                      0              1         0          1534       59     0
2             0                      1              0         1         30000       60     0
3             1                      0              0         1         45000       45     0

and data.frame(model.matrix( ~ . , test_df)) produces:

  X.Intercept. categoryNarrative.Film categoryPoetry countryUS usd_goal_real time_int state
1            1                      0              1         0          1534       59     0
2            1                      1              0         1         30000       60     0
3            1                      0              0         1         45000       45     0

since there is a categorical variable in the model, you will also notice that the Music level of that variable disappears when there is an X intercept in the model since the first level of the variable is used for the intercept and all others are measured from that.

These are 2 different ways of parameterizing your model

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM