A convoluted question and I'm not sure I'm expressing it as concisely as I could, but...
I'm in a position where I want to fit multivariate generalised linear models - and because of the size and complexity of my models I'm having to use rxGlm()
from the RevoScaleR
package rather than the built in glm()
function.
It's important that each factor in the model has a reference level of my choosing, which I can set using relevel()
of course. However the nuisance here is that the factor levels are reordered, so it makes the GLM model output confusing to work with. I'd like to be able to retrieve the original factor level ordering after I've fitted the model, for presentation purposes.
A simple example:
library(RevoScaleR) # from Microsoft R Client
x <- data.frame(country = c("Australia", "Belgium", "Chile", "Belgium", "Belgium"),
degree = c("Y", "Y", "N", "Y", "N"),
salary = c(10000, 15000, 5000, 20000, 4000))
model <- rxGlm(salary ~ country + degree, data = x, dropFirst = TRUE)
model$coefficients
This gives
(Intercept) country=Australia country=Belgium country=Chile degree=N degree=Y
-3500 NA 7500 8500 NA 13500
Both factors are ordered alphabetically here so the reference level is country = Australia
, degree = N
. Suppose I'd like to have my reference levels as country = Belgium
, degree = Y
. I can do this and then rerun the model:
x$country <- relevel(x$country, ref = "Belgium")
x$degree <- relevel(x$degree, ref = "Y")
model <- rxGlm(salary ~ country + degree, data = x, dropFirst = TRUE)
model$coefficients
This now gives the same model, but presented differently:
(Intercept) country=Belgium country=Australia country=Chile degree=Y degree=N
17500 NA -7500 1000 NA -13500
These are the coefficients I want, but now the ordering is wrong. Is there a simple way to rearrange this item using the factor ordering I had before the relevel()
commands?
Thank you.
Create a vector of names, then index your coefficients using those names. Eg:
Names <- c(
'(Intercept)',
paste('country', sort(levels(x$country)), sep = '='),
paste('degree', sort(levels(x$degree)), sep = '=')
)
coefs2 <- coefs[Names]
Gives:
(Intercept) country=Australia country=Belgium country=Chile degree=N degree=Y
17500 -7500 NA 1000 -13500 NA
Using:
coefs <- c(
`(Intercept)` = 17500L, `country=Belgium` = NA, `country=Australia` = -7500L,
`country=Chile` = 1000L, `degree=Y` = NA, `degree=N` = -13500L
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.