This is an example of what I'm trying to do.
Create a list of combination of dependent variable and independent variables
a <- list(paste("Sepal.Length ~ Sepal.Width" ) ,
paste("Sepal.Width ~ Sepal.Length" )
)
Using lappy function to run glm for each element in the list in the step #1 , and also create a for loop to test two different parameters
param <- c("gaussian" , "Gamma" )
for(i in 1:2) {
print(lapply(a , FUN = function(X) glm(X , data = iris ,family = param[i] )))}
Is there a better way to achieve this without using for loop in the second step? This is what I have tried but it's not working.
a <-
list(
paste("Sepal.Length ~ Sepal.Width , data = iris , family = "Gaussian" " ) ,
paste("Sepal.Length ~ Sepal.Width , data = iris , family = "Gamma" " ) ,
paste("Sepal.Width ~ Sepal.Length , data = iris , family = "Gaussian" " ) ,
paste("Sepal.Width ~ Sepal.Length , data = iris , family = "Gamma" " )
)
lapply(a , FUN = function(X) glm(X))
Your paste
does nothing here. Leave it out. Furthermore, the use of strings is also unnecessary here. Leave them out. Same goes for your parameter families: these are functions , no need to quote them.
This already vastly simplifies the code, both in length and conceptually. Now we have this:
models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)
And we can apply it:
lapply(models,
function (model) lapply(families,
function (family) glm(model, family, iris)))
… which is a nested application. The indentation hints at what belongs together. Since this is a bit odd, we can also use the cartesian product of the different parameters:
params = as.data.frame(t(expand.grid(models, families)))
lapply(params, function (p) glm(formula = p[[1]], data = iris, family = p[[2]]))
The first line is a bit obscure here. expand.grid
allows us to create a data frame of all parameter combinations. Here's an example:
> expand.grid(1 : 3, c('a', 'b'))
Var1 Var2
1 1 a
2 2 a
3 3 a
4 1 b
5 2 b
6 3 b
Unfortunately, this data frame is in the wrong orientation to be used by lapply
, because that applies over columns. So we t
ranspose it (and convert it to a data.frame
again, since t
always returns a matrix
).
This piece of code is incredibly useful because it makes writing nested loops via lapply
much more readable; unfortunately, it is itself quite unreadable, so we stick it into a function:
combine_parameters = function (...)
as.data.frame(t(expand.grid(...)))
This allows us to write elegant, readable code:
models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)
params = combine_parameters(models, families)
lapply(params, function (p) glm(formula = p[[1]], family = p[[2]]), data = iris)
Using lapply:
lapply(c("gaussian", "Gamma"), function(myFamily){
lapply(c("Sepal.Length ~ Sepal.Width" ,
"Sepal.Width ~ Sepal.Length"), function(myFormula){
glm(formula = myFormula, family = myFamily, data = iris)
})
})
EDIT: As mentioned in @KonradRudolph answer, we can pass formula as a list with a link =
argument, eg:
lapply(list(gaussian(link = "identity"), Gamma), function(myFamily){
lapply(c("Sepal.Length ~ Sepal.Width" ,
"Sepal.Width ~ Sepal.Length"), function(myFormula){
glm(formula = myFormula, family = myFamily, data = iris)
})
})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.