[英]Using lappy to do loop glm
This is an example of what I'm trying to do. 这是我正在尝试做的一个例子。
Create a list of combination of dependent variable and independent variables 创建因变量和自变量组合的列表
a <- list(paste("Sepal.Length ~ Sepal.Width" ) ,
paste("Sepal.Width ~ Sepal.Length" )
)
Using lappy function to run glm for each element in the list in the step #1 , and also create a for loop to test two different parameters 使用lappy函数在步骤#1中为列表中的每个元素运行glm,还创建一个for循环来测试两个不同的参数
param <- c("gaussian" , "Gamma" )
for(i in 1:2) {
print(lapply(a , FUN = function(X) glm(X , data = iris ,family = param[i] )))}
Is there a better way to achieve this without using for loop in the second step? 有没有在第二步中不使用for循环的更好方法? This is what I have tried but it's not working.
这是我尝试过的,但是没有用。
a <-
list(
paste("Sepal.Length ~ Sepal.Width , data = iris , family = "Gaussian" " ) ,
paste("Sepal.Length ~ Sepal.Width , data = iris , family = "Gamma" " ) ,
paste("Sepal.Width ~ Sepal.Length , data = iris , family = "Gaussian" " ) ,
paste("Sepal.Width ~ Sepal.Length , data = iris , family = "Gamma" " )
)
lapply(a , FUN = function(X) glm(X))
Your paste
does nothing here. 您的
paste
在这里没有任何作用。 Leave it out. 别说了。 Furthermore, the use of strings is also unnecessary here.
此外,这里也不需要使用字符串。 Leave them out.
放他们出去。 Same goes for your parameter families: these are functions , no need to quote them.
参数系列也是如此:这些是函数 ,无需引用它们。
This already vastly simplifies the code, both in length and conceptually. 这已经在长度和概念上极大地简化了代码。 Now we have this:
现在我们有了这个:
models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)
And we can apply it: 我们可以应用它:
lapply(models,
function (model) lapply(families,
function (family) glm(model, family, iris)))
… which is a nested application. …这是一个嵌套的应用程序。 The indentation hints at what belongs together.
缩进暗示了什么属于在一起。 Since this is a bit odd, we can also use the cartesian product of the different parameters:
由于这有点奇怪,我们还可以使用不同参数的笛卡尔积:
params = as.data.frame(t(expand.grid(models, families)))
lapply(params, function (p) glm(formula = p[[1]], data = iris, family = p[[2]]))
The first line is a bit obscure here. 第一行在这里有点晦涩。
expand.grid
allows us to create a data frame of all parameter combinations. expand.grid
允许我们创建所有参数组合的数据框。 Here's an example: 这是一个例子:
> expand.grid(1 : 3, c('a', 'b'))
Var1 Var2
1 1 a
2 2 a
3 3 a
4 1 b
5 2 b
6 3 b
Unfortunately, this data frame is in the wrong orientation to be used by lapply
, because that applies over columns. 不幸的是,此数据帧的方向不
lapply
,因此lapply
会使用lapply
,因为这适用于列。 So we t
ranspose it (and convert it to a data.frame
again, since t
always returns a matrix
). 因此,我们
t
ranspose它(并将其转换为一个data.frame
再次,因为t
总是返回一个matrix
)。
This piece of code is incredibly useful because it makes writing nested loops via lapply
much more readable; 这段代码非常有用,因为它使通过
lapply
编写嵌套循环更具可读性。 unfortunately, it is itself quite unreadable, so we stick it into a function: 不幸的是,它本身是非常不可读的,因此我们将其粘贴到一个函数中:
combine_parameters = function (...)
as.data.frame(t(expand.grid(...)))
This allows us to write elegant, readable code: 这使我们可以编写简洁易读的代码:
models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)
params = combine_parameters(models, families)
lapply(params, function (p) glm(formula = p[[1]], family = p[[2]]), data = iris)
Using lapply: 使用lapply:
lapply(c("gaussian", "Gamma"), function(myFamily){
lapply(c("Sepal.Length ~ Sepal.Width" ,
"Sepal.Width ~ Sepal.Length"), function(myFormula){
glm(formula = myFormula, family = myFamily, data = iris)
})
})
EDIT: As mentioned in @KonradRudolph answer, we can pass formula as a list with a link =
argument, eg: 编辑:如@KonradRudolph答案中所述,我们可以将公式作为具有
link =
参数的列表传递,例如:
lapply(list(gaussian(link = "identity"), Gamma), function(myFamily){
lapply(c("Sepal.Length ~ Sepal.Width" ,
"Sepal.Width ~ Sepal.Length"), function(myFormula){
glm(formula = myFormula, family = myFamily, data = iris)
})
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.