简体   繁体   English

使用lappy做循环glm

[英]Using lappy to do loop glm

This is an example of what I'm trying to do. 这是我正在尝试做的一个例子。

Step 1 : 第1步 :

Create a list of combination of dependent variable and independent variables 创建因变量和自变量组合的列表

a <- list(paste("Sepal.Length ~  Sepal.Width" ) , 
paste("Sepal.Width ~ Sepal.Length" )
)

Step 2 : 第2步 :

Using lappy function to run glm for each element in the list in the step #1 , and also create a for loop to test two different parameters 使用lappy函数在步骤#1中为列表中的每个元素运行glm,还创建一个for循环来测试两个不同的参数

param <- c("gaussian" , "Gamma" )
for(i in 1:2) {
print(lapply(a , FUN = function(X) glm(X , data = iris ,family = param[i]    )))}

Is there a better way to achieve this without using for loop in the second step? 有没有在第二步中不使用for循环的更好方法? This is what I have tried but it's not working. 这是我尝试过的,但是没有用。

a <- 
list(
paste("Sepal.Length ~  Sepal.Width , data = iris , family = "Gaussian" " ) , 
paste("Sepal.Length ~  Sepal.Width , data = iris , family = "Gamma" " ) ,                  
paste("Sepal.Width ~  Sepal.Length , data = iris , family = "Gaussian" " ) ,
paste("Sepal.Width ~  Sepal.Length , data = iris , family = "Gamma" " ) 
)

lapply(a , FUN = function(X) glm(X))

Your paste does nothing here. 您的paste在这里没有任何作用。 Leave it out. 别说了。 Furthermore, the use of strings is also unnecessary here. 此外,这里也不需要使用字符串。 Leave them out. 放他们出去。 Same goes for your parameter families: these are functions , no need to quote them. 参数系列也是如此:这些是函数 ,无需引用它们。

This already vastly simplifies the code, both in length and conceptually. 这已经在长度和概念上极大地简化了代码。 Now we have this: 现在我们有了这个:

models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)

And we can apply it: 我们可以应用它:

lapply(models,
       function (model) lapply(families,
                               function (family) glm(model, family, iris)))

… which is a nested application. …这是一个嵌套的应用程序。 The indentation hints at what belongs together. 缩进暗示了什么属于在一起。 Since this is a bit odd, we can also use the cartesian product of the different parameters: 由于这有点奇怪,我们还可以使用不同参数的笛卡尔积:

params = as.data.frame(t(expand.grid(models, families)))

lapply(params, function (p) glm(formula = p[[1]], data = iris, family = p[[2]]))

The first line is a bit obscure here. 第一行在这里有点晦涩。 expand.grid allows us to create a data frame of all parameter combinations. expand.grid允许我们创建所有参数组合的数据框。 Here's an example: 这是一个例子:

> expand.grid(1 : 3, c('a', 'b'))

  Var1 Var2
1    1    a
2    2    a
3    3    a
4    1    b
5    2    b
6    3    b

Unfortunately, this data frame is in the wrong orientation to be used by lapply , because that applies over columns. 不幸的是,此数据帧的方向不lapply ,因此lapply会使用lapply ,因为这适用于列。 So we t ranspose it (and convert it to a data.frame again, since t always returns a matrix ). 因此,我们t ranspose它(并将其转换为一个data.frame再次,因为t总是返回一个matrix )。

This piece of code is incredibly useful because it makes writing nested loops via lapply much more readable; 这段代码非常有用,因为它使通过lapply编写嵌套循环更具可读性。 unfortunately, it is itself quite unreadable, so we stick it into a function: 不幸的是,它本身是非常不可读的,因此我们将其粘贴到一个函数中:

combine_parameters = function (...)
    as.data.frame(t(expand.grid(...)))

This allows us to write elegant, readable code: 这使我们可以编写简洁易读的代码:

models = list(Sepal.Length ~ Sepal.Width, Sepal.Width ~ Sepal.Length)
families = c(gaussian, Gamma)
params = combine_parameters(models, families)
lapply(params, function (p) glm(formula = p[[1]], family = p[[2]]), data = iris)

Using lapply: 使用lapply:

lapply(c("gaussian", "Gamma"), function(myFamily){
  lapply(c("Sepal.Length ~  Sepal.Width" , 
           "Sepal.Width ~ Sepal.Length"), function(myFormula){
             glm(formula = myFormula, family = myFamily, data = iris)
           })
})

EDIT: As mentioned in @KonradRudolph answer, we can pass formula as a list with a link = argument, eg: 编辑:如@KonradRudolph答案中所述,我们可以将公式作为具有link =参数的列表传递,例如:

lapply(list(gaussian(link = "identity"), Gamma), function(myFamily){
  lapply(c("Sepal.Length ~  Sepal.Width" , 
           "Sepal.Width ~ Sepal.Length"), function(myFormula){
             glm(formula = myFormula, family = myFamily, data = iris)
           })
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM