简体   繁体   English

使用lapply将功能应用于列表中的多个数据框时的错误消息。

[英]Error message when using lapply to apply a function to multiple dataframes in a list.

My dataset looks like this, and I have a list of data. 我的数据集看起来像这样,并且我有一个数据列表。

   Plot_ID Canopy_infection_rate DAI 
1  YO01    5                     7   
2  YO01    8                     14
3  YO01    10                    21

What I want to do is to apply a function called "audpc_Canopyinfactionrate" to a list of dataframes. 我要做的是将一个名为“ audpc_Canopyinfactionrate”的函数应用于数据帧列表。

However, when I run lapply, I get an error as below: 但是,当我运行lapply时,出现如下错误:

Error in FUN(X[[i]], ...) : argument "DAI" is missing, with no default

I've checked my list that my data does not shift a column. 我检查了我的列表,确保我的数据不会移位列。

Does anyone know what's wrong with it? 有谁知道这是怎么回事? Thanks 谢谢

Here is part of my code: 这是我的代码的一部分:

#Read files in to list

for(i in 1:length(files)) {
  lst[[i]] <- read.delim(files[i], header = TRUE,  sep=" ")
}

#Apply a function to the list
densities <- list()
densities<- lapply(lst, audpc_Canopyinfactionrate)

#canopy infection rate 
audpc_Canopyinfactionrate <- function(Canopy_infection_rate,DAI){
  n <- length(DAI)
  meanvec <- matrix(-1,(n-1))
  intvec <- matrix(-1,(n-1))
  for(i in 1:(n-1)){
    meanvec[i] <- mean(c(Canopy_infection_rate[i],
                         Canopy_infection_rate[i+1]))
    intvec[i] <- DAI[i+1] - DAI[i]
  }

  infprod <- meanvec * intvec
  sum(infprod)

}

As pointed out in the comments, the problem lies in the way you are using lapply . 正如评论中指出的那样,问题出在您使用lapply

This function is built up like this: lapply(X, FUN, ...) . 该函数的构建方式如下: lapply(X, FUN, ...) FUN is the name of a function used to apply to the elements in a data.frame/list called X . FUN是用于应用于名为X的data.frame / list中的元素的函数的名称。 So far so good. 到现在为止还挺好。

Back to your case: You want to apply a function audpc_Canopyinfactionrate() to all data frames in lst . 回到您的案例:您想将函数audpc_Canopyinfactionrate()应用于lst所有数据帧。 This function takes two arguments. 该函数有两个参数。 And I think this is where things got mixed up in your code. 我认为这就是代码中混杂的地方。 Make sure you understand that in the way you are using lapply , you use lst[[1]] , lst[[2]] , etc. as the only argument in audpc_Canopyinfactionrate() , whereas it actually requires two arguments! 确保您了解使用lapply的方式,将lst[[1]]lst[[2]]等用作audpc_Canopyinfactionrate()唯一参数,而实际上它需要两个参数!

If you reformulate your function a bit, you can use lst[[1]] , lst[[2]] as the only argument to your function, because you know that argument contains the columns you need - Canopy_infection_rate and DAI : 如果稍微重新定义函数,则可以将lst[[1]]lst[[2]]用作函数的唯一参数,因为您知道该参数包含所需的列Canopy_infection_rateDAI

audpc_Canopyinfactionrate <- function(df){
  n <- nrow(df)
  meanvec <- matrix(-1, (n-1))
  intvec  <- matrix(-1, (n-1))
  for(i in 1:(n-1)){
    meanvec[i] <- mean(c(df$Canopy_infection_rate[i],
                         df$Canopy_infection_rate[i+1]))
    intvec[i] <- df$DAI[i+1] - df$DAI[i]
  }

  infprod <- meanvec * intvec
  return(sum(infprod))    
}

Call lapply in the following way: 通过以下方式调用lapply

lapply(lst, audpc_Canopyinfactionrate)

Note : lapply can also be used with more than 1 argument, by using the ... in lapply(X, FUN, ...) . lapply也可以用1个多参数使用,通过使用...lapply(X, FUN, ...) In your case, however, I think this is not the best option. 但是,就您的情况而言,我认为这不是最佳选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM