简体   繁体   English

R应用列表的访问元素并执行计算

[英]R lapply access elements of a list and perform calculations

I have a list of about 561 elements, each of which is a list that looks like a matrix when called. 我有一个大约561个元素的列表,每个元素都是一个被调用时看起来像矩阵的列表。 Below is an example from the dataset, 以下是数据集中的示例,

structure(list(`111110` = structure(c(205, 4, 1, 6, 23, 0, 1, 
0, 0), .Dim = c(3L, 3L), .Dimnames = list(c("1", "4", "5"), c("1", 
"4", "5"))), `111120` = structure(c(181, 3, 4, 4), .Dim = c(2L, 
2L), .Dimnames = list(c("1", "4"), c("1", "4"))), `111130` = structure(c(71, 8, 3, 15, 114, 7, 6, 8, 56), .Dim = c(3L, 3L), .Dimnames = list(
c("1", "4", "5"), c("1", "4", "5"))), `111140` = structure(c(87, 
8, 9, 14), .Dim = c(2L, 2L), .Dimnames = list(c("1", "4"), c("1", 
"4"))), `111150` = structure(24, .Dim = c(1L, 1L), .Dimnames = list(
"1", "1")), `111160` = structure(48, .Dim = c(1L, 1L), .Dimnames = list(
"1", "1"))), .Names = c("111110", "111120", "111130", "111140", 
"111150", "111160"))

The dimensions of each element in the list are 1 x 1 to 6 x 6. I would like to do the following calculations for each of the elements in the list: 列表中每个元素的尺寸为1 x 1到6 x6。我想对列表中的每个元素进行以下计算:

  1. if the entry has a column named "5", then I would like to sum the entries in column "5", except the entry in the last row of column "5". 如果条目具有名为“ 5”的列,那么我想对“ 5”列中的条目求和,除了“ 5”列最后一行中的条目。 If there is no column "5" then the calculation should be blank. 如果没有列“ 5”,则计算应为空白。

  2. if the entry has a column named "5", sum elements in column "1", except the first element. 如果条目具有名为“ 5”的列,则将第一个元素之外的列“ 1”中的元素相加。 If the associated entry does not have a column with "5" as its header it should be blank. 如果关联的条目没有标题为“ 5”的列,则应为空白。

  3. take the calculations in part 1 and 2 and add them to a data frame containing the unique id and the calculations from 1 and 2. 进行第1部分和第2部分中的计算,并将它们添加到包含唯一ID以及1和2中计算的数据框中。

I have tried the following (based on the answer provided below): 我已经尝试了以下方法(基于下面提供的答案):

output <- c()
for(x in names(trans.by.naics)) {
  id <- x
  count.entry.5 <- ifelse("5" %in% colnames(trans.by.naics[[x]]),
                            sum(trans.by.naics[[x]][1 :nrow(trans.by.naics[[x]]), 5]) - trans.by.naics[[x]][5,5], "") # sum down the first four rows of column "5" if it exists
  count.entry.1 <- ifelse("5" %in% colnames(trans.by.naics[[x]]),
                     sum(trans.by.naics[[x]][1 : nrow(trans.by.naics[[x]]), 1]) - trans.by.naics[[x]][1,1], "") 
  thing <- data.frame(id, count.entry.5, count.entry.1)
  output <- rbind(output, thing)

}

But I get the following when I run my code: 但是运行代码时,我得到以下信息:

Error in trans.by.naics[[x]][1:nrow(trans.by.naics[[x]]), 5] : 
  subscript out of bounds

The desired output looks like this: 所需的输出如下所示:

      id count.entry.5 count.entry.1
1 111110             1             5
2 111120                           3
3 111130            14            11
4 111140                            
5 111150                            
6 111160

Is there a good way to do this that won't take too long? 有没有一种好方法可以花很长时间呢? Perhaps a more vectorized approach? 也许是一种更加矢量化的方法? An lapply approach? lapply方法? Any advice or help is appreciated. 任何建议或帮助,表示赞赏。 Thanks!! 谢谢!!

output <- c()
for (x in names(data)) {
  id <- x
  if(sum(colnames(data[[x]]) %in% "5") == 1) {
    calc1 <- sum(data[[x]][-nrow(data[[x]]), "5"])
    calc2 <- sum(data[[x]][-1, "1"])
  } else {
    calc1 <- NA
    calc2 <- NA
  }
  thing <- data.frame(id, calc1, calc2)
  output <- rbind(output, thing)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM