简体   繁体   English

将多个函数应用于矩阵列表以返回数据帧

[英]Apply multiple functions to list of matrices to return data frame

I have a data frame like this: 我有一个像这样的数据框:

df<- data.frame(year= c(rep("2004", 10), rep("2005", 10), rep("2006", 10), rep("2007", 10)), 
            lev1=c("A", "B", "C", "A", "D", "E", "D", "D", "B","B","C", "A","F","E","A","B",
                       "A", "B","C", "A", "D", "E", "D", "D", "B","B","C", "A","F","E","A", "B", "C", "A", "D","A","F","E","A","B" ), 
            lev2=c("X", "Y", "Z", "X", "W", "T", "W", "W", "Y","Y","Z", "T","U","V","Y","Y",
                      "W", "X","T", "W", "X", "Y", "Z", "X", "W", "T", "W", "W", "Y","Y","Z", "T","U","V","Y","Y",
                   "W", "X","T", "W"))

And have code to make a list of matrices ( Results ) for each year. 并编写代码以列出每年的矩阵( Results )。 lev1 becomes the rows and lev2 becomes the columns. lev1成为行, lev2成为列。 Values inside the matrix is the quantity of times the two co-occur. 矩阵内的值是两者共同出现的次数。

sublist=NA
for (i in unique(df$year)){   
sublist[i]<-list(subset(df, df[,1] == i)) 
print(i)
}
Results = list()
for (i in 1: length(unique(sublist))){ 
if (length(sublist[[i]]) > 1 & length(sublist[[i]]) > 1 ){
rows<-unique(sublist[[i]][[2]]) 
cols<-unique(sublist[[i]][[3]]) 
matrix1<- matrix(nrow = length(rows), ncol = length(cols))
df = data.frame(sublist[[i]])
for (k in 1: length(rows)){
  sub_lev1<- subset(df,lev1 == rows[k]) 
  for (j in 1:length(cols)){ 
    sub_lev2<-subset(sub_lev1, lev2 == cols[j]) 
    matrix1[k,j]<-length(sub_lev2[,3])
  }
}
colnames(matrix1) <- cols
rownames(matrix1) <- rows
Results[[i]] = matrix1
}else{next}
}
Results

I would like to run a singe function ( library("bipartite") networklevel() ) on each element of the list that returns multiple values for multiple network indices. 我想在列表的每个元素上运行一个singe函数( library("bipartite") networklevel() ),该函数返回多个网络索引的多个值。 Below I do it individually for each matrix. 下面我对每个矩阵分别进行处理。

d1<-networklevel(Results[[2]])
d2<-networklevel(Results[[3]])
d3<-networklevel(Results[[4]])
d4<-networklevel(Results[[5]])

The output desired is a data frame that includes the year, name of the network index, and the value for each network index: 所需的输出是一个数据帧,其中包括年份,网络索引的名称以及每个网络索引的值:

d1<-data.frame(as.list(d1))
d1<- melt(d1)
d1$year<-rep("2004", length(d1))

d2<-data.frame(as.list(d2))
d2<- melt(d2)
d2$year<-rep("2005", length(d2))

d3<-data.frame(as.list(d3))
d3<- melt(d3)
d3$year<-rep("2006", length(d3))

d4<-data.frame(as.list(d4))
d4<- melt(d4)
d4$year<-rep("2007", length(d4))

output<- rbind(d1,d2,d3, d4)

A few problems I have: 1) for some reason the loop above returns the first matrix as NULL . 我有一些问题:1)由于某种原因,以上循环将第一个矩阵返回为NULL How do I correct this? 我该如何纠正? 2) When the matrices are indexed in Results they are not indexed by year , rather 1-5. 2)在“ Results对矩阵进行索引时,它们不是按year索引,而是按1-5索引。 I would like to adjust the loop so that the name of the year is indexed. 我想调整循环,以便索引年的名称。 I believe this would facilitate creating the output df downstream. 我相信这将有助于在下游创建输出df。

I have tried the following to return network indices for each element of the list with out success: 我尝试了以下方法,但未成功返回列表中每个元素的网络索引:

output<- lapply(mylist, FUN= function(x) networklevel(x)

I would appreciate any help running networklevel on all elements of the list at one time. 我希望一次在列表的所有元素上运行networklevel任何帮助。 The default of networklevel is to return multiple network indices, so I need a solution to run networklevel and return all those indices for each matrix into an organized data frame that specifies the year from which the matrix came. networklevel的默认值是返回多个网络索引,因此我需要一种解决方案来运行networklevel并将每个矩阵的所有这些索引返回到一个指定矩阵表示的年份的有组织数据框中。 In my actual dataset I have over 20 years of data so it would be most efficient to find a solution that prevents me from doing this for each year/matrix separately. 在我的实际数据集中,我有20多年的数据,因此找到一种解决方案来阻止我分别对每一年/矩阵执行此操作将是最有效的。

Your first problem: 您的第一个问题:

1) for some reason the loop above returns the first matrix as NULL. 1)由于某种原因,以上循环将第一个矩阵返回为NULL。 How do I correct this? 我该如何纠正?

change sublist <- NA to sublist <- NULL , the NA will not get removed from the object sublist when you run your for loop and that is what is causing the first matrix to be NULL . sublist <- NA sublist <- NULL更改为sublist <- NULL sublist当您运行for循环时,NA不会从对象sublist删除,这就是导致第一个矩阵为NULL R trieds to subset where year == NA and this will not work. R尝试对年== NA进行子集化,这将不起作用。

Second issue: 第二期:

2) When the matrices are indexed in Results they are not indexed by year, rather 1-5. 2)在结果中对矩阵进行索引时,它们不是按年份索引,而是按1-5索引。 I would like to adjust the loop so that the name of the year is indexed. 我想调整循环,以便索引年的名称。

I would try something like this names(Results) <- c("2004", "2005", "2006", "2007") 我会尝试这样的names(Results) <- c("2004", "2005", "2006", "2007")

Third Issue: 第三期:

looping output 循环输出

In your lapply you do not need to create a function(x) just simply call networklevel like this output <- lapply(Results, bipartite::networklevel) 在您的lapply中,您不需要创建一个function(x)只需简单地调用networklevel这样的output <- lapply(Results, bipartite::networklevel)

Then you can do something like this to get it into a df/matrix: 然后,您可以执行以下操作将其放入df / matrix:

#get to matrix
dfoutput <- do.call(rbind, output)
#add row names as variable - in your case it is year of analysis
dfoutput2 <- cbind(dfoutput, nms = row.names(dfoutput))
#convert to df if needed
dfoutput3 <- as.data.frame(dfoutput2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM