简体   繁体   English

数据框 R 列表的 lapply 和聚合列

[英]lapply and aggregate columns for list of dataframes R

Newbie here, Thanks in advance for the help新手在这里,提前感谢您的帮助

I have a list of multiple dataframes (AllExplants), with the same number of columns (with identical names) and different number of rows.我有一个包含多个数据框(AllExplants)的列表,列数相同(名称相同)和行数不同。 I wish to aggregate columns for all the dataframes in the list at once.我希望一次聚合列表中所有数据框的列。

My data: I will use a list of two dataframes here for simplicity我的数据:为简单起见,我将在这里使用两个数据框的列表

AllExplants <- list(Explant1, Explant2)

Explant1:外植体1:

 `Sample Name`                       `Tissue Category` `Annotation ID`                              All Negative `Non-nuclear`   PD1  PDL1
1 LT181- PD1 PDL1 MNF -1_Scan1.qptiff All               LT181- PD1 PDL1 MNF -1_Scan1_[10311,49192] 25140     4954            23  4418 15635
2 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Stroma            LT181- PD1 PDL1 MNF -1_Scan1_[10311,49192]  8788     1678            23  2922  4114
3 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Tumour            LT181- PD1 PDL1 MNF -1_Scan1_[10311,49192] 16344     3268             0  1496 11521
4 LT181- PD1 PDL1 MNF -1_Scan1.qptiff All               LT181- PD1 PDL1 MNF -1_Scan1_[10311,51272] 37930     9847           137  9821 17921
5 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Stroma            LT181- PD1 PDL1 MNF -1_Scan1_[10311,51272] 17400     5700           123  4914  6544
6 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Tumour            LT181- PD1 PDL1 MNF -1_Scan1_[10311,51272] 20526     4144            13  4907 11377
7 LT181- PD1 PDL1 MNF -1_Scan1.qptiff All               LT181- PD1 PDL1 MNF -1_Scan1_[12161,50230]  2315     1105            34   334   818
8 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Stroma            LT181- PD1 PDL1 MNF -1_Scan1_[12161,50230]  1666      934            30   266   427
9 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Tumour            LT181- PD1 PDL1 MNF -1_Scan1_[12161,50230]   639      164             1    68   391    

Explant2:外植体2:

  `Sample Name`                       `Tissue Category` `Annotation ID`                              All Negative `Non-nuclear`   PD1  PDL1
1 LT181- PD1 PDL1 MNF -1_Scan1.qptiff All               LT181- PD1 PDL1 MNF -1_Scan1_[10872,46112] 19602     4370            47  3176 11983
2 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Stroma            LT181- PD1 PDL1 MNF -1_Scan1_[10872,46112]  8479     2158            36  2624  3644
3 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Tumour            LT181- PD1 PDL1 MNF -1_Scan1_[10872,46112] 11116     2207            11   552  8339
4 LT181- PD1 PDL1 MNF -1_Scan1.qptiff All               LT181- PD1 PDL1 MNF -1_Scan1_[11335,47845] 14783     2036            10  1697 10973
5 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Stroma            LT181- PD1 PDL1 MNF -1_Scan1_[11335,47845]  3179      494             6   894  1770
6 LT181- PD1 PDL1 MNF -1_Scan1.qptiff Tumour            LT181- PD1 PDL1 MNF -1_Scan1_[11335,47845] 11604     1542             4   803  9203

I wish to aggregate columns 4-8 (ie All, Negative, Non-nuclear ,PD1 ,PDL1) and do so according to column 2 (Tissue Category)我希望汇总第 4-8 列(即所有、阴性、非核、PD1、PDL1)并根据第 2 列(组织类别)进行汇总

I can do this for each individual dataframe (if it is not in a list), with the code below我可以使用下面的代码为每个单独的数据框(如果它不在列表中)执行此操作

Explant1_agg <- aggregate(Explant1 [,4:8], by=list(Explant1$`Tissue Category`), FUN=sum)

However that's time consuming, to apply the same function to all the dataframes in the list, I have tried this code, having looked at posts here:但是,这很耗时,要将相同的功能应用于列表中的所有数据帧,我已经尝试了此代码,并查看了此处的帖子:

AllExplants_agg <- lapply(AllExplants, function(x) {aggregate(x[,4:8], by=list(x[,2]),  FUN=sum)})

However R returns the error但是 R 返回错误

 Error in aggregate.data.frame(x[, 4:8], by = list(x[, 2]), FUN = sum) : 
  arguments must have same length 

Any help would be greatly appreciated!任何帮助将不胜感激!

I know this question is already old, but I encountered the same problem and fixed it, and I thought it might still be of interest.我知道这个问题已经很老了,但我遇到了同样的问题并修复了它,我认为它可能仍然很有趣。
I didn't manage to get R to use a list in aggregate, but I worked around it with a for-loop:我没有设法让 R 整体使用列表,但我使用 for 循环解决了这个问题:

dat=seq(from=1, by=1, to=length(All_Explants))
#list to ID dataframes
output_All_Explants=list() #create list for output data from for loop
for (i in dat) { 
    dummy=as.data.frame(All_Explants[[i]])
    Output_All_Explants[[i]]=aggregate(dummy[,4:8], 
    by=list(date=dummy$Tissue Category), FUN=sum) 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM