将R表中的值相加2个因子

Question

I have a single big text file which looks as follows: 我有一个大文本文件，如下所示：

tag            colony   diff
1035            03      498
1035            03     -44365
1035            03     -66652
1035            04      234234
1035            04     -343
1035            04     -23423
1037            10      234234
1037            10     -343
1037            10     -23423

Most 'tags' only have a single colony, such as 1037 in the above example. 大多数“标签”只有一个菌落，例如上例中的1037。 However, some have 2, such as 1036 having both 03 and 04. What I would like to do is sum the diff column for each tag, but separately for each colony, so the output would be something like this. 但是，有些具有2，例如1036同时具有03和04。我要对每个标签的diff列求和，但对每个菌落分别求和，因此输出将是这样的。

tag    colony    total
1035   03        -110 519
1035   04        210 648
1037   10        210 648

So far (I've been working in R), I have been using aggregate: 到目前为止（我在R中工作），我一直在使用聚合：

x2 = aggregate(x$diff, by=list(tag=x$tag), FUN=sum)

But this would count all tags together, irrespective of colony. 但这将所有标签都算在一起，而不管菌落如何。 Is there a way of 'adding another level', so to speak, into the aggregate function, so that it counts the colonies seperately? 可以说，有没有一种方法可以“添加另一个级别”到聚合函数中，以便分别计算菌落？

Thanks 谢谢

Answer 1

We can use dplyr 我们可以使用dplyr

library(dplyr)
df1 %>%
   group_by(tag, colony) %>%
   summarise(total = sum(diff))

Or data.table 或数据data.table

library(data.table)
setDT(df1)[, .(total = sum(diff)), .(tag, colony)]

Answer 2

x2 <- aggregate(x$diff, by=list(x$tag,x$colony), FUN=sum)

或等效地作为公式x2 <- aggregate(diff~tag+colony,data=x,FUN=sum)

将R表中的值相加2个因子

问题描述

2 个解决方案

解决方案1
1 2017-04-19 13:47:15

解决方案2
0 2017-04-19 14:36:08

将R表中的值相加2个因子

问题描述

2 个解决方案

解决方案1 1 2017-04-19 13:47:15

解决方案2 0 2017-04-19 14:36:08

解决方案1
1 2017-04-19 13:47:15

解决方案2
0 2017-04-19 14:36:08