[英]Summing data based on Column in R
I have a data set that looks like this (actual data is 10K by 5K so I really need a shortcut):我有一个看起来像这样的数据集(实际数据是 10K x 5K,所以我真的需要一个快捷方式):
Cluster![]() |
Item1![]() |
Item2![]() |
Item 3![]() |
---|---|---|---|
1 ![]() |
1 ![]() |
2 ![]() |
2 ![]() |
1 ![]() |
3 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
3 ![]() |
0 ![]() |
2 ![]() |
3 ![]() |
2 ![]() |
0 ![]() |
2 ![]() |
0 ![]() |
0 ![]() |
2 ![]() |
2 ![]() |
4 ![]() |
2 ![]() |
2 ![]() |
3 ![]() |
0 ![]() |
1 ![]() |
1 ![]() |
3 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
I want to add the columns of each data set by cluster so it will look I this:我想按集群添加每个数据集的列,所以它看起来像这样:
Cluster![]() |
Item1![]() |
Item2![]() |
Item 3![]() |
---|---|---|---|
1 ![]() |
5 ![]() |
6 ![]() |
3 ![]() |
2 ![]() |
7 ![]() |
4 ![]() |
4 ![]() |
3 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
I want to sum them by a certain column.我想按某一列对它们求和。
You can use aggregate
( dat
is the name of your data frame):您可以使用
aggregate
( dat
是您的数据框的名称):
aggregate(dat[-1], dat["Cluster"], sum)
# Cluster Item1 Item2 Item3
# 1 1 5 6 3
# 2 2 7 4 4
# 3 3 1 2 3
With data.table
:使用
data.table
:
library(data.table)
setDT(dat)[ , lapply(.SD, sum), by = Cluster]
# Cluster Item1 Item2 Item3
# 1: 1 5 6 3
# 2: 2 7 4 4
# 3: 3 1 2 3
With dplyr
:使用
dplyr
:
dat %>%
group_by(Cluster) %>%
summarise_each(funs(sum))
# Cluster Item1 Item2 Item3
# 1 1 5 6 3
# 2 2 7 4 4
# 3 3 1 2 3
thanks for your answer, I also used this good and it worked perfectly:谢谢你的回答,我也用过这个好用的,效果很好:
aggregate(. ~ Cluster, data=dat, FUN=sum)
# Cluster Item1 Item2 Item3
# 1 1 5 6 3
# 2 2 7 4 4
# 3 3 1 2 3
Try:尝试:
> sapply(ddf[-1], function(x) tapply(x,ddf$Cluster,sum))
Item1 Item2 Item3
1 5 6 3
2 7 4 4
3 1 2 3
If you want to sum all varibales except that of grouping, use across
in dplyr如果要总结所有varibales除了分组,利用
across
在dplyr
df <- read.table(text = "Cluster Item1 Item2 Item3
1 1 2 2
1 3 1 1
1 1 3 0
2 3 2 0
2 0 0 2
2 4 2 2
3 0 1 1
3 1 1 2", header = T)
df %>% group_by(Cluster) %>% summarise(across(everything(), ~sum(.)))
# A tibble: 3 x 4
Cluster Item1 Item2 Item3
<int> <int> <int> <int>
1 1 5 6 3
2 2 7 4 4
3 3 1 2 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.