[英]Sum over a column while another column takes a particular value
Input: 输入:
a 3 hi
a 4 hi
a NA hi
b 7 lo
b 2 lo
b 3 lo
c 1 hi
c 6 hi
Desired output: 所需的输出:
a 7 hi
b 12 lo
c 7 hi
Basically I would like to obtain the sum of the second column while column 1 takes on a unique value. 基本上,我想获得第二列的总和,而列1具有唯一值。 I would also like to obtain the string in column 3 associated with each unique value in column 1. 我也想在第3列中获取与第1列中的每个唯一值相关联的字符串。
dat <- data.frame(letters = c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'), numbers = c(3, 4, NA, 7, 2, 3, 1, 6), chars = c("hi", "hi", "hi", "lo", "lo", "lo", "hi", "hi"))
Using dplyr
: 使用dplyr
:
library(dplyr)
dat %>%
group_by(letters, chars) %>%
summarise(n = sum(numbers, na.rm = TRUE))
Source: local data frame [3 x 3]
Groups: letters
letters chars n
1 a hi 7
2 b lo 12
3 c hi 7
Using plyr
: 使用plyr
:
library(plyr)
ddply(dat, c("letters", "chars"), 'summarise', n = sum(numbers, na.rm = TRUE))
letters chars summarise
1 a hi 7
2 b lo 12
3 c hi 7
You basically want some variant of the split-apply-combine method. 基本上,您需要split-apply-combine方法的一些变体。
Using data.table
: 使用data.table
:
> library(data.table)
> setDF(dat)
> dat[,list(sum(numbers, na.rm=T), unique(chars)), by=letters]
letters V1 V2
1: a 7 hi
2: b 12 lo
3: c 7 hi
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.