简体   繁体   English

一列求和,而另一列取特定值

[英]Sum over a column while another column takes a particular value

Input: 输入:

a   3   hi
a   4   hi
a   NA  hi
b   7   lo
b   2   lo
b   3   lo
c   1   hi
c   6   hi

Desired output: 所需的输出:

a   7   hi
b   12  lo
c   7   hi

Basically I would like to obtain the sum of the second column while column 1 takes on a unique value. 基本上,我想获得第二列的总和,而列1具有唯一值。 I would also like to obtain the string in column 3 associated with each unique value in column 1. 我也想在第3列中获取与第1列中的每个唯一值相关联的字符串。

dat <- data.frame(letters = c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'), numbers = c(3, 4, NA, 7, 2, 3, 1, 6), chars = c("hi", "hi", "hi", "lo", "lo", "lo", "hi", "hi"))

Using dplyr : 使用dplyr

library(dplyr)

dat %>%
  group_by(letters, chars) %>%
  summarise(n = sum(numbers, na.rm = TRUE))

Source: local data frame [3 x 3]
Groups: letters

  letters chars  n
1       a    hi  7
2       b    lo 12
3       c    hi  7

Using plyr : 使用plyr

library(plyr)

ddply(dat, c("letters", "chars"), 'summarise', n = sum(numbers, na.rm = TRUE))

  letters chars summarise
1       a    hi         7
2       b    lo        12
3       c    hi         7

You basically want some variant of the split-apply-combine method. 基本上,您需要split-apply-combine方法的一些变体。

Using data.table : 使用data.table

> library(data.table)
> setDF(dat)
> dat[,list(sum(numbers, na.rm=T), unique(chars)), by=letters]
   letters V1 V2
1:       a  7 hi
2:       b 12 lo
3:       c  7 hi

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM