简体   繁体   English

如何使用dplyr划分行组?

[英]How to divide between groups of rows using dplyr?

I have this dataframe: 我有这个数据帧:

x <- data.frame(
    name = rep(letters[1:4], each = 2),
    condition = rep(c("A", "B"), times = 4),
    value = c(2,10,4,20,8,40,20,100)
) 
#   name condition value
# 1    a         A     2
# 2    a         B    10
# 3    b         A     4
# 4    b         B    20
# 5    c         A     8
# 6    c         B    40
# 7    d         A    20
# 8    d         B   100

I want to group by name and divide the value of rows with condition == "B" with those with condition == "A" , to get this: 我想按名称分组并将condition == "B"condition == "B"的行除以condition == "A" ,以得到:

data.frame(
    name = letters[1:4],
    value = c(5,5,5,5)
)
#   name value
# 1    a     5
# 2    b     5
# 3    c     5
# 4    d     5

I know something like this can get me pretty close: 我知道这样的事情可以让我非常接近:

x$value[which(x$condition == "B")]/x$value[which(x$condition == "A")]

but I was wondering if there was an easy way to do this with dplyr (My dataframe is a toy example and I got to it by chaining multiple group_by and summarise calls). 但是我想知道是否有一种简单的方法来使用dplyr(我的数据帧是一个玩具示例,我通过链接多个group_bysummarise调用来实现它)。

Try: 尝试:

x %>% 
  group_by(name) %>%
  summarise(value = value[condition == "B"] / value[condition == "A"])

Which gives: 这使:

#Source: local data frame [4 x 2]
#
#    name value
#  (fctr) (dbl)
#1      a     5
#2      b     5
#3      c     5
#4      d     5

I'd use spread from tidyr . 我会使用来自tidyr spread

library(dplyr)
library(tidyr)

x %>%
  spread(condition, value) %>%
  mutate(value = B/A)

  name  A   B value
1    a  2  10     5
2    b  4  20     5
3    c  8  40     5
4    d 20 100     5

You could then do select(-A, -B) to drop the extra columns. 然后,您可以select(-A, -B)删除多余的列。

Using data.table , convert the 'data.frame' to 'data.table' ( setDT(x) ), grouped by 'name', we divide the 'value' corresponds to 'B' condition by the those that corresponds to 'A' 'condition'. 使用data.table ,将'data.frame'转换为'data.table'( setDT(x) ),按'name'分组,我们将'value'对应'B'条件与'对应'对应'一个条件'。

library(data.table)
setDT(x)[,.(value = value[condition=="B"]/value[condition=="A"]) , name]
#    name value
#1:    a     5
#2:    b     5
#3:    c     5
#4:    d     5

Or reshape from 'long' to 'wide' and divide the 'B' column by 'A'. 或者从'long'重新变换为'wide'并将'B'列除以'A'。

dcast(setDT(x), name~condition, value.var='value')[, .(name, value = B/A)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM