如何使用 dplyr group_by() 然后将一列的值连接到 R 中的单列中？

Question

I have data in the form of:我有以下形式的数据：

M | Y | title | terma | termb | termc
4 | 2009 | titlea | 2 | 0 | 1
6 | 2001 | titleb | 0 | 1 | 0
4 | 2009 | titlec | 1 | 0 | 1

I'm using dplyr's group_by() and summarise() to count instances of terms for each title:我正在使用 dplyr 的 group_by() 和 summarise() 来计算每个标题的术语实例：

data %>%
 gather(key = term, value = total, terma:termc) %>%
 group_by(m, y, title, term) %>%
 summarise(total = sum(total))

Which gives me something like this:这给了我这样的东西：

M | Y | title |term | count
4 | 2009 | titlea | terma | 2
4 | 2009 |titlea |termc | 1
6 | 2001 | titleb | termb | 1
4 | 2009 | titlec | terma | 1
4 | 2009 | titlec | termc | 1

Instead, I would like to be able to group by M, Y, and term, then concatenate any titles that are grouped and add their totals together.相反，我希望能够按 M、Y 和术语进行分组，然后连接任何分组的标题并将它们的总数相加。 Desired output would look like this:所需的输出如下所示：

M | Y | title | term | count
4 | 2009 | titlea, titlec | terma | 3
4 | 2009 | titlea, titlec | termc | 2
6 | 2001 | titleb | termb | 1

How can I do this?我怎样才能做到这一点？ Any help appreciated!任何帮助表示赞赏！

Answer 1

We can do我们可以做的

library(dplyr)
library(tidyr)
data %>% 
    mutate_at(vars(starts_with('term')), na_if, '0') %>%
    pivot_longer(cols = starts_with('term'), names_to = 'term',
       values_to = 'count', values_drop_na = TRUE) %>%
    group_by(M, Y, term) %>% 
    summarise(title = toString(title), count = sum(count))
# A tibble: 3 x 5
# Groups:   M, Y [2]
#      M     Y term  title          count
#  <int> <int> <chr> <chr>          <int>
#1     4  2009 terma titlea, titlec     3
#2     4  2009 termc titlea, titlec     2
#3     6  2001 termb titleb             1

data数据

data <- structure(list(M = c(4L, 6L, 4L), Y = c(2009L, 2001L, 2009L), 
    title = c("titlea", "titleb", "titlec"), terma = c(2L, 0L, 
    1L), termb = c(0L, 1L, 0L), termc = c(1L, 0L, 1L)),
    class = "data.frame", row.names = c(NA, 
-3L))

Answer 2

@akrun was very close. @akrun 非常接近。 This ended up working:这最终起作用了：

data %>%
   pivot_longer(cols = terma:termc), names_to = 'term', values_to = 'count') %>%
    filter(count != 0) %>%
    group_by(M, Y, term) %>%
    summarise(title = toString(title), count = sum(count))

如何使用 dplyr group_by() 然后将一列的值连接到 R 中的单列中？

问题描述

2 个解决方案

解决方案1
0 2020-02-27 19:39:02

data数据

解决方案2
0 已采纳 2020-02-27 20:00:18

如何使用 dplyr group_by() 然后将一列的值连接到 R 中的单列中？

问题描述

2 个解决方案

解决方案1 0 2020-02-27 19:39:02

data数据

解决方案2 0 已采纳 2020-02-27 20:00:18

解决方案1
0 2020-02-27 19:39:02

解决方案2
0 已采纳 2020-02-27 20:00:18