简体   繁体   中英

How can I group_by() and then concatenate values of one column into a single column in R using dplyr?

I have data in the form of:

M | Y | title | terma | termb | termc
4 | 2009 | titlea | 2 | 0 | 1
6 | 2001 | titleb | 0 | 1 | 0
4 | 2009 | titlec | 1 | 0 | 1

I'm using dplyr's group_by() and summarise() to count instances of terms for each title:

data %>%
 gather(key = term, value = total, terma:termc) %>%
 group_by(m, y, title, term) %>%
 summarise(total = sum(total))

Which gives me something like this:

M | Y | title |term | count
4 | 2009 | titlea | terma | 2
4 | 2009 |titlea |termc | 1
6 | 2001 | titleb | termb | 1
4 | 2009 | titlec | terma | 1
4 | 2009 | titlec | termc | 1

Instead, I would like to be able to group by M, Y, and term, then concatenate any titles that are grouped and add their totals together. Desired output would look like this:

M | Y | title | term | count
4 | 2009 | titlea, titlec | terma | 3
4 | 2009 | titlea, titlec | termc | 2
6 | 2001 | titleb | termb | 1

How can I do this? Any help appreciated!

We can do

library(dplyr)
library(tidyr)
data %>% 
    mutate_at(vars(starts_with('term')), na_if, '0') %>%
    pivot_longer(cols = starts_with('term'), names_to = 'term',
       values_to = 'count', values_drop_na = TRUE) %>%
    group_by(M, Y, term) %>% 
    summarise(title = toString(title), count = sum(count))
# A tibble: 3 x 5
# Groups:   M, Y [2]
#      M     Y term  title          count
#  <int> <int> <chr> <chr>          <int>
#1     4  2009 terma titlea, titlec     3
#2     4  2009 termc titlea, titlec     2
#3     6  2001 termb titleb             1

data

data <- structure(list(M = c(4L, 6L, 4L), Y = c(2009L, 2001L, 2009L), 
    title = c("titlea", "titleb", "titlec"), terma = c(2L, 0L, 
    1L), termb = c(0L, 1L, 0L), termc = c(1L, 0L, 1L)),
    class = "data.frame", row.names = c(NA, 
-3L))

@akrun was very close. This ended up working:

data %>%
   pivot_longer(cols = terma:termc), names_to = 'term', values_to = 'count') %>%
    filter(count != 0) %>%
    group_by(M, Y, term) %>%
    summarise(title = toString(title), count = sum(count)) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM