I wish to unnest (flatten?) and concatenate strings (comma separated) of text within a tibble
. Example data:
library(tidyverse)
tibble(person = c("Alice", "Bob", "Mary"),
score = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"))
# A tibble: 3 x 2
person score
<chr> <list>
1 Alice <chr [3]>
2 Bob <chr [3]>
3 Mary <chr [1]>
Expected output:
tibble(person = c("Alice", "Bob", "Mary"),
score = c("Red, Green, Blue", "Orange, Green, Yellow", "Blue" ))
# A tibble: 3 x 2
person score
<chr> <chr>
1 Alice Red, Green, Blue
2 Bob Orange, Green, Yellow
3 Mary Blue
I suspect there's a very neat tidyverse
solution to this but I've been unable to find an answer after extensive searching; I suspect I'm using the wrong search terms (unnest/concatentate). A tidyverse
solution would be preferred. Thank you.
You can do:
library(dplyr)
library(purrr)
df %>%
mutate(score = map_chr(score, toString))
# A tibble: 3 x 2
person score
<chr> <chr>
1 Alice Red, Green, Blue
2 Bob Orange, Green, Yellow
3 Mary Blue
If you have multiple list columns you can do:
df <- tibble(person = c("Alice", "Bob", "Mary"),
score1 = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"),
score2 = rev(list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue")))
df %>%
mutate_if(is.list, ~ map_chr(.x, toString))
# A tibble: 3 x 3
person score1 score2
<chr> <chr> <chr>
1 Alice Red, Green, Blue Blue
2 Bob Orange, Green, Yellow Orange, Green, Yellow
3 Mary Blue Red, Green, Blue
A simple way would be to unnest
the data in long format and collapse it by group.
library(dplyr)
df %>%
tidyr::unnest(score) %>%
group_by(person) %>%
summarise(score = toString(score))
# person score
# <chr> <chr>
#1 Alice Red, Green, Blue
#2 Bob Orange, Green, Yellow
#3 Mary Blue
Other option would be rowwise
df %>% rowwise() %>% mutate(score = toString(score))
Base R solution 1:
df$score <- sapply(df$score, toString)
Base R solution 2:
df$score <- unlist(lapply(df$score, paste, collapse = ", "))
Data:
df <- tibble(person = c("Alice", "Bob", "Mary"),
score = list(c("Red", "Green", "Blue"), c("Orange", "Green", "Yellow"), "Blue"))
这是(最近的)Tidyverse 的一种通用方法,它不会干扰分组:
df %>% mutate(across(where(is.list), ~ sapply(., toString)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.