[英]Manipulate each Row of a Dataframe containing Lists in R
I have a dataframe that looks like this:我有一个看起来像这样的数据框:
df <- data.frame(id=c("list1", "list2"))
df$Content <- list(c("A", "B", "C"), c("A", "B", "A"))
For each row in "Content", I would like to first Remove duplicates, then find all rows containing certain elements, for example "A", and it would return both row 1 and 2.对于“内容”中的每一行,我想首先删除重复项,然后找到包含某些元素的所有行,例如“A”,它会返回第 1 行和第 2 行。
I've tried using duplicate() with apply() but it seems to be finding duplicates on the list level, as in, does c("A", "B", "C") match c("A", "B", "A") instead of finding duplicates within each list.我试过将duplicate()与apply ()一起使用,但似乎在列表级别找到重复项,例如, c("A", "B", "C") 是否匹配 c("A", " B", "A") 而不是在每个列表中查找重复项。
Similarly, I'm having trouble identifying the presence of a specific element in the list, instead of trying to match things to the list as a whole.同样,我无法识别列表中特定元素的存在,而不是尝试将事物与整个列表相匹配。
The only thing I could think of is using a for loop , but I was wondering if there's a more elegant way to do this.我唯一能想到的是使用for 循环,但我想知道是否有更优雅的方法来做到这一点。
We can use map
to loop over the list
elements, return the unique
elements, then filter
the rows of the dataset where there is 'A' in the 'Content'我们可以使用
map
循环遍历list
元素,返回unique
元素,然后filter
数据集中“内容”中有“A”的行
library(dplyr)
library(purrr)
df %>%
mutate(Content = map(Content, unique)) %>%
filter(map_lgl(Content, ~ 'A' %in% .x))
# id Content
#1 list1 A, B, C
#2 list2 A, B
Or another option is to unnest
the list
column, do a group_by
filter
on the distinct
rows and then condense
(from devel
version of dplyr
) or summarise
into a list
column或者另一种选择是
unnest
的list
列,做一个group_by
filter
上的distinct
行,然后condense
(从devel
版本的dplyr
)或summarise
成list
列
df %>%
unnest(c(Content)) %>%
distinct() %>%
group_by(id) %>%
filter('A' %in% Content) %>%
condense(Content)
# A tibble: 2 x 2
# Rowwise: id
# id Content
# <fct> <list>
#1 list1 <chr [3]>
2 list2 2 列表2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.