在 R 中操作包含列表的数据帧的每一行

Question

I have a dataframe that looks like this:我有一个看起来像这样的数据框：

df <- data.frame(id=c("list1", "list2"))
df$Content <- list(c("A", "B", "C"), c("A", "B", "A"))

For each row in "Content", I would like to first Remove duplicates, then find all rows containing certain elements, for example "A", and it would return both row 1 and 2.对于“内容”中的每一行，我想首先删除重复项，然后找到包含某些元素的所有行，例如“A”，它会返回第 1 行和第 2 行。

I've tried using duplicate() with apply() but it seems to be finding duplicates on the list level, as in, does c("A", "B", "C") match c("A", "B", "A") instead of finding duplicates within each list.我试过将duplicate()与apply ()一起使用，但似乎在列表级别找到重复项，例如， c("A", "B", "C") 是否匹配 c("A", " B", "A") 而不是在每个列表中查找重复项。

Similarly, I'm having trouble identifying the presence of a specific element in the list, instead of trying to match things to the list as a whole.同样，我无法识别列表中特定元素的存在，而不是尝试将事物与整个列表相匹配。

The only thing I could think of is using a for loop , but I was wondering if there's a more elegant way to do this.我唯一能想到的是使用for 循环，但我想知道是否有更优雅的方法来做到这一点。

Answer 1

We can use map to loop over the list elements, return the unique elements, then filter the rows of the dataset where there is 'A' in the 'Content'我们可以使用map循环遍历list元素，返回unique元素，然后filter数据集中“内容”中有“A”的行

library(dplyr)
library(purrr)
df %>%
   mutate(Content  = map(Content, unique)) %>%
   filter(map_lgl(Content, ~ 'A' %in% .x))
#    id Content
#1 list1 A, B, C
#2 list2    A, B

Or another option is to unnest the list column, do a group_by filter on the distinct rows and then condense (from devel version of dplyr ) or summarise into a list column或者另一种选择是unnest的list列，做一个group_by filter上的distinct行，然后condense （从devel版本的dplyr ）或summarise成list列

df %>%
    unnest(c(Content)) %>% 
    distinct() %>% 
    group_by(id) %>% 
    filter('A' %in% Content) %>%
    condense(Content)
# A tibble: 2 x 2
# Rowwise:  id
#  id    Content  
#   <fct> <list>   
#1 list1 <chr [3]>

2 list2 2 列表2

在 R 中操作包含列表的数据帧的每一行

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-12 20:02:58

在 R 中操作包含列表的数据帧的每一行

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-12 20:02:58

解决方案1
1 已采纳 2020-03-12 20:02:58