简体   繁体   English

如何根据列表列中的值对 dataframe 进行子集化

[英]How to subset a dataframe based on values in a list column

I'm having an issue where I'm pulling down information from an API, and there are nested values within specific columns.我遇到了一个问题,我从 API 中提取信息,并且特定列中有嵌套值。 I need to filter on those values in order to return the information I need.我需要过滤这些值以返回我需要的信息。 Here's an example:这是一个例子:

library(dplyr)

# Make Data
problem <- list(list("thing 1", "thing 2"), list("thing 1", "thing 2", "thing 3"), list("thing 1"))
name <- list("joe", "sue", "nancy")

df<-data.frame(name=c("joe", "sue", "nancy"),problem=I(problem))

# How can I find subset rows where the problem column contains "thing 3"
filter(df, name == "sue") # this works fine
filter(df, "thing 3" %in% problem) # this doesn't

It's obvious to me that it's because the list is nested and filter() isn't "seeing" the data, but it's less clear to me how to get around it.对我来说很明显,这是因为列表是嵌套的,并且 filter() 没有“看到”数据,但我不太清楚如何绕过它。 Additionally, the data that I'm returning is fairly large, and has an arbitrary number of items per list within the column, so I don't want to unnest the column if I can avoid it.此外,我返回的数据相当大,并且列中的每个列表都有任意数量的项目,所以如果可以避免的话,我不想取消嵌套列。

#EDIT: I'm not married to a dplyr solution, and in fact if there is a data.table solution, I'd be especially interested to hear it, but I'm great with base or whatever! #EDIT:我没有与 dplyr 解决方案结婚,事实上,如果有 data.table 解决方案,我会特别有兴趣听到它,但我对基础或其他什么都很好!

Any help would be appreciated.任何帮助,将不胜感激。

df %>%
  filter(map_lgl(problem, ~any('thing 3' == .x)))

  name      problem
1  sue thing 1,....
  • We can try subset from Base R我们可以尝试从Base R subset
subset(df , grepl("thing 3" , problem))

  • Output Output
  name      problem
2  sue thing 1,....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用基于多列值的条件语句对 dataframe 进行子集化 - How to subset a dataframe with a conditional statement based on multiple column values 根据列表中的值列表对 dataframe 进行子集 - Subset a dataframe based on a list of values in a list 如何根据列表中的值对数据框进行子集化? - How do you subset a dataframe based on values from a list? 如何根据动态值子集获取 dataframe? - How to obtain dataframe based on dynamic subset of values? 通过列上的列表子集 dataframe - subset a dataframe by a list on column 如何根据 r 中的值列表对 dataframe 进行子集化 - How to subset a dataframe based on a list of value in r 根据列名的向量在列表中按列逐列设置子数据集并汇总列 - subset dataframe by column in a list based on a vector of column names and summarize the columns 如何基于列名对R中包含多个列表的列表进行子集化,并合并为单个列表/数据框? - How to subset a list containing several lists in R, based on the column name, and merge into a single list/dataframe? 基于标识列中的最大值和最小值(在R中)来子集数据框 - Subset a dataframe based on identifying max and min values in a column (in R) 基于行或列匹配的多个数据框的子集列表 - subset list of multiple dataframe based on either row or column match
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM