[英]How to subset a dataframe based on values in a list column
I'm having an issue where I'm pulling down information from an API, and there are nested values within specific columns.我遇到了一个问题,我从 API 中提取信息,并且特定列中有嵌套值。 I need to filter on those values in order to return the information I need.
我需要过滤这些值以返回我需要的信息。 Here's an example:
这是一个例子:
library(dplyr)
# Make Data
problem <- list(list("thing 1", "thing 2"), list("thing 1", "thing 2", "thing 3"), list("thing 1"))
name <- list("joe", "sue", "nancy")
df<-data.frame(name=c("joe", "sue", "nancy"),problem=I(problem))
# How can I find subset rows where the problem column contains "thing 3"
filter(df, name == "sue") # this works fine
filter(df, "thing 3" %in% problem) # this doesn't
It's obvious to me that it's because the list is nested and filter() isn't "seeing" the data, but it's less clear to me how to get around it.对我来说很明显,这是因为列表是嵌套的,并且 filter() 没有“看到”数据,但我不太清楚如何绕过它。 Additionally, the data that I'm returning is fairly large, and has an arbitrary number of items per list within the column, so I don't want to unnest the column if I can avoid it.
此外,我返回的数据相当大,并且列中的每个列表都有任意数量的项目,所以如果可以避免的话,我不想取消嵌套列。
#EDIT: I'm not married to a dplyr solution, and in fact if there is a data.table solution, I'd be especially interested to hear it, but I'm great with base or whatever! #EDIT:我没有与 dplyr 解决方案结婚,事实上,如果有 data.table 解决方案,我会特别有兴趣听到它,但我对基础或其他什么都很好!
Any help would be appreciated.任何帮助,将不胜感激。
df %>%
filter(map_lgl(problem, ~any('thing 3' == .x)))
name problem
1 sue thing 1,....
subset
from Base R
Base R
subset
subset(df , grepl("thing 3" , problem))
name problem
2 sue thing 1,....
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.