简体   繁体   中英

How to subset a dataframe based on values in a list column

I'm having an issue where I'm pulling down information from an API, and there are nested values within specific columns. I need to filter on those values in order to return the information I need. Here's an example:

library(dplyr)

# Make Data
problem <- list(list("thing 1", "thing 2"), list("thing 1", "thing 2", "thing 3"), list("thing 1"))
name <- list("joe", "sue", "nancy")

df<-data.frame(name=c("joe", "sue", "nancy"),problem=I(problem))

# How can I find subset rows where the problem column contains "thing 3"
filter(df, name == "sue") # this works fine
filter(df, "thing 3" %in% problem) # this doesn't

It's obvious to me that it's because the list is nested and filter() isn't "seeing" the data, but it's less clear to me how to get around it. Additionally, the data that I'm returning is fairly large, and has an arbitrary number of items per list within the column, so I don't want to unnest the column if I can avoid it.

#EDIT: I'm not married to a dplyr solution, and in fact if there is a data.table solution, I'd be especially interested to hear it, but I'm great with base or whatever!

Any help would be appreciated.

df %>%
  filter(map_lgl(problem, ~any('thing 3' == .x)))

  name      problem
1  sue thing 1,....
  • We can try subset from Base R
subset(df , grepl("thing 3" , problem))

  • Output
  name      problem
2  sue thing 1,....

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM