简体   繁体   English

基于列表匹配的子集 dataframe

[英]subset dataframe based on match in list

I want to subset my dataframe by matching against string values in a list.我想通过匹配列表中的字符串值来对我的 dataframe 进行子集化。 My df looks like this:我的 df 看起来像这样:

df3:
     functions                         main         
1    burger_function  (desc)           c("burger", "fries", "coke", "onion rings", "cheese")    
2    steak_function  (desc)            c("steak", "mash", "jack", "gravy", "cajun_fries")          
3    chicken_function (desc)           c("chicken", "salad", "sprite", "soup")       
4    fish_function (desc)              c("fish", "rice", "water", "garlic_bread")      
   

My first column are functions with a description about them.我的第一列是带有相关描述的函数。 I want to be able to search for a "main" value and subset the list to show which function it belongs to.我希望能够搜索“主要”值并将列表子集化以显示它属于哪个 function。 So far I have tried this code but I am getting wrong values mixed in with the right ones.到目前为止,我已经尝试过这段代码,但我得到的值与正确的值混合在一起。 Is there a better way to accomplish this?有没有更好的方法来完成这个?

func_sub <- df3[sapply(df3$main, function(x) x %in% "fries"),]

To end up with something like this:最终得到这样的东西:

df3:
     functions                         main         
1    burger_function  (desc)           "fries" 

We may need "fries" %in% x instead of x %in% "fries" because the former one returns a single TRUE/FALSE for each row whereas the one OP used will return a vector of TRUE/FALSE values for each row我们可能需要"fries" %in% x而不是x %in% "fries"因为前者为每一行返回一个 TRUE/FALSE 而使用的一个 OP 将为每一行返回一个 TRUE/FALSE 值的向量

df3[sapply(df3$main, function(x) "fries" %in% x),]
           functions                                     main
1 burger_function  (desc) burger, fries, coke, onion rings, cheese

With the OP's code, we may also wrap with any to return a single TRUE/FALSE使用 OP 的代码,我们也可以用any包装以返回单个 TRUE/FALSE

df3[sapply(df3$main, function(x) any(x %in% "fries")),]
         functions                                     main
1 burger_function  (desc) burger, fries, coke, onion rings, cheese

Note: This just subsets the rows of the original data and not the elements of the list .注意:这只是对原始数据的行进行子集化,而不是对list的元素进行子集化。 If we need to subset the 'main' as well如果我们还需要对“主要”进行子集化

out$main<- lapply(out$main, function(x) x[x %in% "fries"])
out
                functions  main
1 burger_function  (desc) fries

data数据

df3 <- structure(list(functions = c("burger_function  (desc)", "steak_function  (desc)", 
"chicken_function (desc)", "fish_function (desc)"), main = list(
    c("burger", "fries", "coke", "onion rings", "cheese"), c("steak", 
    "mash", "jack", "gravy", "cajun_fries"), c("chicken", "salad", 
    "sprite", "soup"), c("fish", "rice", "water", "garlic_bread"
    ))), row.names = c(NA, -4L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM