繁体   English   中英

如果一行包含列表()/没有嵌套的tibble,如何过滤嵌套的tibble

[英]How to filter nested tibble if one row contains list()/no nested tibble

当一行不包含嵌套的 tibble 时,我正在努力过滤嵌套的 tibble。

my_df在列products中包含一个嵌套的 tibble。 我想过滤嵌套的 tibble,以便它只在其列food中包含值apple

我可以用mutate(products=map(products, ~filter(.x, str_detect(food, "apple")))来做到这一点。但是,当my_df中有一行不包含/一个空嵌套 tibble (list())。

我试图通过创建一个辅助列来规避这个问题,该辅助列检查嵌套小标题中的行数,然后仅将搜索应用于 nrow > 0 的那些行。但是,我使用case_when的方法失败了,我不知道为什么。

如果有任何提示,我将不胜感激。 请注意,我知道我可以将my_df拆分为两个单独的 df(一个带有 list(),一个带有嵌套的 tibbles),然后再对它们进行row_bind case_when的方法在我的用例中似乎更方便,我想了解为什么它不起作用。 在代表之下。 非常感谢!

library(tidyverse)


my_df <- structure(list(branch_name = c("basket1", "basket2"), products = list(
  structure(list(), class = c(
    "tbl_df", "tbl",
    "data.frame"
  ), row.names = integer(0), .Names = character(0)),
  structure(list(
    food = c(
      "apple",
      "grape"
    ),
    supplier = c("john", "jack")),
  class = c("tbl_df", "tbl", "data.frame"),
  row.names = c(NA, -2L)
  )
)), row.names = c(NA, -2L), class = c(
  "tbl_df",
  "tbl", "data.frame"
))
my_df
#> # A tibble: 2 x 2
#>   branch_name products        
#>   <chr>       <list>          
#> 1 basket1     <tibble [0 x 0]>
#> 2 basket2     <tibble [2 x 2]>


#Try to filter the nested df 'products', keep only rows where str_detect(food, "apple")==T
#fails
x <- my_df %>% 
  mutate(products=map(products, ~filter(.x, str_detect(food, "apple"))))
#> Error in `mutate_cols()`:
#> ! Problem with `mutate()` column `products`.
#> i `products = map(products, ~filter(.x, str_detect(food, "apple")))`.
#> x Problem with `filter()` input `..1`.
#> i Input `..1` is `str_detect(food, "apple")`.
#> x object 'food' not found
#> Caused by error in `h()`:
#> ! Problem with `filter()` input `..1`.
#> i Input `..1` is `str_detect(food, "apple")`.
#> x object 'food' not found

  
#filter works  if in no row the nested df is list()
y <- my_df %>% 
  mutate(products_nrow=map_dbl(products, nrow)) %>% 
  filter(products_nrow>0) %>% 
  mutate(products=map(products, ~filter(.x, str_detect(food, "apple"))))

#correct result
y  
#> # A tibble: 1 x 3
#>   branch_name products         products_nrow
#>   <chr>       <list>                   <dbl>
#> 1 basket2     <tibble [1 x 2]>             2
y$products
#> [[1]]
#> # A tibble: 1 x 2
#>   food  supplier
#>   <chr> <chr>   
#> 1 apple john


#account for nrows of nested df and use case_when; fails
my_df %>% 
  mutate(products_nrow=map_dbl(products, nrow)) %>% 
  mutate(products=case_when(
    products_nrow>0 ~ map(products, ~filter(.x, str_detect(food, "apple"))),
    TRUE ~ products))
#> Error in `mutate_cols()`:
#> ! Problem with `mutate()` column `products`.
#> i `products = case_when(...)`.
#> x Problem with `filter()` input `..1`.
#> i Input `..1` is `str_detect(food, "apple")`.
#> x object 'food' not found
#> Caused by error in `h()`:
#> ! Problem with `filter()` input `..1`.
#> i Input `..1` is `str_detect(food, "apple")`.
#> x object 'food' not found

reprex package (v2.0.1) 创建于 2022-03-18

您可以使用if条件来检查数据集中是否有列food

library(dplyr)
library(purrr)
library(strings)

my_df %>% 
  mutate(products = map(products, ~ if ("food" %in% names(.x)) filter(.x, str_detect(food, "apple")) else .x))
#> # A tibble: 2 × 2
#>   branch_name products        
#>   <chr>       <list>          
#> 1 basket1     <tibble [0 × 0]>
#> 2 basket2     <tibble [1 × 2]>

另一种可能的解决方案:

library(tidyverse)

my_df[["products"]] <-
 map(my_df[["products"]], ~ if (nrow(.x) != 0) 
     {filter(.x, food == "apple")} else {.x})

my_df

#> # A tibble: 2 × 2
#>   branch_name products        
#>   <chr>       <list>          
#> 1 basket1     <tibble [0 × 0]>
#> 2 basket2     <tibble [1 × 2]>

一个不直接回答你的问题的 hacky 解决方案,但可能最简单的事情就是简单地unnest (删除空的小标题)并在应用你的过滤器之前再次nest

 my_df %>% 
   unnest(products) %>%
   nest(products = -branch_name) %>%
   mutate(products=map(products, ~filter(.x, str_detect(food, "apple"))))

导致:

# A tibble: 1 × 2
  branch_name products        
  <chr>       <list>          
1 basket2     <tibble [1 × 2]>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM