简体   繁体   English

在dplyr中过滤列表变量

[英]filter list variable in dplyr

In general how do we filter by a list variable in dplyr? 一般来说,我们如何在dplyr中按列表变量进行过滤?

Eg a data frame where one variable is a list of different classes of object: 例如,一个数据框,其中一个变量是不同类对象的列表:

aa <- tibble(ss = c(1,2),
             dd = list(NA,
                       matrix(data = c(1,2,3,4),
                              nrow = 2,
                              ncol = 2)))

> aa
# A tibble: 2 x 2
#     ss dd           
#  <dbl> <list>       
#1  1.00 <lgl [1]>    
#2  2.00 <dbl [2 × 2]>

For example if I want to filter for logicals (though could be anything), if it were not a list it would be as simple as: 例如,如果我想过滤逻辑(虽然可能是任何东西),如果它不是一个列表,它将简单如下:

aa %>% filter(is.logical(dd))

But this returns 但这会回来

# A tibble: 0 x 2
# ... with 2 variables: ss <dbl>, dd <list>

Because it's not the first element that's a logical, it's the first element of the first element: 因为它不是第一个符合逻辑的元素,所以它是第一个元素的第一个元素:

> is.logical(aa$dd[1])
# [1] FALSE
> is.logical(aa$dd[[1]])
# [1] TRUE

One may use purrr:map for other operations on nested list variables, but this also doesn't work. 可以使用purrr:map对嵌套列表变量进行其他操作,但这也不起作用。

> aa %>% filter(map(.x = dd,
+                   .f = is.logical))
# Error in filter_impl(.data, quo) : basic_string::resize

What am I missing here? 我在这里错过了什么?

As the 'dd' is a list column, we can loop through the 'dd' using map , but each element of 'dd' can have more than one element, so we make a condition that if all the elements are NA , then filter the rows of the dataset 由于'dd'是list列,我们可以使用map遍历'dd',但是'dd'的每个元素可以有多个元素,所以我们创建一个条件,如果all元素都是NA ,那么filter数据集的行

library(tidyverse)
aa %>%
   filter(map_lgl(dd, ~ .x %>%
                           is.na %>% 
                             all))
# A tibble: 1 x 2
#     ss dd       
#   <dbl> <list>   
#1     1 <lgl [1]>

If this is about filter ing based on class . 如果这是关于基于class filter

aa %>%
    filter(map_lgl(dd, is.logical))
# A tibble: 1 x 2
#     ss dd       
#  <dbl> <list>   
#1     1 <lgl [1]>

In the OP's code, map output is still a list , we convert it to a logical vector with map_lgl 在OP的代码中, map输出仍然是一个list ,我们将它转​​换为带有map_lgl的逻辑向量

The best I can do is to create a dummy variable using is.logical with purrr:map , unlist it, filter by it, then un- select the dummy variable. 我能做的最好的是创建一个使用虚拟变量is.logicalpurrr:mapunlist它, filter的话,那么非select的虚拟变量。 Works, but what a kerfuffle. 工作,但是什么是kerfuffle。

aa %>%
  mutate(ff = map(.x = dd,
                       .f = is.logical),
         ff = unlist(ff)) %>%
  filter(ff == TRUE) %>%
  select(-ff)

# A tibble: 1 x 2
#      ss dd       
#   <dbl> <list>   
# 1  1.00 <lgl [1]>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM