[英]Apply dplyr functions on a single column across a list using piping
I'm tring to filter something across a list of dataframes for a specific column.我正在尝试在特定列的数据框列表中过滤某些内容。 Typically across a single dataframe using dplyr I would use:
通常跨单个 dataframe 使用 dplyr 我会使用:
#creating dataframe
df <- data.frame(a = 0:10, d = 10:20)
# filtering column a for rows greater than 7
df %>% filter(a > 7)
I've tried doing this across a list using the following:我尝试使用以下方法在列表中执行此操作:
# creating list
x <- list(data.frame(a = 0:10, b = 10:20),
data.frame(c = 11:20, d = 21:30),
data.frame(e = 15:25, f = 35:45))
# selecting the appropriate column and trying to filter
# this is not working
x[1][[1]][1] %>% lapply(. %>% {filter(. > 2)})
# however, if I use the min() function it works
x[1][[1]][1] %>% lapply(. %>% {min(.)})
I find the %>%
syntax quite easy to understand and carry out.我发现
%>%
语法很容易理解和执行。 However, in this case, selecting a specific column and doing something quite simple like filtering is not working.但是,在这种情况下,选择特定列并执行一些非常简单的操作(例如过滤)是行不通的。 I'm guessing map could be equally useful.
我猜 map 可能同样有用。 Any help is appreciated.
任何帮助表示赞赏。
You can use filter_at
to refer column by position.您可以使用
filter_at
来引用 position 的列。
library(dplyr)
purrr::map(x, ~.x %>% filter_at(1, any_vars(. > 7)))
In filter
, you can subset the column and use it在
filter
中,您可以对列进行子集化并使用它
purrr::map(x, ~.x %>% filter(.[[1]] > 7))
In base R, that would be:在基础 R 中,这将是:
lapply(x, function(y) y[y[[1]] > 7, ])
It seems you are interested in checking the condition on the first column of each dataframe in your list.您似乎有兴趣检查列表中每个 dataframe 第一列的条件。 One solution using
dplyr
would be使用
dplyr
的一种解决方案是
lapply(x, function(df) {df %>% filter_at(1, ~. > 7)})
The 1
in filter_at
indicates that I want to check the condition on the first column ( 1
is a positional index) of each dataframe in the list. filter_at
中的1
表示我要检查列表中每个 dataframe 的第一列( 1
是位置索引)的条件。
After the discussion in the comments, I propose the following solution经过评论中的讨论,我提出以下解决方案
lapply(x, function(df) {df %>% filter(a > 7) %>% select(a) %>% slice(1)})
Input data输入数据
x <- list(data.frame(a = 0:10, b = 10:20),
data.frame(a = 11:20, b = 21:30),
data.frame(a = 15:25, b = 35:45))
Output Output
[[1]]
a
1 8
[[2]]
a
1 11
[[3]]
a
1 15
Using filter
with across
使用
filter
across
library(dplyr)
library(purrr)
map(x, ~ .x %>%
filter(across(names(.)[1], ~ .> 7)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.