简体   繁体   English

过滤 dplyr 中的列

[英]Filter across columns in dplyr

I want to filter the iris dataframe to only return rows where the value is greater than 2 in the sepal.length,sepal.width,petal.length, and petal.width fields using the filter and across functions.我想使用过滤器和跨函数过滤 iris dataframe 以仅返回 sepal.length、sepal.width、petal.length 和 petal.width 字段中值大于 2 的行。 I have the below code:我有以下代码:

iris%>%
  filter(across(c(Sepal.Length, Sepal.Width , Petal.Length, Petal.Width), >2))

The error message is that there is: Error: unexpected '>' in:错误消息是: Error: unexpected '>' in:

Can anyone suggest amendments to the code to solve this?任何人都可以建议修改代码来解决这个问题吗?

A possible solution, based on dplyr :基于dplyr的可能解决方案:

library(dplyr)

iris%>%
  filter(across(is.numeric, ~ .x > 2))

Or:或者:

iris%>%
  filter(across(c(Sepal.Length,Sepal.Width,Petal.Length,Petal.Width), ~ .x > 2))

Or even:甚至:

iris%>%
  filter(across(ends_with(c("Length","Width")), ~ .x > 2))

Two possibilities两种可能

iris %>%
  filter(across(c(Sepal.Length, Sepal.Width , Petal.Length, Petal.Width), `>`, 2))
iris %>%
  filter(across(c(Sepal.Length, Sepal.Width , Petal.Length, Petal.Width), ~ .x > 2))

# or

iris %>%
  filter(across(c(Sepal.Length, Sepal.Width , Petal.Length, Petal.Width), function(x) x > 2))

Let's start from the second example - there we are using anonymous function notation, first one is purrr's style, the second one is, let's call is, classic style.让我们从第二个例子开始——我们使用匿名 function 表示法,第一个是 purrr 的风格,第二个是,我们称之为经典风格。 Purrr's style works only with some packages. Purrr 的风格只适用于某些包。

And now the first one - what across() wants as a second argument is a function, but you need to use function in prefix form Advanced R .现在第一个 - cross across()想要的第二个参数是 function,但您需要使用前缀形式的function 高级 R All functions in R have this form, but often it is not necessary to use it, for example: R中的所有函数都有这种形式,但很多时候并不需要使用它,例如:

2 + 2
`+`(2, 2)

Is the same.是一样的。

In across() when you pass (as a second argument) function, then you can pass after comma all other arguments which can be passed to this function.在cross across()中,当您传递(作为第二个参数)function 时,您可以在逗号后传递所有其他可以传递给此 function 的 arguments。 For > first argument is, well first number(s) - and there go values from iris , and the second argument is number 2 , ie number you chosen to check against values in columns.对于>第一个参数是,第一个数字 - 并且有来自iris的 go 值,第二个参数是数字2 ,即您选择检查列中值的数字。

A potential solution using dplyr :使用dplyr的潜在解决方案:

iris %>% filter(Sepal.Length > 2 & Sepal.Width >2 & Petal.Length >2 & Petal.Width >2)

And the condensated version:还有精简版:

iris %>% filter_at(vars(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width),all_vars(.>2))

Hi another possible you could use is since you are using variables that have similar names is嗨,您可以使用的另一种可能是,因为您使用的变量名称相似

iris_filter_contain = iris %>% 
  filter(across(c(contains("Petal"), ends_with("Sepal")), ~ .x > 2))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM