简体   繁体   English

按字符列名称过滤数据框(在 dplyr 中)

[英]Filter data frame by character column name (in dplyr)

I have a data frame and want to filter it in one of two ways, by either column "this" or column "that".我有一个数据框,想通过“this”列或“that”列以两种方式之一过滤它。 I would like to be able to refer to the column name as a variable.我希望能够将列名作为变量引用。 How (in dplyr , if that makes a difference) do I refer to a column name by a variable?我如何(在dplyr中,如果有所不同)如何通过变量引用列名?

library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
#   this that
# 1    1    1
# 2    2    1
# 3    2    2
df %>% filter(this == 1)
#   this that
# 1    1    1

But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is.但是假设我想使用变量column来保存“this”或“that”,并根据column的值进行过滤。 Both as.symbol and get work in other contexts, but not this: as.symbolget都在其他上下文中工作,但不是这样:

column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found

How can I turn the value of column into a column name?如何将column的值转换为列名?

From the current dplyr help file (emphasis by me):从当前的 dplyr 帮助文件(我强调):

dplyr used to offer twin versions of each verb suffixed with an underscore. dplyr 曾经提供每个动词的双版本,后缀为下划线。 These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value.这些版本具有标准评估 (SE) 语义:它们不像 NSE 动词那样按代码获取参数,而是按值获取参数。 Their purpose was to make it possible to program with dplyr.他们的目的是使使用 dplyr 编程成为可能。 However, dplyr now uses tidy evaluation semantics .但是, dplyr 现在使用整洁的评估语义 NSE verbs still capture their arguments, but you can now unquote parts of these arguments. NSE 动词仍然捕获它们的参数,但您现在可以取消引用这些参数的一部分。 This offers full programmability with NSE verbs.这为 NSE 动词提供了完整的可编程性。 Thus, the underscored versions are now superfluous.因此,带下划线的版本现在是多余的。

So we basically need to do two things, to be able to refer to the value "this" of the variable column inside dplyr::filter() :所以我们基本上需要做两件事,才能在dplyr::filter()变量column的值"this"

  1. We need to turn the variable column which is of type character into type symbol .我们需要将字符类型的变量column转换为symbol类型。

    Using base R this can be achieved by the function as.symbol() which is an alias for as.name() .使用基础R这可以通过函数来实现as.symbol()这是一个别名as.name() The former is preferred by the tidyverse developers because it前者是tidyverse 开发者的首选,因为它

    follows a more modern terminology (R types instead of S modes).遵循更现代的术语(R 类型而不是 S 模式)。

    Alternatively the same can be achieved by rlang::sym() from the tidyverse.或者,可以通过rlang::sym()实现相同的效果。

  2. We need to unquote the symbol from 1).我们需要取消引用 1) 中的符号。

    What unquoting exactly means can be learned in the vignette Programming with dplyr .可以在使用 dplyr的小插图编程中了解取消引用的确切含义。 It is achieved by the syntactic sugar !!它是通过语法糖实现的!! . .

    (In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single ! , but this is not an issue anymore since !! gained the right operator precedence .) (在早期版本的dplyr (或底层rlang分别)曾经有情况(包括你),其中!!会跟单碰撞! ,但这不是一个问题了,因为!!获得正确的运算符优先级。)

Applied to your example:应用于您的示例:

library(dplyr)
df <- data.frame(this = c(1, 2, 2),
                 that = c(1, 1, 2))
column <- "this"

df %>% filter(!!as.symbol(column) == 1)
#   this that
# 1    1    1

I would steer clear of using get() all together.我会避免一起使用get() It seems like it would be quite dangerous in this situation, especially if you're programming.在这种情况下,这似乎非常危险,尤其是在您进行编程时。 You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter() .您可以使用未评估的调用或粘贴的字符串,但您需要使用filter_()而不是filter()

df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"

Option 1 - using an unevaluated call:选项 1 - 使用未评估的调用:

You can hard-code y as 1 , but here I show it as y to illustrate how you can change the expression values easily.您可以将y硬编码为1 ,但在这里我将其显示为y以说明如何轻松更改表达式值。

expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or 
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
#   this that
# 1    1    1

Option 2 - using paste() (and obviously easier):选项 2 - 使用paste() (显然更容易):

df %>% filter_(paste(column, "==", 1))
#   this that
# 1    1    1

The main thing about these two options is that we need to use filter_() instead of filter() .这两个选项的主要内容是我们需要使用filter_()而不是filter() In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.事实上,根据我的阅读,如果您使用dplyr编程,则应始终使用*_()函数。

I used this post as a helpful reference: character string as function argument r , and I'm using dplyr version 0.3.0.2.我使用这篇文章作为有用的参考: 字符串作为函数参数 r ,我使用的是dplyr版本 0.3.0.2。

Here's another solution for the latest dplyr version:这是最新 dplyr 版本的另一个解决方案:

df <- data.frame(this = c(1, 2, 2),
                 that = c(1, 1, 2))
column <- "this"

df %>% filter(.[[column]] == 1)

#  this that
#1    1    1

Regarding Richard's solution, just want to add that if you the column is character.关于理查德的解决方案,只想补充一点,如果您的列是字符。 You can add shQuote to filter by character values.您可以添加shQuote以按字符值过滤。

For example, you can use例如,您可以使用

df %>% filter_(paste(column, "==", shQuote("a")))

If you have multiple filters, you can specify collapse = "&" in paste .如果您有多个过滤器,您可以在paste指定collapse = "&"

df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))

执行此操作的最新方法是使用my.data.frame %>% filter(.data[[myName]] == 1) ,其中myName是包含列名称的环境变量。

Or using filter_at或者使用filter_at

library(dplyr)
df %>% 
   filter_at(vars(column), any_vars(. == 1))

Like Salim B explained above but with a minor change:像上面解释的 Salim B 一样,但有一个小的变化:

df %>% filter(1 == !!as.name(column))

ie just reverse the condition because !!即只是颠倒条件,因为!! otherwise behaves like否则表现得像

!!(as.name(column)==1)

You can use the across(all_of()) syntax, it takes a string as argument您可以使用 across(all_of()) 语法,它需要一个字符串作为参数

column = "this"
df %>% filter(across(all_of(column)) == 1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM