简体   繁体   English

R使用字符串来表示列

[英]R use string to refer to column

I would like to subset a dataframe by referring to a column with a string and select the values of that column that fulfill a condition. 我想通过引用带有字符串的列来对数据帧进行子集化,并选择满足条件的该列的值。 From the following code 从以下代码

 employee <- c('John Doe','Peter Gynn','Jolie Hope')
 salary <- c(21000, 23400, 26800)
 startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
 employ.data <- data.frame(employee, salary, startdate)
 salary_string <- "salary"

I want to get all salaries over 23000 by using the salary_string to refer to the column name. 我希望通过使用salary_string来引用列名来获得超过23000的所有工资。

I tried without succes: 我尝试了没有成功:

set <- subset(employ.data, salary_string > 23000)
set2 <- employ.data[, employ.data$salary_string > 23000)

This does not seem to work because the salary_string is of type character but what I need is some sort of "column name object". 这似乎不起作用,因为salary_string是字符类型,但我需要的是某种“列名对象”。 Using as.name(salary_string) does not work neither. 使用as.name(salary_string)也不起作用。 I know I could get the subset by using 我知道我可以通过使用获得子集

set <- subset(employ.data, salary > 23000)

But my goal is to use the column name that is of type character (salary_string) once with subset(employ.data, ... ) and once with employ.data[, ...] 但我的目标是使用字符(salary_string)类型的列名称与子集(employ.data,...)一次使用,并使用employ.data [,...]

简短的回答是:不要使用subset而是使用类似的东西

employ.data[employ.data[salary_string]>23000,]

Here's another idea: 这是另一个想法:

dplyr::filter(employ.data, get(salary_string) > 23000)

Which gives: 这使:

#    employee salary  startdate
#1 Peter Gynn  23400 2008-03-25
#2 Jolie Hope  26800 2007-03-14

For the sake of showing how to achieve the result with subset() : 为了展示如何使用subset()实现结果:

The issue you're having is because subset() uses non-standard evaluation. 您遇到的问题是因为subset()使用非标准评估。 Here's one way to substitute your string into the subset() function. 这是将字符串替换为subset()函数的一种方法。

## set up an unevaluated call
e <- call(">", as.name(salary_string), 23000)
## evaluate it in subset()
subset(employ.data, eval(e))
#     employee salary  startdate
# 2 Peter Gynn  23400 2008-03-25
# 3 Jolie Hope  26800 2007-03-14

Or as Steven suggests, the following would also work well. 或者正如史蒂文所说,以下情况也会奏效。

subset(employ.data, eval(as.name(salary_string)) > 23000)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM