R使用字符串来表示列

Question

I would like to subset a dataframe by referring to a column with a string and select the values of that column that fulfill a condition. 我想通过引用带有字符串的列来对数据帧进行子集化，并选择满足条件的该列的值。 From the following code 从以下代码

 employee <- c('John Doe','Peter Gynn','Jolie Hope')
 salary <- c(21000, 23400, 26800)
 startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))
 employ.data <- data.frame(employee, salary, startdate)
 salary_string <- "salary"

I want to get all salaries over 23000 by using the salary_string to refer to the column name. 我希望通过使用salary_string来引用列名来获得超过23000的所有工资。

I tried without succes: 我尝试了没有成功：

set <- subset(employ.data, salary_string > 23000)
set2 <- employ.data[, employ.data$salary_string > 23000)

This does not seem to work because the salary_string is of type character but what I need is some sort of "column name object". 这似乎不起作用，因为salary_string是字符类型，但我需要的是某种“列名对象”。 Using as.name(salary_string) does not work neither. 使用as.name（salary_string）也不起作用。 I know I could get the subset by using 我知道我可以通过使用获得子集

set <- subset(employ.data, salary > 23000)

But my goal is to use the column name that is of type character (salary_string) once with subset(employ.data, ... ) and once with employ.data[, ...] 但我的目标是使用字符（salary_string）类型的列名称与子集（employ.data，...）一次使用，并使用employ.data [，...]

Answer 1

简短的回答是：不要使用subset而是使用类似的东西

employ.data[employ.data[salary_string]>23000,]

Answer 2

Here's another idea: 这是另一个想法：

dplyr::filter(employ.data, get(salary_string) > 23000)

Which gives: 这使：

#    employee salary  startdate
#1 Peter Gynn  23400 2008-03-25
#2 Jolie Hope  26800 2007-03-14

Answer 3

For the sake of showing how to achieve the result with subset() : 为了展示如何使用subset()实现结果：

The issue you're having is because subset() uses non-standard evaluation. 您遇到的问题是因为subset()使用非标准评估。 Here's one way to substitute your string into the subset() function. 这是将字符串替换为subset()函数的一种方法。

## set up an unevaluated call
e <- call(">", as.name(salary_string), 23000)
## evaluate it in subset()
subset(employ.data, eval(e))
#     employee salary  startdate
# 2 Peter Gynn  23400 2008-03-25
# 3 Jolie Hope  26800 2007-03-14

Or as Steven suggests, the following would also work well. 或者正如史蒂文所说，以下情况也会奏效。

subset(employ.data, eval(as.name(salary_string)) > 23000)

R使用字符串来表示列

问题描述

3 个解决方案

解决方案1
5 已采纳 2015-04-29 22:32:07

解决方案2
3 2015-04-29 22:50:04

解决方案3
2 2015-04-29 22:41:58

R使用字符串来表示列

问题描述

3 个解决方案

解决方案1 5 已采纳 2015-04-29 22:32:07

解决方案2 3 2015-04-29 22:50:04

解决方案3 2 2015-04-29 22:41:58

解决方案1
5 已采纳 2015-04-29 22:32:07

解决方案2
3 2015-04-29 22:50:04

解决方案3
2 2015-04-29 22:41:58