[英]R: how to pass a variable into a function to subset data.frame
dat = data.frame(height = c(20, 20, 40, 50, 60, 10), weight = c(100, 200, 300, 200, 140, 240),
age = c(19, 20, 20, 19, 10, 11))
f = function(x){
subset.19 = dat$x[dat$age == 20]
subset.20 = dat$x[dat$age == 19]
t.test(subset.19, subset.20)
}
f("weight")
I get an error: 我收到一个错误:
Error in var(x) : 'x' is NULL In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL' 2: In mean.default(x) : argument is not numeric or logical: returning NA var(x)中的错误:'x'为NULL另外:警告消息:1:在is.na(x)中:is.na()应用于类型为'NULL'的非(列表或向量)2:在mean.default(x):参数不是数字或逻辑:返回NA
I think this is because dat$x
is always NULL because there is no column named x
in the data.frame. 我认为这是因为
dat$x
始终为NULL,因为data.frame中没有名为x
的列。 I think I am not passing the variable name into the function. 我想我没有将变量名传递给函数。
dat$x
is always subsetting the column named x
from dat
, not the column name that I passed in (ie weight). dat$x
总是从dat
子集名为x
的列,而不是我传入的列名(即weight)。 So my question is how can I pass in the column name that I want so this function runs? 所以我的问题是如何传递想要的列名,以便该函数运行?
As @agstudy and @docendodiscimus mentioned in the comments, it is better to use [
, [[
instead of $
when passing column name in functions. 正如注释中提到的@agstudy和@docendodiscimus,在函数中传递列名时,最好使用
[
, [[
而不是$
。
f <- function(x){
subset.19 = dat[,x][dat$age == 20]
subset.20 = dat[,x][dat$age == 19]
t.test(subset.19, subset.20)
}
f("weight")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.