简体   繁体   English

R中的变量名称向量

[英]vector of variable names in R

I'd like to create a function that automatically generates uni and multivariate regression analyses, but I'm not able to figure out how I can specify **variables in vectors...**This seems very easy, but skimming the documentation I havent figured it out so far... 我想创建一个自动生成单变量和多变量回归分析的函数,但我无法弄清楚如何在向量中指定**变量... **这看起来很简单,但略读文档我到目前为止还没弄明白......

Easy example 简单的例子

a<-rnorm(100)
b<-rnorm(100)
k<-c("a","b")
d<-c(a,b)
summary(k[1])

But k[1]="a" and is a character vector...d is just b appended to a, not the variable names. 但是k [1] =“a”并且是一个字符向量... d只是b附加到a,而不是变量名。 In effect I'd like k[1] to represent the vector a. 实际上,我希望k [1]代表向量a。

Appreciate any answers... 感谢任何答案......

//M //中号

You can use the "get" function to get an object based on a character string of its name, but in the long run it is better to store the variables in a list and just access them that way, things become much simpler, you can grab subsets, you can use lapply or sapply to run the same code on every element. 您可以使用“get”函数根据其名称的字符串获取对象,但从长远来看,最好将变量存储在列表中并以这种方式访问​​它们,事情变得更加简单,您可以抓取子集,您可以使用lapply或sapply在每个元素上运行相同的代码。 When saving or deleting you can just work on the entire list rather than trying to remember every element. 保存或删除时,您可以只处理整个列表,而不是尝试记住每个元素。 eg: 例如:

mylist <- list(a=rnorm(100), b=rnorm(100) )
names(mylist)
summary(mylist[[1]])
# or
summary(mylist[['a']])
# or
summary(mylist$a)
# or 
d <- 'a'
summary(mylist[[d]])

# or
lapply( mylist, summary )

If you are programatically creating models for analysis with lm (or other modeling functions), then one approach is to just subset your data and use the ".", eg: 如果您以编程方式创建用于使用lm(或其他建模函数)进行分析的模型,那么一种方法是仅对您的数据进行子集化并使用“。”,例如:

yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
fit <- lm( Sepal.Width ~ ., data=iris[, c(yvar,xvars)] )

Or you can build the formula using "paste" or "sprintf" then use "as.formula" to convert it to a formula, eg: 或者您可以使用“paste”或“sprintf”构建公式,然后使用“as.formula”将其转换为公式,例如:

yvar <- 'Sepal.Width'
xvars <- c('Petal.Width','Sepal.Length')
my.formula <- paste( yvar, '~', paste( xvars, collapse=' + ' ) )
my.formula <- as.formula(my.formula)
fit <- lm( my.formula, data=iris )

Note also the problem of multiple comparisons if you are looking at many different models fit automatically. 如果您正在查看自动适合的许多不同模型,请注意多重比较的问题。

you could use a list k=list(a,b) . 你可以使用列表k=list(a,b) This creates a list with components a and b but is not a list of variable names. 这将创建一个包含组件a和b的列表,但不是变量名列表。

get() is what you're looking for : get()正是您要找的:

summary(get(k[1]))

edit : get() is not what you're looking for, it's list(). 编辑:get()不是你要找的,它是list()。 get() could be useful too though. get()也可能有用。

If you're looking for automatic generation of regression analyses, you might actually benefit from using eval(), although every R-programmer will warn you about using eval() unless you know very well what you're doing. 如果您正在寻找自动生成回归分析,您实际上可能会受益于使用eval(),尽管每个R程序员都会警告您使用eval(),除非您非常清楚自己在做什么。 Please read the help files about eval() and parse() very carefully before you use them. 在使用它们之前,请仔细阅读有关eval()和parse()的帮助文件。

An example : 一个例子 :

d <- data.frame(
  var1 = rnorm(1000),
  var2 = rpois(1000,4),
  var3 = sample(letters[1:3],1000,replace=T)
)

vars <- names(d)

auto.lm <- function(d,dep,indep){
      expr <- paste(
          "out <- lm(",
          dep,
          "~",
          paste(indep,collapse="*"),
          ",data=d)"
      )
      eval(parse(text=expr))
      return(out)
}

auto.lm(d,vars[1],vars[2:3])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM