[英]How can I extract variables from a R expression to be evaluated in data.frame context
I have expressions in character
that are supposed to be evaluated in a data.table
(not important just context). 我有一些
character
表达式,应该在data.table
评估(不重要,只是上下文)。 To make sure all the required columns are present I would like to extract the said columns within the R expression. 为了确保所有必需的列都存在,我想在R表达式中提取所述列。
What I want: 我想要的是:
library(data.table)
DT <- data.table(p001=rnorm(10),p002=rnorm(10),p003=rnorm(10))
expr <- 'p001+mean(p001,na.rm=TRUE)-weighted.mean(p002,w=p003)+someRandomOtherColumn'
# DT[,test:=p001+mean(p001,na.rm=TRUE)-weighted.mean(p002,w=p003)+someRandomOtherColumn]
# would fail as p004 is not in the columns
Basically I am looking for a way (probably a regex) that would extract from expr
p001,p002,p003,someRandomOtherColumn
. 基本上,我正在寻找一种将从
expr
p001,p002,p003,someRandomOtherColumn
提取的方法(可能是正则表达式)。
My view on it: The way I see it I should be able to capture p001,p001,TRUE,p002,p003,someRandomOtherColumn
with some regex that would capture things within f(,)
and then filter for 'allowed' column names ( TRUE
is not in that case). 我对此的看法:我的看法是,我应该能够使用一些正则表达式捕获
p001,p001,TRUE,p002,p003,someRandomOtherColumn
,这些正则表达式将捕获f(,)
,然后过滤“允许的”列名( TRUE
不在那种情况下)。
Nested f(,,)
are not an issue as I can call the same function recursively and nested f(,(),)
are also fine. 嵌套的
f(,,)
并不是问题,因为我可以递归调用同一函数,嵌套的f(,(),)
也很好。
What I have: From now this is what I have, this can be made to work but this feels bad 我所拥有的:从现在开始这就是我所拥有的,可以使它正常工作,但是感觉很糟糕
expr <- 'p001+mean(p001,na.rm=TRUE)-weighted.mean(p002,w=p003)+someRandomOtherColumn'
clean <- function(string) gsub(string, pattern='[_|\\.|a-zA-z]+\\(([^)]*)\\)', replacement='\\1', perl=TRUE)
clean(expr)
[1] "p001+p001,na.rm=TRUE-p002,w=p003+someRandomOtherColumn"
# Then I can remove =* than split on ,|+|-|*
When you add a ~
to your expression, you can create a valid R formula expression: 在表达式中添加
~
,可以创建一个有效的R公式表达式:
expr <- '~ p001+mean(p001,na.rm=TRUE)-weighted.mean(p002,w=p003)+someRandomOtherColumn'
This string can be converted to a formula with as.formula
. 该字符串可以使用
as.formula
转换为公式。 Afterwards, the variable names can be extracted with all.vars
: 之后,可以使用
all.vars
提取变量名称:
all.vars(as.formula(expr))
# [1] "p001" "p002" "p003" "someRandomOtherColumn"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.