简体   繁体   English

使用data.table作为参数在函数内使用deparse(substitute())

[英]deparse(substitute()) within function using data.table as argument

If I want do deparse the argument of a function for an error or a warning, something strange is happening if the argument is converted to a data.table within the function: 如果我想将函数的参数解析为错误或警告,如果参数转换为函数中的data.table,则会发生奇怪的事情:

e <- data.frame(x = 1:10)
### something strange is happening
foo <- function(u) {
  u <- data.table(u)
  warning(deparse(substitute(u)), " is not a data.table")
  u
}
foo(e)

##  foo(e)
##      x
##  1:  1
##  2:  2
##  3:  3
##  4:  4
##  5:  5
##  6:  6
##  7:  7
##  8:  8
##  9:  9
## 10: 10
## Warning message:
## In foo(e) :
##   structure(list(x = 1:10), .Names = "x", row.names = c(NA, -10L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x10026568>) is not a data.table

If I deparse it before data.table everything works fine: 如果我deparse之前data.table一切工作正常:

### ok
foo1 <- function(u) {
  nu <- deparse(substitute(u))
  u <- data.table(u)
  warning(nu, " is not a data.table")
  u
}
## foo1(e)
##      x
##  1:  1
##  2:  2
##  3:  3
##  4:  4
##  5:  5
##  6:  6
##  7:  7
##  8:  8
##  9:  9
## 10: 10
## Warning message:
## In foo1(e) : e is not a data.table

There is by the way no difference if e already is a data.table or not. 如果e已经 data.table或者没有,那么顺便说一下没有区别。 I found it on purpose, when I was profiling some code, where deparse was very time consuming because e was quite big. 我故意找到它,当我分析一些代码时, deparse非常耗时,因为e非常大。

What's happening here and how can I handle such functions for data.frame and data.table input? 这里发生了什么,我如何处理data.framedata.table输入的这些函数?

nachti nachti

This is because substitute behaves differently when you are dealing with a normal variable instead of a promise object. 这是因为当您处理普通变量而不是promise对象时, substitute行为会有所不同。 A promise object is a formal argument and has a special slot that contains the expression that generated it. promise对象是一个形式参数,并且有一个特殊的插槽,其中包含生成它的表达式。 In other words, a promise object is a variable in a function that is part of the argument list of that function. 换句话说,promise对象是函数中的变量,该函数是该函数的参数列表的一部分。 When you use substitute on a promise object in a function, then it will retrieve the expression in the call to the function that was assigned to that formal argument. 当您在函数中的promise对象上使用substitute时,它将在对分配给该正式参数的函数的调用中检索表达式。 From ?substitute : ?substitute

Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in env, it is unchanged. 通过检查解析树的每个组件进行替换,如下所示:如果它不是env中的绑定符号,则它将保持不变。 If it is a promise object, ie, a formal argument to a function or explicitly created using delayedAssign(), the expression slot of the promise replaces the symbol. 如果它是一个promise对象,即函数的形式参数或使用delayedAssign()显式创建的,则promise的表达式槽替换该符号。 If it is an ordinary variable, its value is substituted , unless env is .GlobalEnv in which case the symbol is left unchanged. 如果它是普通变量,则其值被替换 ,除非env是.GlobalEnv,在这种情况下符号保持不变。

In your case, you actually overwrite the original promise variable with a new one with: 在您的情况下,您实际上使用新的变量覆盖原始的promise变量:

u <- data.table(u)

at which point u becomes a normal variable that contains a data table. 此时u成为包含数据表的普通变量。 When you substitute on u after this point, substitute just returns the data table, which deparse processes back to the R language that would generate it, which is why it is slow. 在此之后substitute u时, substitute只返回数据表,将数据表deparse生成它的R语言,这就是为什么它很慢。

This also explains why your second example works. 这也解释了为什么你的第二个例子有效。 You substitute while the variable is still a promise (ie before you overwrite u ). substitute而变量仍然是一个承诺(您覆盖即前u )。 This is also the answer to your second question. 这也是你第二个问题的答案。 Either substitute before you overwrite your promise, or don't overwrite your promise. 在覆盖您的承诺之前要么替换,要么不覆盖您的承诺。

For more details, see section 2.1.8 of the R Language Definition (promises) which I excerpt here: 有关更多详细信息,请参阅我在此处摘录的R语言定义(承诺)的第2.1.8节

Promise objects are part of R's lazy evaluation mechanism. Promise对象是R的懒惰评估机制的一部分。 They contain three slots: a value, an expression, and an environment. 它们包含三个槽:值,表达式和环境。 When a function is called the arguments are matched and then each of the formal arguments is bound to a promise. 调用函数时,参数匹配,然后每个形式参数都绑定到一个promise。 The expression that was given for that formal argument and a pointer to the environment the function was called from are stored in the promise. 为该形式参数提供的表达式和调用该函数的环境的指针存储在promise中。

You could probably do this with sprintf too, along with is.data.table . 您也可以使用sprintfis.data.table一起执行此is.data.table

> e <- data.frame(x = 1:10)
> foo <- function(u){
      nu <- deparse(substitute(u))
      if(!is.data.table(u)){
          warning(sprintf('%s is not a data table', nu))
          u
      } else {
          u
      }
  }
> foo(e)
    x
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10 10
Warning message:
In foo(e) : e is not a data table

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM