简体   繁体   English

data.table中by的非标准评估

[英]Non standard evaluation of by in data.table

I am lost with evaluation of by in data.table . 我对data.tableby评估迷失了。 What will be correct way to merge functionality of LJ and LJ2 into one function? LJLJ2功能合并为一个功能的正确方法是什么?

LJ <- function(dt_x_, dt_y_, by_)
{
    merge(
        dt_x_,
        dt_y_,
        by = eval(substitute(by_)), all.x = TRUE, sort = FALSE)
}
LJ2 <- function(dt_x_, dt_y_, by_)
{
    merge(
        dt_x_,
        dt_y_,
        by = deparse(substitute(by_)), all.x = TRUE, sort = FALSE)
}
LJ(
    data.table(A = c(1,2,3)),
    data.table(A = c(1,2,3), B = c(11,12,13)), 
    "A")
LJ2(
    data.table(A = c(1,2,3)),
    data.table(A = c(1,2,3), B = c(11,12,13)), 
    A)

I consider this a bad idea. 我认为这是个坏主意。 Have the user always pass a character value. 让用户始终传递字符值。 You could do this: 您可以这样做:

LJ3 <- function(dt_x_, dt_y_, by_)
{ 
  by_ <- gsub('\"', "", deparse(substitute(by_)), fixed = TRUE)
  dt_y_[dt_x_, on = by_] 
}

LJ3(
  data.table(A = c(4,1,2,3)),
  data.table(A = c(1,2,3), B = c(11,12,13)), 
  A)
#   A  B
#1: 4 NA
#2: 1 11
#3: 2 12
#4: 3 13

LJ3(
  data.table(A = c(4,1,2,3)),
  data.table(A = c(1,2,3), B = c(11,12,13)), 
  "A")
#   A  B
#1: 4 NA
#2: 1 11
#3: 2 12
#4: 3 13

This question is not related to data.table. 这个问题与data.table不相关。 The by parameter in merge.data.table always expects a character value, as does on . by在参数merge.data.table总是需要一个字符值,象on

Edit: @eddi points out that the above will fail if you have column names with actual " in them (something you should avoid in general, but may happen if you fread some input files prepared by others). 编辑:@eddi指出,如果你有一个实际的列名以上会失败"在他们(你应该避免在一般情况下,但如果可能发生fread他人准备了一些输入文件)。

An alternative that can handle such edge cases would be: 可以处理此类极端情况的替代方法是:

LJ4 <- function(dt_x_, dt_y_, by_)
{ 
  by_ <- substitute(by_)
  if (!is.character(by_)) by_ <- deparse(by_)
  dt_y_[dt_x_, on = by_] 
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM