简体   繁体   中英

Programming on the data.table with "env" in a function

I am interested in joining two data.tables in a function. However, when using the new env for programming on the data.table, I am unable to join the data.tables in a function because the argument I attempt to join on does not exist, ie I get a "argument specifying columns received non-existing columns" error. How can I programmatically feed the matching column for joining two data.tables into a function? I provide a minimal working example of a surprising failure below.

dt.mwe.1 <- data.table(a = c(1,2,3,4,0,10))

mwe_function = function(dt, merge_var){
  dt.internal = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal2 = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal
  dt.internal[dt.internal2, on = .(mv), 
              env = list(mv = merge_var)] %>% `[`
}
# fails
mwe_function(dt = dt.mwe.1, merge_var = "a")
# also fails
mwe_function(dt = dt.mwe.1, merge_var = a)

Maybe I am missing your point, but what about:

mwe_function = function(dt, merge_var){
  dt.internal = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal2 = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal
  dt.internal[dt.internal2, on = merge_var] %>% `[`
}

mwe_function(dt = dt.mwe.1, merge_var = "a")

#         a
#     <int>
#  1:     0
#  2:     1
#  3:     2
#  4:     3
#  5:     4
#  6:     5
#  7:     6
#  8:     7
#  9:     8
# 10:     9
# 11:    10

From the help of ?data.table :

 env: List or an environment, passed to 'substitute2' for substitution of parameters in 'i', 'j' and 'by' (or 'keyby'). Use 'verbose' to preview constructed expressions.

So I guess the env approach does not work on the on argument, which, however, accepts anyways strings as input.


NSE Approach

mwe_function = function(dt, merge_var){
  merge_var <- as.character(substitute(merge_var))
  dt.internal = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal2 = 
    data.table(z = min(dt):max(dt)) %>% 
    .[ , .(mv = z) , env = list(mv = merge_var)] %>% `[`
  dt.internal
  dt.internal[dt.internal2, on = merge_var] %>% `[`
}

mwe_function(dt = dt.mwe.1, merge_var = a)

#         a
#     <int>
#  1:     0
#  2:     1
#  3:     2
#  4:     3
#  5:     4
#  6:     5
#  7:     6
#  8:     7
#  9:     8
# 10:     9
# 11:    10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM