简体   繁体   中英

R: Using get and data.table within a user defined function

I'm in the process of learning how to write functions in R that use common packages like data.table and dplyr.

This function I wrote calculates the percentage of observations in a particular category, within some other group (example: the share of cars with 10-20mpg, that were released in year 2015) and produces a table. Here it is without a function around it:

library(data.table)
library(scales)


#Create test dataframe and cut off points
test<-data.frame(x=c(0:10), y=c(rep(1,5),rep(2,6)), z=c("A","A","A","B","B","B","C","C","C","C","C"))
test <- data.table(test)


#trial non function version (calculating share of row by category z): works
tmp<-test[,.(N=.N), keyby=.(y,z)]
tmp[,total:=sum(N), by=y]
tmp[,percent:=percent(N/total)]
dcast(tmp,y ~ z, value.var="percent")

But in order to make it work within a function I had to use get. Once get is evaluated, the two categorical variables have to be referred to as "get" and "get.1" for the rest of the code (see below). Is there a way to avoid this?

#Two way table function: data.table

tw_tab<-function(dt,v1,v2){

#set up variables as charaters
v1<-as.character(substitute(v1))
v2<-as.character(substitute(v2))
dt<-as.character(substitute(dt))

#function
tmp<-get(dt)[,.(N=.N), keyby=.(get(v1),get(v2))]
tmp[,total:=sum(N), by=get]
tmp[,percent:=percent(N/total)]
dcast(tmp,get ~ get.1, value.var="percent")

}

#test function
tw_tab(test, y, z)

I tried just using "get(v1)" and "get(v2)" throughout the code, but this doesn't work

I've looked at other posts on user functions with data.table (eg. Get a user-defined function work in data.table ) But they don't seem to touch on this issue/encounter it.

I'm new to this, so would appreciate any other feedback/comments on better ways to do this people have.

You don't have to call get on dt (Based on my experience, get is most often used to refer to a column using string) and you can supply character vector to by or keyby :

tw_tab <- function(dt,v1,v2){

    #set up variables as charaters
    v1<-as.character(substitute(v1))
    v2<-as.character(substitute(v2))

    #function
    tmp <- dt[,.(N=.N), keyby = c(v1, v2)]
    tmp[,total:=sum(N), by= c(v1)]
    tmp[,percent:=percent(N/total)]
    dcast(tmp, paste(v1, '~', v2), value.var="percent")
}

#test function
tw_tab(test, y, z)
#    y     A     B     C
# 1: 1 60.0% 40.0%    NA
# 2: 2    NA 16.7% 83.3%

Here is also a solution using xtabs and prop.table :

tw_tab <- function(x, v1, v2){
    fm <- bquote(~ .(substitute(v1)) + .(substitute(v2)))
    res <- prop.table(xtabs(formula = fm, data = x), 1)
    res <- as.data.frame.matrix(res)
    res[] <- lapply(res, scales::percent)
    return(res)
}

tw_tab(test, y, z)
#     A     B     C
# 1 60% 40.0%  0.0%
# 2  0% 16.7% 83.3%

I'd do...

row_pct = function(DT, fm){
  all = all.vars(fm)
  lhs = all.vars(fm[[2]])
  rhs = all.vars(fm[[3]])

  DT[, .N, by=all][, 
    p := percent(N/sum(N)), by=lhs][, 
    dcast(.SD, eval(fm), value.var = "p", fill = percent(0))]
}

Examples:

row_pct(test, y ~ z)

   y   A     B     C
1: 1 60%   40%    0%
2: 2  0% 16.7% 83.3%

row_pct(data.table(mtcars), cyl + gear ~ carb)

   cyl gear    1     2     3     4    6   8
1:   4    3 100%    0%    0%    0%   0%  0%
2:   4    4  50%   50%    0%    0%   0%  0%
3:   4    5   0%  100%    0%    0%   0%  0%
4:   6    3 100%    0%    0%    0%   0%  0%
5:   6    4   0%    0%    0%  100%   0%  0%
6:   6    5   0%    0%    0%    0% 100%  0%
7:   8    3   0% 33.3% 25.0% 41.7%   0%  0%
8:   8    5   0%    0%    0%   50%   0% 50%

If for some reason you want to enter row and col vars separately:

row_pct2 = function(DT, rowvars, colvar){
      fm  = substitute(`~`(rowvars, colvar))
      row_pct(DT, fm)
}

# Examples:
row_pct2(test, y, z)
row_pct2(data.table(mtcars), cyl + gear, carb)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM