简体   繁体   中英

Tapply and function with several arguments

I can use the tapply function to make basic operations (eg using mtcars data, calculate mean weight by number of cylinders).

library(data.table)
mtcars <- data.table(mtcars)
tapply(X = mtcars[,wt], 
       INDEX = mtcars[,cyl],
       mean)

However, I do not know how to perform more complex operations. Eg Correlation between weight and qsec variables by number of cylinders. I tried something like the following but it does not work.

tapply(X = mtcars[,.(wt, qsec)], 
       INDEX = mtcars[,cyl],
       cor.test(mtcars[,wt], mtcars[,qsec]))
Error in match.fun(FUN) :  'cor.test(mtcars[, wt], mtcars[, qsec])' is not a function, character or symbol

tapply(X = rownames(mtcars[,.(wt,qsec,cyl)]), 
       INDEX = mtcars[,cyl],
       function(r) cor.test(mtcars[r, 1],
                            mtcars[r, 2])

Any idea how to do this efficiently with an t/apply function?

In my mind, a tapply data.table variant should have FUNs that operate on indexed subsets of the data.table. I have defined a dt_tapply is I imagine it should behave. Seems ok practical.

library(data.table)

data(mtcars)
mtcars = data.table(mtcars)

#iterate over table with index, like tapply just for table rows
dt_tapply = function(dx,INDEX,FUN=NULL,...) {
  lapply(sort(unique(INDEX)),function(i){
    do.call(FUN,c(list(dx[INDEX==i,]),list(...)))
  })
}


dt_tapply(mtcars,mtcars$cyl,summary)

#some custom made function computing stuff from multiple columns giving some blob output
compute_cor_wtqsec = function(dx) {
  cor(dx$wt,dx$qsec)
}

#dt_tapply that function
dt_tapply(mtcars,mtcars$cyl,compute_cor_wtqsec)




[[1]]
      mpg             cyl         disp              hp              drat             wt             qsec      
 Min.   :21.40   Min.   :4   Min.   : 71.10   Min.   : 52.00   Min.   :3.690   Min.   :1.513   Min.   :16.70  
 1st Qu.:22.80   1st Qu.:4   1st Qu.: 78.85   1st Qu.: 65.50   1st Qu.:3.810   1st Qu.:1.885   1st Qu.:18.56  
 Median :26.00   Median :4   Median :108.00   Median : 91.00   Median :4.080   Median :2.200   Median :18.90  
 Mean   :26.66   Mean   :4   Mean   :105.14   Mean   : 82.64   Mean   :4.071   Mean   :2.286   Mean   :19.14  
 3rd Qu.:30.40   3rd Qu.:4   3rd Qu.:120.65   3rd Qu.: 96.00   3rd Qu.:4.165   3rd Qu.:2.623   3rd Qu.:19.95  
 Max.   :33.90   Max.   :4   Max.   :146.70   Max.   :113.00   Max.   :4.930   Max.   :3.190   Max.   :22.90  
       vs               am              gear            carb      
 Min.   :0.0000   Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:1.0000   1st Qu.:0.5000   1st Qu.:4.000   1st Qu.:1.000  
 Median :1.0000   Median :1.0000   Median :4.000   Median :2.000  
 Mean   :0.9091   Mean   :0.7273   Mean   :4.091   Mean   :1.545  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:2.000  
 Max.   :1.0000   Max.   :1.0000   Max.   :5.000   Max.   :2.000  

[[2]]
      mpg             cyl         disp             hp             drat             wt             qsec      
 Min.   :17.80   Min.   :6   Min.   :145.0   Min.   :105.0   Min.   :2.760   Min.   :2.620   Min.   :15.50  
 1st Qu.:18.65   1st Qu.:6   1st Qu.:160.0   1st Qu.:110.0   1st Qu.:3.350   1st Qu.:2.822   1st Qu.:16.74  
 Median :19.70   Median :6   Median :167.6   Median :110.0   Median :3.900   Median :3.215   Median :18.30  
 Mean   :19.74   Mean   :6   Mean   :183.3   Mean   :122.3   Mean   :3.586   Mean   :3.117   Mean   :17.98  
 3rd Qu.:21.00   3rd Qu.:6   3rd Qu.:196.3   3rd Qu.:123.0   3rd Qu.:3.910   3rd Qu.:3.440   3rd Qu.:19.17  
 Max.   :21.40   Max.   :6   Max.   :258.0   Max.   :175.0   Max.   :3.920   Max.   :3.460   Max.   :20.22  
       vs               am              gear            carb      
 Min.   :0.0000   Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:3.500   1st Qu.:2.500  
 Median :1.0000   Median :0.0000   Median :4.000   Median :4.000  
 Mean   :0.5714   Mean   :0.4286   Mean   :3.857   Mean   :3.429  
 3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :1.0000   Max.   :1.0000   Max.   :5.000   Max.   :6.000  

[[3]]
      mpg             cyl         disp             hp             drat             wt             qsec      
 Min.   :10.40   Min.   :8   Min.   :275.8   Min.   :150.0   Min.   :2.760   Min.   :3.170   Min.   :14.50  
 1st Qu.:14.40   1st Qu.:8   1st Qu.:301.8   1st Qu.:176.2   1st Qu.:3.070   1st Qu.:3.533   1st Qu.:16.10  
 Median :15.20   Median :8   Median :350.5   Median :192.5   Median :3.115   Median :3.755   Median :17.18  
 Mean   :15.10   Mean   :8   Mean   :353.1   Mean   :209.2   Mean   :3.229   Mean   :3.999   Mean   :16.77  
 3rd Qu.:16.25   3rd Qu.:8   3rd Qu.:390.0   3rd Qu.:241.2   3rd Qu.:3.225   3rd Qu.:4.014   3rd Qu.:17.55  
 Max.   :19.20   Max.   :8   Max.   :472.0   Max.   :335.0   Max.   :4.220   Max.   :5.424   Max.   :18.00  
       vs          am              gear            carb     
 Min.   :0   Min.   :0.0000   Min.   :3.000   Min.   :2.00  
 1st Qu.:0   1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.25  
 Median :0   Median :0.0000   Median :3.000   Median :3.50  
 Mean   :0   Mean   :0.1429   Mean   :3.286   Mean   :3.50  
 3rd Qu.:0   3rd Qu.:0.0000   3rd Qu.:3.000   3rd Qu.:4.00  
 Max.   :0   Max.   :1.0000   Max.   :5.000   Max.   :8.00  

[[1]]
[1] 0.6380214

[[2]]
[1] 0.8659614

[[3]]
[1] 0.5365487

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM