I am experimenting with dcast.data.table
for weighted.mean
. However it throws an error for this function.
library(data.table)
dat = data.table(
x = c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3),
y = c(4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6),
z = c(7:24),
w = c(0.1, 0.1, 0.1, 0.9, 0.9, 0.9, 0.2, 0.2, 0.2, 0.8, 0.8, 0.8, 0.3, 0.3, 0.3, 0.7, 0.7, 0.7)
)
dcast.data.table(
dat,
x~y,
fun.aggregate = weighted.mean, w = 'w',
value.var= 'z'
)
# Error in weighted.mean.default(z, w = "w") :
# 'x' and 'w' must have the same length
There are workarounds that suggest to use either dplyr
or data.table[]
but none explain why dcast
doesn't work.
As @Frank points out, the fun.aggregate
argument of dcast
can only take functions whose output is a single value. However, I don't think that this is the issue with weighted.mean
. If I don't specify weights it get valid answer
dcast.data.table(
dat,
x~y,
fun.aggregate = weighted.mean,
value.var= 'z'
# ,w = 'w'
)
This is also demonstrated with quantile
function which gives me a valid answer when the result for each function is a single value (ie by specifying single value for probs
)
dcast.data.table(
dat,
x~y,
fun.aggregate = quantile,
value.var= 'z',
probs = c(0.25)
)
However when it is written to output a vector for each combination, I get an error commensurate with the limitation of fun.aggregate
but different from the error I get with using weighted.mean
dcast.data.table(
dt,
x~y,
fun.aggregate = quantile,
value.var= 'z',
probs = c(0.25,0.75)
)
# Error: Aggregating function(s) should take vector inputs and return a single value (length=1). However, function(s) returns length!=1. This value will have to be used to fill any missing combinations, and therefore must be length=1. Either override by setting the 'fill' argument explicitly or modify your function to handle this case appropriately.
It seems that dcast
doesn't split up the w
argument for each function and passes the entire vector to weighted.mean
function. I want to understand what internally prevents this function from doing this.
Wath about this?
dat = data.frame(x = c(1,1,2,2),
y = c(4,4,5,5),
z = c(1,2,3,4),
w = c(1,2,1,2))
weighted.sum
reshape2::dcast(data = dat, formula=x~y,
fun.aggregate = function(x){mean(x*dat$w)*length(x)},
value.var= c('z'))
#weighted.mean
reshape2::dcast(data = dat, formula=x~y,
fun.aggregate = function(x){mean(x*dat$w)},
value.var= c('z'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.