I want to demean a whole data.table
object (or just a list of many columns of it) by groups.
Here's my approach so far:
setkey(myDt, groupid)
for (col in colnames(wagesOfFired)){
myDt[, paste(col, 'demeaned', sep='.') := col - mean(col), with=FALSE]
}
which gives
Error in col - mean(col) : non-numeric argument to binary operator
Here's some sample data. In this simple case, there's only two columns, but I typically have so many columns such that I want to iterate over a list
y groupid x
1: 3.46000 51557094 97
2: 111.60000 51557133 25
3: 29.36000 51557133 23
4: 96.38000 51557133 9
5: 65.22000 51557193 32
6: 66.05891 51557328 10
7: 9.74000 51557328 180
8: 61.59000 51557328 18
9: 9.99000 51557328 18
10: 89.68000 51557420 447
11: 129.24436 51557429 15
12: 3.46000 51557638 3943
13: 117.36000 51557642 11
14: 9.51000 51557653 83
15: 68.16000 51557653 518
16: 96.38000 51557653 14
17: 9.53000 51557678 18
18: 7.96000 51557801 266
19: 51.88000 51557801 49
20: 10.70000 51558040 1034
The problem is that col
is a string, so col-mean(col)
cannot be computed.
myNames <- names(myDt)
myDt[,paste(myNames,"demeaned",sep="."):=
lapply(.SD,function(x)x-mean(x)),
by=groupid,.SDcols=myNames]
Comments:
[
repeatedly can be slow. myNames
to some subset of the column names.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.