As an example i will be using the mtcars
data available in R:
data(mtcars)
setDT(mtcars)
Lets day I want to group the data by three variables, namely: carb
, cyl
, and gear
. I have done this as follow. However, i am sure there is a better way, as this is quite repetitive.
newDTcars <- mtcars [, mtcars[, mtcars[, .N , by = carb], by = cyl], by= gear]
Secondly, I would like to have the data in a wide format, where there is a separate column for every gear
level. For illustration purpose I have done this using tidyr
, however i would like to have this done the "data.table" way.
newDTcars %>% tidyr::spread(gear, N)
The emphasis of this question is to keep to solution to the data.table world, as i would like too learn more about data.table
.
In data.table
, we can group by multiple columns and to reshape we can use dcast
.
library(data.table)
dcast(mtcars[, .N, .(carb, cyl, gear)], carb+cyl~gear, value.var = "N")
# carb cyl 3 4 5
#1: 1 4 1 4 NA
#2: 1 6 2 NA NA
#3: 2 4 NA 4 2
#4: 2 8 4 NA NA
#5: 3 8 3 NA NA
#6: 4 6 NA 4 NA
#7: 4 8 5 NA 1
#8: 6 6 NA NA 1
#9: 8 8 NA NA 1
You may use fill
argument in dcast
to replace NA
s with 0 or any other number.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.