按多种因素对表进行分组并将其从长格式扩展到宽格式 - R 中的 data.table 方式

Question

As an example i will be using the mtcars data available in R:例如，我将使用 R 中可用的mtcars数据：

data(mtcars)
setDT(mtcars)

Lets day I want to group the data by three variables, namely: carb , cyl , and gear .让我想通过三个变量对数据进行分组，即： carb 、 cyl和gear 。 I have done this as follow.我这样做了如下。 However, i am sure there is a better way, as this is quite repetitive.但是，我相信有更好的方法，因为这是非常重复的。

newDTcars <- mtcars [, mtcars[, mtcars[, .N , by = carb], by = cyl], by= gear]

Secondly, I would like to have the data in a wide format, where there is a separate column for every gear level.其次，我想要宽格式的数据，其中每个gear都有一个单独的列。 For illustration purpose I have done this using tidyr , however i would like to have this done the "data.table" way.出于说明目的，我使用tidyr完成了此操作，但是我希望以“data.table”方式完成此操作。

newDTcars %>% tidyr::spread(gear, N)

The emphasis of this question is to keep to solution to the data.table world, as i would like too learn more about data.table .这个问题的重点是继续解决 data.table 世界，因为我也想了解更多关于data.table 。

Answer 1

In data.table , we can group by multiple columns and to reshape we can use dcast .在data.table ，我们可以按多列进行分组，并且可以使用dcast来重塑。

library(data.table)
dcast(mtcars[, .N, .(carb, cyl, gear)], carb+cyl~gear, value.var = "N")

#   carb cyl  3  4  5
#1:    1   4  1  4 NA
#2:    1   6  2 NA NA
#3:    2   4 NA  4  2
#4:    2   8  4 NA NA
#5:    3   8  3 NA NA
#6:    4   6 NA  4 NA
#7:    4   8  5 NA  1
#8:    6   6 NA NA  1
#9:    8   8 NA NA  1

You may use fill argument in dcast to replace NA s with 0 or any other number.您可以在dcast使用fill参数将NA替换为 0 或任何其他数字。

按多种因素对表进行分组并将其从长格式扩展到宽格式 - R 中的 data.table 方式

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-03-20 11:06:17

按多种因素对表进行分组并将其从长格式扩展到宽格式 - R 中的 data.table 方式

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-03-20 11:06:17

解决方案1
2 已采纳 2020-03-20 11:06:17