简体   繁体   English

计算data.table中的百分比汇总

[英]Calculate percentage summaries in data.table

If this is my dataset: 如果这是我的数据集:

library(data.table)    
dt <- data.table(
  record=c(1:20),
  area=rep(LETTERS[1:4], c(4, 6, 3, 7)), 
  score=c(1,1:3,2:3,1,1,1,2,2,1,2,1,1,1,1,1:3),
  cluster=c("X", "Y", "Z")[c(1,1:3,3,2,1,1:3,1,1:3,3,3,3,1:3)]
)

What is the best way using data.table to calculate percentage summaries like this: 使用data.table来计算百分比汇总的最佳方式是这样的:

prop.table(table(dt$area, dt$score), 1)*100

However, I would also want more flexibility in the inputs of this summary. 但是,我也想在此摘要的输入中提供更大的灵活性。 For example, including only records that belong to cluster 'X' or clusters 'Y' and 'Z') 例如,仅包含属于群集“ X”或群集“ Y”和“ Z”的记录)

dt[,.N,by=list(area,score)][,perc:=100*N/sum(N),by=area][,.SD]

和dcast.data.table(如果需要)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM