r使用多个因素汇总的data.table

Question

i have below data.table 我有下面的data.table

'data.frame':   66977 obs. of  16 variables:
 $ SUBS                         : int  
 $ CITY                         : Factor w/ 18 levels 
 $ VALUE_SEG                    : Factor w/ 7 levels 
 $ region                       : Factor w/ 5 levels 
 $ SUM.DATA_PPU_REV_DEC.        : num  
 $ SUM.DATA_BUNDLE_REV_DEC.     : int  
 $ SUM.DATA_USAGE_TOTAL_KB_DEC. : num  
 $ SUM.THIS_MONTH_REV_DEC.      : num  
 $ SUM.VOICE_ONNET_DURATION_DEC.: num  
 $ SUM.VOICE_ONNET_REV_DEC.     : num  
 $ SUM.VOICE_OFFNET_REV_DEC.    : num  
 $ SUM.SMS_ONNET_REV_DEC.       : num  
 $ SUM.SMS_OFFNET_REV_DEC.      : int  
 $ SUM.RECHARGE_DEC.            : int  
 $ STATUS_DEC                   : Factor w/ 5 levels 
 $ TYPE_DEC_2                   : Factor w/ 6 levels

i want to group it by two of the Factor variables let's say VALUE_SEG & region, get the sum for number and create new coulm for each factor variable with count of observations. 我想按两个因子变量将其分组，比如说VALUE_SEG和区域，获取数字的总和，并为每个因子变量创建新的库仑，并进行观察。 i tryied aggregate, ddply and others with varians type of errors :( thanks in advance 我试过聚合，ddply和其他带有varians类型的错误:(预先感谢

Answer 1

Here is an option using data.table 这是使用data.table的选项

library(data.table)
setDT(data)[,lapply(.SD, function(x) if(is.numeric(x)) sum(x) else .N),
                          by= list(VALUE_SEG,region)]

Answer 2

I recommend you to separate numeric and factor variable and summarize using dplyr . 我建议您将数值变量和因子变量分开，并使用dplyr进行dplyr 。 It could be like 可能像

library(dplyr)

data %>% select(VALUE_SEG,region,SUM..... all numeric variables) %>% 
   group_by(VALUE_SEG,region) %>% summarize_each(funs(sum)) -> summary1

## For factors

data %>% select(VALUE_SEG,region,SUM..... all factors variables) %>% 
   group_by(VALUE_SEG,region) %>% summarize_each(funs(n)) -> summary2

## Then you can merge these results

Summary <- merge(summary1,summary2,by="VALUE_SEG")

For more details on using this package visit this link 有关使用此软件包的更多详细信息，请访问此链接

r使用多个因素汇总的data.table

问题描述

2 个解决方案

解决方案1
3 2015-04-23 12:07:23

解决方案2
1 已采纳 2015-04-23 08:36:07

r使用多个因素汇总的data.table

问题描述

2 个解决方案

解决方案1 3 2015-04-23 12:07:23

解决方案2 1 已采纳 2015-04-23 08:36:07

解决方案1
3 2015-04-23 12:07:23

解决方案2
1 已采纳 2015-04-23 08:36:07