如何在一列中将值分成相等的范围，并在R中将另一列的关联值求和？

Question

我有一个名为Cust_Amount的数据Cust_Amount ，如下所示：

Age    Amount_Spent
25       20
43       15
32       27
37       10
45       17
29       10

我想将其分解成相等大小的年龄段，并总结出每个年龄段的花费金额，如下所示：

Age_Group  Total_Amount
 20-30     30
 30-40     37
 40-50     32

Answer 1

我们可以使用cut来对“年龄”进行分组，并根据分组变量获取“ Amount_Spent” sum 。

library(data.table)
setDT(df1)[,.(Total_Amount = sum(Amount_Spent)) , 
       by = .(Age_Group = cut(Age, breaks = c(20, 30, 40, 50)))]

或与dplyr

library(dplyr)
df1 %>%
    group_by(Age_Group = cut(Age, breaks = c(20, 30, 40, 50))) %>%
    summarise(Total_Amount = sum(Amount_Spent))
#     Age_Group Total_Amount
#      <fctr>        <int>
#1   (20,30]           30
#2   (30,40]           37
#3   (40,50]           32

Answer 2

这是一个使用cut和aggregate ，然后使用setNames命名结果列的基本解决方案：

mydf$Age_Group <- cut(mydf$Age, breaks = seq(20,50, by = 10))
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum), 
                    c('Age_Group', 'Total_Spent')))

  Age_Group Total_Spent
1   (20,30]          30
2   (30,40]          37
3   (40,50]          32

我们可以进一步使用gsub来匹配您所需的输出（请注意，我不是正则表达式专家）：

mydf$Age_Group <- 
    gsub(pattern = ',',
     x = gsub(pattern = ']', 
     x = gsub(pattern = '(', x = mydf$Age_Group, replacement = '', fixed = T),
     replacement = '', fixed = T),
     replacement = ' - ', fixed = T)
with(mydf, setNames(aggregate(Amount_Spent ~ Age_Group, FUN = sum), 
                  c('Age_Group', 'Total_Spent')))

  Age_Group Total_Spent
1   20 - 30          30
2   30 - 40          37
3   40 - 50          32

如何在一列中将值分成相等的范围，并在R中将另一列的关联值求和？

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-07-25 16:32:28

解决方案2
3 2016-07-25 16:46:10

如何在一列中将值分成相等的范围，并在R中将另一列的关联值求和？

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-07-25 16:32:28

解决方案2 3 2016-07-25 16:46:10

解决方案1
5 已采纳 2016-07-25 16:32:28

解决方案2
3 2016-07-25 16:46:10