R 将数据分组为具有度量变量的相等组

Question

I'm struggeling to get a good performing script for this problem: I have a table with a score, x, y.我正在努力为这个问题获得一个性能良好的脚本：我有一个带有分数 x，y 的表。 I want to sort the table by score and than build groups based on the x value.我想按分数对表格进行排序，而不是根据 x 值构建组。 Each group should have an equal sum (not counts) of x.每个组的 x 总和（不是计数）应该相等。 x is a metric number in the dataset and resembles the historic turnover of a customer. x 是数据集中的一个度量数字，类似于客户的历史营业额。

      score x   y
0.436024136 3   435
0.282303336 46  56
0.532358015 24  34
0.644236597 0   2
0.99623626  0   4
0.557673456 56  46
0.08898779  0   7
0.702941303 453 2
0.415717835 23  1
0.017497461 234 3
0.426239166 23  59
0.638896238 234 86
0.629610596 26  68
0.073107526 0   35
0.85741877  0   977
0.468612039 0   324
0.740704267 23  56
0.720147257 0   68
0.965212467 23  0

Answer 1

a good way to do so is adding a group variable to the data.frame with cumsum.这样做的一个好方法是使用 cumsum 向 data.frame 添加一个组变量。 Now you can easily sum the groups with eg subset.现在您可以轻松地将组与子集相加。

data.frame$group <-cumsum(as.numeric(data.frame$x)) %/% (ceiling(sum(data.frame$x) / 3)) + 1

remarks:评论：

in big data.frames cumsum(as.numeric()) works reliably在大data.frames cumsum(as.numeric())可靠地工作
%/% is a division where you get an integer back %/%是一个部门，您可以在其中获得 integer
the '+1' just let your groups start with 1 instead of 0 '+1' 只是让您的组以 1 而不是 0 开头

thank you @Ronak Shah!谢谢@Ronak Shah！

R 将数据分组为具有度量变量的相等组

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-07-08 06:01:41

R 将数据分组为具有度量变量的相等组

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-07-08 06:01:41

解决方案1
0 已采纳 2020-07-08 06:01:41