在 R 中，根据其他列中的值创建连续的 1 到 N 列

Question

Seems like a straightforward data manip problem, however we would like to avoid using a for loop that simply compares the values in each row.看起来像是一个简单的数据操作问题，但是我们希望避免使用简单地比较每一行中的值的 for 循环。 We have the following dataframe:我们有以下 dataframe：

zed = data.frame(
  a = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1),
  b = c('a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'd', 'd', 'd', 'd', 'e', 'e', 'a', 'a'),
  c = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 1, 1),
  stringsAsFactors = FALSE
)

output = zed = data.frame(
  a = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1),
  b = c('a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'd', 'd', 'd', 'd', 'e', 'e', 'a', 'a'),
  c = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 1, 1),
  group = c(1, 1, 2, 2, 2, 3, 4, 5, 6, 6, 6, 7, 8, 8, 9, 9),
  stringsAsFactors = FALSE
)

> output
   a b c group
1  1 a 1     1
2  1 a 1     1
3  1 b 1     2
4  1 b 1     2
5  1 b 1     2
6  1 c 1     3
7  1 c 2     4
8  1 d 2     5
9  2 d 2     6
10 2 d 2     6
11 2 d 2     6
12 2 d 3     7
13 2 e 3     8
14 2 e 3     8
15 1 a 1     9
16 1 a 1     9

The dataframe begins with the columns a , b , c , and we need to add the group column to the dataframe. dataframe 以列a 、 b 、 c ，我们需要将group列添加到 dataframe 中。 The group column starts at 1, and increases sequentially if any of the values in a , b , c are different from their value in the previous row. group列从 1 开始，如果a 、 b 、 c中的任何值与前一行中的值不同，则按顺序增加。

This is not quite as simple as doing a group_by() on a , b , c , as the same row can appear later, but not sequentially, in the dataframe (eg rows 1,2 == rows 15,16, however they are not the same group because they did not appear sequentially in the dataframe).这不像在a ， b ， c上执行group_by()那样简单，因为同一行可以稍后出现，但不是按顺序出现在 dataframe 中（例如第 1,2 行 == 第 15,16 行，但是它们是不是同一group ，因为它们没有按顺序出现在数据框中）。

Answer 1

We can use我们可以用

library(data.table)
setDT(zed)[, group := .GRP, .(rleid(a, b, c))]

在 R 中，根据其他列中的值创建连续的 1 到 N 列

问题描述

1 个解决方案

解决方案1
4 已采纳 2020-12-16 16:03:57

在 R 中，根据其他列中的值创建连续的 1 到 N 列

问题描述

1 个解决方案

解决方案1 4 已采纳 2020-12-16 16:03:57

解决方案1
4 已采纳 2020-12-16 16:03:57