[英]In R, create sequential 1 to N column based on values in other columns
Seems like a straightforward data manip problem, however we would like to avoid using a for loop that simply compares the values in each row.看起来像是一个简单的数据操作问题,但是我们希望避免使用简单地比较每一行中的值的 for 循环。 We have the following dataframe:
我们有以下 dataframe:
zed = data.frame(
a = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1),
b = c('a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'd', 'd', 'd', 'd', 'e', 'e', 'a', 'a'),
c = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 1, 1),
stringsAsFactors = FALSE
)
output = zed = data.frame(
a = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1),
b = c('a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'd', 'd', 'd', 'd', 'e', 'e', 'a', 'a'),
c = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 1, 1),
group = c(1, 1, 2, 2, 2, 3, 4, 5, 6, 6, 6, 7, 8, 8, 9, 9),
stringsAsFactors = FALSE
)
> output
a b c group
1 1 a 1 1
2 1 a 1 1
3 1 b 1 2
4 1 b 1 2
5 1 b 1 2
6 1 c 1 3
7 1 c 2 4
8 1 d 2 5
9 2 d 2 6
10 2 d 2 6
11 2 d 2 6
12 2 d 3 7
13 2 e 3 8
14 2 e 3 8
15 1 a 1 9
16 1 a 1 9
The dataframe begins with the columns a
, b
, c
, and we need to add the group
column to the dataframe. dataframe 以列
a
、 b
、 c
,我们需要将group
列添加到 dataframe 中。 The group
column starts at 1, and increases sequentially if any of the values in a
, b
, c
are different from their value in the previous row. group
列从 1 开始,如果a
、 b
、 c
中的任何值与前一行中的值不同,则按顺序增加。
This is not quite as simple as doing a group_by()
on a
, b
, c
, as the same row can appear later, but not sequentially, in the dataframe (eg rows 1,2 == rows 15,16, however they are not the same group
because they did not appear sequentially in the dataframe).这不像在
a
, b
, c
上执行group_by()
那样简单,因为同一行可以稍后出现,但不是按顺序出现在 dataframe 中(例如第 1,2 行 == 第 15,16 行,但是它们是不是同一group
,因为它们没有按顺序出现在数据框中)。
We can use我们可以用
library(data.table)
setDT(zed)[, group := .GRP, .(rleid(a, b, c))]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.