简体   繁体   English

R data.table:如何将每个前面的0更改为一列中的1?

[英]R data.table: how to change each preceding 0 into a 1 within a column?

I have the following R data.table, which is composed of only one column: 我有以下R data.table,它仅由一列组成:

library(data.table)

DT <- data.table(first_column = c(0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0))

> DT
    first_column
 1:            0
 2:            0
 3:            0
 4:            1
 5:            1
 6:            1
 7:            0
 8:            0
 9:            1
10:            1
11:            0
12:            0
13:            0
14:            0
15:            1
16:            1
17:            1
18:            1
19:            1
20:            0
21:            0
...          ...

The binary column first_column is composed of "clusters" of consecutive ones. 二进制列first_column由连续的“簇”组成。

I would like to turn each preceding 0 for each cluster and turn this into a 1. Somehow, one checks for a 1 , and then change the preceding 0 into 1. 我想将每个群集的每个前面的0都变成1。以某种方式,一个检查1 ,然后将前面的0变成1。

EDIT: To be more clear, the pattern 0001110011000011111... would become 0011110111000111111... 编辑:更清楚地说,模式0001110011000011111...将变成0011110111000111111...

Try this using diff : 使用diff尝试一下:

DT$first_column[diff(DT$first_column)==1] <- 1

    # first_column
 # 1:            0
 # 2:            0
 # 3:            1
 # 4:            1
 # 5:            1
 # 6:            1
 # 7:            0
 # 8:            1
 # 9:            1
# 10:            1
# 11:            0
# 12:            0
# 13:            0
# 14:            1
# 15:            1
# 16:            1
# 17:            1
# 18:            1
# 19:            1
# 20:            0
# 21:            0
    # first_column

Basically diff will output 1 wherever a 1 is preceded by a 0 . 基本上diff会在1后面加上0地方输出1

This will replace the final value of each 0/1 "group" with a 1, which will be redundant for the 1 groups, but what you want to accomplish for the 0s (if I read your question correctly). 这会将每个0/1“组”的最终值替换为1,这对于1个组来说是多余的,但是对于0s,您想要完成什么(如果我正确地阅读了您的问题)。

DT[, c(head(first_column, -1), 1), by=rleid(first_column)]

rleid is used to group adjacent 0s and 1s and head with -1 keeps all but the final element. rleid用于对相邻的0和1进行分组,以-1表示的head保留除最终元素以外的所有元素。 Or even better, you can use replace as @Frank suggests, like this 甚至更好,您可以像@Frank所建议的那样使用replace ,就像这样

DT[, replace(first_column, .N, 1), by=rleid(first_column)]

where .N is used to specify the final row in the group. .N用于指定组中的最后一行。 Both of these return 这些都回来了

    rleid V1
 1:     1  0
 2:     1  0
 3:     1  1
 4:     2  1
 5:     2  1
 6:     2  1
 7:     3  0
 8:     3  1
 9:     4  1
10:     4  1
11:     5  0
12:     5  0
13:     5  0
14:     5  1
15:     6  1
16:     6  1
17:     6  1
18:     6  1
19:     6  1
20:     7  0
21:     7  1
    rleid V1

These solutions (incorrectly) fill in the final observation with a 1. One way to avoid this is to add a check before filling in the values. 这些解决方案(错误地)用1填充了最终的观察值。避免这种情况的一种方法是,在填充值之前添加检查。

DT[, if(.I[.N] < nrow(DT)) replace(first_column, .N, 1) else first_column,
   by=rleid(first_column)]

Here, .I[.N] < nrow(DT) returns TRUE for every group except the final group. 在这里, .I[.N] < nrow(DT)对除最终组以外的每个组返回TRUE。 The final observation of this group is left "as is." 该组的最终观察结果保持不变。

If I understood the OP correctly, he wants to turn any occurence of the sub-sequence 0,1 into 1,1 : 如果我正确理解了OP,他想将子序列0,1变成1,1

DT <- data.table(first_column = c(0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0))

DT[first_column == 0 & shift(first_column, type = "lead") == 1, first_column := 1]

DT[, first_column]
# [1] 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 0 0

At the expense of implicit type conversions from double to logical , this can be written more concisely as: 可以隐式地将类型从double转换为logical ,这可以写得更简洁:

DT <- data.table(first_column = c(0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0))

DT[!first_column & shift(first_column, type = "lead"), first_column := 1]
DT[, first_column]
# [1] 0 0 1 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 0 0

Here, the fact is used that 0 is treated as FALSE and any number unequal to 0 as TRUE . 在这里,使用的事实是将0视为FALSE而将任何不等于0视为TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM