简体   繁体   中英

How can I create a new column to identify the sequence between Zero Values

I would like to create a new column in order to figure it out how many different sequences I have when I find the Zero value until the next Zero value, with 1's values between then.

I am using R to develop such code:

I have two Scenarios: I have the Conversion Column and I'd like to create the New Column

First Scenario (when my Conversions Column starts with 1):

Conversions New Column (The Sequence)
1 1
1 1
0 2
1 2
1 2
1 2
0 3
1 3
1 3
0 4
0 4
0 4
1 4
1 4
1 4
0 5
0 5

Second Scenario (when my Conversions Column starts with 0)

Conversions New Column (The Sequence)
0 1
0 1
0 1
1 1
0 2
1 2
1 2
1 2
0 3
0 3
1 3
0 4
1 4
1 4
0 5
1 5
1 5

Thanks

library(dplyr)

dt1 <- tibble(
  conversion = c(1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0),
  sequence = c(1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5),
  id = 1:17
)

dt2 <- tibble(
  conversion = c(0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1),
  sequence = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5),
  id = 1:17
)

build_seq <- function(df) {
  df %>%
    mutate(
      new_col = ifelse((conversion - lag(conversion, 1)) == -1, id, NA),
      new_col = as.numeric(as.factor(new_col))
    ) %>%
    tidyr::fill(new_col, .direction = "down") %>%
    mutate(
      new_col = ifelse(is.na(new_col), 1, new_col + 1)
    )
}

new_dt1 <- build_seq(dt1)
new_dt2 <- build_seq(dt2)
  
all(new_dt1$new_col == new_dt1$sequence)
all(new_dt2$new_col == new_dt2$sequence)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM