The Most Efficient Way of Forming Groups using R

Question

I have a tibble dt given as follows:

library(tidyverse) 

dt <- tibble(x=as.integer(c(0,0,1,0,0,0,1,1,0,1))) %>% 
  mutate(grp = as.factor(c(rep("A",3), rep("B",4), rep("C",1), rep("D",2))))
dt

As one can observe the rule for grouping is:

starts 0 and ends with 1 (eg, groups A, B, D) or
it solely contains 1 (eg, group C)

Problem : Given a tibble with column integer vector x of zeros and 1 that starts with 0 and ends in 1, what is the most efficient way to obtain a grouping using R? (You can use any grouping symbols/factors.)

Answer 1

We can get the cumulative sum of 'x' (assuming it is binary), take the lag add 1 and use that index to replace it with LETTERS (Note that LETTERS was used only as part of matching with the expected output - it can take go up to certain limit)

library(dplyr)
dt %>% 
   mutate(grp2 = LETTERS[lag(cumsum(x), default = 0)+ 1])

-output

# A tibble: 10 x 3
       x grp   grp2 
   <int> <fct> <chr>
 1     0 A     A    
 2     0 A     A    
 3     1 A     A    
 4     0 B     B    
 5     0 B     B    
 6     0 B     B    
 7     1 B     B    
 8     1 C     C    
 9     0 D     D    
10     1 D     D

Answer 2

Though the strategy proposed by Akrun is fantastic, yet to show that it can be managed through accumulate also

library(tidyverse) 

dt <- tibble(x=as.integer(c(0,0,1,0,0,0,1,1,0,1))) %>% 
  mutate(grp = as.factor(c(rep("A",3), rep("B",4), rep("C",1), rep("D",2))))

dt %>%
  mutate(GRP = accumulate(lag(x, default = 0),.init =1, ~ if(.y != 1) .x  else .x+1)[-1])
#> # A tibble: 10 x 3
#>        x grp     GRP
#>    <int> <fct> <dbl>
#>  1     0 A         1
#>  2     0 A         1
#>  3     1 A         1
#>  4     0 B         2
#>  5     0 B         2
#>  6     0 B         2
#>  7     1 B         2
#>  8     1 C         3
#>  9     0 D         4
#> 10     1 D         4

^{Created on 2021-06-13 by the reprex package (v2.0.0)}

The Most Efficient Way of Forming Groups using R

Question

2 answers

solution1
2 ACCPTED 2021-06-12 23:51:48

solution2
2 2021-06-13 03:48:52

The Most Efficient Way of Forming Groups using R

Question

2 answers

solution1 2 ACCPTED 2021-06-12 23:51:48

solution2 2 2021-06-13 03:48:52

solution1
2 ACCPTED 2021-06-12 23:51:48

solution2
2 2021-06-13 03:48:52