简体   繁体   中英

generate id within group

I have the following dataset

varA <- c(rep("A",2), rep("B",4))
varB <- c(rep("aaaa",2), rep("bbbb", 3), rep("cccc",1) )

dat <- data.frame(varA, varB)
dat 
  varA varB
1    A aaaa
2    A aaaa
3    B bbbb
4    B bbbb
5    B bbbb
6    B cccc

I would like to generate ids for each subgroup, such that the first subgroup is 1, the second 2, etc, within varA. Theids can repeat across the dataset, just not within subgroup.

This the needed result

  varA varB res
1    A aaaa   1
2    A aaaa   1
3    B bbbb   1
4    B bbbb   1
5    B bbbb   1
6    B cccc   2 

How can I do this with R?

I tried cur_group_id() in dplyr but it is not working for me...

thanks!!

You can use data.table::rleid() , ie

library(dplyr)

df %>% 
 group_by(VarA) %>% 
 mutate(id = data.table::rleid(VarB))

# A tibble: 6 x 3
# Groups:   VarA [2]
#  VarA  VarB     id
#  <chr> <chr> <int>
#1 A     aaaa      1
#2 A     aaaa      1
#3 B     bbbb      1
#4 B     bbbb      1
#5 B     bbbb      1
#6 B     cccc      2

Another potential solution:

library(tidyverse)
varA <- c(rep("A",2), rep("B",4))
varB <- c(rep("aaaa",2), rep("bbbb", 3), rep("cccc",1) )

dat <- data.frame(varA, varB)

dat %>%
  group_by(varA) %>%
  mutate(count = ifelse(varB != lag(varB, default = "NA"),
                       1, 0)) %>%
  mutate(rleid = cumsum(count))
#> # A tibble: 6 × 4
#> # Groups:   varA [2]
#>   varA  varB  count rleid
#>   <chr> <chr> <dbl> <dbl>
#> 1 A     aaaa      1     1
#> 2 A     aaaa      0     1
#> 3 B     bbbb      1     1
#> 4 B     bbbb      0     1
#> 5 B     bbbb      0     1
#> 6 B     cccc      1     2

Created on 2021-12-16 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM