I have the following dataset
varA <- c(rep("A",2), rep("B",4))
varB <- c(rep("aaaa",2), rep("bbbb", 3), rep("cccc",1) )
dat <- data.frame(varA, varB)
dat
varA varB
1 A aaaa
2 A aaaa
3 B bbbb
4 B bbbb
5 B bbbb
6 B cccc
I would like to generate ids for each subgroup, such that the first subgroup is 1, the second 2, etc, within varA. Theids can repeat across the dataset, just not within subgroup.
This the needed result
varA varB res
1 A aaaa 1
2 A aaaa 1
3 B bbbb 1
4 B bbbb 1
5 B bbbb 1
6 B cccc 2
How can I do this with R?
I tried cur_group_id() in dplyr but it is not working for me...
thanks!!
You can use data.table::rleid()
, ie
library(dplyr)
df %>%
group_by(VarA) %>%
mutate(id = data.table::rleid(VarB))
# A tibble: 6 x 3
# Groups: VarA [2]
# VarA VarB id
# <chr> <chr> <int>
#1 A aaaa 1
#2 A aaaa 1
#3 B bbbb 1
#4 B bbbb 1
#5 B bbbb 1
#6 B cccc 2
Another potential solution:
library(tidyverse)
varA <- c(rep("A",2), rep("B",4))
varB <- c(rep("aaaa",2), rep("bbbb", 3), rep("cccc",1) )
dat <- data.frame(varA, varB)
dat %>%
group_by(varA) %>%
mutate(count = ifelse(varB != lag(varB, default = "NA"),
1, 0)) %>%
mutate(rleid = cumsum(count))
#> # A tibble: 6 × 4
#> # Groups: varA [2]
#> varA varB count rleid
#> <chr> <chr> <dbl> <dbl>
#> 1 A aaaa 1 1
#> 2 A aaaa 0 1
#> 3 B bbbb 1 1
#> 4 B bbbb 0 1
#> 5 B bbbb 0 1
#> 6 B cccc 1 2
Created on 2021-12-16 by the reprex package (v2.0.1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.