Probably something very easy for you guys. As indicated in the title, I would like to create a new column having running numbers based on row entries from a different column (in this case ASV column). So the row entries in reference column has duplicate values.
ASV New_column
wthjjwjjgbwurigwe434j 1
wthjjwjjgbwurigwe434j 1
wthjjwjjgbwurigwe434j 1
21y4hghgw6yw8ngqoigj7 2
21y4hghgw6yw8ngqoigj7 2
1387341yqfysddhas394h 3
Appreciate your help.
If we assume your data frame is named 'dat' we can use the following code:
dat$New_column <- as.integer(factor(dat$ASV))
Updated I decided to come up with another solution as rleid
may lead to misleading result.
library(dplyr)
df %>%
mutate(dup = +duplicated(df$ASV),
id = cumsum(dup == 0)) %>%
select(-dup)
ASV id
1 wthjjwjjgbwurigwe434j 1
2 wthjjwjjgbwurigwe434j 1
3 wthjjwjjgbwurigwe434j 1
4 21y4hghgw6yw8ngqoigj7 2
5 21y4hghgw6yw8ngqoigj7 2
6 1387341yqfysddhas394h 3
We could use match
dat$New_column <- with(data, match(ASV, unique(ASV)))
If new ids are to be allocated alphabetically, dense_rank
in dplyr
can be used
df %>% mutate(New_column = dense_rank(ASV))
ASV New_column
1 wthjjwjjgbwurigwe434j 3
2 wthjjwjjgbwurigwe434j 3
3 wthjjwjjgbwurigwe434j 3
4 21y4hghgw6yw8ngqoigj7 2
5 21y4hghgw6yw8ngqoigj7 2
6 1387341yqfysddhas394h 1
OR
df %>% group_by(ASV) %>%
mutate(New_column = cur_group_id())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.