Assign running number based on row entries R

Question

Probably something very easy for you guys. As indicated in the title, I would like to create a new column having running numbers based on row entries from a different column (in this case ASV column). So the row entries in reference column has duplicate values.

 ASV                   New_column 
 wthjjwjjgbwurigwe434j     1 
 wthjjwjjgbwurigwe434j     1
 wthjjwjjgbwurigwe434j     1 
 21y4hghgw6yw8ngqoigj7     2 
 21y4hghgw6yw8ngqoigj7     2 
 1387341yqfysddhas394h     3

Appreciate your help.

Answer 1

If we assume your data frame is named 'dat' we can use the following code:

dat$New_column <- as.integer(factor(dat$ASV))

Answer 2

Updated I decided to come up with another solution as rleid may lead to misleading result.

library(dplyr)

df %>%
  mutate(dup = +duplicated(df$ASV),
         id = cumsum(dup == 0)) %>%
  select(-dup)

                    ASV id
1 wthjjwjjgbwurigwe434j  1
2 wthjjwjjgbwurigwe434j  1
3 wthjjwjjgbwurigwe434j  1
4 21y4hghgw6yw8ngqoigj7  2
5 21y4hghgw6yw8ngqoigj7  2
6 1387341yqfysddhas394h  3

Answer 3

We could use match

dat$New_column <- with(data, match(ASV, unique(ASV)))

Answer 4

If new ids are to be allocated alphabetically, dense_rank in dplyr can be used

df %>% mutate(New_column = dense_rank(ASV))

                    ASV New_column
1 wthjjwjjgbwurigwe434j          3
2 wthjjwjjgbwurigwe434j          3
3 wthjjwjjgbwurigwe434j          3
4 21y4hghgw6yw8ngqoigj7          2
5 21y4hghgw6yw8ngqoigj7          2
6 1387341yqfysddhas394h          1

OR

df %>% group_by(ASV) %>%
  mutate(New_column = cur_group_id())

Assign running number based on row entries R

Question

4 answers

solution1
5 2021-06-02 14:39:19

solution2
2 2021-06-02 14:53:00

solution3
2 2021-06-02 17:11:09

solution4
1 2021-06-03 05:15:35

Assign running number based on row entries R

Question

4 answers

solution1 5 2021-06-02 14:39:19

solution2 2 2021-06-02 14:53:00

solution3 2 2021-06-02 17:11:09

solution4 1 2021-06-03 05:15:35

solution1
5 2021-06-02 14:39:19

solution2
2 2021-06-02 14:53:00

solution3
2 2021-06-02 17:11:09

solution4
1 2021-06-03 05:15:35