简体   繁体   中英

Add counter for Matching ID based on value in another column in R

I am struggling to think of what logic I will need to be able to come up with a counter/index for genuine matches and non genuine matches. A simplified example of my data is as follows:

ID    track
x       10
x       10
x       3
x       3
x       1
y       2

The final data frame I wish to get is as follows:

ID     Track   Counter
x       10     1
x       10     1
x       3      2
x       3      2
x       1      3
y       2      1

Hence whenever the ID is the same and the track is the same put a counter in thr Counter column (starting with 1), whenever the ID is the same but then the Track changes make the counter +1, etc. When a new ID comes up the counter starts from 1 again.

Any advice would be great.

You may use

library(tidyverse)
data %>% group_by(ID) %>% mutate(Counter = cumsum(!duplicated(track)))

The trick is to use duplicated as to indicate unseen entries and cumsum to act as their counter. Eg,

!duplicated(data$track[1:5])
# [1]  TRUE FALSE  TRUE FALSE  TRUE

@Julius' answer works if you have no repeating tracks. If you run into a situation where the track may revert to a previous value, the counter will not be incremented. If this is the case in your data and you need to increment the counter when that occurs, I would suggest using lag from dplyr.

library(dplyr)
df %>% group_by(ID) %>% mutate(count = cumsum(track != lag(track, default = track[1]))+1)

Results with a couple more datapoints:

# A tibble: 8 x 3
# Groups:   ID [2]
#   ID    track count
#   <fct> <int> <dbl>
# 1 x        10     1
# 2 x        10     1
# 3 x         3     2
# 4 x         3     2
# 5 x         1     3
# 6 x         3     4
# 7 x         3     4
# 8 y         2     1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM