[英]How to assign incremental values based on two columns in R?
My data set looks like:我的数据集看起来像:
ID VISIT_ID DATE DV
1001 112233 12-23 3
1001 112233 12-23 4
1001 112244 12-23 5
1001 112244 12-23 6
1001 112244 12-23 7
1001 112244 12-23 8
1002 112254 12-23 3
1002 112254 12-23 4
1002 112254 12-23 5
1002 112264 12-23 6
1002 112264 12-23 7
1002 112264 12-23 8
I want the results like below;我想要如下结果; it assign a incremental encounter value for each unique VISIT_ID.
它为每个唯一的 VISIT_ID 分配一个递增的遭遇值。 The sequence will restart from 1 for each ID.
对于每个 ID,序列将从 1 重新开始。 Helps will be much appreciated.
帮助将不胜感激。
ID VISIT_ID DATE DV ENCOUNTER
1001 112233 12-23 3 1
1001 112233 12-23 4 1
1001 112244 12-23 5 2
1001 112244 12-23 6 2
1001 112244 12-23 7 2
1001 112244 12-23 8 2
1002 112254 12-23 3 1
1002 112254 12-23 4 1
1002 112254 12-23 5 1
1002 112264 12-23 6 2
1002 112264 12-23 7 2
1002 112264 12-23 8 2
We can use match
to find the index of unique 'VISIT_ID' after grouping by 'ID'我们可以使用
match
找到唯一的 'VISIT_ID' 按'ID'分组后的索引
library(dplyr)
df1 %>%
group_by(ID) %>%
mutate(ENCOUNTER = match(VISIT_ID, unique(VISIT_ID)))
# ID VISIT_ID DATE DV ENCOUNTER
# <int> <int> <chr> <int> <int>
#1 1001 112233 12-23 3 1
#2 1001 112233 12-23 4 1
#3 1001 112244 12-23 5 2
#4 1001 112244 12-23 6 2
#5 1001 112244 12-23 7 2
#6 1001 112244 12-23 8 2
#7 1002 112254 12-23 3 1
#8 1002 112254 12-23 4 1
#9 1002 112254 12-23 5 1
#10 1002 112264 12-23 6 2
#11 1002 112264 12-23 7 2
#12 1002 112264 12-23 8 2
Or another option is duplicated
或者另一个选项
duplicated
df1 %>%
group_by(ID) %>%
mutate(ENCOUNTER = cumsum(!duplicated(VISIT_ID)))
Or using data.table
或者使用
data.table
library(data.table)
setDT(df1)[, ENCOUNTER := match(VISIT_ID, unique(VISIT_ID), by = ID]
Or with base R
或与
base R
with(df1, ave(VISIT_ID, ID, FUN = function(x) cumsum(!duplicated(x))))
With base R
ave
we can convert the VISIT_ID
to factor
and then numeric
to get unique number for every VISIT_ID
of ID
随着
base R
ave
我们可以转换VISIT_ID
到factor
,然后numeric
获得唯一编号,每VISIT_ID
的ID
df$ENCOUNTER <- ave(df$VISIT_ID, df$ID,FUN = function(x) as.numeric(as.factor(x)))
df
# ID VISIT_ID DATE DV ENCOUNTER
#1 1001 112233 12-23 3 1
#2 1001 112233 12-23 4 1
#3 1001 112244 12-23 5 2
#4 1001 112244 12-23 6 2
#5 1001 112244 12-23 7 2
#6 1001 112244 12-23 8 2
#7 1002 112254 12-23 3 1
#8 1002 112254 12-23 4 1
#9 1002 112254 12-23 5 1
#10 1002 112264 12-23 6 2
#11 1002 112264 12-23 7 2
#12 1002 112264 12-23 8 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.