I have the following df:
df <- data.frame(ID = c(1,1,2,2,2,3,3,3,3),
Attendance = c(1, 1, NA, 1,1, NA, 1, NA, 1 ))
And I want this one:
df <- data.frame(ID = c(1,1,2,2,2,3,3,3,3),
Attendance = c(1, 1, NA, 1,1, NA, 1, NA, 1),
Visit = c(1,2,0,1,2,0,1,0,2))
How can I count every time (cumsum) an ID appears , in 'Visit' column, based on 'Attendance' column value while ignoring NA's or 0's?
I have tried something with ave function like this one, but unsuccessfully:
df$Visit <- ifelse(!is.na(df$ID), (ave(df$ID, df$ID, FUN=cumsum))/df$ID, 0)
I have achieved the result by creating an auxiliar df with:
aux <- df[complete.cases(df$Attendance),]
Counting the visits with Ave function and then merging , but I'm sure there exists an easiest way
We can use data.table
. convert the 'data.frame' to 'data.table' ( setDT(df)
), grouped by 'ID', specify the i
as a logical vector which is TRUE for non-NA elements in 'Attendance', assign ( :=
) the 'rowid' of 'Attendance' as the 'Visit' column. Then, replace the NA in 'Visit' to 0
library(data.table)
setDT(df)[!is.na(Attendance), Visit := rowidv(Attendance),
ID][is.na(Visit), Visit := 0]
df
# ID Attendance Visit
#1: 1 1 1
#2: 1 1 2
#3: 2 NA 0
#4: 2 1 1
#5: 2 1 2
#6: 3 NA 0
#7: 3 1 1
#8: 3 NA 0
#9: 3 1 2
Or if we are using ave
, then create an index for non-NA rows, and then use ave
on those rows
i1 <- !is.na(df$Attendance)
df$Visit <- 0
df$Visit[i1] <- with(df[i1, ], ave(Attendance, ID, FUN = cumsum))
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Visit = if_else(is.na(Attendance), 0, cumsum(if_else(is.na(Attendance), 0, 1))))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.