I am trying to mark the maximum and minimum date of observations per ID using data.table. Whilst I thought that this would be a straight forward exercise, I do not really understand why I do not obtain the result I wish: for some reason the following data.table command only flags the overall min and max and not "per ID", even though this is indicated:
Reproducible example (to mark maximum value by ID):
library(data.table)
date1 = as.POSIXct(Sys.Date(), "%m-%d-%Y-%X")
date2 = date1 - 70000
date3 = date1 - 7000
date4 = date1 + 90000
DT = data.table(ID= rep(1:2,each = 3), Date=c(date1,date2,date3,date4,date1,date2))
# create position marker (2 means middle value for date - not min/not max)
DT[,Position:=2]
# change position marker to 3 if latest date
DT[Date==max(Date),Position:=3, by=ID]
Why does data.table not consider the "by=ID" part? What am I overlooking?
Version: Data.table 1.9.2 R: 3.0.3
I believe it is filtering the data, and the by
statement follows. Perhaps what you want is:
DT[, Position := ifelse(Date==max(Date),3,2), by= ID]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.