I have a large data table of patient data. I want to delete rows where "id" is duplicated without losing the information in the "date" column.
id date
01 2004-07-01
02 NA
03 2013-11-15
03 2005-03-15
04 NA
05 2011-07-01
05 2012-07-01
I could do this one of two ways -
create a column that writes over the date column values to concatenate all the dates for that ID, ie:
id date_new 01 2004-07-01 02 NA 03 2013-11-15; 2005-03-15 04 NA 05 2011-07-01; 2012-07-01
or
create one new column for each additional date, ie:
id date_new date_new2 01 2004-07-01 NA 02 NA NA 03 2013-11-15 2005-03-15 04 NA NA 05 2011-07-01 2012-07-01
I have tried a few things, but they keep crashing my R session (I get the message R Session Aborted. R encountered a fatal error. The session was terminated.
):
setkey(DT, "id")
unique_DT <- subset(unique(DT))
and:
DT[!duplicated(DT[, "id", with = FALSE])]
However, besides crashing R, neither of these solutions does what I want with the dates.
Any ideas? I am new to data table (and R generally) but I have the vague sense that I could solve this with :=
somehow.
尝试这个:
dt[,c(date_new=paste(date,collapse="; "),.SD),by=id]
You can use the aggregate function and it should do what you want. I was having some trouble with the dates switching to factors, but it seems like enclosing the date string with I() keeps it as a character.
id=c(1,2,3,3,4,5,5)
date = c("2004-07-01","NA","2013-11-15","2005-03-15","NA",
"2011-07-01","2012-07-01")
data=as.data.frame(list(id=id,date=date))
data$date=as.character(data$date)
aggregate(list(date = I(data$date)),by=list(id = data$id),c)
id date
1 1 2004-07-01
2 2 NA
3 3 2013-11-15, 2005-03-15
4 4 NA
5 5 2011-07-01, 2012-07-01
edit: used the aggregate function but used paste instead of c. Changing the collapse option to ";" should solve the separator problem
newdata = aggregate(list(date = I(data$date)),
by=list(id = data$id),
function(x){paste(unique(x),collapse=";")})
newdata
id date
1 1 2004-07-01
2 2 NA
3 3 2013-11-15;2005-03-15
4 4 NA
5 5 2011-07-01;2012-07-01
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.