Thanks to both of you for suggesting elegant solutions, Both solutions worked for me, but only the melt()
and back-join solution worked for a data.table with dates instead of numeric values.
EDIT
I implemented the proposed data.table solution through melting and joining back with the obtained results from Wimpel as his/her solution also works with dates stored in the date columns instead of the intial toy data that was all integer values.
I prefered the readability of Peace Wang's solution though using data.table assignments and IMO it is much clearer syntax than the melt()
solution, however (at least for me), it does not work with columns of type date.
Benchmarking both solutions for numeric/integer data, saw the melt()
solution as clear winner.
EDIT 2 To replicate the NA-values through conversion that I get if I implement the solution proposed by Peace Wang, see below for the corrected version of the input data.table.
I have sth like this: Image a list of patient records with measurements taken at various dates. The colnames of the date columns would be sth like "2020-12-15" / "2021-01-15" etc.
ID Date_1 Date_2 Date_3
1 1990-01-01 1990-02-01 1990-03-01
2 1990-01-01 1990-02-01 1990-03-01
3 1990-01-01 1982-02-01 1990-03-01
I have determined the mimum value of each row in my data.table dt
like this:
dt <- dt[, Min := do.call(pmin, c(.SD, list(na.rm = TRUE))), .SDcols = -(1)]
So far so good. Now I want to add a new col Min_Date
stating the corresponding col name (aka date in my example) of the found miniumum value per row to finally get sth lik this:
ID Date_1 Date_2 Date_3 Min Min_Date
1 1990-01-01 1990-02-01 1990-03-01 1990-01-01 Date_1
2 1990-01-01 1990-02-01 1990-03-01 1990-01-01 Date_1
3 1990-01-01 1982-02-01 1990-03-01 1982-02-01 Date_2
I tried variations of:
dt <- dt[, Min_Date := do.call(which.pmin, c(.SD, list(na.rm = TRUE))),
.SDcols = (2:4)]
and then trying to do sth with the col index. Don't really know my way around .I
yet, but I couldn't make it work when used in sth along these lines:
exclusions.dt[exclusions.dt[, .I[which.min(.SD)], ISSUE_ID, .SDcols = (2:6)]$V1]
Would appreciate any pointer!
The following code can work.
dt <- fread("
ID Date_1 Date_2 Date_3
1 100 200 300
2 100 500 300
3 200 150 400
")
dt[, `:=`(Min = do.call(pmin, c(.SD, list(na.rm = TRUE))),
date_min = colnames(.SD)[apply(.SD, 1, which.min)]
),
.SDcols = -1]
If you got the NAs
warning, maybe you can try to refresh dt
firstly.
Here is another data.table
approach
#sample data
library( data.table )
DT <- fread("ID Date_1 Date_2 Date_3
1 100 200 300
2 100 500 300
3 200 150 400 ")
#melt to long format and get rows with minimum value by ID
DT.min <- melt( DT, id.vars = "ID" )[ , .SD[ which.min(value) ], by = ID]
# ID variable value
# 1: 1 Date_1 100
# 2: 2 Date_1 100
# 3: 3 Date_2 150
#join back to DT
DT[ DT.min, `:=`( Min = i.value, Min_Date = i.variable ), on = .(ID)][]
# ID Date_1 Date_2 Date_3 Min Min_Date
# 1: 1 100 200 300 100 Date_1
# 2: 2 100 500 300 100 Date_1
# 3: 3 200 150 400 150 Date_2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.