简体   繁体   English

R - data.table 将作为行的最小值的列的名称作为值分配给新列

[英]R - data.table assign the name of a column that is the minium of a row as value to new column

Thanks to both of you for suggesting elegant solutions, Both solutions worked for me, but only the melt() and back-join solution worked for a data.table with dates instead of numeric values.感谢你们提出了优雅的解决方案,这两种解决方案都对我有用,但只有melt()和 back-join 解决方案适用于 data.table 的日期而不是数值。

EDIT编辑

I implemented the proposed data.table solution through melting and joining back with the obtained results from Wimpel as his/her solution also works with dates stored in the date columns instead of the intial toy data that was all integer values.通过熔化并与从 Wimpel 获得的结果重新结合来实施建议的 data.table 解决方案,因为他/她的解决方案也适用于存储在日期列中的日期,而不是所有 integer 值的初始玩具数据。

I prefered the readability of Peace Wang's solution though using data.table assignments and IMO it is much clearer syntax than the melt() solution, however (at least for me), it does not work with columns of type date.我更喜欢 Peace Wang 解决方案的可读性,尽管使用data.table 分配和 IMO,它的语法比melt()解决方案更清晰,但是(至少对我而言),它不适用于日期类型的列。

Benchmarking both solutions for numeric/integer data, saw the melt() solution as clear winner.对数值/整数数据的两种解决方案进行基准测试后, melt()解决方案显然是赢家。


EDIT 2 To replicate the NA-values through conversion that I get if I implement the solution proposed by Peace Wang, see below for the corrected version of the input data.table.编辑 2如果我实施 Peace Wang 提出的解决方案,要通过转换复制 NA 值,请参阅下面的输入 data.table 的更正版本。

I have sth like this: Image a list of patient records with measurements taken at various dates.我有这样的事情:将患者记录列表与在不同日期进行的测量进行图像化。 The colnames of the date columns would be sth like "2020-12-15" / "2021-01-15" etc.日期列的名称将是“2020-12-15”/“2021-01-15”等。

 ID   Date_1       Date_2      Date_3   
  1   1990-01-01   1990-02-01  1990-03-01      
  2   1990-01-01   1990-02-01  1990-03-01       
  3   1990-01-01   1982-02-01  1990-03-01 

I have determined the mimum value of each row in my data.table dt like this:我已经确定了 data.table dt中每一行的最小值,如下所示:

dt <- dt[, Min := do.call(pmin, c(.SD, list(na.rm = TRUE))), .SDcols = -(1)]

So far so good.到目前为止,一切都很好。 Now I want to add a new col Min_Date stating the corresponding col name (aka date in my example) of the found miniumum value per row to finally get sth lik this:现在我想添加一个新的 col Min_Date ,说明每行找到的最小值的相应 col 名称(在我的示例中为日期),以最终得到这样的结果:

  ID   Date_1       Date_2      Date_3        Min        Min_Date
  1    1990-01-01   1990-02-01  1990-03-01   1990-01-01  Date_1
  2    1990-01-01   1990-02-01  1990-03-01   1990-01-01  Date_1
  3    1990-01-01   1982-02-01  1990-03-01   1982-02-01  Date_2

I tried variations of:我尝试了以下变化:

dt <- dt[, Min_Date := do.call(which.pmin, c(.SD, list(na.rm = TRUE))),
                           .SDcols = (2:4)]

and then trying to do sth with the col index.然后尝试用 col 索引做某事。 Don't really know my way around .I yet, but I couldn't make it work when used in sth along these lines:真的不知道我的方式.I还没有,但是当我在这些方面使用时,我无法让它工作:

exclusions.dt[exclusions.dt[, .I[which.min(.SD)], ISSUE_ID, .SDcols = (2:6)]$V1]

Would appreciate any pointer!将不胜感激任何指针!

The following code can work.以下代码可以工作。

dt <- fread("
             ID   Date_1   Date_2   Date_3
  1    100      200      300
  2    100      500      300
  3    200      150      400
")
dt[, `:=`(Min = do.call(pmin, c(.SD, list(na.rm = TRUE))),
          date_min = colnames(.SD)[apply(.SD, 1, which.min)]
         ), 
   .SDcols = -1]

If you got the NAs warning, maybe you can try to refresh dt firstly.如果您收到NAs警告,也许您可以先尝试刷新dt

Here is another data.table approach这是另一种data.table方法

#sample data
library( data.table )
DT <- fread("ID   Date_1   Date_2   Date_3   
  1    100      200      300      
  2    100      500      300      
  3    200      150      400    ")

#melt to long format and get rows with minimum value by ID
DT.min <- melt( DT, id.vars = "ID" )[ , .SD[ which.min(value) ], by = ID]
#    ID variable value
# 1:  1   Date_1   100
# 2:  2   Date_1   100
# 3:  3   Date_2   150

#join back to DT
DT[ DT.min, `:=`( Min = i.value, Min_Date = i.variable ), on = .(ID)][]
#    ID Date_1 Date_2 Date_3 Min Min_Date
# 1:  1    100    200    300 100   Date_1
# 2:  2    100    500    300 100   Date_1
# 3:  3    200    150    400 150   Date_2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM