[英]Maximum value of one data.table column based on other columns
I have a R
data.table
我有一个
R
data.table
DT = data.table(x=rep(c("b","a",NA_character_),each=3), y=rep(c('A', NA_character_, 'C'), each=3), z=c(NA_character_), v=1:9)
DT
# x y z v
#1: b A NA 1
#2: b A NA 2
#3: b A NA 3
#4: a NA NA 4
#5: a NA NA 5
#6: a NA NA 6
#7: NA C NA 7
#8: NA C NA 8
#9: NA C NA 9
For each column if the value is not NA
, I want to extract the max value from column v
. 对于每列,如果值不是
NA
,我想从列v
提取最大值。 I am using 我在用
sapply(DT, function(x) { ifelse(all(is.na(x)), NA_integer_, max(DT[['v']][!is.na(x)])) })
#x y z v
#6 9 NA 9
Is there a simpler way to achive this? 是否有更简单的方法来实现这一目标?
here is a way, giving you -Inf
(and a warning) if all values of the column are NA
(you can later replace that by NA
if you prefer): 这是一种方法,如果列的所有值都是
NA
,则给你-Inf
(和警告)(如果你愿意,你可以稍后用NA
替换它):
DT[, lapply(.SD, function(x) max(v[!is.na(x)]))]
# x y z v
# 1: 6 9 -Inf 9
As suggested by @DavidArenburg, to ensure that everything goes well even when all values are NA
(no warning and directly NA
as result), you can do: 正如@DavidArenburg所建议的那样,为了确保一切顺利,即使所有值都是
NA
(没有警告,结果直接NA
),您可以:
DT[, lapply(.SD, function(x) {
temp <- v[!is.na(x)]
if(!length(temp)) NA else max(temp)
})]
# x y z v
#1: 6 9 NA 9
We can use summarise_each
from dplyr
我们可以使用
summarise_each
从dplyr
library(dplyr)
DT %>%
summarise_each(funs(max(v[!is.na(.)])))
# x y z v
#1: 6 9 -Inf 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.