[英]Select first non-NA value using R
df<-data.frame(ID = c(1,1,1,2,3,3,3),
test = c(NA, 5.5, 6.4, NA, 7.3, NA, 10.9))
I want to create a variable called "value", which is the first non-NA value for the test for each individual ID.我想创建一个名为“value”的变量,它是每个单独 ID 的测试的第一个非 NA 值。 For individual ID 2 who only has the NA, the value is NA.
对于只有 NA 的个人 ID 2,该值为 NA。
The expected output is:预期的 output 为:
df<-data.frame(ID = c(1,1,1,2,3,3,3),
test = c(NA, 5.5, 6.4, NA, 7.3, NA, 10.9),
value = c(5.5, 5.5, 5.5, NA, 7.3, 7.3, 7.3))
We can use first
on the non-NA elements after grouping我们可以在分组后
first
在非 NA 元素上使用
library(dplyr)
df <- df %>%
group_by(ID) %>%
mutate(value = first(test[complete.cases(test)]))
You can use ave
to group by ID and which.max
to select with [
the first non NA
value.您可以使用
ave
按ID和which.max
到 select 与[
第一个非NA
值。
df$value <- ave(df$test, df$ID, FUN=function(x) x[which.max(!is.na(x))])
df
# ID test value
#1 1 NA 5.5
#2 1 5.5 5.5
#3 1 6.4 5.5
#4 2 NA NA
#5 3 7.3 7.3
#6 3 NA 7.3
#7 3 10.9 7.3
Here is a data.table
option using first
+ na.omit
这是使用
first
+ na.omit
的data.table
选项
> setDT(df)[, value := first(na.omit(test)), ID][]
ID test value
1: 1 NA 5.5
2: 1 5.5 5.5
3: 1 6.4 5.5
4: 2 NA NA
5: 3 7.3 7.3
6: 3 NA 7.3
7: 3 10.9 7.3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.