[英]How do I find the 1st non-NA value in a row?
Suppose I have the following:假设我有以下内容:
df <- data.frame(dt=c(as.Date('2019-02-02'), as.Date('2019-02-04'), as.Date('2019-02-05'), as.Date('2020-03-04')), v1=c(1,2,NA,NA), v2=c(NA,3,4,NA), v3=c(NA,NA,3,5), v4=c(2, 4, 6, NA))
> read.zoo(df)
v1 v2 v3 v4
2019-02-02 1 NA NA 2
2019-02-04 2 3 NA 4
2019-02-05 NA 4 3 6
2020-03-04 NA NA 5 NA
I would like to find the first non-NA value on each row that occur after a column that had a value.我想在具有值的列之后出现的每一行上找到第一个非 NA 值。
So for example, for '2019-02-02':例如,对于“2019-02-02”:
v1
of 1, v2 has NA so we skip, v3
has NA so we skip but v4
is NOT NA so I would like to return its value, 2 for row 1, col 1. v1
中有一个值为 1,v2 有 NA 所以我们跳过, v3
有 NA 所以我们跳过但v4
不是 NA 所以我想返回它的值,2 代表第 1 行,第 1 列。v2
, in the same row it is NA so we skip it since it is not a number查看下一列v2
,在同一行中它是 NA 所以我们跳过它,因为它不是数字v3
is also NA so we skip it. v3
也是 NA,所以我们跳过它。v4
is NOT NA but there are no columns following it so we return NA. v4
不是 NA,但它后面没有列,所以我们返回 NA。Therefore our 1st row will be:因此,我们的第一行将是:
c1 c2 c3 c4
2 NA NA NA
Going through all the rows in this example I am expecting the output to be:遍历此示例中的所有行,我期望输出为:
c1 c2 c3 c4
1 2019-02-02 2 NA NA NA
2 2019-02-04 3 4 NA NA
3 2019-02-05 NA 3 6 NA
4 2020-03-04 NA NA NA NA
It looks like all I need to do is shift the column values in each row to the left but I can't seem to figure out how to do it...看起来我需要做的就是将每一行中的列值向左移动,但我似乎不知道该怎么做......
NOTE: I would prefer a base-R solution using zoo注意:我更喜欢使用 zoo 的 base-R 解决方案
Here's a solution applying a custom function:这是一个应用自定义函数的解决方案:
res = t(apply(df[-1], 1, function(x) {
val = which(!is.na(x))
x[val[-length(val)]] = x[val[-1]]
x[val[length(val)]] = NA
return(x)
}
))
cbind(df[1], res)
# dt v1 v2 v3 v4
# 1 2019-02-02 2 NA NA NA
# 2 2019-02-04 3 4 NA NA
# 3 2019-02-05 NA 3 6 NA
# 4 2020-03-04 NA NA NA NA
I'm not sure how to do it with base
R. But in tidyverse
:我不知道如何用base
R 来做到这一点。但在tidyverse
:
df %>%
gather(key, value, -dt) %>%
arrange(dt, key) %>%
mutate(key2 = as.numeric(substr(key, 2, 2))) %>%
filter(!is.na(value)) %>% group_by(dt) %>%
mutate(ind = lag(key2, default = NA), index = paste0("c", ind)) %>%
ungroup() %>%
filter(!is.na(ind)) %>%
select(dt, index, value) %>%
spread(index, value)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.