简体   繁体   English

如何找到连续的第一个非 NA 值?

[英]How do I find the 1st non-NA value in a row?

Suppose I have the following:假设我有以下内容:

df <- data.frame(dt=c(as.Date('2019-02-02'), as.Date('2019-02-04'), as.Date('2019-02-05'), as.Date('2020-03-04')), v1=c(1,2,NA,NA), v2=c(NA,3,4,NA), v3=c(NA,NA,3,5), v4=c(2, 4, 6, NA))
> read.zoo(df)
           v1 v2 v3 v4
2019-02-02  1 NA NA  2
2019-02-04  2  3 NA  4
2019-02-05 NA  4  3  6
2020-03-04 NA NA  5 NA

I would like to find the first non-NA value on each row that occur after a column that had a value.我想在具有值的列之后出现的每一行上找到第一个非 NA 值。

So for example, for '2019-02-02':例如,对于“2019-02-02”:

  • there is a value in v1 of 1, v2 has NA so we skip, v3 has NA so we skip but v4 is NOT NA so I would like to return its value, 2 for row 1, col 1. v1中有一个值为 1,v2 有 NA 所以我们跳过, v3有 NA 所以我们跳过但v4不是 NA 所以我想返回它的值,2 代表第 1 行,第 1 列。
  • Looking at the next column, v2 , in the same row it is NA so we skip it since it is not a number查看下一列v2 ,在同一行中它是 NA 所以我们跳过它,因为它不是数字
  • v3 is also NA so we skip it. v3也是 NA,所以我们跳过它。
  • v4 is NOT NA but there are no columns following it so we return NA. v4不是 NA,但它后面没有列,所以我们返回 NA。

Therefore our 1st row will be:因此,我们的第一行将是:

c1 c2 c3 c4
2  NA NA NA

Going through all the rows in this example I am expecting the output to be:遍历此示例中的所有行,我期望输出为:

             c1 c2 c3 c4
1 2019-02-02  2 NA NA NA
2 2019-02-04  3  4 NA NA
3 2019-02-05 NA  3  6 NA
4 2020-03-04 NA NA NA NA

It looks like all I need to do is shift the column values in each row to the left but I can't seem to figure out how to do it...看起来我需要做的就是将每一行中的列值向左移动,但我似乎不知道该怎么做......

NOTE: I would prefer a base-R solution using zoo注意:我更喜欢使用 zoo 的 base-R 解决方案

Here's a solution applying a custom function:这是一个应用自定义函数的解决方案:

res = t(apply(df[-1], 1, function(x) {
     val = which(!is.na(x))
     x[val[-length(val)]] = x[val[-1]]
     x[val[length(val)]] = NA
     return(x)
     }
  ))

cbind(df[1], res)
#           dt v1 v2 v3 v4
# 1 2019-02-02  2 NA NA NA
# 2 2019-02-04  3  4 NA NA
# 3 2019-02-05 NA  3  6 NA
# 4 2020-03-04 NA NA NA NA

I'm not sure how to do it with base R. But in tidyverse :我不知道如何用base R 来做到这一点。但在tidyverse

df %>% 
gather(key, value, -dt) %>% 
arrange(dt, key) %>% 
mutate(key2 = as.numeric(substr(key, 2, 2))) %>% 
filter(!is.na(value)) %>% group_by(dt) %>% 
mutate(ind = lag(key2, default = NA), index = paste0("c", ind)) %>% 
ungroup() %>% 
filter(!is.na(ind)) %>% 
select(dt, index, value) %>% 
spread(index, value)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM