[英]replace NA with original value when using lag() function in R
I am using dplyr
's lag()
function and I am trying to figure out not make NA (but take the original value instead) as the default value for the blank lagged cells.我正在使用
dplyr
的lag()
function 并且我试图找出不做 NA (而是取原始值)作为空白滞后单元格的默认值。
Here is my code:这是我的代码:
df <- data_frame(d1 = runif(10, 1, 5),
d2 = runif(10, 2, 6),
d3 = runif(10, 3, 7),
d4 = runif(10, 4, 8),
d5 = runif(10, 5, 9),
d6 = runif(10, 6, 10),
d7 = runif(10, 7, 11),
d8 = runif(10, 8, 12)) %>% rownames_to_column()
df %>%
gather(key = "col", value = "val", -"rowname") %>%
group_by(col) %>%
mutate(new_col = ifelse(val >= lag(val, 2) + lag(val, 2)*0.4, NA, val))
It doesn't work if I do this code (which, honestly, I quite expect):如果我执行此代码(老实说,我很期待),它就不起作用:
df %>%
gather(key = "col", value = "val", -"rowname") %>%
group_by(col) %>%
mutate(new_col = if_else(val >= lag(val, 2, default = val) + lag(val, 2, default = val)*0.4, NA, val))
What am I missing so that I can arrive to this result?我错过了什么才能达到这个结果?
rowname col val new_col
<chr> <chr> <dbl> <dbl>
1 1 d1 1.31 **1.31**
2 2 d1 4.10 **4.10**
3 3 d1 3.81 NA
4 4 d1 4.52 4.52
5 5 d1 3.89 3.89
6 6 d1 1.01 1.01
7 7 d1 2.68 2.68
8 8 d1 2.81 NA
9 9 d1 1.18 1.18
10 10 d1 1.19 1.19
# ... with 70 more rows
Appreciate any help!感谢任何帮助!
You could replace
the n
lagged values with the original values.您可以
replace
n
滞后值替换为原始值。
library(dplyr)
n <- 2
df %>%
tidyr::pivot_longer(cols = -rowname, values_to = 'val', names_to = 'col') %>%
group_by(col) %>%
mutate(new_col = if_else(val >= lag(val, n) + lag(val, n)*0.4, NA_real_, val),
new_col = replace(new_col, 1:n, val[1:n]))
coalesce is made for this kind of problems coalesce 就是针对这类问题而设计的
library(tidyverse)
set.seed(42)
df <- data_frame(d1 = runif(10, 1, 5),
d2 = runif(10, 2, 6),
d3 = runif(10, 3, 7),
d4 = runif(10, 4, 8),
d5 = runif(10, 5, 9),
d6 = runif(10, 6, 10),
d7 = runif(10, 7, 11),
d8 = runif(10, 8, 12)) %>% rownames_to_column()
#> Warning: `data_frame()` is deprecated as of tibble 1.1.0.
#> Please use `tibble()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_warnings()` to see where this warning was generated.
df %>%
gather(key = "col", value = "val", -"rowname") %>%
group_by(col) %>%
mutate(new_col = ifelse(val >= lag(val, 2) + lag(val, 2)*0.4, NA, val),
new_col_no_na = coalesce(new_col,val))
#> # A tibble: 80 x 5
#> # Groups: col [8]
#> rowname col val new_col new_col_no_na
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 1 d1 4.66 NA 4.66
#> 2 2 d1 4.75 NA 4.75
#> 3 3 d1 2.14 2.14 2.14
#> 4 4 d1 4.32 4.32 4.32
#> 5 5 d1 3.57 NA 3.57
#> 6 6 d1 3.08 3.08 3.08
#> 7 7 d1 3.95 3.95 3.95
#> 8 8 d1 1.54 1.54 1.54
#> 9 9 d1 3.63 3.63 3.63
#> 10 10 d1 3.82 NA 3.82
#> # ... with 70 more rows
Created on 2020-06-07 by the reprex package (v0.3.0)由reprex package (v0.3.0) 于 2020 年 6 月 7 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.