简体   繁体   中英

R: Fill all NAs in a set of columns with the values above

I have a data frame and vectors of names of "a" columns and "b" columns:

x <- data.frame(a1 = c(1, NA, rep(1, 3), NA),
                a2 = c(2, NA, rep(2, 3), NA),
                a3 = c(3, NA, rep(3, 3), NA),
                b1 = c(10, 10, NA, rep(10, 2), NA),
                b2 = c(20, 20, NA, rep(20, 2), NA),
                b3 = c(30, 30, NA, rep(30, 2), NA),
                c = c(2, 3, 5, NA, 9, 8))
avars <- names(x)[1:3]
bvars <- names(x)[4:6]

Is there an elegant way - using dynamic variable name vectors 'avars' and 'bvars' - to fill out all the NAs in avars and bvars with the values above them.

I understand, I could use a loop like this:

library(tidyr)
for(i in c(avars, bvars)) x <- x %>% fill(!!i)
x

But maybe there is a more elegant solution? Thank you!

Use na.locf from zoo package

> library(zoo)
> na.locf(x)
  a1 a2 a3 b1 b2 b3
1  1  2  3 10 20 30
2  1  2  3 10 20 30
3  1  2  3 10 20 30
4  1  2  3 10 20 30
5  1  2  3 10 20 30
6  1  2  3 10 20 30

You can use tidyr::fill() along with grep to make sure we only fill down avars and bvars :

library(tidyverse)

x %>% fill(grep("^[ab]", names(.)))

  a1 a2 a3 b1 b2 b3  c
1  1  2  3 10 20 30  2
2  1  2  3 10 20 30  3
3  1  2  3 10 20 30  5
4  1  2  3 10 20 30 NA
5  1  2  3 10 20 30  9
6  1  2  3 10 20 30  8

The RegEx expression ^[ab] asserts that the column name has to start with either a or b

Or per your comment, using avars and bvars :

x %>% fill(grep(paste0(c(avars,bvars), collapse = "|"), names(x)))

Which is still better than the for loop solution, because it is vectorized.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM