简体   繁体   中英

tidyr::fill() with sequential integers rather than a repeated value

After grouping by id I wish to replace the NAs in dist_from_top with sequential values such that dist_from_top becomes c(5,4,3,2,1,5,4,3,2). I am using the one dist_from_top value within each id grouping as a seed of sorts to fill in the values of dist_from_top that are above and below.

tidyr::fill() can fill in the same value throughout the grouping, but I can't think of a way to make it increase and decrease by 1 as it fills. Any help is greatly appreciated.

library(dplyr)
library(tidyr)

df <- 
  tribble(
    ~id, ~mgr, ~dist_from_top,
    "A", "B",  NA,
    "A", "C",  NA,
    "A", "D",  3,
    "A", "E",  NA,
    "A", "F",  NA,
    "B", "C",  NA,
    "B", "D",  4,
    "B", "E",  NA,
    "B", "F",  NA
  )

An "almost there" solution using fill()

df %>% 
  group_by(id) %>% 
  fill(dist_from_top, .direction = "up") %>%
  fill(dist_from_top, .direction = "down")
  1. Create a column that counts downwards in each group, from any starting point:

     ... %>% mutate(rn = -row_number()) 
  2. Add the offset that is defined by the difference between dist_from_top and rn for the one row where dist_from_top is not NA :

     ... %>% mutate(dist_from_top = rn + max(dist_from_top - rn, na.rm = TRUE)) 

    This uses max() merely to pick one value, assuming there is only one value that isn't NA .

Both mutate() operations operate on groups:

df %>%
  group_by(id) %>%
  mutate(rn = ...) %>%
  mutate(dist_from_top = ...) %>%
  ungroup() %>%
  select(-rn)

If there is an all- NA group, you'll see a warning.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM