簡體   English   中英

如何將NA填充到R中的下一行?

[英]How do I fill the NA to next row in R?

我想將NA填入下一行。 這是數據集。

結構(列表(時間戳=結構(c(1L,2L,3L,4L,5L,6L,7L,8L,9L,10L,11L,1L,2L,3L,4L,5L,6L,7L,8L,9L, 10L,11L),.Label = c(“ 2019-07-07 00:00:00”,“ 2019-07-07 00:00:01”,“ 2019-07-07 00:00:02”,“ 2019-07-07 00:00:03“,” 2019-07-07 00:00:04“,” 2019-07-07 00:00:05“,” 2019-07-07 00:00:06“ ,“ 2019-07-07 00:00:07”,“ 2019-07-07 00:00:08”,“ 2019-07-07 00:00:09”,“ 2019-07-07 00:00: 10“),類=”因子“),源=結構(c(NA,NA,NA,1L,NA,NA,1L,NA,NA,NA,NA,NA,NA,2L,NA,2L,NA,NA ,2L,NA,NA,2L,NA),.Label = c(“ USER_A”,“ USER_B”),class =“ factor”),value = c(NA,NA,NA,1L,NA,NA,1L ,NA,NA,NA,NA,NA,1L,NA,1L,NA,NA,2L,NA,NA,3L,NA)),類=“ data.frame”,row.names = c(NA,- 22L))

             timestamp source value
1  2019-07-07 00:00:00   <NA>    NA
2  2019-07-07 00:00:01   <NA>    NA
3  2019-07-07 00:00:02   <NA>    NA
4  2019-07-07 00:00:03 USER_A     1
5  2019-07-07 00:00:04   <NA>    NA
6  2019-07-07 00:00:05   <NA>    NA
7  2019-07-07 00:00:06 USER_A     1
8  2019-07-07 00:00:07   <NA>    NA
9  2019-07-07 00:00:08   <NA>    NA
10 2019-07-07 00:00:09   <NA>    NA
11 2019-07-07 00:00:10   <NA>    NA
12 2019-07-07 00:00:00   <NA>    NA
13 2019-07-07 00:00:01 USER_B     1
14 2019-07-07 00:00:02   <NA>    NA
15 2019-07-07 00:00:03 USER_B     1
16 2019-07-07 00:00:04   <NA>    NA
17 2019-07-07 00:00:05   <NA>    NA
18 2019-07-07 00:00:06 USER_B     2
19 2019-07-07 00:00:07   <NA>    NA
20 2019-07-07 00:00:08   <NA>    NA
21 2019-07-07 00:00:09 USER_B     3
22 2019-07-07 00:00:10   <NA>    NA

該表是時間和源之間的各種循環。 每個源(A和B)都有固定的行(在這種情況下為00:00:00到00:00:10)。

這是預期結果表。

             timestamp source value
1  2019-07-07 00:00:00   <NA>    NA
2  2019-07-07 00:00:01   <NA>    NA
3  2019-07-07 00:00:02   <NA>    NA
4  2019-07-07 00:00:03 USER_A     1
5  2019-07-07 00:00:04 USER_A     1
6  2019-07-07 00:00:05 USER_A     1
7  2019-07-07 00:00:06 USER_A     1
8  2019-07-07 00:00:07   <NA>    NA
9  2019-07-07 00:00:08   <NA>    NA
10 2019-07-07 00:00:09   <NA>    NA
11 2019-07-07 00:00:10   <NA>    NA
12 2019-07-07 00:00:00   <NA>    NA
13 2019-07-07 00:00:01 USER_B     1
14 2019-07-07 00:00:02 USER_B     1
15 2019-07-07 00:00:03 USER_B     1
16 2019-07-07 00:00:04 USER_B     2
17 2019-07-07 00:00:05 USER_B     2
18 2019-07-07 00:00:06 USER_B     2
19 2019-07-07 00:00:07 USER_B     3
20 2019-07-07 00:00:08 USER_B     3
21 2019-07-07 00:00:09 USER_B     3
22 2019-07-07 00:00:10   <NA>    NA

根據USER_A,將5和6行的值和源替換為7行的值和源。 USER_B行也將基於下一行以相同方式替換。

如何在R中進行此過程?

這是使用dplyr一種方法,因為每個source都有固定數量的行。 我們首先為每n行創建一個組,並添加一個新列group2 ,該組僅在組中非NA值的minmax之間具有1。 然后,我們也通過group2進行group_by fill ,以按組fill先前的非缺失值。

n <- 11
library(dplyr)  

df %>%
  group_by(group1 = gl(n()/n, n)) %>%
  mutate(group2 = 0, 
         group2 = replace(group2, min(which(!is.na(source))) : 
                                  max(which(!is.na(source))), 1)) %>%
  group_by(group2) %>%
  tidyr::fill(source, value) %>% 
  ungroup() %>%
  select(-group1, -group2) 

# A tibble: 22 x 3
#   timestamp           source value
#   <fct>               <fct>  <int>
# 1 2019-07-07 00:00:00 NA        NA
# 2 2019-07-07 00:00:01 NA        NA
# 3 2019-07-07 00:00:02 NA        NA
# 4 2019-07-07 00:00:03 USER_A     1
# 5 2019-07-07 00:00:04 USER_A     1
# 6 2019-07-07 00:00:05 USER_A     1
# 7 2019-07-07 00:00:06 USER_A     1
# 8 2019-07-07 00:00:07 NA        NA
# 9 2019-07-07 00:00:08 NA        NA
#10 2019-07-07 00:00:09 NA        NA
# … with 12 more rows

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM