用更有效的解决方案替换循环

Question

So my Data looks like this:所以我的数据看起来像这样：

test <- structure(list(value = c(0, 781, 1109, 57, 250, 541, 533, 320, 
322, 1033, 291, 2213, 1845, 618, 271, 525, 88, 1354, 217, 820, 
786, 119, 41, 316, 153, 378, 172, 615, 383, 168, 1448, 824, 85, 
224310, 1186, 1488, 244, 368, 133, 488, 118, 4505, 1411, 649, 
690, 548, 226, 393, 1042, 92, 521, 212, 1015, 380, 2944, 54376, 
1396, 429, 2725, 171, 1874, 87, 547, 488, 140, 169, 237, 1749, 
1144, 156, 843, 116, 313, 601, 679, 464, 1092, 178, 28, 57, 550, 
498, 64, 48143, 352, 4100, 232, 1936, 189, 940, 180, 1051, 2917, 
2397, 229, 802, 540, 297, 505, 1649), count = c(1L, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))

column value has some random values and column count is mostly filled with NA s.列value有一些随机值，列count主要用NA填充。 What I need in the end is that every NA in count should be the same as the last one that was not NA .最后我需要的是每个NA count应该与最后一个不是NA相同。 So the first couple of rows should be count == 1 and as soon as count changes to 2 it should be count == 2 .所以前几行应该是count == 1 ，一旦count变为2 ，它应该是count == 2 。 So far I am using a loop到目前为止，我正在使用一个循环

for (i in 1:length(test$value))
{
  if(isTRUE(is.na(test$count[i]))){
    test$count[i] <- test$count[i-1]
  }
}

However, this takes forever?然而，这需要永远吗？ Can anyone think of a more efficient way to get the same result as the loop?谁能想到一种更有效的方法来获得与循环相同的结果？ This would help me out a lot!这对我有很大帮助！ Thanks in advance!提前致谢！

Answer 1

You can use fill from the tidyr package to do exactly this:您可以使用 tidyr package 中的fill来执行此操作：

tidyr::fill(test, count)
#> # A tibble: 100 x 2
#>    value count
#>    <dbl> <int>
#>  1     0     1
#>  2   781     1
#>  3  1109     1
#>  4    57     1
#>  5   250     1
#>  6   541     1
#>  7   533     1
#>  8   320     1
#>  9   322     1
#> 10  1033     1
#> # ... with 90 more rows

Answer 2

You can also use na.locf() from zoo :您还可以使用zoo的na.locf() ：

library(zoo)
#Code
test$count <- na.locf(test$count)

Output: Output：

# A tibble: 100 x 2
   value count
   <dbl> <int>
 1     0     1
 2   781     1
 3  1109     1
 4    57     1
 5   250     1
 6   541     1
 7   533     1
 8   320     1
 9   322     1
10  1033     1
# ... with 90 more rows

Answer 3

We can also use我们也可以使用

library(zoo)
transform(test, count = na.locf0(count))

Or using data.table nafill for an efficient version或使用data.table nafill获得高效版本

 library(data.table)
 setDT(test)[, count:= nafill(count, type = 'locf')]

-output -输出

test
#      value count
#  1:      0     1
#  2:    781     1
#  3:   1109     1
#  4:     57     1
#  5:    250     1
#  6:    541     1
#  7:    533     1
#  8:    320     1
#  9:    322     1
# 10:   1033     1
# 11:    291     1
# 12:   2213     1
# 13:   1845     1
# 14:    618     1
# ..

用更有效的解决方案替换循环

问题描述

3 个解决方案

解决方案1
2 已采纳 2020-12-05 16:22:23

解决方案2
2 2020-12-05 16:27:49

解决方案3
1 2020-12-05 16:36:58

用更有效的解决方案替换循环

问题描述

3 个解决方案

解决方案1 2 已采纳 2020-12-05 16:22:23

解决方案2 2 2020-12-05 16:27:49

解决方案3 1 2020-12-05 16:36:58

解决方案1
2 已采纳 2020-12-05 16:22:23

解决方案2
2 2020-12-05 16:27:49

解决方案3
1 2020-12-05 16:36:58