简体   繁体   English

用更有效的解决方案替换循环

[英]Replace loop with more efficient solution

So my Data looks like this:所以我的数据看起来像这样:

test <- structure(list(value = c(0, 781, 1109, 57, 250, 541, 533, 320, 
322, 1033, 291, 2213, 1845, 618, 271, 525, 88, 1354, 217, 820, 
786, 119, 41, 316, 153, 378, 172, 615, 383, 168, 1448, 824, 85, 
224310, 1186, 1488, 244, 368, 133, 488, 118, 4505, 1411, 649, 
690, 548, 226, 393, 1042, 92, 521, 212, 1015, 380, 2944, 54376, 
1396, 429, 2725, 171, 1874, 87, 547, 488, 140, 169, 237, 1749, 
1144, 156, 843, 116, 313, 601, 679, 464, 1092, 178, 28, 57, 550, 
498, 64, 48143, 352, 4100, 232, 1936, 189, 940, 180, 1051, 2917, 
2397, 229, 802, 540, 297, 505, 1649), count = c(1L, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))

column value has some random values and column count is mostly filled with NA s.value有一些随机值,列count主要用NA填充。 What I need in the end is that every NA in count should be the same as the last one that was not NA .最后我需要的是每个NA count应该与最后一个不是NA相同。 So the first couple of rows should be count == 1 and as soon as count changes to 2 it should be count == 2 .所以前几行应该是count == 1 ,一旦count变为2 ,它应该是count == 2 So far I am using a loop到目前为止,我正在使用一个循环

for (i in 1:length(test$value))
{
  if(isTRUE(is.na(test$count[i]))){
    test$count[i] <- test$count[i-1]
  }
}

However, this takes forever?然而,这需要永远吗? Can anyone think of a more efficient way to get the same result as the loop?谁能想到一种更有效的方法来获得与循环相同的结果? This would help me out a lot!这对我有很大帮助! Thanks in advance!提前致谢!

You can use fill from the tidyr package to do exactly this:您可以使用 tidyr package 中的fill来执行此操作:

tidyr::fill(test, count)
#> # A tibble: 100 x 2
#>    value count
#>    <dbl> <int>
#>  1     0     1
#>  2   781     1
#>  3  1109     1
#>  4    57     1
#>  5   250     1
#>  6   541     1
#>  7   533     1
#>  8   320     1
#>  9   322     1
#> 10  1033     1
#> # ... with 90 more rows

You can also use na.locf() from zoo :您还可以使用zoona.locf()

library(zoo)
#Code
test$count <- na.locf(test$count)

Output: Output:

# A tibble: 100 x 2
   value count
   <dbl> <int>
 1     0     1
 2   781     1
 3  1109     1
 4    57     1
 5   250     1
 6   541     1
 7   533     1
 8   320     1
 9   322     1
10  1033     1
# ... with 90 more rows

We can also use我们也可以使用

library(zoo)
transform(test, count = na.locf0(count))

Or using data.table nafill for an efficient version或使用data.table nafill获得高效版本

 library(data.table)
 setDT(test)[, count:= nafill(count, type = 'locf')]

-output -输出

test
#      value count
#  1:      0     1
#  2:    781     1
#  3:   1109     1
#  4:     57     1
#  5:    250     1
#  6:    541     1
#  7:    533     1
#  8:    320     1
#  9:    322     1
# 10:   1033     1
# 11:    291     1
# 12:   2213     1
# 13:   1845     1
# 14:    618     1
# ..

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM