[英]Replace loop with more efficient solution
So my Data looks like this:所以我的数据看起来像这样:
test <- structure(list(value = c(0, 781, 1109, 57, 250, 541, 533, 320,
322, 1033, 291, 2213, 1845, 618, 271, 525, 88, 1354, 217, 820,
786, 119, 41, 316, 153, 378, 172, 615, 383, 168, 1448, 824, 85,
224310, 1186, 1488, 244, 368, 133, 488, 118, 4505, 1411, 649,
690, 548, 226, 393, 1042, 92, 521, 212, 1015, 380, 2944, 54376,
1396, 429, 2725, 171, 1874, 87, 547, 488, 140, 169, 237, 1749,
1144, 156, 843, 116, 313, 601, 679, 464, 1092, 178, 28, 57, 550,
498, 64, 48143, 352, 4100, 232, 1936, 189, 940, 180, 1051, 2917,
2397, 229, 802, 540, 297, 505, 1649), count = c(1L, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 2L, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
)), row.names = c(NA, -100L), class = c("tbl_df", "tbl", "data.frame"
))
column value
has some random values and column count
is mostly filled with NA
s.列
value
有一些随机值,列count
主要用NA
填充。 What I need in the end is that every NA
in count
should be the same as the last one that was not NA
.最后我需要的是每个
NA
count
应该与最后一个不是NA
相同。 So the first couple of rows should be count == 1
and as soon as count
changes to 2
it should be count == 2
.所以前几行应该是
count == 1
,一旦count
变为2
,它应该是count == 2
。 So far I am using a loop到目前为止,我正在使用一个循环
for (i in 1:length(test$value))
{
if(isTRUE(is.na(test$count[i]))){
test$count[i] <- test$count[i-1]
}
}
However, this takes forever?然而,这需要永远吗? Can anyone think of a more efficient way to get the same result as the loop?
谁能想到一种更有效的方法来获得与循环相同的结果? This would help me out a lot!
这对我有很大帮助! Thanks in advance!
提前致谢!
You can use fill
from the tidyr package to do exactly this:您可以使用 tidyr package 中的
fill
来执行此操作:
tidyr::fill(test, count)
#> # A tibble: 100 x 2
#> value count
#> <dbl> <int>
#> 1 0 1
#> 2 781 1
#> 3 1109 1
#> 4 57 1
#> 5 250 1
#> 6 541 1
#> 7 533 1
#> 8 320 1
#> 9 322 1
#> 10 1033 1
#> # ... with 90 more rows
You can also use na.locf()
from zoo
:您还可以使用
zoo
的na.locf()
:
library(zoo)
#Code
test$count <- na.locf(test$count)
Output: Output:
# A tibble: 100 x 2
value count
<dbl> <int>
1 0 1
2 781 1
3 1109 1
4 57 1
5 250 1
6 541 1
7 533 1
8 320 1
9 322 1
10 1033 1
# ... with 90 more rows
We can also use我们也可以使用
library(zoo)
transform(test, count = na.locf0(count))
Or using data.table
nafill
for an efficient version或使用
data.table
nafill
获得高效版本
library(data.table)
setDT(test)[, count:= nafill(count, type = 'locf')]
-output -输出
test
# value count
# 1: 0 1
# 2: 781 1
# 3: 1109 1
# 4: 57 1
# 5: 250 1
# 6: 541 1
# 7: 533 1
# 8: 320 1
# 9: 322 1
# 10: 1033 1
# 11: 291 1
# 12: 2213 1
# 13: 1845 1
# 14: 618 1
# ..
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.