[英]Fill data.table with NA-values, using the nearest non-na-observation
我有一個 data.table 如下:
library(data.table)
dt.tst <- CJ(Type = c("A", "B"),
Range_val = seq(0,20000, by = 1000))
dt.tst[Range_val == 2000 & Type == "A", Value := 0.987]
dt.tst[Range_val == 2000 & Type == "B", Value := 1.987]
dt.tst[Range_val == 9000 & Type == "A", Value := 1.056]
dt.tst[Range_val == 9000 & Type == "B", Value := 2.138]
dt.tst[Range_val == 16000 & Type == "A", Value := 1.563]
dt.tst[Range_val == 16000 & Type == "B", Value := 2.089]
我想在值列中填寫 NA:s:
我可以用for循環來做這個非常簡單但很慢的事情。 所以我想要一種更干凈的方式(而且更快)。 這樣做的好 data.table 方式是什么?
總會有非 na 值,但間隔可能不同。
dt.tst[is.na(Value), Value := dt.tst[!is.na(Value)][dt.tst[is.na(Value)], roll = "nearest", on = .(Type, Range_val)]$Value]
output
dt.tst
# Type Range_val Value
# 1: A 0 0.987
# 2: A 1000 0.987
# 3: A 2000 0.987
# 4: A 3000 0.987
# 5: A 4000 0.987
# 6: A 5000 0.987
# 7: A 6000 1.056
# 8: A 7000 1.056
# 9: A 8000 1.056
# 10: A 9000 1.056
# 11: A 10000 1.056
# 12: A 11000 1.056
# 13: A 12000 1.056
# 14: A 13000 1.563
# 15: A 14000 1.563
# 16: A 15000 1.563
# 17: A 16000 1.563
# 18: A 17000 1.563
# 19: A 18000 1.563
# 20: A 19000 1.563
# 21: A 20000 1.563
# 22: B 0 1.987
# 23: B 1000 1.987
# 24: B 2000 1.987
# 25: B 3000 1.987
# 26: B 4000 1.987
# 27: B 5000 1.987
# 28: B 6000 2.138
# 29: B 7000 2.138
# 30: B 8000 2.138
# 31: B 9000 2.138
# 32: B 10000 2.138
# 33: B 11000 2.138
# 34: B 12000 2.138
# 35: B 13000 2.089
# 36: B 14000 2.089
# 37: B 15000 2.089
# 38: B 16000 2.089
# 39: B 17000 2.089
# 40: B 18000 2.089
# 41: B 19000 2.089
# 42: B 20000 2.089
如果我正確理解了您的目標,即在Value
列中分配一個NA
與其最接近的非NA
值,您可能可以嘗試max.col
+ outer
如下所示
dt.tst[
,
Value2 := replace(
Value,
is.na(Value),
na.omit(Value)[max.col(-abs(outer(which(is.na(Value)), which(!is.na(Value)), `-`)), "last")]
), Type
][]
這使
Type Range_val Value Value2
1: A 0 NA 0.987
2: A 1000 NA 0.987
3: A 2000 0.987 0.987
4: A 3000 NA 0.987
5: A 4000 NA 0.987
6: A 5000 NA 0.987
7: A 6000 NA 1.056
8: A 7000 NA 1.056
9: A 8000 NA 1.056
10: A 9000 1.056 1.056
11: A 10000 NA 1.056
12: A 11000 NA 1.056
13: A 12000 NA 1.056
14: A 13000 NA 1.563
15: A 14000 NA 1.563
16: A 15000 NA 1.563
17: A 16000 1.563 1.563
18: A 17000 NA 1.563
19: A 18000 NA 1.563
20: A 19000 NA 1.563
21: A 20000 NA 1.563
22: B 0 NA 1.987
23: B 1000 NA 1.987
24: B 2000 1.987 1.987
25: B 3000 NA 1.987
26: B 4000 NA 1.987
27: B 5000 NA 1.987
28: B 6000 NA 2.138
29: B 7000 NA 2.138
30: B 8000 NA 2.138
31: B 9000 2.138 2.138
32: B 10000 NA 2.138
33: B 11000 NA 2.138
34: B 12000 NA 2.138
35: B 13000 NA 2.089
36: B 14000 NA 2.089
37: B 15000 NA 2.089
38: B 16000 2.089 2.089
39: B 17000 NA 2.089
40: B 18000 NA 2.089
41: B 19000 NA 2.089
42: B 20000 NA 2.089
Type Range_val Value Value2
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.