[英]Left join doesn't do matching correctly
我有以下數據:
structure(list(Time = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1), AgentID = 1:40, State = c(59L, 28L, 84L, 11L,
5L, 8L, 14L, 71L, 47L, 7L, 84L, 95L, 91L, 92L, 99L, 34L, 70L,
37L, 55L, 96L, 46L, 38L, 71L, 2L, 61L, 13L, 73L, 26L, 44L, 59L,
52L, 53L, 42L, 66L, 23L, 11L, 42L, 77L, 38L, 48L), Action = c(-1L,
-1L, 1L, -1L, 1L, 1L, 1L, -1L, -1L, -1L, -1L, 1L, 1L, 1L, 1L,
-1L, 1L, 1L, -1L, -1L, 1L, 1L, -1L, 1L, 1L, -1L, -1L, 1L, -1L,
-1L, 1L, -1L, -1L, 1L, 1L, -1L, -1L, 1L, 1L, 1L), Reward = c(-987L,
15L, -479L, -485L, -785L, -683L, -1281L, -990L, -886L, -186L,
-83L, -285L, -886L, -387L, -1087L, -791L, -687L, -988L, -888L,
-285L, -888L, -690L, -185L, -387L, -789L, -589L, -1089L, -391L,
-388L, -1193L, 18L, -388L, -989L, -278L, -487L, -988L, -484L,
-588L, -282L, -790L), ActorLoss = c(-685.0519, 10.296739, -332.56876,
-339.06058, -543.66394, -471.6095, -890.6096, -685.5919, -615.5027,
-128.70341, -57.796043, -194.98253, -615.1243, -269.02368, -758.362,
-548.063, -478.1057, -690.9155, -616.49274, -197.9684, -615.4089,
-478.90158, -128.84201, -268.25974, -546.7193, -407.66656, -752.44385,
-270.63773, -268.98254, -825.52856, 12.47267, -268.95764, -684.3579,
-190.53835, -336.1535, -687.00714, -335.5734, -408.69858, -196.12567,
-549.0034), CriticLoss = c(346.44806, 3.8356564, 264.62875, 223.61797,
282.86646, 264.60562, 412.33395, 346.4176, 300.00894, 141.48476,
100.09644, 223.62798, 331.69186, 200.03246, 360.58798, 316.22833,
264.60284, 374.19714, 300.0259, 173.24109, 331.69604, 264.6325,
141.4761, 244.95958, 346.43042, 282.85916, 331.66525, 244.9461,
244.9759, 374.21875, 4.2367864, 244.98549, 374.17026, 173.27281,
223.65364, 346.42776, 223.63875, 282.86008, 173.25778, 316.2569
), N = c(40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L), SimulationID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), discountFactor = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0)), row.names = c(NA, -40L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x55d1abc6fab0>)
我想做的是創建一個新的 data.table,比如dStat
,它將包含Time,SimulationID,N,discountFactor,position
。 前 4 列( Time,SimulationID,N,discountFactor
)來自上述數據,所有SimulationID,N,discountFactor
的position
將只是seq(1,100,1)
。
然后我想在dStat
中創建一個名為pDensity
的新列,這樣pDensity
將是上述數據中的 # AgentID
。
我試過了
dStat <- a[, list(position = seq(0,L,1)), by=.(Time,SimulationID,N,discountFactor)]
dStat[a, pDensity:= .N, on=.(position=State,Time,SimulationID,N,discountFactor)]
但是對於所有position
s, pDensity
為 40,其中至少有一個AgentID
與State=position
。
但是 40 是AgentIDs
的總數,而不是滿足State=position
的 # AgentID
的總數。
我在這里做錯了什么?
似乎確實有 40 個代理 ID 滿足state=Position
。
這是合乎邏輯的,因為max(State)=100
和position=1:100
,所以總是有匹配的。
dStat[a,.(Time,SimulationID,N,discountFactor,i.State,x.position,AgentID) , on=.(position=State,Time,SimulationID,N,discountFactor)]
Time SimulationID N discountFactor i.State x.position AgentID
1: 1 1 40 0 59 59 1
2: 1 1 40 0 28 28 2
3: 1 1 40 0 84 84 3
4: 1 1 40 0 11 11 4
5: 1 1 40 0 5 5 5
6: 1 1 40 0 8 8 6
7: 1 1 40 0 14 14 7
8: 1 1 40 0 71 71 8
9: 1 1 40 0 47 47 9
10: 1 1 40 0 7 7 10
11: 1 1 40 0 84 84 11
12: 1 1 40 0 95 95 12
13: 1 1 40 0 91 91 13
14: 1 1 40 0 92 92 14
15: 1 1 40 0 99 99 15
16: 1 1 40 0 34 34 16
17: 1 1 40 0 70 70 17
18: 1 1 40 0 37 37 18
19: 1 1 40 0 55 55 19
20: 1 1 40 0 96 96 20
21: 1 1 40 0 46 46 21
22: 1 1 40 0 38 38 22
23: 1 1 40 0 71 71 23
24: 1 1 40 0 2 2 24
25: 1 1 40 0 61 61 25
26: 1 1 40 0 13 13 26
27: 1 1 40 0 73 73 27
28: 1 1 40 0 26 26 28
29: 1 1 40 0 44 44 29
30: 1 1 40 0 59 59 30
31: 1 1 40 0 52 52 31
32: 1 1 40 0 53 53 32
33: 1 1 40 0 42 42 33
34: 1 1 40 0 66 66 34
35: 1 1 40 0 23 23 35
36: 1 1 40 0 11 11 36
37: 1 1 40 0 42 42 37
38: 1 1 40 0 77 77 38
39: 1 1 40 0 38 38 39
40: 1 1 40 0 48 48 40
Time SimulationID N discountFactor i.State x.position AgentID
解決了!
dStat <- d[, list(position = seq(0,L,1)), by=.(Time,SimulationID,N,discountFactor)]
dStat[d, pDensity:= .N/N, on=.(position=State,Time,SimulationID,N,discountFactor), by=.(position, Time, SimulationID, N, discountFactor)]
通過將by=.(position, Time, SimulationID, N, discountFactor)
添加到連接操作。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.