[英]5-D Kernel density estimation in R using “kde” function
我想通過在R的“ks”庫中使用“kde”函數對5維數據(x,y,z,時間,大小)進行核密度估計。在它的手冊中它說它可以做核密度估計1至6維數據(手冊第24頁: http : //cran.r-project.org/web/packages/ks/ks.pdf )。
我的問題是它說超過3個維度我需要指定eval.points。 我不知道如何指定評估點,因為沒有超過3個維度的示例。 例如,如果我想在問題空間中生成常規3D序列數據並將其用作評估點,我該怎么辦?
這是我的數據:
422.697323 164.19886 2.457419 8.083796636 0.83367586
423.008236 163.32434 0.5551326 37.58477455 0.893893903
204.733908 218.36365 1.9397874 37.88324312 0.912809449
203.963056 218.4808 0.3723791 43.21775903 0.926406005
100.727581 46.60876 1.4022341 49.41510519 0.782807523
453.335182 244.25521 1.6292517 51.73779175 0.903910803
134.909462 210.96333 2.2389119 53.13433521 0.896529401
135.300562 212.02055 0.6739541 67.55073745 0.748783521
258.237117 134.29735 2.1205291 76.34032587 0.735699304
341.305271 149.26953 3.718958 94.33975483 0.849509216
307.138925 59.60571 0.6311074 106.9636715 0.987923188
307.76875 58.91453 2.6496741 113.8515307 0.802115718
415.025535 217.17398 1.7155688 115.7464603 0.875580325
414.977687 216.73327 1.7107369 115.9776948 0.767143582
311.006135 173.24378 2.7819572 120.8079566 0.925380118
310.116929 174.28122 4.3318722 129.2648401 0.776528535
347.260911 37.34946 3.5155427 136.7851291 0.851787115
351.317624 33.65703 0.5806926 138.7349284 0.909723017
4.471892 59.42068 1.4062959 139.0543783 0.967270976
5.480223 59.72857 2.7326106 139.2114277 0.987787428
199.513023 21.53302 2.5163259 143.5895625 0.864164659
198.718031 23.50163 0.4801849 147.2280466 0.741587333
26.650517 35.2019 0.8246514 150.4876506 0.744788202
25.089379 90.47825 0.8700944 152.1944046 0.777252476
26.307439 88.41552 2.4422487 155.9090026 0.952215177
234.282901 236.11422 1.8115261 155.9658144 0.776284654
235.052948 236.77437 1.9644963 156.6900297 0.944285448
23.048202 98.6261 3.4573048 159.7700912 0.773057491
21.516695 98.05431 2.5029284 160.8202997 0.978779087
213.936324 151.87013 3.1042192 161.0612489 0.80499513
277.887935 197.25753 1.3659279 163.673142 0.758978575
277.239746 197.54001 2.2109361 166.2629868 0.775325157
這是我正在使用的代碼:
library(ks)
library(rgl)
kern <- read.table(file.choose(), sep=",")
hat <- kde(kern)
它適用於最多3個維度,但對於4維和5維,它表示:需要為超過3個維度指定eval.points。
另外,我想知道如何繪制這些內核? 例如,使用z作為條件變量並在3D散點圖中繪制x,y,時間,並對不同的大小范圍使用不同的顏色
像你一樣,我最初找不到一個有用的例子,文檔並沒有真正描述預期的對象類型。 對於你的5d數據集,我嘗試設置一個5d網格的點,這些點是從每個維度的10,25,50,70和90百分位數構建的。 我的數據集名為“dat”:
evpts <- do.call(expand.grid, lapply(dat, quantile, prob=c(0.1,.25,.5,.75,.9)) )
然后我將它傳遞給kde函數,似乎滿足了算法。 這是否“正確”確實需要檢查。 沒有保證。
> hat <- kde(dat, eval.points= evpts)
> str(hat)
List of 8
$ x : num [1:31, 1:5] 423 423 205 204 101 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:5] "V1" "V2" "V3" "V4" ...
$ eval.points:'data.frame': 3125 obs. of 5 variables:
..$ V1: Named num [1:3125] 23 118 234 326 415 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "25%" "50%" "75%" ...
..$ V2: Named num [1:3125] 35.2 35.2 35.2 35.2 35.2 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ V3: Named num [1:3125] 0.581 0.581 0.581 0.581 0.581 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ V4: Named num [1:3125] 43.2 43.2 43.2 43.2 43.2 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ V5: Named num [1:3125] 0.749 0.749 0.749 0.749 0.749 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..- attr(*, "out.attrs")=List of 2
.. ..$ dim : Named int [1:5] 5 5 5 5 5
.. .. ..- attr(*, "names")= chr [1:5] "V1" "V2" "V3" "V4" ...
.. ..$ dimnames:List of 5
.. .. ..$ V1: chr [1:5] "V1= 23.0482" "V1=117.8185" "V1=234.2829" "V1=326.1557" ...
.. .. ..$ V2: chr [1:5] "V2= 35.20190" "V2= 59.51319" "V2=149.26953" "V2=211.49194" ...
.. .. ..$ V3: chr [1:5] "V3=0.5806926" "V3=1.1180112" "V3=1.9397874" "V3=2.5830000" ...
.. .. ..$ V4: chr [1:5] "V4= 43.21776" "V4= 71.94553" "V4=129.26484" "V4=151.34103" ...
.. .. ..$ V5: chr [1:5] "V5=0.7487835" "V5=0.7764066" "V5=0.8517871" "V5=0.9190948" ...
$ estimate : Named num [1:3125] 3.23e-08 5.70e-08 1.01e-08 4.07e-10 6.20e-12 ...
..- attr(*, "names")= chr [1:3125] "1" "2" "3" "4" ...
$ H : num [1:5, 1:5] 5073.879 1010.815 1.211 -651.089 -0.223 ...
$ gridded : logi FALSE
$ binned : logi FALSE
$ names : chr [1:5] "V1" "V2" "V3" "V4" ...
$ w : num [1:31] 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "class")= chr "kde"
我確實找到了一個早期版本的軟件包文檔,它提供了這個4d執行的工作示例,我認為我的努力基本相同,模數不同:
data(iris)
ir <- iris[,1:4][iris[,5]=="setosa",]
H.scv <- Hscv(ir)
fhat <- kde(ir, H.scv, eval.points=ir)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.