簡體   English   中英

用“kde”函數估算R中的5-D核密度

[英]5-D Kernel density estimation in R using “kde” function

我想通過在R的“ks”庫中使用“kde”函數對5維數據(x,y,z,時間,大小)進行核密度估計。在它的手冊中它說它可以做核密度估計1至6維數據(手冊第24頁: http//cran.r-project.org/web/packages/ks/ks.pdf )。

我的問題是它說超過3個維度我需要指定eval.points。 我不知道如何指定評估點,因為沒有超過3個維度的示例。 例如,如果我想在問題空間中生成常規3D序列數據並將其用作評估點,我該怎么辦?
這是我的數據:

422.697323  164.19886   2.457419    8.083796636  0.83367586
423.008236  163.32434   0.5551326   37.58477455  0.893893903
204.733908  218.36365   1.9397874   37.88324312  0.912809449
203.963056  218.4808    0.3723791   43.21775903  0.926406005
100.727581  46.60876    1.4022341   49.41510519  0.782807523
453.335182  244.25521   1.6292517   51.73779175  0.903910803
134.909462  210.96333   2.2389119   53.13433521  0.896529401
135.300562  212.02055   0.6739541   67.55073745  0.748783521
258.237117  134.29735   2.1205291   76.34032587  0.735699304
341.305271  149.26953   3.718958    94.33975483  0.849509216
307.138925  59.60571    0.6311074   106.9636715  0.987923188
307.76875   58.91453    2.6496741   113.8515307  0.802115718
415.025535  217.17398   1.7155688   115.7464603  0.875580325
414.977687  216.73327   1.7107369   115.9776948  0.767143582
311.006135  173.24378   2.7819572   120.8079566  0.925380118
310.116929  174.28122   4.3318722   129.2648401  0.776528535
347.260911  37.34946    3.5155427   136.7851291  0.851787115
351.317624  33.65703    0.5806926   138.7349284  0.909723017
4.471892    59.42068    1.4062959   139.0543783  0.967270976
5.480223    59.72857    2.7326106   139.2114277  0.987787428
199.513023  21.53302    2.5163259   143.5895625  0.864164659
198.718031  23.50163    0.4801849   147.2280466  0.741587333
26.650517   35.2019     0.8246514   150.4876506  0.744788202
25.089379   90.47825    0.8700944   152.1944046  0.777252476
26.307439   88.41552    2.4422487   155.9090026  0.952215177
234.282901  236.11422   1.8115261   155.9658144  0.776284654
235.052948  236.77437   1.9644963   156.6900297  0.944285448
23.048202   98.6261     3.4573048   159.7700912  0.773057491
21.516695   98.05431    2.5029284   160.8202997  0.978779087
213.936324  151.87013   3.1042192   161.0612489  0.80499513
277.887935  197.25753   1.3659279   163.673142   0.758978575
277.239746  197.54001   2.2109361   166.2629868  0.775325157

這是我正在使用的代碼:

library(ks) 
library(rgl)
kern <- read.table(file.choose(), sep=",")
hat <- kde(kern)

它適用於最多3個維度,但對於4維和5維,它表示:需要為超過3個維度指定eval.points。

另外,我想知道如何繪制這些內核? 例如,使用z作為條件變量並在3D散點圖中繪制x,y,時間,並對不同的大小范圍使用不同的顏色

像你一樣,我最初找不到一個有用的例子,文檔並沒有真正描述預期的對象類型。 對於你的5d數據集,我嘗試設置一個5d網格的點,這些點是從每個維度的10,25,50,70和90百分位數構建的。 我的數據集名為“dat”:

evpts <- do.call(expand.grid,  lapply(dat, quantile, prob=c(0.1,.25,.5,.75,.9)) )

然后我將它傳遞給kde函數,似乎滿足了算法。 這是否“正確”確實需要檢查。 沒有保證。

> hat <- kde(dat, eval.points= evpts)
> str(hat)
List of 8
 $ x          : num [1:31, 1:5] 423 423 205 204 101 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:5] "V1" "V2" "V3" "V4" ...
 $ eval.points:'data.frame':    3125 obs. of  5 variables:
  ..$ V1: Named num [1:3125] 23 118 234 326 415 ...
  .. ..- attr(*, "names")= chr [1:3125] "10%" "25%" "50%" "75%" ...
  ..$ V2: Named num [1:3125] 35.2 35.2 35.2 35.2 35.2 ...
  .. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
  ..$ V3: Named num [1:3125] 0.581 0.581 0.581 0.581 0.581 ...
  .. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
  ..$ V4: Named num [1:3125] 43.2 43.2 43.2 43.2 43.2 ...
  .. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
  ..$ V5: Named num [1:3125] 0.749 0.749 0.749 0.749 0.749 ...
  .. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
  ..- attr(*, "out.attrs")=List of 2
  .. ..$ dim     : Named int [1:5] 5 5 5 5 5
  .. .. ..- attr(*, "names")= chr [1:5] "V1" "V2" "V3" "V4" ...
  .. ..$ dimnames:List of 5
  .. .. ..$ V1: chr [1:5] "V1= 23.0482" "V1=117.8185" "V1=234.2829" "V1=326.1557" ...
  .. .. ..$ V2: chr [1:5] "V2= 35.20190" "V2= 59.51319" "V2=149.26953" "V2=211.49194" ...
  .. .. ..$ V3: chr [1:5] "V3=0.5806926" "V3=1.1180112" "V3=1.9397874" "V3=2.5830000" ...
  .. .. ..$ V4: chr [1:5] "V4= 43.21776" "V4= 71.94553" "V4=129.26484" "V4=151.34103" ...
  .. .. ..$ V5: chr [1:5] "V5=0.7487835" "V5=0.7764066" "V5=0.8517871" "V5=0.9190948" ...
 $ estimate   : Named num [1:3125] 3.23e-08 5.70e-08 1.01e-08 4.07e-10 6.20e-12 ...
  ..- attr(*, "names")= chr [1:3125] "1" "2" "3" "4" ...
 $ H          : num [1:5, 1:5] 5073.879 1010.815 1.211 -651.089 -0.223 ...
 $ gridded    : logi FALSE
 $ binned     : logi FALSE
 $ names      : chr [1:5] "V1" "V2" "V3" "V4" ...
 $ w          : num [1:31] 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "class")= chr "kde"

我確實找到了一個早期版本的軟件包文檔,它提供了這個4d執行的工作示例,我認為我的努力基本相同,模數不同:

data(iris)
   ir <- iris[,1:4][iris[,5]=="setosa",]
   H.scv <- Hscv(ir)
   fhat <- kde(ir, H.scv, eval.points=ir)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM