簡體   English   中英

什么分位數類型與R中的3個百分位數定義相匹配

[英]what quantile types match the 3 definitions of percentile in R

百分位數有3種定義:

  1. 最小數字大於y個數字的x%
  2. 大於或等於y個數的x%的最小數
  3. 1和2中百分位數的加權平均值

哪種quantile()參數type與這三個定義匹配?

如果用“四分位數”表示“ quantile() :都不是。 這不是那么簡單。 如文檔所述,當您嘗試: help(quantile)

Hyndman和Fan(1996)中討論的九種分位數算法之一是按類型選擇的。

該文件可以在這里找到:

https://www.researchgate.net/profile/Rob_Hyndman/publication/222105754_Sample_Quantiles_in_Statistical_Packages/links/02e7e530c316d129d7000000.pdf

值得一讀的是了解讓計算機執行“直觀”操作所涉及的內容。 :)

您可以嘗試一下並擺弄prbs的數字,以了解quantile的行為:

aa <- 1: 10
prbs <- c(0.2, 0.22, 0.29)

for(typ in 1:9){
  this_line <- paste0("type=", typ)
  this_val <- paste0("qval=",quantile(aa, probs=prbs, type=typ))
  print(paste(this_line,this_val))
}

給出以下內容:

[1] "type=1 qval=2" "type=1 qval=3" "type=1 qval=3"
[1] "type=2 qval=2.5" "type=2 qval=3"   "type=2 qval=3"  
[1] "type=3 qval=2" "type=3 qval=2" "type=3 qval=3"
[1] "type=4 qval=2"   "type=4 qval=2.2" "type=4 qval=2.9"
[1] "type=5 qval=2.5" "type=5 qval=2.7" "type=5 qval=3.4"
[1] "type=6 qval=2.2"  "type=6 qval=2.42" "type=6 qval=3.19"
[1] "type=7 qval=2.8"  "type=7 qval=2.98" "type=7 qval=3.61"
[1] "type=8 qval=2.4"  "type=8 qval=2.60666666666667 "type=8 qval=3.33"            
[1] "type=9 qval=2.425"  "type=9 qval=2.63"   "type=9 qval=3.3475"

為了得到答案,我生成了一個隨機樣本,使用相關性測試來找到三個定義的匹配項。 這不是最優雅的代碼,但是...

這是代碼:

    #####  program  to test types
## generate 100 random samples of 100 numbers
set.seed(3)
x <- rnorm(100000,mean = 50, sd = 10)
means <- replicate(100, sample(x, 100))

#create answer matrix
answers <- matrix(ncol=12)
colnames(answers) <- (c("def1","def2","def3","q1","q2","q3","q4","q5","q6","q7","q8","q9"))


printallper <- function(x,bar) {
  # get values for per calcs
  bar <- sort(bar)
  per <- (x/100)*(length(bar)+1)
  # get per1
  perres1 <<-round(bar[per+1],digits=2)
  # get per2
  perres2 <<-round(bar[per], digits=2)

  #get per3
  whole <- floor(per)
  dec <- per - whole
  low <- bar[per]
  high <- bar[per+1]
  final <- (dec * (high-low)) + bar[per]
  perres3 <<-round(final, digits=2)
  # q types
  q1 <- round(quantile(bar,(x/100), type = 1), digits = 2)
  q2 <-round( quantile(bar,(x/100), type = 2), digits = 2)
  q3 <- round(quantile(bar,(x/100), type = 3), digits = 2)
  q4 <- round(quantile(bar,(x/100), type = 4), digits = 2)
  q5 <- round(quantile(bar,(x/100), type = 5), digits = 2)
  q6 <- round(quantile(bar,(x/100), type = 6), digits = 2)
  q7 <- round(quantile(bar,(x/100), type = 7), digits = 2)
  q8 <- round(quantile(bar,(x/100), type = 8), digits = 2)
  q9 <- round(quantile(bar,(x/100), type = 9), digits = 2)
  answers <<- rbind(answers,c(perres1,perres2,perres3,q1,q2,q3,q4,q5,q6,q7,q8,q9))
}

#run all percentiles for data in means matrix
apply(means,1,function(x) printallper(25,x))

# correlate various percentiles
cor_answers <- cor(answers[complete.cases(answers),])

#print correlations for 3 deifinitions of percentils with quantiles
cor_answers[1:3,]

結果:

      def1    def2    def3   quan1   quan2   quan3   quan4   quan5    quan6   quan7   quan8   quan9
def1  1.0000  0.9763  0.9867  0.9763  0.9941  0.9763  0.9763  0.9941  0.9867  0.9985  0.9920  0.9763
def2  0.9763  1.0000  0.9984  1.0000  0.9939  1.0000  1.0000  0.9939  0.9984  0.9864  0.9958  0.9926  
def3  0.9867  0.9984  1.0000  0.9984  0.9984  0.9984  0.9984  0.9984  1.0000  0.9939  0.9993  0.9984

結果顯示:

  • 百分位數定義1(def1)不匹配分位數類型
  • 百分位數定義2(def2)=匹配分位數類型1,3和4
  • 百分位數定義3(def3)=匹配分位數類型6

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM