繁体   English   中英

什么分位数类型与R中的3个百分位数定义相匹配

[英]what quantile types match the 3 definitions of percentile in R

百分位数有3种定义:

  1. 最小数字大于y个数字的x%
  2. 大于或等于y个数的x%的最小数
  3. 1和2中百分位数的加权平均值

哪种quantile()参数type与这三个定义匹配?

如果用“四分位数”表示“ quantile() :都不是。 这不是那么简单。 如文档所述,当您尝试: help(quantile)

Hyndman和Fan(1996)中讨论的九种分位数算法之一是按类型选择的。

该文件可以在这里找到:

https://www.researchgate.net/profile/Rob_Hyndman/publication/222105754_Sample_Quantiles_in_Statistical_Packages/links/02e7e530c316d129d7000000.pdf

值得一读的是了解让计算机执行“直观”操作所涉及的内容。 :)

您可以尝试一下并摆弄prbs的数字,以了解quantile的行为:

aa <- 1: 10
prbs <- c(0.2, 0.22, 0.29)

for(typ in 1:9){
  this_line <- paste0("type=", typ)
  this_val <- paste0("qval=",quantile(aa, probs=prbs, type=typ))
  print(paste(this_line,this_val))
}

给出以下内容:

[1] "type=1 qval=2" "type=1 qval=3" "type=1 qval=3"
[1] "type=2 qval=2.5" "type=2 qval=3"   "type=2 qval=3"  
[1] "type=3 qval=2" "type=3 qval=2" "type=3 qval=3"
[1] "type=4 qval=2"   "type=4 qval=2.2" "type=4 qval=2.9"
[1] "type=5 qval=2.5" "type=5 qval=2.7" "type=5 qval=3.4"
[1] "type=6 qval=2.2"  "type=6 qval=2.42" "type=6 qval=3.19"
[1] "type=7 qval=2.8"  "type=7 qval=2.98" "type=7 qval=3.61"
[1] "type=8 qval=2.4"  "type=8 qval=2.60666666666667 "type=8 qval=3.33"            
[1] "type=9 qval=2.425"  "type=9 qval=2.63"   "type=9 qval=3.3475"

为了得到答案,我生成了一个随机样本,使用相关性测试来找到三个定义的匹配项。 这不是最优雅的代码,但是...

这是代码:

    #####  program  to test types
## generate 100 random samples of 100 numbers
set.seed(3)
x <- rnorm(100000,mean = 50, sd = 10)
means <- replicate(100, sample(x, 100))

#create answer matrix
answers <- matrix(ncol=12)
colnames(answers) <- (c("def1","def2","def3","q1","q2","q3","q4","q5","q6","q7","q8","q9"))


printallper <- function(x,bar) {
  # get values for per calcs
  bar <- sort(bar)
  per <- (x/100)*(length(bar)+1)
  # get per1
  perres1 <<-round(bar[per+1],digits=2)
  # get per2
  perres2 <<-round(bar[per], digits=2)

  #get per3
  whole <- floor(per)
  dec <- per - whole
  low <- bar[per]
  high <- bar[per+1]
  final <- (dec * (high-low)) + bar[per]
  perres3 <<-round(final, digits=2)
  # q types
  q1 <- round(quantile(bar,(x/100), type = 1), digits = 2)
  q2 <-round( quantile(bar,(x/100), type = 2), digits = 2)
  q3 <- round(quantile(bar,(x/100), type = 3), digits = 2)
  q4 <- round(quantile(bar,(x/100), type = 4), digits = 2)
  q5 <- round(quantile(bar,(x/100), type = 5), digits = 2)
  q6 <- round(quantile(bar,(x/100), type = 6), digits = 2)
  q7 <- round(quantile(bar,(x/100), type = 7), digits = 2)
  q8 <- round(quantile(bar,(x/100), type = 8), digits = 2)
  q9 <- round(quantile(bar,(x/100), type = 9), digits = 2)
  answers <<- rbind(answers,c(perres1,perres2,perres3,q1,q2,q3,q4,q5,q6,q7,q8,q9))
}

#run all percentiles for data in means matrix
apply(means,1,function(x) printallper(25,x))

# correlate various percentiles
cor_answers <- cor(answers[complete.cases(answers),])

#print correlations for 3 deifinitions of percentils with quantiles
cor_answers[1:3,]

结果:

      def1    def2    def3   quan1   quan2   quan3   quan4   quan5    quan6   quan7   quan8   quan9
def1  1.0000  0.9763  0.9867  0.9763  0.9941  0.9763  0.9763  0.9941  0.9867  0.9985  0.9920  0.9763
def2  0.9763  1.0000  0.9984  1.0000  0.9939  1.0000  1.0000  0.9939  0.9984  0.9864  0.9958  0.9926  
def3  0.9867  0.9984  1.0000  0.9984  0.9984  0.9984  0.9984  0.9984  1.0000  0.9939  0.9993  0.9984

结果显示:

  • 百分位数定义1(def1)不匹配分位数类型
  • 百分位数定义2(def2)=匹配分位数类型1,3和4
  • 百分位数定义3(def3)=匹配分位数类型6

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM