简体   繁体   中英

what quantile types match the 3 definitions of percentile in R

There are 3 definitions of percentile:

  1. lowest number greater than x% of y numbers
  2. smallest number greater than or equal to x% of y numbers
  3. weighted mean of the percentiles from 1 & 2

Which quantile() argument type match these three definitions?

If by "quartile" you mean quantile() : None of them. It's not quite that simple. As the documentation says when you try: help(quantile) :

One of the nine quantile algorithms discussed in Hyndman and Fan (1996), selected by type, is employed.

The paper can be found here:

https://www.researchgate.net/profile/Rob_Hyndman/publication/222105754_Sample_Quantiles_in_Statistical_Packages/links/02e7e530c316d129d7000000.pdf

It's worth reading to get an idea of what is involved in getting a computer to do something "intuitive". :)

You can get a feel for quantile 's behaviour by trying this and fiddling with the numbers for prbs :

aa <- 1: 10
prbs <- c(0.2, 0.22, 0.29)

for(typ in 1:9){
  this_line <- paste0("type=", typ)
  this_val <- paste0("qval=",quantile(aa, probs=prbs, type=typ))
  print(paste(this_line,this_val))
}

Which gives the lines:

[1] "type=1 qval=2" "type=1 qval=3" "type=1 qval=3"
[1] "type=2 qval=2.5" "type=2 qval=3"   "type=2 qval=3"  
[1] "type=3 qval=2" "type=3 qval=2" "type=3 qval=3"
[1] "type=4 qval=2"   "type=4 qval=2.2" "type=4 qval=2.9"
[1] "type=5 qval=2.5" "type=5 qval=2.7" "type=5 qval=3.4"
[1] "type=6 qval=2.2"  "type=6 qval=2.42" "type=6 qval=3.19"
[1] "type=7 qval=2.8"  "type=7 qval=2.98" "type=7 qval=3.61"
[1] "type=8 qval=2.4"  "type=8 qval=2.60666666666667 "type=8 qval=3.33"            
[1] "type=9 qval=2.425"  "type=9 qval=2.63"   "type=9 qval=3.3475"

To get the answer, I generated a random sample, used a correlation test to find a match for the three definition. It's not the most elegant code but....

Here's the code:

    #####  program  to test types
## generate 100 random samples of 100 numbers
set.seed(3)
x <- rnorm(100000,mean = 50, sd = 10)
means <- replicate(100, sample(x, 100))

#create answer matrix
answers <- matrix(ncol=12)
colnames(answers) <- (c("def1","def2","def3","q1","q2","q3","q4","q5","q6","q7","q8","q9"))


printallper <- function(x,bar) {
  # get values for per calcs
  bar <- sort(bar)
  per <- (x/100)*(length(bar)+1)
  # get per1
  perres1 <<-round(bar[per+1],digits=2)
  # get per2
  perres2 <<-round(bar[per], digits=2)

  #get per3
  whole <- floor(per)
  dec <- per - whole
  low <- bar[per]
  high <- bar[per+1]
  final <- (dec * (high-low)) + bar[per]
  perres3 <<-round(final, digits=2)
  # q types
  q1 <- round(quantile(bar,(x/100), type = 1), digits = 2)
  q2 <-round( quantile(bar,(x/100), type = 2), digits = 2)
  q3 <- round(quantile(bar,(x/100), type = 3), digits = 2)
  q4 <- round(quantile(bar,(x/100), type = 4), digits = 2)
  q5 <- round(quantile(bar,(x/100), type = 5), digits = 2)
  q6 <- round(quantile(bar,(x/100), type = 6), digits = 2)
  q7 <- round(quantile(bar,(x/100), type = 7), digits = 2)
  q8 <- round(quantile(bar,(x/100), type = 8), digits = 2)
  q9 <- round(quantile(bar,(x/100), type = 9), digits = 2)
  answers <<- rbind(answers,c(perres1,perres2,perres3,q1,q2,q3,q4,q5,q6,q7,q8,q9))
}

#run all percentiles for data in means matrix
apply(means,1,function(x) printallper(25,x))

# correlate various percentiles
cor_answers <- cor(answers[complete.cases(answers),])

#print correlations for 3 deifinitions of percentils with quantiles
cor_answers[1:3,]

The result:

      def1    def2    def3   quan1   quan2   quan3   quan4   quan5    quan6   quan7   quan8   quan9
def1  1.0000  0.9763  0.9867  0.9763  0.9941  0.9763  0.9763  0.9941  0.9867  0.9985  0.9920  0.9763
def2  0.9763  1.0000  0.9984  1.0000  0.9939  1.0000  1.0000  0.9939  0.9984  0.9864  0.9958  0.9926  
def3  0.9867  0.9984  1.0000  0.9984  0.9984  0.9984  0.9984  0.9984  1.0000  0.9939  0.9993  0.9984

The results show:

  • percentile definition 1 (def1) matches no quantile type
  • percentile definition 2 (def2) = matches quantile type 1,3, and 4
  • percentile definition 3 (def3) = matches quantile type 6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM