[英]what quantile types match the 3 definitions of percentile in R
百分位數有3種定義:
哪種quantile()
參數type
與這三個定義匹配?
如果用“四分位數”表示“ quantile()
:都不是。 這不是那么簡單。 如文檔所述,當您嘗試: help(quantile)
:
Hyndman和Fan(1996)中討論的九種分位數算法之一是按類型選擇的。
該文件可以在這里找到:
值得一讀的是了解讓計算機執行“直觀”操作所涉及的內容。 :)
您可以嘗試一下並擺弄prbs
的數字,以了解quantile
的行為:
aa <- 1: 10
prbs <- c(0.2, 0.22, 0.29)
for(typ in 1:9){
this_line <- paste0("type=", typ)
this_val <- paste0("qval=",quantile(aa, probs=prbs, type=typ))
print(paste(this_line,this_val))
}
給出以下內容:
[1] "type=1 qval=2" "type=1 qval=3" "type=1 qval=3"
[1] "type=2 qval=2.5" "type=2 qval=3" "type=2 qval=3"
[1] "type=3 qval=2" "type=3 qval=2" "type=3 qval=3"
[1] "type=4 qval=2" "type=4 qval=2.2" "type=4 qval=2.9"
[1] "type=5 qval=2.5" "type=5 qval=2.7" "type=5 qval=3.4"
[1] "type=6 qval=2.2" "type=6 qval=2.42" "type=6 qval=3.19"
[1] "type=7 qval=2.8" "type=7 qval=2.98" "type=7 qval=3.61"
[1] "type=8 qval=2.4" "type=8 qval=2.60666666666667 "type=8 qval=3.33"
[1] "type=9 qval=2.425" "type=9 qval=2.63" "type=9 qval=3.3475"
為了得到答案,我生成了一個隨機樣本,使用相關性測試來找到三個定義的匹配項。 這不是最優雅的代碼,但是...
這是代碼:
##### program to test types
## generate 100 random samples of 100 numbers
set.seed(3)
x <- rnorm(100000,mean = 50, sd = 10)
means <- replicate(100, sample(x, 100))
#create answer matrix
answers <- matrix(ncol=12)
colnames(answers) <- (c("def1","def2","def3","q1","q2","q3","q4","q5","q6","q7","q8","q9"))
printallper <- function(x,bar) {
# get values for per calcs
bar <- sort(bar)
per <- (x/100)*(length(bar)+1)
# get per1
perres1 <<-round(bar[per+1],digits=2)
# get per2
perres2 <<-round(bar[per], digits=2)
#get per3
whole <- floor(per)
dec <- per - whole
low <- bar[per]
high <- bar[per+1]
final <- (dec * (high-low)) + bar[per]
perres3 <<-round(final, digits=2)
# q types
q1 <- round(quantile(bar,(x/100), type = 1), digits = 2)
q2 <-round( quantile(bar,(x/100), type = 2), digits = 2)
q3 <- round(quantile(bar,(x/100), type = 3), digits = 2)
q4 <- round(quantile(bar,(x/100), type = 4), digits = 2)
q5 <- round(quantile(bar,(x/100), type = 5), digits = 2)
q6 <- round(quantile(bar,(x/100), type = 6), digits = 2)
q7 <- round(quantile(bar,(x/100), type = 7), digits = 2)
q8 <- round(quantile(bar,(x/100), type = 8), digits = 2)
q9 <- round(quantile(bar,(x/100), type = 9), digits = 2)
answers <<- rbind(answers,c(perres1,perres2,perres3,q1,q2,q3,q4,q5,q6,q7,q8,q9))
}
#run all percentiles for data in means matrix
apply(means,1,function(x) printallper(25,x))
# correlate various percentiles
cor_answers <- cor(answers[complete.cases(answers),])
#print correlations for 3 deifinitions of percentils with quantiles
cor_answers[1:3,]
結果:
def1 def2 def3 quan1 quan2 quan3 quan4 quan5 quan6 quan7 quan8 quan9
def1 1.0000 0.9763 0.9867 0.9763 0.9941 0.9763 0.9763 0.9941 0.9867 0.9985 0.9920 0.9763
def2 0.9763 1.0000 0.9984 1.0000 0.9939 1.0000 1.0000 0.9939 0.9984 0.9864 0.9958 0.9926
def3 0.9867 0.9984 1.0000 0.9984 0.9984 0.9984 0.9984 0.9984 1.0000 0.9939 0.9993 0.9984
結果顯示:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.