简体   繁体   English

如何精确匹配累积分布函数和分位数函数的结果?

[英]How to exactly match the result of cumulative distribution function and quantile function?

As we know, quantile function is the inverse cumulative distribution function. 众所周知, quantile函数是逆累积分布函数。

Then for an existed distribute(a vector), how to exactly match the result of cumulative distribution function and quantile function? 那么对于一个存在的分布(向量),如何精确匹配cumulative distribution functionquantile函数的结果呢?

Here is an example given in MATLAB. 这是MATLAB中给出的示例。

a = [150   154   151   153   124]
[x_count, x_val] = hist(a, unique(a));
% compute the probability cumulative distribution 
p = cumsum(n)/sum(n);
x_out = quantile(a, p)

In the cumulative distribution function, the corresponding relation between cumulative probability and x value should be: 在累积分布函数中,累积概率与x值之间的对应关系应为:

x = 124   150   151   153   154
p = 0.2000    0.4000    0.6000    0.8000    1.0000

But use p and quantile to compute x_out , the result is different with x : 但是使用p分位数计算x_out ,结果与x不同:

x_out =

  137.0000  150.5000  152.0000  153.5000  154.0000

Reference 参考

  1. quantile function 分位数功能
  2. matlab quantile function Matlab分位数功能

From the docs : 文档

For a data vector of five elements such as {6, 3, 2, 10, 1}, the sorted elements {1, 2, 3, 6, 10} respectively correspond to the 0.1, 0.3, 0.5, 0.7, 0.9 quantiles. 对于具有五个元素(例如{6、3、2、10、1})的数据向量,已排序元素{1、2、3、6、10}分别对应于0.1、0.3、0.5、0.7、0.9个分位数。

So if you wanted to get the exact numbers out that you put in for x , and your x has 5 elements then your p needs to be p = [0.1, 0.3, 0.5, 0.7, 0.9] . 因此,如果您想得出x的确切数字,并且x有5个元素,则p必须为p = [0.1, 0.3, 0.5, 0.7, 0.9] The complete algorithm is explicitly defined in the documentation. 完整的算法在文档中明确定义。

You have assumed that to get x back, p should have been [0.2, 0.4, 0.6, 0.8, 1] . 您假设要使x返回, p应该为[0.2, 0.4, 0.6, 0.8, 1] But then why not p = [0, 0.2, 0.4, 0.6, 0.8] ? 但是,为什么不p = [0, 0.2, 0.4, 0.6, 0.8] Matlab's algorithm seems to just take a linear average of the two methods. Matlab的算法似乎只是对这两种方法进行了线性平均。

Note that R defines nine different algorithms for quantiles, so your assumptions need to be stated clearly. 请注意, R为分位数定义了 九种不同的算法,因此您的假设需要清楚地陈述。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM