简体   繁体   English

如何在python中不同大小的两个样本之间创建qq图?

[英]how to create a qq plot between two samples of different size in python?

I got an original sample data and its simulated data (don't ask me how I simulated), and I want to check if histograms are matching. 我得到了原始样本数据及其模拟数据(不要问我如何模拟),我想检查直方图是否匹配。 So the best way is by qqplot but statsmodels library does not allow samples with different size. 因此,最好的方法是使用qqplot但是statsmodels库不允许使用大小不同的样本。

Constructing a qq plot involves finding corresponding quantiles in both sets and plotting them against one another. 构造一个qq图涉及在两个集合中找到相应的分位数,并将它们相对绘制。 In the case where one set is larger than the other, common practice is to take the quantile levels of the smaller set, and use linear interpolation to estimate the corresponding quantiles in the larger set. 在一个集合大于另一个集合的情况下,通常的做法是采用较小集合的分位数级别,并使用线性插值法估计较大集合中的相应分位数。 This is described here: http://www.itl.nist.gov/div898/handbook/eda/section3/qqplot.htm 此处对此进行了描述: http : //www.itl.nist.gov/div898/handbook/eda/section3/qqplot.htm

This is relatively straightforward to do manually: 手动执行此操作相对简单:

import numpy as np
import pylab

test1 = np.random.normal(0, 1, 1000)
test2 = np.random.normal(0, 1, 800)

#Calculate quantiles
test1.sort()
quantile_levels1 = np.arange(len(test1),dtype=float)/len(test1)

test2.sort()
quantile_levels2 = np.arange(len(test2),dtype=float)/len(test2)

#Use the smaller set of quantile levels to create the plot
quantile_levels = quantile_levels2

#We already have the set of quantiles for the smaller data set
quantiles2 = test2

#We find the set of quantiles for the larger data set using linear interpolation
quantiles1 = np.interp(quantile_levels,quantile_levels1,test1)

#Plot the quantiles to create the qq plot
pylab.plot(quantiles1,quantiles2)

#Add a reference line
maxval = max(test1[-1],test2[-1])
minval = min(test1[0],test2[0])
pylab.plot([minval,maxval],[minval,maxval],'k-')

pylab.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM