简体   繁体   English

python:sns distplot区域重叠

[英]python: sns distplot area overlap

How can I get the overlapping area of 2 sns.distplots?如何获得 2 个 sns.distplots 的重叠区域?

Apart from the difference in mean (as below) I would like to add a number that descripes how different the (normalised) distributions are (for example 2 distributions could have the same mean but still look very different if they are not normal).除了平均值的差异(如下所示),我想添加一个数字来描述(归一化)分布的不同(例如,2 个分布可能具有相同的平均值,但如果它们不正常,看起来仍然非常不同)。

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

x1 = np.random.normal(size=2000)
x2 = np.random.normal(size=1000)+1

sns.distplot(x1, hist=False, kde=True, color="r", norm_hist=True)
sns.distplot(x2, hist=False, kde=True, color="b", norm_hist=True)

m1 = x1.mean()
m2 = x2.mean()

plt.title("m1={:2.2f}, m2={:2.2f} (diffInMean={:2.2f})".format(m1, m2, m1-m2))

plt.show(block=True)

If somebody is interested: I have approximated it now with an integral of the distributions (unfortunately not quite the 1-liner I was searching for):如果有人感兴趣:我现在用分布的积分来近似它(不幸的是不是我正在寻找的 1-liner):

data1 = np.random.normal(size=9000)
data2 = np.random.normal(size=5000, loc=0.5, scale=1.5)
num_bins = 100

xmin = min(data1.min(), data2.min())
xmax = max(data1.max(), data2.max())
bins = np.linspace(xmin, xmax, num_bins)
weights1 = np.ones_like(data1) / float(len(data1))
weights2 = np.ones_like(data2) / float(len(data2))

hist_1 = np.histogram(data1, bins, weights=weights1)[0]
hist_2 = np.histogram(data2, bins, weights=weights2)[0]

tvd = 0.5*sum(abs(hist_1 - hist_2))
print("overlap: {:2.2f} percent".format((1-tvd)*100))

plt.figure()
ax = plt.gca()
ax.hist(data1, bins, weights=weights1, color='red', edgecolor='white', alpha=0.5)[0]
ax.hist(data2, bins, weights=weights2, color='blue', edgecolor='white', alpha=0.5)[0]
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM