简体   繁体   English

有没有办法在相同的轴上绘制多个累积直方图,其中数据集是标准化的

[英]Is there a way to plot multiple cumulative histograms on the same axes, where the datasets are normalized

I am attempting to plot two datasets on the same axes using matplotlib.plt.hist().我正在尝试使用 matplotlib.plt.hist() 在同一轴上绘制两个数据集。 I would like the datasets to appear on the same y axis such that the 100% value of each dataset will appear on the same spot on the plot.我希望数据集出现在同一个 y 轴上,这样每个数据集的 100% 值将出现在图上的同一位置。 These datasets have different amounts of data in them.这些数据集包含不同数量的数据。 I am generating cumulative distribution plots, both for total area and for percent density.我正在生成总面积和百分比密度的累积分布图。 I have tried to use density = True as an argument, however the plots were not the same shape as when plotted separately.我尝试使用密度 = True 作为参数,但是这些图的形状与单独绘制时的形状不同。 Here is the code I have used thus far!这是我迄今为止使用的代码!


data = pd.Series(condensed['slope'])
data_mtl = pd.Series(montreal['slope'])

fig, ax = plt.subplots(figsize = (60, 20))
ax.hist([data, data_mtl], bins = 200, color=['g','r'], cumulative = True, histtype = 'step')
ax.set_xlabel('ΔCO/ΔCO₂ (ppb/ppm)')
ax.set_ylabel("% Density")

for item in ([ax.title, ax.xaxis.label, ax.yaxis.label] +
             ax.get_xticklabels() + ax.get_yticklabels()):
    item.set_fontsize(50)
    
ax.yaxis.set_major_formatter(ticker.PercentFormatter(xmax=len(data_mtl)))
plt.show()

绘图生成

EDIT: Separate plots with 80th percentile line vs with density = True编辑:用第 80 个百分位线与密度 = True 分开绘图

9月1日

9月2日

with density = True:密度 = True:

一起

Provided you use density=True , cumulative=True for all plots, it should be just fine:假设您对所有图使用density=Truecumulative=True ,它应该没问题:

import matplotlib.pyplot as plt
import numpy as np

data1 = np.random.normal(size=100)
data2 = np.random.rand(4000)    
plt.hist(data1, cumulative=True, density=True)
plt.hist(data2, cumulative=True, density=True)
plt.show()

在此处输入图片说明

If your x-axis is not large enough to include all the data and has a sparse region, it could appear that it "does not reach 1.0 ".如果您的 x 轴不够大以包含所有数据并且具有稀疏区域,则它可能看起来“未达到1.0 ”。 I would try again with density=True .我会再次尝试使用density=True Also, cumulative=True is required to guarantee that the max bin height will be the same.此外,需要cumulative=True以保证最大bin 高度相同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM