Plot 百分比直方图的拟合曲线（不是实际数据）

Question

I first try to draw my data as percentage as follows:我首先尝试将我的数据绘制为百分比，如下所示：

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
plt.hist(data, weights=np.ones(len(data)) / len(data), bins=5)
plt.gca().yaxis.set_major_formatter(PercentFormatter(1))
plt.grid()
plt.show()

This will give me this.这会给我这个。

Now I used this line to fit a curve on the "percentage data" as follows:现在我用这条线在“百分比数据”上拟合一条曲线，如下所示：

import seaborn as sns
p=sns.displot(data=data, x="Dist",kde=True, bins=5)

Which gives me this:这给了我这个：

But this curve was fitted according to the data not the percent per 5 bins.但是这条曲线是根据数据而不是每 5 个箱子的百分比拟合的。 If for example you had 10 bins you could understand why there was a bump at the end.例如，如果您有 10 个箱子，您就可以理解为什么最后会有一个凸起。 That bump we don't want to see.我们不想看到的颠簸。 What I really want is a curve as this我真正想要的是这样的曲线

Answer 1

The kde plot approximates the data as a sum of guassian bell curves. kde plot 将数据近似为高斯钟形曲线的总和。 An idea could be to regroup the data and place them at the centers of each bar.一个想法可能是重新组合数据并将它们放在每个条形的中心。

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

z = [1.83E-05,2.03E-05,3.19E-05,3.39E-05,3.46E-05,3.56E-05,3.63E-05,3.66E-05,4.13E-05,4.29E-05,4.29E-05,4.79E-05,5.01E-05,5.07E-05,5.08E-05,5.21E-05,5.39E-05,5.75E-05,5.91E-05,5.95E-05,5.98E-05,6.00E-05,6.40E-05,6.41E-05,6.67E-05,6.79E-05,6.79E-05,6.92E-05,7.03E-05,7.17E-05,7.45E-05,7.75E-05,7.99E-05,8.03E-05,8.31E-05,8.74E-05,9.69E-05,9.80E-05,9.86E-05,0.000108267,0.000108961,0.000109634,0.000111083,0.000111933,0.00011491,0.000126831,0.000135493,0.000138174,0.000141792,0.000150507,0.000155346,0.000155516,0.000202407,0.000243149,0.000248106,0.00025259,0.000254496,0.000258372,0.000258929,0.000265318,0.000293665,0.000312719,0.000430077]

counts, bin_edges = np.histogram(z, 5)
centers = (bin_edges[:-1] + bin_edges[1:]) / 2
regrouped_data = np.repeat(centers, counts)

sns.histplot(data=regrouped_data, kde=True, bins=bin_edges)

Normally, a kdeplot can be extended via the clip= parameter, but unfortunately kde_kws={'clip':bin_edges[[0,-1]]} doesn't work here.通常，可以通过clip=参数扩展 kdeplot，但不幸kde_kws={'clip':bin_edges[[0,-1]]}在这里不起作用。 To extend the kde, a trick could be to keep the highest and lowest value of the original data.要扩展 kde，一个技巧可能是保留原始数据的最高值和最低值。 So, subtracting one of the counts of the first and last bin, and append the lowest and highest value to the regrouped data.因此，将第一个和最后一个 bin 的计数中的一个和 append 减去重组数据的最低值和最高值。

counts, bin_edges = np.histogram(z, 5)
centers = (bin_edges[:-1] + bin_edges[1:]) / 2
counts[[0, -1]] -= 1
regrouped_data = np.concatenate([np.repeat(centers, counts), bin_edges[[0, -1]]])

sns.histplot(data=regrouped_data, kde=True, bins=bin_edges, stat='percent')

Plot 百分比直方图的拟合曲线（不是实际数据）

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-12-06 22:08:18

Plot 百分比直方图的拟合曲线（不是实际数据）

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-12-06 22:08:18

解决方案1
1 已采纳 2022-12-06 22:08:18