简体   繁体   English

Seaborn:distplot()具有相对频率

[英]Seaborn: distplot() with relative frequency

I am trying to make some histograms in Seaborn for a research project. 我正在尝试在Seaborn制作一些直方图用于研究项目。 I would like the y-axis to relative frequency and for the x-axis to run from -180 to 180. Here is the code I have for one of my histograms: 我希望y轴与相对频率和x轴从-180到180.这是我的一个直方图的代码:

import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline
import seaborn as sns

df = pd.read_csv('sample.csv', index_col=0)

x = df.Angle
sns.distplot(x, kde=False);

This outputs: 这输出: seaborn频率图

I can't figure out how to convert the output to a frequency instead of a count. 我无法弄清楚如何将输出转换为频率而不是计数。 I've tried a number of different types of graphs to get frequency output, but to no avail. 我已经尝试了许多不同类型的图形来获得频率输出,但无济于事。 I have also come across this question which appears to be asking for countplot with frequencies (but with another function.) I've tried using it as a guide but have failed. 我也遇到过这个问题似乎要求带有频率的计数图 (但是有另一个功能。)我试过用它作为指南却失败了。 Any help would be greatly appreciated. 任何帮助将不胜感激。 I'm very new to this software and to Python as well. 我对这个软件和Python也很新。

My data looks like the following and can be downloaded: 我的数据如下所示,可以下载: 样本数据

There is a sns.displot argument that allows converting to frequency (or density, as sns refers to it) from count. 有一个sns.displot参数允许从count转换到频率(或密度,如sns引用它)。 Its usually False, so you have to enable it with True. 它通常是假的,所以你必须用True启用它。 In your case: 在你的情况下:

sns.distplot(x, kde=False, norm_hist=True)

Then if you want the x-axis to run from -180 to 180, just use: 然后,如果你想让x轴从-180到180运行,只需使用:

plt.xlim(-180,180)

From the Seaborn Docs : 来自Seaborn Docs

norm_hist : bool, optional

If True, the histogram height shows a density rather than a count. This is implied if a KDE or fitted density is plotted.

Especially as a beginner, try to keep things simple. 特别是作为初学者,尽量保持简单。 You have a list of numbers 你有一个数字列表

a = [-0.126,1,9,72.3,-44.2489,87.44]

of which you want to create a histogram. 您想要创建直方图。 In order to define a histogram, you need some bins. 为了定义直方图,您需要一些箱子。 So let's say you want to divide the range between -180 and 180 into bins of width 20, 因此,假设您要将-180和180之间的范围划分为宽度为20的区间,

import numpy as np
bins = np.arange(-180,181,20)

You can compute the histogram with numpy.histogram which returns the counts in the bins. 您可以使用numpy.histogram计算直方图,该numpy.histogram返回numpy.histogram中的计数。

hist, edges = np.histogram(a, bins)

The relative frequency is the number in each bin divided by the total number of events, 相对频率是每个箱中的数字除以事件总数,

freq = hist/float(hist.sum())

The quantity freq is hence the relative frequency which you want to plot as a bar plot 因此,数量freq是您想要绘制为条形图的相对频率

import matplotlib.pyplot as plt
plt.bar(bins[:-1], freq, width=20, align="edge", ec="k" )

This results in the following plot, from which you can read eg that 33% of the values lie in the range between 0 and 20. 这导致下面的图,您可以从中读取例如33%的值位于0到20之间的范围内。

在此输入图像描述

Complete code: 完整代码:

import numpy as np
import matplotlib.pyplot as plt

a = [-0.126,1,9,72.3,-44.2489,87.44]

bins = np.arange(-180,181,20)

hist, edges = np.histogram(a, bins)
freq = hist/float(hist.sum())

plt.bar(bins[:-1],freq,width=20, align="edge", ec="k" )

plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM