简体   繁体   中英

How can I change the values on Y axis of Histogram plot in Python

I have data in the CSV file. I am trying to plot a histogram using matplotlib. Here is the code that I am trying.

data.hist(bins=10)
plt.ylabel('Frequency')
plt.xlabel('Data')
plt.show()

在此处输入图片说明

This is the plot that I get. Now using the same code, I need to create a normalized histogram that shows the probability distribution of the data. But now on the y-axis, instead of plotting the number of data points that fall in each bin, you will plot the number of data points in that data bin divided by the total number of data points.

How should I do it?

Pandas' histogram adds some functionality to the underlying pyplot.hist() . Many of the parameters are passed through. One of them is density= .

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

data = pd.DataFrame(np.random.uniform(258.1, 262.3, 20))
data.hist(bins=10, density=True)
plt.ylabel('Density')
plt.xlabel('Data')
plt.show()

示例图

A related library, seaborn, has a command to create a density histogram together with a kde curve as an approximation of the probability distribution.

import seaborn as sns
sns.distplot(data, bins=10)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM