简体   繁体   中英

Spread out data on the histogram matplotlib jupyter

Mind me, I'm new to matplotlib and I am trying to spread out the data in my histogram that can be seen below. Below is the result of what I coded:

我的尝试

What I want to achieve is this:

想要达到

I tried spreading out the bins but it only decrease the frequency and not spread out the graph. Below is my code:

#Loading data
url = 'https://raw.githubusercontent.com/diggledoot/dataset/master/uber-raw-data-apr14.csv'
latlong = pd.read_csv(url)

#Rounding off data for more focused results
n=2
latlong['Lon']=[round(x,n) for x in latlong['Lon']]
latlong['Lat']=[round(x,n) for x in latlong['Lat']]

#Plot
plt.figure(figsize=(8,6))
plt.title('Rides based on latitude')
plt.hist(latlong['Lat'],bins=100,color='cyan')
plt.xlabel('Latitude')
plt.ylabel('Frequency')
plt.xticks(np.arange(round(latlong.Lat.min(),1),round(latlong.Lat.max(),1),0.1),rotation=45)
plt.show()

How do I space out x-ticks in a similar fashion to the histogram I want to achieve?

If you do

frequency, bins = np.histogram(latlong['Lat'], bins=20)
print(frequency)
print(bins)

you get

[     1      7     12     18    301  35831 504342  22081   1256    580
     63     12      8      1      2      0      0      0      0      1]
[40.07   40.1725 40.275  40.3775 40.48   40.5825 40.685  40.7875 40.89
 40.9925 41.095  41.1975 41.3    41.4025 41.505  41.6075 41.71   41.8125
 41.915  42.0175 42.12  ]

You can see that there are some counts very far away from the mean.

You can ignore those far from mean bins by clipping your variable of interest between a specified min and max and then plot histogram, something like this

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#Loading data
url = 'https://raw.githubusercontent.com/diggledoot/dataset/master/uber-raw-data-apr14.csv'
latlong = pd.read_csv(url)

#Plot
plt.figure(figsize=(8,6))
plt.title('Rides based on latitude')
plt.hist(np.clip(latlong['Lat'], 40.6, 40.9),bins=50,color='cyan')
plt.xlabel('Latitude')
plt.ylabel('Frequency')
plt.show()

This will yield the following

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM