简体   繁体   中英

Can someone please help me make my overlapping histograms?

I have three dataframes in the format below, which have a column with the month of the year in digit format, and a column adjacent to it which has the number of items occurring in that month. I wanted to create an overlapping histogram detailing the spread between the three histograms but for some reason I keep getting the same thing!

    month_box   Sum Value
0     1        4812
1     2        2053
2     3        2405
3     4        2353
4     5        2427
5     6        2484
6     8        2579
7     9        2580
8    10        2497
9    11        2510
10   12        2202

The code I am using is below:

sns.distplot(bex_boxdf['month_box'],kde=False,label = 'Bexley')
sns.distplot(west_boxdf['month_box'],kde=False,label = 'Westminster')
sns.distplot(gwch_boxdf['month_box'],kde=False,label = 'Greenwich')
plt.legend(prop={'size': 12})
plt.title('Crime by month')
plt.xlabel('Month')
plt.ylabel('Density')

I attach below the result I get...help would be appreciated thank you. 在此处输入图片说明

Using the data provided by @Esa, here are three different views using Matplotlib.

There is also a 'stepfilled' histogram type that I didn't include but could be useful depending on the distribution of the data:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'month_box': [1,2,3,4,5,6,7,8,9,10,11,12],
               'Bexley_sum': [4812,2053,2405,2353,2427,2484,2579,2580,
                             2497,2510,2202,2021],
               'Westminster_sum': [4712,2050,2435,2323,2487,2414,2679,2780,
                             2490,2110,2702,2022],
               'Greenwich_sum': [4812,2053,2405,2353,2427,2484,2579,2580,
                             2497,2510,2202,2021],
               })

data = df["Bexley_sum"], df["Westminster_sum"], df["Greenwich_sum"]
labels = ["Bexley", "Westminster", "Greenwich"]
fig, ax = plt.subplots(ncols=3, sharex=True, sharey=True)
ax[0].hist(x=data, histtype="bar",label=labels)
ax[1].hist(x=data, histtype="barstacked",label=labels)
ax[2].hist(x=data, histtype="step", label=labels)
plt.legend()
plt.show()

三种不同类型的直方图

Matplotlib has a lot of customization options. Referring to the documentation could be useful.

https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html

Sometimes it is easier to plot directly from the dataframe or using matplotlib instead of seaborn. And other times seaborn is better, so its best to try to learn both at some level at least.

Here's a simple solution if you first arrange your data into one dataframe. You did not provide the other two dataframes so I set some values as an example.

df = pd.DataFrame({'month_box': [1,2,3,4,5,6,7,8,9,10,11,12], 
               'Bexley_sum': [4812,2053,2405,2353,2427,2484,2579,2580,
                             2497,2510,2202,2021],
               'Westminster_sum': [4712,2050,2435,2323,2487,2414,2679,2780,
                             2490,2110,2702,2022],                   
               'Greenwich_sum': [4812,2053,2405,2353,2427,2484,2579,2580,
                             2497,2510,2202,2021],                   
               })

df.plot(x='month_box', y=['Bexley_sum', 'Westminster_sum', 'Greenwich_sum'], kind='bar')

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM