简体   繁体   English

如何在 python 中创建具有多个 x 轴的条形图

[英]how to create a bar chart in python with multiple x-axis

I have a dataset with 3 columns: BOROUGHS, COMPLAINT_DATE, OFFENSE我有一个包含 3 列的数据集:BOROUGHS、COMPLAINT_DATE、OFFENSE

NOTE: the date format is like this: 2010-01-30注意:日期格式如下: 2010-01-30

I do know how to create a simple bar chart...like this:我确实知道如何创建一个简单的条形图......像这样:

df.plot(kind="bar")

But, I need something like this:但是,我需要这样的东西:

在此处输入图像描述

This chart is telling me the 5 boroughs, the number of complaints and the year.这张图表告诉我 5 个行政区、投诉数量和年份。 Plus using colors.加上使用 colors。

First, how do you do something like that?首先,你怎么做这样的事情? Second, does this type of chart has a name?第二,这种图表有名字吗? like, multi-bar chart or something like that?像,多条形图或类似的东西?

EDIT:编辑: 在此处输入图像描述

the purple color should be first...in the bar... but it says that it has more crime...紫色应该是第一个……在酒吧……但它说它有更多的犯罪……

EDIT: #2 Plus...look at this number base on 2010 and 2019编辑:#2 Plus...看看这个数字基于 2010 年和 2019 年在此处输入图像描述

Edit:#3 too small... not showing the number of crime at the bottom Thanks,编辑:#3 太小...没有显示底部的犯罪数量谢谢, 在此处输入图像描述

  • The data will need to be grouped and aggregated by count, and then pivoted into the correct shape.数据需要按计数进行分组和聚合,然后转换为正确的形状。
    • Use the .dt accessor to extract the year from the 'complaint_date' column.使用.dt访问器从'complaint_date'列中提取年份。
  • See pandas.DataFrame.plot & pandas.DataFrame.plot.bar for all the available parameters. See pandas.DataFrame.plot & pandas.DataFrame.plot.bar for all the available parameters.
import pandas as pd
import matplotlib.pyplot as plt

# sample data
data = {'boroughs': ['x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z', 'x'],
        'complaint_date': ['2020-11-1', '2020-11-1', '2020-11-1', '2019-11-1', '2019-11-1', '2019-11-1', '2020-11-1', '2020-11-1', '2020-11-1', '2019-11-1', '2019-11-1', '2019-11-1', '2019-11-1'],
        'offense': ['a', 'b', 'c', 'a', 'b', 'c', 'd', 'e', 'f', 'd', 'e', 'f', 'd']}

# create dataframe
df = pd.DataFrame(data)

# convert date column to datetime dtype
df.complaint_date = pd.to_datetime(df.complaint_date)

# groupby year and borough to get count of offenses
dfg = df.groupby([df.complaint_date.dt.year, 'boroughs']).boroughs.count().reset_index(name='count')

# display(dfg)
   complaint_date boroughs  count
0            2019        x      3
1            2019        y      2
2            2019        z      2
3            2020        x      2
4            2020        y      2
5            2020        z      2

# pivot into the correct form for stacked bar
dfp = dfg.pivot(index='complaint_date', columns='boroughs', values='count')

# display(dfp)
boroughs        x  y  z
complaint_date         
2019            3  2  2
2020            2  2  2

# plot
dfp.plot.bar(stacked=True, xlabel='Year Complaint Filed', ylabel='Volumn of Complaints')
plt.legend(title='Boroughs', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=0)

在此处输入图像描述

Response to comment对评论的回应

  • In response to AttributeError: 'Rectangle' object has no property 'xlabel'响应AttributeError: 'Rectangle' object has no property 'xlabel'
  • pandas probably needs to be updated; pandas可能需要更新; this was run in version 1.1.3 .这是在版本1.1.3中运行的。
# plot
dfp.plot.bar(stacked=True)
plt.legend(title='Boroughs', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xlabel('Year Complaint Filed')
plt.ylabel('Volumn of Complaints')
plt.xticks(rotation=0)

A better option than a stacked bar比堆叠条更好的选择

  • Use seaborn.barplot使用seaborn.barplot
  • This will provide a better overall representation of the relative values for each bar.这将为每个条形的相对值提供更好的整体表示。
import seaborn as sns

# use dfg from above

# plot
fig, ax = plt.subplots(figsize=(6, 4))
sns.barplot(y='complaint_date', x='count', data=dfg, hue='boroughs', orient='h', ax=ax)

# use log scale since you have large numbers
plt.xscale('log')

# relocate the legend
plt.legend(title='Boroughs', bbox_to_anchor=(1.05, 1), loc='upper left')

在此处输入图像描述

  • See question or question to change the format of the x-tick values from exponents to integers.请参阅questionquestion以将 x-tick 值的格式从指数更改为整数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM