I'm new to Pandas and I'm looking for a way to plot data that has been grouped by two columns. Here's my example:
First I group by the 'Date'(year) and 'Primary Type' column.
groups = df.groupby([df['Date'].map(lambda x: x.year), pri_type['Primary Type']])
Now from that I can get a series of basically exactly what I want to plot.
groups.size().head()
Date Primary Type
2001 ARSON 1010
ASSAULT 31384
BATTERY 93448
BURGLARY 26011
CRIM SEXUAL ASSAULT 1794
dtype: int64
But when I plot this I get a very messy plot with thousands of labels on the x axis. What I would like to get is a plot with date on the x axis and a ledgend with all the Primary Types. Something similar to this graph:
Thanks in advance!
What do you want to be displayed on the x axis, date? If so, you can set date as index: groups.set_index('Date')
The solution that I came up with is to convert the series to a data frame and use the unstack() method. Here is what I did:
# convert to a dataframe
df = groups.size().to_frame()
| | | 0
|------ | --------------|------
|Date | Primary Type |
| | ARSON | 1010
| | ASSAULT | 31384
| 2001 | BATTERY | 93234
| | BURGLARY | 26031
| | CRIM SEXUAL AS| 1723
# unstack() to pivot the data which puts it in the correct format for plot()
df.unstack(level=-1)
| |0
|------------|-------|---------|-------...
|Primary Type|ARSON |ASSAULT |BATTERY...
|Date | | | ...
|2001 |1010.0 |31384.0 |93234.0...
|2002 |2938.0 |31993.0 |94235.0...
|2003 |955.0 |30082.0 |92834.0...
Which almost makes the graph I was after, other than the 0, but I can probably get rid of that. And as you can see it's still not very readable, but this solves my question of how to graph it.
df.unstack(level=-1).plot(kind='bar', figsize = (10,10))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.