简体   繁体   中英

Complex dataframe plotting with Pandas / Matplotlib

I'd like to create a single time-series graph from a pandas dataframe that looks like the following:

*sample of a simplified version of my dataframe:*

index    to_network    count
201401   net_1         100
201401   net_2         200
201401   net_3         150
201402   net_1         300
201402   net_2         250
201403   net_1         175

Ultimately, the final graph should be a time-series line graph (x-axis being the index and the y-axis being 'count') with multiple lines, and each line being a network in the to_network column (eg, one line should be net_1).

I've been reading the 'python for data analysis' book, but they don't appear to be this complex.

Does it work?

df.groupby('to_network').count.plot()

If you want to show the date correctly, you can try:

df.index=pd.to_datetime(df.index,format='%Y%m')

To answer your question, I have checked in a notebook here: http://nbviewer.ipython.org/github/ericmjl/Stack-Overflow-Answers/blob/master/20141020%20Complex%20Pandas%20Plotting/Untitled0.ipynb

The core idea is to do a groupby , and then plot only the column that you're interested in.

Code is also pasted below here:

df = pd.read_csv("data.csv")
df.groupby("to_network")['count'].plot()

Also, be sure to add in Daniele's contribution, where you format the index correctly:

df.index=pd.to_datetime(df.index,format='%Y%m')

For attribution, I have up-voted her answer in addition to citing it here.

I hope this answers the question; if it did, please accept the answer!

The default behavior of plot in pandas is to use the index as an x-axis and plot one line per column. So you want to reshape your data frame to mirror that structure. You can do the following:

df.pivot_table(index='index', columns = 'to_network', values = 'count', aggfunc = 'sum').plot()

This will pivot your df (which is in the long format ala ggplot style) into a frame from which pandas default plot behavior will produce your desired result of one line per network type with index as the x-axis and count as the value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM