简体   繁体   中英

How to plot a 'multiple-line' line graph in python

I have a dataframe that looks like the following--

month      source_id   revenue
April      PA0057       16001.0
           PA0202       54063.0
           PA0678       24219.0
           PA0873       41827.0
August     PA0057       40673.0
           PA0202       75281.0
           PA0678       60318.0
           PA0873       55243.0
December   PA0057       49781.0
           PA0202       71797.0
           PA0678       24975.0
           PA0873       57630.0
February   PA0057       13193.0
           PA0202       44211.0
           PA0678       29862.0
           PA0873       36436.0
January    PA0057       65707.0
           PA0202       67384.0
           PA0678       29392.0
           PA0873       46854.0
July       PA0057       31533.0
           PA0202       49663.0
           PA0678       10520.0
           PA0873       53634.0
June       PA0057       97229.0
           PA0202       56115.0
           PA0678       72770.0
           PA0873       51260.0
March      PA0057       44622.0
           PA0202       54079.0
           PA0678       36776.0
           PA0873       42873.0
May        PA0057       38077.0
           PA0202       68103.0
           PA0678       78012.0
           PA0873       83464.0
November   PA0057       26599.0
           PA0202       53050.0
           PA0678       87853.0
           PA0873       65499.0
October    PA0057       47638.0
           PA0202       44445.0
           PA0678       49983.0
           PA0873       57926.0
September  PA0057       46171.0
           PA0202       49202.0
           PA0678       42598.0
           PA0873       65660.0

I wanted to draw a line plot where the x-axis is the month , the y-axis is revenue and I have 4 source_id- PA0057, PA0202, PA0678, PA0873, so I wanted one line for each of the source ids

How do I show this as 4 lines on a line graph??

I have used the below

import matplotlib.pyplot as pls 
my_df.plot(x='month', y='revenue', kind='line') 
plt.show()

but it doesn't give me the expected result as I am not feeding in source ids

  • The dataframe looks like the result of pandas.DataFrame.groupby
    • Presumably something similar to df.groupby(['month', 'source_id']).agg({'revenue': sum})
  • Use pandas.Categorical to set the 'month' column as ordered
    • The calendar module is part of the Standard Library and we just using it for an ordered list of the .month_name s
  • Use seabron.lineplot with the hue parameter to plot the dataframe.
    • seaborn is a high-level API for matplotlib and will make many plots for easier.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import calendar

# given some dataframe, perform groupby and reset the index
dfg = df.groupby(['month', 'source_id']).agg({'revenue': sum}).reset_index()

# display(dfg) - your dataframe should be in the following form
    month source_id  revenue
0   April    PA0057    16001
1   April    PA0202    54063
2   April    PA0678    24219
3   April    PA0873    41827
4  August    PA0057    40673

# set the month column as categorical and set the order for calendar months
dfg.month = pd.Categorical(df.month, categories=list(calendar.month_name)[1:], ordered=True)

# plot with seaborn and use the hue parameter
plt.figure(figsize=(10, 6))
sns.lineplot(x='month', y='revenue', data=dfg, hue='source_id')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=90)
plt.show()

在此处输入图片说明

If none of the columns in your example are the index you can reshape your df with

df = df.set_index(['month', 'source_id']).unstack()

Which will give you a new dataframe with month as index and source_id as columns. Then you can call plot with.

df.plot()

The result will have as many lines as source_id s are in the data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM