简体   繁体   English

如何在matplotlib中制作不同y轴的堆叠折线图?

[英]How to make stacked line chart with different y-axis in matplotlib?

I am wondering how should I make stacked line chart which is gonna take different columns in matplotlib.我想知道如何制作堆叠折线图,它将在 matplotlib 中采用不同的列。 The point is when we are doing aggregation, I need to do data aggregation on two different columns, I think I need to make one big dataframe that will be used for plotting.关键是当我们进行聚合时,我需要在两个不同的列上进行数据聚合,我想我需要制作一个大的 dataframe 用于绘图。 I didn't find prettier and handy way to do this in pandas, matplotlib.我没有在 pandas、matplotlib 中找到更漂亮、更方便的方法。 Can anyone suggest possible tweaks to do this?谁能建议可能的调整来做到这一点? any ideas?有任何想法吗?

my attempt我的尝试

this is the first aggregation I need to do:这是我需要做的第一个聚合:

import pandas as pd
import matplotlib.pyplot as plt

url = "https://gist.githubusercontent.com/adamFlyn/4657714653398e9269263a7c8ad4bb8a/raw/fa6709a0c41888503509e569ace63606d2e5c2ff/mydf.csv"
df = pd.read_csv(url, parse_dates=['date'])

df_re = df[df['retail_item'].str.contains("GROUND BEEF")]
df_rei = df_re.groupby(['date', 'retail_item']).agg({'number_of_ads': 'sum'})
df_rei = df_rei.reset_index(level=[0,1])
df_rei['week'] = pd.DatetimeIndex(df_rei['date']).week
df_rei['year'] = pd.DatetimeIndex(df_rei['date']).year
df_rei['week'] = df_rei['date'].dt.strftime('%W').astype('uint8')

df_ret_df1 = df_rei.groupby(['retail_item', 'week'])['number_of_ads'].agg([max, min, 'mean']).stack().reset_index(level=[2]).rename(columns={'level_2': 'mm', 0: 'vals'}).reset_index()

and this is second aggregation that I need to do which is similar to first one except I am choosing different column now:这是我需要做的第二个聚合,它类似于第一个聚合,除了我现在选择不同的列:

df_re['price_gap'] = df_re['high_price'] - df_re['low_price']
dff_rei1 = df_re.groupby(['date', 'retail_item']).agg({'price_gap': 'mean'})
dff_rei1 = dff_rei1.reset_index(level=[0,1])
dff_rei1['week'] = pd.DatetimeIndex(dff_rei1['date']).week
dff_rei1['year'] = pd.DatetimeIndex(dff_rei1['date']).year
dff_rei1['week'] = dff_rei1['date'].dt.strftime('%W').astype('uint8')

dff_ret_df2 = dff_rei1.groupby(['retail_item', 'week'])['price_gap'].agg([max, min, 'mean']).stack().reset_index(level=[2]).rename(columns={'level_2': 'mm', 0: 'vals'}).reset_index()

now I am struggling how can I combine the output of first, second aggregation into one dataframe for making stacked line chart.现在我正在苦苦挣扎,如何将第一个、第二个聚合的 output 组合成一个 dataframe 以制作堆叠折线图。 Is that possible to do so?有可能这样做吗?

goal :目标

I want to make stacked line charts where its y axis is taking different columns such as y axis should show # of ads, and price range, while x-axis shows 52 week period.我想制作堆叠折线图,其中 y 轴采用不同的列,例如 y 轴应显示广告数量和价格范围,而 x 轴显示 52 周。 This is partial code I attempted to make line chart:这是我尝试制作折线图的部分代码:

for g, d in df_ret_df1.groupby('retail_item'):
    fig, ax = plt.subplots(figsize=(7, 4), dpi=144)
    sns.lineplot(x='week', y='vals', hue='mm', data=d,alpha=.8)
    y1 = d[d.mm == 'max']
    y2 = d[d.mm == 'min']
    plt.fill_between(x=y1.week, y1=y1.vals, y2=y2.vals)
    
    for year in df['year'].unique():
        data = df_rei[(df_rei.date.dt.year == year) & (df_rei.retail_item == g)]
        sns.lineplot(x='week', y='price_gap', ci=None, data=data,label=year,alpha=.8)

is there any elegant way so we can construct plotting data where data aggregation on different columns can be done easily in pandas?有什么优雅的方法可以构建绘图数据,在 pandas 中可以轻松完成不同列上的数据聚合? Is there other way around to make this happen?还有其他方法可以实现这一点吗? any thoughts?有什么想法吗?

desired output :所需的 output

here is the desired output that I want to get:这是我想要得到的所需 output : 在此处输入图像描述

How should I make plotting data in order to get my desired plot like this?我应该如何制作绘图数据才能像这样获得我想要的 plot? Any idea?任何想法?

Pandas groupby feature is very versatile, and you can reduce the lines of code considerably to achieve the final dataframe for plotting. Pandas groupby 功能非常通用,您可以大大减少代码行数以实现最终的 dataframe 进行绘图。

plotdf = df_re.groupby([ 'retail_item',df_re['date'].dt.year,df_re['date'].dt.week]).agg({'number_of_ads':'sum','price_gap':'mean'}).unstack().T

Once you have the aggregation done the right way, use a for loop to show each of the measures needed in a different plot.以正确的方式完成聚合后,使用 for 循环显示不同 plot 中所需的每个度量。 Plot a shaded range by using pandas describe feature to compute the min and max on the fly: Plot 使用 pandas 描述用于计算最小值和最大值的特性的阴影范围:

f,axs = plt.subplots(2,1,figsize=(20,14))
axs=axs.ravel()

for i,x in enumerate(['number_of_ads','price_gap']):
    plotdf.loc[x].plot(rot=90,grid=True,ax=axs[i])
    plotdf.loc[x].T.describe().T[['min','max']].plot(kind='area',color=['w','grey'],alpha=0.3,ax=axs[i],title= x)

在此处输入图像描述

Edit with updated code:使用更新的代码进行编辑:

plotdf = df_re.groupby(['retail_item',df_re['date'].dt.year,df_re['date'].dt.week]).agg({'number_of_ads':'sum','weighted_avg':'mean'}).unstack().T
f,axs = plt.subplots(3,2,figsize=(20,14))
axs=axs.ravel()
i=0
for col in plotdf.columns.get_level_values(0).unique():
    for x in ['number_of_ads','weighted_avg']:
        plotdf.loc[x,col].plot(rot=90,grid=True,ax=axs[i]);
      plotdf.loc[x,col].T.describe().T[['min','max']].plot(kind='area',color=['w','grey'],alpha=0.3,ax=axs[i],title= col+', '+x)
        i+=1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM