简体   繁体   English

Pandas 数据框到 Seaborn 分组条形图

[英]Pandas Dataframe to Seaborn Grouped Barchart

I have the following dataframe which I have obtained from a larger dataframe which lists the worst 10 "Benchmark Returns" and their corresponding portfolio returns and dates:我从一个更大的数据框中获得了以下数据框,其中列出了最差的 10 个“基准收益”及其相应的投资组合收益和日期:

在此处输入图片说明

I've managed to create a Seaborn bar plot which lists Benchmark Returns against their corresponding dates with this script:我设法创建了一个 Seaborn 条形图,其中列出了使用此脚本的相应日期的基准回报:

import pandas as pd
import seaborn as sns

df = pd.read_csv('L:\\My Documents\\Desktop\\Data NEW.csv', parse_dates = True)

df = df.nsmallest(10, columns = 'Benchmark Returns')
df = df[['Date', 'Benchmark Returns', 'Portfolio Returns']]
p6 = sns.barplot(x = 'Date', y = 'Benchmark Returns', data = df)
p6.set(ylabel = 'Return (%)')
for x_ticks in p6.get_xticklabels():
    x_ticks.set_rotation(90)

And it produces this plot:它产生了这个情节:

在此处输入图片说明

However, what I'd like is a grouped bar plot that contains both Benchmark Returns and Portfolio Returns, where two different colours are used to distinguish between these two categories.但是,我想要的是包含基准回报和投资组合回报的分组条形图,其中使用两种不同的颜色来区分这两个类别。

I've tried several different methods but nothing seems to work.我尝试了几种不同的方法,但似乎没有任何效果。

Thanks in advance for all your help!在此先感谢您的帮助!

The trick is to transform the pandas df from wide to long format诀窍是将大熊猫df从宽格式转换为长格式

Step 1: Prepare data第 1 步:准备数据

import seaborn as sns

np.random.seed(123)
index = np.random.randint(1,100,10)

x1 = pd.date_range('2000-01-01','2015-01-01').map(lambda t: t.strftime('%Y-%m-%d'))
dts = np.random.choice(x1,10)

benchmark = np.random.randn(10)
portfolio = np.random.randn(10)

df = pd.DataFrame({'Index': index,
                   'Dates': dts,
                   'Benchmark': benchmark,
                   'Portfolio': portfolio},
                    columns = ['Index','Dates','Benchmark','Portfolio'])

Step 2: From "wide" to "long" format第 2 步:从“宽”格式到“长”格式

df1 = pd.melt(df, id_vars=['Index','Dates']).sort_values(['variable','value'])
df1

    Index   Dates   variable    value
9   48  2012-06-13  Benchmark   -1.410301
1   93  2002-07-31  Benchmark   -1.301489
8   97  2005-01-21  Benchmark   -1.100985
0   67  2011-06-01  Benchmark   0.126526
4   84  2003-09-25  Benchmark   0.465645
3   18  2009-07-13  Benchmark   0.522742
5   58  2007-12-04  Benchmark   0.724915
7   98  2002-12-28  Benchmark   0.746581
6   87  2009-02-07  Benchmark   1.495827
2   99  2000-04-21  Benchmark   2.207427
16  87  2009-02-07  Portfolio   -2.750224
14  84  2003-09-25  Portfolio   -1.855637
15  58  2007-12-04  Portfolio   -1.779455
19  48  2012-06-13  Portfolio   -1.774134
11  93  2002-07-31  Portfolio   -0.984868
12  99  2000-04-21  Portfolio   -0.748569
10  67  2011-06-01  Portfolio   -0.747651
18  97  2005-01-21  Portfolio   -0.695981
17  98  2002-12-28  Portfolio   -0.234158
13  18  2009-07-13  Portfolio   0.240367

Step 3: Plot第 3 步:绘图

sns.barplot(x='Dates', y='value', hue='variable', data=df1)
plt.xticks(rotation=90)
plt.ylabel('Returns')
plt.title('Portfolio vs Benchmark Returns');

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM