![](/img/trans.png)
[英]How to plot subplots from a condition applied on a single column and the data available on another single coloumn of the same dataframe?
[英]How to plot columns from a dataframe, as subplots
我在这里做错了什么? 我想从df
创建新数据框,并在每个新创建的数据框(Emins、FTSE、Stoxx 和 Nikkei)的折线图中使用日期作为 x 轴。
我有一个从 data.xlsx 创建的名为df
的数据框,它看起来像这样:
Dates ES1 Z 1 VG1 NK1
0 2005-01-04 -0.0126 0.0077 -0.0030 0.0052
1 2005-01-05 -0.0065 -0.0057 0.0007 -0.0095
2 2005-01-06 0.0042 0.0017 0.0051 0.0044
3 2005-01-07 -0.0017 0.0061 0.0010 -0.0009
4 2005-01-11 -0.0065 -0.0040 -0.0147 0.0070
3670 2020-09-16 -0.0046 -0.0065 -0.0003 -0.0009
3671 2020-09-17 -0.0083 -0.0034 -0.0039 -0.0086
3672 2020-09-18 -0.0024 -0.0009 -0.0009 0.0052
3673 2020-09-23 -0.0206 0.0102 0.0022 -0.0013
3674 2020-09-24 0.0021 -0.0136 -0.0073 -0.0116
从df
我创建了 4 个新的数据框,称为 Eminis、FTSE、Stoxx 和 Nikkei。
谢谢你的帮助!!!!
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('classic')
df = pd.read_excel('data.xlsx')
df = df.rename(columns={'Dates':'Date','ES1': 'Eminis', 'Z 1': 'FTSE','VG1': 'Stoxx','NK1': 'Nikkei','TY1': 'Notes','G 1': 'Gilts', 'RX1': 'Bunds','JB1': 'JGBS','CL1': 'Oil','HG1': 'Copper','S 1': 'Soybeans','GC1': 'Gold','WILLTIPS': 'TIPS'})
headers = df.columns
Eminis = df[['Date','Eminis']]
FTSE = df[['Date','FTSE']]
Stoxx = df[['Date','Stoxx']]
Nikkei = df[['Date','Nikkei']]
# create multiple plots via plt.subplots(rows,columns)
fig, axes = plt.subplots(2,2, figsize=(20,15))
x = Date
y1 = Eminis
y2 = Notes
y3 = Stoxx
y4 = Nikkei
# one plot on each subplot
axes[0][0].line(x,y1)
axes[0][1].line(x,y2)
axes[1][0].line(x,y3)
axes[1][1].line(x,y4)
plt.legends()
plt.show()
.stack
将数据帧从宽格式转换为长(整洁)格式。
seaborn.relplot
,它可以从长格式的数据帧创建FacetGrid
。
seaborn
是matplotlib
的高级 API,使绘图更容易。import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# import data from excel, or setup test dataframe
data = {'Dates': ['2005-01-04', '2005-01-05', '2005-01-06', '2005-01-07', '2005-01-11', '2020-09-16', '2020-09-17', '2020-09-18', '2020-09-23', '2020-09-24'],
'ES1': [-0.0126, -0.0065, 0.0042, -0.0017, -0.0065, -0.0046, -0.0083, -0.0024, -0.0206, 0.0021],
'Z 1': [0.0077, -0.0057, 0.0017, 0.0061, -0.004, -0.0065, -0.0034, -0.0009, 0.0102, -0.0136],
'VG1': [-0.003, 0.0007, 0.0051, 0.001, -0.0147, -0.0003, -0.0039, -0.0009, 0.0022, -0.0073],
'NK1': [0.0052, -0.0095, 0.0044, -0.0009, 0.007, -0.0009, -0.0086, 0.0052, -0.0013, -0.0116]}
df = pd.DataFrame(data)
# rename columns
df = df.rename(columns={'Dates':'Date','ES1': 'Eminis', 'Z 1': 'FTSE','VG1': 'Stoxx','NK1': 'Nikkei'})
# set Date to a datetime
df.Date = pd.to_datetime(df.Date)
# set Date as the index
df.set_index('Date', inplace=True)
# stack the dataframe
dfs = df.stack().reset_index().rename(columns={'level_1': 'Stock', 0: 'val'})
# to select only a subset of values from Stock, to plot, select them with Boolean indexing
df_select = dfs[dfs.Stock.isin(['Eminis', 'FTSE', 'Stoxx', 'Nikkei'])]`
# df_select.head()
Date Stock val
0 2005-01-04 Eminis -0.0126
1 2005-01-04 FTSE 0.0077
2 2005-01-04 Stoxx -0.0030
3 2005-01-04 Nikkei 0.0052
4 2005-01-05 Eminis -0.0065
# plot
sns.relplot(data=df_select, x='Date', y='val', col='Stock', col_wrap=2, kind='line')
Date
无定义x = Date
y2 = Notes
: Notes
未定义.line
不是plt
方法并导致AttributeError
; 它应该是plt.plot
y1 - y4
是数据帧,但传递给 y 轴的 plot 方法,这会导致TypeError: unhashable type: 'numpy.ndarray'
; 一列应该作为y
传递。.legends
不是方法; 这是.legend
Eminis = df[['Date','Eminis']]
FTSE = df[['Date','FTSE']]
Stoxx = df[['Date','Stoxx']]
Nikkei = df[['Date','Nikkei']]
# create multiple plots via plt.subplots(rows,columns)
fig, axes = plt.subplots(2,2, figsize=(20,15))
x = df.Date
y1 = Eminis.Eminis
y2 = FTSE.FTSE
y3 = Stoxx.Stoxx
y4 = Nikkei.Nikkei
# one plot on each subplot
axes[0][0].plot(x,y1, label='Eminis')
axes[0][0].legend()
axes[0][1].plot(x,y2, label='FTSE')
axes[0][1].legend()
axes[1][0].plot(x,y3, label='Stoxx')
axes[1][0].legend()
axes[1][1].plot(x,y4, label='Nikkei')
axes[1][1].legend()
plt.show()
优雅的解决方案是:
执行此操作的代码是:
fig, a = plt.subplots(2, 2, figsize=(12, 6), tight_layout=True)
df.plot(ax=a, subplots=True, rot=60);
为了测试上面的代码,我创建了以下 DataFrame:
np.random.seed(1)
ind = pd.date_range('2005-01-01', '2006-12-31', freq='7D')
df = pd.DataFrame(np.random.rand(ind.size, 4),
index=ind, columns=['ES1', 'Z 1', 'VG1', 'NK1'])
并得到以下图片:
由于我的测试数据是随机的,我假设了“7 天”的频率,以使图片不太“杂乱”。 对于您的真实数据,请考虑使用例如“7D”频率和mean()聚合函数重新采样。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.