![](/img/trans.png)
[英]How to map sevaral keywords with a dataframe column values using pandas in python
[英]Python Pandas plot using dataframe column values
我正在尝试使用数据框绘制图形。
我正在使用“pandas_datareader”来获取数据。
所以我的代码如下:
tickers = ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
import pandas_datareader.data as web
import datetime as dt
end = dt.datetime.now().strftime("%Y-%m-%d")
start = (dt.datetime.now()-dt.timedelta(days=365*3)).strftime("%Y-%m-%d")
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
data = []
for ticker in tickers:
sub_df = web.get_data_yahoo(ticker, start, end)
sub_df["name"] = ticker
data.append(sub_df)
data = pd.concat(data)
所以在变量data
,有 8 列 = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']
我想要做的是绘制一个图形,以“日期”值作为 x 参数,“高”作为 y 参数,多列作为“名称”列值(=[“AAPL”,“GOOG”,“MSFT” ,"XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"])。
我怎样才能做到这一点?
当我执行data.plot()
,结果很好地将data
作为 x 参数,但有 5 列['open','high','low','close','volume','adj close']
不是7 列["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
:我想要什么做。 结果如下:
您需要对数据进行整形,以便名称成为数据框的标题,因为您只想绘制High
,您可以提取High
和name
列,并将其转换为宽格式,然后进行绘图:
import matplotlib as mpl
mpl.rcParams['savefig.dpi'] = 120
high = data[["High", "name"]].set_index("name", append=True).High.unstack("name")
# notice here I scale down the BRK-A column so that it will be at the same scale as other columns
high['BRK-A'] = high['BRK-A']/1000
high.head()
ax = high.plot(figsize = (16, 10))
您应该按name
对数据进行分组,然后进行绘图。 像data.groupby('name').plot ()
应该会让你开始。 您可能需要输入date
作为 x 值和high
值作为 y。 由于我在移动设备上,目前无法自己测试。
更新
拿到电脑后,我意识到我有点不对劲。 您需要在分组之前重置索引,然后绘制并最终更新图例。 像这样:
fig, ax = plt.subplots()
names = data.name.unique()
data.reset_index().groupby('name').plot(x='Date', y='High', ax=ax)
plt.legend(names)
plt.show()
当然,如果您希望此图表有意义,您将需要对价值进行某种形式的调整,因为 BRK-A 比任何其他股票都贵得多。
我只想补充一点, pandas_datareader
允许我们一步轻松地将所有数据读入 Pandas.Panel:
p = web.DataReader(tickers, 'yahoo', start, end)
现在我们可以根据需要轻松切片
# i'll intentionally exclude `BRK-A` as it spoils the whole graph
p.loc['High', :, ~p.minor_axis.isin(['BRK-A'])].plot(figsize=(10,8))
或者,您可以即时切片并仅保存High
值:
df = web.DataReader(tickers, 'yahoo', start, end).loc['High']
这给了我们:
In [68]: df
Out[68]:
AAPL AMZN BRK-A FB GE GOOG JNJ MSFT WFC XOM
Date
2014-03-13 539.659988 383.109985 188852.0 71.349998 26.000000 1210.502120 94.199997 38.450001 48.299999 94.570000
2014-03-14 530.890015 378.570007 186507.0 69.430000 25.379999 1190.872020 93.440002 38.139999 48.070000 94.220001
2014-03-17 529.969994 378.850006 185790.0 68.949997 25.629999 1197.072063 94.180000 38.410000 48.169998 94.529999
2014-03-18 531.969986 379.000000 185400.0 69.599998 25.730000 1211.532091 94.239998 39.900002 48.450001 95.250000
2014-03-19 536.239990 379.000000 185489.0 69.290001 25.700001 1211.992061 94.360001 39.549999 48.410000 95.300003
2014-03-20 532.669975 373.000000 186742.0 68.230003 25.370001 1209.612076 94.190002 40.650002 49.360001 94.739998
2014-03-21 533.750000 372.839996 188598.0 67.919998 25.830000 1209.632048 95.930000 40.939999 49.970001 95.989998
... ... ... ... ... ... ... ... ... ... ...
2017-03-02 140.279999 854.820007 266445.0 137.820007 30.230000 834.510010 124.360001 64.750000 59.790001 84.250000
2017-03-03 139.830002 851.989990 264690.0 137.330002 30.219999 831.359985 123.930000 64.279999 59.240002 83.599998
2017-03-06 139.770004 848.489990 263760.0 137.830002 30.080000 828.880005 124.430000 64.559998 58.880001 82.900002
2017-03-07 139.979996 848.460022 263560.0 138.369995 29.990000 833.409973 124.459999 64.779999 58.520000 83.290001
2017-03-08 139.800003 853.070007 263900.0 137.990005 29.940001 838.150024 124.680000 65.080002 59.130001 82.379997
2017-03-09 138.789993 856.400024 263620.0 138.570007 29.830000 842.000000 126.209999 65.199997 58.869999 81.720001
2017-03-10 139.360001 857.349976 263800.0 139.490005 30.430000 844.909973 126.489998 65.260002 59.180000 82.470001
[755 rows x 10 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.