[英]Python Pandas plot using dataframe column values
I'm trying to plot a graph using dataframes.我正在尝试使用数据框绘制图形。
I'm using 'pandas_datareader' to get the data.我正在使用“pandas_datareader”来获取数据。
so my code is below:所以我的代码如下:
tickers = ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
import pandas_datareader.data as web
import datetime as dt
end = dt.datetime.now().strftime("%Y-%m-%d")
start = (dt.datetime.now()-dt.timedelta(days=365*3)).strftime("%Y-%m-%d")
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
data = []
for ticker in tickers:
sub_df = web.get_data_yahoo(ticker, start, end)
sub_df["name"] = ticker
data.append(sub_df)
data = pd.concat(data)
So in the variable data
, there are 8 columns = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']
所以在变量data
,有 8 列 = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']
The variable 'data' is shown below:变量“数据”如下所示:
What I want to do is to plot a graph taking 'date' values as x-parameter , 'high' as y-parameter with multiple columns as 'name' column values(=["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]).我想要做的是绘制一个图形,以“日期”值作为 x 参数,“高”作为 y 参数,多列作为“名称”列值(=[“AAPL”,“GOOG”,“MSFT” ,"XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"])。
How can I do this?我怎样才能做到这一点?
When i executed data.plot()
, the result takes data
as x-parameter well but there are 5 columns ['open','high','low','close','volume','adj close']
not 7 columns ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
: what i want to do.当我执行data.plot()
,结果很好地将data
作为 x 参数,但有 5 列['open','high','low','close','volume','adj close']
不是7 列["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
:我想要什么做。 The result is below:结果如下:
You need to reshape your data so that the names become the header of the data frame, here since you want to plot High
only, you can extract the High
and name
columns, and transform it to wide format, then do the plot:您需要对数据进行整形,以便名称成为数据框的标题,因为您只想绘制High
,您可以提取High
和name
列,并将其转换为宽格式,然后进行绘图:
import matplotlib as mpl
mpl.rcParams['savefig.dpi'] = 120
high = data[["High", "name"]].set_index("name", append=True).High.unstack("name")
# notice here I scale down the BRK-A column so that it will be at the same scale as other columns
high['BRK-A'] = high['BRK-A']/1000
high.head()
ax = high.plot(figsize = (16, 10))
You should group your data by name
and then plot.您应该按name
对数据进行分组,然后进行绘图。 Something like data.groupby('name').plot ()
should get you started.像data.groupby('name').plot ()
应该会让你开始。 You may need to feed in date
as the x value and high
for the y.您可能需要输入date
作为 x 值和high
值作为 y。 Cant test it myself at the moment as i am on mobile.由于我在移动设备上,目前无法自己测试。
Update更新
After getting to a computer this I realized I was a bit off.拿到电脑后,我意识到我有点不对劲。 You would need to reset the index before grouping then plot and finally update the legend.您需要在分组之前重置索引,然后绘制并最终更新图例。 Like so:像这样:
fig, ax = plt.subplots()
names = data.name.unique()
data.reset_index().groupby('name').plot(x='Date', y='High', ax=ax)
plt.legend(names)
plt.show()
Granted if you want this graph to make any sense you will need to do some form of adjustment for values as BRK-A is far more expensive than any of the other equities.当然,如果您希望此图表有意义,您将需要对价值进行某种形式的调整,因为 BRK-A 比任何其他股票都贵得多。
@Psidom and @Grr have already given you very good answers. @Psidom和@Grr已经给了你很好的答案。
I just wanted to add that pandas_datareader
allows us to read all data into a Pandas.Panel conviniently in one step:我只想补充一点, pandas_datareader
允许我们一步轻松地将所有数据读入 Pandas.Panel:
p = web.DataReader(tickers, 'yahoo', start, end)
now we can easily slice it as we wish现在我们可以根据需要轻松切片
# i'll intentionally exclude `BRK-A` as it spoils the whole graph
p.loc['High', :, ~p.minor_axis.isin(['BRK-A'])].plot(figsize=(10,8))
alternatively you can slice on the fly and save only High
values:或者,您可以即时切片并仅保存High
值:
df = web.DataReader(tickers, 'yahoo', start, end).loc['High']
which gives us:这给了我们:
In [68]: df
Out[68]:
AAPL AMZN BRK-A FB GE GOOG JNJ MSFT WFC XOM
Date
2014-03-13 539.659988 383.109985 188852.0 71.349998 26.000000 1210.502120 94.199997 38.450001 48.299999 94.570000
2014-03-14 530.890015 378.570007 186507.0 69.430000 25.379999 1190.872020 93.440002 38.139999 48.070000 94.220001
2014-03-17 529.969994 378.850006 185790.0 68.949997 25.629999 1197.072063 94.180000 38.410000 48.169998 94.529999
2014-03-18 531.969986 379.000000 185400.0 69.599998 25.730000 1211.532091 94.239998 39.900002 48.450001 95.250000
2014-03-19 536.239990 379.000000 185489.0 69.290001 25.700001 1211.992061 94.360001 39.549999 48.410000 95.300003
2014-03-20 532.669975 373.000000 186742.0 68.230003 25.370001 1209.612076 94.190002 40.650002 49.360001 94.739998
2014-03-21 533.750000 372.839996 188598.0 67.919998 25.830000 1209.632048 95.930000 40.939999 49.970001 95.989998
... ... ... ... ... ... ... ... ... ... ...
2017-03-02 140.279999 854.820007 266445.0 137.820007 30.230000 834.510010 124.360001 64.750000 59.790001 84.250000
2017-03-03 139.830002 851.989990 264690.0 137.330002 30.219999 831.359985 123.930000 64.279999 59.240002 83.599998
2017-03-06 139.770004 848.489990 263760.0 137.830002 30.080000 828.880005 124.430000 64.559998 58.880001 82.900002
2017-03-07 139.979996 848.460022 263560.0 138.369995 29.990000 833.409973 124.459999 64.779999 58.520000 83.290001
2017-03-08 139.800003 853.070007 263900.0 137.990005 29.940001 838.150024 124.680000 65.080002 59.130001 82.379997
2017-03-09 138.789993 856.400024 263620.0 138.570007 29.830000 842.000000 126.209999 65.199997 58.869999 81.720001
2017-03-10 139.360001 857.349976 263800.0 139.490005 30.430000 844.909973 126.489998 65.260002 59.180000 82.470001
[755 rows x 10 columns]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.