使用数据框列值的 Python Pandas 图

Question

I'm trying to plot a graph using dataframes.我正在尝试使用数据框绘制图形。

I'm using 'pandas_datareader' to get the data.我正在使用“pandas_datareader”来获取数据。

so my code is below:所以我的代码如下：

tickers = ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]
import pandas_datareader.data as web
import datetime as dt
end = dt.datetime.now().strftime("%Y-%m-%d")
start = (dt.datetime.now()-dt.timedelta(days=365*3)).strftime("%Y-%m-%d")
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
data = []
for ticker in tickers:
    sub_df = web.get_data_yahoo(ticker, start, end)
    sub_df["name"] = ticker
    data.append(sub_df)
data = pd.concat(data)

So in the variable data , there are 8 columns = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']所以在变量data ，有 8 列 = ['Date', 'Open', 'High' ,'Low' ,'Close' 'Volume', 'Adj Close','name']

The variable 'data' is shown below:变量“数据”如下所示：

What I want to do is to plot a graph taking 'date' values as x-parameter , 'high' as y-parameter with multiple columns as 'name' column values(=["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"]).我想要做的是绘制一个图形，以“日期”值作为 x 参数，“高”作为 y 参数，多列作为“名称”列值（=[“AAPL”，“GOOG”，“MSFT” ,"XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"])。

How can I do this?我怎样才能做到这一点？

When i executed data.plot() , the result takes data as x-parameter well but there are 5 columns ['open','high','low','close','volume','adj close'] not 7 columns ["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"] : what i want to do.当我执行data.plot() ，结果很好地将data作为 x 参数，但有 5 列['open','high','low','close','volume','adj close']不是7 列["AAPL","GOOG","MSFT","XOM","BRK-A","FB","JNJ","GE","AMZN","WFC"] ：我想要什么做。 The result is below:结果如下：

Answer 1

You need to reshape your data so that the names become the header of the data frame, here since you want to plot High only, you can extract the High and name columns, and transform it to wide format, then do the plot:您需要对数据进行整形，以便名称成为数据框的标题，因为您只想绘制High ，您可以提取High和name列，并将其转换为宽格式，然后进行绘图：

import matplotlib as mpl
mpl.rcParams['savefig.dpi'] = 120

high = data[["High", "name"]].set_index("name", append=True).High.unstack("name")

# notice here I scale down the BRK-A column so that it will be at the same scale as other columns
high['BRK-A'] = high['BRK-A']/1000
high.head()

ax = high.plot(figsize = (16, 10))

Answer 2

You should group your data by name and then plot.您应该按name对数据进行分组，然后进行绘图。 Something like data.groupby('name').plot () should get you started.像data.groupby('name').plot ()应该会让你开始。 You may need to feed in date as the x value and high for the y.您可能需要输入date作为 x 值和high值作为 y。 Cant test it myself at the moment as i am on mobile.由于我在移动设备上，目前无法自己测试。

Update更新

After getting to a computer this I realized I was a bit off.拿到电脑后，我意识到我有点不对劲。 You would need to reset the index before grouping then plot and finally update the legend.您需要在分组之前重置索引，然后绘制并最终更新图例。 Like so:像这样：

fig, ax = plt.subplots()
names = data.name.unique()
data.reset_index().groupby('name').plot(x='Date', y='High', ax=ax)
plt.legend(names)
plt.show()

Granted if you want this graph to make any sense you will need to do some form of adjustment for values as BRK-A is far more expensive than any of the other equities.当然，如果您希望此图表有意义，您将需要对价值进行某种形式的调整，因为 BRK-A 比任何其他股票都贵得多。

Answer 3

@Psidom and @Grr have already given you very good answers. @Psidom和@Grr已经给了你很好的答案。

I just wanted to add that pandas_datareader allows us to read all data into a Pandas.Panel conviniently in one step:我只想补充一点， pandas_datareader允许我们一步轻松地将所有数据读入 Pandas.Panel：

p = web.DataReader(tickers, 'yahoo', start, end)

now we can easily slice it as we wish现在我们可以根据需要轻松切片

# i'll intentionally exclude `BRK-A` as it spoils the whole graph
p.loc['High', :, ~p.minor_axis.isin(['BRK-A'])].plot(figsize=(10,8))

alternatively you can slice on the fly and save only High values:或者，您可以即时切片并仅保存High值：

df = web.DataReader(tickers, 'yahoo', start, end).loc['High']

which gives us:这给了我们：

In [68]: df
Out[68]:
                  AAPL        AMZN     BRK-A          FB         GE         GOOG         JNJ       MSFT        WFC        XOM
Date
2014-03-13  539.659988  383.109985  188852.0   71.349998  26.000000  1210.502120   94.199997  38.450001  48.299999  94.570000
2014-03-14  530.890015  378.570007  186507.0   69.430000  25.379999  1190.872020   93.440002  38.139999  48.070000  94.220001
2014-03-17  529.969994  378.850006  185790.0   68.949997  25.629999  1197.072063   94.180000  38.410000  48.169998  94.529999
2014-03-18  531.969986  379.000000  185400.0   69.599998  25.730000  1211.532091   94.239998  39.900002  48.450001  95.250000
2014-03-19  536.239990  379.000000  185489.0   69.290001  25.700001  1211.992061   94.360001  39.549999  48.410000  95.300003
2014-03-20  532.669975  373.000000  186742.0   68.230003  25.370001  1209.612076   94.190002  40.650002  49.360001  94.739998
2014-03-21  533.750000  372.839996  188598.0   67.919998  25.830000  1209.632048   95.930000  40.939999  49.970001  95.989998
...                ...         ...       ...         ...        ...          ...         ...        ...        ...        ...
2017-03-02  140.279999  854.820007  266445.0  137.820007  30.230000   834.510010  124.360001  64.750000  59.790001  84.250000
2017-03-03  139.830002  851.989990  264690.0  137.330002  30.219999   831.359985  123.930000  64.279999  59.240002  83.599998
2017-03-06  139.770004  848.489990  263760.0  137.830002  30.080000   828.880005  124.430000  64.559998  58.880001  82.900002
2017-03-07  139.979996  848.460022  263560.0  138.369995  29.990000   833.409973  124.459999  64.779999  58.520000  83.290001
2017-03-08  139.800003  853.070007  263900.0  137.990005  29.940001   838.150024  124.680000  65.080002  59.130001  82.379997
2017-03-09  138.789993  856.400024  263620.0  138.570007  29.830000   842.000000  126.209999  65.199997  58.869999  81.720001
2017-03-10  139.360001  857.349976  263800.0  139.490005  30.430000   844.909973  126.489998  65.260002  59.180000  82.470001

[755 rows x 10 columns]

使用数据框列值的 Python Pandas 图

问题描述

3 个解决方案

解决方案1
3 已采纳 2017-03-12 06:02:53

解决方案2
2 2017-03-12 05:58:06

解决方案3
2 2017-03-12 09:55:33

使用数据框列值的 Python Pandas 图

问题描述

3 个解决方案

解决方案1 3 已采纳 2017-03-12 06:02:53

解决方案2 2 2017-03-12 05:58:06

解决方案3 2 2017-03-12 09:55:33

解决方案1
3 已采纳 2017-03-12 06:02:53

解决方案2
2 2017-03-12 05:58:06

解决方案3
2 2017-03-12 09:55:33