[英]Converting MultiIndex Pandas DataFrame to Pivot
Seems like this should be easy, but I cannot get it to work for the life of me.看起来这应该很容易,但我无法让它为我的生活工作。
I have a dataframe for stock data like below.我有一个股票数据的数据框,如下所示。 How can I convert the below dataframe into a pivot table with the date as the rows, stock symbols as the columns, and Adj Close as the values (picture at bottom)
如何将下面的数据框转换为以日期为行、股票代码为列、Adj Close 为值的数据透视表(底部图片)
I am getting the dataframe with this code: pricing = web.DataReader(['MSFT', 'AAPL'], 'yahoo', datetime.datetime(2020, 1, 1), datetime.datetime(2020, 2, 10))
我正在使用以下代码获取数据
pricing = web.DataReader(['MSFT', 'AAPL'], 'yahoo', datetime.datetime(2020, 1, 1), datetime.datetime(2020, 2, 10))
: pricing = web.DataReader(['MSFT', 'AAPL'], 'yahoo', datetime.datetime(2020, 1, 1), datetime.datetime(2020, 2, 10))
EDIT:编辑:
If I just do pricing = web.DataReader(['MSFT', 'AAPL'], 'yahoo', datetime.datetime(2020, 1, 1), datetime.datetime(2020, 2, 10))['Adj Close']
then I run into problems using .loc[] to grab data using another pivot's index.如果我只做
pricing = web.DataReader(['MSFT', 'AAPL'], 'yahoo', datetime.datetime(2020, 1, 1), datetime.datetime(2020, 2, 10))['Adj Close']
然后我遇到了使用 .loc[] 使用另一个枢轴索引获取数据的问题。
I have a second pivot (shown below) and when I try to do pricing.loc[pivot2.index] I get an error.我有第二个支点(如下所示),当我尝试执行 Pricing.loc[pivot2.index] 时出现错误。
Error:错误:
KeyError: "None of [DatetimeIndex(['1999-01-01', '2000-01-01', '2003-01-01', '2004-01-01',\\n '2005-01-01', '2006-01-01', '2007-01-01', '2008-01-01',\\n '2009-01-01', '2010-01-01', '2011-01-01', '2012-01-01',\\n '2013-01-01', '2014-01-01', '2015-01-01', '2016-01-01',\\n '2017-01-01', '2018-01-01', '2019-01-01'],\\n dtype='datetime64[ns]', name='date', freq=None)] are in the [index]"
KeyError: "[DatetimeIndex(['1999-01-01', '2000-01-01', '2003-01-01', '2004-01-01',\\n '2005-01-01') , '2006-01-01', '2007-01-01', '2008-01-01',\\n '2009-01-01', '2010-01-01', '2011-01-01' , '2012-01-01',\\n '2013-01-01', '2014-01-01', '2015-01-01', '2016-01-01',\\n '2017-01- 01', '2018-01-01', '2019-01-01'],\\n dtype='datetime64[ns]', name='date', freq=None)] 在 [index]"
Try using:尝试使用:
pricing['Adj Close']
where,在哪里,
pricing = pd.DataFrame(np.random.randint(50,75,(10,6)),
index=pd.date_range('01/2/20', periods=10, freq='D'),
columns=pd.MultiIndex.from_product([['Adj Close', 'Close', 'High'],['MSFT', 'AAPL']]))
Adj Close Close High
MSFT AAPL MSFT AAPL MSFT AAPL
2020-01-02 67 71 58 60 50 53
2020-01-03 54 59 64 72 62 50
2020-01-04 51 53 56 70 63 51
2020-01-05 64 71 74 62 68 62
2020-01-06 74 68 69 71 60 62
2020-01-07 55 55 51 70 74 72
2020-01-08 60 58 74 70 73 69
2020-01-09 51 58 72 54 50 61
2020-01-10 64 56 74 52 59 57
2020-01-11 55 50 68 61 60 59
Using Basic indexing on axis with MultiIndex we can just select 'Adj Close' at level 0.使用MultiIndex 在轴上使用基本索引,我们可以在级别 0 处选择“Adj Close”。
pricing['Adj Close']
Output:输出:
MSFT AAPL
2020-01-02 66 51
2020-01-03 67 67
2020-01-04 74 74
2020-01-05 73 66
2020-01-06 68 52
2020-01-07 67 50
2020-01-08 73 54
2020-01-09 66 52
2020-01-10 62 73
2020-01-11 61 71
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.