简体   繁体   中英

Pandas Pivot Table Subsetting

My pivot table looks like this:

Symbol     DIA   QQQ        SPY   XLE   DIA   QQQ        SPY   XLE  DIA  QQQ  \
          Open  Open       Open  Open  High  High       High  High  Low  Low   
Date                                                                           
19930129   NaN   NaN  29.083294   NaN   NaN   NaN  29.083294   NaN  NaN  NaN   
19930201   NaN   NaN  29.083294   NaN   NaN   NaN  29.269328   NaN  NaN  NaN   
19930202   NaN   NaN  29.248658   NaN   NaN   NaN  29.352010   NaN  NaN  NaN   
19930203   NaN   NaN  29.372680   NaN   NaN   NaN  29.662066   NaN  NaN  NaN   
19930204   NaN   NaN  29.744748   NaN   NaN   NaN  29.827430   NaN  NaN  NaN   

Symbol          SPY  XLE    DIA    QQQ        SPY    XLE           DIA  \
                Low  Low  Close  Close      Close  Close  Total Volume   
Date                                                                     
19930129  28.938601  NaN    NaN    NaN  29.062624    NaN           NaN   
19930201  29.083294  NaN    NaN    NaN  29.269328    NaN           NaN   
19930202  29.186647  NaN    NaN    NaN  29.331340    NaN           NaN   
19930203  29.352010  NaN    NaN    NaN  29.641396    NaN           NaN   
19930204  29.414021  NaN    NaN    NaN  29.765419    NaN           NaN   

Symbol             QQQ           SPY           XLE  
          Total Volume  Total Volume  Total Volume  
Date                                                
19930129           NaN         15167           NaN  
19930201           NaN          7264           NaN  
19930202           NaN          3043           NaN  
19930203           NaN          8004           NaN  
19930204           NaN          8035           NaN 

How does one go about subsetting for a particular day and for a particular column value, say Closing prices for all symbols?

19930129 NaN NaN 29.062624 NaN

i tried pt['Close'] , but it didn't seem to work. Only pt['SPY'] gives me the whole table values for symbol SPY.

You could use pd.IndexSlice :

pt = pt.sortlevel(axis=1)
pt.loc['19930129', pd.IndexSlice[:,'Close']]

Using IndexSlicer requires the selection axes are fully lexsorted, hence the call to sortlevel .

Alternatively, slice(None) could also be used to select everything from the first column index level:

pt = pt.sortlevel(axis=1)
pt.loc['19930129', (slice(None), 'Close')]

To select the ith row, but select the columns by label, you could use

pt.loc[pt.index[i], (slice(None), 'Close')]

Or, you could use pt.ix as Andy Hayden suggests, but be aware that if pt has an integer-valued index, then pt.ix performs label-based row indexing, not ordinal indexing.

So as long as 19930129 (and the other index values) are not integers -- ie pt.index is not a Int64Index -- you could use

pt.ix[i, (slice(None), 'Close')]

Note that chained indexing , such as

pt.iloc[i].loc[(slice(None), 'Close')]

should be avoided when performing assignments, since assignment with chained indexing may fail to modify pt .

An alternative is to use xs , "cross-section":

In [21]: df.xs(axis=1, level=1, key="Open")
Out[21]:
Symbol    DIA  QQQ        SPY  XLE
Date
19930129  NaN  NaN  29.083294  NaN
19930201  NaN  NaN  29.083294  NaN
19930202  NaN  NaN  29.248658  NaN
19930203  NaN  NaN  29.372680  NaN
19930204  NaN  NaN  29.744748  NaN

In [22]: df.xs(axis=1, level=1, key="Open").loc[19930129]
Out[22]:
Symbol
DIA          NaN
QQQ          NaN
SPY    29.083294
XLE          NaN
Name: 19930129, dtype: float64

This is somewhat less powerful that unutbu's answer (using IndexSlice).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM