简体   繁体   中英

How to select rows in the last key of first level in a Multiindex dataframe?

I have a Pandas DataFrame that looks like the following:

                       data
date       signal                     
2012-11-01 a           0.04
           b           0.03
2012-12-01 a          -0.01
           b           0.00
2013-01-01 a          -0.00
           b          -0.01

I am trying to get only the last row based on the first level of the multiindex, which is date in this case.

2013-01-01 a          -0.00
           b          -0.01

The first level index is datetime. What would be the most elegant way to select the last row?

One way is to access the MultiIndex's levels directly (and use the last one):

In [11]: df.index.levels
Out[11]: [Index([bar, baz, foo, qux], dtype=object), Index([one, two], dtype=object)]

In [12]: df.index.levels[0][-1]
Out[12]: 'qux'

And select these rows with ix :

In [13]: df.ix[df.index.levels[0][-1]]
Out[13]:
            0         1         2         3
one  1.225973 -0.703952  0.265889  1.069345
two -1.521503  0.024696  0.109501 -1.584634

In [14]: df.ix[df.index.levels[0][-1]:]
Out[14]:
                0         1         2         3
qux one  1.225973 -0.703952  0.265889  1.069345
    two -1.521503  0.024696  0.109501 -1.584634

(Using @Jeff's example DataFrame .)

Perhaps a more elegant way is to use tail (if you knew there would always be two):

In [15]: df.tail(2)
Out[15]:
                0         1         2         3
qux one  1.225973 -0.703952  0.265889  1.069345
    two -1.521503  0.024696  0.109501 -1.584634

In 0.11 (coming this week), this is a reasonable way to do this

In [50]: arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
   .....:           np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]

In [51]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays)

In [52]: df
Out[52]: 
                0         1         2         3
bar one -1.798562  0.852583 -0.148094 -2.107990
    two -1.091486 -0.748130  0.519758  2.621751
baz one -1.257548  0.210936 -0.338363 -0.141486
    two -0.810674  0.323798 -0.030920 -0.510224
foo one -0.427309  0.933469 -1.259559 -0.771702
    two -2.060524  0.795388 -1.458060 -1.762406
qux one -0.574841  0.023691 -1.567137  0.462715
    two  0.936323  0.346049 -0.709112  0.045066

In [53]: df.loc['qux'].iloc[[-1]]
Out[53]: 
            0         1         2         3
two  0.936323  0.346049 -0.709112  0.045066

This will work in 0.10.1

In [63]: df.ix['qux'].ix[-1]
Out[63]: 
0    0.936323
1    0.346049
2   -0.709112
3    0.045066
Name: two, dtype: float64

And another way (this works in 0.10.1) as well

In [59]: df.xs(('qux','two'))
Out[59]: 
0    0.936323
1    0.346049
2   -0.709112
3    0.045066
Name: (qux, two), dtype: float64

If you have a dataframe df with a MultiIndex already defined, then:

df2 = df.ix[df.index[len(df.index)-1][0]]

would also work.

您可以使用iloc获取最后一行:

   df.iloc[-1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM