简体   繁体   中英

How to compare the data and choose the maximum one from multiIndex dataframe in pandas?

How to compare the data and choose the maximum one from multiIndex dataframe in pandas?

For example:

arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz','foo', 'foo', 'qux', 'qux']),np.array(['AA', 'AB', 'AC','BA', 'BB', 'CA', 'CB', 'DA', 'DB'])]
df = pd.DataFrame(np.random.randn(9, 1), index=arrays)
df

out

         0
bar AA   1.740325
    AB   2.017906
    AC  -0.873244
baz BA  -1.761734
    BB   0.467648
foo CA   0.740907
    CB  -0.322276
qux DA   0.607481
    DB  -0.460324

Finally, I want to choose like this:

   1    2   0
0  bar  AB  2.017906
1  baz  BB  0.467648
2  foo  CA  0.740907
3  qux  DA  0.607481

Found answer here

v = df.groupby(level=0).idxmax().values
df.loc[v.ravel()]

Solution should be simplify by specify column for check max values by DataFrameGroupBy.idxmax :

np.random.seed(234)
arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz','foo', 'foo', 'qux', 'qux']),np.array(['AA', 'AB', 'AC','BA', 'BB', 'CA', 'CB', 'DA', 'DB'])]
df = pd.DataFrame(np.random.randn(9, 1), index=arrays)
print (df)

               0
bar AA  0.818792
    AB -1.043551
    AC  0.350901
baz BA  0.921578
    BB -0.087382
foo CA -3.128885
    CB -0.969733
qux DA  0.934666

df = df.loc[df.groupby(level=0)[0].idxmax()]
print (df)
               0
bar AA  0.818792
baz BA  0.921578
foo CB -0.969733
qux DA  0.934666

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM