How to compare the data and choose the maximum one from multiIndex dataframe in pandas?
For example:
arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz','foo', 'foo', 'qux', 'qux']),np.array(['AA', 'AB', 'AC','BA', 'BB', 'CA', 'CB', 'DA', 'DB'])]
df = pd.DataFrame(np.random.randn(9, 1), index=arrays)
df
out
0
bar AA 1.740325
AB 2.017906
AC -0.873244
baz BA -1.761734
BB 0.467648
foo CA 0.740907
CB -0.322276
qux DA 0.607481
DB -0.460324
Finally, I want to choose like this:
1 2 0
0 bar AB 2.017906
1 baz BB 0.467648
2 foo CA 0.740907
3 qux DA 0.607481
Found answer here
v = df.groupby(level=0).idxmax().values
df.loc[v.ravel()]
Solution should be simplify by specify column for check max
values by DataFrameGroupBy.idxmax
:
np.random.seed(234)
arrays = [np.array(['bar', 'bar', 'bar', 'baz', 'baz','foo', 'foo', 'qux', 'qux']),np.array(['AA', 'AB', 'AC','BA', 'BB', 'CA', 'CB', 'DA', 'DB'])]
df = pd.DataFrame(np.random.randn(9, 1), index=arrays)
print (df)
0
bar AA 0.818792
AB -1.043551
AC 0.350901
baz BA 0.921578
BB -0.087382
foo CA -3.128885
CB -0.969733
qux DA 0.934666
df = df.loc[df.groupby(level=0)[0].idxmax()]
print (df)
0
bar AA 0.818792
baz BA 0.921578
foo CB -0.969733
qux DA 0.934666
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.