I need to be able sort the result of Pandas' 2nd groupby by Category.
The first groupby creates a list from another column, and second one is the groupby result I need. The problem is that the 2nd groupby does not honour the original sorted categorical index of the Dataframe
import pandas as pd
import numpy as np
import numpy.ma as ma
from pathlib import Path
fr = Path('../data/rules-1.xlsx')
df = pd.read_excel(fr, sheet_name='MS')
from pandas.api.types import CategoricalDtype
print('Before:')
display(df)
ms_cat = ['Parent-C', 'Parent-A', 'Parent-B']
df['ParentMS'] = df['ParentMS'].astype(CategoricalDtype(list(ms_cat)),order=True)
df = df.reset_index()
df = df.set_index('ParentMS')
df = df.sort_index()
print('After:')
display(df)
df_g = df. groupby(['ParentMS', 'Milestone'])['Tasks'].apply(list)
df_g = df_g.groupby('ParentMS')
# Category sort is not honored after the second groupby()
for name, group in df_g:
print(name, group)
This the input file:
[enter image description here][1]
[1]: https://i.stack.imgur.com/KZnZD.png
Combining the two "df_g" lines did the trick for me. I cannot explain it but it worked
df_g = df.groupby(['ParentMS', 'Milestone'])['RN'].apply(list).groupby('ParentMS')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.