I'm trying to sort unique values in pandas dataframe with group by;
df = pd.DataFrame({
... 'gr1': ['A', 'A', 'A','A', 'B', 'B', 'B','B'],
'gr1_sum' : [100,100 ,100,100, 200,200,200,200],
'rank_gr1': [2, 2, 2, 2, 1, 1, 1, 1],
... 'gr2': ['a1', 'a1', 'a2','a2', 'b1', 'b1', 'b2','b2'],
'gr2_sum' : [30,30 ,40,40, 20,20,10,10]})
#df.sort_values(by=['col2'],inplace = True)
rank_gr1_sort = pd.unique(df['rank_gr1'].values)
rank_gr2_sort = df.sort_values(['rank_gr1']).groupby(['gr1','gr2'])['gr2_sum'].unique()
rank_gr1_sort
array([2, 1], dtype=int64)
rank_gr2_sort
gr1 gr2
A a1 [30]
a2 [40]
B b1 [20]
b2 [10]
Name: gr2_sum, dtype: object
What I need to have is this;
gr1 gr2
B b1 [20]
b2 [10]
A a1 [30]
a2 [40]
Name: gr2_sum, dtype: object
How I do achive this output ?
Thx!
pandas groupby sort within groups
Pandas Number of Unique Values and sort by the number of unique
Pass sort=False
under the groupby.
From docs:
sort : bool, default True Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.
rank_gr2_sort = df.sort_values(['rank_gr1']).groupby(
['gr1','gr2'],sort=False)['gr2_sum'].unique()
gr1 gr2
B b1 [20]
b2 [10]
A a1 [30]
a2 [40]
Name: gr2_sum, dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.