How to slice groupby pandas

Question

I have like 40.000 groups after the code:

groups=data.groupby('A')

I need to subdived them like in sub-groups of 10.000, of course without overlapping and keeping the groupby stucture. Like group1=groups[0:10000], group2=groups[10000:20000]... to re-use them in other scripts. How can I do that?

Thank you !

Answer 1

in that case you can simply slice using iloc

group1=groups.iloc[0:10000,:]
group2=groups.iloc[10000:20000,:]
.
group3=groups.iloc[30000:40000,:]

this is when you want to slice according to indexes or number of rows required.

id you want to do it category wise then after performing group b you can simply do this

groups=groups.groupby(a).agg()
group1=groups.loc['category 1']

code mentioned in question aggregate not mentioned which is not valid refer the link to know how groupby works groupby

Answer 2

Unless you're aggregating right afterwards, groupby might be an overkill for this task.

data = data.set_index('A')
group_idx = data.index.drop_duplicates()
sub_group_1 = data.loc[group_idx[:10000]]

will get you first 10000 groups

How to slice groupby pandas

Question

2 answers

solution1
0 2022-01-05 18:04:07

solution2
0 2022-01-05 18:07:28

How to slice groupby pandas

Question

2 answers

solution1 0 2022-01-05 18:04:07

solution2 0 2022-01-05 18:07:28

solution1
0 2022-01-05 18:04:07

solution2
0 2022-01-05 18:07:28