简体   繁体   中英

How to slice groupby pandas

I have like 40.000 groups after the code:

groups=data.groupby('A')

I need to subdived them like in sub-groups of 10.000, of course without overlapping and keeping the groupby stucture. Like group1=groups[0:10000], group2=groups[10000:20000]... to re-use them in other scripts. How can I do that?

Thank you !

in that case you can simply slice using iloc

group1=groups.iloc[0:10000,:]
group2=groups.iloc[10000:20000,:]
.
group3=groups.iloc[30000:40000,:]

this is when you want to slice according to indexes or number of rows required.

id you want to do it category wise then after performing group b you can simply do this

groups=groups.groupby(a).agg()
group1=groups.loc['category 1']

code mentioned in question aggregate not mentioned which is not valid refer the link to know how groupby works groupby

Unless you're aggregating right afterwards, groupby might be an overkill for this task.

data = data.set_index('A')
group_idx = data.index.drop_duplicates()
sub_group_1 = data.loc[group_idx[:10000]]

will get you first 10000 groups

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM