简体   繁体   中英

How can I subset dataframe and put them on a list?

I'm looking for a more automated approach to subset this dataframe by rank and put them in a list. Because if there happens to be 150 ranks I can't do individual subsets.

ID    |  GROUP   |  RANK
1     |    A     |    1
2     |    B     |    2
3     |    C     |    3
2     |    A     |    1
2     |    E     |    2
2     |    G     |    3

How can I subset the dataframe by Rank and then put every subset in a list? (Not using group by) I know how to individually subset them but I'm not sure how I can do this if there's more ranks.

Output:

ranks = [df1,df2,df3....and so on]

Just use groupby directly in a list comprehension

>>> [df for rank, df in df.groupby('RANK')]

This will generate a list of dataframes, each a sub-dataframe related to the corresponding rank .

You can also do a dict comprehension:

>>> dic = {rank: df for rank, df in df.groupby('RANK')}

such that you can access your df via dic[1] for rank == 1 .


In more detail, pd.DataFrame.groupby is a method that returns a DataFrameGroupBy object. A DataFrameGroupBy object is an iterable, which means you can iterate over it with a for loop. This iterable generates tuples with two vales, where the first is whatever you used to group (in this case, an integer rank ), and the second, the sub dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM