简体   繁体   中英

How do you group pandas dataframe rows based on permutation of booleans?

Imagine there is a pandas dataframe with five columns and n rows. Each column holds a boolean value.

Maths says there should be 32 permutations of boolean values.

How do I group them by the permutation of boolean values associated with each row so I can get a count on each group or return other properties?

For example, how do I find out how many rows associated with TTTTTs or TTTTFs or whatever permutation I'm interested in?

There are a couple of ways of doing this. One way would be to just group by all the columns you care about at once. If you want the counts, you can call the GroupBy.count method on the result:

df.groupby(['c1', 'c2', 'c3', 'c4', 'c5']).count()

Or more simply, if all the columns are of interest:

df.groupby(list(df.columns)).count()

You could also convert the booleans to a number, and group on that:

df['Num'] = (df.to_numpy() << [4, 3, 2, 1, 0]).sum(0)
df.groupby('Num').count()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM