How do you group pandas dataframe rows based on permutation of booleans?

Question

Imagine there is a pandas dataframe with five columns and n rows. Each column holds a boolean value.

Maths says there should be 32 permutations of boolean values.

How do I group them by the permutation of boolean values associated with each row so I can get a count on each group or return other properties?

For example, how do I find out how many rows associated with TTTTTs or TTTTFs or whatever permutation I'm interested in?

Answer 1

There are a couple of ways of doing this. One way would be to just group by all the columns you care about at once. If you want the counts, you can call the GroupBy.count method on the result:

df.groupby(['c1', 'c2', 'c3', 'c4', 'c5']).count()

Or more simply, if all the columns are of interest:

df.groupby(list(df.columns)).count()

You could also convert the booleans to a number, and group on that:

df['Num'] = (df.to_numpy() << [4, 3, 2, 1, 0]).sum(0)
df.groupby('Num').count()

How do you group pandas dataframe rows based on permutation of booleans?

Question

1 answers

solution1
0 2022-01-18 07:50:09

How do you group pandas dataframe rows based on permutation of booleans?

Question

1 answers

solution1 0 2022-01-18 07:50:09

solution1
0 2022-01-18 07:50:09