Producing every combination of columns from one pandas dataframe in python

Question

I'd like to take a dataframe and visualize how useful each column is in a k-neighbors analysis so I was wondering if there was a way to loop through dropping columns and analyzing the dataframe in order to produce an accuracy for every single combination of columns. I'm really not sure if there are some functions in pandas that I'm not aware of that could make this easier or how to loop through the dataframe to produce every combination of the original dataframe. If I have not explained it well I will try and create a diagram.

a | b | c | | labels |

1 | 2 | 3 | | 0 |

5 | 6 | 7 | | 1 |

The dataframe above would produce something like this after being run through the splitting and k-neighbors function:

a & b = 43%

a & c = 56%

b & c = 78%

a & b & c = 95%

Answer 1

import itertools
min_size = 2
max_size = df.shape[1]
column_subsets = itertools.chain(*map(lambda x: itertools.combinations(df.columns, x), range(min_size,max_size+1)))
for column_subset in column_subsets:
     foo(df[list(column_subset)])

where df is your dataframe and foo is whatever kNA you're doing. Although you said "all combinations", I put min_size at 2 since your example has only size >= 2. And these are more precisely referred to as "subsets" rather than "combinations".

Producing every combination of columns from one pandas dataframe in python

Question

1 answers

solution1
0 ACCPTED 2018-01-29 16:52:48

Producing every combination of columns from one pandas dataframe in python

Question

1 answers

solution1 0 ACCPTED 2018-01-29 16:52:48

solution1
0 ACCPTED 2018-01-29 16:52:48