简体   繁体   中英

Producing every combination of columns from one pandas dataframe in python

I'd like to take a dataframe and visualize how useful each column is in a k-neighbors analysis so I was wondering if there was a way to loop through dropping columns and analyzing the dataframe in order to produce an accuracy for every single combination of columns. I'm really not sure if there are some functions in pandas that I'm not aware of that could make this easier or how to loop through the dataframe to produce every combination of the original dataframe. If I have not explained it well I will try and create a diagram.

a | b | c | | labels |

1 | 2 | 3 | | 0 |

5 | 6 | 7 | | 1 |

The dataframe above would produce something like this after being run through the splitting and k-neighbors function:

a & b = 43%

a & c = 56%

b & c = 78%

a & b & c = 95%

import itertools
min_size = 2
max_size = df.shape[1]
column_subsets = itertools.chain(*map(lambda x: itertools.combinations(df.columns, x), range(min_size,max_size+1)))
for column_subset in column_subsets:
     foo(df[list(column_subset)])

where df is your dataframe and foo is whatever kNA you're doing. Although you said "all combinations", I put min_size at 2 since your example has only size >= 2. And these are more precisely referred to as "subsets" rather than "combinations".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM