简体   繁体   English

对列使用itertools.combinations

[英]Using itertools.combinations with columns

I have a Dataframe df with 3 columns. 我有3列的Dataframe df。 A,B and C A,B和C

A B C
2 4 4
5 2 5
6 9 5 

My goal is to use itertools.combinations to find all non-repeating column pairs and to put the first column pair in one DataFrame and the second in the other. 我的目标是使用itertools.combinations查找所有非重复的列对,并将第一个列对放在一个DataFrame中,将第二个列对放在另一个DataFrame中。 So all pairs of this would give A:B,A:C,B:C. 因此,所有这对都将给出A:B,A:C,B:C。

So the first dataframe df1 would have the first of of those column pairs: 因此,第一个数据帧df1将具有这些列对中的第一个:

df=A A B
   2 4 4
   5 5 2
   6 5 9

and the second df2: 和第二个df2:

   B C C
   4 4 4
   3 5 5
   9 5 5

I'm trying to do something with itertools like: 我正在尝试使用itertools做一些事情:

    for cola, colb in itertools.combinations(df, 2):
        df1[cola]=cola
        df2[colb]=colb

I know that makes no sense but i can change each column to a list and itertool a list of lists and then append each to a list A and B and then turn that list back into a Dataframe but then Im missing the headers. 我知道这没有任何意义,但是我可以将每一列更改为一个列表,并在itertool中更改为列表的列表,然后将每个列追加到列表A和B,然后将该列表重新转换为数据框,但是Im缺少标题。 And I tried adding the headers to the list but when i try and remake it back to a DataFrame the indexing seems off and I cant seem to fix it. 我尝试将标头添加到列表中,但是当我尝试将其重新制作回DataFrame时,索引似乎已关闭,并且似乎无法修复。 So I'm just trying to see if there is a way to just itertool entire columns with the headers. 所以我只是想看看是否有一种方法可以只用标题来遍历整个列。

Utilize the zip function to group the columns to be used in each DataFrame separately, and then use pandas.concat to construct your new DataFrames: 利用zip函数将要在每个DataFrame中使用的列分别分组,然后使用pandas.concat构造新的DataFrame:

from itertools import combinations

df1_cols, df2_cols = zip(*combinations(df.columns,2))

df1 = pd.concat([df[col] for col in df1_cols],axis=1)
df2 = pd.concat([df[col] for col in df2_cols],axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM