简体   繁体   English

如何创建一个包含两个列表之间的笛卡尔积作为元素的数据框?

[英]How to create a dataframe containing lthe cartesian product between two lists as elements?

I have a quite big dataframe containing 3 columns, each of which contains a list.我有一个包含 3 列的相当大的数据框,每列都包含一个列表。 Each list can be arbitrarily long (and is, sometimes, quite long).每个列表可以任意长(并且有时很长)。

 a          Yes                 No  
[1, 2, 3]  ["a", "b"]        ["A", "B", "C"]
[7, 11, 6] ["a", "d", "f"]   ["C", "H", "L", "Z"]

I want to create a new dataframe which looks like this:我想创建一个新的数据框,如下所示:

C1    C2    value 
 1     a      1 
 1     b      1
 2     a      1
 2     b      1
...
 3     b      1
 1     A      0
 1     B      0
 1     C      0
...
 3     C      0
...
 7     a      1
 7     d      1
...
 6     Z      0      

The order of the rows does not matter.行的顺序无关紧要。 I'm looking for an efficient way of doing that.我正在寻找一种有效的方法来做到这一点。

What I'm doing at the moment is the following (I might be complicating things a bit):我目前正在做的事情如下(我可能会让事情复杂化):

new_df = pd.DataFrame(columns=["C1", "C2", "value"])
def compute_pairs(df1, new_df):
  perm = list(itertools.product(df1.a, df1.Yes))
  for p in perm:
    new_df.loc[len(new_df)] = list(p) + [1]
  perm = list(itertools.product(df1.a, df1.No))
  for p in perm:
    df.loc[len(new_df)] = list(p) + [0]

df1.apply(lambda x: compute_pairs(x, new_df), axis=1)

However, the for look is quite slow.但是,for 查找非常慢。 I tried to use map but I failed.我尝试使用地图,但失败了。

Any improvements?有什么改进吗?

Let's try melt with explode :让我们尝试用explode melt

(df.melt('a',var_name='value',value_name='C')
   .explode('C').explode('a')
)

Output:输出:

    a value  C
0   1   Yes  a
0   2   Yes  a
0   3   Yes  a
0   1   Yes  b
0   2   Yes  b
0   3   Yes  b
1   7   Yes  a
1  11   Yes  a
1   6   Yes  a
1   7   Yes  d
1  11   Yes  d
1   6   Yes  d
1   7   Yes  f
1  11   Yes  f
1   6   Yes  f
2   1    No  A
2   2    No  A
2   3    No  A
2   1    No  B
2   2    No  B
2   3    No  B
2   1    No  C
2   2    No  C
2   3    No  C
3   7    No  C
3  11    No  C
3   6    No  C
3   7    No  H
3  11    No  H
3   6    No  H
3   7    No  L
3  11    No  L
3   6    No  L
3   7    No  Z
3  11    No  Z
3   6    No  Z

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM