How to create a dataframe containing lthe cartesian product between two lists as elements?

Question

I have a quite big dataframe containing 3 columns, each of which contains a list. Each list can be arbitrarily long (and is, sometimes, quite long).

 a          Yes                 No  
[1, 2, 3]  ["a", "b"]        ["A", "B", "C"]
[7, 11, 6] ["a", "d", "f"]   ["C", "H", "L", "Z"]

I want to create a new dataframe which looks like this:

C1    C2    value 
 1     a      1 
 1     b      1
 2     a      1
 2     b      1
...
 3     b      1
 1     A      0
 1     B      0
 1     C      0
...
 3     C      0
...
 7     a      1
 7     d      1
...
 6     Z      0

The order of the rows does not matter. I'm looking for an efficient way of doing that.

What I'm doing at the moment is the following (I might be complicating things a bit):

new_df = pd.DataFrame(columns=["C1", "C2", "value"])
def compute_pairs(df1, new_df):
  perm = list(itertools.product(df1.a, df1.Yes))
  for p in perm:
    new_df.loc[len(new_df)] = list(p) + [1]
  perm = list(itertools.product(df1.a, df1.No))
  for p in perm:
    df.loc[len(new_df)] = list(p) + [0]

df1.apply(lambda x: compute_pairs(x, new_df), axis=1)

However, the for look is quite slow. I tried to use map but I failed.

Any improvements?

Answer 1

Let's try melt with explode :

(df.melt('a',var_name='value',value_name='C')
   .explode('C').explode('a')
)

Output:

    a value  C
0   1   Yes  a
0   2   Yes  a
0   3   Yes  a
0   1   Yes  b
0   2   Yes  b
0   3   Yes  b
1   7   Yes  a
1  11   Yes  a
1   6   Yes  a
1   7   Yes  d
1  11   Yes  d
1   6   Yes  d
1   7   Yes  f
1  11   Yes  f
1   6   Yes  f
2   1    No  A
2   2    No  A
2   3    No  A
2   1    No  B
2   2    No  B
2   3    No  B
2   1    No  C
2   2    No  C
2   3    No  C
3   7    No  C
3  11    No  C
3   6    No  C
3   7    No  H
3  11    No  H
3   6    No  H
3   7    No  L
3  11    No  L
3   6    No  L
3   7    No  Z
3  11    No  Z
3   6    No  Z

How to create a dataframe containing lthe cartesian product between two lists as elements?

Question

1 answers

solution1
1 ACCPTED 2020-10-19 13:50:52

How to create a dataframe containing lthe cartesian product between two lists as elements?

Question

1 answers

solution1 1 ACCPTED 2020-10-19 13:50:52

solution1
1 ACCPTED 2020-10-19 13:50:52