简体   繁体   中英

Create a pandas DataFrame from a Cartesian product of two large lists

I'm looking for the simplest way to create a data frame from two others such that it contains all combinations of their elements. For instance we have these two dataframes:

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

The result must be:

   0   1
0  A  x1
1  A  x2
2  A  x3
3  A  x4
4  A  x5
5  A  x6
6  A  x7
7  A  x8
8  B  x1
9  B  x2

I tried to combine from the lists and it works fine with small lists but not for the large ones. Thank you

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

df1['key'] = 0
df2['key'] = 0
print( df1.merge(df2, on='key', how='outer').drop(columns='key') )

Prints:

   0_x 0_y
0    A  x1
1    A  x2
2    A  x3
3    A  x4
4    A  x5
5    A  x6
6    A  x7
7    A  x8
8    B  x1
9    B  x2

...

You can use itertools.product :

import itertools
import pandas as pd

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]
result = pd.DataFrame(list(itertools.product(list1, list2)))

You want to join each element in df1 with all elements of df2 .

You can do it using df.merge :

In [1820]: df1['tmp'] = 1   ## Create a dummy key in df1
In [1821]: df2['tmp'] = 1   ## Create a dummy key in df2

## Merge both frames on `tmp`
In [1824]: df1.merge(df2, on='tmp').drop('tmp', 1).rename(columns={'0_x': '0', '0_y':'1'}) 
Out[1824]: 
    0   1
0   A  x1
1   A  x2
2   A  x3
3   A  x4
4   A  x5
5   A  x6
6   A  x7
7   A  x8
8   B  x1
9   B  x2
10  B  x3
11  B  x4
12  B  x5
13  B  x6
14  B  x7
15  B  x8
16  C  x1
17  C  x2
18  C  x3
...
...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM