繁体   English   中英

从两个大列表的笛卡尔积创建 pandas DataFrame

[英]Create a pandas DataFrame from a Cartesian product of two large lists

我正在寻找从其他两个创建数据框的最简单方法,以便它包含其元素的所有组合。 例如,我们有这两个数据框:

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

结果必须是:

   0   1
0  A  x1
1  A  x2
2  A  x3
3  A  x4
4  A  x5
5  A  x6
6  A  x7
7  A  x8
8  B  x1
9  B  x2

我尝试从列表中合并,它适用于小列表,但不适用于大列表。 谢谢

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

df1['key'] = 0
df2['key'] = 0
print( df1.merge(df2, on='key', how='outer').drop(columns='key') )

印刷:

   0_x 0_y
0    A  x1
1    A  x2
2    A  x3
3    A  x4
4    A  x5
5    A  x6
6    A  x7
7    A  x8
8    B  x1
9    B  x2

...

您可以使用itertools.product

import itertools
import pandas as pd

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]
result = pd.DataFrame(list(itertools.product(list1, list2)))

您想将df1中的每个元素与df2的所有元素连接起来。

您可以使用df.merge来做到这一点:

In [1820]: df1['tmp'] = 1   ## Create a dummy key in df1
In [1821]: df2['tmp'] = 1   ## Create a dummy key in df2

## Merge both frames on `tmp`
In [1824]: df1.merge(df2, on='tmp').drop('tmp', 1).rename(columns={'0_x': '0', '0_y':'1'}) 
Out[1824]: 
    0   1
0   A  x1
1   A  x2
2   A  x3
3   A  x4
4   A  x5
5   A  x6
6   A  x7
7   A  x8
8   B  x1
9   B  x2
10  B  x3
11  B  x4
12  B  x5
13  B  x6
14  B  x7
15  B  x8
16  C  x1
17  C  x2
18  C  x3
...
...

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM