[英]Join a Pandas dataframe with dataframe of lists
As illustrated below, I have two panda dataframes that I want to combine.如下图所示,我有两个想要合并的熊猫数据框。 The first holds information on a huge number of products.
第一个包含大量产品的信息。 The second holds the information of the category of the product where each entry in the category column is a list.
第二个保存产品类别的信息,其中类别列中的每个条目都是一个列表。
CatId Date CatId CatName
0 C2 01-15 0 C1 [crime, alt]
1 C1 01-15 1 C2 [crime, bests]
2 C1 01-15 2 C3 [fantasy, american]
3 C3 01-16
.
.
n C2 02-17
I am interested in the following dataframe:我对以下数据框感兴趣:
CatId Date
0 [crime, bests] 01-15
1 [crime, alt] 01-15
2 [crime, alt] 01-15
3 [fantasy, american] 01-16
.
.
n [crime, bests] 02-17
For efficiency (due to the size of the dataset) I try to avoid looping.为了效率(由于数据集的大小),我尽量避免循环。
Is it possible in Python?在 Python 中可能吗?
I believe you need map
by Series created by set_index
:我相信你需要
map
通过系列创建由set_index
:
print (df1)
CatId Date
0 C2 01-15
1 C1 01-15
2 C1 01-15
3 C3 01-16
n C2 02-17
print (df2)
CatId CatName
0 C1 [crime, alt]
1 C2 [crime, bests]
2 C3 [fantasy, american]
df1['CatId'] = df1['CatId'].map(df2.set_index('CatId')['CatName'])
print (df1)
CatId Date
0 [crime, bests] 01-15
1 [crime, alt] 01-15
2 [crime, alt] 01-15
3 [fantasy, american] 01-16
n [crime, bests] 02-17
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.