简体   繁体   English

将 Pandas 数据框与列表数据框连接起来

[英]Join a Pandas dataframe with dataframe of lists

As illustrated below, I have two panda dataframes that I want to combine.如下图所示,我有两个想要合并的熊猫数据框。 The first holds information on a huge number of products.第一个包含大量产品的信息。 The second holds the information of the category of the product where each entry in the category column is a list.第二个保存产品类别的信息,其中类别列中的每个条目都是一个列表。

   CatId   Date           CatId      CatName
0     C2   01-15       0     C1   [crime, alt]
1     C1   01-15       1     C2   [crime, bests]
2     C1   01-15       2     C3   [fantasy, american]
3     C3   01-16       
.
.
n     C2   02-17

I am interested in the following dataframe:我对以下数据框感兴趣:

      CatId             Date           
0  [crime, bests]       01-15      
1  [crime, alt]         01-15      
2  [crime, alt]         01-15      
3  [fantasy, american]  01-16       
.
.
n  [crime, bests]       02-17

For efficiency (due to the size of the dataset) I try to avoid looping.为了效率(由于数据集的大小),我尽量避免循环。

Is it possible in Python?在 Python 中可能吗?

I believe you need map by Series created by set_index :我相信你需要map通过系列创建由set_index

print (df1)
  CatId   Date
0    C2  01-15
1    C1  01-15
2    C1  01-15
3    C3  01-16
n    C2  02-17

print (df2)

  CatId              CatName
0    C1         [crime, alt]
1    C2       [crime, bests]
2    C3  [fantasy, american]

df1['CatId'] = df1['CatId'].map(df2.set_index('CatId')['CatName'])
print (df1)
                 CatId   Date
0       [crime, bests]  01-15
1         [crime, alt]  01-15
2         [crime, alt]  01-15
3  [fantasy, american]  01-16
n       [crime, bests]  02-17

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM