简体   繁体   English

合并非唯一列-pandas python

[英]Merging on non-unique column - pandas python

I have been trying to merge two DataFrames together ( df and df_details ) in a similar fashion to an Excel "vlookup" but am getting strange results.我一直在尝试以类似于 Excel“ DataFrames方式将两个DataFrames合并在一起( dfdf_details ),但得到了奇怪的结果。 Below I show the structure of the two DataFrames without populating real data for simplicity下面我展示了两个DataFrames的结构,为简单起见,没有填充真实数据

df_details:

Abstract_Title  |  Abstract_URL  |  Session_No_v2  | Session_URL | Session_ID
  -------------------------------------------------------------------------
Abstract_Title1    Abstract_URL1         1          Session_URL1     12345
Abstract_Title2    Abstract_URL2         1          Session_URL1     12345
Abstract_Title3    Abstract_URL3         1          Session_URL1     12345
Abstract_Title4    Abstract_URL4         2          Session_URL2     22222 
Abstract_Title5    Abstract_URL5         2          Session_URL2     22222
Abstract_Title6    Abstract_URL6         3          Session_URL3     98765
Abstract_Title7    Abstract_URL7         3          Session_URL3     98765

df:

Session_Title   |   Session_URL   |   Sponsors   |    Type    |   Session_ID
    -------------------------------------------------------------------------------
Session_Title1     Session_URL1        x, y z     Paper             12345
Session_Title2     Session_URL2         x, y      Presentation      22222
Session_Title3     Session_URL3        a, b ,c    Presentation      98765
Session_Title4     Session_URL4          c        Talk              12121
Session_Title5     Session_URL5         a, x      Paper             33333

I want to merge along Session_ID and I want the final DataFrame to look like:我想沿着Session_ID合并,我希望最终的DataFrame看起来像: 在此处输入图片说明

I've tried the following script which yields a DataFrame that duplicates (several times) certain rows and does strange things.我尝试了以下脚本,该脚本生成一个重复(多次)某些行并执行奇怪操作的DataFrame For example, df_details has 7,046 rows and df has 1,856 rows - when I run the following merge code, my final_df results in 21,148 rows:例如, df_details有 7,046 行, df有 1,856 行 - 当我运行以下合并代码时,我的final_df结果为 21,148 行:

final_df = pd.merge(df_details, df, how = 'outer', on = 'Session_ID')

Please help!请帮忙!

To generate your final output table I used the following code:为了生成最终的输出表,我使用了以下代码:

final_df = pd.merge(df_details, df[['Session_ID',
                                'Session_Title',
                                'Sponsors',
                                'Type']], left_on = ['Session_ID'], right_on = ['Session_ID'], how = 'outer')

使用“左”而不是“外”。

final_df = pd.merge(df_details, df[['Session_ID','Session_Title','Sponsors','Type']], left_on = ['Session_ID'], right_on =['Session_ID'], how = 'left')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM