简体   繁体   English

如何合并不同数据框的两列,如果找到匹配项,请使用 pandas 在新列中写入“True”

[英]How to merge two columns of different data frame and if the match is found write "True" in a new column using pandas

I'm working on pandas project.我正在从事 pandas 项目。 I have two data frame similar to bellow我有两个类似于波纹管的数据框

DF1 :

Data1    Data2      Data3
Head     Cat        Fire
Limbs    Dog        Snow
Eyes     Fish       Water
Mouth    Dragon     Air


DF2 :

 Data1     Data2      
 Limbs     Dog        
 Mouth     Dragon        
 Head      Cat 

Based on the above Dataframe I need to compare both DF's and if the match is found I need to write "True" in a separate column else False基于以上 Dataframe 我需要比较两个 DF,如果找到匹配我需要在单独的列中写“True”,否则为 False

ex: lets say, I pick DF2 first row with combination (Limbs, Dog) this should be searched in DF1, as we can see the combination exits in the 2nd row, then write DF1's Data3 value "Snow" to the DF2 Data3 value.例如:可以说,我选择了 DF2 第一行的组合(肢体,狗)这应该在 DF1 中搜索,因为我们可以看到第二行的组合存在,然后将 DF1 的 Data3 值“Snow”写入 DF2 Data3 值。 and also print "True" value in a new column if the match is found.如果找到匹配项,还会在新列中打印“True”值。

expected output预计 output

Data1         Data2         Data3   Data4
 Limbs        Dog            Snow    True
 Mouth        Dragon         Air     True
 Head         cat            Fire    True
  Eyes         Fish         Water    False

Currently, I have tried merging two dataframe目前,我尝试合并两个 dataframe

Current Code:当前代码:

df3 = pd.merge(df, valid_req , on=['Data1','Data2' ])

df3


 Data1         Data2         Data3  
     Limbs        Dog            Snow   
     Mouth        Dragon         Air     
     Head         cat            Fire

How can I achieve the expected output?我怎样才能达到预期的output?

You can assign a temporary column to df2 and then merge using how='left' :您可以为df2分配一个临时列,然后使用how='left' merge

In [1665]: df2['tmp'] = 1

In [1668]: x = df1.merge(df2, on=['Data1', 'Data2'], how='left')

In [1667]: x
Out[1667]: 
   Data1   Data2  Data3  tmp
0   Head     Cat   Fire  1.0
1  Limbs     Dog   Snow  1.0
2   Eyes    Fish  Water  NaN
3  Mouth  Dragon    Air  1.0

Finally, use numpy.where to assign the new column Data4 based on if x['tmp'] == 1 then True , else False :最后,使用numpy.where分配新列Data4基于 if x['tmp'] == 1 then True , else False

In [1668]: import numpy as np

In [1669]: x['Data4'] = np.where(x.tmp.eq(1), True, False)

Drop the unnecessary tmp column using df.drop .使用df.drop删除不必要的tmp列。 Then x is your final output :然后x是你的最终 output

In [1671]: x.drop('tmp', 1, inplace=True)

In [1672]: x
Out[1672]: 
   Data1   Data2  Data3  Data4
0   Head     Cat   Fire   True
1  Limbs     Dog   Snow   True
2   Eyes    Fish  Water  False
3  Mouth  Dragon    Air   True

Use DataFrame.merge with left join and indicator=True parameter and then for new column compare by both with DataFrame.pop for remove column:使用DataFrame.merge和 left join 和indicator=True参数,然后将新列与both进行DataFrame.pop以删除列:

df = df1.merge(df2, on=['Data1', 'Data2'], how='left', indicator=True)
df['Data4'] = df.pop('_merge').eq('both')
print (df)
   Data1   Data2  Data3  Data4
0   Head     Cat   Fire   True
1  Limbs     Dog   Snow   True
2   Eyes    Fish  Water  False
3  Mouth  Dragon    Air   True

Use simply the apply function on DF1 to create the Data4:只需在 DF1 上使用 apply function 即可创建 Data4:

import pandas as pd

DF1 = pd.DataFrame([
    ["Head", "Cat", "Fire"],
    ["Limbs", "Dog", "Snow"],
    ["Eyes", "Fish", "Water"],
    ["Mouth", "Dragon", "Air"]
], columns=["Data1", "Data2", "Data3"])

DF2 = pd.DataFrame([
    ["Limbs", "Dog", "Snow"],
    ["Mouth", "Dragon", "Air"],
    ["Head", "Cat", "Fire"]
], columns=["Data1", "Data2", "Data3"])

DF1["Data4"] = DF1["Data1"].apply(lambda cell: DF2[DF2["Data1"]==cell]["Data1"].count()>0)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫数据框:如何将1和0的列合并到新列 - Panda Data frame : How to merge columns of 1 and 0 to a new column 如何使用熊猫之间的匹配以及列列表和值列表之间的匹配来填充数据框? - How to fill the data frame with using the match between columns and column list and value list using pandas? 如何比较两个不同数据框的两列并添加新的结果列 - How can I compare two columns of two different data frame and add new resultant column 如何将熊猫数据帧的两列相乘(行乘法)并将结果存储在新列中? - How to multiply two columns of a pandas data-frame (row multiplication) and store the result in a new column? 使用 Pandas 从现有列创建新列到数据框 - Create a new column to data frame from existing columns using Pandas 使用熊猫中两列之间的差异创建新的数据框 - creating a new data frame using differences between two columns in pandas 如何使用索引合并熊猫中的多个数据框列? - How to merge multiple data frame columns in pandas by using index? 如何在多列上合并,然后如果没有匹配项,则在 Pandas 中的不同列上合并? - How to merge on multiple columns and then if there is not a match, merge on different columns in pandas? 熊猫数据框如何合并列 - Pandas Data Frame how to merge columns 尝试使用 Pandas 数据框中其他两列的 groupby 基于另一列创建新的滚动平均列时出错 - Error when trying to create new rolling average column based on another column using groupby of two other columns in pandas data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM