简体   繁体   English

比较数据框 python 中的两列

[英]Compare two columns in Data frame python

I have 2 excel files as input.我有 2 个 excel 文件作为输入。

F1
FileName        Name   Gender
4-F_994637.txt  XXX    Not Identified
4-F_994576.txt  XXX    Not Identified
3-F_977039.txt  XXX    Not Identified
4-F_992516.txt  XXX    Not Identified
3-F_980311.txt  XXX    Not Identified
4-F_994638.txt  XXX    Female
4-F_994126.txt  XXX    Female
3-F_677039.txt  XXX    female
4-F_322516.txt  XXX    male
3-F_677311.txt  XXX    male

F2
FileName        Name   Gender
4-F_994637.txt  XXX    Male
4-F_994576.txt  XXX    Male
3-F_977039.txt  XXX    Male
4-F_992516.txt  XXX    Male
3-F_980311.txt  XXX    Male

All I want is first i need to select "Gender = 'Non Identified'" and compare F1 and F2 with column "FileName" .我想要的只是首先我需要 select "Gender = 'Non Identified'"并将 F1 和 F2 与列"FileName"进行比较。 And if there is match I should replace the Gender in F1 with Male如果匹配,我应该用男性替换 F1 中的性别

I tried the below code, but getting error我尝试了以下代码,但出现错误

import pandas as pd

df1=pd.read_excel('F1.xlsx')
df2=pd.read_excel('F2.xlsx')

pick=df1[df1["Gender"]=='Not Identified']
filecompare=pick["FileName"] == df2["FileName"]

ValueError: Can only compare identically-labeled Series objects

Can someone Help what is this error all about.有人可以帮助这个错误是什么。

Thanks, Meera谢谢,米拉

With pandas version 1.1.1, your code block使用 pandas 版本 1.1.1,您的代码块

pick=df1[df1["Gender"]=='Not Identified']
filecompare = pick["FileName"] == df2["FileName"]
display(filecompare)

returns返回

output_for_question

You can check for the similar Filenames with the isin() method and set them to 'male' with .loc .您可以使用isin()方法检查类似的文件名,并使用.loc将它们设置为“男性”。

cond_male = (df1['Gender']=='Not Identified') & (df1['FileName'].isin(df2['FileName']))
df1.loc[cond_male,'Gender']='Male'

returns返回

输出问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM