简体   繁体   English

比较两个数据帧,只获取索引和列名不匹配的值

[英]compare two data frames and get only non matching values with index and column names pandas dataframe python

df1- df1-

ID Name  Number
0  AAA    123
1  BBB    456
2  CCC    789

df2- df2-

ID Name  Number
0  AAA    123
1  BBB    456
2  CCC    **963**    <----- Non Matching value

want to compare above two data frames df1 and df2 want the result in below format: need only non matching value with column name.想要比较以上两个数据框 df1 和 df2 想要以下格式的结果:只需要与列名不匹配的值。

expected output:
ID Number                        
2  963   

can anyone help me with the code i am new to pandas, please help me out Thanks you soo much...任何人都可以帮我提供代码我是 pandas 的新手,请帮帮我,非常感谢...

You can use .merge() with indicator turned on and filter the result by the indicator, as follows:您可以在打开指标的情况下使用.merge()指标过滤结果,如下所示:

df3 = df2.merge(df1, how='left', indicator=True)
df3[df3['_merge'] == 'left_only'][['ID', 'Number']]

Result:结果:

   ID  Number
2   2     963

Edit编辑

If you have multiple columns and would not like to specify the column names to highlight the differences, you can try:如果您有多个列并且不想指定列名来突出显示差异,您可以尝试:

df2[(df1 != df2)].dropna(how='all', axis=1).dropna(how='all', axis=0)

Demo演示

df1

   ID Name  Number1  Number2  Number3
0   0  AAA      123       12     1111
1   1  BBB      456       22     2222
2   2  CCC      789       32     3333


df2

   ID Name  Number1  Number2  Number3
0   0  AAA      123       12     1111
1   1  BBB      456       22     2255
2   2  CCC      963       32     3333


df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0)


   Number1  Number3
1      NaN   2255.0
2    963.0      NaN

You can see from the non_NaN values the differences.您可以从 non_NaN 值中看到差异。 The ID is the index on the left. ID是左侧的索引。

Edit 2编辑 2

If your numbers are all integers and you don't want Pandas to show the integers as float type together with NaN values, you can use:如果您的数字都是整数并且您不希望 Pandas 将整数显示为浮点类型以及NaN值,您可以使用:

df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).fillna('').astype(str).replace(r'\.0', '', regex=True)


  Number1 Number3
1            2255
2     963        

Or, simply use:或者,只需使用:

df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).astype('Int64')


   Number1  Number3
1     <NA>     2255
2      963     <NA>

You can use the following您可以使用以下

df2[df1.Number != df2.Number][['ID', 'Number']]

You can Extract the data whatever you want from the output, which has the details of all mismatches您可以从 output 中提取任何您想要的数据,其中包含所有不匹配的详细信息

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在python中匹配两个pandas数据帧的列名 - Matching the column names of two pandas data-frames in python 将索引的值与列名进行比较; Python Pandas - Compare the values of index with column names ; Python Pandas Python:比较pandas dataframe中的两个值并获取最小值的索引 - Python: Compare two values in pandas dataframe and get index of minimum value Python 3:获取与pandas dataframe某列中的值关联的索引名称 - Python 3: Get index names associated with the values in a certain column of pandas dataframe Python Pandas DataFrame:将列名称与行索引匹配” - Python Pandas DataFrame: Matching column names to row index' Python Pandas 比较两个 dataframe 并且只保留索引出现在两个 dataframe 中的数据 - Python Pandas compare two dataframe and keep only data that index appears in both dataframe 如何在 pandas 中连接两个具有不同列名的数据帧? - python - how to concat two data frames with different column names in pandas? - python 比较两个数据框pyspark中的列名 - Compare column names in two data frames pyspark 使用 Python 通过它们在 Pandas 中的索引(位置)比较 DataFrame 列中顶部 X 中的任何给定两个值? - Compare any given two values in a top X from a DataFrame column by their index (position) in pandas using Python? 如何比较两个数据帧并获取相同行的索引(Python) - How to compare two data frames and get index of identical rows (Python)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM