简体   繁体   中英

Comparing NaN columns in pandas/numpy

I want to compare two columns in a dataframe which may contain NaN values.

Based on link I have tried to adapt my code but am struggling with the following:

(s1[s1.notnull()] == s2[s2.notnull()]).all()

How can I adjust this code so that I do not need to extract s1 and s2 as series from the dataframe but can apply it directly:

import pandas as pd
import numpy as np

mydata = [{'SystemA': 1.78, 'SystemB':1.78},
          {'SystemA': 2.5, 'SystemB':2.5},
          {'SystemA': np.nan, 'SystemB':np.nan}]

df = pd.DataFrame(mydata)
print(df)

# does not return single True or False
# gives wrong result when e.g. 2.5 is changed to 2.6 in just one column
c =  (df['SystemA'].notnull() == df['SystemB'].notnull()).all()

print('\nc:',c)

The equals method ignores nans:

>>> df
   SystemA  SystemB
0     1.78     1.78
1     2.50     2.50
2      NaN      NaN
>>> (df["SystemA"] == df["SystemB"]).all()
False
>>> df["SystemA"].equals(df["SystemB"])
True
>>> df.iloc[1,1] = 2.6
>>> df
   SystemA  SystemB
0     1.78     1.78
1     2.50     2.60
2      NaN      NaN
>>> df["SystemA"].equals(df["SystemB"])
False

although it also checks for dtype equality, so if you have a Series of integers and a Series of floats, it'll say they're not equal even if you have

>>> ((df.SystemA == df.SystemB) | (df.SystemA.isnull() & df.SystemB.isnull())).all()
True

This may or may not be a problem for you.

is that what you are looking for:

df['SystemA'].dropna() == df['SystemB'].dropna()

?

mydata = [{'SystemA': 1.78, 'SystemB':1.78},
          {'SystemA': 2.5, 'SystemB':2.5},
          {'SystemA': np.nan, 'SystemB':np.nan}]

df = pd.DataFrame(mydata)

This will get you rows where both cols are null:

df[pd.isnull(df['SystemA']) & (pd.isnull(df['SystemB']))]

    SystemA     SystemB
 2  NaN     NaN

This will get you rows where both cols are notnull:

df[(df['SystemA'].notnull()) & (df['SystemB'].notnull())]

SystemA     SystemB
0   1.78    1.78
1   2.50    2.50

I'm assuming you something along the lines of one of those?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM