简体   繁体   中英

Python nan and None

I am trying to apply the SCD2 with Python. The source files are.xlsx files and will be loaded into PostgreSQL.

I have applied the condition where I identify the changed rows in the merged dataframe by:

ChangedRecords = np.where((All_Data.is_current_x.notnull()) & (All_Data.is_current_y.notnull())
                              & ((All_Data['col1_x'] != All_Data['col1_y'])
                                 | (All_Data['col2_x'] != All_Data['col2_y'])
                                 | (All_Data['col3_x'] != All_Data['col3_y'])
                                 ... ),1,0)

The issue I am having is that the NULL values in source (.xlsx) are displayed as nan and the NULL values in destination (PostgreSQL) are displayed as None and therefore my code is considering these values as different and whenever a row has NULL value it gets in ChangedRecords which is wrong.

The datatype of the column is String.

I have used the below code to replace, but it seems like is not working.

Source = ftp_invoices.replace([np.nan],[None])

I still get nan in NULL values

If think nan is a string and not NaN (see the case).

Try:

Source = ftp_invoices.replace('nan', None)

Example:

df = pd.DataFrame({'A': [None, np.NaN, 'nan']})
print(df)

# Output
      A
0  None  # <- None
1   NaN  # <- np.NaN
2   nan  # <- the string nan
# Replace real NaN by None
>>> df.replace(np.NaN, None)
      A
0  None
1  None  # Only here
2   nan

# Replace nan strings by None
>>> df.replace('nan', None)
      A
0  None
1   NaN
2  None  # Only here

# Combine all
>>> df.replace({'nan': None, np.NaN: None})
      A
0  None
1  None
2  None

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM