简体   繁体   中英

Using Python Pandas, can I replace values of one column in a df based on another column only when a "nan" value does not exist?

Let's say I have a data frame like this:

import pandas as pd
data1 = {
     "date": [1, 2, 3],
     "height": [420.3242, 380.1, 390],
     "height_new": [300, 380.1, "nan"],
     "duration": [50, 40, 45],
     "feeling" : ["great","good","great"]
    }
df = pd.DataFrame(data1)

And I want to update the "height" column with the "height_new" column but not when the value for "height_new" is "nan". Any hints on how to do this in a Pythonic manner?

I have a rough code which gets the job done but feels clunky (too many lines of code).

for x, y in zip(df['height'], df['height_new']) :
  if y != 'nan':
    df['height'].replace(x, y, inplace= True)
    x = y

You can use pandas.Series.where with pandas.Series.notna :

df["height"] = df["height_new"].where(df["height_new"].notna(), df["height"])

# Output:

print(df)
   date  height  height_new  duration feeling
0     1   300.0       300.0        50   great
1     2   380.1       380.1        40    good
2     3   390.0         NaN        45   great

NB: If "nan" is a literal string, use this instead:

df["height"] = df["height_new"].where(df["height_new"].ne("nan"), df["height"])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM