简体   繁体   中英

Pandas: replace values in column with condition

I have dataframe

city_reg     city_live   reg_region    live_region 
 Moscow         Tver        77            69
 Tambov         Tumen'      86            86

I need to replace values in city_reg to values from city_live if reg_region == live_region

I try to use

df.loc[df.reg_region == df.live_region, 'city_reg'] = df['city_live']

but it returnes

ValueError: cannot reindex from a duplicate axis

How can I fix that?

Use mask or numpy.where which working with duplicated indices very nice:

#create duplicated indices for test
df.index = [0,0]
print (df)
  city_reg city_live  reg_region  live_region
0   Moscow      Tver          77           69
0   Tambov    Tumen'          86           86

df['city_reg'] = df['city_reg'].mask(df.reg_region == df.live_region,  df['city_live'])

Or:

df['city_reg'] = np.where(df.reg_region == df.live_region,  df['city_reg'], df['city_live'])

print (df)
  city_reg city_live  reg_region  live_region
0   Moscow      Tver          77           69
0   Tumen'    Tumen'          86           86

Try this:

mask = df.reg_region == df.live_region
df.loc[mask, 'city_reg'] = df.loc[mask, 'city_live']

#   city_reg city_live  reg_region  live_region
# 0   Moscow      Tver          77           69
# 1   Tumen'    Tumen'          86           86

The reason this works is that the indices are aligned between the left and right hand sides of the assignment when you apply the same mask.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM