简体   繁体   中英

Python pandas .map saves only last edit

I am trying to use pandas .map to edit a dataset as in the following code:

df['Region'] = df['Region'].astype('category')
reg = df['Region']
cats = reg.cat.categories
ncats = len(cats)
n = len(os)

north = (...)
south = (...)
center = (...)
islands = (...)

d1 = {cats[i]:'South' for i in range(ncats) if cats[i] in south}
d2 = {cats[i]:'North' for i in range(ncats) if cats[i] in north}
d3 = {cats[i]:'Center' for i in range(ncats) if cats[i] in center}
d4 = {cats[i]:'Islands' for i in range(ncats) if cats[i] in islands}

df['Reg_cat'] = df['Region'].map(d1)
df['Reg_cat'] = df['Region'].map(d2)
df['Reg_cat'] = df['Region'].map(d3)
df['Reg_cat'] = df['Region'].map(d4)
df['Reg_cat'] = df['Reg_cat'].astype('category')
df['Reg_cat'].cat.categories
df['Reg_cat']

The code does work but it only applies the last .map request. So in this case it applies d4. If d1 is the last one it applies that one. What am I doing wrong?

Each successive map call replaces everything not inside the mapper with NaN.

Try building a single dictionary and passing that instead.

m = {'North' : north, 'South' : south, 'Center' : center, 'Islands', islands}    
d = {v2 : k for k, v in m.items() for v2 in v}

df['Reg_cat'] = df['Reg_cat'].map(d)

Note:

  • you don't need reg
  • you don't need cats
  • you don't need ncats
  • you also (not surprisingly) don't need n , whatever that is

Everytime you are calling df['Reg_cat'] = df['Region'].map(d#) you are overwriting the value of df['Reg_cat'] . If you'd like to keep all the values, consider adding them as separate columns.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM