简体   繁体   中英

If column value is “foo”, append dataframe with new values on the same row?

I have a dataframe containing country name, and I would like to append this with the coordinates for the capital of each country.

I created a dict with all the coordinates that's formatted like this:

{'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 
'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

I have a dataframe where a column is "COUNTRY", and want there to be two new columns "LAT", "LON" where I will store the coordinates. I tried converting the dict to a dataframe directly but it didn't work as I wanted it to.

Is it viable creating an empty df with two columns "LAT", "LON", merging it with the original df and then iterating through it, checking the country and adding the coordinates one by one, or is there a better way of doing it?

A country can appear many, many times in the df with about 30k entries so I'm afraid it will cause a bit of overhead. I'm new to Pandas so I might be missing a built in feature that would work well with this.

Do you have any thought on the best way to approach this?

Thanks in advance

Use 2 dict comprehensions with select first and second value of tuple by indexing [0] and [1] with map :

d = {'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 
'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

df = pd.DataFrame({'COUNTRY':['Zimbabwe','Hungary', 'Slovakia']})

df['LAT'] = df['COUNTRY'].map({k:v[0] for k, v in d.items()})
df['LON'] = df['COUNTRY'].map({k:v[1] for k, v in d.items()})
print (df)
    COUNTRY        LAT        LON
0  Zimbabwe  31.045686 -17.831773
1   Hungary  19.040471  47.498382
2  Slovakia        NaN        NaN

adding to the solution above, you can also use iloc

d = {'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

d = pd.DataFrame(d) 
print(d)

    Czech Republic  Zimbabwe    Hungary Nigeria
0   14.421254   31.045686   19.040471   7.489297
1   50.087465   -17.831773  47.498382   9.064331

df = pd.DataFrame({'COUNTRY':['Zimbabwe','Hungary', 'Slovakia']})

df['LAT'] = df['COUNTRY'].map(d.iloc[0]) 
df['LON'] = df['COUNTRY'].map(d.iloc[1])

print(df)

  COUNTRY     LAT         LON
0 Zimbabwe    31.045686   -17.831773 
1 Hungary     19.040471   47.498382 
2 Slovakia    NaN         NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM