如果列值为“ foo”，则在同一行上将数据框追加新值吗？

Question

I have a dataframe containing country name, and I would like to append this with the coordinates for the capital of each country. 我有一个包含国家/地区名称的数据框，我想在此数据框后附加每个国家/地区的首都的坐标。

I created a dict with all the coordinates that's formatted like this: 我创建了一个dict，其所有坐标的格式如下：

{'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 
'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

I have a dataframe where a column is "COUNTRY", and want there to be two new columns "LAT", "LON" where I will store the coordinates. 我有一个数据框，其中一列为“ COUNTRY”，并希望有两个新列“ LAT”，“ LON”用于存储坐标。 I tried converting the dict to a dataframe directly but it didn't work as I wanted it to. 我尝试将dict直接转换为数据框，但没有按我希望的那样工作。

Is it viable creating an empty df with two columns "LAT", "LON", merging it with the original df and then iterating through it, checking the country and adding the coordinates one by one, or is there a better way of doing it? 创建具有两列“ LAT”，“ LON”的空df，将其与原始df合并，然后遍历它，检查国家并逐个添加坐标，是否可行？或者有更好的方法？

A country can appear many, many times in the df with about 30k entries so I'm afraid it will cause a bit of overhead. 一个国家可能会出现很多次，出现在df中的次数很多，大约有3万个条目，因此我担心这会造成一些开销。 I'm new to Pandas so I might be missing a built in feature that would work well with this. 我是Pandas的新手，所以我可能会缺少一个可以很好地使用此功能的内置功能。

Do you have any thought on the best way to approach this? 您是否对解决此问题的最佳方法有任何想法？

Thanks in advance 提前致谢

Answer 1

Use 2 dict comprehensions with select first and second value of tuple by indexing [0] and [1] with map : 通过使用map索引[0]和[1]来选择元组的第一个和第二个值来使用2 dict理解：

d = {'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 
'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

df = pd.DataFrame({'COUNTRY':['Zimbabwe','Hungary', 'Slovakia']})

df['LAT'] = df['COUNTRY'].map({k:v[0] for k, v in d.items()})
df['LON'] = df['COUNTRY'].map({k:v[1] for k, v in d.items()})
print (df)
    COUNTRY        LAT        LON
0  Zimbabwe  31.045686 -17.831773
1   Hungary  19.040471  47.498382
2  Slovakia        NaN        NaN

Answer 2

adding to the solution above, you can also use iloc 除了上述解决方案，您还可以使用iloc

d = {'Czech Republic': (14.4212535, 50.0874654), 'Zimbabwe': (31.045686, -17.831773), 'Hungary': (19.0404707, 47.4983815), 'Nigeria': (7.4892974, 9.0643305)}

d = pd.DataFrame(d) 
print(d)

    Czech Republic  Zimbabwe    Hungary Nigeria
0   14.421254   31.045686   19.040471   7.489297
1   50.087465   -17.831773  47.498382   9.064331

df = pd.DataFrame({'COUNTRY':['Zimbabwe','Hungary', 'Slovakia']})

df['LAT'] = df['COUNTRY'].map(d.iloc[0]) 
df['LON'] = df['COUNTRY'].map(d.iloc[1])

print(df)

  COUNTRY     LAT         LON
0 Zimbabwe    31.045686   -17.831773 
1 Hungary     19.040471   47.498382 
2 Slovakia    NaN         NaN

如果列值为“ foo”，则在同一行上将数据框追加新值吗？

问题描述

2 个解决方案

解决方案1
3 已采纳 2018-07-04 12:28:05

解决方案2
1 2018-07-05 00:57:19

如果列值为“ foo”，则在同一行上将数据框追加新值吗？

问题描述

2 个解决方案

解决方案1 3 已采纳 2018-07-04 12:28:05

解决方案2 1 2018-07-05 00:57:19

解决方案1
3 已采纳 2018-07-04 12:28:05

解决方案2
1 2018-07-05 00:57:19