[英]Create a new column in the original dataframe if the column from another dataframe and a column from original dataframe have matching values
I have two dataframes in Python.我在 Python 中有两个数据框。 One has got more than 90,000 rows.
一个有超过 90,000 行。 I would like to create a new column in the original dataframe from another dataframe if column values of the second dataframe match values in the original dataframe.
I would like to create a new column in the original dataframe from another dataframe if column values of the second dataframe match values in the original dataframe.
For example, if I'm given two DataFrames like this:例如,如果给我两个这样的 DataFrame:
countries = {'Country':['India','South Korea', 'France', 'Austria', 'India','Spain',
'France', 'Algeria', 'Angola','Spain','Belgium','Austria'],
'Capital':['Delhi', 'Seoul', 'Paris', 'Vienna', 'Delhi', 'Madrid', 'Paris',
'Algiers','Luanda','Madrid','Brussels','Vienna'],
'Landmark':['TajMahal','Seoul Tower','EiffelTower','Belvedere Palace', 'TajMahal',
'La Sagrada','EiffelTower','Algiers Memorial','Ruacana Falls','La
'Sagrada','Grand Place','Belvedere Palace']
}
language = {'Country':['India','South Korea', 'France', 'Algeria', 'Angola', 'Spain',
'Belgium', 'Austria'],
'Language':['Hindi', 'Korean', 'French', 'Arabic', 'Portuguese', 'Spanish',
'Dutch', 'German']
}
>>>df1
Country Capital Landmark
0 India Delhi TajMahal
1 South Korea Seoul Seoul Tower
2 France Paris EiffelTower
3 Austria Vienna Belvedere Palace
4 India Delhi TajMahal
5 Spain Madrid La Sagrada
6 France Paris EiffelTower
7 Algeria Algiers Algiers Memorial
8 Angola Luanda Ruacana Falls
9 Spain Madrid La Sagrada
10 Belgium Brussels Grand Place
11 Austria Vienna Belvedere Palace
>>>df2
Country Language
0 India Hindi
1 South Korea Korean
2 France French
3 Algeria Arabic
4 Angola Portuguese
5 Spain Spanish
6 Belgium Dutch
7 Austria German
I would like to get a result like this:我想得到这样的结果:
>>>df1
Country Capital Landmark Language
0 India Delhi TajMahal Hindi
1 South Korea Seoul Seoul Tower Korean
2 France Paris EiffelTower French
3 Austria Vienna Belvedere Palace German
4 India Delhi TajMahal Hindi
5 Spain Madrid La Sagrada Spanish
6 France Paris EiffelTower French
7 Algeria Algiers Algiers Memorial Arabic
8 Angola Luanda Ruacana Falls Portuguese
9 Spain Madrid La Sagrada Spanish
10 Belgium Brussels Grand Place Dutch
11 Austria Vienna Belvedere Palace German
ValueError Traceback (most recent call last)
<ipython-input-13-c4d8473be816> in <module>
----> 1 df2['Countrylanguage'] = languages
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
3368 else:
3369 # set column
-> 3370 self._set_item(key, value)
3371
3372 def _setitem_slice(self, key, value):
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py in _set_item(self, key, value)
3443
3444 self._ensure_valid_index(value)
-> 3445 value = self._sanitize_column(key, value)
3446 NDFrame._set_item(self, key, value)
3447
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast)
3628
3629 # turn me into an ndarray
-> 3630 value = sanitize_index(value, self.index, copy=False)
3631 if not isinstance(value, (np.ndarray, Index)):
3632 if isinstance(value, list) and len(value) > 0:
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_index(data, index, copy)
517
518 if len(data) != len(index):
--> 519 raise ValueError('Length of values does not match length of index')
520
521 if isinstance(data, ABCIndexClass) and not copy:
ValueError: Length of values does not match the length of the index
What is the right way of adding a new column to the original DataFrame?在原来的DataFrame中添加新列的正确方法是什么?
Thank you for your help!谢谢您的帮助!
There are many ways to do that, including merge, join, map
, here's one of them,有很多方法可以做到这一点,包括
merge, join, map
,这是其中之一,
df1.merge(df2)
Alternatively, I would recommend creating the following dictionary and do map
或者,我建议创建以下字典并执行
map
language = {'India': 'Hindi',
'South Korea': 'Korean',
'France': 'French',
'Algeria': 'Arabic',
'Angola': 'Portuguese',
'Spain': 'Spanish',
'Belgium': 'Dutch',
'Austria': 'German'}
df1['Language'] = df1['Country'].map(language)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.