[英]how to add value to a new column to a dataframe based on the match of another dataframe?
I have two dataframes of two large excel files.我有两个大 excel 文件的两个数据框。 The dataframe 1 is always smaller than dataframe 2. The elements of dataframe 1 are unique while in dataframe there can be many elements repeated by ID and with the same code. The dataframe 1 is always smaller than dataframe 2. The elements of dataframe 1 are unique while in dataframe there can be many elements repeated by ID and with the same code.
I'm trying to add a new column to my dataframe 1, the new column is the 'code' of the dataframe 2 (add the value if the ID of both dataframes match).我正在尝试向我的 dataframe 1 添加一个新列,新列是 dataframe 2 的“代码”(如果两个数据帧的 ID 匹配,则添加该值)。
I managed to solve this with two nested for loops, but the process is too slow.我设法用两个嵌套的 for 循环解决了这个问题,但是这个过程太慢了。 Is there an alternative to add the new column?.是否有替代方法来添加新列?
The following dataframes are very small and are just to illustrate the example, actually I have a large amount of data with a large number of columns.以下数据框非常小,只是为了说明示例,实际上我有大量数据和大量列。
import pandas as pd
details_1 = {'ID':['ID01', 'ID02', 'ID03', 'ID04', 'ID05'],
'Qty': [1,2,3,4,5]}
details_2 = {'ID':['IDA01' ,'ID03', 'ID01','ID02','IDA02','IDX12' 'IDA03', 'IDA04', 'IDA05', 'ID04', 'ID05'],
'code': ['ab','yz','acv','abc','efs','xw2','fgt','axf','ard','afd','x01']
}
df1 = pd.Datafrme(details_1, columns = ['ID', 'Qty'])
df2 = pd.Datafrme(details_2, columns = ['ID', 'code'])
output: print(df3)
ID Qty new_code
0 ID01 1 acv
1 ID02 2 abc
2 ID03 3 yz
3 ID04 4 afd
4 ID05 5 x01
You can use .isin
method:您可以使用.isin
方法:
df3 = df1.set_index('ID')
df3['new_code'] = df2.loc[df2['ID'].isin(df1['ID'])].set_index('ID')
df3.reset_index(inplace=True)
Output: Output:
ID Qty new_code
0 ID01 1 acv
1 ID02 2 abc
2 ID03 3 yz
3 ID04 4 afd
4 ID05 5 x01
Use merge method instead.请改用合并方法。
Here is the example code.这是示例代码。
df3 = pd.merge(df1, df2, on = "ID")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.