[英]Pandas DataFrame: How to merge left with a second DataFrame on a combination of index and columns
I'm trying to merge two dataframes. 我正在尝试合并两个数据框。
I want to merge on one column, that is the index of the second DataFrame and one column, that is a column in the second Dataframe. 我想在一列,这是第二个数据框和一个列的索引 合并 ,这是在第二数据帧中的列 。 The column/index names are different in both DataFrames.
两个数据帧中的列/索引名称均不同。
Example: 例:
import pandas as pd
df2 = pd.DataFrame([(i,'ABCDEFGHJKL'[j], i*2 + j)
for i in range(10)
for j in range(10)],
columns = ['Index','Sub','Value']).set_index('Index')
df1 = pd.DataFrame([['SOMEKEY-A',0,'A','MORE'],
['SOMEKEY-B',4,'C','MORE'],
['SOMEKEY-C',7,'A','MORE'],
['SOMEKEY-D',5,'Z','MORE']
], columns=['key', 'Ext. Index', 'Ext. Sub', 'Description']
).set_index('key')
df1 prints out df1打印出来
key Ext. Index Ext. Sub Description
SOMEKEY-A 0 A MORE
SOMEKEY-B 4 C MORE
SOMEKEY-C 7 A MORE
SOMEKEY-D 5 Z MORE
the first lines of df2 are df2的第一行是
Index Sub Value
0 A 0
0 B 1
0 C 2
0 D 3
0 E 4
I want to merge "Ext. Index" and "Ext. Sub" with DataFrame df2, where the index is "Index" and the column is "Sub" 我想将“ Ext。Index”和“ Ext。Sub”与DataFrame df2合并,其中索引为“ Index”,列为“ Sub”
The expected result is: 预期结果是:
key Ext. Index Ext. Sub Description Ext. Value
SOMEKEY-A 0 A MORE 0
SOMEKEY-B 4 C MORE 10
SOMEKEY-C 7 A MORE 14
SOMEKEY-D 5 Z MORE None
Manually, the merge works like this 手动进行合并的方式如下
def get_value(x):
try:
return df2[(df2.Sub == x['Ext. Sub']) &
(df2.index == x['Ext. Index'])]['Value'].iloc[0]
except IndexError:
return None
df1['Ext. Value'] = df1.apply(get_value, axis = 1)
Can I do this with a pd.merge
or pd.concat
command, without changing the df2 by turning the df2.index into a column? 我可以使用
pd.merge
或pd.concat
命令执行此pd.concat
,而无需通过将df2.index转换为列来更改df2吗?
Try using: 尝试使用:
df_new = (df1.merge(df2[['Sub', 'Value']],
how='left',
left_on=['Ext. Index', 'Ext. Sub'],
right_on=[df2.index, 'Sub'])
.set_index(df1.index)
.drop('Sub', axis=1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.