简体   繁体   English

Pandas DataFrame:如何在索引和列的组合上与第二个DataFrame左合并

[英]Pandas DataFrame: How to merge left with a second DataFrame on a combination of index and columns

I'm trying to merge two dataframes. 我正在尝试合并两个数据框。

I want to merge on one column, that is the index of the second DataFrame and one column, that is a column in the second Dataframe. 我想一列,这是第二个数据框和一个列的索引 合并 ,这是在第二数据帧中的 The column/index names are different in both DataFrames. 两个数据帧中的列/索引名称均不同。

Example: 例:

import pandas as pd

df2 = pd.DataFrame([(i,'ABCDEFGHJKL'[j], i*2 + j) 
                    for i in range(10) 
                    for j in range(10)],
                    columns = ['Index','Sub','Value']).set_index('Index')

df1 = pd.DataFrame([['SOMEKEY-A',0,'A','MORE'],
                    ['SOMEKEY-B',4,'C','MORE'],
                    ['SOMEKEY-C',7,'A','MORE'],
                    ['SOMEKEY-D',5,'Z','MORE']
                   ], columns=['key', 'Ext. Index', 'Ext. Sub', 'Description']
                  ).set_index('key')

df1 prints out df1打印出来

key Ext. Index  Ext. Sub    Description
SOMEKEY-A   0   A   MORE
SOMEKEY-B   4   C   MORE
SOMEKEY-C   7   A   MORE
SOMEKEY-D   5   Z   MORE

the first lines of df2 are df2的第一行是

Index   Sub Value
0   A   0
0   B   1
0   C   2
0   D   3
0   E   4

I want to merge "Ext. Index" and "Ext. Sub" with DataFrame df2, where the index is "Index" and the column is "Sub" 我想将“ Ext。Index”和“ Ext。Sub”与DataFrame df2合并,其中索引为“ Index”,列为“ Sub”

The expected result is: 预期结果是:

key Ext. Index  Ext. Sub    Description Ext. Value
SOMEKEY-A   0   A   MORE    0
SOMEKEY-B   4   C   MORE    10
SOMEKEY-C   7   A   MORE    14
SOMEKEY-D   5   Z   MORE    None

Manually, the merge works like this 手动进行合并的方式如下

def get_value(x):
    try:
        return df2[(df2.Sub == x['Ext. Sub']) & 
                   (df2.index == x['Ext. Index'])]['Value'].iloc[0]
    except IndexError:
        return None

df1['Ext. Value'] = df1.apply(get_value, axis = 1)

Can I do this with a pd.merge or pd.concat command, without changing the df2 by turning the df2.index into a column? 我可以使用pd.mergepd.concat命令执行此pd.concat ,而无需通过将df2.index转换为列来更改df2吗?

Try using: 尝试使用:

df_new = (df1.merge(df2[['Sub', 'Value']],
                    how='left',
                    left_on=['Ext. Index', 'Ext. Sub'],
                    right_on=[df2.index, 'Sub'])
          .set_index(df1.index)
          .drop('Sub', axis=1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM