简体   繁体   English

通过索引和列连接两个DataFrame

[英]Join two DataFrames by index and columns

I'm trying to join two DataFrames by index that can contain columns in common and I only want to add one to the other if that specific value is NaN or doesn't exist. 我正在尝试按索引连接两个DataFrames ,这些索引可以包含共同的列,并且我只想在该特定值为NaN或不存在的情况下向另一个添加一个。 I'm using the pandas example, so I've got: 我以熊猫为例,所以我得到了:

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                    'B': ['B0', 'B1', 'B2', 'B3'],
                    'C': ['C0', 'C1', 'C2', 'C3'],
                    'D': ['D0', 'D1', 'D2', 'D3']},
                    index=[0, 1, 2, 3])

as

    A   B   C   D
0  A0  B0  C0  D0
1  A1  B1  C1  D1
2  A2  B2  C2  D2
3  A3  B3  C3  D3

and

df4 = pd.DataFrame({'B': ['B2p', 'B3p', 'B6p', 'B7p'],
                    'D': ['D2p', 'D3p', 'D6p', 'D7p'],
                    'F': ['F2p', 'F3p', 'F6p', 'F7p']},
                    index=[2, 3, 6, 7])

as

    B    D    F
2  B2p  D2p  F2p
3  B3p  D3p  F3p
6  B6p  D6p  F6p
7  B7p  D7p  F7p

and the searched result is: 搜索结果为:

    A    B   C    D   F
0  A0   B0  C0   D0  Nan
1  A1   B1  C1   D1  Nan 
2  A2   B2  C2   D2  F2p
3  A3   B3  C3   D3  F3p
6 Nan  B6p Nan  D6p  F6p
7 Nan  B7p Nan  D7p  F7p

This is a good use case for combine_first , where the row and column indices of the resulting dataframe will be the union of the two, ie in the absence of an index in one of the dataframes, the value from the other is used (same behaviour as if it contained a NaN : 这是combine_first一个好用例,其中结果数据帧的行索引和列索引将是两者的并集,即,在其中一个数据帧中没有索引的情况下,将使用另一个数据帧的值(相同的行为好像它包含一个NaN

df1.combine_first(df4)

    A    B    C    D    F
0   A0   B0   C0   D0  NaN
1   A1   B1   C1   D1  NaN
2   A2   B2   C2   D2  F2p
3   A3   B3   C3   D3  F3p
6  NaN  B6p  NaN  D6p  F6p
7  NaN  B7p  NaN  D7p  F7p

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM