How can I merge 2 dataframe df1
and df2
in order to get df3
that has the rows of df1
and df2
that have the same index (and the same values in the columns)?
df1 = pd.DataFrame({'A': ['A0', 'A2', 'A3', 'A7'],
'B': ['B0', 'B2', 'B3', 'B7'],
'C': ['C0', 'C2', 'C3', 'C7'],
'D': ['D0', 'D2', 'D3', 'D7']},
index=[0, 2, 3,7])
df2 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A7'],
'B': ['B0', 'B1', 'B2', 'B7'],
'C': ['C0', 'C1', 'C2', 'C7'],
'D': ['D0', 'D1', 'D2', 'D7']},
index=[0, 1, 2, 7])
df2 = pd.DataFrame({'A': ['A1'],
'B': ['B1'],
'C': ['C1'],
'D': ['D1']},
index=[1])
Out[13]:
A B C D
0 A0 B0 C0 D0
2 A2 B2 C2 D2
7 A7 B7 C7 D7
Empty DataFrame
Columns: [A, B, C, D]
Index: []
Just merge
:
In[111]:
df1.merge(df2)
Out[111]:
A B C D
0 A0 B0 C0 D0
The default params for merge
is to merge all columns, performing an inner
merge so only where all values agree
Looking at the index matching requirement, I'd filter the df prior to the merge:
In[131]:
filtered = df1.loc[df2.index].dropna()
filtered
Out[131]:
A B C D
1 A1 B1 C1 D1
and then merge
In[132]:
filtered.merge(df2)
Out[132]:
A B C D
0 A0 B0 C0 D0
if the indices do not match at all, say the first row of df2
is 1
instead of 2
:
In[133]:
filtered = df1.loc[df2.index].dropna()
filtered
Out[133]:
A B C D
1 A1 B1 C1 D1
then merge
will return an empty df because the index row value doesn't agree:
In[134]:
filtered.merge(df2)
Out[132]:
Empty DataFrame
Columns: [A, B, C, D]
Index: []
UPDATE
On your new dataset, merge
will reset the index which is the default behaviour:
In[152]:
filtered.merge(df2)
Out[152]:
A B C D
0 A0 B0 C0 D0
1 A2 B2 C2 D2
2 A7 B7 C7 D7
So to retain the indices, we can just make a boolean mask using the equality operator and call dropna
so that any rows with any NaN
values which will occur where the values don't agree will get dropped, this should handle all cases:
In[153]:
filtered[filtered== df2.loc[filtered.index]].dropna()
Out[153]:
A B C D
0 A0 B0 C0 D0
2 A2 B2 C2 D2
7 A7 B7 C7 D7
If you are sure that the values are the same you can do:
df1.loc[df1.index.to_series().isin(df2.index)]
Theres no need to do a merge.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.