[英]Subsetting dataframe by two columns from other dataframe
I have two datasets containing names. 我有两个包含名称的数据集。 What is the easist pythonish way to subset df2 so it contains onlys rows df1 is containing (first, last name). 什么是子集df2的简单pythonish方式,因此它只包含df1包含的行(名字,姓氏)。 Thank you. 谢谢。
import pandas as pd
names1 = {
'index' : [1, 2, 3],
'col1' : ['John', 'Jerry', 'John'],
'col2' : ['Doe', 'Peters', 'Smith']
}
names2 = {
'index' : [1, 2, 3, 4],
'col1' : ['John', 'Bob','Jerry', 'John'],
'col2' : ['Smith', 'Lacko', 'Peters', 'Nowak'],
'col3' : [12, 13, 14, 15]
}
df1 = pd.DataFrame(names1).set_index(["index"])
df2 = pd.DataFrame(names2).set_index(["index"])
print(df1,'\n')
print(df2)
col1 col2
index
1 John Doe
2 Jerry Peters
3 John Smith
col1 col2 col3
index
1 John Smith 12
2 Bob Lacko 13
3 Jerry Peters 14
4 John Nowak 15
desired output: 期望的输出:
col1 col2 col3
index
1 John Smith 12
3 Jerry Peters 14
Use reset_index
before merge
and then set_index
: 在merge
之前使用reset_index
,然后使用set_index
:
df = df2.reset_index().merge(df1).set_index('index')
print (df)
col1 col2 col3
index
1 John Smith 12
3 Jerry Peters 14
because only merge
lost original index values: 因为只merge
丢失原始索引值:
print (df2.merge(df1))
col1 col2 col3
0 John Smith 12
1 Jerry Peters 14
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.