简体   繁体   English

将数据帧从其他数据帧中按两列进行子集

[英]Subsetting dataframe by two columns from other dataframe

I have two datasets containing names. 我有两个包含名称的数据集。 What is the easist pythonish way to subset df2 so it contains onlys rows df1 is containing (first, last name). 什么是子集df2的简单pythonish方式,因此它只包含df1包含的行(名字,姓氏)。 Thank you. 谢谢。

import pandas as pd

names1 = {
    'index' : [1, 2, 3], 
    'col1'  : ['John', 'Jerry', 'John'],
    'col2'  : ['Doe', 'Peters', 'Smith']
}




names2 = {
    'index' : [1, 2, 3, 4], 
    'col1'  : ['John', 'Bob','Jerry', 'John'],
    'col2'  : ['Smith', 'Lacko', 'Peters', 'Nowak'],
    'col3'  : [12, 13, 14, 15]
}


df1 = pd.DataFrame(names1).set_index(["index"])
df2 = pd.DataFrame(names2).set_index(["index"])

print(df1,'\n')
print(df2)

        col1    col2
index               
1       John     Doe
2      Jerry  Peters
3       John   Smith 

        col1    col2  col3
index                     
1       John   Smith    12
2        Bob   Lacko    13
3      Jerry  Peters    14
4       John   Nowak    15

desired output: 期望的输出:

       col1   col2   col3
index                     
1      John   Smith    12
3      Jerry  Peters   14

Use reset_index before merge and then set_index : merge之前使用reset_index ,然后使用set_index

df = df2.reset_index().merge(df1).set_index('index')
print (df)
        col1    col2  col3
index                     
1       John   Smith    12
3      Jerry  Peters    14

because only merge lost original index values: 因为只merge丢失原始索引值:

print (df2.merge(df1))
    col1    col2  col3
0   John   Smith    12
1  Jerry  Peters    14

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM