简体   繁体   English

如何从pandas中重复值的两个数据框中过滤数据?

[英]How to filter the data from two data frames with the repeated values in pandas?

I have two data frames like below:-我有两个如下所示的数据框:-

import pandas 
import numpy as np

data = data = {'Name': ['Tom', 'Joseph', 'Krish', 'John','rack','rox','selena','jha'], 'Age': [20, 21,18,20,30,20,18,20]}
df = pd.DataFrame(data)  

print(df)

# Output :
#       Name    Age
#   0   Tom     20
    1   Joseph  21
    2   Krish   18
    3   John    20
    4   rack    30
    5   rox     20
    6   selena  18
    7   jha     20

data = {'Named': ['Raj', 'kir', 'cena','ang'], 'Age': [20, 21,18,30]}  
df1 = pd.DataFrame(data)  
    
print(df1)

# Output :    
#   Named Age
# 0 Raj   20
# 1 kir   21
# 2 cena  18
# 3 ang   30

Now I want to filter the age column of df with age column of df1.现在我想用 df1 的年龄列过滤 df 的年龄列。 The output should also include the duplicate values. output 还应包含重复值。 I tried to use the simple filter which is excluding the duplicates, its only giving the unique values.我尝试使用排除重复项的简单过滤器,它只提供唯一值。 How do I filter which includes the duplicate values as well?如何过滤也包含重复值的值?

My code and output:我的代码和 output:

res = df1[df1['Age'].isin(df['Age'])]
   Named    Age
0   Raj     20
1   kir     21
2   cena    18
3   ang     30


Execpted output:- 


    Named  Age
0   Raj    20
1   kir    21
2   cena   18
3   Raj    20
4   ang    30
5   Raj    20
6   cena   18
7   Raj    20

It looks like you want a right merge:看起来你想要一个正确的合并:

df1.merge(df[['Age']].dropna(), on='Age', how='right')

output: output:

  Named  Age
0   Raj   20
1   kir   21
2  cena   18
3   Raj   20
4   ang   30
5   Raj   20
6  cena   18
7   Raj   20

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM