简体   繁体   中英

Selecting rows from a pandas dataframe based on the values of some columns in a list of the same dataframe?

Let's suppose, there is a dataframe:

df1 = 
   A  B   C
0  1  a  a1
1  2  b  b2
2  3  c  c3
3  4  d  d4
4  5  e  e5
5  6  f  f6

Created as:

a1 = [1,2,3,4,5,6]
a2 = ['a','b','c','d','e','f']
a3 = ['a1','b2','c3','d4','e5','f6']
df1 = pd.DataFrame(list(zip(a1,a2,a3)),columns=["A","B","C"])

Here, I am considering Columns A and B to be something like primary keys for this dataframe. So, PK = ["A","B"] .

I have another list, list1 = [[2,'b'],[5,'e']] , which is a subset of the dataframe df[PK] .

Is there any way I can get the rows corresponding to these primary key values inside the list from the dataframe df?

Something like: df1 = df[df[PK].values.isin(list1)] which doesn't work as I expect.

I would like to get an output df1 as:

df1 =
   A  B   C
1  2  b  b2
4  5  e  e5

There are some similar questions, which I have gone through in this portal. But none of them showed me how to select rows based on filter on multiple columns as mentioned above. Thanks in advance.

Here is how you can use pandas.DataFrame.merge() :

import pandas as pd
a1 = [1,2,3,4,5,6]
a2 = ['a','b','c','d','e','f']
a3 = ['a1','b2','c3','d4','e5','f6']

df1 = pd.DataFrame(list(zip(a1,a2,a3)),columns=["A","B","C"])

PK = ["A","B"]

list1 = [[2,'b'],[5,'e']]

df2 = df1.merge(pd.DataFrame(list1,columns=PK),on=PK)

print(df2)

Output:

   A  B   C
0  2  b  b2
1  5  e  e5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM