简体   繁体   中英

Select rows from a DataFrame based on values in a MULTIPLE columns in pandas

So this question concerns how to select a subset of rows in a data frame based on values in an array (or a single column). It is not enough for me to solve my problem.

I have many different tables in multiple directories. I have a dictionary with relations between tables (eg keys for join). For each table T1, I lookup other tables (T2, T3...) that share same column names (keys) and I want to filter those tables (T2, T3...) to include rows that have matching key values in a set of columns with T1. Key set may vary! T1 may connect to T2 on one column (key) while T1 may connect with T2 on 5 keys! I do not know this beforehand.

So for example I have t1, t2, t3 and pks=["id"] (t1-->t2), fks=["id", "index", "zip"] (t1-->t3)

t1
id|index|zip|v
10|10000|200|20

t2
id|v
10|30
20|50
30|70

t3
id|index|zip|v
00|10000|200|10
10|10000|200|20
10|10000|300|30
10|10000|200|10

the output of t2 and t3 would be

t2
id|v
10|30

and t3

id|index|zip|v
10|10000|200|20
10|10000|200|10

Looking at the previous answer I would probably need to do smth like

filtered_t2 = t2.loc[t2[pks].isin(t1[fks])]

But i get the following error

ValueError: Cannot index with multidimensional key

Probably in this way I cannot handle compound key, but it also fails if I just provide one key -- 'id'! So maybe it cannot accept an array as values ...

How do I handle it when pks and fks are arrays of variable sizes?

Would this be a correct approach:

    filter = None
    for p, f in zip(pks, fks):
        if filter is None:
            filter = t2[p].isin(t1[f])
        else:
            filter &= t2[p].isin(t1[f])

    filtered_ft = t2.loc[filter]

Thanks!

Let us try merge here

t2.merge(t1,how='inner',on=['id'])

t3.merge(t1,how='inner',on=['id','index','zip'])

Do another way

t2[t2[pks].apply(tuple,1).isin(t1[pks].apply(tuple,1))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM