简体   繁体   English

使用 Pandas 合并来自多个具有特定值的数据框的列

[英]Merge columns from several dataframes with specific values with Pandas

I have 7 dataframes with only "OK" and "KO" values, and the only column that connects everything is the ID.我有 7 个数据框,只有“OK”和“KO”值,唯一连接所有内容的列是 ID。

df1:
ID, Name, Address, Email
1, OK, OK, OK
2, OK, KO, OK
3, OK, OK, KO

df2:
ID Job, Credit_Card, Driving_License_Number
1, OK, OK, OK
2, KO, KO, OK
3, OK, OK, OK

I'm trying to find a way to query or to merge all the "KO" values into a single csv file / Dataframe so I can easily check what column failed the test我正在尝试找到一种方法来查询或将所有“KO”值合并到一个 csv 文件/Dataframe 中,这样我就可以轻松检查哪些列未通过测试

Something like this:像这样:

ID_2, ID_3
Address, Email
Job
Credit_Card

So, with this I know that ID_2 is missing the Address, Job and Credit Card information and ID_3 is missing the Email.因此,我知道 ID_2 缺少地址、工作和信用卡信息,ID_3 缺少 Email。

Let's merged them first on ID , then do a matrix multiplication:让我们先在ID上合并它们,然后进行矩阵乘法:

merged = df1.merge(df2, on='ID').set_index('ID')

(merged.eq('KO') @ (merged.columns + (', '))).str[:-2]

Output: Output:

ID
1                             
2    Address, Job, Credit_Card
3                        Email
dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM