[英]Compare Multiple DataFrames Add New Column Fill With Binary Values For Matches
Let's say I have 2 dataframes.假设我有 2 个数据框。 One with merged dataframe of all instances and another with only unique instances of column id.
一个合并了所有实例的 dataframe,另一个合并了唯一的列 id 实例。
df1 looks something like this: df1 看起来像这样:
| id | category_name
| 459291 | c1
| 349532 | c1
| 459291 | c2
| 719300 | c1
| 349532 | c3
| 459291 | c4
| 649202 | c2
| 459291 | c5
df2 looks something like this: df2 看起来像这样:
| id | category_name
| 459291 | c1
| 349532 | c1
| 719300 | c1
| 649202 | c2
What I want to do is create new columns on df2 for each value in column 'category_name' and output a 1 or 0 if unique value in 'id' has that matching 'category_name'.我想要做的是在 df2 上为“category_name”列和 output 中的每个值创建新列,如果“id”中的唯一值具有匹配的“category_name”,则为 1 或 0。 I would then drop the column 'category_name'.
然后我会删除“category_name”列。 So, my expected output I'm looking for would be something like this
所以,我正在寻找的预期 output 会是这样的
| id | c1 | c2 | c3 | c4 |
| 459291 | 1 | 1 | 1 | 1 |
| 349532 | 1 | 1 | 0 | 0 |
| 719300 | 1 | 0 | 0 | 0 |
| 649202 | 0 | 1 | 0 | 0 |
I feel like this could possibly be done using just the merged dataframe as well, but I'm not sure how I would drop the duplicates while keeping the new column values for each unique ID.我觉得这也可以仅使用合并的 dataframe 来完成,但我不确定如何在保留每个唯一 ID 的新列值的同时删除重复项。 any help is greatly appreciated!
任何帮助是极大的赞赏!
This is a way to do it with pivot_table()
for a reason I can't get around not having to add the aux
column:这是一种使用
pivot_table()
来实现的方法,因为我无法避免不必添加aux
列:
import pandas as pd
df = pd.DataFrame({'id':[459291,349532,459291,719300,349532,459291,649202,459291],
'playlist':['new','new','top','new','top','old','top','workout']})
df['aux'] = 1
new_df = pd.pivot_table(df,index='id',columns=['playlist'],aggfunc='count',values='aux').fillna(0).astype(int)
print(new_df)
Output: Output:
playlist new old top workout
id
349532 1 0 1 0
459291 1 1 1 1
649202 0 0 1 0
719300 1 0 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.