[英]How do you combine two data frames on two values and add an extra column containing a boolean for if the row appeared in the second dataframe?
I have two csv files, one with an ID, a year and some other stats and then I have another csv file with just the name and year.我有两个 csv 文件,一个带有 ID、年份和其他一些统计信息,然后我有另一个 csv 文件,其中只有名称和年份。 The first is just a collection of all users, years and the corresponding stats and the other is for winners of an award and the year that they won it.
第一个是所有用户、年份和相应统计数据的集合,另一个是针对某个奖项的获奖者和他们赢得该奖项的年份。 I'm trying to combine the two and have the result as an extra column in the first data set showing a boolean value for if they won an award that year or not.
我正在尝试将两者结合起来,并将结果作为第一个数据集中的额外列显示 boolean 值,如果他们在那一年获得了奖项。
eg [embiijo01,2018-19,...] and [embiijo01,2018-19] would turn into [embiijo01,2018-19,...,1]例如 [embiijo01,2018-19,...] 和 [embiijo01,2018-19] 将变成 [embiijo01,2018-19,...,1]
pd.merge(csv_1, csv_2, on = 'ID/Name/User/identification of the person', how = 'left')
Should be able to just do simple if statement and append or add column for the boolean.应该能够只做简单的 if 语句和 append 或为 boolean 添加列。
https://towardsdatascience.com/why-and-how-to-use-merge-with-pandas-in-python-548600f7e738 https://towardsdatascience.com/why-and-how-to-use-merge-with-pandas-in-python-548600f7e738
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.