[英]Is it possible to find common values in two dataframes using Python?
I have a dataframe df1 that is the list of e-mails of people that downloaded a certain e-book, and another dataframe df2 that is the e-mails of people that downloaded a different e-book.我有一个数据框 df1,它是下载特定电子书的人的电子邮件列表,另一个数据框 df2 是下载不同电子书的人的电子邮件。
I want to find the people that downloaded both e-books, or the common values between df1 and df2, using Python.我想找到下载了这两个电子书的人,或者 df1 和 df2 之间的共同值,使用 Python。
Is it possible to do that?有可能这样做吗? How?如何?
This was already discussed.这已经讨论过了。 Can you click on the below link你可以点击下面的链接吗
Find the common values in columns in Pandas dataframe 在 Pandas 数据框中的列中查找常见值
Assuming the two data frames as df1
and df2
with email
column, you can do the following:假设两个数据框为df1
和df2
带有email
列,您可以执行以下操作:
intersected_df = pd.merge(df1, df2, how='inner')
This data frame will have the values corresponding to emails found in df1 and
df2此数据框将具有与在 df1 and
df2 中找到的电子邮件对应的值
df1
into a set, in order to avoid duplicates.将来自df1
的电子邮件转储到一个集合中,以避免重复。df2
into a set, for the same reason.出于同样的原因,将df2
的电子邮件转储到一个集合中。set1 = set(df1.Emails)`
set2 = set(df2.Emails)
common = set1.intersection(set2)```
I believe you should merge the two dataframes我相信你应该合并两个数据框
merged = pd.merge(df1, df1, how='inner', on=['e-mails'])
and then drop the Nan values:然后删除 Nan 值:
merged.dropna(inplace=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.