简体   繁体   English

是否可以使用 Python 在两个数据帧中找到公共值?

[英]Is it possible to find common values in two dataframes using Python?

I have a dataframe df1 that is the list of e-mails of people that downloaded a certain e-book, and another dataframe df2 that is the e-mails of people that downloaded a different e-book.我有一个数据框 df1,它是下载特定电子书的人的电子邮件列表,另一个数据框 df2 是下载不同电子书的人的电子邮件。

I want to find the people that downloaded both e-books, or the common values between df1 and df2, using Python.我想找到下载了这两个电子书的人,或者 df1 和 df2 之间的共同值,使用 Python。

Is it possible to do that?有可能这样做吗? How?如何?

This was already discussed.这已经讨论过了。 Can you click on the below link你可以点击下面的链接吗

Find the common values in columns in Pandas dataframe 在 Pandas 数据框中的列中查找常见值

Assuming the two data frames as df1 and df2 with email column, you can do the following:假设两个数据框为df1df2带有email列,您可以执行以下操作:

intersected_df = pd.merge(df1, df2, how='inner')

This data frame will have the values corresponding to emails found in df1 and df2此数据框将具有与在 df1 and df2 中找到的电子邮件对应的值

  1. Dump the emails from df1 into a set, in order to avoid duplicates.将来自df1的电子邮件转储到一个集合中,以避免重复。
  2. Dump the emails from df2 into a set, for the same reason.出于同样的原因,将df2的电子邮件转储到一个集合中。
  3. Find the intersection of these two sets, as such:求这两个集合的交集,如下:
set1 = set(df1.Emails)`
set2 = set(df2.Emails)
common = set1.intersection(set2)```

I believe you should merge the two dataframes我相信你应该合并两个数据框

merged = pd.merge(df1, df1, how='inner', on=['e-mails'])

and then drop the Nan values:然后删除 Nan 值:

merged.dropna(inplace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM