比较另一列 dataframe 中一列的值 dataframe

Question

I have 2 dataframes.我有 2 个数据框。 df1 is df1 是

   DATE
2020-05-20
2020-05-21

and df2 is和 df2 是

ID    NAME    DATE
1     abc     2020-05-20
2     bcd     2020-05-20
3     ggg     2020-05-25
4     jhg     2020-05-26

I want to compare the values of df1 with df2, for eg: taking first value of df1 ie 2020-05-20 and find it in df2 and filter it and show output and subset the filtered rows.我想将 df1 的值与 df2 进行比较，例如：取 df1 的第一个值，即 2020-05-20 并在 df2 中找到它并过滤它并显示 output 并对过滤的行进行子集化。
My code is我的代码是

for index,row in df1.iterrows():
    x = row['DATE']
    if x == df2['DATE']:
        print('Found')
        new = df2[df2['DATE'] == x]
        print(new)
    else:
        print('Not Found')

But I am getting the following error:但我收到以下错误：

ValueError: The truth value of a series is ambigious. Use a.empty,a.bool(),a.item(),a.any()

Answer 1

x == df2['DATE'] is a pd.Series (of Booleans), not a single value. x == df2['DATE']是一个pd.Series （布尔值），而不是单个值。 You have to reduce that to a single Boolean value in order to evaluate that in a condition.您必须将其减少到单个 Boolean 值才能在条件下评估它。

You can either use .any() or .all() depeding on what you need.您可以根据需要使用.any()或 .all( .all() 。 I assumed you need .any() here.我假设你需要.any()这里。

for index,row in df1.iterrows():
    x = row['DATE']
    if (x == df2['DATE']).any():
        print('Found')
        new = df2[df2['DATE'] == x]
        print(new)
    else:
        print('Not Found')

Also see here for a pure pandas solution for this.另请参阅此处了解纯 pandas 解决方案。

Answer 2

you can create one extra column in df1 and use np.where to fill it.您可以在 df1 中创建一个额外的列并使用 np.where 来填充它。

import numpy as np
df1['Match'] = np.where(df1.DATE.isin(df2.DATE),'Found', 'Not Found')

Answer 3

this can also be done as a merge which I think makes it a bit clearer as it's only one line with no branching.这也可以作为merge来完成，我认为这使它更清晰一些，因为它只有一条没有分支的行。 You can also add the validate parameter to make sure that each key is unique in either the left of right dataset,您还可以添加validate参数以确保每个键在右侧数据集的左侧都是唯一的，

import pandas

df1 = pandas.DataFrame(['2020-05-20', '2020-05-21'], columns=['DATE'])
df2 = pandas.DataFrame({'Name': ['abc', 'bcd', 'ggg', 'jgh'], 
                        'DATE': ['2020-05-20', '2020-05-20', '2020-05-25', '2020-05-26']})

df3 = df1.merge(right=df2, on='DATE', how='left')

比较另一列 dataframe 中一列的值 dataframe

问题描述

3 个解决方案

解决方案1
0 已采纳 2021-05-06 10:37:43

解决方案2
0 2021-05-06 10:46:10

解决方案3
0 2021-05-06 11:23:29

比较另一列 dataframe 中一列的值 dataframe

问题描述

3 个解决方案

解决方案1 0 已采纳 2021-05-06 10:37:43

解决方案2 0 2021-05-06 10:46:10

解决方案3 0 2021-05-06 11:23:29

解决方案1
0 已采纳 2021-05-06 10:37:43

解决方案2
0 2021-05-06 10:46:10

解决方案3
0 2021-05-06 11:23:29