按行过滤数据框

Question

Hi I am a beginner python user and I need some help. 嗨，我是Python初学者，我需要一些帮助。 I am trying to filter one dataframe against another. 我正在尝试针对另一个过滤一个数据框。

Df1 DF1

 date          emp#   sku     transaction#   
 2017-01-01    10     200     399              
 2017-01-01    10     201     399             
 2017-01-01    10     202     399             
 2017-01-01    11     203     399             
 2017-01-01    11     200     399

Df2 DF2

 date          emp#   sku     transaction#
 2017-01-01    10     200     301
 2017-01-01    11     200     301

Desired Df1 所需的Df1

 date          emp#   sku     transaction#
 2017-01-01    10     200     399
 2017-01-01    11     200     399

I know this can work with an inner join (one emp# and sku) but I would have erroneous columns, how can I do this as a filter? 我知道这可以与内部联接（一个emp＃和sku）一起使用，但是我会有错误的列，如何作为过滤器呢？

Answer 1

Use merge and the on parameter: 使用merge和on参数：

Df1.merge(Df2, on=['date','emp#','sku'], suffixes=('','_y'))\
   .drop('transaction#_y', axis=1)

Output: 输出：

         date  emp#  sku  transaction#
0  2017-01-01    10  200           399
1  2017-01-01    11  200           399

Answer 2

Here is one way without pd.merge . 这是没有pd.merge一种方法。 The benefit of this method is you don't have to play around with column names. 这种方法的好处是您不必使用列名。

df2 = df2.set_index(['emp#', 'sku'])
df2['transaction#'] = df1.set_index(['emp#', 'sku'])['transaction#']
df2 = df2.reset_index()

#    emp#  sku        date  transaction#
# 0    10  200  2017-01-01           399
# 1    11  200  2017-01-01           399

Answer 3

You can do a filter from df2 by converting the desired columns into a dictionary, with orientation set to list , and then check in the values exist using isin . 您可以通过将所需的列转换为字典（方向设置为list从df2进行过滤，然后使用isin检查值是否存在。 Lastly, take the min of each row to ensure both conditions are met ie 最后，取每一行的min以确保同时满足两个条件，即

False + False = False False + False = False
False + True = False False + True = False
True + False = False True + False = False
True + True = True True + True = True

cols = ['emp#','sku']
df1[df1[cols].isin(df2[cols].to_dict(orient='list')).min(1)]

         date  emp#  sku  transaction#
0  2017-01-01    10  200           399
4  2017-01-01    11  200           399

Answer 4

您需要一个内部联接，它看起来像：保留仅在两个目录中都存在的行：

df1.join(df2, how='inner')

按行过滤数据框

问题描述

4 个解决方案

解决方案1
2 2018-03-05 23:37:02

解决方案2
1 已采纳 2018-03-06 00:37:00

解决方案3
0 2018-03-06 01:42:09

解决方案4
-1 2018-03-05 23:27:13

按行过滤数据框

问题描述

4 个解决方案

解决方案1 2 2018-03-05 23:37:02

解决方案2 1 已采纳 2018-03-06 00:37:00

解决方案3 0 2018-03-06 01:42:09

解决方案4 -1 2018-03-05 23:27:13

解决方案1
2 2018-03-05 23:37:02

解决方案2
1 已采纳 2018-03-06 00:37:00

解决方案3
0 2018-03-06 01:42:09

解决方案4
-1 2018-03-05 23:27:13