简体   繁体   English

如何根据基于另一个数据帧的条件提取熊猫数据帧的行

[英]How to extract rows of a pandas dataframe according to conditions based on another dataframe

I have these two dataframes :我有这两个数据框:

df1 = pd.DataFrame({'Points':[1,2,3,4,5], 'ColX':[9,8,7,6,5]})
df1
    Points  ColX
0        1     9
1        2     8
2        3     7
3        4     6
4        5     5

df2 = pd.DataFrame({'Points':[2,5], 'Sum':[-1,1], 'ColY':[2,4]}) # ColY does not matter, I just added it to say that this dataframe can have other columns that the useful columns for this topic
df2
    Points  Sum  ColY
0        2   -1     2
1        5    1     4

I would like to get a dataframe with the rows of df1 where :我想获得一个包含 df1 行的数据框,其中:

  • the values of column Points in df1 are also in the the column Points of df2 df1 列 Points 的值也在 df2 列 Points 中
  • the values of column Sum in df2 are between 0 and 2 df2 中列 Sum 的值介于 0 和 2 之间

Consequently, I would like to get this dataframe (no matter the index) :因此,我想获得这个数据框(无论索引如何):

    Points  ColX
4        5     5

I tried the following but it didn't work :我尝试了以下但没有奏效:

df1[df1.merge(df2, on = 'Points')['Sum'] <= 2 and ['Sum']>=0] 

Could you please help me to find the right code ?你能帮我找到正确的代码吗?

Try this:尝试这个:

df1[df1['Points'].isin(df2.query('0 <= Sum <= 2')['Points'])]

Output:输出:

  Points  ColX
4       5     5

Explained:解释:

  • df2.query('0 <= Sum <=2') to filter df2 first to only valid records df2.query('0 <= Sum <=2')将 df2 首先过滤为仅有效记录
  • Then use boolean indexing with isin of filter df2 Points column.然后使用布尔索引与isin滤波器DF2点列。

Use Series.between for boolean mask with boolean indexing for filtering passed to another mask with Series.isin :使用Series.between布尔掩码与boolean indexing过滤传递到另一个掩码与Series.isin

df = df1[df1['Points'].isin(df2.loc[df2['Sum'].between(0,2), 'Points'])]
print (df)
   Points  ColX
4       5     5

Your solution should be changed with DataFrame.query for filtering:您的解决方案应该使用DataFrame.query进行更改以进行过滤:

df = df1.merge(df2, on = 'Points').query('0<=Sum<=2')[df1.columns]
print (df)
   Points  ColX
1       5     5

also works:也有效:

df3 = df1.merge(df2, on='Points')
result = df3[(df3.Sum >= 0) & (df3.Sum <= 2)]
result

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据考虑其他 dataframe 的条件删除 pandas dataframe 行 - How to drop pandas dataframe rows based on conditions that consider other dataframe 如何基于多列从另一个 dataframe 中提取 pandas dataframe? - how to extract pandas dataframe from another dataframe based on multiple column? 根据某些条件根据另一个数据框更新数据框 - Update dataframe according to another dataframe based on certain conditions 根据条件选择 pandas dataframe 上的行 - selecting rows on pandas dataframe based on conditions 如何根据pandas数据框中的多列值条件排除行? - How to exclude rows based on multi column value conditions in pandas dataframe? 如何根据 pandas dataframe 中的匹配条件对整行进行 append? - How to append entire rows based on matching conditions in a pandas dataframe? 如何根据复杂条件删除特定的 pandas dataframe 行 - How to drop specific pandas dataframe rows based on complex conditions 如何根据 Pandas 中的条件创建 dataframe 行的修改副本? - How to create modified copy of dataframe rows based on conditions in Pandas? 如何根据这些条件“合并” Pandas DataFrame 中的行 - How can I “merge” rows in a Pandas DataFrame based on these conditions 如何提取pandas数据帧中的行而不是子集数据帧中的行 - How to extract rows in a pandas dataframe NOT in a subset dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM