简体   繁体   English

如何通过比较列从另一个数据帧中过滤掉一个python pandas数据帧的行?

[英]How to filter out rows of one python pandas dataframe from another dataframe by comparing columns?

I'm trying to exclude rows from one dataframe, which also occur in another dataframe: 我试图从一个数据帧中排除行,这也出现在另一个数据帧中:

import pandas

df = pandas.DataFrame({'A': ['Chr1', 'Chr1', 'Chr1','Chr1', 'Chr1', 'Chr1','Chr2','Chr2'], 'B': [10,20,30,40,50,60,15,20]})

errors = pandas.DataFrame({'A': ['Chr1', 'Chr1'], 'B': [20,50]})

As a result, the rows in df, that are equal to errors should be left out: 因此,应忽略df中等于错误的行:

df:
'A'    'B'
Chr1    10
Chr1    30
Chr1    40
Chr1    60
Chr2    15
Chr2    20

It doesn't seem to work with df.merge, and I don't want to iterate over all rows, since the dataframes get pretty large. 它似乎不适用于df.merge,我不想迭代所有行,因为数据帧变得非常大。

Best, 最好,

David 大卫

Add an extra column to errors 为错误添加额外的列

errors['temp'] = 1

Merge the two dataframes 合并两个数据帧

merged_df = pandas.merge(df,errors,how='outer')

Now keep only those rows which have 'temp' as NaN 现在只保留那些'temp'为NaN的行

merged_df = merged_df[ merged_df['temp'] != 1 ]
del merged_df['temp']

print merged_rdf

      A   B
 0  Chr1  10
 2  Chr1  30
 3  Chr1  40
 5  Chr1  60
 6  Chr2  15
 7  Chr2  20

您可以执行以下两列操作:

 print df[ ~df['A'].isin(errors['A']) | ~df['B'].isin(errors['B']) ]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据选定的列过滤重复的行,并与 Pandas 中的另一个 dataframe 进行比较 - Filter duplicated rows based on selected columns and comparing with another dataframe in Pandas 如何按另一个 Dataframe 列过滤 Pandas 行? - How to filter Pandas rows by another Dataframe columns? Python/Pandas:基于另一个 dataframe 过滤和组织 dataframe 的行和列 - Python/Pandas: filter and organize the rows and columns of a dataframe based on another dataframe 合并一个数据帧中与另一个数据帧中的特定列不匹配的行 Python Pandas - Merge rows from one dataframe that do not match specific columns in another dataframe Python Pandas pandas:通过将 DataFrame 行与另一个 DataFrame 的列进行比较来创建新列 - pandas: Create new column by comparing DataFrame rows with columns of another DataFrame Pandas 根据另一个数据框中 2 列的值过滤行 - Pandas filter rows based on values from 2 columns in another dataframe 按条件过滤行和 select 多列来自 dataframe 和 python Z3A43B4F88325D94022C0EFA9 - Filter rows by criteria and select multiple columns from a dataframe with python pandas 如何从python pandas数据框中获取行并将其划分为一个新数据框的列 - How to take rows from python pandas dataframe and make into columns of one new dataframe Python Pandas如何优化比较数据帧的行? - Python pandas how to optimizes comparing rows of a dataframe? Pandas - 如何在 Dataframe 中保留与另一个 Dataframe 比较时不同的行 - Pandas - How to keep rows in a Dataframe that are different when comparing with another Dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM