[英]custom function to filter values in pandas dataframe columns
我想创建一个自定义函数来过滤熊猫数据框。
def df_filter(df, first_elem, col1, col2, other_elems):
'''
df: main dataframe
first_elem: first element to search
col1: first column to search for the first element
col2: second column to search for first element
other_elements: list with other elements to search for
'''
first_flt = df.loc[(df[col1] == first_elem) | (df[col2] == first_elem)]
second_flt = first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])]
return second_flt
第一个过滤器是通过搜索 col1 和 col2 中第一个元素的出现并选择这些行来创建 first_flt 来过滤数据帧,它可以工作。
在第二个过滤器中,我想搜索列表 (other_elems) 中提供的更多值并再次过滤。 关键是这个列表中项目的 nr 可以根据我插入的内容而不同。 other_elems = ['one', 'two', 'three']
或other_elems = ['one', 'two', 'three', four']
因此,必须根据 other_elems 中元素的 nr 创建此部分:
first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])...]
任何想法如何做到这一点?
如果other_elems
是一个迭代,您可以使用数据框ISIN方法。
在你的例子中:
second_flt = first_flt.loc[(first_flt[col1].isin(other_elems)]
您只想通过组合两个单独的过滤器来创建这个单一的过滤器:
def df_filter(df, first_elem, col1, col2, other_elems):
'''
df: main dataframe
first_elem: first element to search
col1: first column to search for the first element
col2: second column to search for first element
other_elements: list with other elements to search for
'''
filt1 = (df[col1] == first_elem) | (df[col2] == first_elem) # rows where col1 or col2 match first_elem
filt2 = (df[col1] == other_elems[0]) | (df[col1] == other_elems[1]) # rows where col1 = other_elem[0] or col2 = other_elem[1]
filt_final = filt1 & filt2
return df[filt_final]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.