简体   繁体   中英

custom function to filter values in pandas dataframe columns

I would like to create a custom function to filter a pandas dataframe.

def df_filter(df, first_elem, col1, col2, other_elems):
    '''
    df: main dataframe
    first_elem: first element to search
    col1: first column to search for the first element
    col2: second column to search for first element
    other_elements: list with other elements to search for
    '''
    first_flt = df.loc[(df[col1] == first_elem) | (df[col2] == first_elem)]
    second_flt = first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])] 
    return second_flt

the first filter is to filter the dataframe by searching for the occurrence of the first element in the col1 and col2 and picking these rows to create first_flt and it works.

In the second filter I would like to search for more values provided in a list (other_elems) and filter again. The crucial point is the nr of items in this list can be different based on what I plug in. other_elems = ['one', 'two', 'three'] or other_elems = ['one', 'two', 'three', four']

Thefore this part has to be created based on the nr of elements in other_elems:

first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])...] 

Any ideas how to do this?

If other_elems is an iterable, you can use DataFrame isin method.

In your example:

second_flt = first_flt.loc[(first_flt[col1].isin(other_elems)]

You just want to create this single filter by combining two individual filters:

def df_filter(df, first_elem, col1, col2, other_elems):
    '''
    df: main dataframe
    first_elem: first element to search
    col1: first column to search for the first element
    col2: second column to search for first element
    other_elements: list with other elements to search for
    '''
    filt1 = (df[col1] == first_elem) | (df[col2] == first_elem) # rows where col1 or col2 match first_elem
    filt2 = (df[col1] == other_elems[0]) | (df[col1] == other_elems[1]) # rows where col1 = other_elem[0] or col2 = other_elem[1]
    filt_final = filt1 & filt2
    return df[filt_final]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM