简体   繁体   中英

How do I create a function that will accept a pandas dataframe and remove rows containing a specific value?

I am looking to create a function that will accept a pandas dataframe and a specific value(to_drop) which will then remove any row containing the specified value.

For example if I have this dataframe:

d = {'Name': ['John', 'Bill', "Frank"], 'A' : [1, 5, 7], 'B': [2, 0, 6], 'C' : [3, 1, 9]}
df = pd.DataFrame(d)

If the specific value I choose is 0, the function should remove Bill's row returning the rows of John and Frank.

I am trying to use:

def drop_row(df, to_drop):
    new_df = df[df.column != to_drop]
    return new_df

This is resulting in an attribute error, which I assumed it would because this only works when you are choosing a specific column.

Thank you!

Use pandas.DataFrame.any or pandas.DataFrame.all along axis=1 on the condition:

>>> df[df.ne(0).all(1)]
    Name  A  B  C
0   John  1  2  3
2  Frank  7  6  9

>>> df[~df.eq(0).any(1)]
    Name  A  B  C
0   John  1  2  3
2  Frank  7  6  9

You can make a function out of this, but frankly, it's unnnecessary:

>>> drop_row = lambda df: df[~df.eq(0).any(1)]
>>> drop_row(df)
    Name  A  B  C
0   John  1  2  3
2  Frank  7  6  9

It checks for the condition:

>>> df.ne(0) # items (n)ot (e)qual to 0:
   Name     A      B     C
0  True  True   True  True
1  True  True  False  True
2  True  True   True  True

>>> df.ne(0).all(1)  # checks if all values along axis 1 are True
0     True
1    False
2     True
dtype: bool

>>> df[df.ne(0).all(1)]  # Returns only index where values is True (i.e. 0, 2)
    Name  A  B  C
0   John  1  2  3
2  Frank  7  6  9

You need to learn the tools to implement the logic you already have. The missing piece is the any and all functions. Look up how to iterate over selected columns of a DF. Put that into a list comprehension expression. Then apply any to that. The filtering syntax (as opposed to using the drop method) will look something like

df[ all( [df.column != to_drop for column ...] ) ]

I'll leave the iteration syntax up to your research.

Define your function as:

def drop_row(df, to_drop):
    return df[~df.eq(to_drop).any(axis=1)]

Then if you call eg drop_row(df, 1) , you will get:

    Name  A  B  C
2  Frank  7  6  9

ie row with index == 0 and 1, both containing 1 in any column, are dropped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM