简体   繁体   English

Python Pandas:如何使用 df.drop 和 df.loc 删除行?

[英]Python Pandas: How can I drop rows using df.drop and df.loc?

Suppose I have the following dataframe:假设我有以下 dataframe:

import numpy as np
import pandas as pd

df = pd.DataFrame(
    {
        'user': ['Adam', 'Barry', 'Cindy', 'Dirk', 'Ella'],
        'income': [50000, 0, 100000, 30000, 0],
        'net worth': [250000, 1000000, 2000000, 50000, 0]
    }
)

在此处输入图像描述

So far, I've been removing rows based on conditions using the following:到目前为止,我一直在使用以下条件根据条件删除行:

df2 = df[df.income != 0]

在此处输入图像描述

And using multiple conditions like so:并像这样使用多个条件:

df3 = df[(df['income'] != 0) & (df['net worth'] > 100000)]

在此处输入图像描述

Question: Is this the preferred way to drop rows?问题:这是删除行的首选方式吗? If not, what is?如果不是,那是什么? Is it possible to do this via df.drop and df.loc ?是否可以通过df.dropdf.loc做到这一点? What would the syntax be?语法是什么?

.loc creates a subset of the rows you want to keep rather than .drop filter rows you want to remove. .loc创建您要保留的行的子集,而不是.drop过滤您要删除的行。 drop need the row label (index name). drop需要行 label(索引名称)。

The equivalent of your last filter with drop is:最后一个带drop的过滤器的等价物是:

>>> df.drop(df[~((df['income'] != 0) & (df['net worth'] > 100000))].index)

    user  income  net worth
0   Adam   50000     250000
2  Cindy  100000    2000000

# OR a bit smart:
>>> df.drop(df[(df['income'] == 0) | (df['net worth'] <= 100000)].index)

    user  income  net worth
0   Adam   50000     250000
2  Cindy  100000    2000000

Which syntax do you prefer?您更喜欢哪种语法?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM