如何使用索引列表从pandas数据帧中删除行

Question

Introduction 介绍

We have the following dataframe which we create from a CSV file. 我们有以下数据框，我们从CSV文件创建。

data = pd.read_csv(path + name, usecols = ['QTS','DSTP','RSTP','DDATE','RDATE','DTIME','RTIME','DCXR','RCXR','FARE'] ) data = pd.read_csv(path + name, usecols = ['QTS','DSTP','RSTP','DDATE','RDATE','DTIME','RTIME','DCXR','RCXR','FARE'] ）

I want to delete specific rows from the dataframe. 我想从数据框中删除特定的行。 For this purpose I used a list and appended the ids of the rows we want to delete. 为此，我使用了一个列表并附加了我们要删除的行的ID。

for index,row in data.iterrows():
     if (row['FARE'] >= 2500.00):
       indices.append(index)

From here i am lost. 从这里我迷失了。 Don't know how to use the ids in the list to delete the rows from the dataframe 不知道如何使用列表中的ID来删除数据框中的行

Question 题

The list containing the row ids must be used in the dataframe to delete rows. 必须在数据框中使用包含行ID的列表来删除行。 Is it possible to do it? 有可能吗？

Constraints 约束

We can't use data.drop(index,inplace=True) because it really slows the process 我们不能使用data.drop(index,inplace=True)因为它确实会减慢进程
We cannot use a filter because I have some special constraints. 我们不能使用过滤器，因为我有一些特殊的约束。

Answer 1

If you are trying to remove rows that have 'FARE' values greater than or equal to zero, you can use a mask that have those values lesser than 2500 - 如果您尝试删除'FARE'值大于或等于零的行，则可以使用具有小于2500值的掩码 -

df_out = df.loc[df.FARE.values < 2500] # Or df[df.FARE.values < 2500]

For large datasets, we might want to work with underlying array data and then construct the output dataframe - 对于大型数据集，我们可能希望使用底层数组数据，然后构造输出数据框 -

df_out = pd.DataFrame(df.values[df.FARE.values < 2500], columns=df.columns)

To use those indices generated from the loopy code in the question - 要使用问题中循环代码生成的indices -

df_out = df.loc[np.setdiff1d(df.index, indices)]

Or with masking again - 或者再次masking -

df_out = df.loc[~df.index.isin(indices)]  # or df[~df.index.isin(indices)]

Answer 2

How about filtering data using DataFrame.query() method: 如何使用DataFrame.query（）方法过滤数据：

cols = ['QTS','DSTP','RSTP','DDATE','RDATE','DTIME','RTIME','DCXR','RCXR','FARE']
df = pd.read_csv(path + name, usecols=cols).query("FARE < 2500")

如何使用索引列表从pandas数据帧中删除行

问题描述

Introduction 介绍

Question 题

Constraints 约束

2 个解决方案

解决方案1
3 已采纳 2017-05-29 13:55:10

解决方案2
0 2017-05-29 15:08:44

如何使用索引列表从pandas数据帧中删除行

问题描述

Introduction 介绍

Question 题

Constraints 约束

2 个解决方案

解决方案1 3 已采纳 2017-05-29 13:55:10

解决方案2 0 2017-05-29 15:08:44

解决方案1
3 已采纳 2017-05-29 13:55:10

解决方案2
0 2017-05-29 15:08:44