删除基于索引的行熊猫数据框（多个条件）（Python 3.5.1）

Question

Suppose I have a Pandas DataFrame with MultiIndex on rows. 假设我在行上有一个带有MultiIndex的Pandas DataFrame。 How can I delete rows based on the value of one of the levels of the index based on multiple criteria? 如何基于基于多个条件的索引级别之一的值删除行？

For example, suppose I have 例如，假设我有

import pandas as pd

df = {'population': [100, 200, 300, 400, 500, 600, 700, 800]}
arrays = [['NJ', 'NJ', 'NY', 'NY', 'CA', 'CA', 'NV', 'NV'],
          ['A', 'B', None, 'D', 'E', 'F', None, 'G']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['state', 'county'])

df = pd.DataFrame(df, index=index)

                   population
state   county  
NJ        A          100
          B          200
NY        NaN        300
          D          400
CA        E          500
          F          600
NV        NaN        700
          G          800

I want to delete all rows where the county level of the index is NaN and also delete it when it is equal to 'D' and 'G'. 我想删除county索引为NaN的所有行，并且当它等于“ D”和“ G”时也删除它。 In other words, I want to end up with a DataFrame 换句话说，我想以一个DataFrame结尾

                   population
state   county  
NJ        A          100
          B          200
          D          400
CA        E          500
          F          600

So the following sort of works: 因此，下面的工作如下：

df = df.iloc[df.index.get_level_values('county') != 'D']
df = df.iloc[df.index.get_level_values('county') != 'G']

But the problem is that in my real use case there is several of these criteria. 但是问题在于，在我的实际用例中，有几个标准。 Also, I can't seem to find a way to delete NaN's using this method. 另外，我似乎找不到使用此方法删除NaN的方法。

Thanks! 谢谢！

Answer 1

Call drop and pass a list on level='county to drop row labels with those values on that index level: 调用drop并在level='county上传递列表，以删除具有该索引级别上的那些值的行标签：

In [284]:
df.drop(['D','G',np.NaN], level='county')

Out[284]:
              population
state county            
NJ    A              100
      B              200
CA    E              500
      F              600

Answer 2

You could try using the inverse operator (~) on boolean indexing. 您可以尝试在布尔索引中使用逆运算符（〜）。 For example, 例如，

import numpy as np
df[~(df.index.get_level_values('county').isin(['A', 'B', np.nan]))]

this line of code says "select from df where county is NOT in some list" 这行代码说“从df中选择县不在列表中的地方”

删除基于索引的行熊猫数据框（多个条件）（Python 3.5.1）

问题描述

2 个解决方案

解决方案1
0 2016-04-01 08:56:16

解决方案2
0 2016-04-01 11:45:05

删除基于索引的行熊猫数据框（多个条件）（Python 3.5.1）

问题描述

2 个解决方案

解决方案1 0 2016-04-01 08:56:16

解决方案2 0 2016-04-01 11:45:05

解决方案1
0 2016-04-01 08:56:16

解决方案2
0 2016-04-01 11:45:05