简体   繁体   English

删除基于索引的行熊猫数据框(多个条件)(Python 3.5.1)

[英]Delete rows pandas Dataframe based on index (multiple criteria) (Python 3.5.1)

Suppose I have a Pandas DataFrame with MultiIndex on rows. 假设我在行上有一个带有MultiIndex的Pandas DataFrame。 How can I delete rows based on the value of one of the levels of the index based on multiple criteria? 如何基于基于多个条件的索引级别之一的值删除行?

For example, suppose I have 例如,假设我有

import pandas as pd

df = {'population': [100, 200, 300, 400, 500, 600, 700, 800]}
arrays = [['NJ', 'NJ', 'NY', 'NY', 'CA', 'CA', 'NV', 'NV'],
          ['A', 'B', None, 'D', 'E', 'F', None, 'G']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['state', 'county'])

df = pd.DataFrame(df, index=index)

                   population
state   county  
NJ        A          100
          B          200
NY        NaN        300
          D          400
CA        E          500
          F          600
NV        NaN        700
          G          800   

I want to delete all rows where the county level of the index is NaN and also delete it when it is equal to 'D' and 'G'. 我想删除county索引为NaN的所有行,并且当它等于“ D”和“ G”时也删除它。 In other words, I want to end up with a DataFrame 换句话说,我想以一个DataFrame结尾

                   population
state   county  
NJ        A          100
          B          200
          D          400
CA        E          500
          F          600  

So the following sort of works: 因此,下面的工作如下:

df = df.iloc[df.index.get_level_values('county') != 'D']
df = df.iloc[df.index.get_level_values('county') != 'G']

But the problem is that in my real use case there is several of these criteria. 但是问题在于,在我的实际用例中,有几个标准。 Also, I can't seem to find a way to delete NaN's using this method. 另外,我似乎找不到使用此方法删除NaN的方法。

Thanks! 谢谢!

Call drop and pass a list on level='county to drop row labels with those values on that index level: 调用drop并在level='county上传递列表,以删除具有该索引级别上的那些值的行标签:

In [284]:
df.drop(['D','G',np.NaN], level='county')

Out[284]:
              population
state county            
NJ    A              100
      B              200
CA    E              500
      F              600

You could try using the inverse operator (~) on boolean indexing. 您可以尝试在布尔索引中使用逆运算符(〜)。 For example, 例如,

import numpy as np
df[~(df.index.get_level_values('county').isin(['A', 'B', np.nan]))]

this line of code says "select from df where county is NOT in some list" 这行代码说“从df中选择县不在列表中的地方”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python Pandas DataFrame基于多个先前行中的值删除行 - Python Pandas DataFrame delete rows based on values in multiple previous rows 根据多个条件从 Pandas DataFrame 中随机选择行 - Randomly select rows from Pandas DataFrame based on multiple criteria 基于多行修改python中的pandas数据帧 - Modify pandas dataframe in python based on multiple rows 按条件过滤行和 select 多列来自 dataframe 和 python Z3A43B4F88325D94022C0EFA9 - Filter rows by criteria and select multiple columns from a dataframe with python pandas 为什么不能根据多个或条件在 python pandas 数据框中选择数据 - Why not able to select data in python pandas dataframe based on multiple or criteria 如何根据条件删除 pandas DataFrame 中的多行? - How to delete multiple rows in a pandas DataFrame based on condition? python dataframe如何根据索引条件删除一些行 - python dataframe how to delete some rows based on index conditions 根据索引中的日期删除Python Pandas数据框行 - Drop Python Pandas dataframe rows based on date in index Python/Pandas:通过匹配的索引条件对数据帧进行子集化 - Python/Pandas: subset a Dataframe by matched index criteria 如何使用 Pandas 和 RegEx 根据特定条件从数据框中删除整行? - How do I delete whole rows from a dataframe based on specific criteria using Pandas and RegEx?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM