简体   繁体   English

使用包含空格的列名称或使用包含空格的列名称的drop方法查询Pandas DataFrame

[英]Querying Pandas DataFrame with column name that contains a space or using the drop method with a column name that contains a space

I am looking to use pandas to drop rows based on the column name (contains a space) and the cell value. 我希望使用pandas根据列名(包含空格)和单元格值删除行。 I have tried various ways to achieve this (drop and query methods) but it seems I'm failing due to the space in the name. 我已经尝试了各种方法来实现这一点(drop和query方法),但由于名称中的空格,我似乎失败了。 Is there a way to query the data using the name that has a space in it or do I need to clean all spaces first? 有没有办法使用其中有空格的名称查询数据,或者我是否需要先清理所有空格?

data in form of a csv file 数据以csv文件的形式

Date,"price","Sale Item"
2012-06-11,1600.20,item1
2012-06-12,1610.02,item2
2012-06-13,1618.07,item3
2012-06-14,1624.40,item4
2012-06-15,1626.15,item5
2012-06-16,1626.15,item6
2012-06-17,1626.15,item7

Attempt Examples 尝试例子

df.drop(['Sale Item'] != 'Item1')
df.drop('Sale Item' != 'Item1')
df.drop("'Sale Item'] != 'Item1'")

df.query('Sale Item' != 'Item1')
df.query(['Sale Item'] != 'Item1')
df.query("'Sale Item'] != 'Item1'")

Error received in most cases 大多数情况下收到错误

ImportError: 'numexpr' not found. Cannot use engine='numexpr' for query/eval if 'numexpr' is not installed

If I understood correctly your issue, maybe you can just apply a filter like: 如果我正确理解了您的问题,也许您只需应用以下过滤器:

df = df[df['Sale Item'] != 'item1']

which returns: 返回:

         Date    price Sale Item
1  2012-06-12  1610.02     item2
2  2012-06-13  1618.07     item3
3  2012-06-14  1624.40     item4
4  2012-06-15  1626.15     item5
5  2012-06-16  1626.15     item6
6  2012-06-17  1626.15     item7

As you can see from the documentation - 文档中可以看出 -

DataFrame.drop(labels, axis=0, level=None, inplace=False, errors='raise') DataFrame.drop(labels,axis = 0,level = None,inplace = False,errors ='raise')

Return new object with labels in requested axis removed 返回删除了请求轴中的标签的新对象

DataFrame.drop() takes the index of the rows to drop, not the condition. DataFrame.drop()获取要删除的行的index ,而不是条件。 Hence you would most probably need something like - 因此你很可能需要像 -

df.drop(df.ix[df['Sale Item'] != 'item1'].index)

Please note, this drops the rows that meet the condition, so the result would be the rows that don't meet the condition, if you want the opposite you can use ~ operator before your condition to negate it. 请注意,这会丢弃符合条件的行,因此结果将是不符合条件的行,如果您想要相反,则可以在条件之前使用~运算符来否定它。

But this seems a bit too much, it would be easier to just use Boolean indexing to get the rows you want (as indicated in the other answer) . 但这看起来有点过分,使用布尔索引来获取所需的行会更容易(如另一个答案中所示)。


Demo - 演示 -

In [20]: df
Out[20]:
         Date    price Sale Item
0  2012-06-11  1600.20     item1
1  2012-06-12  1610.02     item2
2  2012-06-13  1618.07     item3
3  2012-06-14  1624.40     item4
4  2012-06-15  1626.15     item5
5  2012-06-16  1626.15     item6
6  2012-06-17  1626.15     item7

In [21]: df.drop(df.ix[df['Sale Item'] != 'item1'].index)
Out[21]:
         Date   price Sale Item
0  2012-06-11  1600.2     item1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM