简体   繁体   中英

Python Pandas - Query and boolean in dataframe columns

I've a dataframe which has several columns and I want to make query based on several criterias.

My df (I don't know how to make the columns aligned on the topic):

Date        Type          IsInScope CostTable  Value
2017-04-01  CostEurMWh    True      Standard   0.22
2018-01-01  CostEurMWh    True      Standard   0.80
2019-01-01  CostEurMWh    True      Standard   1.72
2017-04-01  CostEurMWh    False     Standard   0.00

I have plenty thousands of other rows with other Types and dates.

I have on the other hand something I would like to price, and in order to do so, I need to get the proper value, based on parameters.

I have a dict like this: {'ID' : 'Customer1', 'IsInScope' : True, 'CostTable' : 'Standard'}

I want to do a query like this df.query('IsInScope' == True & 'CostTable' == 'Standard') but when I do this, I get an empty df. I think the problem comes from the way pandas manages boolean in query, having read this thread: How to use query function with bool in python pandas?

When I change my 'IsInScope' inputs by strings like 'YES'/'NO', and I do a query with 'YES' instead of True, then it works perfectly so I know it's coming from the query part.

The only thing is that I don't know how to properly do my query in this example.

Should I convert my column to a string and not use a boolean ?

I've tried to change the dtype of the IsInScope columns to bool, and it doesn't change anything.

The type of my 'IsInCEEScope' is bool.

I hope I've been clear

Thanks for your help

Regards,

Eric

We can solve your problem in several ways, I will show you two ways here.

  1. With Boolean indexing
  2. With query.

Note, since your IsInScope column is type bool we can clean up your code a bit like following:


1. Boolean indexing

df1 = df[df['IsInScope'] & (df['CostTable'] == 'Standard')]

Output

print(df1)
         Date        Type  IsInScope CostTable  Value
0  2017-04-01  CostEurMWh       True  Standard   0.22
1  2018-01-01  CostEurMWh       True  Standard   0.80
2  2019-01-01  CostEurMWh       True  Standard   1.72

2. DataFrame.query

df2 = df.query("IsInScope  & CostTable == 'Standard'")

Output

print(df2)
         Date        Type  IsInScope CostTable  Value
0  2017-04-01  CostEurMWh       True  Standard   0.22
1  2018-01-01  CostEurMWh       True  Standard   0.80
2  2019-01-01  CostEurMWh       True  Standard   1.72

Note we dont have to explicitly tell Python IsInScope == True :

x = [True, False]

for y in x:
    if y:
        print(y)

Output

True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM