简体   繁体   中英

How to create filter condition from dataframe based on some criteria?

I need to create filter condition based on some criteria from the dataframe. I stuck here. Could you please help me to solve my problem. Thanks in Advance!

Example:

Df:

   0          1
0 kol_id    101152
1 jnj_id    7124166
2 thrc_cd   VIR
3 operator  <=
4 start_dt  05/10/2018

Using the above dataframe I need to create below filter query:

kol_id = '101152' and jnj_id = '7124166' and thrc_cd = 'VIR' and start_dt <= '05/10/2018'

If operator will come in any of the row then it means this operator value will use in next row filter condition otherwise only '=' operator will use in all the filter condition.

you can use masking and shifting in pandas:

Let's assume you have the following:

df = pd.DataFrame({'col1': ['kol', 'jnj', 'thrc', 'operator', 'start'], 'col2': [100, 200, 'VIR', '<=', '05/10/2018']})

In [10]: df                                                                                      
Out[10]: 
       col1        col2
0       kol         100
1       jnj         200
2      thrc         VIR
3  operator          <=
4     start  05/10/2018

When you mask every row where col1 is not operator then shift toward the bottom then fill whith the defaut operator you get:

df['operator'] = df.mask(df['col1']!='operator').shift()['col2'].fillna('==')            

In [14]: df                                                                                     
Out[14]: 
       col1        col2 shifted
0       kol         100      ==
1       jnj         200      ==
2      thrc         VIR      ==
3  operator          <=      ==
4     start  05/10/2018      <=

Now you can get rid of the rows containing operator in col1.

and concatenate columns to have your conditions:

In [17]: df['res'] = '(' + df['col1'] + df.shifted + df['col2'].astype(str) + ')'                

In [18]: df                                                                                      
Out[18]: 
       col1        col2 shifted                  res
0       kol         100      ==           (kol==100)
1       jnj         200      ==           (jnj==200)
2      thrc         VIR      ==          (thrc==VIR)
3     start  05/10/2018      <=  (start<=05/10/2018)

Finally join your column res to get your conditions

In [19]: ' and '.join(df.res)                                                                    
Out[19]: '(kol==100) and (jnj==200) and (thrc==VIR) and (start<=05/10/2018)'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM