简体   繁体   English

如何根据某些条件从 dataframe 创建过滤条件?

[英]How to create filter condition from dataframe based on some criteria?

I need to create filter condition based on some criteria from the dataframe. I stuck here.我需要根据 dataframe 中的某些条件创建过滤条件。我卡在这里了。 Could you please help me to solve my problem.你能帮我解决我的问题吗? Thanks in Advance!提前致谢!

Example:例子:

Df: Df:

   0          1
0 kol_id    101152
1 jnj_id    7124166
2 thrc_cd   VIR
3 operator  <=
4 start_dt  05/10/2018

Using the above dataframe I need to create below filter query:使用上面的 dataframe 我需要创建以下过滤器查询:

kol_id = '101152' and jnj_id = '7124166' and thrc_cd = 'VIR' and start_dt <= '05/10/2018'

If operator will come in any of the row then it means this operator value will use in next row filter condition otherwise only '=' operator will use in all the filter condition.如果运算符将出现在任何行中,则意味着该运算符值将在下一行过滤条件中使用,否则只有“=”运算符将在所有过滤条件中使用。

you can use masking and shifting in pandas:您可以在 pandas 中使用掩码和移位:

Let's assume you have the following:假设您有以下内容:

df = pd.DataFrame({'col1': ['kol', 'jnj', 'thrc', 'operator', 'start'], 'col2': [100, 200, 'VIR', '<=', '05/10/2018']})

In [10]: df                                                                                      
Out[10]: 
       col1        col2
0       kol         100
1       jnj         200
2      thrc         VIR
3  operator          <=
4     start  05/10/2018

When you mask every row where col1 is not operator then shift toward the bottom then fill whith the defaut operator you get:当您屏蔽col1不是operator的每一行时,然后向底部移动然后填充您得到的默认运算符:

df['operator'] = df.mask(df['col1']!='operator').shift()['col2'].fillna('==')            

In [14]: df                                                                                     
Out[14]: 
       col1        col2 shifted
0       kol         100      ==
1       jnj         200      ==
2      thrc         VIR      ==
3  operator          <=      ==
4     start  05/10/2018      <=

Now you can get rid of the rows containing operator in col1.现在您可以删除 col1 中包含operator的行。

and concatenate columns to have your conditions:并连接列以获得您的条件:

In [17]: df['res'] = '(' + df['col1'] + df.shifted + df['col2'].astype(str) + ')'                

In [18]: df                                                                                      
Out[18]: 
       col1        col2 shifted                  res
0       kol         100      ==           (kol==100)
1       jnj         200      ==           (jnj==200)
2      thrc         VIR      ==          (thrc==VIR)
3     start  05/10/2018      <=  (start<=05/10/2018)

Finally join your column res to get your conditions最后加入你的列res来获取你的条件

In [19]: ' and '.join(df.res)                                                                    
Out[19]: '(kol==100) and (jnj==200) and (thrc==VIR) and (start<=05/10/2018)'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM