简体   繁体   English

按条件过滤行和 select 多列来自 dataframe 和 python Z3A43B4F88325D94022C0EFA9

[英]Filter rows by criteria and select multiple columns from a dataframe with python pandas

If I have the following dataframe subset如果我有以下 dataframe 子集

       A   B   C   D   E   Date
   R0  xy  78  io  16  73  2021-03-25
   R1  xx  27  ya  80   1  2021-04-20
   R2  xx  53  ya  27  44  2021-06-20
   R3  xx  65  io  30  84  2021-08-22
   R4  xv   9  ui  62   1  2021-08-01

How can I do with panda to have the following dataframe:如何使用 panda 来获得以下 dataframe:

       A   B   C   Date
   R1  xx  27  ya  2021-04-20
   R2  xx  53  ya  2021-06-20

I was thinking of filtering columns by doing:我正在考虑通过以下方式过滤列:

sbset = subset[['A','B','C', 'Date' ]]

and then filter where A = 'XX' and C = 'ya', but with a dataframe of 1 million of obs and 127 vars it takes too long, can I do both actions (filter by two or more variables and select more variables) in one step? and then filter where A = 'XX' and C = 'ya', but with a dataframe of 1 million of obs and 127 vars it takes too long, can I do both actions (filter by two or more variables and select more variables)一步?

Another question, if the dataframe takes the dates as a string, how can I change the format to date?另一个问题,如果 dataframe 将日期作为字符串,我如何将格式更改为日期?

Thanks for reading.谢谢阅读。

You just need boolean making for this:你只需要 boolean 做这个:

mask=(df['A']=='xx') & (df['C']=='ya')

Finally:-最后:-

result=df[mask]

Now if you print result you will get your desired output现在,如果您打印result ,您将获得所需的 output

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM