[英]Filter rows by criteria and select multiple columns from a dataframe with python pandas
If I have the following dataframe subset如果我有以下 dataframe 子集
A B C D E Date
R0 xy 78 io 16 73 2021-03-25
R1 xx 27 ya 80 1 2021-04-20
R2 xx 53 ya 27 44 2021-06-20
R3 xx 65 io 30 84 2021-08-22
R4 xv 9 ui 62 1 2021-08-01
How can I do with panda to have the following dataframe:如何使用 panda 来获得以下 dataframe:
A B C Date
R1 xx 27 ya 2021-04-20
R2 xx 53 ya 2021-06-20
I was thinking of filtering columns by doing:我正在考虑通过以下方式过滤列:
sbset = subset[['A','B','C', 'Date' ]]
and then filter where A = 'XX' and C = 'ya', but with a dataframe of 1 million of obs and 127 vars it takes too long, can I do both actions (filter by two or more variables and select more variables) in one step? and then filter where A = 'XX' and C = 'ya', but with a dataframe of 1 million of obs and 127 vars it takes too long, can I do both actions (filter by two or more variables and select more variables)一步?
Another question, if the dataframe takes the dates as a string, how can I change the format to date?另一个问题,如果 dataframe 将日期作为字符串,我如何将格式更改为日期?
Thanks for reading.谢谢阅读。
You just need boolean making for this:你只需要 boolean 做这个:
mask=(df['A']=='xx') & (df['C']=='ya')
Finally:-最后:-
result=df[mask]
Now if you print result
you will get your desired output现在,如果您打印
result
,您将获得所需的 output
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.