简体   繁体   English

根据另一个列条件过滤列值

[英]Filtering column values based on another column condition

I have a dataset with multiple columns out of which 2 columns I want to use to filter data as follows:我有一个包含多列的数据集,其中两列我想用来过滤数据,如下所示:

  1. In the Reason column keep all row values that are A3在原因列中保留所有 A3 的行值
  2. In the Reason column keep all values (even null) that has a date(obj) in Goods_Issue_Date_(GID)在原因列中保留在 Goods_Issue_Date_(GID) 中具有 date(obj) 的所有值(甚至为 null)

*feel free to assign a value for the nulls in reason that have date in GID *随意为在 GID 中有日期的原因中的空值分配一个值

and drop the other values like c,b..并删除其他值,例如 c,b..

have used this code which works for A3 values fine:已经使用了适用于 A3 值的代码:

Df = Data[Data["Reason"].isna(['A3'])]....? Df = 数据[数据[“原因”].isna(['A3'])]....?

df=pd.DataFrame({'Reason':{0: 'b',1: 'c',2: 'a3',3: ' ',4: ' ',5: 'a3',6: 'a3',7: ' ',8: 'b',9: ' ',}, 'Goods_Issue_Date_(GID)':{0: ' -1',1: '2 ',2: ' ',3: ' ',4: '2021-11-03T00:00:00',5: '2021-11-03T00:00:00',6: '',7: '',8: '0.5',9: '2021-11-03T00:00:00'}})

Reason原因 Goods_Issue_Date_(GID) Goods_Issue_Date_(GID)
b b -1 -1
c c 2 2
a3 a3
2021-11-03T00:00:00 2021-11-03T00:00:00
2021-11-03T00:00:00 2021-11-03T00:00:00
a3 a3
a3 a3
0 0
b b 0.5 0.5
2021-11-03T00:00:00 2021-11-03T00:00:00

This assumes your date formats always end in 00:00这假设您的日期格式始终以 00:00 结束

df[(df['Reason']=='a3') | (df['Goods_Issue_Date_(GID)'].str[-5:]=='00:00')]

  Reason Goods_Issue_Date_(GID)
2     a3
4           2021-11-03T00:00:00
5     a3    2021-11-03T00:00:00
6     a3
9           2021-11-03T00:00:00
df[(df['Reason']=='a3') | pd.to_datetime(df['Goods_Issue_Date_(GID)'], errors='coerce').notna()]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM