简体   繁体   English

如何处理 pandas 列中的不同日期格式?

[英]How do I handle different date format in a pandas column?

In the date column of some dataset, I have the date column written in different formats.在某些数据集的日期列中,我以不同格式编写了日期列。 Not the usual number formats style, but with the days of the week and months spelt out.不是通常的数字格式样式,而是拼写出星期几和月份。 Some rows have the months spelt short, others have theirs spelt in full.有些行的月份拼写很短,其他的则拼写完整。 Making it difficult to do a simple pd.to_datetime(df,format) .很难做一个简单的pd.to_datetime(df,format) I thought about running a for loop.我想过运行一个 for 循环。 I split each row by '-':我用'-'分割每一行:

for x in df['Date']:
   if len(i.split('-')[1])<=6:

But then I realized this wasn't a great condition.但后来我意识到这不是一个很好的条件。 I am thinking the solution would be regex?我在想解决方案是正则表达式? What do I do?我该怎么办?

A sample of the dataset数据集样本

You don't need to iterate, you can use .loc with .str accessor splits:您不需要迭代,您可以将.loc.str访问器拆分一起使用:

df.loc[df['Date'].str.split('-',expand=True)[6].str.len()<=6,'Date']

You can assign anything if you like, or just get it.如果你愿意,你可以分配任何东西,或者只是得到它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM