简体   繁体   English

将多种日期格式转换为一种格式python

[英]Converting multiple date formats into one format python

The date variable in my data is in multiple formats like DD/MM/YYYY D/MM/YY DD/M/YYYY 12/8/2017 27/08/17 8/9/2017 10/9/2017 15/09/17.. 我数据中的日期变量采用多种格式,例如DD / MM / YYYY D / MM / YY DD / M / YYYY 12/8/2017 27/08/17 8/9/2017 10/9/2017 15/09 / 17 ..

I need to change these multiple formats into one single format like DD/MM/YYYY 我需要将这些多种格式更改为一种单一格式,例如DD / MM / YYYY

Tried to create a parsing function 试图创建一个解析功能

def parse_date(date):
if date == '':
    return None
else:
    return dt.strptime(date, '%d/%m/%y').date()

and when I apply this function to my dataset, it throws me the following error.. 当我将此函数应用于数据集时,会引发以下错误。

"ValueError Traceback (most recent call last) in () ----> 1 data.Date = data.Date.apply(parse_date) “(()----> 1 data.Date = data.Date.apply(parse_date)中的ValueError Traceback(最近一次调用最后一次)

Unconverted Data Remains Error ValueError: unconverted data remains: 17" 转换的数据仍为错误 ValueError:未转换的数据仍为:17“

How can I solve the unconverted data remains error? 如何解决未转换的数据残留错误?

You can use the dateutil module to do this 您可以使用dateutil模块执行此操作

import dateutil.parser as dparser
a = ["12/8/2017", "27/08/17", "8/9/2017", "10/9/2017", "15/09/17"]

for i in a:
    print dparser.parse(i,fuzzy=True).date()

Result: 结果:

2017-12-08
2017-08-27
2017-08-09
2017-10-09
2017-09-15

This is because %y expects a 4 digit number. 这是因为%y需要4位数字。

In order to cover multiple date formats, you can have a look at the dateparser library. 为了涵盖多种日期格式,您可以看一下dateparser库。 ( Docs ) 文档

Otherwise you will have to manually go through possible types or extend the dates yourself. 否则,您将必须手动选择可能的类型或自己延长日期。 If you are sure you only need to extend the year part you can do something like this before feeding the string to the parser: 如果确定只需要扩展年份部分,则可以在将字符串输入解析器之前执行以下操作:

date_parts = date.split('/')
if len(date_parts[2]) == 2:
    date_parts[2] = "20" + date_parts[2]
date = '/'.join(date_parts)

I think using the dateparser library is the way to go, as it is more extendible. 我认为使用dateparser库是dateparser的方法,因为它更具扩展性。

A basic approach is to split the strings on the slashes, and then re-join them with the correct number of digits. 一种基本方法是将斜杠上的字符串分开,然后用正确的数字重新将它们连接起来。 A simple approach: 一个简单的方法:

date = "12/8/2017"

parts = date.split("/")

print(parts) # ['12', '8', '2017']

if len(parts[0]) == 1:
    parts[0] = "0" + parts[0]
if len(parts[1]) == 1:
    parts[1] = "0" + parts[1]
if len(parts[2]) == 2:
    parts[2] = "20" + parts[2]
newDate = "/".join(parts)
# or 
newDate = parts[0] + "/" + parts[1] + "/" + parts[2]

print(newDate) # 12/08/2017

Then you have a consistent date format throughout. 然后,您将始终具有一致的日期格式。 (An additional check is required if your dates extend into the last century.) (如果您的约会日期延续到上个世纪,则需要另外进行检查。)

I would test this first, and consider the other answers' approaches if this is not performant. 我将首先对此进行测试,如果效果不佳,则考虑其他答案的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM