简体   繁体   English

python - 将多个字符串日期时间格式转换为特定日期格式

[英]python - convert multiple string datetimes format into a specific date format

I have this column date with 6 different date strings sizes:我有 6 种不同日期字符串大小的日期列:

df = pd.DataFrame({'date': {0: '2020-03-21T10:13:08',  1: '2020-03-21T17:43:03',  2: '2020-03-21T13:13:30',  3: '2020-03-21T20:43:02',  4: '3/8/20 5:31',  5: '3/8/20 5:19',  6: '3/22/20 23:45',  7: '3/22/20 23:45',  8: '2/1/2020 11:53',  9: '2/1/2020 10:53',  10: '1/31/2020 15:20',  11: '1/31/2020 10:37',  12: '2020-04-04 23:34:21',  13: '2020-04-04 23:34:21'}}, 
             index=range(0,14))

I need to convert all those different datetimes strings to date format.我需要将所有这些不同的日期时间字符串转换为日期格式。 The approach I'm using is:我使用的方法是:

  1. Find the first white space and extract the date找到第一个空格并提取日期

  2. Change its format given a certain string length (each string length has its specific date format as you can see below in the format argument)在给定字符串长度的情况下更改其格式(每个字符串长度都有其特定的日期格式,您可以在下面的format参数中看到)

  3. Do (2) in the respective rows in the dataframe df .在 dataframe df的相应行中执行 (2)。

You can see this approach here:你可以在这里看到这种方法:

df.loc[df["date"].str.find(" ") == 10, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 10, "date"].str[0:10])
df.loc[df["date"].str.find(" ") == -1, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 10, "date"].str[0:10])
df.loc[df["date"].str.find(" ") == 6, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 6, "date"].str[0:6], format="%m/%d/%y")
df.loc[df["date"].str.find(" ") == 7, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 7, "date"].str[0:7], format="%m/%d/%y")
df.loc[df["date"].str.find(" ") == 8, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 8, "date"].str[0:8], format="%m/%d/%Y")
df.loc[df["date"].str.find(" ") == 9, "date"] = pd.to_datetime(df.loc[df["date"].str.find(" ") == 9, "date"].str[0:9], format="%m/%d/%Y")

I'm going perfect until step 3) where I'm trying to find a workaround to make all the format changes in the dataframe, but I can't understand why it doesn't give what it should give.在第 3 步之前,我一直都很完美,我试图找到一种解决方法来更改 dataframe 中的所有格式,但我不明白为什么它没有给出它应该给出的东西。 Any suggestions?有什么建议么?

By the way, it has to be scalable (I have a lot of rows per format string)顺便说一句,它必须是可扩展的(每个格式字符串我有很多行)

For me working converting all values to datetimes and then remove times with Series.dt.floor if output is datetimes or with Series.dt.date if output are python dates:对我来说,如果 output 是日期时间,则将所有值转换为日期时间,然后使用Series.dt.floor删除时间;如果 output 是 Z23EEEB4347BDD26BFC6B7EE9A3B755DD,则使用Series.dt.date删除时间:

df['date'] = pd.to_datetime(df['date']).dt.floor('d')
#dates
#df['date'] = pd.to_datetime(df['date']).dt.date
print (df)
         date
0  2020-03-21
1  2020-03-21
2  2020-03-21
3  2020-03-21
4  2020-03-08
5  2020-03-08
6  2020-03-22
7  2020-03-22
8  2020-02-01
9  2020-02-01
10 2020-01-31
11 2020-01-31
12 2020-04-04
13 2020-04-04

Your solution should be simplify - get first 10 letters, then split by possible space and get first values:您的解决方案应该简化 - 获取前 10 个字母,然后按可能的空间分割并获取第一个值:

df['date'] = pd.to_datetime(df['date'].str[:10].str.split().str[0])
import pandas as pd

df = pd.DataFrame({'date': {0: '2020-03-21T10:13:08',  1: '2020-03-21T17:43:03',  2: '2020-03-21T13:13:30',  3: '2020-03-21T20:43:02',  4: '3/8/20 5:31',  5: '3/8/20 5:19',  6: '3/22/20 23:45',  7: '3/22/20 23:45',  8: '2/1/2020 11:53',  9: '2/1/2020 10:53',  10: '1/31/2020 15:20',  11: '1/31/2020 10:37',  12: '2020-04-04 23:34:21',  13: '2020-04-04 23:34:21'}}, 
             index=range(0,14))
df
    date
0   2020-03-21T10:13:08
1   2020-03-21T17:43:03
2   2020-03-21T13:13:30
3   2020-03-21T20:43:02
4   3/8/20 5:31
5   3/8/20 5:19
6   3/22/20 23:45
7   3/22/20 23:45
8   2/1/2020 11:53
9   2/1/2020 10:53
10  1/31/2020 15:20
11  1/31/2020 10:37
12  2020-04-04 23:34:21
13  2020-04-04 23:34:21

df['date'] = pd.to_datetime(df['date'])
df
    date
0   2020-03-21 10:13:08
1   2020-03-21 17:43:03
2   2020-03-21 13:13:30
3   2020-03-21 20:43:02
4   2020-03-08 05:31:00
5   2020-03-08 05:19:00
6   2020-03-22 23:45:00
7   2020-03-22 23:45:00
8   2020-02-01 11:53:00
9   2020-02-01 10:53:00
10  2020-01-31 15:20:00
11  2020-01-31 10:37:00
12  2020-04-04 23:34:21
13  2020-04-04 23:34:21

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将日期时间数组转换为特定日期格式的列表 - How to convert an array of datetimes to a list of specific date format 将任何日期字符串格式转换为特定的日期格式字符串 - Convert any Date String Format to a specific date format string 在python中将字符串日期转换为日期格式? - Convert string date into date format in python? 无论输入格式如何,将日期转换为python中的特定格式 - Convert date into a specific format in python irrespective of the input format is 如何将可能采用任何格式的日期转换为 python 中的特定格式? - How to convert a date which maybe in any format to a specific format in python? 如何在 Python pandas 日期时间数据框中以特定格式从数据框中转换所有日期时间 - how to convert all datetimes from a dataframe in a specific format in a Python pandas datetime dataframe 如何在python中从字符串格式转换日期 - How to convert date from string format in python 在python pandas dataframe中将字符串转换为日期格式 - Convert string to date format in python pandas dataframe 如何将字符串转换为 Python 中的特定日期格式 - How to convert a string to a particular date format in Python 将日期字符串转换为日期时间 Python 格式错误 - Convert date string to Datetime Python Format Error
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM