简体   繁体   English

如何在python / pandas的毫秒部分转换带逗号(,)的datetime-String?

[英]How to convert a datetime-String with a comma (,) in the milliseconds part in python/pandas?

I got the following problem: 我遇到以下问题:

The date-time columns from my data got the following time-format (the columns are "Date" dd.mm.yyyy and "Time" hh:mm:ss.fff,f ): 我的数据中的date-time列具有以下时间格式(这些hh:mm:ss.fff,f “ Date” dd.mm.yyyy和“ Time” hh:mm:ss.fff,f ):

01.03.2019  12:29:15.732,7

I looked around but I couldn't find a formatting-option which deals with the part behind the comma (after the milliseconds). 我环顾四周,但找不到用于处理逗号后面部分的格式化选项(毫秒后)。 A source which didn't help me: https://docs.python.org/2/library/datetime.html 没有帮助我的来源: https : //docs.python.org/2/library/datetime.html

I am reading the csv-file in with python3 and pd.read_csv() . 我正在使用python3和pd.read_csv()读取csv文件。

I got the following work-around which truncates the comma and the cipher behind it. 我得到以下解决方法,该方法将逗号和后面的密码截断了。

It is terribly slow because of the truncation of over 50000 strings in my dataset: 由于我的数据集中有超过50000个字符串被截断,所以速度非常慢:

data = pd.read_csv('xyz.csv', sep=';', low_memory = False, parse_dates = [['Date', 'Time']], 
                   date_parser = lambda x, y : pd.to_datetime((x + ' ' + y)[:23], format='%d.%m.%Y %H:%M:%S.%f'))

What I want is to use a string-formatting which deals with the comma, either by discarding the whole milliseconds part or by converting it correctly to microseconds. 我想要的是使用一种字符串格式来处理逗号,方法是丢弃整个毫秒部分,或者将其正确转换为微秒。

Sidenote: With RI simply used "%d.%m.%Y %H:%M:%S" which discarded the milliseconds without throwing an error. 旁注:在RI中,仅使用了"%d.%m.%Y %H:%M:%S" ,它丢弃了毫秒而不会引发错误。

ResidentSleeper is correct you can use pd.to_datetime() and drop the comma. ResidentSleeper是正确的,您可以使用pd.to_datetime()并删除逗号。

import pandas as pd

data1 = {'Date': ['01.03.2019  12:29:15.732,7',
                  '01.03.2019  12:29:15.732,7',
                  '01.03.2019  12:29:15.732,7',
                  '01.03.2019  12:29:15.732,7'], 
        'Value': [1, 2, 3, 4]}

df1 = pd.DataFrame(data1)

df1['Date'] = pd.to_datetime(df1['Date'].str.replace(',', ''))

print(df1)

                        Date  Value
0 2019-01-03 12:29:15.732700      1
1 2019-01-03 12:29:15.732700      2
2 2019-01-03 12:29:15.732700      3
3 2019-01-03 12:29:15.732700      4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM