在 pandas read_csv 中以毫秒为单位解析日期

Question

My .csv looks like this:我的.csv看起来像这样：

     date      time  
0    20190101  181555700  
1    20190101  181545515

where the format is YYYYMMDD for date and HHMMSSMMM for time (last MMM are milliseconds).其中date格式为YYYYMMDD ， time格式为HHMMSSMMM （最后一个 MMM 为毫秒）。 For example the first row would be 2019-01-01 18:15:55.700例如第一行是2019-01-01 18:15:55.700

Is there a way to parse this directly from pd.read_csv() without having to convert it later?有没有办法直接从pd.read_csv()解析它而无需稍后转换它？ Using only parse_dates does not work as it doesn't recognize the format.仅使用parse_dates不起作用，因为它无法识别格式。 What I would like is to have a single column in my dataframe, with the timestamp correctly parsed like我想要的是在我的 dataframe 中有一个列，时间戳正确解析为

    timestamp
0   2019-01-01 18:15:55.700

Answer 1

You can use to_timedelta with unit option to turn your time into timedelta and add to date :您可以使用带有unit选项的to_timedelta将您的time转换为timedelta并添加到date ：

df = pd.read_csv('file.csv', parse_dates=['date'])
df['date'] = df.date + pd.to_timedelta(df.time, unit='ms')

or:或者：

df = pd.read_csv('file.csv')
df['date'] = pd.to_datetime(df.date) + pd.to_timedelta(df.time, unit='ms')

Output: Output：

                     date       time
0 2019-01-03 02:25:55.700  181555700
1 2019-01-03 02:25:45.515  181545515

Update per comment:每条评论更新：

df['date'] = pd.to_datetime(df.date.astype(str)+df.time.astype(str), format='%Y%m%d%H%M%S%f')

Output: Output：

                     date       time
0 2019-01-01 18:15:55.700  181555700
1 2019-01-01 18:15:45.515  181545515

Answer 2

I think this is close to what you need:我认为这接近你所需要的：

import pandas as pd
import datetime as dt

data = pd.read_csv(
   './a.csv',
   delimiter='\t',
   index_col=0,
   parse_dates=[1],
   converters={'time': lambda t: dt.datetime.strptime(t, '%H%M%S%f').time()}
)

Output: Output：

        date             time
0 2019-01-01  18:15:55.700000
1 2019-01-01  18:15:45.515000

After some survey I found this:经过一番调查，我发现了这一点：

data = pd.read_csv(
   './a.csv',
   delimiter='\t',
   index_col=1,
   parse_dates={'datetime': [1, 2]},
   converters={'time': lambda t: dt.datetime.strptime(t, '%H%M%S%f').time()}
)

And the output is: output 是：

                 datetime
0 2019-01-01 18:15:55.700
1 2019-01-01 18:15:45.515

在 pandas read_csv 中以毫秒为单位解析日期

问题描述

2 个解决方案

解决方案1
4 2020-05-18 19:51:03

解决方案2
1 已采纳 2020-05-18 19:49:50

在 pandas read_csv 中以毫秒为单位解析日期

问题描述

2 个解决方案

解决方案1 4 2020-05-18 19:51:03

解决方案2 1 已采纳 2020-05-18 19:49:50

解决方案1
4 2020-05-18 19:51:03

解决方案2
1 已采纳 2020-05-18 19:49:50