简体   繁体   English

将时间戳列拆分为单独的日期和时间列

[英]Splitting timestamp column into separate date and time columns

I have a pandas dataframe with over 1000 timestamps (below) that I would like to loop through:我有一个带有超过 1000 个时间戳(如下)的 pandas 数据框,我想循环访问:

2016-02-22 14:59:44.561776

I'm having a hard time splitting this time stamp into 2 columns- 'date' and 'time'.我很难将这个时间戳分成两列——“日期”和“时间”。 The date format can stay the same, but the time needs to be converted to CST (including milliseconds).日期格式可以保持不变,但时间需要转换为 CST(包括毫秒)。

Thanks for the help谢谢您的帮助

Had same problem and this worked for me.有同样的问题,这对我有用。

Suppose the date column in your dataset is called "date"假设数据集中的日期列称为“日期”

import pandas as pd
df = pd.read_csv(file_path)

df['Dates'] = pd.to_datetime(df['date']).dt.date
df['Time'] = pd.to_datetime(df['date']).dt.time

This will give you two columns "Dates" and "Time" with splited dates.这将为您提供两列“日期”和“时间”以及拆分日期。

I'm not sure why you would want to do this in the first place, but if you really must...我不确定你为什么要首先这样做,但如果你真的必须......

df = pd.DataFrame({'my_timestamp': pd.date_range('2016-1-1 15:00', periods=5)})

>>> df
         my_timestamp
0 2016-01-01 15:00:00
1 2016-01-02 15:00:00
2 2016-01-03 15:00:00
3 2016-01-04 15:00:00
4 2016-01-05 15:00:00

df['new_date'] = [d.date() for d in df['my_timestamp']]
df['new_time'] = [d.time() for d in df['my_timestamp']]

>>> df
         my_timestamp    new_date  new_time
0 2016-01-01 15:00:00  2016-01-01  15:00:00
1 2016-01-02 15:00:00  2016-01-02  15:00:00
2 2016-01-03 15:00:00  2016-01-03  15:00:00
3 2016-01-04 15:00:00  2016-01-04  15:00:00
4 2016-01-05 15:00:00  2016-01-05  15:00:00

The conversion to CST is more tricky.转换为 CST 更加棘手。 I assume that the current timestamps are 'unaware', ie they do not have a timezone attached?我假设当前时间戳是“不知道的”,即它们没有附加时区? If not, how would you expect to convert them?如果不是,您希望如何转换它们?

For more details:更多细节:

https://docs.python.org/2/library/datetime.html https://docs.python.org/2/library/datetime.html

How to make an unaware datetime timezone aware in python 如何在python中使不知道的日期时间时区知道

EDIT编辑

An alternative method that only loops once across the timestamps instead of twice:另一种仅在时间戳上循环一次而不是两次的替代方法:

new_dates, new_times = zip(*[(d.date(), d.time()) for d in df['my_timestamp']])
df = df.assign(new_date=new_dates, new_time=new_times)

The easiest way is to use the pandas.Series dt accessor, which works on columns with a datetime dtype (see pd.to_datetime ).最简单的方法是使用pandas.Series dt访问器,它适用于具有datetime dtype dtype 的列(请参阅pd.to_datetime )。 For this case, pd.date_range creates an example column with a datetime dtype , therefore use .dt.date and .dt.time :对于这种情况, pd.date_range创建一个带有datetime dtype dtype 的示例列,因此使用.dt.date.dt.time

df = pd.DataFrame({'full_date': pd.date_range('2016-1-1 10:00:00.123', periods=10, freq='5H')})
df['date'] = df['full_date'].dt.date
df['time'] = df['full_date'].dt.time

In [166]: df
Out[166]:
                full_date        date             time
0 2016-01-01 10:00:00.123  2016-01-01  10:00:00.123000
1 2016-01-01 15:00:00.123  2016-01-01  15:00:00.123000
2 2016-01-01 20:00:00.123  2016-01-01  20:00:00.123000
3 2016-01-02 01:00:00.123  2016-01-02  01:00:00.123000
4 2016-01-02 06:00:00.123  2016-01-02  06:00:00.123000
5 2016-01-02 11:00:00.123  2016-01-02  11:00:00.123000
6 2016-01-02 16:00:00.123  2016-01-02  16:00:00.123000
7 2016-01-02 21:00:00.123  2016-01-02  21:00:00.123000
8 2016-01-03 02:00:00.123  2016-01-03  02:00:00.123000
9 2016-01-03 07:00:00.123  2016-01-03  07:00:00.123000

If your timestamps are already in pandas format (not string), then:如果您的时间戳已经是 pandas 格式(不是字符串),那么:

df["date"] = df["timestamp"].date
dt["time"] = dt["timestamp"].time

If your timestamp is a string, you can parse it using the datetime module:如果您的时间戳是一个字符串,您可以使用 datetime 模块对其进行解析:

from datetime import datetime
data1["timestamp"] = df["timestamp"].apply(lambda x: \
    datetime.strptime(x,"%Y-%m-%d %H:%M:%S.%f"))

Source: http://pandas.pydata.org/pandas-docs/stable/timeseries.html来源:http: //pandas.pydata.org/pandas-docs/stable/timeseries.html

If your timestamp is a string, you can convert it to a datetime object:如果您的时间戳是一个字符串,您可以将其转换为datetime时间对象:

from datetime import datetime

timestamp = '2016-02-22 14:59:44.561776'
dt = datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S.%f')

From then on you can bring it to whatever format you like.从那时起,您可以将其转换为您喜欢的任何格式。

Try尝试

s = '2016-02-22 14:59:44.561776'

date,time = s.split()

then convert time as needed.然后根据需要转换时间。

If you want to further split the time,如果你想进一步分割时间,

hour, minute, second = time.split(':')

try this:尝试这个:

def time_date(datetime_obj):
    date_time = datetime_obj.split(' ')
    time = date_time[1].split('.')
    return date_time[0], time[0]

除了@Alexander,如果你想要一个单班轮

df['new_date'],df['new_time'] = zip(*[(d.date(), d.time()) for d in df['my_timestamp']])

If your timestamp is a string, you can convert it to pandas timestamp before splitting it.如果您的时间戳是一个字符串,您可以在拆分之前将其转换为 pandas 时间戳。

#convert to pandas timestamp
data["old_date"] = pd.to_datetime(data.old_date)

#split columns
data["new_date"] = data["old_date"].dt.date
data["new_time"] = data["old_date"].dt.time

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM