[英]How can I split a DataFrame column with datetimes into two columns: one with dates and one with times of the day?
I have a data frame called data
, which has a column Dates
like this, 我有一个名为
data
的数据框,它有一个像这样的列Dates
,
Dates
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
2 2015-05-13 23:33:00
3 2015-05-13 23:30:00
4 2015-05-13 23:30:00
I know how to add a column to data frame, but how to divide Dates
to 我知道如何向数据框添加列,但如何将
Dates
分成
Day Time
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
2 2015-05-13 23:33:00
3 2015-05-13 23:30:00
4 2015-05-13 23:30:00
If your series is s
, then this will create such a DataFrame: 如果你的系列是
s
,那么这将创建一个这样的DataFrame:
pd.DataFrame({
'date': pd.to_datetime(s).dt.date,
'time': pd.to_datetime(s).dt.time})
as once you convert the series using pd.to_datetime
, then the dt
member can be used to extract the parts. 一旦您使用
pd.to_datetime
转换系列,则可以使用dt
成员提取部件。
Example 例
import pandas as pd
s = pd.Series(['2015-05-13 23:53:00', '2015-05-13 23:53:00'])
>>> pd.DataFrame({
'date': pd.to_datetime(s).dt.date,
'time': pd.to_datetime(s).dt.time})
date time
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
If your Dates
column is a string: 如果您的
Dates
列是字符串:
data['Day'], data['Time'] = zip(*data.Dates.str.split())
>>> data
Dates Day Time
0 2015-05-13 23:53:00 2015-05-13 23:53:00
1 2015-05-13 23:53:00 2015-05-13 23:53:00
2 2015-05-13 23:33:00 2015-05-13 23:33:00
3 2015-05-13 23:33:00 2015-05-13 23:33:00
4 2015-05-13 23:33:00 2015-05-13 23:33:00
If it is a timestamp: 如果是时间戳:
data['Day'], data['Time'] = zip(*[(d.date(), d.time()) for d in data.Dates])
If type of column Dates
is string, convert it by to_datetime
. 如果列
Dates
类型是字符串,则将其转换为to_datetime
。 Then you can use dt.date
, dt.time
and last drop
original column Dates
: 然后你可以使用
dt.date
, dt.time
和last drop
原始列Dates
:
print df['Dates'].dtypes
object
print type(df.at[0, 'Dates'])
<type 'str'>
df['Dates'] = pd.to_datetime(df['Dates'])
print df['Dates'].dtypes
datetime64[ns]
print df
Dates
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
2 2015-05-13 23:33:00
3 2015-05-13 23:30:00
4 2015-05-13 23:30:00
df['Date'] = df['Dates'].dt.date
df['Time'] = df['Dates'].dt.time
df = df.drop('Dates', axis=1)
print df
Date Time
0 2015-05-13 23:53:00
1 2015-05-13 23:53:00
2 2015-05-13 23:33:00
3 2015-05-13 23:30:00
4 2015-05-13 23:30:00
attrgetter
+ pd.concat
+ join
attrgetter
+ pd.concat
+ join
You can use operator.attrgetter
with pd.concat
to add an arbitrary number of datetime
attributes to your dataframe as separate series: 您可以将
operator.attrgetter
与pd.concat
一起使用,将任意数量的datetime
属性作为单独的系列添加到数据pd.concat
:
from operator import attrgetter
fields = ['date', 'time']
df = df.join(pd.concat(attrgetter(*fields)(df['Date'].dt), axis=1, keys=fields))
print(df)
Date date time
0 2015-05-13 23:53:00 2015-05-13 23:53:00
1 2015-01-13 15:23:00 2015-01-13 15:23:00
2 2016-01-13 03:33:00 2016-01-13 03:33:00
3 2018-02-13 20:13:25 2018-02-13 20:13:25
4 2017-05-12 06:52:00 2017-05-12 06:52:00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.