[英]Pandas Series: string date to epoch unix seconds
I have a Pandas Dataframe where one column is in a string date format as below我有一个 Pandas 数据框,其中一列采用字符串日期格式,如下所示
0 time
1 September 20 2016
2 September 20 2016
3 September 19 2016
4 September 16 2016
What would be a succinct way for replacing time to be in epoch unix seconds?将时间替换为纪元 unix 秒的简洁方法是什么?
You can modify the values of a column using the Series' apply
method by giving it a function containing the actions you want to perform on each of the values.您可以使用 Series 的
apply
方法修改列的值,方法是为其提供一个包含要对每个值执行的操作的函数。
For handling datetimes you can use dateutil.parser.parse
to parse arbitrary strings into datetime objects.为了处理日期时间,您可以使用
dateutil.parser.parse
将任意字符串解析为日期时间对象。
import datetime
import pandas as pd
from dateutil.parser import parse
s = pd.Series(['September 20 2016',
'September 20 2016',
'September 19 2016',
'September 16 2016'])
df = pd.DataFrame(s)
def dt2epoch(value):
d = parse(value)
return d.timestamp()
df[0].apply(dt2epoch) # apples given function to each value of column
Result:结果:
0 1474329600
1 1474329600
2 1474243200
3 1473984000
Name: 0, dtype: float64
You could try to_datetime
.你可以试试
to_datetime
。
import pandas as pd
your_df['time']=pd.to_datetime(your_df['time'])
Edit: To get the epoch from a datetime object, you can convert the series to an int64 object, which will give you the number of nanoseconds since the epoch, and divide by 10^9 (the number of nanoseconds in a second).编辑:要从日期时间对象获取纪元,您可以将系列转换为 int64 对象,这将为您提供自纪元以来的纳秒数,然后除以 10^9(一秒中的纳秒数)。
import numpy as np
your_df['time'] = (pd.to_datetime(your_df['time']).astype(np.int64)/10**9).astype(np.int64)
The last conversion is needed if you want to have it in integers (the division will give you floats instead)如果您想将其转换为整数,则需要最后一次转换(除法将为您提供浮点数)
Note: If you have NaT objects in your time series, they will show up as the integer value -9223372036, and you may want to either filter them out up-front, or have them being output as NaN (in which case, the resulting series must be of a float type instead of int).注意:如果您的时间序列中有 NaT 对象,它们将显示为整数值 -9223372036,您可能希望预先过滤掉它们,或者将它们输出为 NaN(在这种情况下,结果series 必须是 float 类型而不是 int)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.