pandas datetime to unix timestamp seconds

Question

From the official documentation of pandas.to_datetime we can say,

unit : string, default ‘ns’

unit of the arg (D,s,ms,us,ns) denote the unit, which is an integer or float number. This will be based off the origin. Example, with unit='ms' and origin='unix' (the default), this would calculate the number of milliseconds to the unix epoch start.

So when I try like this way,

import pandas as pd
df = pd.DataFrame({'time': [pd.to_datetime('2019-01-15 13:25:43')]})
df_unix_sec = pd.to_datetime(df['time'],unit='ms',origin='unix')
print(df)
print(df_unix_sec)

                 time
0   2019-01-15 13:25:43
0   2019-01-15 13:25:43
Name: time, dtype: datetime64[ns]

Output is not changing for the later one. Every time it is showing the datetime value not number of milliseconds to the unix epoch start for the 2nd one. Why is that? Am I missing something?

Answer 1

I think you misunderstood what the argument is for. The purpose of origin='unix' is to convert an integer timestamp to datetime , not the other way.

pd.to_datetime(1.547559e+09, unit='s', origin='unix') 
# Timestamp('2019-01-15 13:30:00')

Here are some options:

Option 1: integer division

Conversely, you can get the timestamp by converting to integer (to get nanoseconds) and divide by 10 ⁹ .

pd.to_datetime(['2019-01-15 13:30:00']).astype(int) / 10**9
# Float64Index([1547559000.0], dtype='float64')

Pros:

super fast

Cons:

makes assumptions about how pandas internally stores dates

Option 2: recommended by pandas

Pandas docs recommend using the following method:

# create test data
dates = pd.to_datetime(['2019-01-15 13:30:00'])

# calculate unix datetime
(dates - pd.Timestamp("1970-01-01")) // pd.Timedelta('1s')

[out]:
Int64Index([1547559000], dtype='int64')

Pros:

"idiomatic", recommended by the library

Cons:

unweildy
not as performant as integer division

Option 3: `pd.Timestamp`

If you have a single date string, you can use pd.Timestamp as shown in the other answer:

pd.Timestamp('2019-01-15 13:30:00').timestamp()
# 1547559000.0

If you have to cooerce multiple datetimes (where pd.to_datetime is your only option), you can initialize and map:

pd.to_datetime(['2019-01-15 13:30:00']).map(pd.Timestamp.timestamp)
# Float64Index([1547559000.0], dtype='float64')

Pros:

best method for a single datetime string
easy to remember

Cons:

not as performant as integer division

Answer 2

You can use timestamp() method which returns POSIX timestamp as float:

pd.Timestamp('2021-04-01').timestamp()

[Out]:
1617235200.0

pd.Timestamp('2021-04-01 00:02:35.234').timestamp()

[Out]:
1617235355.234

Answer 3

value attribute of the pandas Timestamp holds the unix epoch. This value is in nanoseconds. So you can convert to ms or us by diving by 1e3 or 1e6. Check the code below.

import pandas as pd
date_1 = pd.to_datetime('2020-07-18 18:50:00')
print(date_1.value)

Answer 4

In case you are accessing a particular datetime64 object from the dataframe, chances are that pandas will return a Timestamp object which is essentially how pandas stores datetime64 objects.

You can use pd.Timestamp.to_datetime64() method of the pd.Timestamp object to convert it to numpy.datetime64 object with ns precision.

pandas datetime to unix timestamp seconds

Question

4 answers

solution1
91 ACCPTED 2019-01-22 17:26:51

Option 1: integer division

Option 2: recommended by pandas

Option 3: `pd.Timestamp`

solution2
20 2021-04-29 11:57:07

solution3
1 2021-03-24 08:25:59

solution4
-2 2020-07-25 18:55:34

pandas datetime to unix timestamp seconds

Question

4 answers

solution1 91 ACCPTED 2019-01-22 17:26:51

Option 1: integer division

Option 2: recommended by pandas

Option 3: pd.Timestamp

solution2 20 2021-04-29 11:57:07

solution3 1 2021-03-24 08:25:59

solution4 -2 2020-07-25 18:55:34

solution1
91 ACCPTED 2019-01-22 17:26:51

Option 3: `pd.Timestamp`

solution2
20 2021-04-29 11:57:07

solution3
1 2021-03-24 08:25:59

solution4
-2 2020-07-25 18:55:34