Showing out of bounds nanosecond timestamp error when converting column to datetime format

Question

I am using the dataset- Meteorite Landings which can be found here- https://www.kaggle.com/nasa/meteorite-landings#meteorite-landings.csv

A snap of the data: https://imgur.com/a/CrwL3h6

The dataset has a 'year' column which I renamed to 'year1'

data = data.rename(columns = {"year":"year1"})

The year1 column is given:

0        01/01/1880 12:00:00 AM
1                 1/1/1951 0:00
2                 1/1/1952 0:00
3                 1/1/1976 0:00
4                 1/1/1902 0:00
                  ...          
45711             1/1/1990 0:00
45712             1/1/1999 0:00
45713             1/1/1939 0:00
45714             1/1/2003 0:00
45715             1/1/1976 0:00
Name: year1, Length: 45716, dtype: object

I want to convert this column to datetime format in order to only keep the year as the date and time are repeated values, which is of no use, moreover the column's name is 'year'.

I used this:

data['year1'] = pd.to_datetime(data['year1'])

It shows an error when I try to do so:

OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1583-01-01 00:00:00

In order to solve this, I tried using this:

data['year1'] = pd.to_datetime(data['year1'],errors='coerce')

but on doing so, even then the year1 column is not in datetime format.

What can I do to convert it into datetime format?

Sample Data:

                name   id nametype     recclass      mass  fall    year    reclat    reclong               GeoLocation
              Aachen    1    Valid           L5      21.0  Fell  1880.0  50.77500    6.08333     (50.775000, 6.083330)
              Aarhus    2    Valid           H6     720.0  Fell  1951.0  56.18333   10.23333    (56.183330, 10.233330)
                Abee    6    Valid          EH4  107000.0  Fell  1952.0  54.21667 -113.00000  (54.216670, -113.000000)
            Acapulco   10    Valid  Acapulcoite    1914.0  Fell  1976.0  16.88333  -99.90000   (16.883330, -99.900000)
             Achiras  370    Valid           L6     780.0  Fell  1902.0 -33.16667  -64.95000  (-33.166670, -64.950000)
            Adhi Kot  379    Valid          EH4    4239.0  Fell  1919.0  32.10000   71.80000    (32.100000, 71.800000)
 Adzhi-Bogdo (stone)  390    Valid        LL3-6     910.0  Fell  1949.0  44.83333   95.16667    (44.833330, 95.166670)
                Agen  392    Valid           H5   30000.0  Fell  1814.0  44.21667    0.61667     (44.216670, 0.616670)
              Aguada  398    Valid           L6    1620.0  Fell  1930.0 -31.60000  -65.23333  (-31.600000, -65.233330)
       Aguila Blanca  417    Valid            L    1440.0  Fell  1920.0 -30.86667  -64.55000  (-30.866670, -64.550000)

Answer 1

Pandas refuses to work with datetimes earlier than 1677. But no matter, because your input CSV file has the year column as exactly that: the year alone. So just stop doing whatever you're doing that converts the year column into datetimes, and load it as a plain integer column.

Showing out of bounds nanosecond timestamp error when converting column to datetime format

Question

Sample Data:

1 answers

solution1
1 2019-10-20 01:14:13

Showing out of bounds nanosecond timestamp error when converting column to datetime format

Question

Sample Data:

1 answers

solution1 1 2019-10-20 01:14:13

solution1
1 2019-10-20 01:14:13