简体   繁体   中英

Converting objects from CSV into datetime

I've got an imported csv file which has multiple columns with dates in the format "5 Jan 2001 10:20". (Note not zero-padded day)

if I do df.dtype then it shows the columns as being a objects rather than a string or a datetime. I need to be able to subtract 2 column values to work out the difference so I'm trying to get them into a state where I can do that.

At the moment if I try the test subtraction at the end I get the error unsupported operand type(s) for -: 'str' and 'str' .

I've tried multiple methods but have run into a problem every way I've tried. Any help would be appreciated. If I need to give any more information then I will.

As suggested by @MaxU, you can use pd.to_datetime() method to bring the values of the given column to the 'appropriate' format, like this:

df['datetime'] = pd.to_datetime(df.datetime)

You would have to do this on whatever columns you have that you need trasformed to the right dtype.

Alternatively, you can use parse_dates argument of pd.read_csv() method, like this:

df = pd.read_csv(path, parse_dates=[1,2,3])

where columns 1,2,3 are expected to contain data that can be interpreted as dates.

I hope this helps.

convert a column to datetime using this approach

df["Date"] = pd.to_datetime(df["Date"])

If column has empty values then change error level to coerce to ignore errors: Details

df["Date"] = pd.to_datetime(df["Date"], errors='coerce')

After which you should be able to subtract two dates.

example:

import pandas
df = pandas.DataFrame(columns=['to','fr','ans'])
df.to = [pandas.Timestamp('2014-01-24 13:03:12.050000'), pandas.Timestamp('2014-01-27 11:57:18.240000'), pandas.Timestamp('2014-01-23 10:07:47.660000')]
df.fr = [pandas.Timestamp('2014-01-26 23:41:21.870000'), pandas.Timestamp('2014-01-27 15:38:22.540000'), pandas.Timestamp('2014-01-23 18:50:41.420000')]
(df.fr-df.to).astype('timedelta64[h]')

consult this answer for more details:

Calculate Pandas DataFrame Time Difference Between Two Columns in Hours and Minutes

If you want to directly load the column as datetime object while reading from csv, consider this example :

Pandas read csv dateint columns to datetime

I found that the problem was to do with missing values within the column. Using coerce=True so df["Date"] = pd.to_datetime(df["Date"], coerce=True) solves the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM