简体   繁体   中英

Error when using Pandas to read dates from Excel file and sort them

I'm reading an excel file, with Pandas, containing Title and Date columns. When I manually set up a test version like this:

import pandas as pd

df = pd.DataFrame(data={'Title': ['Movie1', 'Movie2', 'Movie3', 'Movie4'],
                    'Date': ['1991-11', '1991', '1991', '1991-10-31']})
print(df)

It prints as expected and most importantly, i'm able to sort it exactly how I would like by using print(df.sort_values('Date')) , Below is the output i'm ultimately trying to achieve. As you can see there are instances of YYYY/MM/DD, YYYY/MM and just YYYY.

Title        Date
1  Movie2        1991
2  Movie3        1991
3  Movie4  1991-10-31
0  Movie1     1991-11

My issues occur when I try to run print(df.sort_values('Date')) with the actual Excel file i'm reading, using read_excel . I get TypeError: '<' not supported between instances of 'int' and 'str'

I've narrowed it down to how i'm entering YYYY-MM and YYYY-MM-DD dates into the Excel file. If I run it with just YYYY dates, it sorts correctly. In order to correctly display YYYY-MM and YYYY-MM-DD dates, in the Excel file, I must pre-pend a back-tick to them. Perhaps this is what is causing the problem.

Hopefully someone else has run into this before. Is there a way to read those dates with leading back-ticks correctly, using Pandas?

Or, is there a better way to enter dates into the Excel file for use with Pandas? (This may be just as much an Excel question as a Pandas question).

To fix the type error you can convert the ints in the 'Date' column to strings. This will ensure the dates are parsed as strings.

df['Date'] = df['Date'].astype('str')
df['pdDate'] = pd.to_datetime(df['Date'])
df.sort_values('pdDate')

Then create a new column, and convert the dates to datetime and sort the values. The 'Date' column will retain the same formats as in excel, and the dates will be sorted correctly. If you sort them as strings there may be errors, but they could still be sorted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM