简体   繁体   中英

Why does Python Pandas read the string of an excel file as datetime

I have the following questions.

I have Excel files as follows:

在此处输入图像描述

When i read the file using df = pd.read_excel(file,dtype=str) . the first row turned to 2003-02-14 00:00:00 while the rest are displayed as it is.

How do i prevent pd.read_excel() from converting its value into datetime or something else?

Thanks!

As @ddejohn correctly said it in the comments, the behavior you face is actually coming from Excel, automatically converting the data to date. Thus pandas will have to deal with that data AS date, and treat it later to get the correct format back as you expect, as like you say you cannot modify the input Excel file.

Here is a short script to make it work as you expect:

import pandas as pd

def rev(x: str) -> str:
    '''
    converts '2003-02-14 00:00:00' to '14.02.03'
    '''

    hours = '00:00:00'
    if not hours in x:
        return x
    y = x.split()[0]
    y = y.split('-')
    return '.'.join([i[-2:] for i in y[::-1]])

file = r'C:\your\folder\path\Classeur1.xlsx'
df = pd.read_excel(file, dtype=str)

df['column'] = df['column'].apply(rev)

Replace df['column'] by your actual column name. You then get the desired format in your dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM