简体   繁体   中英

Incorrect date format in CSV file converting to SQL Server Table with Python

I am reading ~50 files and adding them to the same table consecutively, there is one for each month over the past few years. After the first year, the date format presented in the CSV files shifted from the format YYYY-mm-dd to mm/dd/YYYY. SQL Server is fine with the date format YYYY-mm-dd and is what it expects, but once the format switched in the CSV my program will crash I wrote a piece of code to try and convert the data to the correct format, but it didn't work, as shown here:

if '/' in df['SubmissionDate'].iloc[0]:
                    df['SubmissionDate'] = pd.to_datetime(df['SubmissionDate'], format = '%m/%d/%Y')

I believe that this would have worked, barring the issue that some of the rows of data have no date, so I need to either find some other way to allow the SQL Insert statement to accept this different date format, or avoid trying to convert the blank items in the Submission Date column.

Any help would be greatly appreciated!

It sounds like you are not using parse_dates= when loading the CSV file into the DataFrame. The date parser seems to be able to handle multiple date formats, even within the same file:

import io
import pandas as pd

csv = io.StringIO(
    """\
id,a_date
1,2001-01-01
2,1/2/2001
3,12/31/2001
4,31/12/2001
"""
)
df = pd.read_csv("/__tmp/date_mess.csv", parse_dates=["a_date"])
print(df)
"""
   id     a_date
0   1 2001-01-01
1   2 2001-01-02
2   3 2001-12-31
3   4 2001-12-31
"""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM