简体   繁体   中英

Getting "Error converting data type VARCHAR to DATETIM"E while copying data from Azure blob to Azure DW through Polybase

I am new to the Azure environment and i am using data factory while trying to copy data present in the CSV file on Azure blob storage which has three columns (id,age,birth date) to a table in Azure data warehouse. The birth date is of the format "MM/dd/yyyy" and i am using polybase to copy the data from blob to my table in azure DW. The columns of the table are defined as(int,int,datetime).

I can copy my data if i use "Bulk Insert" option in data factory but it gives me an error when i choose the Polybase copy. Also changing the dateformat in the pipleine does not do any good either. Polybase copies successfully if i change the date format in my file to "yyyy/MM/dd".

Is there a way i can copy data from my blob to my table without having to change the date format in the source file to "yyyy/MM/dd".

I assume you have created an external file format which you reference in your external table?

The CREATE EXTERNAL FILEFORMAT has an option to define how a date is represented: DATE_FORMAT , and you set that to how your source data represents datetime.

So something like so:

CREATE EXTERNAL FILE FORMAT your-format  
WITH 
(  
  FORMAT_TYPE = DELIMITEDTEXT,  
  FORMAT_OPTIONS (  
      FIELD_TERMINATOR = '|',  
      DATE_FORMAT = 'MM/dd/yyyy' ) 
);  

You can find more about this at: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-ver15

Seems like this error is resolved now. I was giving the date-format as 'MM/dd/yyyy' whereas the data factory expected it to be just MM/dd/yyyy without any quotes.

So as per my understanding i will summarize what i learned while copying data from Azure blob to Azure SQL Data Warehouse with a 'MM/dd/yyy' date format, in a few points here :

1) If you are using azure portal to copy data from blob to azure sql data warehouse using Data Factory copy option.

  • Create a copy data pipe line using data factory.
  • Specify your input data source and your destination data store.
  • Under filed mappings,choose datetime in the column that contains the date, click on the little icon on its right to bring the custom date format field and enter your date format without quotes eg MM/dd/yyyy as in my case.
  • Run your pipleline and it should successfully complete.

2) You can use polybase directly by:

  • Creating external data source that specifies the location of your input file eg csv file on blob storage in my case.
  • An external file format that specifies the delimiter and custom date format eg MM/dd/yyyy in your input file.
  • External table that defines all the columns present in your source file and uses the external data storage and file format which you defined above.
  • You can then create your custom tables as select using the external table(CTAS).Something which Niels stated in his answer above.I used Microsoft SQL Server Management Studio for this process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM