[英]Azure Data Factory schema mapping not working with SQL sink
I have a simple pipeline that loads data from a csv file to an Azure SQL db.我有一个简单的管道,可将数据从 csv 文件加载到 Azure SQL 数据库。
I have added a data flow where I have ensured all schema matches the SQL table.我添加了一个数据流,确保所有模式都与 SQL 表匹配。 I have a specific field which contains numbers with leading zeros.
我有一个特定字段,其中包含带前导零的数字。 The data type in the source - projection is set to string.
source - projection 中的数据类型设置为字符串。 The field is mapped to the SQL sink showing as string data-type.
该字段映射到显示为字符串数据类型的 SQL 接收器。 The field in SQL has nvarchar(50) data-type.
SQL 中的字段具有 nvarchar(50) 数据类型。
Once the pipeline is run, all the leading zeros are lost and the field appears to be treated as decimal:管道运行后,所有前导零都将丢失,并且该字段似乎被视为十进制:
Original data: 0012345
Inserted data: 12345.0
The CSV data shown in the data preview is showing correctly, however for some reason it loses its formatting during insert.数据预览中显示的 CSV 数据显示正确,但由于某种原因,它在插入过程中丢失了格式。
Any ideas how I can get it to insert correctly?有什么想法可以让它正确插入吗?
I had repro'd in my lab and was able to load as expected.我在我的实验室中进行了复制,并且能够按预期加载。 Please see the below repro details.
请参阅下面的复制详细信息。
Source file (CSV file):源文件(CSV 文件):
Sink table (SQL table):汇表(SQL表):
ADF:自动进纸器:
data flow
source to the CSV source file.data flow
源连接到 CSV 源文件。 As my file is in text format, all the source columns in the projection are in a string. Source data preview:源数据预览:
Azure SQL database
to load the data to the destination table.Azure SQL database
,加载数据到目的表。 Note : You can all add derived columns before sink to convert the value to string as the sink data type is a string.注意:您可以在 sink 之前添加派生列以将值转换为字符串,因为 sink 数据类型是字符串。
Thank you very much for your response.非常感谢您的回复。
As per your post the DF dataflow appears to be working correctly.根据您的帖子,DF 数据流似乎工作正常。 I have finally discovered an issue with the transformation - I have an Azure batch service which runs a python script, which does a basic transformation and saves the output to a csv file.
我终于发现了转换的问题 - 我有一个 Azure 批处理服务,它运行一个 python 脚本,它进行基本转换并将 output 保存到一个 csv 文件。
Interestingly, when I preview the data in the dataflow, it looks as expected.有趣的是,当我预览数据流中的数据时,它看起来和预期的一样。 However, the values stored in SQL are not.
但是,存储在 SQL 中的值不是。
For the sake of others having a similar issue, my existing python script used to convert a 'float' datatype column to string-type.为了其他人有类似问题,我现有的 python 脚本用于将“float”数据类型列转换为字符串类型。 Upon conversion, it used to retain 1 decimal number but as all of my numbers are integers, they were ending up with.0.
转换后,它过去常常保留 1 个十进制数,但由于我所有的数字都是整数,所以它们以 .0 结尾。
The solution was to convert values to integer and then to string:解决方案是将值转换为 integer,然后再转换为字符串:
df['col_name'] = df['col_name'].astype('Int64').astype('str')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.