[英]Azure Data Factory: Wrong and unexpected Datatype conversion during import from csv to sql server via pipeline
I am trying to load data from a csv to a sql server database using an Azure pipeline copy data operator.我正在尝试使用 Azure 管道复制数据运算符将数据从 csv 加载到 sql server 数据库。 During the import data is converted to other types.在导入数据期间转换为其他类型。
in the Source preview in the pipeline I see the following在管道中的源预览中,我看到以下内容
1- the value "0044" is converted to 44 1- 值“0044”被转换为 44
2- the value 2020000000000000 is converted to 2E+16 2- 值 2020000000000000 转换为 2E+16
3- the value 5.2 is converted to February 5th 3- 值 5.2 转换为 2 月 5 日
4- the value 9.78 is converted to September 1978 4- 值 9.78 转换为 1978 年 9 月
so far i could not find a solution for 0044,到目前为止,我找不到 0044 的解决方案,
I the other cases here is what I did:我这里的其他情况是我所做的:
for 2 I enclosed the number 2020000000000000 in "" then it worked, though for some reason I get it enclosed in four " like so: ""2020000000000000"" for 3 and 4 I replaced the dot for a comma and then it worked.对于 2,我将数字 2020000000000000 括在“”中,然后它起作用了,但出于某种原因,我将它括在四个“中”,就像这样:“2020000000000000””表示 3 和 4 我将点替换为逗号,然后它起作用了。
But I would like to be able to tell the import utility to treat everything just as string and do the conversions in the database.但是我希望能够告诉导入实用程序将所有内容都视为字符串并在数据库中进行转换。
how can I achive this?我怎样才能做到这一点?
the code shows following for one of the columns in 3 and 4:对于 3 和 4 中的一列,代码显示以下内容:
( (
"source":(
"name": "Amount"
"type": "String"
)
"sink":(
"name": "Amount"
"type": "String"
)
) )
Best Regards,此致,
All the default data type in csv is String. csv 中的所有默认数据类型都是字符串。
For Azure SQL database/SQL Server, we can not store data '0044' as int data type.对于 Azure SQL 数据库/SQL Server,我们不能将数据“0044”存储为int数据类型。 You need convert '0044' as String:您需要将 '0044' 转换为字符串:
We could using select convert to 44
to '0044':我们可以使用 select convert to 44
to '0044':
select right('0000'+ltrim([a]),4) new_a, b from test12
When we copy data from csv file, you need think about if the data in csv file is valid data type in Azure SQL database/SQL Server.当我们从 csv 文件中复制数据时,您需要考虑 csv 文件中的数据是否是 Azure SQL 数据库/SQL Server 中的有效数据类型。 For example, the data '2020000000000000' is out of int length.例如,数据“2020000000000000”超出整数长度。
It's very important to design the sink table.设计水槽桌非常重要。 So the suggestions is that you first create the sink table in you Azure SQL database with the suitable data type for every column, then set the column mappings in Copy active manually:因此,建议您首先在 Azure SQL 数据库中使用适合每一列的数据类型创建接收器表,然后手动在复制活动中设置列映射:
Mapping settings:映射设置:
Pipeline running:流水线运行:
Data check in SQL database: SQL数据库中的数据检查:
Update:更新:
The issue is solved by Ramiro Kollmannsperger himself now:这个问题现在由 Ramiro Kollmannsperger 自己解决了:
"my sink table in the database has only nvarchar columns. I did this so after a lot of headaches with datatypes and length. I decided that it is easier for me to just do the conversions from nvarchar in the database into a staging table. What helped in the end was to do the schema import in the source Dataset where the csv is read. There is a tab "connection" and next to it another tab "schema" where you can import the schema. After doing this it worked." “我在数据库中的接收器表只有 nvarchar 列。在对数据类型和长度进行了很多头痛之后,我这样做了。我决定对我来说,只需将数据库中的 nvarchar 转换为临时表就更容易了。什么最后的帮助是在读取 csv 的源数据集中进行模式导入。有一个选项卡“连接”,旁边还有另一个选项卡“模式”,您可以在其中导入模式。这样做后它就起作用了。”
Hope this helps.希望这可以帮助。
my sink table in the database has only nvarchar columns.我在数据库中的接收器表只有 nvarchar 列。 I did this so after a lot of headaches with datatypes and length.在对数据类型和长度感到头疼之后,我这样做了。
I decided that it is easier for me to just do the conversions from nvarchar in the database into a staging table.我决定将数据库中的 nvarchar 转换为临时表对我来说更容易。
What helped in the end was to do the schema import in the source Dataset where the csv is read.最终有帮助的是在读取 csv 的源数据集中进行模式导入。 There is a tab "connection" and next to it another tab "schema" where you can import the schema.有一个选项卡“连接”,旁边还有另一个选项卡“模式”,您可以在其中导入模式。 After doing this it worked.这样做后,它起作用了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.