简体繁体 English

Azure 数据工厂 - 使用 Rest 的复制任务 API 仅在执行时返回第一行

[英]Azure Data Factory - copy task using Rest API is only returning first row upon execution

原文 2021-12-13 23:04:30 4 3 json/ azure/ azure-data-factory

I have a copy task in ADF that is pulling data from a REST API into an Azure SQL Database.我在 ADF 中有一个复制任务，它将数据从 REST API 提取到 Azure SQL 数据库中。 I've created the mappings, and pulled in a collection reference as follows:我已经创建了映射，并提取了一个集合参考，如下所示：

preview of json data json数据预览

source来源

sink下沉

mappings映射

output output

You will notice it's only outputting 1 row (the first row) when running the copy task.您会注意到在运行复制任务时它只输出一行（第一行）。 I know this is usually because you are pulling from a nested JSON array, in which the collection reference should resolve this to pull from the array - but I can't for the life of me get it to pull multiple records even after setting the collection.我知道这通常是因为你是从一个嵌套的 JSON 数组中拉取的，其中集合引用应该解决这个问题以从数组中拉取 - 但即使在设置集合之后我也无法让它拉取多条记录.

3 个解决方案

There's a trick to this.这有一个窍门。 You import schemas, then you put the name of the array in collection reference then you import schemas again then it works Screen shot from azure data factory您导入模式，然后将数组的名称放入集合引用中，然后再次导入模式，然后它就可以工作了来自 azure 数据工厂的屏幕截图

Because of Azure Data Factory design limitation, pulling JSON data and inserting into Azure SQL Database isn't a good approach. Because of Azure Data Factory design limitation, pulling JSON data and inserting into Azure SQL Database isn't a good approach. Even after using the "Collective reference" you might not get the desired results.即使在使用“集体参考”之后，您也可能无法获得想要的结果。

The recommended approach is to store the output of REST API as a JSON file in Azure blob storage by Copy Data activity. The recommended approach is to store the output of REST API as a JSON file in Azure blob storage by Copy Data activity. Then you can use that file as Source and do transformation in Data Flow.然后您可以将该文件用作源并在数据流中进行转换。 Also you can use Lookup activity to get the JSON data and invoke the Stored Procedure to store the data in Azure SQL Database(This way will be cheaper and it's performance will be better).您还可以使用查找活动来获取 JSON 数据并调用存储过程将数据存储在 Azure SQL 数据库中（这样会更便宜，性能会更好）。

Use the flatten transformation to take array values inside hierarchical structures such as JSON and unroll them into individual rows.使用展平转换在 JSON 等层次结构中获取数组值，并将它们展开为单独的行。 This process is known as denormalization.这个过程称为非规范化。

Refer this third-party tutorial for more details.有关更多详细信息，请参阅此第三方教程。

Hey I had this issue and I noticed that the default column names for the json branches were really long and in my target csv the header row got truncated after a bit and I was able to get ADF working by just renaming them in the mapping section.嘿，我遇到了这个问题，我注意到 json 分支的默认列名真的很长，在我的目标 csv 中，header 行通过重命名后能够将 ADF 映射到部分而被截断。 For example i had:例如我有：

['hours']['monday']['openIntervals'][0]['endTime'] in source and changed it to MondayCloseTime in destination.源中['hours']['monday']['openIntervals'][0]['endTime']并将其更改为目标中的MondayCloseTime 。

Just started working.刚开始工作。 Can also just turn off the header on the output for a quick test before re writing all the column names as that also got it working for me也可以在重新编写所有列名之前关闭 output 上的 header 进行快速测试，因为这也可以为我工作

I assume it writes out the truncated header row at the same time as the 1st row of data and then tries to use that header row afterwards but as it doesn't match what its expecting it just ends.我假设它在第一行数据的同时写出截断的 header 行，然后尝试使用该 header 行，但由于它与预期的不匹配，它刚刚结束。 Bit annoying it doesn't give an error or anything but anyway this worked for me.有点烦人，它不会给出错误或任何东西，但无论如何这对我有用。