简体   繁体   English

ADF 管道在复制活动中添加顺序值

[英]ADF Pipeline Adding Sequential Value in Copy Activity

Apologies if this has been asked and answered elsewhere.如果已在其他地方询问并回答了此问题,请道歉。 If it is, please do refer to the url in comments on replies.如果是,请参考回复评论中的网址。 So here is the situation,所以情况是这样的

I am making an API Request, in response I get auth_token which I use in the Copy Activity as Authorization to retrieve data in JSON format and Sink it to Azure SQL Database.我正在发出 API 请求,作为响应,我得到auth_token ,我在复制活动中使用它作为授权以检索 JSON 格式的数据并将其接收到 Azure SQL 数据库。 I am able to Map all the elements I'm receiving in JSON to the columns of Azure SQL Database.我能够将我在 JSON 中收到的所有元素映射到 Azure SQL 数据库的列。 However, there are two columns ( UploadId and RowId ) that still need to be populated.但是,仍有两列( UploadIdRowId )需要填充。

  • UploadId is a GUID which will be same for the whole batch of rows (this I've managed to solve) UploadId是一个 GUID,对于整批行都是相同的(我已经设法解决了这个问题)
  • RowId will be a sequence starting from 1 to end of that batch entry, and then for next batch (with new GUID value) it resets back to 1. RowId将是一个从 1 开始到该批次条目结束的序列,然后对于下一个批次(具有新的 GUID 值),它会重置回 1。

The database will look something like this,数据库看起来像这样,

| APILoadTime |      UploadId     |    RowId    |
|  2020-02-01 | 29AD7-12345-22EwQ |      1      |
|  2020-02-01 | 29AD7-12345-22EwQ |      2      |
|  2020-02-01 | 29AD7-12345-22EwQ |      3      |
|  2020-02-01 | 29AD7-12345-22EwQ |      4      |
|  2020-02-01 | 29AD7-12345-22EwQ |      5      |
--------------------------------------------------> End of Batch One / Start of Batch Two
|  2020-02-01 | 30AD7-12345-22MLK |      1      |
|  2020-02-01 | 30AD7-12345-22MLK |      2      |
|  2020-02-01 | 30AD7-12345-22MLK |      3      |
|  2020-02-01 | 30AD7-12345-22MLK |      4      |
|  2020-02-01 | 30AD7-12345-22MLK |      5      |
--------------------------------------------------> End of Batch Two and so on ... 

Is there a way in Azure Pipeline's Copy Activity to achieve this RowId behavior ... Or even if it's possible within Azure SQL Database.在 Azure Pipeline 的 Copy Activity 中是否有办法实现此RowId行为......或者即使在 Azure SQL 数据库中也有可能。

Apologies for a long description, and Thank you in advance for any help!对冗长的描述深表歉意,并在此先感谢您的帮助! Regards问候

You need to use a Window Function to achieve this.您需要使用窗口函数来实现这一点。 ADF Data Flows have Window Transformation activities that are designed to do this exact thing. ADF 数据流具有专门用于执行此操作的窗口转换活动

Otherwise, you could load the data into a staging table and then use Azure SQL to window the data as you select it out like...否则,您可以将数据加载到暂存表中,然后在您选择数据时使用 Azure SQL 来窗口化数据,例如...

SELECT
    APILoadTime
    ,UploadId
    ,ROW_NUMBER() OVER (PARTITION BY UploadId ORDER BY APILoadTime) AS RowId
FROM dbo.MyTable;

Thanks a lot @Leon Yue and @JeffRamos, I've managed to figure out the solution, so posting it here for everyone else who might encounter the same situation,非常感谢@Leon Yu 和@JeffRamos,我已经找到了解决方案,所以把它贴在这里给其他可能遇到同样情况的人,

The solution I found was to use a Stored Procedure within Azure Data Factory from where I call the Azure Data Flow Activity.我找到的解决方案是在 Azure 数据工厂中使用存储过程,我从中调用 Azure 数据流活动。 This is the code I used for creating the RowId seed function,这是我用于创建 RowId 种子函数的代码,

CREATE PROCEDURE resetRowId
AS
BEGIN
    DBCC CHECKIDENT ('myDatabase', RESEED, 0)
END
GO

Once I have this Stored Procedure, all I did was something like this,一旦我有了这个存储过程,我所做的就是这样,

Azure 数据工厂管道重置 RowId

This does it for you, the reason I kept it 0 so that when new data comes in, it starts from 1 again ...这是为你做的,我把它保留为 0 的原因是,当新数据进来时,它又从 1 开始......

Hope this helps others too ...希望这对其他人也有帮助...

Thank you all who helped in someway感谢所有以某种方式提供帮助的人

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Adf 管道复制活动中参数化源 - How to parameterize the source in Adf pipeline copy activity ADF - 从复制活动中获取价值 - ADF - Get value from a copy activity 在雪花中添加具有默认值的列,但在 ADF 中复制数据活动(从具有所需列的文本文件复制)失败 - Adding a Column with default value in snowflake but copy data activity(copying from a text file which has required columns) in ADF fails ADF 中复制活动的限制带宽 - Throttle Bandwidth for a copy activity in ADF ADF 复制活动中的 Upsert 选项 - Upsert Option in ADF Copy Activity 通过复制活动使用 ADF 管道将数据插入 SQL 表时创建多个会话 - multiple sessions are created when inserting data into SQL table using ADF Pipeline through Copy Activity 使用 ADF 管道中的自定义活动从 OnPrem SQL 服务器复制到 DocumentDB - Copy From OnPrem SQL server to DocumentDB using custom activity in ADF Pipeline 使用 ADF 中的笔记本活动调用 Synapse 管道 - Call Synapse pipeline with a notebook activity in ADF ADF 复制管道未读取/写入所有行 - ADF Copy Pipeline not reading/writing all rows 如何在 adf 中按顺序进行复制活动? - How to do copy activity sequentially in adf?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM