简体   繁体   English

Azure数据工厂管道触发时间

[英]Azure Data Factory Pipeline trigger time

It may be simple but I am having hard time in understanding exact trigger time of Azure Data Factory Pipeline. 这可能很简单,但是我很难理解Azure Data Factory Pipeline的确切触发时间。 I followed the MS tutorial to create a DF pipeline to copy data from Blob To Azure SQL. 我按照MS教程创建了DF管道,以将数据从Blob复制到Azure SQL。

I created a pipeline at "1-March 16:14 IST (10:44 AM UTC)" with below scheduled, 我在“计划安排的3月1日至14日IST(UTC时间上午10:44)”创建了一个管道,

Start date - 02/28/2017 12:00 AM UTC 开始日期-02/28/2017 12:00 UTC

End date - 03/04/2017 11:59 PM UTC 结束日期-2017年3月4日UTC

Recurring in 1 Day 1天重复

After creating pipeline, it immediately run for below window, 创建管道后,它会立即在以下窗口中运行,

Window Start - 02/28/2017 12:00 AM UTC 窗口开始-02/28/2017 12:00 UTC

Window End - 03/01/2017 12:00 AM UTC 窗口结束-2017年3月1日世界标准时间

Attempt End - 03/01/2017 10:45 AM UTC 尝试结束-2017年3月1日世界标准时间

Attempt Start - 03/01/2017 10:44 AM UTC 尝试开始-UTC时间03/01/2017 10:44

Now my question is why it didn't run for window (03/01/17 12:00 AM UTC to 03/02/17 12:00 AM UTC) because pipeline was created between this time window only. 现在我的问题是为什么它没有在窗口中运行(UTC时间为03/01/17 12:00 AM UTC 03/02/17 12:00 AM),因为仅在此时间窗口之间创建了管道。 I mean it ran for last day window but not for current day window. 我的意思是它运行在最后一天的窗口,但没有运行在当前一天的窗口。

So what is exact time when a pipeline is triggered in every window? 那么在每个窗口中触发管道的确切时间是什么?


As asked by Paul, here are more configuration values, 根据Paul的要求,这里有更多配置值,

Pipeline: 管道:

"policy": {
            "timeout": "1.00:00:00",
            "concurrency": 1,
            "executionPriorityOrder": "NewestFirst",
            "style": "StartOfInterval",
            "retry": 3,
            "longRetry": 0,
            "longRetryInterval": "00:00:00"
        },
        "scheduler": {
            "frequency": "Day",
            "interval": 1
        },

"start": "2017-02-28T00:00:00Z",
    "end": "2017-03-04T23:59:00Z",

Source Dataset: 源数据集:

"availability": {
        "frequency": "Day",
        "interval": 1
    },
    "external": true,
    "policy": {},

Destination Dataset: 目标数据集:

"availability": {
        "frequency": "Day",
        "interval": 1
    },
    "external": false,
    "policy": {},

Below is the execution log, 下面是执行日志,

Start & End Time
03/01/2017 12:00 AM UTC - 03/02/2017 12:00 AM UTC
Attempt Time : 03/02/2017 12:01 AM

Can you please provide the JSON for the pipeline schedule, the dataset internals (in and out) and copy activity scheduler? 您能否为管道计划,数据集内部(输入和输出)以及复制活动计划程序提供JSON?

The attribute values from these 4 different blocks of code is what affects the ADF time slice behaviour. 这4个不同的代码块的属性值会影响ADF时间片的行为。 There will be something you've missed in your configuration when you've provisioned the slices. 设置切片时,配置中会缺少某些内容。 Also be mindful that time slices are very different to a SQL Agent schedule, despite the poorly named JSON attribute of 'schedule'! 还请注意,尽管'schedule'的JSON属性命名不当,但时间片与SQL Agent计划有很大不同! This is simple the start and end of the time line that is to be sliced up by defined intervals. 这很简单,可以按定义的时间间隔分割 时间线的起点和终点。

Additionally there are settings to state what order to run things in and when the time slice should execute. 此外,还有一些设置可以指明运行顺序以及执行时间片的时间。 Eg; 例如; at the start or the end. 在开始或结束时。

This is a handy Microsoft article that I often refer to: 这是我经常提到的方便的Microsoft文章:

https://docs.microsoft.com/en-us/azure/data-factory/data-factory-scheduling-and-execution https://docs.microsoft.com/zh-cn/azure/data-factory/data-factory-scheduling-and-execution

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM