繁体   English   中英

Stream 分析:选择自动暂停一天的最佳参数 TUMBLINGWINDOW stream 作业和为该作业设置的最佳触发时间 function

[英]Stream Analytics: Best parameters to choose for the autopause of a day wise TUMBLINGWINDOW stream job and best trigger time to set for that function

语境

我有一个白天的 TUMBLINGWINDOW(类似于下面显示的那个)

SELECT
    DATEADD(day, -1, System.Timestamp()) AS WindowStart
    System.Timestamp() AS WindowEnd, 
    TollId, 
    COUNT(*)
FROM Input TIMESTAMP BY EntryTime  
GROUP BY TumblingWindow(day, 1), TollId

我一直在阅读自动暂停文档并开始按照其中包含的步骤进行操作。 我有一个测试 stream 工作以及一个 function 应用程序,它可以托管自动暂停 PowerShell 代码所有设置,这样我就可以在不影响实际工作的情况下进行游戏,因为我现在正在使用单独的测试工作)。 PowerShell 代码保持原样(除了参数值外没有变化)但是我还没有真正开始测试 stream 工作,并且我计划在我对参数和触发时间有更多线索后这样做用于自动暂停的东西。

这是以前的 stackoverflow 帖子,它提供了额外的有用解释,用于理解目的以及我想要实现的目标(我创建了帖子):

发布解释开始时间如何与具体示例一起工作的帖子,以说明我希望工作如何暂停

其他帖子中的背景场景摘要

目标是能够让 stream 作业每天运行一次(足够长的时间让全天的 TUMBLINGWINDOW output 每天都可以输出全天的数据)。 为了确保为此目的提供足够的时间,我认为这项工作可以在一天的大部分时间(从00:30 UTC开始)保持关闭状态,除了23:30 UTC应该打开并“赶上积压”一天( 00:00-23:30 UTC )之后,一天明智的 window 在00:00 UTC输出并随后关闭,比如在00:30 UTC (有足够的时间确保输出)。 然后这个过程会循环重复

我的问题

我选择(在下面添加)的主要参数是否符合我的意图(如上文所述),如果是,我如何设置function 应用程序的触发时间,以便此代码按照这些参数按预期运行?

我是否将触发器设置为在23:3000:30运行脚本(在文档中提到这是使用 CRON 表达式完成的)因为在这两个点它都需要分别启动或停止作业?

# This snippet is taken from the auto-pause doc linked above

# Set my own values in minutes based on above discussion
$restartThresholdMinute = 1380 # This is M (1380min = 23hours ie time left off 00:30-23:30 UTC)
$stopThresholdMinute = 60 # This is N (60min = 1hours ie time left on 23:30-00:30 UTC)

# Have left these as default due to present advice
$maxInputBacklog = 0 # The amount of backlog we tolerate when stopping the job (in event count, 0 is a good starting point)
$maxWatermark = 10 # The amount of watermark we tolerate when stopping the job (in seconds, 10 is a good starting point at low SUs)

侧重点:

如果我的参数不是开始的好选择,还有哪些其他建议? 请记住我在上下文部分讨论的主要限制

编辑:更新 2022-03-16

@Florian根据我对您在帖子中提到的内容的理解,我有一些想法,但不确定处理此问题的最佳方法。 如果您可以在您的答案中为此实现添加对您的代码的改编,那将是很好的

  • PowerShell脚本的整体结构可以保持不变。 最后,最好也更改控制台写入输出等,但还没有添加
    • 一个主要区别可能是启动/停止作业的 if-else 逻辑,这些作业将有一个条件,将时间与某个预定义的设置值进行比较,而不是依赖于 M 和 N。
    • 也许水印和积压检查可以只保留到 output 到控制台以供参考,但从所有条件部分中删除。
    • -OutputStartMode LastOutputEventTime为 start_time 选项(认为它基本上是when last stopped的时间),以确保我们不会丢失任何数据,并拥有您在上一篇文章中提到的全天数据。
  • 出于初始概念的目的,我保留了几乎所有文档代码(即使可能不需要),只是添加了一些变量并更改了停止/启动 if-else 条件。
# Input bindings are passed in via param block.
Param($Timer)

# Stop on error
$ErrorActionPreference = 'stop'

# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
$currentUTCstringtime = Get-Date -Date $currentUTCtime -UFormat %R  # nishcs edit: Getting the 24hour UTC time format as a string
Write-Host "asaRobotPause - PowerShell timer trigger function is starting at time: $currentUTCtime"

# Set variables
[string]$restartTime = $env:restartTime # nishcs edit: Set this to '23:30' These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)
[string]$stopTime = $env:stopTime # nishcs edit: Set this to '00:30'. These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)

$maxInputBacklog = $env:maxInputBacklog
$maxWatermark = $env:maxWatermark

$restartThresholdMinute = $env:restartThresholdMinute
$stopThresholdMinute = $env:stopThresholdMinute

$subscriptionId = $env:subscriptionId
$resourceGroupName = $env:resourceGroupName
$asaJobName = $env:asaJobName

$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"

# Check if managed identity has been enabled and granted access to a subscription, resource group, or resource
$AzContext = Get-AzContext -ErrorAction SilentlyContinue
if (-not $AzContext.Subscription.Id)
{
    Throw ("Managed identity is not enabled for this app or it has not been granted access to any Azure resources. Please see https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity for additional details.")
}

try
{
    # throw "This is an error."
    
    # Check current ASA job status
    $currentJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotPause - Job $($asaJobName) is $($currentJobState)."

    # Switch state
    if ($currentJobState -eq "Running")
    { 
        # Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
        # We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
        # There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
        $startTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Start Job*"}
        $startTimeStamp = $startTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}

        # Get-AzMetric issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
        $currentBacklog = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "InputEventsSourcesBacklogged" -DetailedOutput -WarningAction Ignore
        $currentWatermark = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "OutputWatermarkDelaySeconds" -DetailedOutput -WarningAction Ignore

        # Metric are always lagging 1-3 minutes behind, so grabbing the last N minutes means checking N+3 actually. This may be overly safe and fined tune down per job.
        $Backlog =  $currentBacklog.Data | `
                        Where-Object {$_.Maximum -ge 0} | `
                        Sort-Object -Property Timestamp -Descending | `
                        Where-Object {$_.Timestamp -ge $startTimeStamp} | `
                        Select-Object -First $stopThresholdMinute | 
                        Measure-Object -Sum Maximum
        $BacklogSum = $Backlog.Sum

        $Watermark = $currentWatermark.Data | `
                        Where-Object {$_.Maximum -ge 0} | `
                        Sort-Object -Property Timestamp -Descending | `
                        Where-Object {$_.Timestamp -ge $startTimeStamp} | `
                        Select-Object -First $stopThresholdMinute | `
                        Measure-Object -Average Maximum
        $WatermarkAvg = [int]$Watermark.Average

        Write-Output "asaRobotPause - Job $($asaJobName) is running since $($startTimeStamp) with a sum of $($BacklogSum) backlogged events, and an average watermark of $($WatermarkAvg) sec, for $($Watermark.Count) minutes."

        # nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set.
        if (
            ($currentUTCstringtime -eq $stopTime)
            )
        {
            Write-Output "asaRobotPause - Job $($asaJobName) is stopping..."
            Stop-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName
        }
        else {
            Write-Output "asaRobotPause - Job $($asaJobName) is not stopping yet, it needs to have less than $($maxInputBacklog) backlogged events and under $($maxWatermark) sec watermark for at least $($stopThresholdMinute) minutes."
        }
    }

    elseif ($currentJobState -eq "Stopped")
    {
        # Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
        # We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
        # There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
        $stopTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Stop Job*"}
        $stopTimeStamp = $stopTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}

        # Get-Date returns a local time, we project it to the same time zone (universal) as the result of Get-AzActivityLog that we extracted above
        $minutesSinceStopped = ((Get-Date).ToUniversalTime()- $stopTimeStamp).TotalMinutes

        # nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set. 
        if ($currentUTCstringtime -eq $restartTime)
        {
            Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it is now starting..."
            Start-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName -OutputStartMode LastOutputEventTime
        }
        else{
            Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it will not be restarted yet."
        }
    }
    else {
        Write-Output "asaRobotPause - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
    }

    # Final ASA job status check
    $newJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotPause - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
    throw $_.Exception.Message
}

我认为您需要一种与文章中描述的不同的调度逻辑。

从文章:

  • 停止的作业在 M 分钟后重新启动
  • 正在运行的作业在 N 分钟后随时停止,只要它的积压和水印指标是健康的

我认为你需要的是:

  • 停止的作业在 23:30 重新启动
  • 正在运行的作业在 00:30 停止(你仍然可以检查水印,但如果你给它足够的时间,这可能是不必要的)

实现用例的最简单方法是创建 2 个简单的作业,一个用于启动,一个用于停止。 在触发器方面:

如果您需要帮助调整代码,请告诉我。

在这里张贴问题的完整性。 我提供了修改后的脚本来处理在特定时间停止和启动。

这是根据@Florian的建议得出的。

方式一:Function App方式

如果您计划使用 function 应用程序来托管代码,您可以在单个 function 应用程序中创建 2 个独立的函数。 一个用于停止,一个用于重新启动 stream 作业。 下面我附上了每个函数 (run.ps1) 的 PowerShell 代码 function 的参数可以添加到 function 应用程序的配置部分,并使用环境变量语法拉入此处的脚本。

Function 1(重启作业):asa-autorestart

<# 
Function for restarting the stream job.
This uses the when last stopped logic to try and ensure no data is missed during the restart process, this can be changed as necessary.
#>

# Input bindings are passed in via param block.
Param($Timer)

# Stop on error
$ErrorActionPreference = 'stop'

# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
Write-Host "asaRobotRestart - PowerShell timer trigger function is starting at time: $currentUTCtime"

# Set variables
$resourceGroupName = $env:resourceGroupName
$asaJobName = $env:asaJobName

# Not being used in code but kept just encase
$subscriptionId = $env:subscriptionId
#$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"

# Check if managed identity has been enabled and granted access to a subscription, resource group, or resource
$AzContext = Get-AzContext -ErrorAction SilentlyContinue
if (-not $AzContext.Subscription.Id)
{
    Throw ("Managed identity is not enabled for this app or it has not been granted access to any Azure resources. Please see https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity for additional details.")
}

try
{
    # throw "This is an error."
    
    # Check current ASA job status
    $currentJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotRestart - Job $($asaJobName) is $($currentJobState)."

    if ($currentJobState -eq "Stopped")
    {   
        # Conditions no longer reliant on the M and N minute. Just on the predefined restart trigger time that has been set.
        Write-Output "asaRobotRestart - Job $($jobName) is now starting from when last stopped..."
        Start-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName -OutputStartMode LastOutputEventTime
        Write-Output "asaRobotRestart - Job $($jobName) has been started."
    }
    else {
        Write-Output "asaRobotRestart - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
    }
    # Final ASA job status check
    $newJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotRestart - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
    throw $_.Exception.Message
}

Function 2(停止作业):asa-autostop

<# 
Function for stopping the stream job.
#>

# Input bindings are passed in via param block.
Param($Timer)

# Stop on error
$ErrorActionPreference = 'stop'

# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
Write-Host "asaRobotStop - PowerShell timer trigger function is starting at time: $currentUTCtime"

# Set variables
$resourceGroupName = $env:resourceGroupName
$asaJobName = $env:asaJobName

# Not being used in code but kept just encase
$subscriptionId = $env:subscriptionId
#$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"

# Check if managed identity has been enabled and granted access to a subscription, resource group, or resource
$AzContext = Get-AzContext -ErrorAction SilentlyContinue
if (-not $AzContext.Subscription.Id)
{
    Throw ("Managed identity is not enabled for this app or it has not been granted access to any Azure resources. Please see https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity for additional details.")
}

try
{
    # throw "This is an error."
    
    # Check current ASA job status
    $currentJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotStop - Job $($asaJobName) is $($currentJobState)."

    # Switch state
    if ($currentJobState -eq "Running")
    { 
        # Conditions no longer reliant on the M and N minute. Just on the predefined stop trigger time that has been set.
        Write-Output "asaRobotStop - Job $($asaJobName) is stopping..."
        Stop-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName
        Write-Output "asaRobotStop - Job $($asaJobName) has stopped."

    }
    else {
            Write-Output "asaRobotStop - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
        }
    # Final ASA job status check
    $newJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotStop - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
    throw $_.Exception.Message
}

方法二:自动化作业法

如果您计划使用自动化帐户来托管代码,您可以在一个自动化帐户中创建 2 个单独的运行手册。 一个用于停止,一个用于重新启动 stream 作业。 下面我附上了每个 Runbook 的 PowerShell 代码 运行手册发布后,您可以添加运行手册的参数,并且您可以安排每本书在特定时间运行。 然后可以使用标准参数语法将其拉入脚本。

Runbook 1(重启作业):asa-autorestart

#Re-starting job
Param(
    [string]$subscriptionId,
    [string]$resourceGroupName,
    [string]$asaJobName
)
# Stop on error
$ErrorActionPreference = 'stop'
# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
Write-Host "asaRobotRestart - PowerShell timer trigger function is starting at time: $currentUTCtime"
# Set variables
$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"
# Ensures you do not inherit an AzContext in your runbook
Disable-AzContextAutosave -Scope Process | Out-Null
# Connect using a Managed Service Identity
try {
        $AzureContext = (Connect-AzAccount -Identity).context
    }
catch{
        Write-Output "There is no system-assigned user identity. Aborting.";
        exit
    }
try
{
    # throw "This is an error."
    # Check current ASA job status
    $currentJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotRestart - Job $($asaJobName) is $($currentJobState)."
    if ($currentJobState -eq "Stopped")
    {
        # Conditions no longer reliant on the M and N minute. Just on the predefined restart trigger time that has been set.
        Write-Output "asaRobotRestart - Job $($jobName) is now starting from when last stopped..."
        Start-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName -OutputStartMode LastOutputEventTime
        Write-Output "asaRobotRestart - Job $($jobName) has been started."
    }
    else {
        Write-Output "asaRobotRestart - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
    }
    # Final ASA job status check
    $newJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotRestart - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
    throw $_.Exception.Message
}

Runbook 2(停止作业):asa-autostop

# Stopping job
Param(
    [string]$subscriptionId,
    [string]$resourceGroupName,
    [string]$asaJobName
)
# Stop on error
$ErrorActionPreference = 'stop'
# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
Write-Host "asaRobotStop - PowerShell timer trigger function is starting at time: $currentUTCtime"
# Set variables
$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"
# Ensures you do not inherit an AzContext in your runbook
Disable-AzContextAutosave -Scope Process | Out-Null
# Connect using a Managed Service Identity
try {
        $AzureContext = (Connect-AzAccount -Identity).context
    }
catch{
        Write-Output "There is no system-assigned user identity. Aborting.";
        exit
    }
try
{
    # throw "This is an error."
    # Check current ASA job status
    $currentJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotStop - Job $($asaJobName) is $($currentJobState)."
    # Switch state
    if ($currentJobState -eq "Running")
    {
        # Conditions no longer reliant on the M and N minute. Just on the predefined stop trigger time that has been set.
        Write-Output "asaRobotStop - Job $($asaJobName) is stopping..."
        Stop-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName
        Write-Output "asaRobotStop - Job $($asaJobName) has stopped."
    }
    else {
            Write-Output "asaRobotStop - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
        }
    # Final ASA job status check
    $newJobState = Get-AzStreamAnalyticsJob  -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
    Write-Output "asaRobotStop - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
    throw $_.Exception.Message
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM