简体   繁体   中英

Using powershell how do you rerun failed pipeline in Azure Data Factory by run ID?

In Azure Data factory I need to re-run around 4000+ failed pipeline. It is possible to do in the Azure portal UI but I am try to automate the run-run process in powershell.

I am not able to find the command/steps in powershell to ru-run a failed pipeline by the run ID.

I found this question while searching for a solution to the same problem, so here's an answer with what I've found.

If you want to re-run the entire pipeline and you don't care about it being technically considered a re-run by Data Factory, you could use the Get-AzDataFactoryV2PipelineRun cmdlet to get a list of runs, filter to the failed runs, then use the same parameters in a call to Invoke-AzureRmDataFactoryV2Pipeline

It looks like fairly soon that cmdlet will be updated to allow a true re-run (based on the response to this issue someone raised on the Visually monitor Azure Data Factory doc from Microsoft).

If you're in a hurry to be able to do a true re-run, the functionality is already included in the REST API by calling createRun with some optional parameters.

Edit: This was added to the Azure PowerShell module v4.8.0 released in October 2020 ( docs ). You can now pass a pipeline run ID to the Invoke-AzDataFactoryV2Pipeline cmdlet using -ReferencePipelineRunId to have it use the parameters from that run. You can also use the -StartFromFailure switch to have it only rerun the failed activities.

I just find a tutorial Azure Data Factory: Detecting and Re-Running failed ADF Slices , it provides you the Powershell script to automate failed pipeline.

Powershell script:

Login-AzureRmAccount
$slices= @()
$tableName=@()
$failedSlices= @()
$failedSlicesCount= @()
$tableNames=@()

$Subscription="Provide Subscription ID"  

  Select-AzureRMSubscription -SubscriptionId  $Subscription    
$DataFactoryName="Provide Data Factory Name"
$resourceGroupName ="Porvide Resource Group Name for Data Factory"

$startDateTime ="2015-05-01" #Start Date for Slices
$endDateTime="2015-08-01" # End Date for Slices


#Get Dataset names in Data Factory - you can exlicitly give a table name using $tableName variable if you like to run only for an individual tablename
$tableNames = Get-AzureRMDataFactoryDataset -DataFactoryName $DataFactoryName -ResourceGroupName $resourceGroupName | ForEach {$_.DatasetName}

$tableNames #lists tablenames

foreach ($tableName in $tableNames)
{
    $slices += Get-AzureRMDataFactorySlice -DataFactoryName $DataFactoryName -DatasetName $tableName -StartDateTime $startDateTime -EndDateTime $endDateTime -ResourceGroupName $resourceGroupName -ErrorAction Stop
}


$failedSlices = $slices | Where {$_.Status -eq 'Failed'}

$failedSlicesCount = @($failedSlices).Count

if ( $failedSlicesCount -gt 0 ) 
{

    write-host "Total number of slices Failed:$failedSlicesCount"
    $Prompt = Read-host "Do you want to Rerun these failed slices? (Y | N)" 
    if ( $Prompt -eq "Y" -Or $Prompt -eq "y" )
    {

        foreach ($failed in $failedSlices)
        {
            write-host "Rerunning slice of Dataset "$($failed.DatasetName)" with StartDateTime "$($failed.Start)" and EndDateTime "$($failed.End)"" 
            Set-AzureRMDataFactorySliceStatus -UpdateType UpstreamInPipeline -Status Waiting -DataFactoryName $($failed.DataFactoryName) -DatasetName $($failed.DatasetName) -ResourceGroupName $resourceGroupName -StartDateTime "$($failed.Start)" -EndDateTime "$($failed.End)" 


        }
    }

}
else
{
    write-host "There are no Failed slices in the given time period."
}

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM