In Azure Data factory I need to re-run around 4000+ failed pipeline. It is possible to do in the Azure portal UI but I am try to automate the run-run process in powershell.
I am not able to find the command/steps in powershell to ru-run a failed pipeline by the run ID.
I found this question while searching for a solution to the same problem, so here's an answer with what I've found.
If you want to re-run the entire pipeline and you don't care about it being technically considered a re-run by Data Factory, you could use the Get-AzDataFactoryV2PipelineRun
cmdlet to get a list of runs, filter to the failed runs, then use the same parameters in a call to Invoke-AzureRmDataFactoryV2Pipeline
It looks like fairly soon that cmdlet will be updated to allow a true re-run (based on the response to this issue someone raised on the Visually monitor Azure Data Factory doc from Microsoft).
If you're in a hurry to be able to do a true re-run, the functionality is already included in the REST API by calling createRun
with some optional parameters.
Edit: This was added to the Azure PowerShell module v4.8.0 released in October 2020 ( docs ). You can now pass a pipeline run ID to the Invoke-AzDataFactoryV2Pipeline
cmdlet using -ReferencePipelineRunId
to have it use the parameters from that run. You can also use the -StartFromFailure
switch to have it only rerun the failed activities.
I just find a tutorial Azure Data Factory: Detecting and Re-Running failed ADF Slices , it provides you the Powershell script to automate failed pipeline.
Powershell script:
Login-AzureRmAccount
$slices= @()
$tableName=@()
$failedSlices= @()
$failedSlicesCount= @()
$tableNames=@()
$Subscription="Provide Subscription ID"
Select-AzureRMSubscription -SubscriptionId $Subscription
$DataFactoryName="Provide Data Factory Name"
$resourceGroupName ="Porvide Resource Group Name for Data Factory"
$startDateTime ="2015-05-01" #Start Date for Slices
$endDateTime="2015-08-01" # End Date for Slices
#Get Dataset names in Data Factory - you can exlicitly give a table name using $tableName variable if you like to run only for an individual tablename
$tableNames = Get-AzureRMDataFactoryDataset -DataFactoryName $DataFactoryName -ResourceGroupName $resourceGroupName | ForEach {$_.DatasetName}
$tableNames #lists tablenames
foreach ($tableName in $tableNames)
{
$slices += Get-AzureRMDataFactorySlice -DataFactoryName $DataFactoryName -DatasetName $tableName -StartDateTime $startDateTime -EndDateTime $endDateTime -ResourceGroupName $resourceGroupName -ErrorAction Stop
}
$failedSlices = $slices | Where {$_.Status -eq 'Failed'}
$failedSlicesCount = @($failedSlices).Count
if ( $failedSlicesCount -gt 0 )
{
write-host "Total number of slices Failed:$failedSlicesCount"
$Prompt = Read-host "Do you want to Rerun these failed slices? (Y | N)"
if ( $Prompt -eq "Y" -Or $Prompt -eq "y" )
{
foreach ($failed in $failedSlices)
{
write-host "Rerunning slice of Dataset "$($failed.DatasetName)" with StartDateTime "$($failed.Start)" and EndDateTime "$($failed.End)""
Set-AzureRMDataFactorySliceStatus -UpdateType UpstreamInPipeline -Status Waiting -DataFactoryName $($failed.DataFactoryName) -DatasetName $($failed.DatasetName) -ResourceGroupName $resourceGroupName -StartDateTime "$($failed.Start)" -EndDateTime "$($failed.End)"
}
}
}
else
{
write-host "There are no Failed slices in the given time period."
}
Hope this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.