简体   繁体   中英

AzCopy from HDInsight cluster failing in PowerShell script

I have a PowerShell script that creates some output using hive on HDinsight. The output is placed in a local blob, and then I copy it to a local machine using AzCopy. I do this a lot to get various pieces of data that I need, often calling that script multiple times. The problem is that at some point the AzCopy errors out with the message "The condition specified using HTTP conditional headers(s) is not met.", but this after numerous successful iterations.

I have am not sure what this means, and a fiddler transcript did not help much either. I tried deleting the file and repeating the AzCopy and the error persisted, so it might have something to do with the AzCopy http session. Can anyone enlighten me?

PS C:\hive> AzCopy /Y /Source:https://msftcampusdata.blob.core.windows.net/crunch88-1 /Dest:c:\hive\extracts\data\ /SourceKey:attEwHZ9AGq7pzzTYwRvjWwcmwLvFqnkxIvJcTblYnZAs1GSsCCtvbBKz9T/TTtwDSVMDuU3DenBbmOYqPIMhQ== /Pattern:hivehost/stdout 
AzCopy : [2015/05/10 15:08:44][ERROR] hivehost/stdout: The remote server returned an error: (412) The condition specified using HTTP conditional header(s) 
is not met..
At line:1 char:1
+ AzCopy /Y /Source:https://msftcampusdata.blob.core.windows.net/crunch88-1 /Dest: ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: ([2015/05/10 15:...s) is not met..:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

The condition specified using HTTP conditional header(s) is not met.

在此输入图像描述

In order to ensure the data integrity during the whole downloading process, AzCopy passes ETag of source blob into HTTP header "If-Match" while reading data from source blob. Thus HTTP status code 412 (Precondition Failed) "The condition specified using HTTP conditional headers(s) is not met." just means that your blobs were being changed while AzCopy was downloading them .

Please avoid changing source blobs while downloading them. If your have to change source blobs simultaneously, you can give a try to the following workaround:

Firstly take a snapshot of a source blob, and then download the blob with AzCopy (/Snapshot option specified), so that AzCopy will try to download the source blob and all of its snapshots. Although downloading of the source blob may fail with 412 (Precondition Fail), downloading of the snapshot can succeed. File name of the downloaded snapshot is: {blob name without extension} ({snapshot timestamp}).{extension}.

For the further information of AzCopy and the option /Snapshot, please refer to Getting Started with the AzCopy Command-Line Utility .

Some Updates:

Did you terminate AzCopy and then resume it with the same command line? If so, you need to make sure the source blob wasn't changed after the previous execution of AzCopy because AzCopy has to ensure the source blob remains unchanged during the period between AzCopy downloaded it for the first time and the source blob is downloaded successfully. In order to check whether resuming occurs, you can check whether the output of AzCopy contains "Incomplete operation with same command line detected at the journal directory {Dir Path}, AzCopy will start to resume.".

Because /Y is specified in your command line, the resuming prompt will be always answered "Yes". To avoid resuming behavior, you can clean up the default journal folder "%LocalAppData%\\Microsoft\\Azure\\AzCopy" before executing AzCopy, or specify /Z: to configure an unique journal folder for each execution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM