简体   繁体   中英

SQL Server Backup To URL failing intermittently

SQL Server 2014 SP2, Windows Server 2012 R2 (full updates) running on a DS13 Azure VM.

I'm running a full backup to an Azure Storage account, and it's intermittently failing, without much information.

The database in question is just under 100GB uncompressed, and just under 9GB compressed.

With dbcc traceon(3051,-1) I am able to see the log contents. The only signs that anything is wrong are a number of these:

7/12/2016 3:45:16 PM: Result recorded Exception Message: The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. 7/12/2016 3:45:16 PM: HTTP status code -1, HTTP Status Message

7/12/2016 3:45:25 PM: Result recorded Exception Message: Unable to read data from the transport connection: The connection was closed. 7/12/2016 3:45:25 PM: HTTP status code -1, HTTP Status Message

7/12/2016 3:45:25 PM: Result recorded Exception Message: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. 7/12/2016 3:45:25 PM: HTTP status code -1, HTTP Status Message

Eventually I see:

7/12/2016 3:45:39 PM: Throttling State Encountered: ParallelThreads allowed 1, Outstanding Ops 16, throttleDelta 1

It stays at 1 ParallelThread for a bit, then slowly start ramping back up with normal results, until the end of log:

7/12/2016 3:49:29 PM: An unexpected exception occurred during communication on VDI Channel. 7/12/2016 3:49:29 PM: Exception Info: Unrecoverable error occurred during Flush operation 7/12/2016 3:49:29 PM: Stack: at Microsoft.SqlServer.VdiInterface.VDI.AsyncIOCompletion(BlobRequestOptions options, List`1 asyncResults, CloudPageBlob pageBlob, Boolean onFlush) at Microsoft.SqlServer.VdiInterface.VDI.PerformPageDataTransfer(CloudPageBlob pageBlob, AccessCondition leaseCondition, Boolean forBackup) 7/12/2016 3:49:29 PM: The Active queue had 0 requests until we got a clearerror 7/12/2016 3:49:29 PM: A fatal error occurred during Engine Communication, exception information follows 7/12/2016 3:49:29 PM: Exception Info: Unrecoverable error occurred during Flush operation 7/12/2016 3:49:29 PM: Stack: at Microsoft.SqlServer.VdiInterface.VDI.PerformPageDataTransfer(CloudPageBlob pageBlob, AccessCondition leaseCondition, Boolean forBackup) at BackupToUrl.Program.MainInternal(String[] args)

In Task Manager, I can see BackupToUrl.exe disappear, but the SQL query still executes for a while. The Azure storage account still shows the database as 1TB in size (as it does while it's normally in the process of uploading). Eventually the SQL query returns with the following error, and the Azure storage account is updated to remove the .bak file:

Processed 7056520 pages for database '<removed>', file '<removed>' on file 1. Processed 3 pages for database '<removed>', file '<removed>_log' on file 1. Msg 3271, Level 16, State 1, Line 1 A nonrecoverable I/O error occurred on file " https://<removed>.blob.core.windows.net/<removed>/<removed>.bak :" Backup to URL received an exception from the remote endpoint. Exception Message: Unrecoverable error occurred during Flush operation. Msg 3013, Level 16, State 1, Line 1 BACKUP DATABASE is terminating abnormally.

Does anyone have any clue what can be done to diagnose and resolve this problem?

Turns out this was due to the Azure VM's Host having IO errors communicating to the storage account. Once the VM was redeployed to new hardware, the problem was resolved. This was allegedly caused by a platform bug.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM