简体   繁体   中英

Copy Specific files in a blob from a storage account to a different storage account

I have a list of files in a blob in a storage account that I need to move to another storage account. Is there a way to specifically select blob files and move only the selected subset to a different storage account? If so, how can I do it?

edit: The list of blobs that need to be moved will be updated and the function process will need to run in an ongoing basis

You can implement it with a Recurrence Logic App:

  1. Runs every X time
  2. Invoke your Stored Procedure to get the list of the files
  3. For each file, use the Copy Blob component to move the source blob to the destination blob

The most rudimentary approach that I would recommend if you want to use Azure Functions for this is based on the fact that this problem is really about I/O more than it is about compute. So while there are patterns you can use to scale out work with Azure functions, those probably don't make much sense for this kind of problem.

The simplest approach here is to use a single timer trigger based function. You'll schedule this function to run as frequently as you need. Its job will be to execute your sproc, enumerate the results and then queue up each result for copying via a TransferManager from the Azure Blob Storage SDK.

If you're not familiar with the TransferManager class already, it takes care of tracking and optimizing the concurrent throughput of I/O operations for you. You would likely want to create a single TransferContext representing the batch of work the function is working on so you can keep track of progress, deal with failures, handle overwrite situation etc. You would be utilizing the CopyAsync method and, again if you're not familiar with this API, there is a parameter on this method named isServiceCopy . Since you're copying between two Azure Storage Service accounts you definitely want to utilize this so that it is a pure server<->server copy and the I/O doesn't have to pass through the server that your function instance is running on at all; your function ends up being little more than an orchestrator of the copying.

Now, like I said, this is the most rudimentary approach I would suggest. There are other things to consider such as remaining idempotent in the face of any failures. For example, if the stored procedure you're calling only returns a particular blob URI once (eg a poor man's queue in SQL server) and your Azure Function fails for some reason, then you would lose that work. I would really need to understand more details about that to prescribe a more durable alternative to that, but you'd definitely want to change this approach so you decouple the actual copying from the execution of the stored procedure to reduce the likelihood of failure there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM