简体   繁体   中英

How do I backup data to azure blob from azure cosmos db during a specific stage?

I have Azure cosmos DB account what I want to do is backup data which is one month old from azure cosmos DB to Azure blob storage using my node app. I have already created pipeline and have triggered it by using create run pipeline API for Nodejs (using Azure data factory). But I am not able to figure out how to make the pipeline selective for data which is one month old from the current date. Any suggestions for that?

EDIT : Actually I want to run the API daily so that it backs up data which is one month old. For example, let's say I get 100 entries today in my cosmos DB, so the pipeline should select data from current date - 30 days and should back it up so that at any point my Azure cosmos DB has data for recent 30 days only and rest are backed up to Azure blob.

Just a supplement to @David's answer here.If you mean Cosmos DB SQL API, it has automatic backup mechanism based on this link: Automatic and online backups .

With Azure Cosmos DB, not only your data, but also the backups of your data are highly redundant and resilient to regional disasters. The automated backups are currently taken every four hours and at any point of time, the latest two backups are stored. If you have accidentally deleted or corrupted your data, you should contact Azure support within eight hours so that the Azure Cosmos DB team can help you restore the data from the backups.

However,you cannot access this backup directly. Azure Cosmos DB will use this backup only if a backup restore is initiated.

But the document provides two options to manage your own backups.

  • 1.Use Azure Data Factory to move data periodically to a storage of your choice.
  • 2.Use Azure Cosmos DB change feed to read data periodically for full backups, as well as for incremental changes, and store it in your own storage.

You could use trigger the copy activity in ADF to transfer data in the schedule.If you want to filter data by date,you could learn about _ts in cosmos db which represents the latest modified time of data.

Not sure what pipeline you're referring to. That said: Cosmos DB doesn't have any built-in backup tools. You'd need to select and copy this data programmatically.

If using the MongoDB API, you could pass a query parameter to the mongoexport command-line tool (to serve as your date filter), but you'd still need to run mongoexport from your VM, write to a local directory, then copy to blob storage (I don't know if you can install/run MongoDB tools in something like Azure Functions or a DevOps pipeline).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM