简体   繁体   中英

How to write only non existing records to Cosmos DB from using Azure Cosmos DB Spark?

I am using Databricks which writes the data from CSV file to Cosmos DB using Spark Connector. Now my Cosmos DB already contains few records, so when I run Databricks Notebooks, it should write only the records which doesn't exist in DB. I tried with SaveMode.Ignore but doesn't help.

df.write.mode(SaveMode.Ignore).cosmosDB(writeConfig)

Now ideally, SaveMode.Ignore should skip over the existing records and write the only ones which doesn't exist in DB but it is not happening.

It would be a great help if anyone has suggestions on how to achieve this.

Thanks.

Create a container with unique key using some unique field from the CSV file. After that you cannot add duplicate unique key values to Cosmos DB.

More info: https://docs.microsoft.com/en-us/azure/cosmos-db/unique-keys

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM