简体   繁体   中英

Azure Search Service comparing documents to be uploaded and deleting

I'm very new to the Azure Search Service. For the current project that I am working on, I am uploading a large number of documents to an Azure Search Index. We will be using the Azure Search Cognitive Api (documentation here https://docs.microsoft.com/en-us/rest/api/searchservice/addupdate-or-delete-documents ) to upload and add new documents using the mergeOrUpload action. This approach is fine so long as we are adding new data that doesn't exist already.

I have been trying to find out if there is a way of comparing the documents in the index already to what I am about to upload, to see if there's any data that should be deleted. Ie what I am about to upload contains some documents that should no longer be in the index and I want to only delete those specific ones. I can't see that any of the upload , merge etc actions will help here. There is a delete action but this removes a specified document and relies on me knowing exactly which document needs to be deleted, whereas if possible I'd prefer a way of comparing to remove the need for any manual intervention. Does anyone know of a way to handle this?

You need to define a unique id for your index / documents. Using mergeOrUpload, Azure Cognitive Search will check if there's a document with the ID you're trying to insert. If so, it will compare the contents and perform the changes (if needed), in case there's no match for the document id, it will insert it.

There is a difference between using the API directly to push content like you describe here vs. defining a data source.

If you want deletes to be handled for you, you can upload your content to Azure Blob Storage or some other type of content source that is supported out-of-the-box. In this scenario, you define a data source and wire it up to your storage. As you add, change and delete content the necessary changes are reflected in Azure Search for you. See the article Import data wizard for Azure Cognitive Search for a step-by-step example.

When you use the API, you are responsible for keeping track of the state of your documents. When content is added, changed, or deleted, you have to do what's necessary to reflect that in the index.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM