[英]How do I prevent a version conflict when reindexing/adding the same document back into the Solr core?

I have a Solr core containing 60k documents.我有一个包含 60k 个文档的 Solr 内核。 I have updated the field types in the schema.xml and I do not want to delete the Solr core for reindexing.我已经更新了 schema.xml 中的字段类型,并且我不想删除 Solr 核心以进行重新索引。 I am trying to retrieve the documents with a Solr search and then try to add that same document with that same id back into Solr.我正在尝试使用 Solr 搜索来检索文档,然后尝试将具有相同 ID 的相同文档添加回 Solr。 In doing this, I get a version conflict.在这样做时,我遇到了版本冲突。

Example: I retrieve one document using a Pysolr search request.示例:我使用 Pysolr 搜索请求检索一个文档。 The document looks like this:该文档如下所示:

doc = {

The above document still exists in Solr and I do not want to change it. Solr 中仍然存在上述文档,我不想更改它。 I want to reindex it/add it again back into Solr because the field types in the schema.xml have changed.我想重新索引它/再次将其添加回 Solr 因为 schema.xml 中的字段类型已更改。

When I do:当我做:

import pysolr

core = pysolr.Solr('http://localhost:10000/solr/core', always_commit=True)

I get the following error:我收到以下错误:

pysolr.SolrError: Solr responded with an error (HTTP 409): [Reason: version conflict for person_abcd expected=1691404871556661248 actual=1691426574942863360]

Why does the 'actual' version change and does not stay as the 'expected' version?为什么“实际”版本会发生变化,而不是“预期”版本?

How can I solve this (examples are appreciated)?我该如何解决这个问题(赞赏示例)?

The _version_ field is used internally by Solr to manage partial update and update log features. Solr 在内部使用_version_字段来管理部分更新和更新日志功能。 You should not include it in your documents when reindexing.重新索引时不应将其包含在文档中。 Just remove it.只需将其删除。

If you need Solr Optimistic Concurrency feature, in this case the _version_ must be specified as part of the update command in the request, not in the documents.如果您需要 Solr 乐观并发功能,在这种情况下, _version_必须在请求中指定为更新命令的一部分,而不是在文档中。


