Lucene：从索引中删除，基于多个字段

Question

I need to perform deletion of the document from lucene search index.我需要从 lucene 搜索索引中删除文档。 Standard approach :标准方法：

indexReader.deleteDocuments(new Term("field_name", "field value"));

Won't do the trick: I need to perform the deletion based on multiple fields.不会成功：我需要根据多个字段执行删除。 I need something like this:我需要这样的东西：

(pseudo code)
TermAggregator terms = new TermAggregator();
terms.add(new Term("field_name1", "field value 1"));
terms.add(new Term("field_name2", "field value 2"));
indexReader.deleteDocuments(terms.toTerm());

Is there any constructs for that?是否有任何构造？

Answer 1

IndexWriter has methods that allow more powerful deleting, such as IndexWriter.deleteDocuments(Query) . IndexWriter具有允许更强大删除的方法，例如IndexWriter.deleteDocuments(Query) 。 You can build a BooleanQuery with the conjunction of terms you wish to delete, and use that.您可以使用要删除的术语的连接来构建 BooleanQuery，然后使用它。

Answer 2

Choice of Analyzer分析仪的选择

First of all, watch out which analyzer you are using.首先，注意您使用的是哪种分析仪。 I was stumped for a while only to realise that the StandardAnalyzer filters out common words like 'the' and 'a'.我被难住了一段时间才意识到 StandardAnalyzer 过滤掉了像“the”和“a”这样的常见词。 This is a problem when your field has the value 'A'.当您的字段具有值“A”时，这是一个问题。 You might want to consider the KeywordAnalyzer:您可能需要考虑 KeywordAnalyzer：

See this post around the analyzer.请参阅有关分析器的这篇文章。

// Create an analyzer:
// NOTE: We want the keyword analyzer so that it doesn't strip or alter any terms:
// In our example, the Standard Analyzer removes the term 'A' because it is a common English word.
// https://stackoverflow.com/a/9071806/231860
KeywordAnalyzer analyzer = new KeywordAnalyzer();

Query Parser查询解析器

Next, you can either create your query using the QueryParser:接下来，您可以使用 QueryParser 创建查询：

See this post around overriding the default operator.请参阅有关覆盖默认运算符的这篇文章。

// Create a query parser without a default field in this example (the first argument):
QueryParser queryParser = new QueryParser("", analyzer);

// Optionally, set the default operator to be AND (we leave it the default OR):
// https://stackoverflow.com/a/9084178/231860
// queryParser.setDefaultOperator(QueryParser.Operator.AND);

// Parse the query:
Query multiTermQuery = queryParser.parse("field_name1:\"field value 1\" AND field_name2:\"field value 2\"");

Query API查询接口

Or you can achieve the same by constructing the query yourself using their API:或者您可以通过使用他们的 API 自己构建查询来实现相同的目的：

See this tutorial around creating the BooleanQuery. 请参阅有关创建 BooleanQuery 的教程。

BooleanQuery multiTermQuery = new BooleanQuery();
multiTermQuery.add(new TermQuery(new Term("field_name1", "field value 1")), BooleanClause.Occur.MUST);
multiTermQuery.add(new TermQuery(new Term("field_name2", "field value 2")), BooleanClause.Occur.MUST);

Numeric Field Queries (Int etc...)数字字段查询（Int 等...）

When the key fields are numeric, you can't use a TermQuery, but instead must use a NumericRangeQuery.当关键字段是数字时，您不能使用 TermQuery，而必须使用 NumericRangeQuery。

See the answer to this question.请参阅此问题的答案。

// NOTE: For IntFields, we need NumericRangeQueries:
// https://stackoverflow.com/a/14076439/231860
BooleanQuery multiTermQuery = new BooleanQuery();
multiTermQuery.add(NumericRangeQuery.newIntRange("field_name1", 1, 1, true, true), BooleanClause.Occur.MUST);
multiTermQuery.add(NumericRangeQuery.newIntRange("field_name2", 2, 2, true, true), BooleanClause.Occur.MUST);

Delete the Documents that Match the Query删除与查询匹配的文档

Then we finally pass the query to the writer to delete documents that match the query:然后我们最终将查询传递给编写器以删除与查询匹配的文档：

See the answer to this question.请参阅此问题的答案。

// Remove the document by using a multi key query:
// http://www.avajava.com/tutorials/lessons/how-do-i-combine-queries-with-a-boolean-query.html
writer.deleteDocuments(multiTermQuery);

Lucene：从索引中删除，基于多个字段

问题描述

2 个解决方案

解决方案1
2 已采纳 2011-01-31 13:22:23

解决方案2
0 2017-05-13 12:16:08

Choice of Analyzer分析仪的选择

Query Parser查询解析器

Query API查询接口

Numeric Field Queries (Int etc...)数字字段查询（Int 等...）

Delete the Documents that Match the Query删除与查询匹配的文档

Lucene：从索引中删除，基于多个字段

问题描述

2 个解决方案

解决方案1 2 已采纳 2011-01-31 13:22:23

解决方案2 0 2017-05-13 12:16:08

Choice of Analyzer分析仪的选择

Query Parser查询解析器

Query API查询接口

Numeric Field Queries (Int etc...)数字字段查询（Int 等...）

Delete the Documents that Match the Query删除与查询匹配的文档

解决方案1
2 已采纳 2011-01-31 13:22:23

解决方案2
0 2017-05-13 12:16:08