简体   繁体   English

通过查询未分析的文本字段来删除Lucene文档

[英]Remove Lucene document by query of not analyzed text field

I'm 99% sure I had this working in the past, maybe I'm wrong. 我99%确信我过去曾做过这项工作,也许我错了。

Anyway, I'd like to delete a Lucene document by a Field which is stored but not analyzed and contains text. 无论如何,我想通过一个Field删除Lucene文档,该Field存储但未分析并且包含文本。

So the problem, it seems, is that calling luceneWriter.deleteDocuments(query) doesn't delete the document unless the field referenced in query is Field.Index.ANALYZED or a simple number. 因此,问题似乎在于,除非query引用的字段为Field.Index.ANALYZED或简单数字,否则调用luceneWriter.deleteDocuments(query)不会删除文档。

Some code: 一些代码:

Integer myId = 1234;
Document doc = new Document();
Field field = new Field("MyIdField", myId, Field.Store.YES, Field.Index.ANALYZED);
doc.add(field);
indexWriter.add(doc);
indexWriter.commit();

...

QueryParser parser = new QueryParser(VERSION, "MyIdField", ANALYZER);
Query query = parser.parse("MyIdField:1234");
indexWriter.deleteDocuments(query);
indexWriter.commit();

Everything works! 一切正常!
Sweet.. what if the field is not analyzed? 甜蜜..如果不分析该字段怎么办?

Field field = new Field("MyIdField", myId, Field.Store.YES, Field.Index.NOT_ANALYZED);

Still works! 仍然有效!
Awesome, what if it's not just a number? 太好了,如果不只是数字呢?

Field field = new Field("MyIdField", "ID" + myId, Field.Store.YES, Field.Index.NOT_ANALYZED);
...
Query query = parser.parse("MyIdField:ID1234");

Doesn't work!.. darn. 不起作用!。该死。
The query doesn't match the document and so it isn't deleted. 该查询与文档不匹配,因此不会被删除。
What if we do index it? 如果我们索引它怎么办?

Field field = new Field("MyIdField", "ID" + myId, Field.Store.YES, Field.Index.ANALYZED);
...
Query query = parser.parse("MyIdField:ID1234");

It works again! 它再次起作用!

Ok, so if the field is not analyzed it can still be queried if it only contains a number? 好的,因此,如果不分析该字段,即使仅包含数字,仍可以查询该字段? Am I missing something? 我想念什么吗?

Thanks for taking some time. 感谢您抽出宝贵的时间。

Note: 注意:
Technically, there are two fields, making it an AND query. 从技术上讲,有两个字段,使其成为AND查询。 As such, I'd prefer to delete the documents with a Query rather than a Term . 因此,我宁愿使用Query而不是Term删除文档。 I'm not sure if that makes a difference but wanted to emphasize I would like to stick with a solution using a Query . 我不确定这是否有所不同,但想强调一点,我想坚持使用Query的解决方案。

According to this question , you have to use a PhraseQuery to search a not analyzed field. 根据此问题 ,您必须使用PhraseQuery搜索未分析的字段。 Your code 您的密码

Query query = parser.parse("MyIdField:ID1234");

would yield a TermQuery instead, and thus won't match. 会改为生成TermQuery ,因此将不匹配。

I recommend you to try a KeywordAnalyzer instead (remember that, even if your field isn't analyzed, the query parser could still analyze your query string and therefore your match could fail anyway). 我建议您改为尝试使用KeywordAnalyzer (请记住,即使不分析您的字段,查询解析器仍然可以分析您的查询字符串,因此匹配始终会失败)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM