简体   繁体   中英

Storing intervals in the Lucene index

I have documents with the annotated zones - say, 'title', 'body', and 'comments' (zones also may be nested). I want to search for a word 'Obama' in the 'title' zone. I can use a SpanQuery like word:'Obama' & zone:'title' matching at the same position, but it means that I need to store zone attribute for each word position in the document. Can I just store zones as interval coordinates and then perform queries only inside those intervals?

这似乎很混乱,但是您可以将每个单词的间隔存储为类似Dewey-Decimal的编码层次结构(请参阅我的愚蠢Lucene Tricks:层次结构 ),这将使您可以在层次结构的任何部分(所有文本,所有标题,仅标题语义等)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM