简体   繁体   English

Lucene中的多词组

[英]Multi-term phrases in Lucene

I am reading the Lucene in Action book and I do not understand the multi-term phrases part. 我正在阅读《 Lucene in Action》一书,但我不理解多词组部分。

The following text is indexed: 索引了以下文本:

the quick brown fox jumped over the lazy dog 敏捷的棕色狐狸跳过了那只懒狗

And then you add the following terms to the PhraseQuery : quick jumped lazy with a slop equal 4. That results in a match, but I don't understand how that happens. 然后,将以下术语添加到PhraseQueryPhraseQuery等于4的快速跳变懒惰 。这导致匹配,但是我不知道这是怎么发生的。 How do you calculate the number of moves when there are multiple terms? 有多个条件时,如何计算移动次数? I don't understand how they do it. 我不明白他们是怎么做到的。

The same with the terms lazy jumped quick with slop equal 8. 术语懒惰与坡度等于8时一样快速跳跃

The slop is actually an edit distance . 斜率实际上是一个编辑距离 Inserting extra terms in between them adds 1 to the distance, transposing terms adds 2 (the first edit moving the two terms atop one another). 在它们之间插入多余的术语会使距离加1,移置的术语会使距离加2(第一个编辑将两个术语彼此叠加)。

You can go through the edits one at a time to illustrate: 您可以一次进行一次编辑以说明:

  • quick jumped lazy distance:0 quick jumped lazy 距离:0
  • quick _ jumped lazy distance:1 quick _ jumped lazy 距离:1
  • quick _ _ jumped lazy distance:2 quick _ _ jumped lazy 距离:2
  • quick _ _ jumped _ lazy distance:3 quick _ _ jumped _ lazy 距离:3
  • quick _ _ jumped _ _ lazy distance:4 quick _ _ jumped _ _ lazy 距离:4

And for the second case: 对于第二种情况:

  • lazy jumped quick distance:0 lazy jumped quick 远距离:0
  • lazy/jumped quick distance:1 lazy/jumped quick 距离:1
  • lazy/jumped/quick distance:2 (all three terms superimposed, in the same position) lazy/jumped/quick 距离:2 (所有三个项叠加在一起,在同一位置)
  • quick lazy/jumped distance:3 quick lazy/jumped 距离:3
  • quick jumped lazy distance:4 quick jumped lazy 距离:4
  • quick _ jumped lazy distance:5 quick _ jumped lazy 距离:5
  • quick _ _ jumped lazy distance:6 quick _ _ jumped lazy 距离:6
  • quick _ _ jumped _ lazy distance:7 quick _ _ jumped _ lazy 距离:7
  • quick _ _ jumped _ _ lazy distance:8 quick _ _ jumped _ _ lazy 距离:8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM