I am reading the Lucene in Action book and I do not understand the multi-term phrases part.
The following text is indexed:
the quick brown fox jumped over the lazy dog
And then you add the following terms to the PhraseQuery
: quick jumped lazy with a slop equal 4. That results in a match, but I don't understand how that happens. How do you calculate the number of moves when there are multiple terms? I don't understand how they do it.
The same with the terms lazy jumped quick with slop equal 8.
The slop is actually an edit distance . Inserting extra terms in between them adds 1 to the distance, transposing terms adds 2 (the first edit moving the two terms atop one another).
You can go through the edits one at a time to illustrate:
quick jumped lazy
distance:0 quick _ jumped lazy
distance:1 quick _ _ jumped lazy
distance:2 quick _ _ jumped _ lazy
distance:3 quick _ _ jumped _ _ lazy
distance:4 And for the second case:
lazy jumped quick
distance:0 lazy/jumped quick
distance:1 lazy/jumped/quick
distance:2 (all three terms superimposed, in the same position) quick lazy/jumped
distance:3 quick jumped lazy
distance:4 quick _ jumped lazy
distance:5 quick _ _ jumped lazy
distance:6 quick _ _ jumped _ lazy
distance:7 quick _ _ jumped _ _ lazy
distance:8
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.