[英]Hibernate search to find partial matches of a phrase
在我的項目中,我們使用帶有 lucene 分析器和太陽能的休眠搜索 4.5。 我為我的客戶提供了一個文本字段。 當他們輸入一個短語時,我想找到名稱包含給定短語的所有User
實體。
例如,考慮在數據庫中有以下標題的條目列表:
[ Alan Smith, John Cane, Juno Taylor, Tom Caner Junior ]
jun
應該回歸Juno Taylor
和小Tom Caner Junior
an
應該返回Alan Smith
, John Cane
和Tom Caner Junior
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
@Analyzer(definition = "customanalyzer")
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String lname;
}
我試過通配符搜索,但
通配符查詢不會在匹配項上應用分析器。 否則風險 * 或 ? 被傷害太高了。
Query luceneQuery = mythQB
.keyword()
.wildcard()
.onFields("fname")
.matching("ju*")
.createQuery();
我怎樣才能做到這一點?
首先,您沒有將分析器分配給您的字段,因此當前未使用它。 你應該使用@Field.analyzer。
其次,要回答您的問題,最好使用EdgeNGramFilter
分析此類文本。 您應該將此過濾器添加到您的分析器定義中。
編輯:另外,為了防止諸如“sathya”之類的查詢匹配“sachana”,您應該在查詢時使用不同的分析器。
下面是一個完整的例子。
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
@TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "15") })
})
@AnalyzerDef(name = "customanalyzer_query", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer"))
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer")))
private String lname;
}
然后特別提到你想在構建查詢時使用這個“查詢”分析器:
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Student.class)
// Here come the assignments of "query" analyzers
.overridesForField( "fname", "customanalyzer_query" )
.overridesForField( "lname", "customanalyzer_query" )
.get();
// Then it's business as usual
Query luceneQuery = queryBuilder.keyword().onFields("fname", "lname").matching("sathya").createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery, Student.class);
另見: https : //stackoverflow.com/a/43047342/6692043
順便說一句,如果您的數據僅包含名字和姓氏,則不應使用詞干提取 ( SnowballPorterFilterFactory
):它只會無緣無故地降低搜索的准確性。
為什么不使用標准的TypedQuery
?
(其中String term
是您的搜索詞)
TypedQuery<Student> q = em.createQuery(
"SELECT s " +
"FROM Student s " +
"WHERE s.fname like :search " +
"OR s.lname like :search";
q.setParameter("search", "%" + term + "%");
沒有測試這個,但是這樣的事情應該可以解決問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.