[英]Create list of custom stop words in elastic search using java
To enhance my search result obtained from elastic search I want to increase my stop word library from my java code. 为了增强从弹性搜索获得的搜索结果,我想从我的java代码中增加我的停止词库。 Till now , I am using the default list of stop analyzer which do not have the interrogative words in list like What,Who,Why etc. We want to remove these words and some additional words from our search when querying for result.
到目前为止,我正在使用默认的停止分析器列表,它没有像What,Who,Why等列表中的疑问词。我们想在查询结果时从搜索中删除这些词和一些额外的词。 I have tried code from here(the last ans) tried
我曾尝试代码从这里(最后ANS) 尝试
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "standard",
"stopwords": [ "and", "the" ]
}
}
}
} } }}
This code in java. 这段代码用java。 But It wasn' working for me.
但它并没有为我工作。 Important Query
重要查询
How to create our own list of stopwords and how to implement it in our code with query 如何创建我们自己的停用词列表以及如何在我们的代码中使用查询来实现它
QueryStringQueryBuilder qb=new QueryStringQueryBuilder(text).analyzer("stop");
qb.field("question_title");
qb.field("level");
qb.field("category");
qb.field("question_tags");
SearchResponse response = client.prepareSearch("questionindex")
.setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(qb)
.execute()
.actionGet();
SearchHit[] results = response.getHits().getHits();
System.out.println("respose-"+results.length);
Currently I am using default stop analyzer. 目前我正在使用默认停止分析器。 Which just stop a limited stop words like
这只是停止有限的停止词
"a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with" “a”,“an”,“and”,“are”,“as”,“at”,“be”,“but”,“by”,“for”,“if”,“in”,“into” “,”是“,”它是“,”不是“,”不是“,”,“,”,“或”,“,”,“,”,“,”,“,”,“,”,“然后”, “那里”,“这些”,“他们”,“这个”,“来”,“是”,“将”,“带”
But I want to increase this library. 但我想增加这个库。
You're on the right track. 你走在正确的轨道上。 In your first listing ( from the documentation about stopwords ) you created a custom analyzer called
my_analyzer
for the index called my_index
which will have the effect of removes "and" and "the" from text that you use my_analyzer
with. 在你的第一个上市( 约停止字的文件 )创建调用自定义分析
my_analyzer
被叫指数my_index
这将对消除了效果“与”和“的”从文本您使用my_analyzer
用。
Now to actually use it, you should: 现在要实际使用它,你应该:
my_analyzer
on the index you're querying ( questionindex
?) my_analyzer
( questionindex
?) my_analyzer
for the fields where you would like to remove "and" and "the" (for example the question_title
field): question_title
字段)创建使用my_analyzer
的文档的映射: Test out your analyzer using the Analyze API 使用Analyze API测试您的分析仪
GET /questionindex/_analyze?field=question.question_title&text=No quick brown fox jumps over my lazy dog and the indolent cat
Reindex your documents 重新索引您的文档
Try this as a starting point: 以此为出发点:
POST /questionindex
{
"settings" : {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "standard",
"stopwords": [ "and", "the" ]
}
}
}
},
"mappings" : {
"question" : {
"properties" : {
"question_title" : {
"type" : "string",
"analyzer" : "my_analyzer"
},
"level" : {
"type" : "integer"
},
"category" : {
"type" : "string"
},
"question_tags" : {
"type" : "string"
}
}
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.