繁体   English   中英

Elasticsearch:让分析器用于从客户端索引给定字段

[英]Elasticsearch: Getting analyzer used for indexing a given field from the client side

有没有一种方法以编程方式获得分析仪用于通过客户端索引由Elasticsearch服务器实例某一领域(假设分析仪可在两侧,当然)?

例如,使用如下映射:

{
    "mappings": {
        "article": {
            "properties": {
                "text": {
                    "type": "string",
                    "index": "analyzed",
                    "analyzer": "spanish"
                }
            }
        }
    }
}

如何使用Elasticsearch的Java客户端为字段text获取org.apache.lucene.analysis.es.SpanishAnalyzer ,如下所示?

import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.Collections;

import org.elasticsearch.action.search.SearchRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;

public class QueryAnalyzerTest {

    public static void main(final String[] args) throws UnknownHostException {
        final String docTextFieldName = "text";
        Iterable<SearchHit> hits = Collections.emptyList();

        try (final Client client = TransportClient.builder().build()
                .addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("localhost"), 9300))) {
            final QueryBuilder queryBuilder = QueryBuilders.matchQuery(docTextFieldName, "anuncio");
            final SearchRequestBuilder searchRequestBuilder = client.prepareSearch("news").setQuery(queryBuilder)
                    .setTypes("article");
            final SearchResponse response = searchRequestBuilder.get();
            hits = response.getHits();
        }

        hits.forEach(hit -> {
            final String docText = (String) hit.getSource().get(docTextFieldName);
            // TODO: Tokenize "docText" with the exact same tokenizer used when
            // indexing the field
        });

    }

}

您绝对可以使用client().admin().indices().prepareGetFieldMappings("indexName")编程方式获取text字段的映射,并且可以检索分析器的逻辑名称(即“西班牙语” ),但不会获得分析器的类名。

为此,您需要调用AnalysisRegistry.getAnalyzer("spanish") ,然后将获得正确的分析器实例。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM