简体   繁体   English

如何在 elasticsearch 中使用 Unicode 字符进行搜索?

[英]How can I search with Unicode characters in elasticsearch?

I have indexed MySQL column into elasticsearch and this column have some AR/EN/RO languages values.我已将 MySQL 列索引到 elasticsearch 中,该列有一些 AR/EN/RO 语言值。 How can I search within these indexes with a unicode string ?如何使用 unicode 字符串在这些索引中进行搜索?

$hosts = ['localhost:9200'];              
$client = \Elasticsearch\ClientBuilder::create()->setHosts($hosts)->build();  

$body = '{  "query": {
"filtered": {
  "query": {
    "match_all": {}
  },
  "filter": {
    "bool": {
      "must": [
        {"query": {"wildcard": {"text": {"value": "*'.$term.'*"}}}},
        {"query": {"wildcard": {"group": {"value": "hotels_cities"}}}}
      ]
    }
  }
}  }}';



$params['index'] = 'my_custom_index_name';
$params['type']  = 'translator_translations';
$params['body'] = $body;

$results = $client->search($params);

The out put hits is zero.输出命中为零。

-There is something called analyzer but there is no information about how to use it in PHP. - 有一种叫做分析器的东西,但没有关于如何在 PHP 中使用它的信息。

I think I found the answer of how to index unicode languages characters in Elasticsearch, hope this would be useful to any one.我想我找到了如何在 Elasticsearch 中索引 unicode 语言字符的答案,希望这对任何人都有用。

  • First you have to set your index name首先你必须设置你的索引名称

  • Second Set your new language settings with filter and language analyzer , Like this:其次使用过滤器和语言分析器设置您的新语言设置,如下所示:

     $client = ClientBuilder::create() // Instantiate a new ClientBuilder ->setHosts(['localhost:9200']) // Set the hosts ->build(); $lang = 'el'; // Greek in my case $param['index'] = 'test_' . $lang; // index name // uncomment this line if you want to delete an existing index // $response = $client->indices()->delete($param); $body = '{ "settings": { "analysis": { "filter": { "greek_stop": { "type": "stop", "stopwords": "_greek_" }, "greek_lowercase": { "type": "lowercase", "language": "greek" }, "greek_keywords": { "type": "keyword_marker", "keywords": ["παράδειγμα"] }, "greek_stemmer": { "type": "stemmer", "language": "greek" } }, "analyzer": { "greek": { "tokenizer": "standard", "filter": [ "greek_lowercase", "greek_stop", "greek_keywords", "greek_stemmer" ] } } } } }'; $param['body'] = $body; // store the JSON body as a parameter in the main array $response = $client->indices()->create($param);

Then start indexing your values with Greek characters然后开始用希腊字符索引你的值

you should use fieldname.keyword你应该使用 fieldname.keyword

$rowArray = array();
$rowArray['term'] = array();
$rowArray['term']['title.keyword'] = (string)$csvRow[1];
$whereArray['bool']['must'][] = $rowArray;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM