简体   繁体   English

带有预定义映射的 Elasticsearch 问题,然后索引文档

[英]Elasticsearch problem with pre-defined mapping and then index docs

I am trynig to index stackoverflow data.我正在尝试索引 stackoverflow 数据。 First of all I create an index with specified mapping and setting.首先,我创建一个具有指定映射和设置的索引。

    @classmethod
    def create_index_with_set_map(cls, name, elasticsearch):
        """
        create index with default mappings and settings(and analyzer).

    Argument:
    name -- The name of the index.
    elasticsearch -- Elasticsearch instance for connection.
        """
     
        mappings = "mappings": {
            "properties": {
                "Body": {
                    "type": "text",
                    "analyzer": "whitespace",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }}}
       
        settings = {
            "analysis": {
                "analyzer": {
                    "default": {
                        "type": "whitespace"
                    }
                }
            }
        }

        body = {
            "settings": settings,
            "mappings": mappings

        }
        res = elasticsearch.indices.create(index=name, body=body)
        print(res)

Then I try to bulk index my docs:然后我尝试批量索引我的文档:

@classmethod
    def start_index(cls, index_name, index_path, elasticsearch, doc_type):
        """
    This function is using bulk index.

    Argument:
    index_name -- the name of index
    index_path -- the path of xml file to index
    elasticsearch -- Elasticsearch instance for connection
    doc_type -- doc type 

    Returns:
    """

        for lines in Parser.xml_reader(index_path):
            actions = [
                {
                    "_index": index_name,
                    "_type": doc_type,
                    "_id": Parser.post_parser(line)['Id'],
                    "_source":  Parser.post_parser(line)


                }
                for line in lines if Parser.post_parser(line) is not None
            ]

            helpers.bulk(elasticsearch, actions)

Given Error: ('500 document(s) failed to index.', [{'index': {'_index': 'sof-question-answer2', '_type': 'Stackoverflow', '_id': 1', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Mapper for [Body] conflicts with existing mapping:\\n[mapper [Body] has different [analyzer]]'}, 'data': ...}鉴于错误:('500 个文档未能建立索引。', [{'index': {'_index': 'sof-question-answer2', '_type': 'Stackoverflow', '_id': 1', 'status': 400, 'error': {'type': 'illegal_argument_exception', 'reason': 'Mapper for [Body] 与现有映射冲突:\\n[mapper [Body] 有不同的 [analyzer]]'}, '数据': ...}

It looks like sof-question-answer2 index has already been created with a different analyzer, probably with the default one standard analyzer .看起来已经使用不同的分析器创建了sof-question-answer2索引,可能使用默认的standard analyzer

If you run the command GET sof-question-answer2/_mapping via kibana you will see that Body field doesn't have the whitespace analyzer.如果您通过 kibana 运行命令GET sof-question-answer2/_mapping ,您将看到Body字段没有whitespace分析器。

I order to resolve this issue you will have to delete your index, update your mapping, and reindexing your data (if you have any).为了解决这个问题,您必须删除索引、更新映射并重新索引数据(如果有的话)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM