简体   繁体   English

在Elasticsearch中将字段重命名为新索引

[英]Renaming fields to new index in Elasticsearch

I have an index with this mapping: 我有一个这个映射的索引:

curl -XPUT 'http://localhost:9200/origindex/_mapping/page' -d '
   {
    "page" : {
        "properties" : {
            "title" : {"type" : "text"},
            "body" : {"type" : "text"},
            "other": {"type": "text"}
        }
     }
   }'

In a new index, I want to copy "title" to "title1" and "title2", and "body" to "body1" and "body2" (disregarding "other"), and change the type from "page" to "articles_eng". 在新索引中,我想将“title”复制到“title1”和“title2”,将“body”复制到“body1”和“body2”(忽略“other”),并将类型从“page”更改为“ articles_eng”。 The new index has this mapping: 新索引具有以下映射:

curl -XPUT 'http://localhost:9200/newindex/_mapping/articles_eng' -d '                             
{                                                                                                  
    "articles_eng" : {                                                                             
        "properties" : {                                                                           
            "title1" : {                                                                     
                 "type" : "text",                                                                  
                 "analyzer" : "my_analyzer1"                                                    
             },                                                                                     
            "title2" : {                                                                   
                 "type" : "text",                                                                  
                 "analyzer": "my_analyzer2"                                                    
             },                                                                                     
            "body1": {                                                                       
                "type" : "text",                                                                  
                "analyzer": "my_analyzer1"                                                     
            },                                                                                     
            "body2" : {                                                                     
                "type" : "text",                                                                  
                "analyzer": "my_analyzer2" 
            }                                                   
        }                                                                                      
    }                                                                                          
}'                                                                                              

From looking at this answer and the Elasticsearch reindex docs I come up with something like this: 通过查看这个答案Elasticsearch reindex文档,我想出了类似这样的东西:

curl -XPOST http://localhost:9200/_reindex -d '{                                                   
    "source": {                                                                                    
        "index": "origindex",                                                                          
        "type": "page",                                                                            
        "query": {                                                                                 
           "match_all": {}                                                                         
        },                                                                                         
        "_source": [ "title", "body" ]                                                             
    },                                                                                             
    "dest": {                                                                                      
        "index": "newindex"                                                                        
    },                                                                                             
    "script": {                                                                                    
        "inline": "ctx._type = \"articles_eng\"";                                                  
                  "ctx._title1 = ctx._source._title";                                         
                  "ctx._title2 = ctx._source._title";                                       
                  "ctx._body1 = ctx._source._body";                                          
                  "ctx._body2 = ctx._source._body"                                                                                                   
    }                                                                                              
}'

I'm having trouble with the script lines. 我遇到了脚本行的问题。 If I do only the top line (changing the document type), everything works fine. 如果我只做顶行(更改文档类型),一切正常。 If I add the rest of the lines, I get an error 如果我添加其余的行,我会收到错误

"[reindex] failed to parse field [script]" “[reindex]无法解析字段[script]”

caused by 引起的

"Unexpected character (';' (code 59)): was expecting comma to separate Object entries\\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@37649463; line: 14, column: 50]" “意外的字符(';'(代码59)):期待逗号在[来源:org.elasticsearch.transport.netty4.ByteBufStreamInput@37649463;第14行,第50列]中分隔对象条目\\ n”

Even if I can sort out the issue with the multiple statements, putting in just the second line gives me the error 即使我可以解决多个语句的问题,只需输入第二行就会给出错误

"Invalid fields added to context [title1]"}] “无效字段已添加到上下文[title1]”}]

Can anyone help me out? 谁能帮我吗? It seems like this shouldn't be impossible to do. 这似乎不应该是不可能做到的。

If I do only the top line (changing the document type), everything works fine. 如果我只做顶行(更改文档类型),一切正常。 If I add the rest of the lines, I get an error 如果我添加其余的行,我会收到错误

You don't need to put all inline statement in double quotes instead you can put all inline script statements seperated by semi-colon( ; ) and enclosed in double quotes( " ) as shown below: 您不需要将所有内联语句放在双引号中,而是可以将所有内联脚本语句分隔为分号( ; )并用双引号( " )括起来,如下所示:

"script": {
    "inline": "ctx._source.title1 = ctx._source.title; ctx._source.title2 = ctx._source.remove(\"title\");ctx._source.body1 = ctx._source.body; ctx._source.body2 = ctx._source.remove(\"body\");ctx._type=\"articles_eng\""
}

Even if I can sort out the issue with the multiple statements, putting in just the second line gives me the error 即使我可以解决多个语句的问题,只需输入第二行就会给出错误

You are trying to access source fields in wrong way. 您试图以错误的方式访问源字段。 Metadata fields(like _id, _type, _index .. ) should be accessed as ctx._type / ctx._id where as source fields(like title, body, other in your case) should be accessed as ctx._source.title / ctx._source.body . 元数据字段(如_id, _type, _index .. )应作为ctx._type / ctx._id访问,其中源字段(如title, body, other在您的情况下为title, body, other字段)应作为ctx._source.title / ctx._source.body访问ctx._source.body

So finally, your ReIndex query should look like this: 最后,您的ReIndex查询应如下所示:

POST _reindex
{
  "source": {
    "index": "origindex",
    "_source": [ "title", "body" ]
  },
  "dest": {
    "index": "newindex"
  },
  "script": {
    "inline": "ctx._source.title1 = ctx._source.title; ctx._source.title2 = ctx._source.remove(\"title\");ctx._source.body1 = ctx._source.body; ctx._source.body2 = ctx._source.remove(\"body\");ctx._type=\"articles_eng\""
  }
}

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM