繁体   English   中英

尝试在摄取附件字段中插入null时,ElasticSearch返回错误

[英]ElasticSearch returning error when trying to insert null to ingest-attachment field

我已经安装了摄取附件处理器,正在使用Java代码从一个索引documents读取文件路径,并在另一个索引documents_attachment中将文件内容建立索引。

在此过程中,如果文件可用,它将解码为base64并将这些内容附加到json字段fileContent并在另一个索引documents_attachment对这些字段进行索引。

如果文件不可用,则尝试将null作为值附加到json字段fileContent并尝试对这些字段进行索引。 在此过程中,当我尝试向json字段fileContent插入null时,出现以下错误。

请在下面找到错误。

ElasticsearchStatusException[Elasticsearch exception [type=exception, reason=java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=field [fileContent] is null, cannot parse.]];
    at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
    at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:573)
    at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:549)
    at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:456)
    at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:429)
    at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:312)
    at com.es.utility.DocumentIndex.main(DocumentIndex.java:193)
    Suppressed: org.elasticsearch.client.ResponseException: method [PUT], host [http://localhost:9200], URI [/document_attachment_dev/doc/129439?pipeline=document_attachment_dev&timeout=1m], status line [HTTP/1.1 500 Internal Server Error]
{"error":{"root_cause":[{"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","header":{"processor_type":"attachment"}}],"type":"exception","reason":"java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"java.lang.IllegalArgumentException: field [fileContent] is null, cannot parse.","caused_by":{"type":"illegal_argument_exception","reason":"field [fileContent] is null, cannot parse."}},"header":{"processor_type":"attachment"}},"status":500}

请找到我的Java代码。

public class DocumentIndex {

    private final static String INDEX = "documents_local";  
    private final static String ATTACHMENT = "document_attachment"; 
    private final static String TYPE = "doc";
    private static final Logger logger = Logger.getLogger(Thread.currentThread().getStackTrace()[0].getClassName());

    public static void main(String args[]) throws IOException {


        RestHighLevelClient restHighLevelClient = null;
        Document doc=new Document();

        logger.info("Started Indexing the Document.....");

        try {
            restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
                    new HttpHost("localhost", 9201, "http")));
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }


        //Fetching Id, FilePath & FileName from Document Index. 
        SearchRequest searchRequest = new SearchRequest(INDEX); 
        searchRequest.types(TYPE);
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        QueryBuilder qb = QueryBuilders.matchAllQuery();
        searchSourceBuilder.query(qb);
        searchSourceBuilder.size(3000);
        searchRequest.source(searchSourceBuilder);
        SearchResponse searchResponse = null;
        try {
             searchResponse = restHighLevelClient.search(searchRequest);
        } catch (IOException e) {
            e.getLocalizedMessage();
        }

        SearchHit[] searchHits = searchResponse.getHits().getHits();
        long totalHits=searchResponse.getHits().totalHits;
        logger.info("Total Hits --->"+totalHits);

        int line=1;

        Map<String, Object> jsonMap ;
        for (SearchHit hit : searchHits) {

            String encodedfile = null;
            File file=null;

            Map<String, Object> sourceAsMap = hit.getSourceAsMap();
            doc.setId((int) sourceAsMap.get("id"));
            doc.setApp_language(sourceAsMap.get("app_language").toString());


            String filepath=doc.getPath().concat(doc.getFilename());

            logger.info("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);

            try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\\AllFilePath.txt"), true))  ){
                out.println("Line Number--> "+line+"ID---> "+doc.getId()+"File Path --->"+filepath);
            }

            file = new File(filepath);
            if(file.exists() && !file.isDirectory()) {
                try {
                      try(PrintWriter out = new PrintWriter(new FileOutputStream(new File("d:\\AvailableFile.txt"), true))  ){
                            out.println("Line Number--> "+line+++"ID---> "+doc.getId()+"File Path --->"+filepath);
                        }
                    FileInputStream fileInputStreamReader = new FileInputStream(file);
                    byte[] bytes = new byte[(int) file.length()];
                    fileInputStreamReader.read(bytes);
                    encodedfile = new String(Base64.getEncoder().encodeToString(bytes));
                } catch (FileNotFoundException e) {
                    e.printStackTrace();
                }
            }

            jsonMap = new HashMap<>();
            jsonMap.put("id", doc.getId());
            jsonMap.put("app_language", doc.getApp_language());
            jsonMap.put("fileContent", encodedfile); // inserting null here when file is not available and it is not able to encoded.

            String id=Long.toString(doc.getId());

            IndexRequest request = new IndexRequest(ATTACHMENT, "doc", id )
                    .source(jsonMap)
                    .setPipeline(ATTACHMENT);

            PrintStream printStream = new PrintStream(new File("d:\\exception.txt"));
            try {
                IndexResponse response = restHighLevelClient.index(request);

            } catch(ElasticsearchException e) {
                if (e.status() == RestStatus.CONFLICT) {
                }
                e.printStackTrace(printStream);
            }

            line++;

        }

        logger.info("Indexing done.....");

    }

}

请找到我的地图详细信息

PUT _ingest/pipeline/document_attachment
    {
      "description" : "Extract attachment information",
      "processors" : [
        {
          "attachment" : {
            "field" : "fileContent"
          }
        }
      ]
    }

PUT document_attachment
{

     "settings": {
     "analysis": {
      "analyzer": {
        "custom_analyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        },
        "product_catalog_keywords_analyzer": {
          "type": "custom",
          "tokenizer": "whitespace",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },

  "mappings" : {
    "doc" : {
      "properties" : {
        "attachment" : {
          "properties" : {
            "content" : {
              "type" : "text",
              "analyzer": "custom_analyzer"
            },
            "content_length" : {
              "type" : "long"
            },
            "content_type" : {
              "type" : "text"
            },
            "language" : {
              "type" : "text"
            }
          }
        },
        "fileContent" : {
          "type" : "text"
        },
        "id": {
        "type": "long"
        },
        "app_language" : {
        "type" : "text"
        },

      }
    }
  }
}

我正在使用以下用于提取附件处理器的映射配置,并且在文件内容不可用(空)时它正在工作。

PUT _ingest/pipeline/document_attachment
    {
  "description" : "my first pipeline with handled exceptions",
  "processors" : [
    {
      "attachment" : {
        "field" : "fileContent",
        "on_failure" : [
          {
            "set" : {
              "field" : "error",
              "value" : "{{ _ingest.on_failure_message }}"
            }
          }
        ]
      }
    }
  ]
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM