简体   繁体   中英

Logstash delivering less than half of data on Docker

I've got the following Docker compose configuration and run it with docker-compose up :

version: "2"

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:${ELK_VERSION}
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xmx2G -Xms2G"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - elk

  logstash:
    image: docker.elastic.co/logstash/logstash:${ELK_VERSION}
    volumes:
      - ./logstash/config/pipelines.yml:/usr/share/logstash/config/pipelines.yml:ro
      - ./logstash/pipeline:/usr/share/logstash/pipeline
      - ./logstash/drivers:/usr/share/logstash/drivers
      - ./logstash/shared:/usr/share/logstash/shared
    environment:
      - xpack.monitoring.collection.enabled=true
      - "LS_JAVA_OPTS=-Xmx2G -Xms2G"
    networks:
      - elk
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:${ELK_VERSION}
    environment:
      - server.name=kibana
      - elasticsearch.url=http://elasticsearch:9200
    ports:
      - 5601:5601
    networks:
      - elk
    depends_on:
      - elasticsearch

volumes:
  esdata:
    driver: local

networks:
  elk:

There's plenty of resources assigned to it. I run two simple persisted pipelines that have less than 100MB of data combined.

However just third of data is indexed. Monitoring in Kibana says that:

  • Events received: 390.8k
  • Events Emitted: 120.1k

Logstash logs indicate no issues also:

[2018-12-20T10:12:56,420][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
[2018-12-20T10:12:56,443][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://elasticsearch:9200/"}
[2018-12-20T10:12:56,453][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-12-20T10:12:56,453][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-12-20T10:12:56,513][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://elasticsearch:9200"]}
[2018-12-20T10:12:56,746][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x1d13c506 sleep>"}
[2018-12-20T10:12:56,782][INFO ][logstash.agent           ] Pipelines running {:count=>3, :running_pipelines=>[:"via-brands-categories", :"via-categories", :".monitoring-logstash"], :non_running_pipelines=>[]}
[2018-12-20T10:12:57,731][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2018-12-20T10:13:00,148][INFO ][logstash.inputs.jdbc     ] (1.116852s)
            SELECT categoryid
              , categoryname
              , industryCode
              , skucount
              , TO_JSON_STRING(permissionIds) as permissionIds
              , TO_JSON_STRING(filters) as filters
              , lastupdate
              , deleted
            FROM migration_search.categories;
[2018-12-20T10:13:00,162][INFO ][logstash.inputs.jdbc     ] (1.114124s)
            SELECT brandid
              , brandname
              , categoryid
              , categoryname
              , industryCode
              , skucount
              , TO_JSON_STRING(permissionIds) as permissionIds
              , TO_JSON_STRING(filters) as filters
              , lastupdate
              , deleted
            FROM migration_search.brands_categories;
[2018-12-20T10:13:50,058][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"via-categories", :thread=>"#<Thread:0x47925db9@/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:51 run>"}
[2018-12-20T10:15:16,179][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"via-brands-categories", :thread=>"#<Thread:0x7e52add8 run>"}
[2018-12-20T10:15:18,108][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x1d13c506 run>"}

What could be the issue that Logstash doesn't emit all of the events?

Turns out that's the way persistent queues work. Logstash will shut down even though not everything has been ingested from the queue.

Adding queue.drain: true to each of my pipelines in pipelines.yml has resolved it. Logstash now will wait until all of the data has been ingested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM