Logstash delivering less than half of data on Docker

Question

I've got the following Docker compose configuration and run it with docker-compose up :

version: "2"

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:${ELK_VERSION}
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xmx2G -Xms2G"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - elk

  logstash:
    image: docker.elastic.co/logstash/logstash:${ELK_VERSION}
    volumes:
      - ./logstash/config/pipelines.yml:/usr/share/logstash/config/pipelines.yml:ro
      - ./logstash/pipeline:/usr/share/logstash/pipeline
      - ./logstash/drivers:/usr/share/logstash/drivers
      - ./logstash/shared:/usr/share/logstash/shared
    environment:
      - xpack.monitoring.collection.enabled=true
      - "LS_JAVA_OPTS=-Xmx2G -Xms2G"
    networks:
      - elk
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:${ELK_VERSION}
    environment:
      - server.name=kibana
      - elasticsearch.url=http://elasticsearch:9200
    ports:
      - 5601:5601
    networks:
      - elk
    depends_on:
      - elasticsearch

volumes:
  esdata:
    driver: local

networks:
  elk:

There's plenty of resources assigned to it. I run two simple persisted pipelines that have less than 100MB of data combined.

However just third of data is indexed. Monitoring in Kibana says that:

Events received: 390.8k
Events Emitted: 120.1k

Logstash logs indicate no issues also:

[2018-12-20T10:12:56,420][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
[2018-12-20T10:12:56,443][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://elasticsearch:9200/"}
[2018-12-20T10:12:56,453][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-12-20T10:12:56,453][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-12-20T10:12:56,513][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://elasticsearch:9200"]}
[2018-12-20T10:12:56,746][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x1d13c506 sleep>"}
[2018-12-20T10:12:56,782][INFO ][logstash.agent           ] Pipelines running {:count=>3, :running_pipelines=>[:"via-brands-categories", :"via-categories", :".monitoring-logstash"], :non_running_pipelines=>[]}
[2018-12-20T10:12:57,731][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2018-12-20T10:13:00,148][INFO ][logstash.inputs.jdbc     ] (1.116852s)
            SELECT categoryid
              , categoryname
              , industryCode
              , skucount
              , TO_JSON_STRING(permissionIds) as permissionIds
              , TO_JSON_STRING(filters) as filters
              , lastupdate
              , deleted
            FROM migration_search.categories;
[2018-12-20T10:13:00,162][INFO ][logstash.inputs.jdbc     ] (1.114124s)
            SELECT brandid
              , brandname
              , categoryid
              , categoryname
              , industryCode
              , skucount
              , TO_JSON_STRING(permissionIds) as permissionIds
              , TO_JSON_STRING(filters) as filters
              , lastupdate
              , deleted
            FROM migration_search.brands_categories;
[2018-12-20T10:13:50,058][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"via-categories", :thread=>"#<Thread:0x47925db9@/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:51 run>"}
[2018-12-20T10:15:16,179][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>"via-brands-categories", :thread=>"#<Thread:0x7e52add8 run>"}
[2018-12-20T10:15:18,108][INFO ][logstash.pipeline        ] Pipeline has terminated {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x1d13c506 run>"}

What could be the issue that Logstash doesn't emit all of the events?

Answer 1

Turns out that's the way persistent queues work. Logstash will shut down even though not everything has been ingested from the queue.

Adding queue.drain: true to each of my pipelines in pipelines.yml has resolved it. Logstash now will wait until all of the data has been ingested.

Logstash delivering less than half of data on Docker

Question

1 answers

solution1
0 ACCPTED 2018-12-20 21:43:27

Logstash delivering less than half of data on Docker

Question

1 answers

solution1 0 ACCPTED 2018-12-20 21:43:27

solution1
0 ACCPTED 2018-12-20 21:43:27