简体   繁体   中英

Polling logs with logstash (ELK)

I have a dilemma regarding polling / storing log files.
The case is that we need to monitor our logs from Cloudhub, aggregate them with Logstash and store them (probably with ElasticSearch).

Anypoint Runtime Manager only appears to support pushing events to third party systems when on premise (not from the Cloud), so I decided to make a demo for polling logs through a REST api via the logstash http-poller plugin.

I'm dealing with some decisions I don't have a lot of experience with.
When polling, you will always retrieve the last x number of logs over a certain time interval. I assume that these parameters will depend on the type of logs, but I still wonder at what level you will deal with duplication of retrieved logs. And how you deal with the uncertainty of missing out on logs.

Is this something that you will handle at the storage level, or is it something that you would handle immediately at logstash?
Thanks for sharing your thoughts on the subject.

I can't say that I have a lot of experience in the subject , but this is what I think.

I think that if Logstash will be running as a service it will depend more on the output of the API to handle the duplicates.

At the same time , if you define an unique identifier in the response, you could tell Logstash to avoid duplicates.

From Change ID in elasticsearch

elasticsearch { 
    host => yourEsHost
    cluster => "yourCluster"
    index => "logstash-%{+YYYY.MM.dd}"
    document_id => "%{someFieldOfMyEvent}"
} 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM