简体   繁体   中英

How to add sequence id when using logstash to parse log

I want to index hadoop logs with logstash and elasticsearch. Here is my problem: I load logs into elasticsearch by logstash and I hope to search events by elasticsearch and keep the order of events as same as it in original log files. But it doesn't work. For example, the events in original log file maybe looks like:

2013-12-25 23:10:19,022 INFO A..
2013-12-25 23:10:19,022 INFO B..
2013-12-25 23:10:19,022 INFO C..

But when I search it using elasticsearch by the keyword "@timestamp",the result maybe like this:

2013-12-25 23:10:19,022 INFO B..
2013-12-25 23:10:19,022 INFO A..
2013-12-25 23:10:19,022 INFO C..

Because timestampa are same in this three events, the search result can not keep the order as before.

Here is my solution: I think I can add a id for each event, the id is added when logstash parsing the data and it increases with the timestamp. Then when I search events, I can use ids instead of timestamps and they will keep right order even when their timastamp are same.

But I don't know how to add the extra autoincremental 'id' field using logstash, I considered the conf file of logstash and didn't find the solution. Please give me some advices of how I can implement this, thanks a lot!

You can try to use timestamp to insert a new field seq . Here is the configuration,

ruby {
    code => "
          event['seq'] = Time.now.strftime('%Y%m%d%H%M%S%L').to_i                
    "
}

With this solution you no need to write any plugin. In this example we use the timestamp millisecond as the value of field seq . However, if your CPU is powerful and your logs is process faster, maybe there will have 2 events have the same value. Please have a try on it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM