简体   繁体   English

Logstash日期使用日期过滤器解析为时间戳

[英]Logstash date parsing as timestamp using the date filter

Well, after looking around quite a lot, I could not find a solution to my problem, as it "should" work, but obviously doesn't. 好吧,经过四处寻找,我无法找到问题的解决方案,因为它“应该”有效,但显然没有。 I'm using on a Ubuntu 14.04 LTS machine Logstash 1.4.2-1-2-2c0f5a1, and I am receiving messages such as the following one: 我正在使用Ubuntu 14.04 LTS机器Logstash 1.4.2-1-2-2c0f5a1,我收到的消息如下:

2014-08-05 10:21:13,618 [17] INFO  Class.Type - This is a log message from the class:
  BTW, I am also multiline

In the input configuration, I do have a multiline codec and the event is parsed correctly. 在输入配置中,我有一个multiline编解码器,并且事件被正确解析。 I also separate the event text in several parts so that it is easier to read. 我还将事件文本分成几个部分,以便于阅读。

In the end, I obtain, as seen in Kibana, something like the following (JSON view): 最后,我在Kibana中看到了类似下面的内容(JSON视图):

{
  "_index": "logstash-2014.08.06",
  "_type": "customType",
  "_id": "PRtj-EiUTZK3HWAm5RiMwA",
  "_score": null,
  "_source": {
    "@timestamp": "2014-08-06T08:51:21.160Z",
    "@version": "1",
    "tags": [
      "multiline"
    ],
    "type": "utg-su",
    "host": "ubuntu-14",
    "path": "/mnt/folder/thisIsTheLogFile.log",
    "logTimestamp": "2014-08-05;10:21:13.618",
    "logThreadId": "17",
    "logLevel": "INFO",
    "logMessage": "Class.Type - This is a log message from the class:\r\n  BTW, I am also multiline\r"
  },
  "sort": [
    "21",
    1407315081160
  ]
}

You may have noticed that I put a ";" 您可能已经注意到我放了一个“;” in the timestamp. 在时间戳中。 The reason is that I want to be able to sort the logs using the timestamp string, and apparently logstash is not that good at that (eg: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/multi-fields.html ). 原因是我希望能够使用时间戳字符串对日志进行排序,显然logstash并不是那么好(例如: http ://www.elasticsearch.org/guide/en/elasticsearch/guide/current/ multi-fields.html )。

I have unsuccessfull tried to use the date filter in multiple ways, and it apparently did not work. 我没有成功尝试以多种方式使用date过滤器,它显然不起作用。

date {
            locale => "en"
            match => ["logTimestamp", "YYYY-MM-dd;HH:mm:ss.SSS", "ISO8601"]
            timezone => "Europe/Vienna"
            target => "@timestamp"
            add_field => { "debug" => "timestampMatched"}
        }

Since I read that the Joda library may have problems if the string is not strictly ISO 8601-compliant (very picky and expects a T, see https://logstash.jira.com/browse/LOGSTASH-180 ), I also tried to use mutate to convert the string to something like 2014-08-05T10:21:13.618 and then use "YYYY-MM-dd'T'HH:mm:ss.SSS" . 因为我读到Joda库可能会遇到问题,如果字符串不是严格符合ISO 8601标准(非常挑剔并期待T,请参阅https://logstash.jira.com/browse/LOGSTASH-180 ),我也尝试过使用mutate将字符串转换为类似2014-08-05T10:21:13.618 ,然后使用"YYYY-MM-dd'T'HH:mm:ss.SSS" That also did not work. 这也行不通。

I do not want to have to manually put a +02:00 on the time because that would give problems with daylight saving. 我不想在时间上手动输入+02:00,因为这会给夏令时带来问题。

In any of these cases, the event goes to elasticsearch, but date does apparently nothing, as @timestamp and logTimestamp are different and no debug field is added. 在任何这些情况下,事件都转到@timestamp ,但是date显然没有,因为@timestamplogTimestamp是不同的,并且没有添加debug字段。

Any idea how I could make the logTime strings properly sortable? 知道如何让logTime字符串正确排序吗? I focused on converting them to a proper timestamp, but any other solution would also be welcome. 我专注于将它们转换为适当的时间戳,但任何其他解决方案也将受到欢迎。

As you can see below: 如下所示: 使用时间戳排序:好的

When sorting over @timestamp , elasticsearch can do it properly, but since this is not the "real" log timestamp, but rather when the logstash event was read, I need (obviously) to be able to sort also over logTimestamp . 当对@timestamp排序时, @timestamp可以正确执行,但由于这不是“真正的”日志时间戳,而是在读取logstash事件时,我(显然)也需要能够对logTimestamp进行排序。 This is what then is output. 这就是输出。 Obviously not that useful: 显然没那么有用:

使用字符串排序:不行。有什么建议么?

Any help is welcome! 欢迎任何帮助! Just let me know if I forgot some information that may be useful. 如果我忘记了一些可能有用的信息,请告诉我。

Update: 更新:

Here is the filter config file that finally worked: 这是最终工作的过滤器配置文件:

# Filters messages like this:
# 2014-08-05 10:21:13,618 [17] INFO  Class.Type - This is a log message from the class:
#  BTW, I am also multiline

# Take only type- events (type-componentA, type-componentB, etc)
filter {
    # You cannot write an "if" outside of the filter!
    if "type-" in [type] {
        grok {
            # Parse timestamp data. We need the "(?m)" so that grok (Oniguruma internally) correctly parses multi-line events
            patterns_dir => "./patterns"
            match => [ "message", "(?m)%{TIMESTAMP_ISO8601:logTimestampString}[ ;]\[%{DATA:logThreadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]*%{GREEDYDATA:logMessage}" ]
        }

        # The timestamp may have commas instead of dots. Convert so as to store everything in the same way
        mutate {
            gsub => [
                # replace all commas with dots
                "logTimestampString", ",", "."
                ]
        }

        mutate {
            gsub => [
                # make the logTimestamp sortable. With a space, it is not! This does not work that well, in the end
                # but somehow apparently makes things easier for the date filter
                "logTimestampString", " ", ";"
                ]
        }

        date {
            locale => "en"
            match => ["logTimestampString", "YYYY-MM-dd;HH:mm:ss.SSS"]
            timezone => "Europe/Vienna"
            target => "logTimestamp"
        }
    }
}

filter {
    if "type-" in [type] {
        # Remove already-parsed data
        mutate {
            remove_field => [ "message" ]
        }
    }
}

I have tested your date filter. 我测试了你的date过滤器。 it works on me! 它适用于我!

Here is my configuration 这是我的配置

input {
    stdin{}
}

filter {
    date {
        locale => "en"
        match => ["message", "YYYY-MM-dd;HH:mm:ss.SSS"]
        timezone => "Europe/Vienna"
        target => "@timestamp"
        add_field => { "debug" => "timestampMatched"}
   }
}

output {
    stdout {
            codec => "rubydebug"
    }
}

And I use this input: 我用这个输入:

2014-08-01;11:00:22.123

The output is: 输出是:

{
   "message" => "2014-08-01;11:00:22.123",
  "@version" => "1",
"@timestamp" => "2014-08-01T09:00:22.123Z",
      "host" => "ABCDE",
     "debug" => "timestampMatched"
}

So, please make sure that your logTimestamp has the correct value. 因此,请确保您的logTimestamp具有正确的值。 It is probably other problem. 这可能是其他问题。 Or can you provide your log event and logstash configuration for more discussion. 或者您可以提供日志事件和logstash配置以进行更多讨论。 Thank you. 谢谢。

This worked for me - with a slightly different datetime format: 这对我有用 - 日期时间格式略有不同:

# 2017-11-22 13:00:01,621 INFO [AtlassianEvent::0-BAM::EVENTS:pool-2-thread-2] [BuildQueueManagerImpl] Sent ExecutableQueueUpdate: addToQueue, agents known to be affected: []
input {
   file {
       path => "/data/atlassian-bamboo.log"
       start_position => "beginning"
       type => "logs"      
       codec => multiline {
                pattern => "^%{TIMESTAMP_ISO8601} "
                charset => "ISO-8859-1"
                negate => true
                what => "previous"                
       }       
   }
}
filter {
   grok {
      match => [ "message", "(?m)^%{TIMESTAMP_ISO8601:logtime}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}\[%{DATA:thread_id}\]%{SPACE}\[%{WORD:classname}\]%{SPACE}%{GREEDYDATA:logmessage}" ]
   }

    date {
        match => ["logtime", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
        timezone => "Europe/Berlin"
   }   
}

output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM