简体   繁体   English

Logstash过滤器正则表达式进入字段

[英]Logstash filter Regex into Field

I am facing some issues with parsing a logline. 我在解析日志行时遇到一些问题。

I Have thousends of loglines and every logline contains a hostname like ABC123DF I Have writen a regex and I want to apply it to the logline and put the hostname in the field "victim" Like this: 我有大量的日志行,每个日志行都包含一个主机名,例如ABC123DF。我已经写了一个正则表达式,我想将其应用于日志行,并将主机名放在“ victim”字段中,如下所示:

add_field => [ "victim", "/[a-z][a-z][a-z][0-9][0-9][0-9].."

I have used the Mutate filter but the result is 我使用了Mutate过滤器,但结果是

victim /[az][az][az][0-9][0-9][0-9].. 受害者/ [az] [az] [az] [0-9] [0-9] [0-9]。

I would like to see: 我想看看:

victim ABC123DF 受害者ABC123DF

how do I do this? 我该怎么做呢?

You don't even need complex regex action to achieve this. 您甚至不需要复杂的正则表达式操作即可实现此目的。 You can use several filters to extract fields. 您可以使用多个过滤器来提取字段。 In your case, the grok filter is a good choice. 在您的情况下, grok过滤器是一个不错的选择。

Let's suppose your log lines look like this: 假设您的日志行如下所示:

20:20:20 ABC123DF 192.168.1.1

Then a grok filter like this would parse the hostname properly: 然后,像这样的grok过滤器将正确解析主机名:

grok {
    match => { "message" => "%{TIME:timestamp} %{HOST:host} %{IP:srcip}" }
}

You can also use regex inside grok ( docs ). 您还可以在grokdocs使用正则表达式 Example Pattern: 示例模式:

%{GREEDYDATA} (?<host>.*) %{GREEDYDATA}

However, I would recommend to avoid regex in grok. 但是,我建议避免在grok中使用正则表达式。 It is a better approach to go with the included patterns. 使用包含的模式是更好的方法。 Use the grok debugger to find the right patterns for you. 使用grok调试器为您找到正确的模式。

I user this site, http://grokconstructor.appspot.com/do/match#result , to test my regex. 我使用此网站http://grokconstructor.appspot.com/do/match#result来测试我的正则表达式。

In my mongodb log, I have this text: 在我的mongodb日志中,有以下文本:

2016-05-17T16:26:07.018-0300 I QUERY [conn50628097] getmore dataBaseName.collectionName query: { empresa: "********" } cursorid:443889850763 ntoreturn:0 keyUpdates:0 writeConflicts:0 numYields:69 nreturned:8886 reslen:1184746 locks:{ Global: { acquireCount: { r: 140 } }, Database: { acquireCount: { r: 70 }, acquireWaitCount: { r: 66 }, timeAcquiringMicros: { r: 98046 } }, Collection: { acquireCount: { r: 70 } } } 178ms 2016-05-17T16:26:07.018-0300我查询[conn50628097] getmore dataBaseName.collectionName查询:{empresa:“ ********”} cursorid:443889850763 ntoreturn:0 keyUpdates:0 writeConflicts:0 numYields: 69 nreturned:8886 reslen:1184746锁:{全局:{accountCount:{r:140}},数据库:{accountCount:{r:70},acquireWaitCount:{r:66},timeAcquiringMicros:{r:98046}},集合:{AccountCount:{r:70}}} 178ms

For get query and time, in config file, I make this filter: 为了获取查询和时间,请在配置文件中进行以下过滤:

    filter {
      if [source] == "/var/log/mongodb/mongod.log" {
        grok {
          match=> {
            "message" => [
              "(getmore|query)\s(?<mongo_database>[a-zA-Z_]+)\.(?<mongo_collection>[a-zA-Z_.]+)\s?query:\s(?<mongo_query>\{.*?\})\s(cursorid|planSummary).*?\s(?<mongo_totaltime>[0-9]+ms)"
            ]
          }
        }
      }
    }

Use: 采用:

    (?<you_new_field_name>you_regex)you_regex(?<you_new_field_name>you_regex)(?<you_new_field_name>you_regex)

after this, you can make: 之后,您可以进行以下操作:

    add_field => [ "tag_text_optional%{you_new_field_name}", "%{you_new_field_name}" ]

So in my case the logline is: 因此,在我的情况下,日志行是:

2015-10-20 14:45:42,156 [pool-3-thread-1] INFO audit Terminated abc123df from group LLDS2Cassandra [LOCAL] with NetworkCorruption 2015-10-20 14:45:42,156 [pool-3-thread-1]信息审核已从LLDS2Cassandra组[LOCAL]与NetworkCorruption终止abc123df

grok {
    match => { "message" => "%{TIME:timestamp} %{LOGLEVEL} %{VICTIM:victim} " }
}

And in the grok patterns I put the following line: VICTIM [az][az][az][0-9][0-9][0-9] . 在grok模式中,我输入了以下行: VICTIM [az][az][az][0-9][0-9][0-9]

To get the following result: 得到以下结果:

FIELD: TIMESTAMP VALUE: 2015-10-20 14:45:42,156 栏位:时间戳值:2015-10-20 14:45:42,156

FIELD: VICTIM VALUE: abc123df 栏位:VICTIM值:abc123df

FIELD:LOGLEVEL FIELD:LOGLEVEL
VALUEE:INFO VALUEE:INFO

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM