简体   繁体   中英

Logstash filter Regex into Field

I am facing some issues with parsing a logline.

I Have thousends of loglines and every logline contains a hostname like ABC123DF I Have writen a regex and I want to apply it to the logline and put the hostname in the field "victim" Like this:

add_field => [ "victim", "/[a-z][a-z][a-z][0-9][0-9][0-9].."

I have used the Mutate filter but the result is

victim /[az][az][az][0-9][0-9][0-9]..

I would like to see:

victim ABC123DF

how do I do this?

You don't even need complex regex action to achieve this. You can use several filters to extract fields. In your case, the grok filter is a good choice.

Let's suppose your log lines look like this:

20:20:20 ABC123DF 192.168.1.1

Then a grok filter like this would parse the hostname properly:

grok {
    match => { "message" => "%{TIME:timestamp} %{HOST:host} %{IP:srcip}" }
}

You can also use regex inside grok ( docs ). Example Pattern:

%{GREEDYDATA} (?<host>.*) %{GREEDYDATA}

However, I would recommend to avoid regex in grok. It is a better approach to go with the included patterns. Use the grok debugger to find the right patterns for you.

I user this site, http://grokconstructor.appspot.com/do/match#result , to test my regex.

In my mongodb log, I have this text:

2016-05-17T16:26:07.018-0300 I QUERY [conn50628097] getmore dataBaseName.collectionName query: { empresa: "********" } cursorid:443889850763 ntoreturn:0 keyUpdates:0 writeConflicts:0 numYields:69 nreturned:8886 reslen:1184746 locks:{ Global: { acquireCount: { r: 140 } }, Database: { acquireCount: { r: 70 }, acquireWaitCount: { r: 66 }, timeAcquiringMicros: { r: 98046 } }, Collection: { acquireCount: { r: 70 } } } 178ms

For get query and time, in config file, I make this filter:

    filter {
      if [source] == "/var/log/mongodb/mongod.log" {
        grok {
          match=> {
            "message" => [
              "(getmore|query)\s(?<mongo_database>[a-zA-Z_]+)\.(?<mongo_collection>[a-zA-Z_.]+)\s?query:\s(?<mongo_query>\{.*?\})\s(cursorid|planSummary).*?\s(?<mongo_totaltime>[0-9]+ms)"
            ]
          }
        }
      }
    }

Use:

    (?<you_new_field_name>you_regex)you_regex(?<you_new_field_name>you_regex)(?<you_new_field_name>you_regex)

after this, you can make:

    add_field => [ "tag_text_optional%{you_new_field_name}", "%{you_new_field_name}" ]

So in my case the logline is:

2015-10-20 14:45:42,156 [pool-3-thread-1] INFO audit Terminated abc123df from group LLDS2Cassandra [LOCAL] with NetworkCorruption

grok {
    match => { "message" => "%{TIME:timestamp} %{LOGLEVEL} %{VICTIM:victim} " }
}

And in the grok patterns I put the following line: VICTIM [az][az][az][0-9][0-9][0-9] .

To get the following result:

FIELD: TIMESTAMP VALUE: 2015-10-20 14:45:42,156

FIELD: VICTIM VALUE: abc123df

FIELD:LOGLEVEL
VALUEE:INFO

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM