简体   繁体   中英

elasticsearch: extract number from a field

I'm using elasticsearch and kibana for storing my logs. Now what I want is to extract a number from a field and store it a new field.

So for instance, having this:

accountExist execution time: 1046 ms

I would like to extract the number (1046) and see it in a new field in kibana.

Is it possible? how? Thanks for the help

You'll need to do this before/during indexing.

Within Elasticsearch, you can get what you need during indexing:

  1. Define a new analyzer using the Pattern Analyzer to wrap a regular expression (for your purposes, to capture consecutive digits in the string - good answer on this topic ).
  2. Create your new numeric field in the mapping to hold the extracted times.
  3. Use copy_to to copy the log message from the input field to the new numeric field from (2) where the new analyzer will parse it.

The Analyze API can be helpful for testing purposes.

While not performant, if you must avoid reindexing, you could use scripted fields in kibana.

Introduction here: https://www.elastic.co/blog/using-painless-kibana-scripted-fields

  • enable painless regex support by putting the following in your elasticsearch.yaml:

    script.painless.regex.enabled: true

  • restart elasticsearch
  • create a new scripted field in Kibana through Management -> Index Patterns -> Scripted Fields
  • select painless as the language and number as the type
  • create the actual script, for example:
def logMsg = params['_source']['log_message'];
if(logMsg == null) {
 return -10000;
}
def m = /.*accountExist execution time: ([0-9]+) ms.*$/.matcher(params['_source']['log_message']);
if ( m.matches() ) {
   return Integer.parseInt(m.group(1))
} else {
   return -10000
}
  • you must reload the website completely for the new fields to be executed, simply re-doing a search on an open discover site will not pick up the new fields. (This almost made me quit trying to get this working -.-)
  • use the script in discover or visualizations

While I do understand, that it is not performant to script fields for millions of log entries, my usecase is a very specific log entry, that is logged 10 times a day in total and I only use the resulting fields to create a visualization or in analysis where I reduce the candidates through regular queries in advance.

Would be interesing if it is possible to have those fields only be calculated in situations where you need them (or they make sense & are computable to begin with; ie to make the "return -1000" unnecessary). Currently they will be applied and show up for every log entry.
You can generate scripted fields inside of queries like this: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html but that seems a bit too much of burried under the hood, to maintain easily :/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM