logstash拆分路径，并通过其位置[2]取某个值并将其插入字段

Question

I need to extract from a path certain value by its position... 我需要根据其位置从路径中提取特定值...

example: let's say I split this following path into array using '\\' as split char E:\\OUM82\\APP\\Logs\\UploadManager_20062019.log I should get something like this: 示例：假设我使用“ \\”将以下路径拆分为数组，作为拆分字符E：\\ OUM82 \\ APP \\ Logs \\ UploadManager_20062019.log我应该得到类似以下内容：

[0]=E:
[1]=OUM82
[2]=APP (this value I want to take into a field  )
[3]=logs
[4]=UploadManager_20062019.log

so, I'm always want to take whatever in [2] how do I implement it? 因此，我一直想采用[2]中的任何内容，该如何实现？ its something with ruby? 它与红宝石有关？

Edit: 编辑：

I Tried this approach: (like @baudsp suggested) but I'm still getting "_grokparsefailure" 我尝试了这种方法：（如建议使用@baudsp），但我仍然收到“ _grokparsefailure”

grok {
              match => { path => "%{GREEDYDATA:pathDriveSign}\\%{GREEDYDATA:RootFolder}\\%{GREEDYDATA:customerFolder}" }
           }

here is the std output: 这是std输出：

{
      "tags" => [
    [0] "beats_input_codec_plain_applied",
    [1] "_grokparsefailure"
],
     "agent" => {
            "type" => "filebeat",
    "ephemeral_id" => "bd6ace26-79cd-4297-bfb5-5add9f4b4217",
              "id" => "83fb6261-5872-4d95-853a-44f2cc41d436",
         "version" => "7.0.0",
        "hostname" => "OctUpload"
},
   "message" => "2019-06-13 17:40:34,591 INFO QueriesParserEngine.Run - GSP queries parser engine end. Total run time duration: 00:02:32.1831164 ",
"@timestamp" => 2019-06-22T16:25:26.204Z,
     "cloud" => {
    "provider" => "az",
     "machine" => {
        "type" => "Standard_DS13_v2"
    },
      "region" => "westeurope",
    "instance" => {
        "name" => "OctUpload",
          "id" => "768097b1-bfb9-4939-b99c-5337aede39ca"
    }
},
 "extractor" => "SQLSERVER",
     "input" => {
    "type" => "log"
},
       "ecs" => {
    "version" => "1.0.0"
},
  "@version" => "1",
    "fields" => {
    "logtype" => "log4net"
},
      "host" => {
              "os" => {
           "build" => "14393.2608",
         "version" => "10.0",
            "name" => "Windows Server 2016 Datacenter",
        "platform" => "windows",
          "kernel" => "10.0.14393.2608 (rs1_release.181024-1742)",
          "family" => "windows"
    },
              "id" => "d79c20df-4184-41a8-b95d-83669c8e3fbe",
            "name" => "OctUpload",
    "architecture" => "x86_64",
        "hostname" => "OctUpload"
},
       "log" => {
      "file" => {
        "path" => "E:\\OUM82\\Micron\\TI_DS_FILES\\SQLSERVER_LOGS\\QueriesParser-SQLS-BOMSSPROD66-2_13062019_173801 - Copy.log"
    },
    "offset" => 927068
}

} }

Answer 1

NB : I'm not sure it's the best filter to use here, but it's the one I've used the most and it should work. 注意：我不确定这是最好的过滤器，但它是我使用最多的过滤器，应该可以使用。

If you are only interested in the APP part of your path, you should be able to retrieve it with the grok filter. 如果您只对路径的APP部分感兴趣，则应该可以使用grok过滤器进行检索。

Supposing that your path is in a field called path : 假设您的路径位于名为path的字段中：

grok {
   match => {path => "^%{DATA}\\%{DATA}\\%{DATA:value}\\"}
}

The filter will put the value APP in the value field. 过滤器会将值APP放入value字段。

For more information on the grok filter: 有关grok过滤器的更多信息：

Answer 2

another better solution by Badger from ELK team: ELK团队的Badger另一个更好的解决方案：

better solution by Badger from ELK team ELK团队的Badger提供了更好的解决方案

You cannot do it with mutate+split (which is what I would normally suggest) due to this issue , which affects regexps, single quoted string, and double quoted strings. 由于这个问题，您无法使用mutate + split（我通常会建议这样做），这会影响正则表达式，单引号字符串和双引号字符串。

It is possible using grok if you enable config.support_escapes on logstash.yml... Believe it or not 如果在logstash.yml上启用config.support_escapes，则可以使用grok ...信不信由你
 grok { match => { "path" => "^(?<pathDriveSign>\\w{1}):\\\\\\\\(?<RootFolder>[^\\\\\\\\]+)\\\\\\\\(?<customerFolder>[^\\\\\\\\]+)\\\\\\\\." } } 
will get you 会得到你
 "RootFolder" => "OUM82", "pathDriveSign" => "E", "customerFolder" => "APP", 
Do not ask me to explain why 4 backslashes are required to represent a single backslash. 不要要求我解释为什么代表单个反斜杠需要4个反斜杠。

There is also a sneaky way to do it in ruby. 还有一种偷偷摸摸的方法来使用红宝石。 You cannot have a backslash at the end of a string, so we have a string that contains a backslash and extract the backslash from it. 字符串的末尾不能包含反斜杠，因此我们有一个包含反斜杠并从中提取反斜杠的字符串。
 ruby { code => ' backslash = "\\\\Z"[0] event.set("components", event.get("path").split(backslash)) ' } 
results in 结果是
 "components" => [ [0] "E:", [1] "OUM82", [2] "APP", [3] "Logs", [4] "UploadManager_20062019.log" ] 

logstash拆分路径，并通过其位置[2]取某个值并将其插入字段

问题描述

2 个解决方案

解决方案1
1 2019-06-21 12:10:30

解决方案2
0 2019-06-22 19:46:12

logstash拆分路径，并通过其位置[2]取某个值并将其插入字段

问题描述

2 个解决方案

解决方案1 1 2019-06-21 12:10:30

解决方案2 0 2019-06-22 19:46:12

解决方案1
1 2019-06-21 12:10:30

解决方案2
0 2019-06-22 19:46:12