简体   繁体   English

在 Logstash 中解析出尴尬的 JSON

[英]Parsing out awkward JSON in Logstash

Afternoon,下午,

I've been trying to sort this for the past few weeks and cannot find a solution.在过去的几周里,我一直在尝试对此进行排序,但找不到解决方案。 We receive some logs via a 3rd part and so far I've used grok to pull out the value below into the details field.我们通过第三部分收到一些日志,到目前为止,我已经使用 grok 将下面的值提取到详细信息字段中。 Annoyingly this would be extremely simple if it weren't for the all the slashes.令人讨厌的是,如果不是所有的斜线,这将非常简单。

Is there an easy way to parse this data out as JSON in Logstash?有没有一种简单的方法可以在 Logstash 中将这些数据解析为 JSON ?

{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"eample@example.onmicrosoft.com\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"example@example.onmicrosoft.com\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}

You can achieve this easily with the json filter :您可以使用json滤波器轻松实现此目的:

filter {
  json {
    source => "message"
  }
}

If your source data actually contains those backslashes, then you need to somehow remove them before Logstash can recognise the message as valid JSON.如果您的源数据实际上包含这些反斜杠,那么您需要以某种方式删除它们,然后 Logstash 才能将消息识别为有效的 JSON。

You could do that before it hits Logstash, then the json codec will probably work as expected.您可以在它到达 Logstash 之前执行此操作,然后 json 编解码器可能会按预期工作。 Or if you want Logstash to handle it, you can use the Mutate's gsub option, followed by the JSON filter to parse:或者如果你想让 Logstash 处理它,你可以使用 Mutate 的gsub选项,然后使用 JSON 过滤器来解析:

filter {
  mutate {
    gsub => ["message", "[\\]", "" ]
  }
  json {
    source => "message"
  }
}

A couple of things to note: this will just blindly strip out all backslashes.需要注意的几件事:这只会盲目地去除所有反斜杠。 If your strings ever might contain backslashes, you need to do something a little more sophisticated.如果你的字符串可能包含反斜杠,你需要做一些更复杂的事情。 I've had trouble escaping backslashes in gsub before and found that using the regex any of / [] construction is safer.我之前在gsub中遇到过 escaping 反斜杠问题,发现使用any of / []构造的正则表达式更安全。

Here's a docker one-liner to run that config.这是运行该配置的 docker 单线。 The stdin input and stdout output are the default when using -e to specify config on the command line, so I've omitted them here for readability: stdin 输入和 stdout output 是使用-e在命令行上指定配置时的默认值,因此为了便于阅读,我在此处省略了它们:

docker run --rm -it docker.elastic.co/logstash/logstash:7.12.1 -e 'filter { mutate { gsub => ["message", "[\\]", "" ]} json { source => "message" } }'

Pasting your example in and hitting return results in this output:粘贴您的示例并在此 output 中返回结果:

{
        "@timestamp" => 2021-05-13T01:57:40.736Z,
       "RelativeUrl" => "/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx",
    "OrganizationId" => "xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx",
           "UserKey" => "eample@example.onmicrosoft.com",
          "DataType" => "MtpBatch",
           "message" => "{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"eample@example.onmicrosoft.com\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"example@example.onmicrosoft.com\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}",
          "UserType" => 5,
            "UserId" => "example@example.onmicrosoft.com",
              "type" => "stdin",
              "host" => "de2c988c09c7",
          "@version" => "1",
         "Operation" => "SearchMtpBatch",
          "AadAppId" => "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
       "ResultCount" => "1",
      "DatabaseType" => "DataInsights",
           "Version" => 1,
        "RecordType" => 52,
      "CreationTime" => "2021-05-11T06:42:44",
                "Id" => "xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
          "Workload" => "SecurityComplianceCenter"
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM