简体   繁体   中英

How do I parse JSON log file with jq?

I have two types of logs in a JSON log file and I want to parse and label each event with a tag using a jq filter. An example of each event below:

The goal is to label each event so that if message begins with a TR, .sourcetype=application_log, else if message begins with an IP, .sourcetype=access_log.

So far, I'm working with this: test.log jq -r '.[] | select(.log[12:14] == "TR") | .sourcetype = "application_log" | .sourcetype'

{
"log": "{\"message\":\"TR=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.64.5.235:5000: i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
"stream": "stdout",
"time": "2019-07-23T00:47:07.222368843Z"
}

{
"log": "{\"message\":\"IP=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.64.5.235:5000: i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
"stream": "stdout",
"time": "2019-07-23T00:47:07.222368843Z"
}

If I understand the task correctly, a solution would be:

.log[12:14] as $code    
| if ($code == "TR") then .sourcetype = "application_log"
  elif ($code == "IP") then .sourcetype = "access_log"
  else .
  end

If you want the .log values as JSON objects so you can add the .sourcetype there, you would have to use fromjson on the original .log values, along the lines of:

.log |= fromjson
| .message[0:2] as $code    
| if ($code == "TR") then .log.sourcetype = "application_log"
  elif ($code == "IP") then .log.sourcetype = "access_log"
  else .
  end
| .log |= tostring . # is this line really needed?

alternatively , the same operation is possible to accomplish with a walk-path based unix utility jtc :

bash $ jtc -aw'[log]:<"TR=>R<V:"application_log">v[-1]' -w'[log]:<"IP>R<V:"access_log">v[-1]' -i0 -T'{"sourcetype":"{V}"}' log.json 
{
   "log": "{\"message\":\"TR=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.64.5.235:5000: i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
   "sourcetype": "application_log",
   "stream": "stdout",
   "time": "2019-07-23T00:47:07.222368843Z"
}
{
   "log": "{\"message\":\"IP=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp 10.64.5.235:5000: i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
   "sourcetype": "access_log",
   "stream": "stdout",
   "time": "2019-07-23T00:47:07.222368843Z"
}
bash $ 
  • there are 2 walk-paths here (one matching TR type of the log record and one matching IP 's), each defining variable V with a respective content (upon a successful match). Both walks will be applied per each JSON, whichever succeed will define the content of V
  • the insert option ( -i ) carries a dummy operand ( 0 ) because it will be entirely replaced by the template ( -T ), which is one you require

PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM