简体   繁体   中英

How do I parse JSON log file with jq?

I have two types of logs in a JSON log file and I want to parse and label each event with a tag using a jq filter. An example of each event below:

The goal is to label each event so that if message begins with a TR, .sourcetype=application_log, else if message begins with an IP, .sourcetype=access_log.

So far, I'm working with this: test.log jq -r '.[] | select(.log[12:14] == "TR") | .sourcetype = "application_log" | .sourcetype'

"log": "{\"message\":\"TR=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
"stream": "stdout",
"time": "2019-07-23T00:47:07.222368843Z"

"log": "{\"message\":\"IP=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
"stream": "stdout",
"time": "2019-07-23T00:47:07.222368843Z"

If I understand the task correctly, a solution would be:

.log[12:14] as $code    
| if ($code == "TR") then .sourcetype = "application_log"
  elif ($code == "IP") then .sourcetype = "access_log"
  else .

If you want the .log values as JSON objects so you can add the .sourcetype there, you would have to use fromjson on the original .log values, along the lines of:

.log |= fromjson
| .message[0:2] as $code    
| if ($code == "TR") then .log.sourcetype = "application_log"
  elif ($code == "IP") then .log.sourcetype = "access_log"
  else .
| .log |= tostring . # is this line really needed?

alternatively , the same operation is possible to accomplish with a walk-path based unix utility jtc :

bash $ jtc -aw'[log]:<"TR=>R<V:"application_log">v[-1]' -w'[log]:<"IP>R<V:"access_log">v[-1]' -i0 -T'{"sourcetype":"{V}"}' log.json 
   "log": "{\"message\":\"TR=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
   "sourcetype": "application_log",
   "stream": "stdout",
   "time": "2019-07-23T00:47:07.222368843Z"
   "log": "{\"message\":\"IP=failed to send order confirmation to \\\"someone@example.com\\\": rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \\\"transport: Error while dialing dial tcp i/o timeout\\\"\",\"severity\":\"warning\",\"timestamp\":\"2019-07-23T00:47:07.216693578Z\"}\n",
   "sourcetype": "access_log",
   "stream": "stdout",
   "time": "2019-07-23T00:47:07.222368843Z"
bash $ 
  • there are 2 walk-paths here (one matching TR type of the log record and one matching IP 's), each defining variable V with a respective content (upon a successful match). Both walks will be applied per each JSON, whichever succeed will define the content of V
  • the insert option ( -i ) carries a dummy operand ( 0 ) because it will be entirely replaced by the template ( -T ), which is one you require

PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM