简体   繁体   中英

In Bash, how can I parse multiple newline delimited JSON objects from a log file?

I am parsing through a log file and get result lines (using grep) like the following:

2017-01-26 17:19:40 +0000 docker: {"source":"stdout","log":"I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}","container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300"}
2017-01-26 17:19:40 +0000 docker: {"container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300","source":"stdout","log":"I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"}

I then extract the JSON objects with the following command:

... | grep -o -E "\\{.*$"

I know I can parse a single line with python -mjson.tool like so:

... | grep -o -E "\\{.*$" | tail -n1 | python -mjson.tool

But I want to parse both lines (or n lines). How can I do this in bash? (I think xargs is supposed to let me do this, but I am new to the tool and can't figure it out)

jq can be told to accept plain text as input, and attempt to parse an extracted subset as JSON. Consider the following example, tested with jq 1.5:

jq -R 'capture("docker: (?<json>[{].*[}])$") | .json? | select(.) | fromjson' <<'EOF'
2017-01-26 17:19:40 +0000 docker: {"source":"stdout","log":"I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}","container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300"}
2017-01-26 17:19:40 +0000 docker: {"container_id":"6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e","container_name":"/test-container-b49c8188c3ebe4b93300","source":"stdout","log":"I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"}
EOF

...properly yields:

{
  "source": "stdout",
  "log": "I, [2017-01-26T17:19:40.703988 #24]  INFO -- : {\"tags\":\"structured_log\",\"payload\":{\"results\":[{\"baserate\":\"-1\"}]},\"commit_stamp\":1485451180,\"resource\":\"google_price_result_metric\",\"object_id\":\"20170126171940700\"}",
  "container_id": "6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e",
  "container_name": "/test-container-b49c8188c3ebe4b93300"
}
{
  "container_id": "6ecbf7f64e4c9557e9dd1efbc6666a3c6c53f9cd5c18414ed5633cad8c302e",
  "container_name": "/test-container-b49c8188c3ebe4b93300",
  "source": "stdout",
  "log": "I, [2017-01-26T17:19:40.704364 #24]  INFO -- : method=POST path=/prices.xml format=xml controller=TestController action=prices status=200 duration=1686.51 view=0.08 db=0.62"
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM