简体   繁体   中英

jq variable substituion works in shell but not in script

The following command works in the shell just fine, but when executed via an script it doesn't. What am I missing.

jsonSelectWords='select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

cat file.json | jq "$jsonSelectWords"

In the script the select statement is created dynamically, thus I am not able to directly provide it.

input=file.json

local jsonSelectWords="'"
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"
jsonSelectWords+="'"


cat $input | jq "$jsonSelectWords"

The execution of the last line gives the following error.

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 1:

'select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

jq: 1 compile error

Any hints. It tried different variations as well as the whole statement in $(cat $input | jq "$jsonSelectWords")

I have also used the following cat $input | jq --args JSW "$jsonSelectWords" '$JSW' cat $input | jq --args JSW "$jsonSelectWords" '$JSW' (with single quotes removed from the initial string, with '[$JSW]' and so on). This just outputs the content of jsonSelectWords.

The following lines are examples of the content of file.json aka $input .

{"timestamp":"2022-03-09T12:30:23.329630917+01:00","scheme":"http","port":"80","path":"/","body-sha256":"0bfc0bdeb920ce4701f130e6e6a33c8aaf558fae44c7479cc1629930cb0f4535","header-sha256":"d9522b92bb09e71b719804f522f0b3b49aa77974c8d79e644fb45a7b3327f73e","a":["81.91.86.14"],"url":"http://01.akce.omv.com:80","input":"01.akce.omv.com","location":"https://01.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":95,"status-code":301,"response-time":"194.004475ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.355007661+01:00","scheme":"http","port":"80","path":"/","body-sha256":"d6285599bd6f2851fc17e0244ad212a58d8d539231f804f81b5b98289197afa0","header-sha256":"96884ec058c78d0ea282a2d51be4ce0f5c7bc05d8fe3e8dd8f6fb73dd4fa2cd6","a":["81.91.86.14","40.90.4.7","64.4.48.7","2603:1061::7","2620:1ec:8ec::7"],"url":"http://09-mail2.akce.omv.com:80","input":"09-mail2.akce.omv.com","location":"https://09-mail2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":101,"status-code":301,"response-time":"233.377898ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.450849812+01:00","scheme":"http","port":"80","path":"/","body-sha256":"c186820e328bf631a2943f77e52e9e8319ddfefade6d308a2a22ef996176bbe6","header-sha256":"61e4f3139518b49cac86b77a4f9f06da98d53f2eb12dbff574b5a0ea66327478","a":["81.91.86.14"],"url":"http://09-server2.akce.omv.com:80","input":"09-server2.akce.omv.com","location":"https://09-server2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":103,"status-code":301,"response-time":"268.856986ms","failed":false,"lines":3,"words":6}

Solution

local jsonSelectWords=""
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"

cat $input | jq "$jsonSelectWords"

Programmatically producing code that is being executed (here, a bash script producing and running a jq filter) is generally considered not only less readable (what happens in jq is very fragmented), more error-prone (this has actually brought you here), but also a principal safety risk (in a complex chain of dependencies you might not have full control over what is being executed in the end).

Therefore, you should try to modify your approach in a way that the only thing variable is the data that is being input, while the code is formulated in a way that it can react on the varying data but by itself is just a literal invariable string.

Given your sample (I presume, the code above is just a small snippet relevant to the actual question, so let this be more a hinting guide rather than a general solution), you are trying to reduce an input stream of JSON objects from file.json by comparing their words field's numeric content to a list of numbers stored in the bash array wordDupArray . More specifically, given a single input object, you want to pass it through to the output if .words holds a number that is not present in a list of numbers provided, or else drop it if the number is present in the list. Let's implement that.

If jq is given a stream of objects, it will process them one by one, so breaking down the input stream to a single object already happens automatically. For the comparison part, jq needs to be given the list of numbers from the bash array. As jq is a JSON processor, it'd be best to provide the list as a JSON array, thus the task at hand is to convert the bash array into a JSON array.

There are many ways to accomplish this. As the array contains only numbers, you can cash in on the fact that a number by itself is already a valid JSON document, so one approach could be to have another jq call which takes a stream of numbers and outputs them as a JSON-ecoded array using the --slurp (or -s ) option, and to then, back in bash, store that output in a variable and provide it to the actual jq call using the --argjson option, which lets you access that JSON array as a variable inside jq.

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jsonarray="$(jq -sc <<< "${wordDupArray[@]}")"   # will contain "[6,1173,1,8,9,27]"
jq --argjson list "$jsonarray" ' … jq filter using the array in $list … ' file.json

For the sake of variation, another way could be to use the --slurpfile option which by itself already combines a stream of JSON documents to a JSON array, and similarly lets you access that array using a variable. But as a major difference, it requires the document to be provided as a file rather than a JSON-encoded string. This can be mimicked by using Process Substitution in bash:

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jq --slurpfile list <(cat <<< "${wordDupArray[@]}") ' … using $list … ' file.json

For the main task, filtering the input objects from file.json according to a match in the $list array, you can check for inequality just as you did before but now using the array's items $list[] instead, and have the all function check whether the given condition holds for all items or not (all hold true means none did match).

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select([.words != $list[]] | all)' file.json

Demo

Again, for the sake of variation, you could also use the IN function which returns whether or not a given value appears in a given stream (not to be confused with the in function which is for checking keys in objects), and the not function to select the cases where a match could not be found.

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select(.words | IN($list[]) | not)' file.json

Demo

All in all, these solutions are more stable and robust as the code is invariable and self-contained, also more comprehensible as a contiguous code is easier to follow, and even in the case of a failure you can expect more convenient error messages than the generic "compile error" which is even harder to trace if the actual code executed is unknown because of its volatility.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM