[英]optimize multiple sed commands in shell script
I have a folder containing many text files with json content in it.我有一个文件夹,其中包含许多带有 json 内容的文本文件。 With jq library, I am able extract the "commodities" array and write it to a file.
使用 jq 库,我可以提取“商品”数组并将其写入文件。 The "commodities-output.txt" is a temp file that contains brackets "[", "]" and "null" values apart from the string values in the array.
“commodities-output.txt”是一个临时文件,除了数组中的字符串值之外,还包含括号“[”、“]”和“null”值。 I want to remove the square brackets, "null" value and get the unique string values in a text file.
我想删除方括号,“null”值并获取文本文件中的唯一字符串值。 Is there a way to optimise the sed command so that I don't have to create temporary text files such as "commodities-output.txt" and only have one output file with all the string values I need that are uniq and sorted(optional to be sorted).
有没有办法优化 sed 命令,这样我就不必创建诸如“commodities-output.txt”之类的临时文本文件,并且只有一个 output 文件,其中包含我需要的所有字符串值,这些值是唯一且已排序的(可选进行排序)。
$F=foldername
for entry in $F*.json
do
echo "processing $entry"
jq '.[].commodities' $entry >> commodities-output.txt
done
sed '/[][]/d' commodities-output.txt | sed '/null/d' commodities-output.txt | sort commodities-output.txt | uniq >> commodities.txt
echo "processing complete!"
You can easily do all of this in jq
.您可以在
jq
中轻松完成所有这些操作。
files=( "$F"*.json )
echo "$0: processing ${files[0]}" >&2
xargs jq '.[] | select(.commodities != [] and .commodities != null) | .commodities' "${files[0]}"
I refactored to use a Bash array to get the first of the matching files.我重构为使用 Bash 数组来获取第一个匹配文件。
If for some reason you can't refactor your code to run entirely in jq
, you definitely want to prefer pipes over temporary files.如果由于某种原因您无法重构代码以完全在
jq
中运行,那么您肯定更喜欢管道而不是临时文件。
for entry in $F*.json
do
echo "$0: processing $entry" >&2
jq '.[].commodities' "$entry"
break
done |
sed -e '/[][]/d' -e '/null/d' |
sort -u > commodities.txt
Notice also how we take care to print the progress diagnostics to standard error ( >&2
) and include the name of the script in the diagnostic message.还要注意我们如何注意将进度诊断打印到标准错误 (
>&2
) 并在诊断消息中包含脚本的名称。 That way, when you have scripts running scripts running scripts, you can see which one wants your attention.这样,当您有运行脚本的脚本运行脚本时,您可以看到哪一个需要您的注意。
...
# write to target file, no temp needed
jq '.[].commodities' $entry >> commodities.txt
...
# You can read it with first sed command and pipe the output to next sed command (it reads stdin) and to the next commands
# Also, sort has -u flag that do the same as uniq, so you don't need a separate command
# At the end rewrite your target file with the result from sort
sed '/[][]/d' commodities.txt | sed '/null/d' | sort -u > commodities.txt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.