優化 shell 腳本中的多個 sed 命令

Question

我有一個文件夾，其中包含許多帶有 json 內容的文本文件。 使用 jq 庫，我可以提取“商品”數組並將其寫入文件。 “commodities-output.txt”是一個臨時文件，除了數組中的字符串值之外，還包含括號“[”、“]”和“null”值。 我想刪除方括號，“null”值並獲取文本文件中的唯一字符串值。 有沒有辦法優化 sed 命令，這樣我就不必創建諸如“commodities-output.txt”之類的臨時文本文件，並且只有一個 output 文件，其中包含我需要的所有字符串值，這些值是唯一且已排序的（可選進行排序）。

$F=foldername
for entry in $F*.json
do
  echo "processing $entry"
  jq '.[].commodities' $entry >> commodities-output.txt
done
sed '/[][]/d' commodities-output.txt | sed '/null/d' commodities-output.txt | sort commodities-output.txt | uniq >> commodities.txt

echo "processing complete!"

Answer 1

您可以在jq中輕松完成所有這些操作。

files=( "$F"*.json )
echo "$0: processing ${files[0]}" >&2
xargs jq '.[] | select(.commodities != [] and .commodities != null) | .commodities' "${files[0]}"

我重構為使用 Bash 數組來獲取第一個匹配文件。

如果由於某種原因您無法重構代碼以完全在jq中運行，那么您肯定更喜歡管道而不是臨時文件。

for entry in $F*.json
do
  echo "$0: processing $entry" >&2
  jq '.[].commodities' "$entry"
  break
done |
sed -e '/[][]/d' -e '/null/d' |
sort -u > commodities.txt

還要注意我們如何注意將進度診斷打印到標准錯誤 ( >&2 ) 並在診斷消息中包含腳本的名稱。 這樣，當您有運行腳本的腳本運行腳本時，您可以看到哪一個需要您的注意。

Answer 2

...
# write to target file, no temp needed
jq '.[].commodities' $entry >> commodities.txt
...
# You can read it with first sed command and pipe the output to next sed command (it reads stdin) and to the next commands
# Also, sort has -u flag that do the same as uniq, so you don't need a separate command
# At the end rewrite your target file with the result from sort
sed '/[][]/d' commodities.txt | sed '/null/d' | sort -u > commodities.txt

優化 shell 腳本中的多個 sed 命令

問題描述

2 個解決方案

解決方案1
1 已采納 2022-08-29 13:13:25

解決方案2
-1 2022-08-29 10:58:26

優化 shell 腳本中的多個 sed 命令

問題描述

2 個解決方案

解決方案1 1 已采納 2022-08-29 13:13:25

解決方案2 -1 2022-08-29 10:58:26

解決方案1
1 已采納 2022-08-29 13:13:25

解決方案2
-1 2022-08-29 10:58:26