简体   繁体   English

Bash:从文件中读取正则表达式并将它们代入 sed 作为变量内联

[英]Bash : reading regex from file and subsitute them into sed inline as variable

I am stuck with how sed interacts with variables.我对 sed 如何与变量交互感到困惑。 I am reading a list of regex from a file then substitute it into SED to mask certain sensitive information within a log file.我正在从文件中读取正则表达式列表,然后将其替换为 SED 以屏蔽日志文件中的某些敏感信息。 if I hard coded the regex, the SED work perfectly, however it behave differently when used with variable.如果我对正则表达式进行硬编码,则 SED 可以完美运行,但是当与变量一起使用时,它的行为会有所不同。

con-list.txt contain below:
(HTTP\/)(.{2})(.*?)(.{2})(group\.com)
(end\sretrieve\sfacility\s)(.{2})(.*?)(.{3})$

Not sure if the dollar sign for regex is interfering with the SED command.不确定正则表达式的美元符号是否干扰了 SED 命令。

input="/c/Users/con-list.txt"
inputfiles="/c/Users/test.log"
echo $inputfiles
while IFS= read -r var
do
  #echo "Searching $var"
  count1=`zgrep -E "$var" "$inputfiles" | wc -l`
  if [ ${count1} -ne 0 ] 
  then
    echo "total:${count1} ::: ${var}"
    sed -r -i "s|'[$]var'|'\1\2XXXX\4\5'|g" $inputfiles #this doesnt work
    sed -r -i "s/(HTTP\/)(.{2})(.*?)(.{2})(group\.com)/'\1\2XXXX\4\5'/g"     $inputfiles #This works
    egrep -in "${var}" $inputfiles
  fi
done < "$input"

I need the SED to accept the regex as variable read from the file.我需要 SED 接受正则表达式作为从文件中读取的变量。 So I could automate masking for sensitive information within logs.所以我可以自动屏蔽日志中的敏感信息。

$ ./zgrep2.sh
/c/Users/test.log
total:4 ::: (HTTP\/)(.{2})(.*?)(.{2})(group\.comp\.com\@GROUP\.COM)
sed: -e expression #1, char 30: invalid reference \5 on `s' command's RHS

Your idea was right, but you forgot to leave the regex in the sed command to be under double quotes for $var to be expanded.您的想法是正确的,但是您忘记将sed命令中的正则表达式保留在双引号下,以便扩展$var

Also you don't need to use wc -l to count the match of occurrences.此外,您不需要使用wc -l来计算出现的匹配。 The family of utilities under grep all implement a -c flag that returns a count of matches. grep下的实用程序系列都实现了一个-c标志,该标志返回匹配计数。 That said, you don't even need to count the matches, but use the return code of the command (if the match was found or not) simply as也就是说,您甚至不需要计算匹配项,而是简单地使用命令的返回码(如果找到匹配项)

if zgrep -qE "$var" "$inputfiles" ; then

Assuming you might need the count for debug purposes, you can continue with your approach with modifications to your script done as below假设您可能需要用于调试目的的计数,您可以继续您的方法,修改您的脚本,如下所示

Notice how the var is interpolated in the sed substitution, leaving it expanded under double-quotes and once expanded preserving the literal values using the single-quote.请注意varsed替换中是如何插入的,使其在双引号下展开,一旦展开,则使用单引号保留文字值。

while IFS= read -r var
do
  count1=$(zgrep -Ec "$var" "$inputfiles")
  if [ "${count1}" -ne 0 ] 
  then
    sed -r -i 's|'"$var"'|\1\2XXXX\4\5|g' "$inputfiles"
    sed -r -i "s/(HTTP\/)(.{2})(.*?)(.{2})(group\.com)/'\1\2XXXX\4\5'/g" "$inputfiles"
    egrep -in "${var}" "$inputfiles"
  fi
done < "$input"

You need:你需要:

sed -r -i "s/$var"'/\1\2XXXX\4\5/g' $inputfiles

You also need to provide sample input (a useful bit of the log file) so that we can verify our solutions.您还需要提供示例输入(日志文件的有用部分),以便我们验证我们的解决方案。

EDIT: a slight change to $var and I think this is what you want:编辑:对 $var 稍作更改,我认为这就是您想要的:

$ cat ~/tmp/j
Got creds for HTTP/PPCKSAPOD81.group.com
Got creds for HTTP/PPCKSAPOD21.group.com
Got creds for HTTP/PPCKSAPOD91.group.com
Got creds for HTTP/PPCKSWAOD81.group.com
Got creds for HTTP/PPCKSDBOD81.group.com
Got creds for HTTP/PPCKSKAOD81.group.com
$ echo $var
(HTTP\/)(.{2})(.*?)(.{2})(.group\.com)
$ sed -r "s/$var"'/\1\2XXXX\4\5/' ~/tmp/j 
Got creds for HTTP/PPXXXX81.group.com
Got creds for HTTP/PPXXXX21.group.com
Got creds for HTTP/PPXXXX91.group.com
Got creds for HTTP/PPXXXX81.group.com
Got creds for HTTP/PPXXXX81.group.com
Got creds for HTTP/PPXXXX81.group.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM