简体   繁体   English

BASH:如何在AWK中将变量用作正则表达式

[英]BASH: How to use a variable as regex in AWK

I have spent hours on Awk tutorials but I can not get around that one: I want to use a variable as a regex for a awk query. 我已经在Awk教程上花费了数小时,但是却无法解决这一问题:我想将变量用作awk查询的正则表达式。 Here is an example of what i want to achieve: 这是我要实现的示例:

#!/bin/bash
#My test array
testarray=(teststring[1078] teststringthatshouldnotmatch teststring[5845])

#myregex as a variable
regex="teststring\[.*"

#the awk
for value in ${testarray[*]}
do
echo ${value} | awk '{if ($1 ~ regex) print}'
done

I woud expect Awk to match teststring 1 and 3 but it matches all. 我希望Awk可以匹配测试字符串1和3,但是可以匹配所有字符串。 Thanks for any light on this one. 感谢您对此发表任何看法。

When using a string in a regexp context you need to escape twice anything you want escaped. 在正则表达式上下文中使用字符串时,您需要对要转义的所有内容进行两次转义。 Always quote your shell variables, and there's no need to call match(), and you should put the condition inthe condition section of the awk script, not inside an if in the action part, and there's no need for an explicit print. 总是引用您的shell变量,并且不需要调用match(),您应该将条件放在awk脚本的条件部分中,而不是在操作部分的if中,并且不需要显式打印。 Also, .* means zero or more repetitions of any char and so matches zero chars and so is doing nothing useful for your regexp. 此外, .*表示任何字符的零个或多个重复,因此匹配零个字符,因此对您的正则表达式没有任何帮助。 All you need is: 所有你需要的是:

regex='teststring\\['
...
awk -v test="$regex" '$1~test'

Look: 看:

$ cat tst.sh
#!/bin/bash
#My test array
testarray=(teststring[1078] teststringthatshouldnotmatch teststring[5845])

#myregex as a variable
regex='teststring\\['

#the awk
for value in "${testarray[@]}"
do
    echo "$value" | awk -v test="$regex" '$1 ~ test'
done
$
$ ./tst.sh
teststring[1078]
teststring[5845]

I found a way in the end: Awk should be written like this to allow for a variable to be used (need to re-declare the variable with -v) 我最终找到了一种方法:Awk应该这样写,以允许使用变量(需要用-v重新声明变量)

awk -v test=$regex '{if (match($1, test)) {print}}'

Maybe there is a better way but this one does the trick :) 也许有更好的方法,但这是可行的:)

EDIT AFTER SEEING THE ANSWERS: Thanks, I will update my code. 查看答案后进行编辑:谢谢,我将更新我的代码。

The answer to the seemingly strange behavior of awk is quite simple. 对于awk看似奇怪的行为的答案非常简单。

Shell variables are not awk variables. Shell变量不是awk变量。

While the shell variable regex holds the string you assigned to it, the awk variable regex is still the empty string, which matches any string. 尽管shell变量regex保存了您分配给它的字符串,但是awk变量regex仍然是空字符串,它与任何字符串匹配。

Shell variables are accessible via the ENVIRON hash in awk. 可通过awk中的ENVIRON哈希访问Shell变量。

Using this approach don't forget that as for any process started from the shell only exported shell variables will be copied in the environment of the child process. 使用这种方法时,请不要忘记,对于从shell启动的任何进程,只有导出的shell变量将被复制到子进程的环境中。

So don't forget to export any variables you want to access via ENVIRON. 因此,不要忘记导出要通过ENVIRON访问的任何变量。

To make your script work change $1 ~ regex to $1 ~ ENVIRON["regex"] . 为了使脚本正常工作,请将$1 ~ regex更改$1 ~ regex $1 ~ ENVIRON["regex"]

You may also assign the shell variable regex to the awk variable regex on the command line using the -v switch. 您也可以在命令行上使用-v开关将shell变量regex分配给awk变量regex In this case you will have to escape shell metacharacters, so maybe the above mentioned solution is the more elagant one. 在这种情况下,您将不得不转义外壳元字符,因此,也许上面提到的解决方案比较费劲。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM