简体   繁体   English

AWK 子函数语法

[英]AWK sub function syntax

I have a files with the contents:我有一个包含内容的文件:

aaa.bbb.ccc ddd.eee.fff.ggg h.i.j.k

If i use the code:如果我使用代码:

awk '{sub(/\.$/, ""); print $1}' test.txt
returns    aaa.bbb.ccc

awk '{sub(/\.$/, ""); print $3}' test.txt
Returns: h.i.j.k

I understand the sub function is used as: sub(regexp, replacement, target)我理解子函数被用作:sub(regexp,replacement,target)

I dont understand this part .$/, from the sub function.我不明白这部分 .$/,来自子函数。 what is the .$? .$ 是什么?

thanks谢谢

UPDATE更新

Ok, i like your way of explaining things - thank you!好的,我喜欢你解释事情的方式——谢谢!

If i apply this to a real example,如果我把它应用到一个真实的例子中,

/usr/bin/host 172.0.0.10 /usr/bin/host 172.0.0.10

01.0.0.172.in-addr.arpa domain name pointer hostname.domain.com. 01.0.0.172.in-addr.arpa 域名指针hostname.domain.com。

  1. /usr/bin/host 172.0.0.10 | /usr/bin/host 172.0.0.10 | /bin/awk '{sub(/.$/, ""); /bin/awk '{sub(/.$/, ""); print $5}' gives: hostname.domain.com打印 $5}' 给出:hostname.domain.com

  2. /usr/bin/host 172.0.0.10| /usr/bin/host 172.0.0.10| /bin/awk '{sub(/.$/, ""); /bin/awk '{sub(/.$/, ""); print $1}' gives: 10.0.0.172.in-addr.arpa打印 $1}' 给出:10.0.0.172.in-addr.arpa

-The sub function will match to the end of the line as there is a "." - 子函数将匹配到行尾,因为有一个“。” -what is the "" doing? - “”在做什么? -I dont understand how awk is splitting things into columns? -我不明白 awk 是如何将东西分成几列的?

sub(/regexp/, replacement, target)
sub(/\.$/, replacement, target)

Your regexp is \\.$ , not .$/你的正则表达式是\\.$ ,而不是.$/

\\ is the escape character. \\是转义字符。 It escapes the character that follows it, thus stripping it from the regex meaning and processing it literally.它转义了跟在它后面的字符,从而将它从regex含义中剥离出来并按字面意思处理。

. in regex matches any single character.regex匹配任何单个字符。 Unless it's escaped by \\ like in your example, thus it just matches the dot character .除非它像在您的示例中一样被\\转义,否则它只会匹配点字符.

$ simply means the end of the line. $仅表示该行的结尾。

Putting this together, \\.$ is an escaped dot at the end of the line.把它们放在一起, \\.$是行尾的转义点。 This would match for example any end of paragraph that ends in a period.例如,这将匹配以句点结尾的任何段落结尾。

In your example, the sub doesn't substitute anything because there is no .在您的示例中, sub不会替换任何内容,因为没有. at the end of the line (your input ends with .k . So your first awk just prints the 1st column, and the other one prints the 3rd column.在行尾(你的输入以.k结尾。所以你的第一个awk只打印第一列,另一个打印第三列。

Update更新

For your updated question.对于您更新的问题。

Awk splits a string in columns by whitespace by default.默认情况下,awk 按空格分割列中的字符串。 Thus in your input, columns are like this:因此,在您的输入中,列是这样的:

 01.0.0.172.in-addr.arpa domain name pointer hostname.domain.com.
|----------$1-----------|--$2--|-$3-|--$4---|----------$5--------|

in your sub command, awk finds the dot at the end of the line and replaces with "" which is the empty string (ie it just deletes it)在您的sub命令中,awk 找到行尾的点并替换为空字符串"" (即它只是删除它)

So your 1st command - {sub(/.$/, ""); print $5}所以你的第一个命令 - {sub(/.$/, ""); print $5} {sub(/.$/, ""); print $5} , it prints the 5th column which is hostname.domain.com. {sub(/.$/, ""); print $5} ,它打印第 5 列,即hostname.domain.com. after it replaces the .在它取代. at the end with nothing (deletes it).最后什么都没有(删除它)。 It's worth noting that in this regex you don't escape the .值得注意的是,在这个正则表达式中,你不会逃避. anymore, so the pattern just matches any character at the end and deletes it (it happens to be a . in your input)了,所以该模式,就像匹配任何字符在最后,并删除它(它正好是一个.在你的输入)

Your other command - {sub(/.$/, ""); print $1}你的另一个命令 - {sub(/.$/, ""); print $1} {sub(/.$/, ""); print $1} deletes the character at the very end of the line and then just prints the first column 10.0.0.172.in-addr.arpa {sub(/.$/, ""); print $1}删除行尾的字符,然后只打印第一列10.0.0.172.in-addr.arpa

You can also set custom column separators in awk, I recommend you read some introduction and tutorials on awk to have a better understanding of how it works.您还可以在 awk 中设置自定义列分隔符,我建议您阅读一些有关 awk 的介绍和教程,以更好地了解它的工作原理。 Eg simple awk tutorial例如简单的 awk 教程

sub(regexp, replacement, target)

So here we used the regex as \\.$ , which matches the dot at the end.所以在这里我们使用正则表达式作为\\.$ ,它匹配末尾的点。 Here sub(/\\.$/, "") we didn't mention the target so it takes $0 ie the whole line.这里sub(/\\.$/, "")我们没有提到目标,所以它需要$0即整行。 If you specify any target , it would remove the last dot only on that particular column.如果您指定任何 target ,它将仅删除该特定列上的最后一个点。

awk '{sub(/\.$/, ""); print $1}' test.txt

Removes a dot which was present only at the end of the line and prints only the column 1. If there is no dot at the last, then replacement won't occur.删除仅出现在行尾的点并仅打印第 1 列。如果最后没有点,则不会发生替换。

awk '{sub(/\.$/, ""); print $3}' test.txt

Removes the dot at the end of the line and prints only the column 3. Because of there is no dot at the end, it returns the third column aka last column as it is.删除行尾的点并仅打印第 3 列。由于末尾没有点,它返回第三列,也就是最后一列。

Example:例子:

$ cat file
aaa.bbb.ccc. ddd.eee.fff.ggg h.i.j.k.
$ awk '{sub(/\.$/, ""); print $1}' file
aaa.bbb.ccc.
$ awk '{sub(/\.$/, ""); print $3}' file
h.i.j.k

I had one table with this format我有一张这种格式的桌子

<table width="700" border="1" align="center" cellpadding="0" cellspacing="0" bordercolor="ffcc00" bgcolor="ffcc00">
<tbody>
        <th colspan="7" bordercolor="ffcc00" bgcolor="000000" scope="col">
            <div align="center" class="style2">
                Exciter Power Supply</div>
        </th>
    </tr>
    <tr>
        <th width="175" bordercolor="ffcc00" bgcolor="000000" scope="col">
            <div align="center" class="style1">+ 3 V </div>
        </th>
        <th width="175" bordercolor="ffcc00" bgcolor="000000" scope="col">
            <div align="center" class="style1">
                OK</div>
        </th>
        <th width="175" bordercolor="ffcc00" bgcolor="000000" scope="col">
            <div align="center" class="style1">&nbsp;+ 5 V</div>
        </th>
        <th width="175" bordercolor="ffcc00" bgcolor="000000" scope="col">
            <div align="center" class="style1">
                OK</div>
        </th>
    </tr>
    
</tbody>

When i get the value of +3 V当我得到+3 V的值时

curl -s http://my-site/index.htm | sed -e 's/<[^>]*>//g' | awk '/+ 3 V/{getline;  print}'

I had the output OK'&nbsp';我有输出 OK'&nbsp'; + 5 V + 5 伏

For remove blank space and text of another field i use sub() for change caracters plus tr to remove the caracters为了删除另一个字段的空格和文本,我使用 sub() 来更改字符加上 tr 来删除字符

curl -s http://my-site/index.htm | sed -e 's/<[^>]*>//g' | awk '/+ 3 V/{getline; sub(/+ 5 V/, ""); print}' | tr "&nbsp;" " "

My output is only OK我的输出正常

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM