简体   繁体   English

如何使用awk进行数字比较并创建列表-在带有CRLF行尾的macOS上使用Awk

[英]how to use awk to do a number comparisons and create a list - using Awk on macOS with CRLF line endings

I'm trying to list values greater than my value listed (output from my knife command). 我试图列出的值大于列出的值(我的knife命令的输出)。 I am trying to do this using awk , i've been researching examples and came up with this. 我正在尝试使用awk做到这一点,我一直在研究示例并提出来。 However, my expected output does not work. 但是,我的预期输出不起作用。

For example with this command, I get the following output: 例如,使用此命令,我得到以下输出:

knife ssh -x foobar -a ec2.local_ipv4 "chef_environment:prod AND roles:db_cluster AND AND ipaddress:10.1.*" 'netstat -na | grep EST | wc -l'

Output: 输出:

10.1.3.129 2273
10.1.3.130 2533
10.1.3.131 1981
10.1.2.133 1965

Now, I want to use awk because I want to filter only those values (2nd column, remove IPs) > 2000. 现在,我想使用awk因为我只想过滤> 2000的那些值(第二列,删除IP)。

I tried the following awk statement, but to no avail 我尝试了以下awk语句,但无济于事

knife ssh -x foobar -a ec2.local_ipv4 "chef_environment:prod AND roles:db_cluster AND AND ipaddress:10.1.*" 'netstat -na | grep EST | wc -l' \
| awk '{if ($2 > 2000) print $2; else echo "Nothing to print"}`

Output: 输出:

10.1.3.129 2273
10.1.3.130 2533
10.1.3.131 1981
10.1.2.133 1965

Expected output: 预期产量:

2273
2533

tl;dr tl; dr

The simplest approach is to remove \\r instances from the output before passing it to awk : 最简单的方法是在将输出传递到awk之前从输出中删除\\r实例:

knife ... | tr -d '\r' | awk ...

This assumes that \\r instances only occur as part of \\r\\n pairs to designate line endings, which is generally the case. 假定\\r实例仅作为\\r\\n对的一部分出现以指定行尾,通常是这种情况。


From your comments, we now know that your input has Windows-style CRLF ( \\r\\n ) line endings and that you're on macOS Sierra (10.12) . 根据您的评论,我们现在知道您的输入具有Windows风格的CRLF( \\r\\n )行尾 ,并且您使用的是macOS Sierra(10.12)

That said, your sample output is inconsistent with the awk command in your question. 也就是说,示例输出与问题中的awk命令不一致。

Leaving that issue aside, there are two basic approaches : 抛开这个问题,有两种基本方法

  • (a) Translate \\r\\n (CRLF) sequences to just \\n (LF) first . 的(a)翻译\\r\\n (CRLF)序列只\\n 第一 (LF)。

  • (b) Work around the issue by modifying Awk's input-record separator. (b)通过修改Awk的输入记录分隔符来解决此问题。


The following examples use simplified input and a simplified command to focus on the core issue: 以下示例使用简化的输入和简化的命令来关注核心问题:

  • printf '10.1.3.129 2273\\r\\n10.1.3.130 2533\\r\\n' is used to produce 2 CRLF- terminated ( \\r\\n -terminated) input lines containing 2 whitespace-separated fields each. printf '10.1.3.129 2273\\r\\n10.1.3.130 2533\\r\\n'用于产生2条CRLF终止( \\r\\n终止)输入行,每行包含2个以空格分隔的字段。

  • awk '{ print $2 }' | cat -e awk '{ print $2 }' | cat -e - or a variations thereof - prints the 2nd whitespace-separated field from each line using awk , and cat -e is used to visualize control characters in the output: $ represents a \\n (LF) char. awk '{ print $2 }' | cat -e或其变体-使用awk打印每行中第二个空格分隔的字段,并且cat -e用于可视化输出中的控制字符: $表示\\n (LF)字符。 (the end of the line in Unix terms), and other control characters are visualized as ^<letter> , ie, in caret notation ; (在Unix中,该行的末尾),其他控制字符显示为^<letter> ,即,以脱字符号表示 therefore, \\r (CR) is represented as ^M . 因此, \\r (CR)表示为^M

    • By default, the \\r would be included in the output, because awk doesn't consider it whitespace (which the lines are split into fields by) - which is clearly undesired. 默认情况下, \\r包含在输出中,因为awk不会将其视为空格(行被空格分隔),这显然是不希望的。 The output would look as follows, where ^M indicates the unwanted inclusion of \\r : 输出如下所示,其中^M表示不希望包含\\r

       2273^M$ 2533^M$ 
    • With an effective solution, the \\r would not be included in the output, and the output would look as follows (note the absence of ^M ): 通过有效的解决方案, \\r不会包含在输出中,并且输出将如下所示(请注意缺少^M ):

       2273$ 2533$ 

Solutions based on approach (a): 基于方法(a)的解决方案:

Most typically, utility dos2unix is used to translate Windows-style line breaks to Unix-style ones, but that utility doesn't come with macOS. 最典型的是, 实用程序dos2unix用于将Windows风格的换行符转换为Unix风格的换行符,但是macOS并不附带该实用程序。
It's easy to install it via Homebrew , however. 但是,可以通过Homebrew轻松安装它。
Then use knife ... | dos2unix | awk ... 然后knife ... | dos2unix | awk ... knife ... | dos2unix | awk ... knife ... | dos2unix | awk ... . knife ... | dos2unix | awk ...
(Alternatively send output to a file first and update that file in-place before further processing: dos2unix file .) (或者,先将输出发送到文件,然后在进一步处理之前就地更新该文件: dos2unix file 。)

Alternatively, brought to you by the Shameless Self-Promotion Department, you can install my nws CLI ; 或者,由无耻自我促进部带给您,您可以安装我的nws CLI if you have Node.js installed, install it by simply running [sudo] npm install -g nws-cli and then use knife ... | nws --lf | awk ... 如果安装了Node.js,则只需运行[sudo] npm install -g nws-cli即可安装它,然后使用knife ... | nws --lf | awk ... knife ... | nws --lf | awk ... knife ... | nws --lf | awk ... . knife ... | nws --lf | awk ...
(Alternatively, send output to a file first and update that file in-place before further processing: (或者,先将输出发送到文件,然后在进行进一步处理之前就地更新该文件:
nws --lf -i file ; nws --lf -i file nws can also translate from LF to CRLF and offers other whitespace-related functions.) nws还可将LF转换为CRLF,并提供其他与空白相关的功能。)

There are also fairly simple ways to use stock macOS utilities - see this answer of mine. 还有使用库存macOS实用程序的相当简单的方法-请参阅我的答案

The simplest solution with stock utilities is to use tr to blindly remove any \\r instances: 使用股票工具的最简单解决方案是使用tr盲目删除任何\\r实例:

$ printf '10.1.3.129 2273\r\n10.1.3.130 2533\r\n' |
    tr -d '\r' | awk '{ print $2 }' | cat -e
2273$
2533$

Solution based on approach (b): 基于方法(b)的解决方案:

$ printf '10.1.3.129 2273\r\n10.1.3.130 2533\r\n' |
    awk -v RS='\r' 'NF {print $2}' | cat -e
2273$
2533$

Note how -v RS='\\r' defines \\r as RS , the input-record separator, which means that it is automatically excluded from each record (line) that awk reads and splits into fields. 注意-v RS='\\r'\\r定义为RS ,即输入记录分隔符,这意味着它会自动从awk读取并拆分为字段的每个记录(行)中排除。

NF , placed as a condition before the action ( {...} ) is necessary to eliminate the empty line that results from reading the final \\n as a separate record. NF是作为操作( {...} )之前的条件放置的,它对于消除因将最终\\n作为单独的记录而读取而导致的空行很有必要。

  • This could be avoided if we could define RS as \\r\\n , but, sadly, the BSD Awk on macOS doesn't support multi-character input-record separators (in line with the POSIX spec. ). 如果我们可以RS定义为\\r\\n ,则可以避免这种情况,但是可悲的是,macOS上的BSD Awk不支持多字符输入记录分隔符(符合POSIX规范。 )。
    Via Homebrew , however, you could install GNU Awk, which does support such separators, which would simplify the command to: 但是,可以通过Homebrew安装GNU Awk,它确实支持此类分隔符,从而可以将命令简化为:
    gawk -v RS='\\r\\n' '{print $2}'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM