使用 gawk 命令的正则表达式

Question

In Linux, I'm running the command在 Linux 中，我正在运行命令

pmap -x $PID | tail -n 1

This gives me a line like the following:这给了我如下一行：

total kB         168194812  870692  852296

I'm trying to extract the 2nd number (rss) for use.我正在尝试提取第二个数字 (rss) 以供使用。 I found this example that works in regex101.com:我发现这个例子适用于 regex101.com：

/[^\d]*[\d]+[\s]+([\d]+)/

However, when I try to run it against my line of text I don't get any print output:但是，当我尝试针对我的文本行运行它时，我没有得到任何打印 output：

echo "total kB         168194812  870692  852296" | gawk 'match($0, /[^\d]*[\d]+[\s]+([\d]+)/, a) {print a[1]}'

I'm expecting it to print我期待它打印

Answer 1

Like this:像这样：

$ pmap -x $PID | gawk 'match($0, /[^0-9]*[0-9]+\s+([0-9]+)/, a) {print a[1]}'
870692

The expression \d is specific Perl/PCRE compatible regex.表达式\d是特定的 Perl/PCRE 兼容正则表达式。 Some languages like Python use this too.一些语言如 Python 也使用它。

You can simplify to:您可以简化为：

awk '{print $4}'

Using grep :使用grep ：

grep -oP '\d+(?=\s+\d+$)'

Answer 2

What about just displaying the 4th field with只显示第 4 个字段怎么样

awk '{print $4}'

With your example以你的例子

echo "total kB         168194812  870692  852296" | awk '{print $4}'

returns回报

Answer 3

With GNU grep and with your shown samples please try following grep code.使用 GNU grep和您显示的示例，请尝试使用以下grep代码。 Here is the complete Online regex demo for used regex.这是使用过的正则表达式的完整在线正则表达式演示。

echo "total kB         168194812  870692  852296" |
grep -oP '^total kB\s+\d+\s+\K\d+'

Explanation:解释：

I am using -oP options of GNU grep here, which are for exact matching and enabling PCRE regex flavour respectively.我在这里使用 GNU grep的-oP选项，分别用于精确匹配和启用 PCRE 正则表达式风格。
Then in main grep program I am using regex ^total kB\s+\d+\s+\K\d+ where:然后在主要的grep程序中，我使用正则表达式^total kB\s+\d+\s+\K\d+其中：
Matching total kB from starting of the value followed by spaces followed by digits followed by spaces.从值开始匹配total kB ，然后是空格，然后是数字，然后是空格。
Then using \K option to forget values whatever matched till now by regex, this will help us to get the required output, though it matches the regex but doesn't consider values in output printing.然后使用\K选项忘记正则表达式到目前为止匹配的任何值，这将帮助我们获得所需的 output，尽管它与正则表达式匹配但不考虑 output 打印中的值。
Then matching 1 or more digits which is our required output.然后匹配1个或多个数字是我们需要的output。

Answer 4

If you want to use awk you can match digits with [0-9] and the negated version [^0-9]如果您想使用awk ，您可以将数字与[0-9]和取反版本[^0-9]匹配

As you output a single line with tail -n 1 , using gnu awk you could also set the row separator to 1 or more digits, and print the row terminator when the row number is 2.当你 output 与tail -n 1单行时，使用gnu awk你也可以将行分隔符设置为 1 个或多个数字，并在行号为 2 时打印行终止符。

echo "total kB         168194812  870692  852296" | 
awk -v RS='[0-9]+' 'NR == 2 {print RT}'

Output Output

使用 gawk 命令的正则表达式

问题描述

4 个解决方案

解决方案1
3 已采纳 2022-11-28 21:48:44

解决方案2
2 2022-11-28 21:48:45

解决方案3
2 2022-11-29 02:39:25

解决方案4
1 2022-11-28 22:17:15

使用 gawk 命令的正则表达式

问题描述

4 个解决方案

解决方案1 3 已采纳 2022-11-28 21:48:44

解决方案2 2 2022-11-28 21:48:45

解决方案3 2 2022-11-29 02:39:25

解决方案4 1 2022-11-28 22:17:15

解决方案1
3 已采纳 2022-11-28 21:48:44

解决方案2
2 2022-11-28 21:48:45

解决方案3
2 2022-11-29 02:39:25

解决方案4
1 2022-11-28 22:17:15