简体   繁体   English

只打印与grep匹配的一部分

[英]Print only a part of a match with grep

I'm interested whether I could use a single grep command for the following situation. 我很感兴趣是否可以在以下情况下使用单个grep命令。

I have a dhcpd.conf file in which DHCP hosts are defined. 我有一个dhcpd.conf文件,其中定义了DHCP主机。 Given the hostname, I need to find its MAC address in the dhcpd.conf file. 给定主机名,我需要在dhcpd.conf文件中找到它的MAC地址。 I need to use it to disable its PXE boot config, but that's not part of this question. 我需要使用它来禁用它的PXE启动配置,但这不是这个问题的一部分。

The file's syntax is uniform, but I still want to make it a little fool-proof. 该文件的语法是统一的,但我仍然想让它变得有点傻瓜。 Here is how the hosts are defined: 以下是主机的定义方式:

    host client1 { hardware ethernet 12:23:34:56:78:89; fixed-address 192.168.1.11; filename "pxelinux.0"; }
    host client2 { hardware ethernet 23:34:45:56:67:78; fixed-address 192.168.1.12; filename "pxelinux.0"; }
    host client3 { hardware ethernet AB:CD:EF:01:23:45; fixed-address 192.168.1.13; filename "pxelinux.0"; }
    host client4 { hardware ethernet C1:CA:88:FA:F4:90; fixed-address 192.168.1.14; filename "pxelinux.0"; }

We assume that all configurations take only one line, even though the dhcpd.conf syntax would allow to break options to several lines. 我们假设所有配置只占用一行,即使dhcpd.conf语法允许将选项分解为多行。 We assume that the order of options may differ, however. 但是,我们假设选项的顺序可能不同。

I came up with the following grep command: 我想出了以下grep命令:

grep -o "^[^#]*host.*${DHCP_HOSTNAME}.*hardware ethernet.*..:..:..:..:..:..;" /etc/dhcp/dhcpd-hosts.conf

It is supposed to ignore lines those are commented, allow arbitrary whitespace between tokens, and match until the end of the MAC address. 它应该忽略那些被注释的行,允许令牌之间的任意空格,并匹配到MAC地址的末尾。 When I run it, I get lines like this: 当我运行它时,我得到这样的行:

host client1 { hardware ethernet 12:23:34:56:78:89;

This is great! 这很棒! But the point is that I only need a MAC address, without the preceding trash. 但重点是我只需要一个MAC地址,而不需要前面的垃圾箱。 Now I know that using another grep, or cut, or awk to cut only the MAC address from this output would be trivial. 现在我知道使用另一个grep,或者cut或awk来从这个输出中仅删除MAC地址将是微不足道的。 But I wonder, is there a way to use a single grep command to get the end result, without having to pipe this output into another filter? 但我想知道,有没有办法使用单个grep命令来获得最终结果,而不必将此输出传输到另一个过滤器? Obviously I can't leave out the beginning of the pattern, because I want to get a specific hostname, thus matching for " ..:..:..:..:..:.. " would give me all the MAC addresses. 显然我不能忽略模式的开头,因为我想得到一个特定的主机名,因此匹配“ ..:..:..:..:..:.. ”会给我所有的MAC地址。

Once again, I want a single command (not necessarily grep) which cuts out only the proper MAC address from the file. 再一次,我想要一个命令(不一定是grep),它只从文件中删除正确的MAC地址。 Thus I am not interested in any solutions those say "grep ... | grep ..." or "grep ... | cut ...", etc.. 因此,我对那些说“grep ... | grep ...”或“grep ... | cut ...”等的任何解决方案都不感兴趣。

Of course, in practice, nothing bad happens if I use multiple filters and pipe them, I am just curious whether it is possible to solve with one filter. 当然,在实践中,如果我使用多个过滤器并管道它们,没有什么不好的事情,我只是好奇是否有可能用一个过滤器来解决。

I would assign the output to a variable. 我会将输出分配给变量。

You can use a Perl one-liner to match each line of the file against a single regex with an appropriate capture group, and for each line that matches you can print the submatch. 您可以使用Perl单行匹配文件的每一行与具有适当捕获组的单个正则表达式匹配,并且对于与您匹配的每一行,您可以打印子匹配。

There are several ways to use Perl for this task. 有几种方法可以将Perl用于此任务。 I suggest going with the perl -ne {program} idiom, which implicitly loops over the lines of stdin and executes the one-liner {program} once for each line, with the current line made available as the $_ special variable. 我建议使用perl -ne {program}惯用法,它隐含地循环stdin行,并为每一行执行一行{program}一次,当前行作为$_特殊变量提供。 (Note: The -n option does not cause the final value of $_ to be automatically printed at the end of each iteration of the implicit loop, which is what the -p option would do; that is, perl -pe {program} .) (注: -n选项不会导致的最终值$_在隐式循环,这是什么样的每次迭代结束时自动打印-p选项会做的;也就是说, perl -pe {program} 。)

Below is the solution. 以下是解决方案。 Note that I decided to pass the target hostname using the obscure -s option, which enables parsing of variable assignment specifications after the {program} argument, similar to awk's -v option. 请注意,我决定使用obscure -s选项传递目标主机名,这样可以在{program}参数之后解析变量赋值规范,类似于awk的-v选项。 (It is not possible to pass normal command-line arguments with the -n option because the implicit while (<>) { ... } loop gobbles up all such arguments for file names, but the -s mechanism provides an excellent solution. See Is it possible to pass command-line arguments to @ARGV when using the -n or -p options? .) This design prevents the need to embed the $DHCP_HOSTNAME variable in the {program} string itself, which allows us to single-quote it and save a few (actually 8) backslashes. (使用-n选项传递正常的命令行参数是不可能的,因为隐式的while (<>) { ... }循环会占用文件名的所有这些参数,但-s机制提供了一个很好的解决方案。请参阅使用-n或-p选项时是否可以将命令行参数传递给@ARGV?。这种设计可以防止需要在{program}字符串本身中嵌入$DHCP_HOSTNAME变量,这样我们就可以$DHCP_HOSTNAME引用它并保存一些(实际上是8个)反斜杠。

DHCP_HOSTNAME='client3';
perl -nse 'print($1) if m(^\s*host\s*$host\s*\{.*\bhardware\s*ethernet\s*(..:..:..:..:..:..));' -- -host="$DHCP_HOSTNAME" <dhcpd.cfg;
## AB:CD:EF:01:23:45

I often prefer Perl to sed for the following reasons: 我常常喜欢Perl来sed ,原因如下:

  • Perl provides a complete general-purpose programming environment, whereas sed is more limited. Perl提供了一个完整的通用编程环境,而sed则更为有限。
  • Perl has an enormous repository of publicly available modules on CPAN which can easily be installed and then used with the -M{module} option. Perl在CPAN上有一个庞大的公共可用模块库,可以轻松安装,然后与-M{module}选项一起使用。 sed is not extensible. sed不可扩展。
  • Perl has a much more powerful regular expression engine than sed, with lookaround assertions, backtracking control verbs, within-regex and replacement Perl code, many more options and special escapes, embedded group options, and more. Perl具有比sed更强大的正则表达式引擎,具有环绕声断言,回溯控制动词,正则表达式和替换Perl代码,更多选项和特殊转义,嵌入式组选项等等。 See perlre . perlre
  • Counter-intuitively, despite its greater sophistication, Perl is often much faster than sed due to its two-pass process and highly optimized opcode implementation. 反直觉地说,尽管Perl具有更高的复杂性,但由于其双通过程和高度优化的操作码实现,它通常比sed快得多。 See http://rc3.org/2014/08/28/surprisingly-perl-outperforms-sed-and-awk/ for example. 例如,请参见http://rc3.org/2014/08/28/surprisingly-perl-outperforms-sed-and-awk/
  • I often find that the equivalent Perl implementation is more intuitive than that of sed , since sed has a more primitive set of commands for manipulating the underlying text. 我经常发现等效的Perl实现比sed更直观,因为sed有一组更原始的命令来操作底层文本。

I'd choose sed for this, because you can use a regexp for line addressing: 我会为此选择sed,因为你可以使用regexp进行行寻址:

sed -e "/host  *${DHCP_HOSTNAME}/!d" -e "s/*.\(hardware [^;]*\).*/\1/g"

The first expression deletes all lines not matching ${DHCP_HOSTNAME} (you might want to massage this in the shell if you might have any regexp metacharacters in your hostnames, but I'll assume you don't). 第一个表达式删除所有不匹配${DHCP_HOSTNAME} (如果你的主机名中可能有任何正则表达式元字符,你可能想在shell中${DHCP_HOSTNAME} ,但我假设你没有)。

The second expression matches the hardware address portion, and deletes the rest of the line. 第二个表达式与硬件地址部分匹配,并删除该行的其余部分。

You can Try Grep -o with this expression: 你可以用这个表达式尝试Grep -o:

grep -o "[0-9A-F]\{2\}:[0-9A-F]\{2\}:[0-9A-F]\{2\}:[0-9A-F]\{2\}:[0-9A-F]\{2\}:[0-9A-F]\{2\}"

Output: 输出:

12:23:34:56:78:89 12:23:34:56:78:89
23:34:45:56:67:78 23:34:45:56:67:78
AB:CD:EF:01:23:45 AB:CD:EF:01:23:45
C1:CA:88:FA:F4:90 C1:CA:88:FA:F4:90

The Expression above will return only the MAC Address from the dhcp config file. 上面的表达式将仅返回dhcp配置文件中的MAC地址。

Since people also answer with different tools, I think awk might be a good alternative as well. 由于人们也用不同的工具回答,我认为awk也可能是一个不错的选择。

$ cat so
host client1 { hardware ethernet 12:23:34:56:78:89; fixed-address 192.168.1.11; filename "pxelinux.0"; }
host client2 { hardware ethernet 23:34:45:56:67:78; fixed-address 192.168.1.12; filename "pxelinux.0"; }
#host client3 { hardware ethernet AB:CD:EF:01:23:45; fixed-address 192.168.1.13; filename "pxelinux.0"; }
host client3 { hardware ethernet AB:CD:EF:01:23:45; fixed-address 192.168.1.13; filename "pxelinux.0"; }
host client4 { hardware ethernet C1:CA:88:FA:F4:90; fixed-address 192.168.1.14; filename "pxelinux.0"; }
$ awk '/^[^#]/ && /client3/ { printf ("%s: %s\n",  $2, $6); }' so
client3: AB:CD:EF:01:23:45;

I use a double match to exclude commented lines, and simply use the fields index to print out wanted informations. 我使用双匹配来排除注释行,只需使用字段索引打印出想要的信息。 That way, it should also be easy to remove the PXE part. 这样,移除PXE部件也应该很容易。 For instance, remove the filename directive for host3 could be done as follow: 例如,删除host3的filename指令可以按如下方式完成:

$ awk '/^[^#]/ && /client3/ { gsub(/filename[^;]+;/, ""); print; }' so
host client3 { hardware ethernet AB:CD:EF:01:23:45; fixed-address 192.168.1.13;  }

Specifying a custom image (pxecustom.0): 指定自定义图像(pxecustom.0):

$ awk '/^[^#]/ && /client3/ { gsub(/filename[^;]+;/, "filename \"pxecustom.0\";"); print; }' so
host client3 { hardware ethernet AB:CD:EF:01:23:45; fixed-address 192.168.1.13; filename "pxecustom.0"; }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM