简体   繁体   English

Perl:从文件中的每一行抓取第n和第m个定界字

[英]Perl: Grabbing the nth and mth delimited words from each line in a file

Because of the more tedious way of adding hosts to be monitored in Nagios (it requires defining a host object, as opposed to the previous program which only required the IP and hostname), I figured it'd be best to automate this, and it'd be a great time to learn Perl, because all I know at the moment is C/C++ and Java. 由于在Nagios中添加要监视的主机的方法比较繁琐(与以前的程序只需要IP和主机名相对,它需要定义一个主机对象,而以前的程序则需要定义一个主机对象),所以我认为最好自动执行此操作,并且这将是学习Perl的好时机,因为目前我所知道的只是C / C ++和Java。

The file I read from looks like this: 我从中读取的文件如下所示:

xxx.xxx.xxx.xxx hostname #comments. i.dont. care. about

All I want are the first 2 bunches of characters. 我只想要前两个字符。 These are obviously space delimited, but for the sake of generality, it might as well be anything. 这些显然是用空格分隔的,但是为了通用起见,也可能是任何东西。 To make it more general, why not the first and third, or fourth and tenth? 为了使它更笼统,为什么不选择第一和第三,或者第四和第十? Surely there must be some regex action involved, but I'll leave that tag off for the moment, just in case. 当然,必须涉及一些正则表达式操作,但是为了以防万一,我暂时暂时不使用该标签。

The one-liner is great, if you're not writing more Perl to handle the result. 如果您不编写更多的Perl来处理结果,那么单行代码就很棒。

More generally though, in the context of a larger Perl program, you would either write a custom regular expression, for example: 但是,更一般而言,在较大的Perl程序的上下文中,您将编写自定义正则表达式,例如:

if($line =~ m/(\S+)\s+(\S+)/) {
     $ip = $1;
     $hostname = $2;
}

... or you would use the split operator. ...或者您将使用split运算符。

my @arr = split(/ /, $line);
$ip = $arr[0];
$hostname = $arr[1];

Either way, add logic to check for invalid input. 无论哪种方式,都添加逻辑以检查无效输入。

Let's turn this into code golf! 让我们把它变成代码高尔夫! Based on David's excellent answer, here's mine: 根据David的出色回答,这是我的:

perl -ane 'print "@F[0,1]\n";'

Edit: A real golf submission would look more like this (shaving off five strokes): 编辑:真正的高尔夫投稿会更像这样(取消五招):

perl -ape '$_="@F[0,1]
"'

but that's less readable for this question's purposes. 但这对于这个问题的目的来说不太容易理解。 :-P :-P

Here's a general solution (if we step away from code-golfing a bit). 这是一个通用的解决方案(如果我们稍微远离代码编程的话)。

#!/usr/bin/perl -n
chop;                     # strip newline (in case next line doesn't strip it)
s/#.*//;                  # strip comments
next unless /\S/;         # don't process line if it has nothing (left)
@fields = (split)[0,1];   # split line, and get wanted fields
print join(' ', @fields), "\n";

Normally split splits by whitespace. 通常split由空格分割。 If that's not what you want (eg, parsing /etc/passwd ), you can pass a delimiter as a regex: 如果这不是您想要的(例如,解析/etc/passwd ),则可以将定界符作为正则表达式传递:

@fields = (split /:/)[0,2,4..6];

Of course, if you're parsing colon-delimited files, chances are also good that such files don't have comments and you don't have to strip them. 当然,如果您要分析以冒号分隔的文件,则很有可能此类文件没有注释,也不必剥离它们。

A simple one-liner is 一个简单的单线是

perl -nae 'print "$F[0] $F[1]\n";'

you can change the delimiter with -F 您可以使用-F更改定界符

David Nehme said: 大卫·内姆(David Nehme)说:

perl -nae 'print "$F[0] $F[1}\n";

which uses the -a switch. 使用-a开关。 I had to look that one up: 我必须查一查:

-a   turns on autosplit mode when used with a -n or -p.  An implicit split
     command to the @F array is done as the first thing inside the implicit
     while loop produced by the -n or -p.

you learn something every day. 你每天都会学到一些东西。 -n causes each line to be passed to -n导致将每一行传递给

LINE:
    while (<>) {
        ...             # your program goes here
    }

And finally -e is a way to directly enter a single line of a program. 最后, -e是直接输入程序的一行的一种方法。 You can have more than -e . 您可以拥有多个-e Most of this was a rip of the perlrun(1) manpage. 其中大部分是perlrun(1)联机帮助页的一部分。

Since ray asked, I thought I'd rewrite my whole program without using Perl's implicitness (except the use of <ARGV> ; that's hard to write out by hand). 自从ray问到之后,我以为我会在不使用Perl隐式性的情况下重写整个程序(除了使用<ARGV> ;这很难手动写出)。 This will probably make Python people happier (braces notwithstanding :-P): 这可能会使Python的人更加快乐(尽管括号为:-P):

while (my $line = <ARGV>) {
    chop $line;
    $line =~ s/#.*//;
    next unless $line =~ /\S/;
    @fields = (split ' ', $line)[0,1];
    print join(' ', @fields), "\n";
}

Is there anything I missed? 我有什么想念的吗? Hopefully not. 希望不会。 The ARGV filehandle is special. ARGV文件句柄很特殊。 It causes each named file on the command line to be read, unless none are specified, in which case it reads standard input. 除非指定了没有指定的名称,否则它将导致读取命令行上的每个命名文件,在这种情况下,它将读取标准输入。

Edit: Oh, I forgot. 编辑:哦,我忘了。 split ' ' is magical too, unlike split / / . split ' '很神奇,与split / /不同。 The latter just matches a space. 后者仅与一个空格匹配。 The former matches any amount of any whitespace. 前者匹配任何数量的空白。 This magical behaviour is used by default if no pattern is specified for split . 如果未为split指定任何模式,则默认使用此魔术行为。 (Some would say, but what about /\\s+/ ? ' ' and /\\s+/ are similar, except for how whitespace at the beginning of a line is treated. So ' ' really is magical.) (有人会说, 但是/\\s+/' '/\\s+/相似,除了如何处理行首的空白。所以' '真的很神奇。)

The moral of the story is, Perl is great if you like lots of magical behaviour. 这个故事的寓意是,如果您喜欢许多神奇的行为,Perl就是很棒的选择。 If you don't have a bar of it, use Python. 如果没有限制,请使用Python。 :-P :-P

To Find Nth to Mth Character In Line No. L --- Example For Finding Label 在行L中查找第N到第M个字符---查找标签的示例


@echo off

REM Next line = Set command value to a file  OR  Just Choose Your File By Skipping The Line
vol E: > %temp%\justtmp.txt
REM  Vol E:  = Find Volume Lable Of Drive E

REM  Next Line to choose line line no. +0 = line no. 1 
for /f "usebackq delims=" %%a in (`more +0 %temp%\justtmp.txt`) DO (set findstringline=%%a& goto :nextstep)

:nextstep

REM  Next line to read nth to mth Character  here 22th Character to 40th Character
set result=%findstringline:~22,40%

echo %result%
pause
exit /b

Save as find label.cmd 另存为查找label.cmd

The Result Will Be Your Drive E Label 结果将成为您的驱动器E标签

Enjoy 请享用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM