简体   繁体   English

文件解析perl读取第一行和最后一行

[英]File parsing in perl reading first and last lines

I have a file with first line 我有第一行的文件

=== Verbose logging started: 1/3/2017  17:41:55  Build type: SHIP UNICODE 5.00.7601.00  Calling process: C:\Windows\SysWOW64\msiexec.exe ===

and last line 和最后一行

=== Verbose logging stopped: 1/3/2017  17:49:17 ===

I am interested in time fields in those lines ( 17:41:55 and 17:49:17 ), want to find the difference in time from start to stop. 我对这些行中的时间段感兴趣( 17:41:5517:49:17 ),想要找到从开始到停止的时间差异。

I tried reading the file in an array and fetch first and last lines 我尝试在数组中读取文件并获取第一行和最后一行

my $last = pop (@array);
my $first = shift (@array);

But getting to time field in array is becoming difficult. 但是在阵列中进入时间领域变得越来越困难。

Could you please suggest any alternative way? 你能建议任何其他方式吗?

If you want to read the first and last line of a potentially very large log file, you shouldn't slurp it all into an array as it may consume a lot of memory. 如果要读取可能非常大的日志文件的第一行和最后一行,则不应将其全部插入到数组中,因为它可能会占用大量内存。 Instead, just read the first and last lines. 相反,只需阅读第一行和最后一行。

You can read the first line easy enough. 你可以很容易地阅读第一行。

 use v5.10;
 use strict;
 use warnings;
 use autodie;

 open my $fh, $logfile;
 my $first = <$fh>;

You can read the last line by using seek to jump to the end of the file and then reading backwards in chunks with read until you get a whole line. 您可以通过使用seek跳转到文件的末尾,然后使用read向后read块来读取最后一行,直到获得整行。 That can get complicated. 这可能会变得复杂。 Fortunately there's File::ReadBackwards to do that for you. 幸运的是, File :: ReadBackwards可以帮到你。

use Carp;
use File::ReadBackwards;

my $backwards = File::ReadBackwards->new( $logfile )
    or croak "Can't open $logfile: $!";
my $last = $backwards->readline;

Note that if there's any stray newlines at the end of the file those will be the last line, so you might want to continue reading until you get what you're looking for. 请注意,如果文件末尾有任何杂散换行符,那么这将是最后一行,因此您可能希望继续阅读,直到找到所需内容。

# Read lines backwards until we get something that
# contains non-whitespace.
while( my $last = $backwards->readline ) {
    last if $last =~ /\S+/;
}

Here's a simpler, but slower (for large files) way to get the first and last lines. 这是一个更简单但更慢(对于大文件)获取第一行和最后一行的方法。 Read the first line as before, then read each line but only keep the last one. 像以前一样读取第一行,然后读取每一行,但只保留最后一行。

my $last;
while( my $line = <$fh> ) { $last = $line }

It still has to read the whole file, but it only keeps the last one in memory. 它仍然必须读取整个文件,但它只保留最后一个在内存中。


Once you have that, you can parse the line and turn it into a Time::Piece object to work with it easier. 完成后,您可以解析该行并将其转换为Time :: Piece对象,以便更轻松地使用它。

# === Verbose logging started: 1/3/2017  17:41:55 ... ===
# === Verbose logging stopped: 1/3/2017  17:49:17 ===
sub log_time {
    my $line = shift;

    # This captures the 1/3/2017  17:49:17 part
    my($datetime) = $line =~
        /^=== Verbose logging (?:started|stopped):\s*(\d+/\d+/\d+\s+\d+:\d+:\d+)/;

    # Parse it into a Time::Piece object.
    return Time::Piece->strptime($datetime, "%m/%d/%Y %H:%M:%S");
}

strptime is a function used by many languages to parse dates (string parse time). strptime是许多语言用来解析日期的函数(字符串解析时间)。 strftime (string format time) is used to format dates. strftime (字符串格式时间)用于格式化日期。 They share the same mini language. 他们共享相同的迷你语言。 Have a look at the strftime docs to understand what's going on there. 看看strftime文档,了解那里发生了什么。

Once you have that, you can get the difference in seconds by subtracting them . 一旦你有了, 你可以通过减去它们来获得差异

my $start = log_time($first);
my $end   = log_time($last);

say "Seconds elapsed: ".$end - $start;

I have a slightly less sophisticated approach to Schwern which is to use the Unix commands: 我对Schwern采用一种稍微不那么复杂的方法就是使用Unix命令:

#!/usr/bin/perl

use strict;
use English;

my $first=`head -1 $ARGV[0]`;
my $last=`tail -1 $ARGV[0]`;

print "$first\n";
print "$last\n";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM