在Perl中高效地逐字讀取和處理文件

Question

我是perl的新手，想知道如何才能更快地完成此工作。 這是我當前的代碼。 很感謝任何形式的幫助。

#!/usr/bin/perl

use strict;
use warnings;

open( FILE_IN, "<practicecase.txt" ) or die "$!";

open( FILE_OUT, ">extracted.txt" ) or die "$!";

print "Extracting inputs\n";

while (<FILE_IN>) {
    if ( $_ =~ m/^second_word/ ) {
        my @filepath2 = split (/\s+/, $_);
        print FILE_OUT $filepath2[1]."\n";
    }
    if ($_ =~ m/^first_word/ ) {
        my @filepath1 = split (/\s+/, $_);
        print FILE_OUT $filepath1[1]."\n";
    }
}

exit;

我的輸入文件Practicecase.txt就是：

first_word some/filepath
second_word another/filepath

我的輸出文件extract.txt如下所示：

some/filepath
another/filepath

非常感謝！

Answer 1

這與您的算法即將執行的速度差不多。 我所做的優化是使用單個正則表達式模式在行的開頭查找first_word或second_word ，並使用相同的模式捕獲行中的第二個字段

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;
use autodie;

open my $in_fh,  '<', 'practicecase.txt';

open my $out_fh, '>', 'extracted.txt';
select $out_fh;

print "Extracting inputs\n";

while ( <$in_fh> ) {
    print "$1\n" if / ^ (?:first|second)_word \s+ (\S+) /x;
}

Answer 2

對Borodin的代碼稍作修改

#!/usr/bin/perl

use strict;
use warnings;
use Cwd qw( abs_path );
use File::Basename;
use File::Basename qw( dirname );

my $file_in = dirname(abs_path($0))."/practicecase.txt";
my $file_out = dirname(abs_path($0))."/extracted.txt";

open my $in, "<$file_in" or die "$!";
open my $out, ">$file_out" or die "$!";

print "Extracting inputs\n";

while ( <$in> ) {
    print $out "$1\n" if / ^ (?:first|second)_word\s+(.+?)$ /x;
}

close($in);
close($out);
exit;

基本上更改了正則表達式，以捕獲到行尾的所有內容，以防您的原始文本文件包含空格或其他可能要包含的空白。 也不要忘記關閉文件句柄

在Perl中高效地逐字讀取和處理文件

問題描述

2 個解決方案

解決方案1
3 已采納 2015-06-03 18:55:46

解決方案2
-1 2015-06-03 21:21:41

在Perl中高效地逐字讀取和處理文件

問題描述

2 個解決方案

解決方案1 3 已采納 2015-06-03 18:55:46

解決方案2 -1 2015-06-03 21:21:41

解決方案1
3 已采納 2015-06-03 18:55:46

解決方案2
-1 2015-06-03 21:21:41