从perl中的特定行开始读取大文件

Question

I have to match 2 patterns (pattern_1 and pattern_2) 我必须匹配2个模式（模式_1和模式_2）
Data to match for pattern_2 depends upon pattern_1 (pattern_2 uses some data extracted out of pattern_1) 匹配模式_2的数据取决于模式_1（模式_2使用从模式_1中提取的一些数据）
pattern_2 always occurs after pattern_1 pattern_2总是在pattern_1之后发生
once done matching pattern_2 i need to move back to the place where pattern_1 was matched and start again 完成匹配pattern_2的操作后，我需要移回匹配pattern_1的位置，然后重新开始

I have following code: 我有以下代码：

open(DATA_IN, "<$in_file") or die "Couldn't open file $in_file, $!";
open(DATA_OUT, ">$out_file") or die "Couldn't open file $out_file, $!";
while(<DATA_IN>){
    if($_ =~ /pattern_1/){
        #extract some data
        open(DATA_TEMP, "<$in_file") or die "Couldn't open file $in_file, $!";
        TEMP: while(<DATA_TEMP>){
            if($_ =~ /pattern_2/){
                my $i = 0;
                my $line;
                while ($i<4){
                    $line = <DATA_TEMP>;
                    $i++;
                }
                print $line; #print the data 4 lines after the matched pattern_2
                last TEMP;
            }
        }
    }
}

It works fine, but the issue is that it loads $in_file everytime for pattern_1 match from the start which takes a long time. 它工作正常，但问题是，每次从一开始就为pattern_1匹配每次都加载$ in_file，这需要很长时间。 Can you suggest me a way to load $in_file only from pattern_1 onwards? 您能否建议我仅从pattern_1开始加载$ in_file的方法？

Answer 1

You can use the seek() and tell() methods to move around in the file. 您可以使用seek()和tell()方法在文件中四处移动。 Something like the following: 类似于以下内容：

open(DATA_IN, "<$in_file") or die "Couldn't open file $in_file, $!";
open(DATA_OUT, ">$out_file") or die "Couldn't open file $out_file, $!";
while(<DATA_IN>){
    if($_ =~ /pattern_1/){
        # Save the current position
        my $saved_position = tell(DATA_IN);

        # extract some data
        TEMP: while(<DATA_IN>){
            if($_ =~ /pattern_2/){
                my $i = 0;
                my $line;
                while ($i<4){
                    $line = <DATA_IN>;
                    $i++;
                }
                print $line; #print the data 4 lines after the matched pattern_2
                last TEMP;
            }
        }

        # Restore the saved position
        seek(DATA_IN, saved_position, 0);
    }
}

从perl中的特定行开始读取大文件

问题描述

1 个解决方案

解决方案1
2 2016-06-24 13:25:37

从perl中的特定行开始读取大文件

问题描述

1 个解决方案

解决方案1 2 2016-06-24 13:25:37

解决方案1
2 2016-06-24 13:25:37