简体   繁体   中英

perl script to copy file content which is between certain lines

I am new to perl scripting and need help regarding a given problem. I have many files with details of persons. I want to print the contents from each of the file after a particular line and before a particular line. Example: (one of the file contains following details:)

My name is XYZ.
Address: ***
ID:12414
Country:USA
End XYZ

Another file contains details like:

My name is ABC.
Address: ###
ID:124344
Country:Singapore
End ABC

I want to print the lines from the first file after My name is XYZ and before End XYZ into my new file. Similarly, I want to print the contents from the second file after My name is ABC and before End ABC , into my new file.

I wrote the logic as below, but I am not sure of the perl syntax to print after and below a particular line.

while(<file1>)
{
    if () # if we read the phrase "My name" in file1 start printing after this     +line
    {
        print  #print the contents into file3(output file)
        if() # if we read the phrase "End" in file1 stop printing the content into     +file3
    }
}

I hope my question is clear. Any help is appreciated.

You can get the lines between My name is <name>. and End <name> with one of several regexes.

Lazy:

My name is ([^\n]+)\.(.*?)End \1

Greedy:

My name is ([^\n]+)\.(.*)End \1

Optimized:

My name is ([^\s]+)\.((?:[^\n]*(?!End \1)\n)+)End \1

Either way, you'll need the s modifier. If more than one thing needs to be parsed in a file, you will need the g modifier.

The back-references ensure a match without needing to know the name. This means that the content you want will be in capture group 2.

What's the difference between the three regexes? Speed! Depending on how many files you need to parse, you may need the speed.

The optimized one is the best if there is significant variance in what you are parsing. It works the same way as this other regex I wrote . (You should do some testing if speed is important.)

It should be fairly straight forward to write the code from this.

OK. I believe your question is about the perl syntax to print to the output file. I will try to give you a little more complete solution based on the description of what you are trying to do. This is just a quick very simple code example. (For somre reference you may want to also look at http://perlmaven.com/slurp .)

First lets call your new file "newfile.txt". Then lets call you source file(s) "sourcefile.txt". Here is some code with comments:

# First I would set the buffer to flush everything to to newfile.txt  
$++;

# Now open newfile.txt for writing the intformation you want
open my $NEWFILE, '>', 'newfile.txt';

# Now open sourcerfile.txt (or iterate over a list of them)
open my $SOURCEFILE, '<', 'sourcefile.txt';

# Now go through the sourcefile and get info you want to 
# add to your newfile

# set a variable to print data to newfile - initialize to
# N or false
$data_wanted = "N";

# open sourcefile and start reading lines

while <$SOURCEFILE> {
      # Test to see if data is between My Name and 
      if ($_ =~ /^My name/ ) {
          $data_wanted = "N";
      } 
      elsif ($_ =~ /^End/ ) {
          $data_wanted = "N";
          next;
      } 
      elsif ($_ =~ /^STUFF TO OMIT/) {
          $data_wanted = "N";
      }
      else {
          $data_wanted = "Y";
      }

      if ( $data_wanted eq "Y" ) {
          print $NEWFILE $_;
      }

      # you don't really need this but
      # it will show you how this works in perl
      next;  

}  # end of while

# finish by closing the files

close $SOURCEFILE;
close $NEWFILE;

##########################################

Hope this helps ;-)

Is this what you are looking for?

while (<>) {
    if ( /^My name / .. /^End / ) {
        if ( /^My name / ) {
            # Do nothing, or anything you would like for this line.
        } elsif ( /^End / ) {
            # Do nothing, or anything you would like for this line.
        } else {
           print $_;
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM