简体   繁体   中英

Perl Program to Read a Text File, Search for text within the File and copy the text to a new file

This is a fixed format and this is what you see when I open up in notepad or ultraedit. This is just some sample data, but my text file has about 200,000 lines and multiple directories like you see in the example. So basically I am trying to append the path that you see where it says " Directory of V: \\word" and add "V:\\word" to the end of the line and copy it down up until it sees the new " Directory of V:\\word\\excels" and adds "V:\\word\\excels" after the fact and so on and so on. Would you be able to help me out and possibly throw a dog a bone? Thanks and much appreciated!

 Directory of V:\word
04/30/2007  11:49 AM        938,458   BUILTIN\Admin       Filename.pdf
04/06/2012  01:13 PM          3,801   AMERICAS\DoeJ       Filename3.pdf
01/11/2007  12:05 PM         26,624   BUILTIN\Admin       Filename2.doc
08/01/2007  11:57 AM         18,432   BUILTIN\Admin       Filename5.xls
 Directory of V:\word\excels
03/03/2010  10:42 AM         35,840   AMERICAS\DavisF     Billing3-3.xls
02/24/2010  10:31 AM         34,380   AMERICAS\StewartF   Allie2-24.xls

This is what I am trying to accomplish

 Directory of V:\word
04/30/2007  11:49 AM        938,458   BUILTIN\Admin       Filename.pdf     V:\word
04/06/2012  01:13 PM          3,801   AMERICAS\DoeJ       Filename3.pdf    V:\word
01/11/2007  12:05 PM         26,624   BUILTIN\Admin       Filename2.doc    V:\word
08/01/2007  11:57 AM         18,432   BUILTIN\Admin       Filename5.xls    V:\word
 Directory of V:\word\excels
03/03/2010  10:42 AM         35,840   AMERICAS\DavisF     Billing3-3.xls   V:\word\excels
02/24/2010  10:31 AM         34,380   AMERICAS\StewartF   Allie.xls        V:\word\excels



This is what I have in perl: I'm still stuck, but I think I'm making some progress.

    #!/usr/bin/perl 
    use strict 
    use warnings 
    use autodie

    open (MYFILE, 'List.txt');
    my $str = " Directory of V:\word";
    while (<MYFILE>)
    {
        chomp;
        ($Date, $Time, $Size, $User, $Filename) = split("\t");
        print $Date, $Time, $Size, $User, $Filename, substr $str,14;

        print "$_\n";
    }
    close (MYFILE);

@Pichi's one-liner will do what you want if your file is given on stdin or passed as an argument. Since it is a bit opaque, here's what it is doing in an explicit manner:

# What's this doing?  perl -lpe'/ Directory of (.*)/?$a=$1:($_.="\t$a")'

my $suffix;                              # Pichi uses $a, a quietly special var I usually avoid

while (defined(my $line = <ARGV>)) {     # Magic ARGV filehandle - stdin or arguments
  chomp($line);                          # Remove newline (-l switch)

  if ($line =~ / Directory of (.*)/) {   # This is the ?: clause
    $suffix = $1;
  } else {
    $line .= "\t$suffix";
  }

  print "$line\n";                       # Print (-p) with newline (-l, again)
}

Perl's convenient one-liners actually do a bit more than that (eg, $/ and $\\ are explicitly set, and the print is error-checked), but that's essentially the approach.

Why not awk?

awk '/ Directory of /{at=$3;print;next}{print $0""FS""at}' your_file

Perl:

perl -lne 'if(/ Directory of (.*)/){$a=$1;print}else{$_.="\t".$a;print}' your_file

if you want to do an inline replacement:

perl -i -lne 'if(/ Directory of (.*)/){$a=$1;print}else{$_.="\t".$a;print}' your_file

If you want to do it using a Perl script (good opportunity to learn about it):

First, in order to write in a NEW file you need to open two files: FILE1 would be the one where you have the information to read, FILE2 the one where you will write. So, open the first one with "reading" options ('<') and the second one with "writing" options ('>'). More info. about 'openfile' here .

Then, when you are using "WHILE" I will recommend you to take each sentence in one variable ...

while ($line =<$file1>) 
{    ... 
     ...
}

... and depending on ( info. about 'if statement' ) the beginning of the sentence ( 'regular expressions' ) do an action/printing or another.

To print the directory at the end of the sentence, keep also the lines with the directory information ('if') in a variable. You can also eliminate the part of the sentence that you are not interested on using the useful 'regular expressions' (aka 'regex').

Easy ... ;-)

简短:

perl -lpe'/ Directory of (.*)/?$a=$1:($_.="\t$a")'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM