简体   繁体   中英

Search the front of a string to replace the end of the string Perl

After getting some help here is what I have come up with (I was hoping to learn by trying to put multiple scripts together). The script below will do the HW and OW replacements but does not run the if statement.

*#*!/usr/bin/perl  
use strict;  
use warnings 'all';
$^I = '.bak'; # create a backup copy 
while (<>) {
   s/HW/HT/g; # do the replacement of HW with HT
   s/OW/OT/g; # do a second replacement OW with OT
*#* Hopefully run the if statement       
   my @parts = /\s*\S+/g;
   if ( $parts[1] =~ s/([HO])W/$1T/ ) {
    $parts[5] = sprintf '%*d',
            length $parts[5],
            $parts[1] =~ /HT/ ? 2002 : 2001;
      }
print @parts, "\n";
}

I have left the rest of the post below in case people have similar problems.

I would like to use Perl to replace text in a file by searching for specific letters at the beginning of the string. For example here is a section of the file:

 6  HT     4.092000    4.750000   -0.502000     0     5     7
 7  HT     5.367000    5.548000   -0.325000     0     5     6
 8  OT    -5.470000    5.461000    1.463000     0     9    10
 9  HT    -5.167000    4.571000    1.284000     0     8    10
10  HT    -4.726000    6.018000    1.235000     0     8     9
11  OT    -4.865000   -5.029000   -3.915000     0    12    13
12  HT    -4.758000   -4.129000   -3.608000     0    11    13

I would like to use HT as the search and be able to replace the "0" in the column of zeros with 2002 . I know how to replace the entire column of zeros but I don't know how to make it line specific. After using HT as the search I need to then search OT and replace the 0 column with 2001 .

Basically I need to search a string that identifies the line and replace a specific string of that line while the text that lies between is variable. The output needs to be printed to a new_file.xyz. Also I will be doing this repeatedly on lots of files. Thanks for your help.

Here is the python code that I was using but could not figure out how to make the "file.txt" be a variable to accept the file typed after the command. This code requires that I change the "file.txt" to be the name of the file every time I use it. Also I could not get it to print to a new file.

python code:

#!/usr/bin/python

with open('file.txt') as f:
    lines = f.readlines()
    new_lines = []
    for line in lines:
        if "HT" in line:
            new_line = line.replace(' 0 ', '2002')
            new_lines.append(new_line)
        else:
            new_lines.append(line)
    content = ''.join(new_lines)
    print(content)

I have been able to do some of the work in Perl and was hoping to have a single script that would carryout all of the replace steps in sequential order since all of the HT start out as HW and all the OT start out as OW . Perl script:

#!/usr/bin/perl

use strict;
use warnings;

$^I = '.bak'; # create a backup copy 

while (<>) {
   s/HW/HT/g; # do the replacement
   s/OW/OT/g; # do a second replacement
   print; # print to the modified file
}

Thanks for your help.
Oh and I am unfortunately limited to Python 2.7 as someone suggested code for python 3.0. I am purely a user of a university cluster but will ask about upgrading python.

Update

So what you really want to do is to change all HW to HT and OW to OT in the second column, and change column six to 2001 if for OW and 2002 for HW ?

That looks like this

use strict;
use warnings 'all';

while ( <DATA> ) {

    my @parts = /\s*\S+/g;

    if ( $parts[1] =~ s/([HO])W/$1T/ ) {

        $parts[5] = sprintf '%*d',
                length $parts[5],
                $1 eq 'H' ? 2002 : 2001;
    }

    print @parts, "\n";
}


__DATA__
 6  HW     4.092000    4.750000   -0.502000     0     5     7
 7  HW     5.367000    5.548000   -0.325000     0     5     6
 8  OW    -5.470000    5.461000    1.463000     0     9    10
 9  HW    -5.167000    4.571000    1.284000     0     8    10
10  HW    -4.726000    6.018000    1.235000     0     8     9
11  OW    -4.865000   -5.029000   -3.915000     0    12    13
12  HW    -4.758000   -4.129000   -3.608000     0    11    13

output

 6  HT     4.092000    4.750000   -0.502000  2002     5     7
 7  HT     5.367000    5.548000   -0.325000  2002     5     6
 8  OT    -5.470000    5.461000    1.463000  2001     9    10
 9  HT    -5.167000    4.571000    1.284000  2002     8    10
10  HT    -4.726000    6.018000    1.235000  2002     8     9
11  OT    -4.865000   -5.029000   -3.915000  2001    12    13
12  HT    -4.758000   -4.129000   -3.608000  2002    11    13



In case it is important, this solution takes care to keep the positions of all the values constant within each line

The lines to be modified are selected by checking whether the second field contains the string HT or OT . I don't know if that is adequate given the small data sample that you offer

This is for demonstration purposes. I trust you are able to modify the code to open an external file if necessary and read the data from a different file handle from DATA

use strict;
use warnings 'all';

while ( <DATA> ) {

    my @parts = /\s*\S+/g;

    if ( $parts[1] =~ /[HO]T/ ) {

        $parts[5] = sprintf '%*d',
                length $parts[5],
                $parts[1] =~ /HT/ ? 2002 : 2001;
    }

    print @parts, "\n";
}


__DATA__
 6  HT     4.092000    4.750000   -0.502000     0     5     7
 7  HT     5.367000    5.548000   -0.325000     0     5     6
 8  OT    -5.470000    5.461000    1.463000     0     9    10
 9  HT    -5.167000    4.571000    1.284000     0     8    10
10  HT    -4.726000    6.018000    1.235000     0     8     9
11  OT    -4.865000   -5.029000   -3.915000     0    12    13
12  HT    -4.758000   -4.129000   -3.608000     0    11    13

output

 6  HT     4.092000    4.750000   -0.502000  2002     5     7
 7  HT     5.367000    5.548000   -0.325000  2002     5     6
 8  OT    -5.470000    5.461000    1.463000  2001     9    10
 9  HT    -5.167000    4.571000    1.284000  2002     8    10
10  HT    -4.726000    6.018000    1.235000  2002     8     9
11  OT    -4.865000   -5.029000   -3.915000  2001    12    13
12  HT    -4.758000   -4.129000   -3.608000  2002    11    13

It looks like it uses fixed-width fields, so

sub trim { $_[0] =~ s/^\s+//r =~ s/\s+\z//r }

while (<>) {
   my $code = trim(substr($_, 2, 4));
   if ($code eq "HW") {
      substr($_,  2, 4, "  HT");
      substr($_, 43, 6, "  2002");
   }
   elsif ($code eq "OW") {
      substr($_,  2, 4, "  OT");
      substr($_, 43, 6, "  2001");
   }

   print;
}

Cleaner:

sub parse {
   my ( @format, @row );
   while ($_[0] =~ /\G\s*(\S+)/g) {
      push @row, $1;
      push @format, '%'.( $+[0] - $-[0] ).'s';
   }
   return ( join('', @format)."\n", @row );
}

while (<>) {
   my ($format, @row) = parse($_);

   if    ($row[1] eq "HW") { $row[1] = "HT";  $row[5] = 2002; }
   elsif ($row[1] eq "OW") { $row[1] = "OT";  $row[5] = 2001; }

   printf($format, @row);
}

It seems you want to use a regular expression to perform string substitution. IMO, you should do all your operations in a single substitution because it is not more complicated, it is probably faster and less error prone (because shorter).

Here is how I have understood your requirement: In your lines, you have a H or a O followed by a T or a W that you want to force to T, then 3 fields you want to copy, then a 4th field. If the 4th field is 0, you want to replace it by 2002 or 2001 according to the letter H or O.

This gives:

while (my $line = <>) {
    $line =~ s/(\s*)([HO])(T|W)(\s+\S+\s+\S+\s+\S+)(\s+\d+)/$1.$2.'T'.$4.($5 == 0 ? ($2 eq 'H' ? '  2002' : '  2001') : $5)/eg;
    print $line;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM