After getting some help here is what I have come up with (I was hoping to learn by trying to put multiple scripts together). The script below will do the HW and OW replacements but does not run the if statement.
*#*!/usr/bin/perl
use strict;
use warnings 'all';
$^I = '.bak'; # create a backup copy
while (<>) {
s/HW/HT/g; # do the replacement of HW with HT
s/OW/OT/g; # do a second replacement OW with OT
*#* Hopefully run the if statement
my @parts = /\s*\S+/g;
if ( $parts[1] =~ s/([HO])W/$1T/ ) {
$parts[5] = sprintf '%*d',
length $parts[5],
$parts[1] =~ /HT/ ? 2002 : 2001;
}
print @parts, "\n";
}
I have left the rest of the post below in case people have similar problems.
I would like to use Perl to replace text in a file by searching for specific letters at the beginning of the string. For example here is a section of the file:
6 HT 4.092000 4.750000 -0.502000 0 5 7
7 HT 5.367000 5.548000 -0.325000 0 5 6
8 OT -5.470000 5.461000 1.463000 0 9 10
9 HT -5.167000 4.571000 1.284000 0 8 10
10 HT -4.726000 6.018000 1.235000 0 8 9
11 OT -4.865000 -5.029000 -3.915000 0 12 13
12 HT -4.758000 -4.129000 -3.608000 0 11 13
I would like to use HT
as the search and be able to replace the "0" in the column of zeros with 2002
. I know how to replace the entire column of zeros but I don't know how to make it line specific. After using HT as the search I need to then search OT
and replace the 0
column with 2001
.
Basically I need to search a string that identifies the line and replace a specific string of that line while the text that lies between is variable. The output needs to be printed to a new_file.xyz. Also I will be doing this repeatedly on lots of files. Thanks for your help.
Here is the python code that I was using but could not figure out how to make the "file.txt" be a variable to accept the file typed after the command. This code requires that I change the "file.txt" to be the name of the file every time I use it. Also I could not get it to print to a new file.
python code:
#!/usr/bin/python
with open('file.txt') as f:
lines = f.readlines()
new_lines = []
for line in lines:
if "HT" in line:
new_line = line.replace(' 0 ', '2002')
new_lines.append(new_line)
else:
new_lines.append(line)
content = ''.join(new_lines)
print(content)
I have been able to do some of the work in Perl and was hoping to have a single script that would carryout all of the replace steps in sequential order since all of the HT
start out as HW
and all the OT
start out as OW
. Perl script:
#!/usr/bin/perl
use strict;
use warnings;
$^I = '.bak'; # create a backup copy
while (<>) {
s/HW/HT/g; # do the replacement
s/OW/OT/g; # do a second replacement
print; # print to the modified file
}
Thanks for your help.
Oh and I am unfortunately limited to Python 2.7 as someone suggested code for python 3.0. I am purely a user of a university cluster but will ask about upgrading python.
So what you really want to do is to change all HW
to HT
and OW
to OT
in the second column, and change column six to 2001 if for OW
and 2002 for HW
?
That looks like this
use strict;
use warnings 'all';
while ( <DATA> ) {
my @parts = /\s*\S+/g;
if ( $parts[1] =~ s/([HO])W/$1T/ ) {
$parts[5] = sprintf '%*d',
length $parts[5],
$1 eq 'H' ? 2002 : 2001;
}
print @parts, "\n";
}
__DATA__
6 HW 4.092000 4.750000 -0.502000 0 5 7
7 HW 5.367000 5.548000 -0.325000 0 5 6
8 OW -5.470000 5.461000 1.463000 0 9 10
9 HW -5.167000 4.571000 1.284000 0 8 10
10 HW -4.726000 6.018000 1.235000 0 8 9
11 OW -4.865000 -5.029000 -3.915000 0 12 13
12 HW -4.758000 -4.129000 -3.608000 0 11 13
6 HT 4.092000 4.750000 -0.502000 2002 5 7
7 HT 5.367000 5.548000 -0.325000 2002 5 6
8 OT -5.470000 5.461000 1.463000 2001 9 10
9 HT -5.167000 4.571000 1.284000 2002 8 10
10 HT -4.726000 6.018000 1.235000 2002 8 9
11 OT -4.865000 -5.029000 -3.915000 2001 12 13
12 HT -4.758000 -4.129000 -3.608000 2002 11 13
In case it is important, this solution takes care to keep the positions of all the values constant within each line
The lines to be modified are selected by checking whether the second field contains the string HT
or OT
. I don't know if that is adequate given the small data sample that you offer
This is for demonstration purposes. I trust you are able to modify the code to open an external file if necessary and read the data from a different file handle from DATA
use strict;
use warnings 'all';
while ( <DATA> ) {
my @parts = /\s*\S+/g;
if ( $parts[1] =~ /[HO]T/ ) {
$parts[5] = sprintf '%*d',
length $parts[5],
$parts[1] =~ /HT/ ? 2002 : 2001;
}
print @parts, "\n";
}
__DATA__
6 HT 4.092000 4.750000 -0.502000 0 5 7
7 HT 5.367000 5.548000 -0.325000 0 5 6
8 OT -5.470000 5.461000 1.463000 0 9 10
9 HT -5.167000 4.571000 1.284000 0 8 10
10 HT -4.726000 6.018000 1.235000 0 8 9
11 OT -4.865000 -5.029000 -3.915000 0 12 13
12 HT -4.758000 -4.129000 -3.608000 0 11 13
6 HT 4.092000 4.750000 -0.502000 2002 5 7
7 HT 5.367000 5.548000 -0.325000 2002 5 6
8 OT -5.470000 5.461000 1.463000 2001 9 10
9 HT -5.167000 4.571000 1.284000 2002 8 10
10 HT -4.726000 6.018000 1.235000 2002 8 9
11 OT -4.865000 -5.029000 -3.915000 2001 12 13
12 HT -4.758000 -4.129000 -3.608000 2002 11 13
It looks like it uses fixed-width fields, so
sub trim { $_[0] =~ s/^\s+//r =~ s/\s+\z//r }
while (<>) {
my $code = trim(substr($_, 2, 4));
if ($code eq "HW") {
substr($_, 2, 4, " HT");
substr($_, 43, 6, " 2002");
}
elsif ($code eq "OW") {
substr($_, 2, 4, " OT");
substr($_, 43, 6, " 2001");
}
print;
}
Cleaner:
sub parse {
my ( @format, @row );
while ($_[0] =~ /\G\s*(\S+)/g) {
push @row, $1;
push @format, '%'.( $+[0] - $-[0] ).'s';
}
return ( join('', @format)."\n", @row );
}
while (<>) {
my ($format, @row) = parse($_);
if ($row[1] eq "HW") { $row[1] = "HT"; $row[5] = 2002; }
elsif ($row[1] eq "OW") { $row[1] = "OT"; $row[5] = 2001; }
printf($format, @row);
}
It seems you want to use a regular expression to perform string substitution. IMO, you should do all your operations in a single substitution because it is not more complicated, it is probably faster and less error prone (because shorter).
Here is how I have understood your requirement: In your lines, you have a H or a O followed by a T or a W that you want to force to T, then 3 fields you want to copy, then a 4th field. If the 4th field is 0, you want to replace it by 2002 or 2001 according to the letter H or O.
This gives:
while (my $line = <>) {
$line =~ s/(\s*)([HO])(T|W)(\s+\S+\s+\S+\s+\S+)(\s+\d+)/$1.$2.'T'.$4.($5 == 0 ? ($2 eq 'H' ? ' 2002' : ' 2001') : $5)/eg;
print $line;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.