how in the specified file change the end of the url address ".pl" to ".en" and the penultimate ".com" to ".org"
for example: http://www.addres.pl change to: http://www.addres.en
and if in addres exist like this http://www.addres.com.pl change to: http://www.addres.org.en
and if its appear like this http://www.addres.com.ru then only change .com http://www.addres.org.ru
example of text file input:
http://www.addres.org.en
http://www.addres.com.pl
http://www.addre.pl
http://www.addres.en
http://www.addres.ru
http://com ddd http://www.com.pl.com.pl.com.pl.com.pl
aaa http://www.addres.com.pl! bbb
ccc (http://www.addre.pl) ddd
example of console output:
http://www.addres.org.en
http://www.addres.org.en
http://www.addre.en
http://www.addres.en
http://www.addres.ru
http://com ddd http://www.com.pl.com.pl.com.pl.org.en
aaa http://www.addres.org.en! bbb
ccc (http://www.addre.en) ddd
for now i have this to check if input is a file
#!/usr/bin/perl
use warnings;
use strict;
use File::Find;
if (($#ARGV+1 != 1 )||(! -f $ARGV[0]))
{
print "podaj plik\n";
exit 1;
}
#!/usr/local/bin/perl
open (MYFILE, $ARGV[0]);
while (<MYFILE>) {
chomp;
my $url = $_;
for ($url) {
#s|(com)(.??)|org$2| and last;
s|com.pl|org.en| and last;
s|com[.]|org.| and last;
s|[.]pl|.en|;
}
print "$url\n";
}
close (MYFILE);
exit 0;
how to make this
s|com[.]ru|org.ru| and last;
change all addres like this
s|com[.]??|org.??| and last;
where ?? can be for example ru, or en or all others then pl
Quick and dirty:
use strict;
while (<>) {
s|com[.]pl\b|org.en| or
s|[.]pl\b|.en| or
s|com[.]ru\b|org.ru|;
print;
}
Pay attention to the regex order and call it from the command line: perl script.pl in.txt
.
Then learn the proper three-argument way to open files using lexical variables for filehandles (to prevent global file handles with names as common as MYFILE to clobber one another + get file to close when the lexical variable goes out of scope).
Added:
Looking at your new sample lines, I think you probably need something more like this (I included the regex you asked for at the end of your last edit):
while (<>) {
s|com[.]pl([\s!)])|org.en\1|
or s|[.]pl([\s!)])|.en\1|
or s|com[.]([!pl])([\s!)])|org.\1\2| ;
print;
}
For further advice, read my comments below.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.