I'm trying to make a script that would take in a string with accented characters, and return their unaccented counterparts.
I managed to make something that almost works after looking around for some help and tutorials, but I have a problem.
My code does what I want, as long as I want it done on a simple string, but it does absolutely nothing when I want to use <STDIN>
.
My code:
use strict;
use warnings;
my %replace = (
'é' => "e",
'á' => "a",
'ő' => "o",
'ö' => "o",
'ó' => "o",
'ú' => "u",
'ü' => "u",
'ű' => "u",
'í' => "i",
);
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
my $s = <STDIN>;
$s = substr $s, 0, length($s) - 1;
my $var = "$s - öüóőúéáű";
$var =~ s/($regex)/$replace{$1}/g;
$s = $var;
print "$s\n";
If i input "öüóőúéáű" to <STDIN>
i get the following output:
öüóőúéáű - ouooueau
Could someone tell me what I'm doing wrong?
EDIT:
I checked, and when using it like the following (with <DATA>
instead of <STDIN>
) it works properly:
use strict;
use warnings;
my %replace = (
'é' => "e",
'á' => "a",
'ő' => "o",
'ö' => "o",
'ó' => "o",
'ú' => "u",
'ü' => "u",
'ű' => "u",
'í' => "i",
);
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
my $s = <DATA>;
$s = substr $s, 0, length($s) - 1;
my $var = "$s - öüóőúéáű";
$var =~ s/($regex)/$replace{$1}/g;
$s = $var;
print "$s\n";
__DATA__
öüóőúéáű
EDIT2:
I now did the following: my $s = <DATA>." - ".<>;
so it reads in the characters from <DATA>
as well as from <STDIN>
and now I realized, that it still matches with <DATA>
and does noting to <STDIN>
, so i get the following output:
uaeuoouoi - űáéúőóüöí - uaeuoouoi
from the following code:
use strict;
use warnings;
use utf8;
my %replace = (
'é' => "e",
'á' => "a",
'ő' => "o",
'ö' => "o",
'ó' => "o",
'ú' => "u",
'ü' => "u",
'ű' => "u",
'í' => "i",
);
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
my $s = <DATA>." - ".<>;
$s = substr $s, 0, length($s) - 1;
my $var = "$s - űáéúőóüöí";
$var =~ s/($regex)/$replace{$1}/g;
$s = $var;
print "$s\n";
__DATA__
űáéúőóüöí
where <STDIN>
= űáéúőóüöí
In my case with your program I get the expected result:
use strict;
use warnings;
my %replace = (
'é' => "e",
'á' => "a",
'ő' => "o",
'ö' => "o",
'ó' => "o",
'ú' => "u",
'ü' => "u",
'ű' => "u",
'í' => "i",
);
my $regex = join "|", keys %replace;
$regex = qr/$regex/;
my $s = <DATA>;
$s = substr $s, 0, length($s) - 1;
my $var = "$s - öüóőúéáű";
$var =~ s/($regex)/$replace{$1}/g;
$s = $var;
print "$s\n";
__DATA__
öüóőúéáű
Where I get:
$ perl test.pl
ouooueau - ouooueau
So you have another problem such as an encoding issue.
You can try to add to your program.
use utf8;
Also you can simplify your program like this:
use strict;
use warnings;
my %replace = (
'é' => "e",
'á' => "a",
'ő' => "o",
'ö' => "o",
'ó' => "o",
'ú' => "u",
'ü' => "u",
'ű' => "u",
'í' => "i",
);
while(<DATA>) {
for my $key (keys %replace) {
s/$key/$replace{$key}/;
}
print;
}
__DATA__
öüóőúéáű
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.