简体   繁体   中英

How to save a ® (registered trademark) in Perl ASP SQLite database

I can not isolate this problem. I have an HTML form text input that is being saved into an SQLite datebase via Perl ASP. If I just save the form data ® or if replace the character by using:

    $registered = chr(174);
$DESCRIPTION =~ s/$registered/R/g;

I get an extra character when the data is retreived ® or ÂR if I replace the trademark with the code above, save it again and I get î , again ÃÂî . Where are the à's coming from??

Set the sqlite_unicode attribute to 1 in your connect:

$dbh = DBI::connect( "dbi:SQLite:dbname=foo", "", "", { sqlite_unicode => 1 } );

After that, when setting some binary data columns, you may need to explicitly denote them as binary:

$sth->bind_param(1, $binary_data, SQL_BLOB);

The string is probably in UTF-8 (Perl's standard for character encoding) when you are working with it. A registered trademark symbol in UTF-8 is two bytes, and you are only replacing one of them. See more information here for the encoding of that character .

If you want to replace the symbol with a regex, use a method other than chr() to match the appropriate character. You should be able to do this:

s/\x{c2ae}/R/g;

\\x matches a UTF-8 character given in hexadecimal. I obtained the hex encoding from the page linked above.

For more information see "Escape Sequences" in perlre .

Also see the Encode core module for more information on how Perl handles character encodings.

Maybe this tour will shed some light on what you are hitting? I'm guessing chr2 is where your issue lies.

use strictures;
use utf8;
use DBI;

my $dbh = DBI->connect("dbi:SQLite::memory:", undef, undef,
                       { sqlite_unicode => 1,
                         PrintError => 1 } );

$dbh->do(<<"");
   CREATE TABLE moo (
    name TEXT
    ,string TEXT )

my $insert = $dbh->prepare("INSERT INTO moo VALUES ( ?, ? )");

my %reg = ( raw => "®", # note "use utf8;"
            "chr" => chr(174) );

while ( my ( $name, $reg ) = each %reg )
{
    $insert->execute($name, $reg);
}

# And a couple without placeholders (which we all know is EVIL, right?)
$dbh->do(<<"");
    INSERT INTO moo VALUES( "raw2", "®" )

my $reg = chr(174);
$dbh->do(<<"");
    INSERT INTO moo VALUES( "chr2", "$reg" )

my $sth = $dbh->prepare("SELECT * FROM moo");

$sth->execute;

binmode STDOUT, ":encoding(UTF-8)";
while ( my $row = $sth->fetchrow_hashref )
{
    print $row->{name}, " -> ", $row->{string}, $/;
}

__DATA__
chr -> ®
raw -> ®
raw2 -> ®
"\x{00ae}" does not map to utf8.
chr2 -> \xAE

After taking a look at the characters actually in the string with:

foreach (split //, $DESCRIPTION) {
     $hold = ord($_);
     %>chr(<%= $hold %>)<br><%
}

I found that ® from the html form text input is being treated/received as chr(194).chr(174). So:

    $registered = chr(194).chr(174);
$DESCRIPTION =~ s/$registered/&#174;/g;

allows me to save it to the database without issue...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM