简体   繁体   中英

how to use UTF-8 in CGI scripts?

I am trying to use UTF-8 characters in CGI scripts.

I am using the following header for the CGI script:

#! /usr/bin/perl
#

use utf8;

use open ':std' => ':encoding(UTF-8)';

use CGI '-utf8';

my $q      = CGI->new();
my %params = $q->Vars;

print $q->header( -type => "text/html", -charset => "UTF-8" );
print $q->start_html( -encoding => "UTF-8" );

The issue is that whenever I print something to standard output, I get output on the browser that looks like:

st\xE1n

instead of

stán

Any ideas what's wrong?

By using use CGI '-utf8'; , you indicate that inputs should be encoded using UTF-8.

utf8 "\\xE1" does not map to Unicode means your input wasn't encoded using UTF-8.

The script doesn't output stán because stán wasn't provided to the script.

As @ikegami mentioned, your input does not look like UTF-8.

In general, to make your CGI output valid UTF-8, you should do two things:

  1. Make sure your browser will understand that you're giving UTF-8 to it. You already did that.

  2. Make sure the values of the variables you print are in UTF-8. This is the part that causes much problems. For example, if you get some value from the database, or from CGI parameter, or whatever, you should be sure it's stored internally as UTF-8 string. In most cases it means that you should explicitly run utf8::decode on that scalar, eg if $stan is the variable keeping the value you print, just put the following line before printing it:

utf8::decode($stan);

The use utf8; directive in your source means that the script itself is in UTF-8. It means that you don't need to utf8::decode the string constants explicitly as they are already UTF-8. But if your stán is coming from some external source such as a database, you still need to utf8::decode it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM