简体   繁体   中英

how to use utf-8 in a perl cgi-bin script?

I have the following cgi bin script:

 #! /usr/bin/perl
 #

 use utf8;

 use CGI;
 my $q = CGI->new();
 my %params = $q->Vars;

 print $q->header('text/html');

 $w = $params{"words"};

 print "$w\n";

I want to be able to call it as cgi-bin/script.pl?words=É for example, but when I do that, what's printed is not UTF-8, but instead garbled:

   É 

Is there any way to use cgi-bin with utf8?

Your line use utf8 doesn't do anything for you, other than allowing UTF-8 characters in the source file itself. You must make sure that the output handles (on STDOUT as well as any files) are set to utf8 . One easy way to handle this is the utf8::all module. Also, make sure you are sending the correct headers, and use the -utf8 CGI pragma to treat incoming parameters as UTF-8. Finally, as always, be sure to use strict and warnings.

The following should get you started:

#!/usr/bin/perl

use strict;
use warnings;

use utf8::all;
use CGI qw(-utf8);

my $q = CGI->new;
print $q->header("text/html;charset=UTF-8");
print $q->param("words");

exit;

You will save yourself a lot of pain in the long run if you start out by reading the perlunitut and perlunicode documentation pages. They will give you the basics on exactly what Unicode and character encodings are, and how to work with them in Perl.

Also, what you're asking for is more complex than you think. There are many layers hidden in the phrase "use cgi-bin with utf8", starting with your interface to whatever tool you're using to send requests to the web server and ending with that tool having parsed a response and presenting it to you. You need to understand all those layers well enough to at least be able to tell if the problem lies in your CGI script or not. For example, it doesn't help if your script works perfectly if the problem is that bash and curl don't agree on the encoding of your command line arguments.

I have been having this problem of intermittent failure of utf8 encoding with my CGI script. I tried everything but couldn't reliably repeat the problem.

I finally discovered that is is absolutely critical to be consistent with you use of the utf8 pragma throughout every module that uses CGI

use CGI qw(-utf8);

What seems to happen is that modperl invokes the CGI module just once per requests. If there is inconsistent including of the CGI module - say for some utility function that is just using a redirect function and you haven't bothered to set the utf8 pragma. Then this invocation can be the one that modperl decides to use to decode requests.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM