简体   繁体   English

如何在 perl 脚本中使用 UTF8 连接到 MySQL?

[英]How to connect to MySQL using UTF8 within a perl script?

In a nutshell:简而言之:

Within a Perl-Script: How do I connect to MySQL in a way that allows to transmit the four-byte unicode character U+1F61C ("") from the perl script to a MySQL-Table where this character should be stored?在 Perl 脚本中:如何以允许将四字节 unicode 字符 U+1F61C ("") 从 perl 脚本传输到应存储此字符的 MySQL 表的方式连接到 MySQL?

Using {mysql_enable_utf8 => 1} doesn't solve the problem.使用{mysql_enable_utf8 => 1}不能解决问题。

In detail:详细:

I have exactly the same problem as described in the Question ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x9C' for column 'comment' at row 1 and even with the same Unicode character ( = U+1F61C = FACE WITH STUCK-OUT TONGUE AND WINKING EYE ) which produces the error message我有与问题ERROR 1366 (HY000) 中描述的完全相同的问题:不正确的字符串值:'\xF0\x9F\x98\x9C' for column 'comment' at row 1 and even with the same Unicode character ( = U+ 1F61C = 吐舌头和眨眼的脸),它会产生错误消息

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column ...

But I don't use PHP, I use Perl.但是我不用PHP,我用的是Perl。

The accepted answer in the other question says:另一个问题中接受的答案是:

  • Run MySQL 5.5 or later.运行 MySQL 5.5 或更高版本。
    I check the version:我检查版本:

     mysql> select version(); +-------------------------+ | version() | +-------------------------+ | 5.7.13-0ubuntu0.16.04.2 | +-------------------------+

    So it is 5.7 which is later than 5.5.所以它是 5.7,比 5.5 晚。
    ✅checked ✅已检查

  • Set table's character to utf8mb4 .将表的字符设置为utf8mb4
    I check the character set of my database, my table and even of the reported column:我检查了我的数据库、我的表甚至报告列的字符集:

     mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA -> WHERE schema_name = "myDatabase"; +----------------------------+ | default_character_set_name | +----------------------------+ | utf8mb4 | +----------------------------+ mysql> SELECT CCSA.character_set_name FROM information_schema.`TABLES` T, -> information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA -> WHERE CCSA.collation_name = T.table_collation -> AND T.table_schema = "myDatabase" -> AND T.table_name = "myTable"; +--------------------+ | character_set_name | +--------------------+ | utf8mb4 | +--------------------+ mysql> SELECT character_set_name FROM information_schema.`COLUMNS` -> WHERE table_schema = "myDatabase" -> AND table_name = "myTable" -> AND column_name = "myColumn"; +--------------------+ | character_set_name | +--------------------+ | utf8mb4 | +--------------------+

    So my database, my table and the reported column all use the character set utf8mb4 .所以我的数据库、我的表和报告的列都使用字符集utf8mb4
    ✅checked ✅已检查

  • Enable UTF8 on your MySQL connection.在 MySQL 连接上启用 UTF8。
    This seems to be the problem.这似乎是问题所在。 The answer to the other question says:另一个问题的答案是

    SET NAMES utf8 , or use an option when connecting that similarly enables it. SET NAMES utf8 ,或在连接时使用类似启用它的选项。

I don't know how to SET NAMES utf8 within a perl script, so I did it how I did it over the last years.我不知道如何在 perl 脚本中SET NAMES utf8 ,所以我是按照过去几年的方式来做的。 I think that this is " an option when connecting that similarly enables it ".我认为这是“类似地启用它的连接时的一个选项”。
It's at the end of the long line that begins with my $dbh = DBI->connect :它位于以my $dbh = DBI->connect开头的长行的末尾:

#!/usr/bin/perl -w
use strict;
use warnings;
use utf8;
use Encode;
use DBI;
binmode STDOUT, ":utf8";

#Here I connect using the parameter mysql_enable_utf8 (create database handle):
my $dbh = DBI->connect('DBI:mysql:database=myDatabase;host=localhost','aUser','aPassword',{mysql_enable_utf8 => 1});

#Prepare the statement (create statement handle):
my $sth = $dbh->prepare('INSERT INTO `myTable` (`myColumn`) VALUES(?);');

#This doesn't work: 
$sth->execute('😜');

#This doesn't work either: 
$sth->execute(encode_utf8('😜'));

#end processing:
$dbh->disconnect();
exit(0);

Both executes throw the same error (only the line number at the end changes):两次执行都抛出相同的错误(只有末尾的行号发生变化):

DBD::mysql::st execute failed: Incorrect string value: '\xF0\x9F\x98\x9C' for column 'myColumn' at row 1 at myTestScript.pl line 16.

What am I doing wrong?我究竟做错了什么?
How can I do it better?我怎样才能做得更好?

The problem is with the SET NAMES utf8 command. 问题出在SET NAMES utf8命令上。 In MySQL the utf8 character set is not truly utf8, it supports characters up 3 bytes only and the character in question has 4 bytes: 在MySQL中,utf8字符集不是真正的utf8,它仅支持3个字节的字符,并且所讨论的字符有4个字节:

The utf8 character set in MySQL has these characteristics: MySQL中的utf8字符集具有以下特征:

• No support for supplementary characters (BMP characters only). •不支持增补字符(仅限BMP字符)。

• A maximum of three bytes per multibyte character. •每个多字节字符最多三个字节。

The true utf8 is the utf8mb4 that you use as character set in the field itself. 真正的utf8是你在字段中用作字符集的utf8mb4。 So, use SET NAMES utf8mb4 所以,使用SET NAMES utf8mb4

So from Perl you should use {mysql_enable_utf8mb4 => 1} instead of {mysql_enable_utf8 => 1} . 所以从Perl你应该使用{mysql_enable_utf8mb4 => 1}而不是{mysql_enable_utf8 => 1}

I tried so many times, in so many different ways, to cgi script works correctly to read input from STDIN, read html file, print it to STDOUT and search the inputed text on mysql correctly.我尝试了很多次,以多种不同的方式,使 cgi 脚本能够正常工作以从 STDIN 读取输入,读取 html 文件,将其打印到 STDOUT 并正确搜索 mysql 上的输入文本。 The attribute mysql_enable_utf8mb4 and "SET NAMES utf8mb4" after the mysql connection works correctly with "meta charset='UTF-8'". mysql 连接后的属性 mysql_enable_utf8mb4 和“SET NAMES utf8mb4”与“meta charset='UTF-8'”一起正常工作。

#!/usr/bin/perl
print "Content-type: text/html; charset=UTF-8\n\n";

#use utf8;
#use open ':utf8';
#binmode STDOUT, ":utf8";
#binmode STDIN , ":utf8";
#use encoding 'utf8';

our $dbh = DBI->connect("DBI:mysql:database=$database;host=$servername;port=$port",$username,$password, {PrintWarn => 0, PrintError => 0, mysql_enable_utf8mb4 => 1}) || die;
$dbh->do("SET NAMES utf8mb4");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM