简体   繁体   中英

Reading Unicode characters from MySQL with PHP

I've inherited a MySQL database which contains a field named Description of type text and collation of latin1_swedish_ci .

The problem with this field is it contains utf-8 data with some Unicode characters, eg character 733, etc. Sometimes this character also exists in the field represented as HTML encoded "&#733" as well.

I'm trying to read the table and export the data to a CSV file and I need to represent this character as a double quote.

Reading the HTML encoded character is easy enough. However, it appears that the actual Unicode character is converted to utf-8 before I can do anything with it resulting in a "?".

How do I read in the Unicode character 733 (U+02DD), recognize it and convert it?

Here's a simplified (not tested) version of the code.

<?
$testconn=odbc_connect ("TESTLIB", "......", "......");

$query="SELECT Description FROM TestTable";

$rsWeb=mysql_query($query));

$WebRow=mysql_fetch_row($rsWeb));
$Desc = $WebRow[0];
$Desc = str_replace('"','""',$Desc);

fwrite($output,"\"".$Desc."\",\r\n");
%>

Also set charset to utf-8 when connecting to SQL server:

http://php.net/manual/en/mysqli.set-charset.php

$mysqli->set_charset("utf8");

I think your connection charset is not utf8, that's why chars are being converted to '?'.

Read this: http://dev.mysql.com/doc/refman/5.1/en/charset-connection.html

Post result for query: show variables like 'char%';

You really should put only non-entity (Unicode) version in the database, and entity-decode the rest. However, when you want to use UTF-8 with MySQL, there are a few things to remember:

  • Your table column's collation should be utf8_bin or similar.
  • Your table's collation and database collation should also be utf8_bin just in case.
  • Your connection charset should be UTF8. Do this by executing the "SET NAMES utf8" query.

Also, if you're outputting a HTML page, that should have the UTF8 charset as well. If everything is correct, the UTF8 characters should come out fine.

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM