I am dealing with a third-party XML which contains special characters like bullets, long dashes etc.
Sample XML:
$xml = "<xml><node>• Special Characters</node></xml>";
My goal is to parse this XML using PHP and insert it in a MySQL database. I am using DomDocument
to parse the XML to get a SimpleXMLElement
object from the DOM node using simplexml_import_dom
.
The load method of DomDocument
fails unless I use utf8_encode to encode the xml.
$doc = new DOMDocument();
$doc->loadXML(utf8_encode($xml));
To be able to parse the xml, I understand that I need the utf8_encode
function. After being able to parse the XML, the inserts in MySQL table will result in special characters appearing as ? or garbage. Even the special characters from XML if displayed on a browser after parsing will be garbage.
The MySQL table column is of text datatype and is in latin1_swedish_ci collation. I saw similar questions on SO and tried their solutions like running mysql_query('SET NAMES utf8')
or changing the column encoding but they didn't work for me.
Please advise.
The issue is your database only works with Latin1 encoding by default. You'll want to change your database or table (I forget which, maybe both) encoding to UTF8.
You could try
alter table TABLE_NAME charset utf8
http://wolfram.kriesing.de/blog/index.php/2007/convert-mysql-db-to-utf8
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.