I am making use of PHP tidy like so:
$config = array(
'wrap' => 0,
'lower-literals' => 1,
'preserve-entities' => 1,
'drop-empty-paras' => 0
);
$tidy = new tidy;
$tidy->parseString($html, $config, 'utf8');
$tidy->cleanRepair();
When I pass in HTML with English text it comes out fine. However, French text, and it has trouble with the encoding. So if I pass something like vérifier
then it appears as vérifier
in the output. How can I get tidy to play nice with all languages, at least latin ones.
In addition, I will be passing the output of tidy through to PHP's DOM Document, anything I should be careful with here?
It looks very much like the UTF-8 handling is working fine, but you're interpreting the result in latin-1 instead of UTF-8. Set an appropriate HTTP header or meta tag instructing the browser to read the document using UTF-8.
header('Content-Type:text/html; charset=utf-8');
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.