简体   繁体   中英

Spanish characters are incorrect

I downloaded a page with cURL and parsed the html with the "PHP Simple HTML DOM Parser". The issue is when it displays the outer html of the element, the Spanish characters are incorrect. For example:

The original text

la puja por la compra de los derechos de publicación ha sido la más reñida del año.

The displayed text

la puja por la compra de los derechos de publicación ha sido la más reñida del año.

What would cause the letters to changed?

I'm pretty sure that because it's appearing as multiple characters in the output this is occuring because you're trying to display some multi-byte UTF8 characters in a single-byte charset (probably ISO-8859-1).

Have a look at this blog post that I wrote a while ago which should talk you through all of the potential problem areas.

不正确的字符编码-确保整个编码一致,我建议使用UTF-8

U have to determine what is the encoding of downloaded page and then (by iconv for example) convert it to your encoding.

See PHP: Convert curl_exec output to UTF8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM