简体   繁体   中英

Curl: Encoding problem

I grab a webpage that is in Hebrew with the help of curl but do get fancy characters (ie. ЧђЧ§Ч©Чџ) instead of Hebrew. What should I do to receive it all in Hebrew?

You may be receiving it right in Hebrew but may not be displaying them correctly. Be sure that the page genereated is UTF-8 encoded. Put this line on top of the page output:

 echo '<?xml version="1.0" encoding="UTF-8"?>';

And this in the HTML <head> section:

 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

This may correct your problem. And if nothing works, try changing the text encoding you're getting by this function:

 $text = iconv("Windows-1252","UTF-8",$text);

Of course you have to set the hebrew or whatever encoding you want there. Try different combination (UTF-8, ISO-8859-1, Windows-1252).

CURL doesn't really think in terms of what language it's parsing. It just retrieves whatever data source you point it to.

The issue you are seeing comes down to Character Encoding. After you get the output with curl try using php's iconv library to change the character encoding (maybe UTF8?). You can probably check out the headers of the response to see what encoding the service you are hitting is sending back to you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM