简体   繁体   English

卷曲:编码问题

[英]Curl: Encoding problem

I grab a webpage that is in Hebrew with the help of curl but do get fancy characters (ie. ЧђЧ§Ч©Чџ) instead of Hebrew. 我在curl的帮助下抓取了一个希伯来语网页,但确实得到了奇特的字符(即ЧђЧ§Ч©Чџ)而不是希伯来语。 What should I do to receive it all in Hebrew? 我该怎么办才能在希伯来语中接受这一切?

You may be receiving it right in Hebrew but may not be displaying them correctly. 您可能会在希伯来语中正确收到它,但可能无法正确显示它们。 Be sure that the page genereated is UTF-8 encoded. 确保生成的页面是UTF-8编码的。 Put this line on top of the page output: 将此行放在页面输出的顶部:

 echo '<?xml version="1.0" encoding="UTF-8"?>';

And this in the HTML <head> section: 而这在HTML <head>部分中:

 <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">

This may correct your problem. 这可以纠正您的问题。 And if nothing works, try changing the text encoding you're getting by this function: 如果没有任何效果,请尝试通过此功能更改获取的文本编码:

 $text = iconv("Windows-1252","UTF-8",$text);

Of course you have to set the hebrew or whatever encoding you want there. 当然,您必须在此处设置希伯来语或任何所需的编码。 Try different combination (UTF-8, ISO-8859-1, Windows-1252). 尝试不同的组合(UTF-8,ISO-8859-1,Windows-1252)。

CURL doesn't really think in terms of what language it's parsing. CURL并没有真正考虑要解析的语言。 It just retrieves whatever data source you point it to. 它只是检索您指向的任何数据源。

The issue you are seeing comes down to Character Encoding. 您看到的问题归结为字符编码。 After you get the output with curl try using php's iconv library to change the character encoding (maybe UTF8?). 在获得curl的输出之后,请尝试使用php的iconv库更改字符编码(也许是UTF8?)。 You can probably check out the headers of the response to see what encoding the service you are hitting is sending back to you. 您可能可以检查出响应的标头,以查看所命中的服务正在发送回给您的编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM