简体   繁体   English

CURL导入字符编码问题

[英]CURL import character encoding problem

I'm using CURL to import some code. 我正在使用CURL导入一些代码。 However, in french, all the characters come out funny. 然而,在法语中,所有角色都很有趣。 For example: Bonjour ... 例如:Bonjour ......

I don't have access to change anything on the imported code. 我无权更改导入代码的任何内容。 Is there anything I can do my side to fix this? 有什么我可以帮我解决这个问题吗?

Thanks 谢谢

Like Jon Skeet pointed it's difficult to understand your situation, however if you have access only to final text, you can try to use iconv for changing text encoding. 就像Jon Skeet指出的那样,很难理解你的情况,但是如果你只能访问最终文本,你可以尝试使用iconv来改变文本编码。

Ie

$text = iconv("Windows-1252","UTF-8",$text);

I've had similar issue time ago (with Italian language and special chars) and I've solved it in this way. 我以前有类似的问题(用意大利语和特殊字符),我已经用这种方式解决了它。

Try different combination (UTF-8, ISO-8859-1, Windows-1252). 尝试不同的组合(UTF-8,ISO-8859-1,Windows-1252)。

I had a similar problem. 我遇到了类似的问题。 I tried to loop through all combinations of input and output charsets. 我试图遍历输入和输出字符集的所有组合。 Nothing helped! 什么都没有帮助! :( :(

However I was able to access the code that actually fetched the data and this is where the culprit lied. 但是,我能够访问实际获取数据的代码,这就是罪魁祸首的地方。 Data was fetched via cURL. 数据是通过cURL获取的。 Adding 添加

 curl_setopt($ch,CURLOPT_BINARYTRANSFER,true);

fixed it. 固定它。

A handy set of code to try out all possible combinations of a list of charsets: 一组方便的代码,用于尝试charsets列表的所有可能组合:

$charsets = array(  
        "UTF-8", 
        "ASCII", 
        "Windows-1252", 
        "ISO-8859-15", 
        "ISO-8859-1", 
        "ISO-8859-6", 
        "CP1256"
        ); 

foreach ($charsets as $ch1) { 
    foreach ($charsets as $ch2){ 
        echo "<h1>Combination $ch1 to $ch2 produces: </h1>".iconv($ch1, $ch2, $text_2_convert); 
    } 
} 

PHP seems to use UTF-8 by default, so I found the following works PHP似乎默认使用UTF-8,所以我找到了以下工作

$text = iconv("UTF-8","Windows-1252",$text); $ text = iconv(“UTF-8”,“Windows-1252”,$ text);

You could replace your 你可以替换你的

$data = curl_exec($ch);

by 通过

$data = utf8_decode(curl_exec($ch));

I had this same issue and it worked well for me. 我有同样的问题,对我来说效果很好。

I'm currently suffering a similar problem, i'm trying to write a simple html <title> importer cia cURL. 我目前遇到了类似的问题,我正在尝试编写一个简单的html <title>导入程序cia cURL。 So i'm going to give an idea of what i've done until now: 所以我想知道我到目前为止所做的事情:

  1. Retrieve the HTML via cURL 通过cURL检索HTML
  2. Check if there's any hint of encoding on the response headers via curl_getinfo() and match it via regex 通过curl_getinfo()检查响应头上是否有任何编码提示,并通过regex进行匹配
  3. Parse the HTML for the purpose of looking at the content-type meta and the <title> tag (yes, i know the consequences ) 解析HTML以查看content-type元和<title>标签(是的, 我知道后果
  4. Compare both content-type, header and meta and choose the meta one if it's different, because we know noone cares about their httpd configuration and there are a lot of dirt workarounds using it 比较内容类型,标题和元数据并选择元数据,如果它不同,因为我们知道没有人关心他们的httpd配置,并且有很多使用它的污垢解决方法
  5. iconv() the string iconv()字符串
  6. Whish everyday that when someone does not follow the standards $DEITY punishes him/her until the end of the days, because it would save me the meta parsing Whish每天都会有人不遵守标准$DEITY惩罚他/她,直到日子结束,因为它会省去我的元解析

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM