[英]Outputting file contents as UTF-8 leads to character encoding issues
I set my header as follows: 我将标题设置如下:
header( 'Content-Type: text/html; charset="utf-8"' );
and then output a local file on my server to the browser using the following code-segment: 然后使用以下代码段将服务器上的本地文件输出到浏览器:
$content = file_get_contents($sPath);
$content = mb_convert_encoding($content, 'UTF-8');
echo $content;
The files I have on the server are created by lua and thus, the output of the following is FALSE
(before conversion): 我在服务器上拥有的文件是由lua创建的,因此,以下输出为
FALSE
(转换前):
var_dump( mb_detect_encoding($content) );
The files contain some characters like ™
( ™
) etc. and these appear as plain square boxes in browsers. 这些文件包含一些字符,例如
™
( ™
)等,它们在浏览器中显示为普通方形框。 I've read the following threads which were suggested as similar questions and none of the variations in my code helped: 我已阅读以下被建议为类似问题的线程,并且我的代码中的所有变体都无济于事:
.txt
s) .txt
) There seem to be no problems when I simply use the following: 当我简单地使用以下内容时,似乎没有问题:
header( 'Content-Type: text/html; charset="iso-8859-1"' );
// setting path here
$content = file_get_contents($sPath);
echo $content;
There seem to be no problems when I simply use the following:
当我简单地使用以下内容时,似乎没有问题:
header( 'Content-Type: text/html; charset="iso-8859-1"' ); // setting path here $content = file_get_contents($sPath); echo $content;
So this means the file content is actually encoded in ISO-8859-1. 因此,这意味着文件内容实际上是按照ISO-8859-1编码的。 If you want to output this as UTF-8, then explicitly convert from ISO-8859-1 to UTF-8:
如果要将其输出为UTF-8,则将其从ISO-8859-1明确转换为UTF-8:
$content = mb_convert_encoding($content, 'UTF-8', 'ISO-8859-1');
You always need to know what you're converting from . 您始终需要知道要从中进行转换。 Just telling PHP to "convert to UTF-8" and leaving it guessing what to convert from has an undefined outcome, and in your case it does not work.
只是告诉PHP“转换为UTF-8”,然后让其猜测要转换的内容会有不确定的结果,在您的情况下,它是行不通的。
Check the file encoding, is it utf-8 without BOM ? 检查文件编码,是否为没有BOM的utf-8 ? For example, use the notepad++ for check file encoding.
例如,使用notepad ++进行检查文件编码。
Or mayby it's usefull: 或者也许它很有用:
$content = file_get_contents($sPath);
$content = htmlentities($content);
echo $content;
Or try in .htaccess: 或者尝试.htaccess:
AddDefaultCharset utf-8
AddCharset utf-8 *
<IfModule mod_charset.c>
CharsetSourceEnc utf-8
CharsetDefault utf-8
</IfModule>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.