简体   繁体   English

PHP中的非英文字符

[英]Non-English characters in PHP

im having a problem with writing a Non-English characters into file (.txt) using php . 我有一个问题,使用PHP将非英语字符写入文件(.txt)。 this is my code : 这是我的代码:

$str = "â€êþÿûîœøîô‘ë’ðüïlæ߀¿×÷¡ï";
$str = htmlentities($str, ENT_QUOTES, mb_detect_encoding($str));
$str =htmlspecialchars_decode(html_entity_decode($str),ENT_QUOTES);
$f = fopen("test.txt","w");
fputs($f,$str);
fclose($f);

when i open the file the result is : â€êþÿûîœøîô'ë'ðüïlæ߀¿×÷¡ï 当我打开文件时,结果是: â€êþÿûîœøîô'ë'ðüïlæ߀¿×÷¡ï

as you see for example the euro symbol still no appear correctly in the file and other symbols . 正如您所看到的那样,欧元符号在文件和其他符号中仍然没有正确显示。

any one have an idea to fix this problem ? 任何人都有想法解决这个问题?

The conversion of to € 的转化€ is done by the htmlentities() function; htmlentities()函数完成; since you are encoding into HTML entities and decoding right after, I'd suggest to leave this step out: 既然您正在编码成HTML实体并在之后解码,我建议将此步骤退出:

$str = "â€êþÿûîœøîô‘ë’ðüïlæ߀¿×÷¡ï";
$f = fopen("test.txt","w");
fputs($f,$str);
fclose($f);

Assuming you want to keep this encoding/decoding business (it looks like you're trying to use the encode/decode process to convert between character sets?): 假设您想保留这种编码/解码业务(看起来您正在尝试使用编码/解码过程在字符集之间进行转换?):

In your encoding step, you use mb_detect_encoding on the input string and pass that to htmlentities , which allows the euro sign in your input to be correctly detected (most of the time). 在编码步骤中,您在输入字符串上使用mb_detect_encoding并将其传递给htmlentities ,这样可以正确检测输入中的欧元符号(大多数情况下)。

However, in your decoding step, you don't specify any charset, so html_entity_decode will pick ISO-8859-1, which doesn't include the euro sign. 但是,在解码步骤中,您没有指定任何字符集,因此html_entity_decode将选择ISO-8859-1,其中不包括欧元符号。

If you want to keep this code block mostly the same, you need to pick a charset to decode to that includes all the characters you want (like UTF-8 or ISO-8859-15). 如果你想保持这个代码块大致相同,你需要选择一个字符串来解码,包括你想要的所有字符(如UTF-8或ISO-8859-15)。

Edit: Here's an example based on your code (I picked ISO-8859-15, but you really need to know or decide what output character set you want): 编辑:这是一个基于你的代码的例子(我选择了ISO-8859-15,但你真的需要知道或决定你想要的输出字符集):

$str = "â€êþÿûîœøîô‘ë’ðüïlæ߀¿×÷¡ï";
$str = htmlentities($str, ENT_QUOTES, mb_detect_encoding($str));
$str = html_entity_decode($str, ENT_QUOTES, 'ISO-8859-15');
$f = fopen("test.txt","w");
fputs($f,$str);
fclose($f);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM