简体   繁体   English

PHP和重音字符(Ba \\ u015f \\ u00e7 \\ u0131l)

[英]PHP and accent characters (Ba\u015f\u00e7\u0131l)

I have a string like so "Ba\ş\ç\ıl". 我有一个像“Ba \\ u015f \\ u00e7 \\ u0131l”这样的字符串。 I'm assuming those are some special accent characters. 我假设那些是一些特殊的重音字符。 How do I: 我如何能:

1) Display the string with the accents (ie replace code with actual character) 1)显示带重音的字符串(即用实际字符替换代码)

2) What is best practice for storing strings like this? 2)存储这样的字符串的最佳做法是什么?

2) If I don't want to allow such characters, how do I replace it with "normal characters"? 2)如果我不想允许这样的字符,我该如何用“普通字符”替换它?

My educated guess is that you obtained such values from a JSON string. 我有根据的猜测是你从JSON字符串中获取了这些值。 If that's the case, you should properly decode the full piece of data with json_decode() : 如果是这种情况,您应该使用json_decode()正确解码整个数据:

<?php

header('Content-Type: text/plain; charset=utf-8');

$data = '"Ba\u015f\u00e7\u0131l"';
var_dump( json_decode($data) );

?>
  1. To display the characters look at How to decode Unicode escape sequences like "\í" to proper UTF-8 encoded characters? 要显示字符,请参阅如何将Unicode转义序列(如“\\ u00ed”)解码为正确的UTF-8编码字符?

  2. You can store the character like that, or decoded, just make sure your storage can handle the UTF8 charset. 您可以存储这样的字符或解码,只需确保您的存储可以处理UTF8字符集。

  3. Use iconv with the translit flag. iconv与translit标志一起使用。

Here's an example... 这是一个例子......

function replace_unicode_escape_sequence($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}
$str = preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);

echo $str;

echo '<br/>';
$str = iconv('UTF8', 'ASCII//TRANSLIT', $str);

echo $str;

Here's another option: 这是另一种选择:

<html><head>
    <!-- don't forget to tell the browser what encoding you're using: -->
    <meta http-equiv="Content-type" content="text/html;charset=UTF-8" />
</head><body><?php

$string = "Ba\u015f\u00e7\u0131l";
echo json_decode('"'.str_replace('"', '\"', $string).'"');

?></body></html>

This works because the \\u000 syntax is what JSON uses. 这是有效的,因为\\ u000语法是JSON使用的语法。 Note that json_decode() requires the JSON module, which is now a part of the standard PHP installation. 请注意, json_decode()需要JSON模块,该模块现在是标准PHP安装的一部分。

There is no native support in PHP to decode such strings. PHP中没有本机支持来解码这样的字符串。

There are several tricks to use native function though I am not sure that any of those is safe and injection proof : 使用本机功能有几个技巧虽然我不确定其中任何一个是安全和注射证明:

Another option using Zend Framework is to download the Zend_Utf8 proposal class. 使用Zend Framework的另一个选择是下载Zend_Utf8提议类。 See more information at Zend_Utf8 proposal for Zend Framework 有关Zend Framework的Zend_Utf8提案,请参阅更多信息

  1. Outputing them would output the appropriate character. 输出它们将输出适当的字符。 If you don't provide any encoding for the output document, the browser would try and guess the best one to show. 如果您没有为输出文档提供任何编码,浏览器将尝试猜测要显示的最佳编码。 Otherwise you should figure it out and output explicitly. 否则你应该弄明白并明确输出。
  2. Simply store them, or turn them into normal chars and binary store them. 只需存储它们,或将它们变成普通的字符并将它们二进制存储。
  3. Use iconv functions to convert from one encoding to another, then you shuold save your source file with the desired encoding to support it. 使用iconv函数从一种编码转换为另一种编码,然后您可以使用所需的编码保存源文件以支持它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM