[英]How to match Javascript unicode string precessing with php?
I am trying to handle strings on both php and javascript and I want them to behave the same. 我正在尝试处理php和javascript上的字符串,我希望它们的行为相同。 I wrote a javascript version of php chr() fucntion to implement this.
我编写了javascript版本的php chr()功能来实现此目的。 However I run into some uft-8 unicode issue.
但是我遇到了一些uft-8 unicode问题。 For example, I want to create a string with Chinese characters "a大小b" which I can do correctly in php but fail in javascipt using the codes below.
例如,我想创建一个带有中文字符“ a大小b”的字符串,该字符串可以在php中正确执行,但使用以下代码在javascipt中失败。 I want to ask experts what is wrong with the implementation.
我想问专家实施有什么问题。
Output are: 输出为:
php str=a----
php str=a�----
php str=a��----
php str=a大----
php str=a大�----
php str=a大��----
php str=a大小----
php str=a大小b----
--------
js str=a---
js str=aå---
js str=aå¤---
js str=a大---
js str=a大å---
js str=a大å°---
js str=a大å°---
js str=a大å°b---
The codes I used are as the following: 我使用的代码如下:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<div class="container">
<?php
$string5 = "" ;
$str_a = chr(97) ;
$string5 .= $str_a ; echo "php str=$string5----<br>" ;
$str_c1 = chr(229) ;
$string5 .= $str_c1 ; echo "php str=$string5----<br>" ;
$str_c2 = chr(164) ;
$string5 .= $str_c2 ; echo "php str=$string5----<br>" ;
$str_c3 = chr(167) ;
$string5 .= $str_c3 ; echo "php str=$string5----<br>" ;
$str_cs1 = chr(229) ;
$string5 .= $str_cs1 ; echo "php str=$string5----<br>" ;
$str_cs2 = chr(176) ;
$string5 .= $str_cs2 ; echo "php str=$string5----<br>" ;
$str_cs3 = chr(143) ;
$string5 .= $str_cs3 ; echo "php str=$string5----<br>" ;
$str_b= chr(98) ;
$string5 .= $str_b ; echo "php str=$string5----<br>" ;
echo "<br><br>--------<br><br>" ;
?>
<script language = "JavaScript">
function chr2(codePt) {
if (codePt > 0xFFFF) {
codePt -= 0x10000;
return String.fromCharCode(0xD800 + (codePt >> 10), 0xDC00 + (codePt & 0x3FF));
}
return String.fromCharCode(codePt);
}
var string5 = "" ;
var str_a = chr2(97) ;
string5 += str_a ; document.write( "js str="+string5+"---<br>" );
var str_c1 = chr2(229) ;
string5 += str_c1 ; document.write( "js str="+string5+"---<br>" );
var str_c2 = chr2(164) ;
string5 += str_c2 ; document.write( "js str="+string5+"---<br>" );
var str_c3 = chr2(167) ;
string5 += str_c3 ; document.write( "js str="+string5+"---<br>" );
var str_cs1 = chr2(229) ;
string5 += str_cs1 ; document.write( "js str="+string5+"---<br>" );
var str_cs2 = chr2(176) ;
string5 += str_cs2 ; document.write( "js str="+string5+"---<br>" );
var str_cs3 = chr2(143) ;
string5 += str_cs3 ; document.write( "js str="+string5+"---<br>" );
var str_b = chr2(98) ;
string5 += str_b ; document.write( "js str="+string5+"---<br>" );
</script>
</div>
</body>
</html
PHP and JavaScript strings are fundamentally different. PHP和JavaScript字符串根本不同。 A PHP string is a series of bytes.
PHP字符串是一系列字节。 A JavaScript string is a series of characters.
JavaScript字符串是一系列字符。 (Actually a series of UTF-16 code units, but that's irrelevant to this example.)
(实际上是一系列UTF-16代码单元,但这与本示例无关。)
大
is character U+5927 (Han Ideograph Big). 大
是字符U + 5927(汉字表大)。 To generate it in JavaScript you would use String.fromCharCode(0x5927)
(or chr2(0x5927)
using the above helper function). 要在JavaScript中生成它,您可以使用
String.fromCharCode(0x5927)
(或使用上述辅助函数的chr2(0x5927)
)。
229, 164, 167 is the byte representation of 大
using the UTF-8 encoding ( "\\xE5\\xA4\\xA7"
). "\\xE5\\xA4\\xA7"
是使用UTF-8编码( "\\xE5\\xA4\\xA7"
)的大
字节表示。 Splitting the character in the middle of the byte sequence is invalid which is why you get the
error in the output of PHP. 在字节序列中间拆分字符是无效的,这就是为什么在PHP输出中会得到
错误''的原因。 You can't split the byte sequence in the middle in JavaScript as its string model is character-based, so the code will never work the same. 您不能在JavaScript的中间拆分字节序列,因为其字符串模型是基于字符的,因此代码将永远无法正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.