简体   繁体   English

如何用php匹配Javascript unicode字符串?

[英]How to match Javascript unicode string precessing with php?

I am trying to handle strings on both php and javascript and I want them to behave the same. 我正在尝试处理php和javascript上的字符串,我希望它们的行为相同。 I wrote a javascript version of php chr() fucntion to implement this. 我编写了javascript版本的php chr()功能来实现此目的。 However I run into some uft-8 unicode issue. 但是我遇到了一些uft-8 unicode问题。 For example, I want to create a string with Chinese characters "a大小b" which I can do correctly in php but fail in javascipt using the codes below. 例如,我想创建一个带有中文字符“ a大小b”的字符串,该字符串可以在php中正确执行,但使用以下代码在javascipt中失败。 I want to ask experts what is wrong with the implementation. 我想问专家实施有什么问题。

Output are: 输出为:

  php str=a----
  php str=a�----
  php str=a��----
  php str=a大----
  php str=a大�----
  php str=a大��----
  php str=a大小----
  php str=a大小b----

  --------

  js str=a---
  js str=aå---
  js str=aå¤---
  js str=a大---
  js str=a大å---
  js str=a大å°---
  js str=a大å°---
  js str=a大å°b---

The codes I used are as the following: 我使用的代码如下:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<div class="container">

<?php 
    $string5 = "" ; 
    $str_a = chr(97) ; 
    $string5 .= $str_a ;   echo "php str=$string5----<br>" ; 


    $str_c1 = chr(229) ; 
    $string5 .= $str_c1 ;   echo "php str=$string5----<br>" ; 
    $str_c2 = chr(164) ; 
    $string5 .= $str_c2 ;   echo "php str=$string5----<br>" ; 
    $str_c3 = chr(167) ; 
    $string5 .= $str_c3 ;   echo "php str=$string5----<br>" ; 


    $str_cs1 = chr(229) ; 
    $string5 .= $str_cs1 ;   echo "php str=$string5----<br>" ; 
    $str_cs2 = chr(176) ; 
    $string5 .= $str_cs2 ;   echo "php str=$string5----<br>" ; 
    $str_cs3 = chr(143) ; 
    $string5 .= $str_cs3 ;   echo "php str=$string5----<br>" ; 


    $str_b= chr(98) ; 
    $string5 .= $str_b ;   echo "php str=$string5----<br>" ; 

    echo "<br><br>--------<br><br>" ; 
?>


<script language = "JavaScript">   

    function chr2(codePt) {
      if (codePt > 0xFFFF) { 
        codePt -= 0x10000;
        return String.fromCharCode(0xD800 + (codePt >> 10), 0xDC00 + (codePt & 0x3FF));
      }
      return String.fromCharCode(codePt);
    }

    var string5 = "" ; 
    var str_a = chr2(97) ; 
    string5 += str_a ;     document.write( "js str="+string5+"---<br>"  ); 

    var str_c1 = chr2(229) ; 
    string5 += str_c1 ;   document.write( "js str="+string5+"---<br>"  ); 
    var str_c2 = chr2(164) ; 
    string5 += str_c2 ;   document.write( "js str="+string5+"---<br>"  ); 
    var str_c3 = chr2(167) ; 
    string5 += str_c3 ;   document.write( "js str="+string5+"---<br>"  ); 


    var str_cs1 = chr2(229) ; 
    string5 += str_cs1 ;   document.write( "js str="+string5+"---<br>"  ); 
    var str_cs2 = chr2(176) ; 
    string5 += str_cs2 ;   document.write( "js str="+string5+"---<br>"  ); 
    var str_cs3 = chr2(143) ; 
    string5 += str_cs3 ;   document.write( "js str="+string5+"---<br>"  ); 

    var str_b = chr2(98) ; 
    string5 += str_b ;   document.write( "js str="+string5+"---<br>"  ); 

</script>


</div> 
</body>
</html

PHP and JavaScript strings are fundamentally different. PHP和JavaScript字符串根本不同。 A PHP string is a series of bytes. PHP字符串是一系列字节。 A JavaScript string is a series of characters. JavaScript字符串是一系列字符。 (Actually a series of UTF-16 code units, but that's irrelevant to this example.) (实际上是一系列UTF-16代码单元,但这与本示例无关。)

is character U+5927 (Han Ideograph Big). 是字符U + 5927(汉字表大)。 To generate it in JavaScript you would use String.fromCharCode(0x5927) (or chr2(0x5927) using the above helper function). 要在JavaScript中生成它,您可以使用String.fromCharCode(0x5927) (或使用上述辅助函数的chr2(0x5927) )。

229, 164, 167 is the byte representation of using the UTF-8 encoding ( "\\xE5\\xA4\\xA7" ). "\\xE5\\xA4\\xA7"是使用UTF-8编码( "\\xE5\\xA4\\xA7" )的字节表示。 Splitting the character in the middle of the byte sequence is invalid which is why you get the error in the output of PHP. 在字节序列中间拆分字符是无效的,这就是为什么在PHP输出中会得到 错误''的原因。 You can't split the byte sequence in the middle in JavaScript as its string model is character-based, so the code will never work the same. 您不能在JavaScript的中间拆分字节序列,因为其字符串模型是基于字符的,因此代码将永远无法正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从javascript中的字符串中剥离(或正则表达式匹配)unicode字符? - how to strip (or regex match) a unicode character from a string in javascript? 如何在PHP中将Javascript字符串编码为Unicode并将其解码为utf-8? - How to encode a Javascript string to Unicode and decode it to utf-8 in PHP? Javascript匹配并替换为unicode - Javascript match and replace with unicode 如何在Javascript中打印文字unicode字符串? - How to print literal unicode string in Javascript? 如何在python中转义UNICODE字符串(到javascript转义) - How to escape UNICODE string in python (to javascript escape) 如何在javascript中使用unicode和utf-8解码字符串? - How to decode a string with unicode and utf-8 in javascript? 如何将 Unicode 字符串拆分为 JavaScript 中的字符 - How to split Unicode string to characters in JavaScript 如何在 JavaScript 或 PHP 中将 unicode 转换为 ascii? - How to convert unicode to ascii in JavaScript or PHP? 如何将字符串编码为 Unicode 十进制 Javascript - How to encode a String to a Unicode Decimal in Javascript 如何在JavaScript中将十六进制数转换为文本字符串(Unicode,而不仅仅是ASCII)(例如PHP的hex2bin)? - How to convert hexadecimal number into a text string (unicode, not just ASCII) (like PHP's hex2bin) in JavaScript?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM