简体   繁体   English

PHP iconv_strlen 问题

[英]PHP iconv_strlen question

What does it mean when the icon_strlen fails on bad character sequences specifically character sequences is what I want to know.当 icon_strlen 在错误的字符序列特别是character sequences上失败是什么意思是我想知道的。 Thanks谢谢

A character sequence is a series of bytes. character sequence是一系列字节。 When using UTF-8 not all combinations of bytes are valid.使用 UTF-8 时,并非所有字节组合都有效。

The byte sequence \xc2\xbc forms the Unicode character U+00BC which is the VULGAR FRACTION ONE QUARTER symbol (¼) when using UTF-8 encoding.字节序列\xc2\xbc forms Unicode 字符U+00BC是使用 ZAE3B3DF9970B49B65723E908759BC 编码时的 VULGAR FRACTION VULGAR FRACTION ONE QUARTER符号 (¼)。

The byte sequence \xe2\x88\x9c forms the Unicode character U+221C which is the FOURTH ROOT symbol (∜) when using UTF-8 encoding.字节序列\xe2\x88\x9c forms Unicode 字符U+221C这是使用 ZAE3B3DF9970B49B6523E608759BC9 编码时的FOURTH ROOT符号(∜)。

A bad character sequence for UTF-8 encoding would be any byte combination that doesn't fit into the required schema for UTF-8 byte streams, eg the byte sequence \xbc\xbc would be illegal because two byte characters must have 110xxxxx in the first byte but \xbc is 10111100 written as bits. UTF-8 编码的错误字符序列将是任何不符合UTF-8字节流所需模式的字节组合,例如字节序列\xbc\xbc将是非法的,因为两个字节字符的第一个字节必须有110xxxxx\xbc10111100写入位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM