简体   繁体   English

PHP strlen 和 mb_strlen 未按预期工作

[英]PHP strlen and mb_strlen not working as expected

PHP functions strlen() and mb_strlen() both are returning the wrong number of characters when I run them on a string.当我在字符串上运行 PHP 函数 strlen() 和 mb_strlen() 时,它们都返回错误数量的字符。

Here is a piece of the code I'm using...这是我正在使用的一段代码......

 $foo = mb_strlen($itemDetails['ITEMDESC'], 'UTF-8');
 echo $foo;

It is telling me this sting - "4½" Straight Iris Scissors" is 45 characters long. It's 27.它告诉我这个刺痛 - “4½” Straight Iris Scissors”有 45 个字符长。它是 27。

It also tells me that this string - "Infant Heel Warmer, No Adhesive Attachment Pad, 100/cs" is 54, which is correct.它还告诉我这个字符串 - “Infant Heel Warmer, No Adhesive Attachment Pad, 100/cs”是 54,这是正确的。

I assume its some issue with character encoding, everything should be UTF-8 I think.我认为它与字符编码有一些问题,我认为一切都应该是 UTF-8。 I've tried feeding mb_strlen() several different character encoding types and they all are returning this oddball count with the string that has those non-standard characters.我已经尝试为 mb_strlen() 提供几种不同的字符编码类型,它们都返回带有这些非标准字符的字符串的奇怪计数。

I've no idea why this is happening.我不知道为什么会这样。

Double-check whether your text really is UTF-8 or not.仔细检查您的文本是否真的是 UTF-8。 That "Â" character makes it look like a classic character encoding problem to me.那个“”字符让我看起来像是一个经典的字符编码问题。 You should check the entire path from the origin of the text through the point in your code that you quoted above, because there are a lot of places where the encodings can get munged.您应该检查从文本来源到您上面引用的代码中的点的整个路径,因为有很多地方可以修改编码。

Did the text originate from an HTML form?文本是否源自 HTML 表单? Ensure your <form> element includes the accept-charset="UTF-8" attribute.确保您的<form>元素包含accept-charset="UTF-8"属性。

Did the text get stored in a database along the way?文本是否在此过程中存储在数据库中? Make sure the database stores and returns the data in UTF-8.确保数据库在 UTF-8 中存储和返回数据。 This means checking the server's global defaults, the defaults for the database or schema, and the table itself.这意味着检查服务器的全局默认值、数据库或模式的默认值以及表本身。

It is very likely that your input is encoded in UTF-16.您的输入很可能是以 UTF-16 编码的。 You may convert to UTF-8您可以转换为 UTF-8

$foo = mb_strlen(mb_convert_encoding($itemDetails['ITEMDESC'], "UTF-8", "UTF-16"));

or if you use mb_strlen() be sure to use proper encoding as a second parameter.或者,如果您使用mb_strlen() ,请务必使用正确的编码作为第二个参数。

$foo = mb_strlen($itemDetails['ITEMDESC'], "UTF-16");

Without correct encoding mb_strlen will always return wrong results.如果没有正确的编码,mb_strlen 总是会返回错误的结果。 It's easy to get into troubles when you're dealing with UTF-8/16/32 encoded strings.在处理 UTF-8/16/32 编码的字符串时很容易遇到麻烦。 mb_detect_encoding() will not solve this problem. mb_detect_encoding()不能解决这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM