How to strlen of a multi-language string

Question

I want to get strlen() of Shift-jis and Utf-8, then compare them. A string could be mixed "ああ12345678sdfdszzz". I tried to use strlen but it generates the different results. mb_strlen also doesn't help because this is a mixed string.

For example:

ああ12345678 >> strlen() = 24 chars
ああああああああああああああああ >> strlen() = 48 chars
ああああああああああああああああああ >> strlen() = 54 chars

It seems to be there is no rule. So what is the best way to calculate strlen and compare them in multilanguage ?

Answer 1

strlen does only count the bytes and thus is only useful for single-byte character encodings ; use mb_strlen for multi-byte character encodings that can count the actual characters instead.

Answer 2

I would write a function to check from where to where a particular encoding exsist.

Then I would split the string into encodings, perform the mb_strlen and sum up the sizes afterwords. Then repeat on the second string and compare.

I guess you understand my point ;)

PS: Use mb_detect_encoding to detect encoding

mb_detect_encoding (see the comments for further ideas by the php community)

Answer 3

$field = $_POST['field'];
$field_length = mb_strlen($field,'utf-8');

How to strlen of a multi-language string

Question

3 answers

solution1
6 ACCPTED 2012-02-13 07:03:42

solution2
2 2012-02-13 07:13:36

solution3
0 2014-11-15 14:55:51

How to strlen of a multi-language string

Question

3 answers

solution1 6 ACCPTED 2012-02-13 07:03:42

solution2 2 2012-02-13 07:13:36

solution3 0 2014-11-15 14:55:51

solution1
6 ACCPTED 2012-02-13 07:03:42

solution2
2 2012-02-13 07:13:36

solution3
0 2014-11-15 14:55:51