简体   繁体   中英

strlen and non latin characters in for loop

I have an HTML form that sends 索索索 using post to my PHP file.

I tried strlen on it, but it gives me 24 instead of 3 (?!)...and this then breaks my for loop:

$in=$_POST['inn'];
$length=strlen($in)
for ($i=0; $i<$length; $i++) {
$cleanchar=$in[$i];
}

I want cleanchar as an individual character, like if only one character had been sent down with the POST.

How can I separate each character using PHP?

Try using mb_strlen for multi-byte characters operations:

echo mb_strlen('索索索', 'utf-8'); // or omit second parameter or change to your encoding

From documentation:

Returns the number of characters in string str having character encoding encoding. A multi-byte character is counted as 1.

http://php.net/manual/en/function.mb-strlen.php

Your form is not sending three characters, it is sending three sequences of 〹 (where 12345 is the character code for those symbols - I don't know what it actually is).

That's eight characters, times three symbols, makes a string length of 24.

If you were to run echo htmlspecialchars($_POST['inn']); you would see this effect in clear light.

I'm pretty sure there's a way to fix this... I think you need to make sure the document character set is UTF-8: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> in your <head> section.

Even then, you will get a length of 6 or maybe 9 depending on the byte length of those symbols, since that's what strlen measures.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM