简体   繁体   中英

Turkish Character issue in PHP and MySQL

I'm trying to count occurences of all letters in Turkish alphabet in a MySQL database.

When I try to count letter "a" like this, I get correct result :

while($nt=mysql_fetch_array($rt))
{
    $mystring = $nt["word"];

    for($i = 0; $i < strlen($mystring) ; $i++)
    {
        if($mystring[$i] == 'a')
        {
            $a++;
        }
    }
}

When I replace "a", with "ç" I get zero. I already added this code :

$bd = mysql_connect($mysql_hostname, $mysql_user, $mysql_password) or die("database unavailable");
mysql_set_charset('utf8', $bd);

How can I fix my code for Turkish characters? Thanks.

In UTF-8 ç is encoded as two bytes ( C3 A7 ), therefore byte-by-byte comparison won't work. Consider substr_count :

$s = "abçdeç";
print substr_count($s, 'ç'); // 2

or use a unicode-aware function like this:

function utf8_char_count($s) {
    $count = [];
    preg_match_all('~.~u', $s, $m);
    foreach($m[0] as $c)
        $count[$c] = isset($count[$c]) ? $count[$c] + 1 : 1;
    return $count;
}

print_r(utf8_char_count('çAüθç')); // [ç] => 2 [A] => 1 [ü] => 1 [θ] => 1

This assumes that your string are actually UTF-8, if this is not the case (hint: var_dump(rawurlencode($str)) ), check your DB and connection settings (see the linked thread).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM