简体   繁体   中英

Case insensitive unicode collation in MySQL

I've got a database where we store usernames with a capital first letter of each name -- ie, IsaacSparling. I'm trying to do case insensitive autocomplete against my MySQL (v5.1.46) db. Table has a charset of UTF8 and a collation of utf8_unicode_ci. I've done these tests against the utf8_general_ci collation as well.

Plain ASCII text works fine:

mysql> select username from users where username like 'j%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

mysql> select username from users where username like 'J%';
+----------------+
| username       |
+----------------+
| J********      |
| J***********   |
| J************* |
+----------------+
3 rows in set (0.00 sec)

(names redacted, but they're there).

However, when I try to do the same for unicode characters outside the ASCII set, no such luck:

mysql> select username from users where username like 'ø%';
Empty set (0.00 sec)


mysql> select username from users where username like 'Ø%';
+-------------+
| username    |
+-------------+
| Ø*********  |
+-------------+
1 row in set (0.00 sec)

Some investigation has lead me to this: http://bugs.mysql.com/bug.php?id=19567 (tl;dr, this is a known bug with the unicode collations, and fixing it is at 'new feature' priority -- ie, won't be finished in any reasonable timeframe).

Has anybody discovered any effective workarounds that allow for case-insensitive searching for unicode characters in MySQL? Any thoughts appreciated!

Works fine for me with version 5.1.42-community

Maybe your mysql client did not send the unicode characters properly. I tested with sqlYog and it worked just fine with both utf8_unicode_ci and utf8_general_ci collations

I just resolved the same problem using the query

show variables like '%char%';

My character_set_client was set to 'utf8', but character_set_connection and character_set_results were set to 'latin1'. Thus, the functions UPPER, LOWER, LIKE did not work as expected.

I just inserted the line

mysql_query("SET NAMES utf8");

right after connection to get the case-insensitive searching work.

如果您关心的是能够通过文本订购字段值而不关心它是大写还是小写我认为您可以做的最好的事情是在寻址字段而不是只键入username ,键入LOWER(username) username ,然后您可以完美地使用该字段的命令,通过其名称调用它

Have you tried using CONVERT? Something like

WHERE `lastname` LIKE CONVERT( _utf8 'ø%' USING latin1 )

might work for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM