简体   繁体   中英

Best MySQL collate for search

I've implement a search with PHP and MySQL. At the moment the collate of my table is "utf8_unicode_ci" . The problem is, that with this collation "ä" = "a" is. If I change the collate to "utf_bin" everything works, but that collation isn't casesentive.

So I want both with no changing the SQL or PHP code with "upper" or "lower" :)

What is the best MySQL collate for my search?

In general, you cannot do that, and using lower in your code is the safe approach that will work for all kind of characters and languages. For some languages there are specialized collations that would support your comparison, but might have some complications on their own. For 'ä' = 'Ä' != 'A' , you can use utf8_german2_ci ( german phonebook ordering ). It will treat the following characters as equal in comparisons:

Ä = Æ = AE
Ö = Œ = OE
Ü = UE
ß = ss

But comparison (eg =, <, > ) is meant literally: because a collation actually relates to sorting, this collation has the strange sideeffect that 'AE' = 'Ä' , but not 'AE' like 'Ä' ! It might be harder to consider that in your code than to simply add a lower everywhere, and might result in some brain-twisting effects later. But if you can live with that, and you don't have to support other special characters than just german umlauts (eg 'à' , 'á' and 'å' will still all be treated as 'a' ), you can give it a try.

Example:

create table germanumlaut (
  word varchar(20) collate utf8_german2_ci
);

insert into germanumlaut (word) 
values ('Ä'), ('ä'), ('A'), ('á'), ('AE');

select * from germanumlaut where word = 'A';
-- result: 'A', 'á', as 'á' is not a german umlaut and treated as 'a'

select * from germanumlaut where word = 'Ä';
-- result: 'Ä', 'ä', 'AE', as 'AE' = 'Ä'

select * from germanumlaut where word > 'Ad';
-- result: 'Ä', 'ä', 'AE', as 'Ä' = 'AE'

select * from germanumlaut where word like 'A';
-- result: 'A', 'á'

select * from germanumlaut where word like 'Ä';
-- result: 'Ä', 'ä'

select * from germanumlaut where word like 'A%';
-- result: 'A', 'á', 'AE'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM