简体   繁体   English

用于搜索的最佳MySQL整理

[英]Best MySQL collate for search

I've implement a search with PHP and MySQL. 我用PHP和MySQL实现了搜索。 At the moment the collate of my table is "utf8_unicode_ci" . 目前我桌子的整理是"utf8_unicode_ci" The problem is, that with this collation "ä" = "a" is. 问题是,通过这种整理"ä" = "a"是。 If I change the collate to "utf_bin" everything works, but that collation isn't casesentive. 如果我将整理更改为"utf_bin"一切正常,但整理并不具体。

So I want both with no changing the SQL or PHP code with "upper" or "lower" :) 所以我希望两者都不用"upper""lower"改变SQL或PHP代码:)

What is the best MySQL collate for my search? 我搜索的最佳MySQL整理是什么?

In general, you cannot do that, and using lower in your code is the safe approach that will work for all kind of characters and languages. 通常,您不能这样做,并且在代码中使用lower是一种适用于所有类型字符和语言的安全方法。 For some languages there are specialized collations that would support your comparison, but might have some complications on their own. 对于某些语言,有专门的排序规则可以支持您的比较,但可能会有一些并发症。 For 'ä' = 'Ä' != 'A' , you can use utf8_german2_ci ( german phonebook ordering ). 对于'ä' = 'Ä' != 'A' ,您可以使用utf8_german2_ci德语电话簿订购 )。 It will treat the following characters as equal in comparisons: 它会在比较中将以下字符视为相等:

Ä = Æ = AE
Ö = Œ = OE
Ü = UE
ß = ss

But comparison (eg =, <, > ) is meant literally: because a collation actually relates to sorting, this collation has the strange sideeffect that 'AE' = 'Ä' , but not 'AE' like 'Ä' ! 但是比较 (例如=, <, > )字面意思是:因为整理实际上与排序有关,这种整理具有'AE' = 'Ä'的奇怪副作用,但'AE' like 'Ä' It might be harder to consider that in your code than to simply add a lower everywhere, and might result in some brain-twisting effects later. 在你的代码中考虑这个问题可能比在任何地方添加lower更困难,并且可能会在以后导致一些大脑扭曲效应。 But if you can live with that, and you don't have to support other special characters than just german umlauts (eg 'à' , 'á' and 'å' will still all be treated as 'a' ), you can give it a try. 但是,如果你可以忍受这一点,并且你不需要支持其他特殊角色而不仅仅是德语变音符号(例如'à''á''å'仍将被视为'a' ),你可以给予一试。

Example: 例:

create table germanumlaut (
  word varchar(20) collate utf8_german2_ci
);

insert into germanumlaut (word) 
values ('Ä'), ('ä'), ('A'), ('á'), ('AE');

select * from germanumlaut where word = 'A';
-- result: 'A', 'á', as 'á' is not a german umlaut and treated as 'a'

select * from germanumlaut where word = 'Ä';
-- result: 'Ä', 'ä', 'AE', as 'AE' = 'Ä'

select * from germanumlaut where word > 'Ad';
-- result: 'Ä', 'ä', 'AE', as 'Ä' = 'AE'

select * from germanumlaut where word like 'A';
-- result: 'A', 'á'

select * from germanumlaut where word like 'Ä';
-- result: 'Ä', 'ä'

select * from germanumlaut where word like 'A%';
-- result: 'A', 'á', 'AE'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM