I have created a mysql 5.6 table with a column encoded in utf-8, for characters in Romanian, Czech, Hungarian, Polish, French, German, Scandinavian language(s) - ie european characters, but quite non-ASCII.
However, i would like to query this column using just ASCII characters - eg in the LIKE clause- so that ă,î,â,ș,ț,ü,ä,ö etc. characters can be (succesfully) queried using a,e,i,o,u,s,t etc.
Is that even possible ?
Well, I don't see it possible by any conventional way using only SQL. You only can write query preprocessor, that will automatically replace ascii characters with european. https://php.net/manual/en/function.str-replace.php - assuming you are using PHP But you still need to feed every query to it.
I found a partial answer to my question:
If the character set you define for the column is utf8_general_ci , then many, (if not all) flavors of a,e,o,u will be found by a query using plain a,e,o,u . I even found the n in Wołoszyńska using plain n.
UNFORTUNATELY, the lowercase L "with oblique bar" in the same word was not found.
The answer was suggested by dddd's answer here
There is a cheat sheet for knowing what letters map "equal" under what collations in utf8 collations It agrees that Ł
is not mapped to L
for any collation. general_ci
sorts it after Z
; utf8_unicode_520_ci
sorts it with L
; the rest sort it before M
.
polish_ci
treats Ę
as distinct from the rest of the E
-like characters. Ditto for Ą
. The Baltic states tend to keep certain accented consonants separate.
In polish_ci
, Ń
(hex C584
) collates after N
and before O
; the other collations treate it equal to N
.
utf8_unicode_520_ci is probably the best collation for you.
Also, you might consider "combining" accents -- where two utf8 'characters' "combine" to make a single characters. utf8_unicode_ci
collates 'correctly' for most of them, as seen here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.