How can I perform accent-sensitive but case-insensitive utf8 search in mysql? Utf8_bin is case sensitive, and utf8_general_ci is accent insensitive.
If you want to differ "café" from "cafe" You may use :
Select word from table_words WHERE Hex(word) LIKE Hex("café");
This way it will return 'café'.
Otherwise if you use :
Select word from table_words WHERE Hex(word) LIKE Hex("cafe");
it will return café. I'm using the latin1_german2_ci Collation.
There doesn't seem to be one because case sensitivity is tough to do in Unicode .
There is a utf8_general_cs
collation but it seems to be experimental, and according to this bug report , doesn't do what it's expected to when using LIKE.
If your data consists of western umlauts only (ie. umlauts that are included in ISO-8859-1), you might be able to collate your search operation to latin1_german2_ci
or create a separate search column with it (that specific collation is accent sensitive according to this page ; latin1_general_ci
might be as well, I don't know and can't test right now).
You can use " hex
" to make the search accent-sensitive. Then simply add lcase to make it case insensitive again. So that would give:
SELECT name FROM people WHERE HEX(LCASE(name)) = HEX(LCASE("René"))
You do throw all your indexes out of the window like that. If you want to avoid having to do a full table scan and you have an index on "name", also search for the same thing without the hex and lcase:
SELECT name FROM people WHERE name = "René" and HEX(LCASE(name)) = HEX(LCASE("René"))
This way the index on " name
" will be used to find for example only the rows " René
" and "Rene" and then the comparison with the " hex
" needs to be done only on those two rows instead of on the complete table.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.