简体   繁体   中英

Sphinx search doesn't understand special characters (accents)

I have a MySQL db in utf8_general_ci.

And my sphinx.conf is like this:

source jobs
{
    type                = mysql
    sql_sock            = /var/run/mysqld/mysqld.sock
    sql_query_pre       = SET NAMES utf8
    ...
}

When I query "système" I would like sphinx to search for "système" & "systeme" in the DB.

AND when I query "systeme" I would like sphinx to search for "système" & "systeme" too.

What it does now is removing all the characters before the accents (including the accents themselves). So "système" becomes "me" and "dév" becomes "v"...

PS : I'm using the sphinxapi.php - which shouldn't be preferred over SphinxQL, I know, but it should still work with the api. And I use EXTENDED match mode.

You need to setup your charset_table to be able do this

http://sphinxsearch.com/docs/current.html#charsets

Alas there is no 'magic' config option to just magically work with all languages text, need to setup charset_table to deal with the langauge(s) you deal with.

Although this is pretty close: http://sphinxsearch.com/forum/view.html?id=9312 (ie steals the hard work MySQL had done with collations and mimics it in charset_table)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM