简体   繁体   中英

REGEXP with MySQL isn't working properly

If I want to search all data with the string léon in the name field, I use a query like this:

SELECT * FROM `my_table` WHERE `name` REGEXP 'léon' 

This request is OK.

Now, I need to retrieve all rows with the string léon or leon or lèon , so I tried:

SELECT * FROM `my_table` WHERE `name` REGEXP 'l[éeè]on' 

This query doesn't work... I have no error, but no result. Do you have any ideas?

MySQL's regexes don't work reliably with multibyte character sets .

However, if you're using UTF-8, you can instead make use of the accent insensitive LIKE and = searches which looks like it may be what you're really after anyway.

From the docs:

The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

What you could try is handling the multibyte character like this so it is handling 1 byte or 2. It won't be perfect because it will matchin on "lion" for example, but may get you closer.

REGEXP 'l.{1,2}on'

Another thing that might be affecting you is the way that REGEXP works in MySQL is that there are implied wildcards on the end because you are not specifying the the beginning and end so

REGEXP 'l[éeè]on' 

is equivalent to

REGEXP '^.*l[éeè]on.*$' 

This means that you might be matching MORE than what you expect. Sounds like your problem is the first scenario I mention

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM