简体   繁体   中英

how to improve MySQL REGEXP search?

Question are,

1.How can I improve the performance of SELECT queries in mysql utilizing REGEXP ?

The table looks like

create table `tweets`(
    `id` bigint auto_increment,
    `tweet` varchar(140),
    `time` datetime,
    primary key(`id`)
);

Here the following query takes about 0.35 seconds .

select tweet from tweets where tweet regexp '^[abcdef]{1,4}$';
  1. Will indexing tweet make it faster? If so, what type of index should I use?
  2. My table engine is InnoDB , Is there any other table engine that will become beneficial?

Your best bet is to reduce the result set to evaluate against the regular expression before evaluating. Regular expressions are, for all intents and purposes, impossible to index for.

If I had to come up with a way for this, I would examine patterns that are commonly searched against, and mark them in some indexible way at insert time. For example if you use the ^[abcdef]{1,4}$ expression to search against a lot, I'd make a boolean column first4AThruF and on an insert/ update trigger, update the column to true or false based on whether or not it matched the regular expression. If I indexed the first4AThruF column, and the column had enough selectivity, I could write the query:

select tweet from tweets where first4AThruF = true;

and this should be pretty zippy.

Other possibilities to consider are full-text queries or LIKE clauses, although in the case mentioned above I don't expect them to work well.

If the search you're looking for is at the start of a string, you can use LIKE as a high-level filter then check again with REGEXP :

select tweet from tweets 
where 
    ( 
      tweet LIKE 'a%' OR
      tweet LIKE 'b%' OR
      tweet LIKE 'c%' OR
      tweet LIKE 'd%' OR
      tweet LIKE 'e%'
    )
    AND LENGTH(tweet) <= 4 -- try taking this line out line too
    AND tweet regexp '^[abcdef]{1,4}$';

In spite of being a little convoluted, this should be a lot faster.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM