简体   繁体   中英

Mysql, PHP, searching for multiple words

I'm trying to search a table for specific words.

Say I have a list of words: printer,network,wireless,urgent

I only want to return those rows where all of these words are in it.

SELECT * FROM tickets WHERE concat(subject,body) REGEXP "printer|network|wireless|urgent" 

will return any row with any one of these words. How can I make it so that it will only return those rows where all of these words are in it.

Thanks,

There are two ways to do this. The first is the rather obvious approach. Let's say you have all the words that need to appear in an array called $necessaryWords:

$sql = 'SELECT ... FROM ...'; // and so on
$sql .= ' WHERE 1';

foreach ($necessaryWords as $word)
    $sql .= ' AND concat(subject,body) LIKE "%' . $word . '%"'; //Quotes around string

However, using %foo% is rather slow, as no indexes can be used, so this query might cause performance issues with huge tables and/or a high number of necessary words.

The other approach would be a FULLTEXT index on subject and body . You could the use the fulltext MATCH IN BOOLEAN MODE like this:

$sql = 'SELECT ... FROM ...'; // and so on
$sql .= ' WHERE MATCH(subject,body) AGAINST("';

foreach ($necessaryWords as $word)
    $sql .= ' +' . $word;
$sql .= '")';

Note that your table must use MyISAM in order to use FULLTEXT indexes. UPDATE: As of MySQL 5.6 , InnoDB supports FULLTEXT indexes as well. I guess this could be the better choice performance wise. Further documentation on the fulltext in boolean mode can be found in the manual .

not the best way, but:

SELECT * FROM tickets WHERE
concat(subject,body) REGEXP "printer" AND
concat(subject,body) REGEXP "network" AND
concat(subject,body) REGEXP "wireless" AND
concat(subject,body) REGEXP "urgent"
SELECT * FROM tickets WHERE
   concat(subject,body) LIKE "%printer%" AND
   concat(subject,body) LIKE "%network%" AND
   concat(subject,body) LIKE "%wireless%" AND
   concat(subject,body) LIKE "%urgent%"

Not sure this would work with MySQL regex engine but the regex (using lookarounds) below can achieve what you are looking for. Will find the words of interest irrespective of the order in which they occur:

^(?=.*printer)(?=.*network)(?=.*wireless)(?=.*urgent).*$

Demo: http://www.rubular.com/r/XcVz5xMZcb

Some regex lookaround examples here: http://www.rexegg.com/regex-lookarounds.html

Alternative answer, just because I thought of it when I looked at your question. I do not know whether it would be faster than the other answers (most likely no):

(SELECT * FROM tickets WHERE subject LIKE "%printer%" OR body LIKE "%printer%")
UNION
(SELECT * FROM tickets WHERE subject LIKE "%network%" OR body LIKE "%network%")
UNION
(SELECT * FROM tickets WHERE subject LIKE "%wireless%" OR body LIKE "%wireless%")
UNION
(SELECT * FROM tickets WHERE subject LIKE "%urgent%" OR body LIKE "%urgent%")

UPDATE: This is wrong

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM