简体   繁体   中英

PHP: find 3-char words in query string to augment MySQL full-text search

I'm working on a simple MySQL full-text search feature on a CakePHP site, and noticed that MySQL strips short words (3 chars or less) out of the query. Some of the items in the site have 3 character titles, however, and I'd like to include them in the results. (I've ruled out using more robust search appliances like Solr due to budget constraints)

So I want to find any 3 character words in the query string, and do a quick lookup just on the title field. The easiest way I can think to do this is to explode() the string and iterate over the resulting array with strlen() to find words of 3 characters. Then I'll take those words and do a LIKE search on the title field, just to make sure nothing that should obviously be in the results was missed.

Is there a better / easier way to approach this?

UPDATE: Yes, I know about the ft_min_word_len setting in MySQL. I don't think I want to do this.

There is a system option named “ft_min_word_len” by which you can define the minimum length of words to be indexed. You can set the value of this configuration directive to a lower value (eg 2): it's found under the [mysqld] section in your MySQL configuration file. This file is typically found under “/etc/mysql” or “/etc”. In windows you can look under windows directory or MySQL home folder.

[mysqld]
ft_min_word_len=2

I'm going with my original idea for now, unless someone has a better approach not involving ft_min_word_len . (If I could use this on a per-database level, I might consider it -- but otherwise it is too far-reaching.)

I have a function like this:

    $query = str_replace(array(',', '.'), '', $query);
    $terms = explode(' ', $query);
    $short = '';

    foreach($terms as $term){
        if(strlen($term) == 3){
            $short .= '"'.$term.'", ';
        }
    }

    if(!empty($short)){
        $short = trim($short, ', ');
    }

    return $short;

And then I use the returned string to search the title column: WHERE title IN ($short) , to supplement a full-text search. I arbitrarily assign a score of 3.5, so that the returned records can be sorted along with the other full-text search hits (I chose a relatively high score, since it is an exact match for the title of the record).

This doesn't feel very elegant to me, but it resolves the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM