简体   繁体   中英

Search code optimisation - CakePHP

I have a database containing a list of files (around 6000 files). All these files have certain additional details linked to them (such as project numbers, sectors, clients, comments, disciplines).

Although the code and search works, it's very slow. A simple search with two terms takes about a minute to complete.

My code is below. What I want to know is, what can I do to simplify and optimize my search function?

public function search() {
    $Terms = explode(' ',$this->request->data['KmFiles']['search']);
    $possible = 0;
    $Matches = array();
    foreach($Terms as $Term) {
        $Files = $this->KmFile->find('list',
            array(
                'conditions' => array(
                    'file_name LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('id')
            )
        );
        $possible++;
        $Clients = $this->KmClient->find('list',
            array(
                'conditions' => array(
                    'clients LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Disciplines = $this->KmDiscipline->find('list',
            array(
                'conditions' => array(
                    'disciplines LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Projects = $this->KmProject->find('list',
            array(
                'conditions' => array(
                    'projects LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Sectors = $this->KmSector->find('list',
            array(
                'conditions' => array(
                    'sectors LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Comments = $this->KmComment->find('list',
            array(
                'conditions' => array(
                    'comments LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Matches = array_merge($Matches,$Files,$Clients,$Disciplines,$Projects,$Sectors,$Comments);
    }
    if(count($Matches) > 0) {
        $NumberOfMatches = array_count_values($Matches);
        $Matches = array_unique($Matches);
        $k=0;
        foreach($Matches as $Match) {
            $Result = $this->KmFile->find('all',
                array(
                    'conditions' => array(
                        'id' => $Match
                    )
                )
            );
        $Results[$k] = $Result[0];
        $Results[$k]['Relevance'] = round(($NumberOfMatches[$Match] / $possible) * 100,2);
        $relevance[] = $Results[$k]['Relevance'];
        $k++;
    }
        array_multisort($relevance,SORT_DESC,$Results);
        $Stats['Count'] = count($Results);
        $Stats['Terms'] = $this->request->data['KmFiles']['search'];
        $this->set(compact('Results','Stats'));
    } else {
        $Stats['Count'] = 0;
        $Stats['Terms'] = $this->request->data['KmFiles']['search'];
        $this->set(compact('Stats'));
    }
}

I know it's a long piece of code, but I'm fairly new to CakePHP so have no idea what to do to improve it.

Any assistance will be appreciated.

To make it faster, you have to defer as much responsibility to the database as possible (databases are REALLY fast these days!) and minimise the back and forth between PHP and the database. Ideally, you'd fetch all search results in a single query (ie, a single find call).

You'd specify joins , such that your KmFile model is left-joined to your KmClient, KmProject, etc tables.

Then, it's just a matter of building up a long conditions array. In Cake, you can specify 'OR' conditions like this:

array('OR' => array(
    array('Post.title LIKE' => '%one%'),
    array('Post.title LIKE' => '%two%')
))

Check out the doco on complex find conditions . Your conditions array would look something like:

array('OR' => array(
    array('KmFile.file_name LIKE' => '%term1%'),
    array('KmFile.file_name LIKE' => '%term2%'),
    array('KmDiscipline.disciplines LIKE' => '%term1%'),
    array('KmDiscipline.disciplines LIKE' => '%term2%'),
    array('KmProject.projects LIKE' => '%term1%'),
    array('KmProject.projects LIKE' => '%term2%'),
    // and so on...
))

Obviously you'd want to use a loop to build your conditions array.

Then do a single find, on your KmFile model, joined to all your related models, with your big list of conditions. That will return a list of matches, and shouldn't take too long.

It's probably possible to calculate some kind of relevancy score in the same single query, though I don't know how. And in any case, once you get your find results back in a single query, it shouldn't take too long to loop through them in PHP code and calculate relevancy for each one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM