简体   繁体   English

搜索代码优化-CakePHP

[英]Search code optimisation - CakePHP

I have a database containing a list of files (around 6000 files). 我有一个包含文件列表(大约6000个文件)的数据库。 All these files have certain additional details linked to them (such as project numbers, sectors, clients, comments, disciplines). 所有这些文件都具有链接到它们的某些其他详细信息(例如,项目编号,部门,客户,注释,学科)。

Although the code and search works, it's very slow. 尽管代码和搜索有效,但速度非常慢。 A simple search with two terms takes about a minute to complete. 一个包含两个词的简单搜索大约需要一分钟才能完成。

My code is below. 我的代码如下。 What I want to know is, what can I do to simplify and optimize my search function? 我想知道的是,如何简化和优化搜索功能?

public function search() {
    $Terms = explode(' ',$this->request->data['KmFiles']['search']);
    $possible = 0;
    $Matches = array();
    foreach($Terms as $Term) {
        $Files = $this->KmFile->find('list',
            array(
                'conditions' => array(
                    'file_name LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('id')
            )
        );
        $possible++;
        $Clients = $this->KmClient->find('list',
            array(
                'conditions' => array(
                    'clients LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Disciplines = $this->KmDiscipline->find('list',
            array(
                'conditions' => array(
                    'disciplines LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Projects = $this->KmProject->find('list',
            array(
                'conditions' => array(
                    'projects LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Sectors = $this->KmSector->find('list',
            array(
                'conditions' => array(
                    'sectors LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Comments = $this->KmComment->find('list',
            array(
                'conditions' => array(
                    'comments LIKE' => '%' . $Term . '%'
                ),
                'fields' => array('km_file_id')
            )
        );
        $possible++;
        $Matches = array_merge($Matches,$Files,$Clients,$Disciplines,$Projects,$Sectors,$Comments);
    }
    if(count($Matches) > 0) {
        $NumberOfMatches = array_count_values($Matches);
        $Matches = array_unique($Matches);
        $k=0;
        foreach($Matches as $Match) {
            $Result = $this->KmFile->find('all',
                array(
                    'conditions' => array(
                        'id' => $Match
                    )
                )
            );
        $Results[$k] = $Result[0];
        $Results[$k]['Relevance'] = round(($NumberOfMatches[$Match] / $possible) * 100,2);
        $relevance[] = $Results[$k]['Relevance'];
        $k++;
    }
        array_multisort($relevance,SORT_DESC,$Results);
        $Stats['Count'] = count($Results);
        $Stats['Terms'] = $this->request->data['KmFiles']['search'];
        $this->set(compact('Results','Stats'));
    } else {
        $Stats['Count'] = 0;
        $Stats['Terms'] = $this->request->data['KmFiles']['search'];
        $this->set(compact('Stats'));
    }
}

I know it's a long piece of code, but I'm fairly new to CakePHP so have no idea what to do to improve it. 我知道这是一段很长的代码,但是对于CakePHP来说我还很陌生,所以不知道该怎么做来改进它。

Any assistance will be appreciated. 任何帮助将不胜感激。

To make it faster, you have to defer as much responsibility to the database as possible (databases are REALLY fast these days!) and minimise the back and forth between PHP and the database. 为了使其更快,您必须尽可能多地承担数据库责任(如今,数据库的确非常快!),并最小化PHP与数据库之间的往返。 Ideally, you'd fetch all search results in a single query (ie, a single find call). 理想情况下,您将在单个查询(即单个find调用)中获取所有搜索结果。

You'd specify joins , such that your KmFile model is left-joined to your KmClient, KmProject, etc tables. 您可以指定joins ,这样您的KmFile模型就可以左连接到KmClient,KmProject等表。

Then, it's just a matter of building up a long conditions array. 然后,只需要建立一个长条件数组即可。 In Cake, you can specify 'OR' conditions like this: 在Cake中,您可以指定“ OR”条件,如下所示:

array('OR' => array(
    array('Post.title LIKE' => '%one%'),
    array('Post.title LIKE' => '%two%')
))

Check out the doco on complex find conditions . 复杂的查找条件下检查doco。 Your conditions array would look something like: 您的条件数组如下所示:

array('OR' => array(
    array('KmFile.file_name LIKE' => '%term1%'),
    array('KmFile.file_name LIKE' => '%term2%'),
    array('KmDiscipline.disciplines LIKE' => '%term1%'),
    array('KmDiscipline.disciplines LIKE' => '%term2%'),
    array('KmProject.projects LIKE' => '%term1%'),
    array('KmProject.projects LIKE' => '%term2%'),
    // and so on...
))

Obviously you'd want to use a loop to build your conditions array. 显然,您希望使用循环来构建条件数组。

Then do a single find, on your KmFile model, joined to all your related models, with your big list of conditions. 然后在您的KmFile模型上进行一个单一查找,并与所有相关模型一起加入条件列表。 That will return a list of matches, and shouldn't take too long. 这将返回一个匹配列表,并且不要花太长时间。

It's probably possible to calculate some kind of relevancy score in the same single query, though I don't know how. 尽管我不知道如何在同一个查询中计算某种相关性分数。 And in any case, once you get your find results back in a single query, it shouldn't take too long to loop through them in PHP code and calculate relevancy for each one. 而且,无论如何,一旦您在单个查询中找到了查找结果,在PHP代码中遍历它们并为每个结果计算相关性就不会花太长时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM