简体   繁体   English

多项查询

[英]multiple term query

I have a search engine which scans all the words in a given web page and then shows their occurrence. 我有一个搜索引擎,扫描给定网页中的所有单词,然后显示它们的出现。 Then they are ranked by the ranked by the amount of occurance the word appears in the document. 然后按照该单词在文档中出现的出现量进行排名。 But it doesn't return multiple term queries. 但它不会返回多个术语查询。

Below is my SQL Query. 下面是我的SQL查询。 I would like to be able to have it check all the words inputted and then rank by the amount of times the words appear in the document. 我希望能够检查所有输入的单词,然后根据单词出现在文档中的次数进行排名。 It is only working for single term queries at the moment. 它目前只适用于单期查询。

         $result = mysql_query(" SELECT p.page_url AS url,
                       COUNT(*) AS occurrences 
                       FROM page p, word w, occurrence o
                       WHERE p.page_id = o.page_id AND
                       w.word_id = o.word_id AND
                       w.word_word = \"$keyword\"
                       GROUP BY p.page_id
                       ORDER BY occurrences DESC
                       LIMIT $results" );

If you want to get all the words, then your join conditional will not allow you to do so 如果您想获得所有单词,那么您的加入条件将不允许您这样做

w.word_word = \"$keyword\"

Your query can be written as follows 您的查询可以写成如下

$sql = "SELECT p.page_url as url, COUNT(*) as occurences "
     . "FROM page p "
     . "INNER JOIN occurence o ON p.page_id = o.page_id "
     . "INNER JOIN word w ON w.word_id = o.word_id "
     . "GROUP BY p.page_id "
     . "ORDER BY occurences DESC "
     . "LIMIT {$results}";
$result = mysql_query($sql);

This will grab all the words in the word table thus providing you with the results that (as I understand) need. 这将获取word表中的所有单词,从而为您提供(据我所知)需要的结果。

If you are interested in a few words then you can use the IN statement (as suggested by Dev in the comments) and your query will become: 如果您对几个单词感兴趣,那么您可以使用IN语句(在评论中由Dev建议),您的查询将变为:

$my_keywords = array('apple', 'banana');
// This produces: "apple", "banana" and assumes that all of your 
// keywords are in lower case. If not, you can transform them to lower
// case or if you don't want that, remove the LOWER() function below 
// from the WHERE
$keywords    = '"' . implode('","', $my_keywords) . '"';
$sql = "SELECT p.page_url as url, COUNT(*) as occurences "
     . "FROM page p "
     . "INNER JOIN occurence o ON p.page_id = o.page_id "
     . "INNER JOIN word w ON w.word_id = o.word_id "
     . "WHERE LOWER(w.word_word) IN ({$keywords}) "
     . "GROUP BY p.page_id "
     . "ORDER BY occurences DESC "
     . "LIMIT {$results}";
$result = mysql_query($sql);

Finally, try using mysqli instead of mysql , or PDO. 最后,尝试使用mysqli而不是mysql或PDO。

HTH HTH

I will go with MATCH-AGAINST which should be better for MySQL optimized search like search engines. 我将选择MATCH-AGAINST,这对于像搜索引擎这样的MySQL优化搜索应该更好。 You should view full text searcing: http://dev.mysql.com/doc/refman/5.5/en//fulltext-search.html 您应该查看全文搜索: http//dev.mysql.com/doc/refman/5.5/en//fulltext-search.html

NOTE: in a MySQL table should be INDEX-ed as FULLTEXT of keyword row in a table of database. 注意:在MySQL表中,应该在数据库表中作为关键字行的FULLTEXT进行索引。 This would give a greater performance for searching. 这将为搜索提供更好的性能。

Example: 例:

Example of input keywords: 输入关键字示例:

$keywords = '+Word+Word2+Word3'; $ keywords ='+ Word + Word2 + Word3';

SELECT p.page_url AS url,
COUNT(*) AS occurrences, MATCH('w.word_word') AGAINST ('$keywords') as keyword FROM page p, occurrence o, w.word WHERE MATCH
('w.word_word') AGAINST('{$keywords}' IN 
BOOLEAN MODE) 
AND p.page_id = o.page_id AND w.word_id = o.word_id
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results

In other de-optimized mode or risky for slowing performance server if your queries are not opitmized (too many groups, where clauses and conditionals). 如果您的查询未被优化(太多组,其中包含子句和条件),则在其他去优化模式中或有降低性能服务器的风险。 Instead of this you can use REGULAR EXPRESSION in MySQL for example: 而不是这个,你可以在MySQL中使用REGULAR EXPRESSION例如:

REGEXP "/(honda)|(jazz)|(manual)/"

This will also get a good performances using regular expressions (not recommended for huge db): 这也将使用正则表达式获得良好的性能(不推荐用于大型数据库):

Make a loop and count it than put in REGEXP: 制作一个循环并计算它而不是放在REGEXP中:

$keywords = "keyword1,keyword2,keyword3";

$expl = explode("," $keywords);

if (count($expl) == 1)
{
    $all = w.word_word REGEXP = '[[:<:]]$keywords[[:>:]]';
}
else
{
    $all = '';
    foreach ($expl as $keyone)
    {
        $all .= 'OR '.w.word_word REGEXP = '[[:<:]]$keyone[[:>:]]';
    }
}

$sql =  'SELECT p.page_url AS url,
COUNT(*) AS occurrences 
FROM page p, word w, occurrence o
WHERE p.page_id = o.page_id AND
w.word_id = o.word_id AND
$all
GROUP BY p.page_id
ORDER BY occurrences DESC
LIMIT $results';

$result_query = mysql_query($sql);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM