简体   繁体   English

mysql查询耗时太长

[英]mysql query taking too long

I am new to advanced queries so I likely have something conceptually wrong because when the database has over 1 million records I get this response rom my query... 我是高级查询的新手,所以我可能有一些概念上的错误,因为当数据库有超过100万条记录时,我得到了这个响应,我的查询...

ERROR 2013: Lost connection to MySQL server during query

Yes! 是! It actually takes so long that it pukes before it finishes. 它实际上需要很长时间才能在它完成之前呕吐。

My query is this... 我的疑问是......

SELECT users.username,
    table_1.field_abc, table_1.field_def,
    table_2.field_ghi, table_2.field_jkl
FROM users
LEFT JOIN table_1 ON table_1.username = users.username
LEFT JOIN table_2 ON table_2.username = users.username
WHERE
    table_1.field_abc REGEXP "(spork|yellow)" OR
    table_1.field_def REGEXP "(spork|yellow)" OR
    table_2.field_ghi REGEXP "(spork|yellow)" OR
    table_2.field_jkl REGEXP "(spork|yellow)"
GROUP BY users.username
ORDER BY
(
    ( CASE WHEN table_1.field_abc LIKE "%spork%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_1.field_abc LIKE "%yellow%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_1.field_def LIKE "%spork%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_1.field_def LIKE "%yellow%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_2.field_ghi LIKE "%spork%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_2.field_ghi LIKE "%yellow%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_2.field_jkl LIKE "%spork%" THEN 1 ELSE 0 END ) +
    ( CASE WHEN table_2.field_jkl LIKE "%yellow%" THEN 1 ELSE 0 END )
)DESC;

I posted a sample dataset (with only a few records) at http://sqlfiddle.com/#!2/cbbda/28 我在http://sqlfiddle.com/#!2/cbbda/28发布了一个样本数据集(只有几条记录)

The sample at sqlfiddle runs quick because there are only a few records but I tried duplicating records on my own server and the query ran quick with only a few records and extremely slow after I added a million records. sqlfiddle上的示例运行速度很快,因为只有少数记录,但我尝试在我自己的服务器上复制记录,查询只用几条记录快速运行,而且在我添加了一百万条记录后速度非常慢。

Is there any possible way to get my results quick? 有没有办法快速获得我的结果?

Well folks... With your help we have a solution... See... http://sqlfiddle.com/#!2/fcfbd/5 BUT I DO STILL HAVE A QUESTION... 好伙计......在你的帮助下,我们有了一个解决方案...请参阅... http://sqlfiddle.com/#!2/fcfbd/5但我仍然有一个问题......

I altered the table to add the indexes... 我改变了表来添加索引......

ALTER TABLE  `users` ADD FULLTEXT ( `username` );
ALTER TABLE  `table_1` ADD FULLTEXT ( `field_abc`,`field_def` );
ALTER TABLE  `table_2` ADD FULLTEXT ( `field_ghi`,`field_jkl` );

I then took the advice of @Barmar and changed the code to this... 然后我接受了@Barmar的建议并将代码改为此...

SELECT users.username,
    table_1.field_abc, table_1.field_def,
    table_2.field_ghi, table_2.field_jkl
FROM users
LEFT JOIN table_1 ON table_1.username = users.username
LEFT JOIN table_2 ON table_2.username = users.username
WHERE
    MATCH(table_1.field_abc,table_1.field_def,table_2.field_ghi,table_2.field_jkl)
    AGAINST ("spork yellow" IN BOOLEAN MODE)
GROUP BY users.username
ORDER BY
(
    ( CASE WHEN MATCH(table_1.field_abc) AGAINST ("spork" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +
    ( CASE WHEN MATCH(table_1.field_abc) AGAINST ("yellow" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +

    ( CASE WHEN MATCH(table_1.field_def) AGAINST ("spork" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +
    ( CASE WHEN MATCH(table_1.field_def) AGAINST ("yellow" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +

    ( CASE WHEN MATCH(table_2.field_ghi) AGAINST ("spork" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +
    ( CASE WHEN MATCH(table_2.field_ghi) AGAINST ("yellow" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +

    ( CASE WHEN MATCH(table_2.field_ghi) AGAINST ("spork" IN BOOLEAN MODE) THEN 1 ELSE 0 END ) +
    ( CASE WHEN MATCH(table_2.field_ghi) AGAINST ("yellow" IN BOOLEAN MODE) THEN 1 ELSE 0 END )
)DESC;

With over 1,000,000 records in my real database, I got my result in 6.5027 seconds. 在我的真实数据库中有超过1,000,000条记录,我的结果是6.5027秒。 That is A LOT better than... well, taking so long that it puked! 那比A ...好多了,花了这么长时间才发现它!

My only question now is... Why does it only work with IN BOOLEAN MODE and not the other 2 options mentioned at http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html#function_match or http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html ? 我现在唯一的问题是......为什么它只适用于IN BOOLEAN MODE而不是http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html#function_match中提到的其他2个选项或http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html

I don't think so - with this table as-is, I doubt you'll make that run fast with all of those LIKE s on them. 我不这么认为 - 有了这张桌子,我怀疑你会让那些LIKE他们一样的快速运行。 Those have to run a ridiculous number of times. 那些必须经历荒谬的次数。

If those values are fixed, then you can add new columns to the table called abc_like_yellow and abc_like_spork , etc., and populate those values one time, then you can easily query off of that column. 如果这些值是固定的,那么您可以向名为abc_like_yellowabc_like_spork等的表添加新列,并将这些值填充一次,然后您可以轻松查询该列。

But if you're trying to do this dynamically, you might be out of luck. 但如果你想动态地做这件事,你可能会失去运气。

Since we're joining on username it is likely that an index on this column will speed things up. 由于我们加入了username因此本专栏的索引很可能会加快速度。

Also, are you able to use an inner join as opposed to a left join? 此外,您是否能够使用内部联接而不是左联接? This can also speed up the query to quite a large extent. 这也可以在很大程度上加速查询。

And finally, if necessary, the ordering can be done in memory as opposed to asking the database to do it (ie order the result set after it is returned). 最后,如果需要,可以在内存中完成排序,而不是要求数据库执行此操作(即,在返回结果集后对结果集进行排序)。

I was using my first solution but found that it gave some false positives that I could not figure out so I came up with this... 我正在使用我的第一个解决方案,但发现它给了一些我无法弄清楚的误报,所以我想出了这个......

(SELECT username, MATCH(field_abc,field_def) AGAINST ("spork yellow" IN BOOLEAN MODE) AS score FROM table_1 HAVING score>0)
UNION ALL
(SELECT username, MATCH(field_ghi,field_jkl) AGAINST ("spork yellow" IN BOOLEAN MODE) AS score FROM table_2 HAVING score >0)

Since each record was returned separately and I can't use GROUP BY I added this PHP code after my query finished: 由于每个记录都是单独返回的,我不能使用GROUP BY因此在查询完成后添加了这个PHP代码:

while($row = mysql_fetch_array($result) )
{
    if( in_array($row['username'],$usernames) )
    {
        $usernames_count[$row['username']] += $row['score'];
    }
    else
    {
        array_push($usernames,$row['username']);
        $usernames_count[$row['username']]=$row['score'];
    }
}
arsort($usernames_count); // Sort the results high->low

foreach($usernames_count as $key=>$value)
{
    echo "Username: ".$key." had a score of ".$value." in the search results<br/>";
}

It now seems so simple compared to the other attempts I made. 与我做的其他尝试相比,它现在看起来如此简单。

When your server has to scan through millions of entries, it simply may not be powerful enough to process the query quickly. 当您的服务器必须扫描数百万条目时,它可能不够强大,无法快速处理查询。

In general, to improve the speed of your website, you could try CloudFlare 一般来说,为了提高网站的速度,您可以尝试CloudFlare

If you are specifically trying to speed up your SQL, Google Cloud SQL may be able to help. 如果您专门尝试加速SQL, Google Cloud SQL可能会提供帮助。 Google's powerful servers are designed to scan through billions of SQL entries, for example when a Google search is performed. Google功能强大的服务器旨在扫描数十亿条SQL条目,例如执行Google搜索时。

As long as there are no errors being returned, the above two services will help to dramatically speed up your query time. 只要没有返回错误,上述两项服务将有助于大大加快您的查询时间。

I hope I could help! 我希望我能帮忙!

VCNinc

If you have access to SQL Server, highlight your complete query in SQL server, and click + L 如果您有权访问SQL Server,请在SQL Server中突出显示您的完整查询,然后单击+ L.

This will show the query execution plan. 这将显示查询执行计划。 Optimize the query based on these results; 根据这些结果优化查询;

if for example you see table scans then an index may assist. 例如,如果您看到表扫描,那么索引可能会有所帮助。 Write queries that do not use the term distinct. 编写不使用术语distinct的查询。 Do not order results if the order is unimportant. 如果订单不重要,请不要订购结果。

In your sample the complicated last set of order-by is very expensive. 在您的样本中,复杂的最后一组订单非常昂贵。

Rather follow these steps: Pull the core information into a temporary table, with 9 extra columns (type int, intially set to 0) after populating the core data, update each of the 8 columns based on the 0 or 1 criteria update the last column as the sum of the other 8 columns retrieve info from the table , with only a single "order-by" based on column 9. 而是按照以下步骤操作:将核心信息拉入临时表,在填充核心数据后添加9个额外列(类型为int,初始设置为0),根据0或1条件更新每个8列更新最后一列因为其他8列的总和从表中检索信息,只有一个基于第9列的“order-by”。

In my experience this approach only takes 20% of the time compared to doing the order-by in-house. 根据我的经验,与在内部执行订单相比,这种方法只需要20%的时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM