简体   繁体   English

使用MySQL的查询非常慢(“发送数据”)

[英]Very slow query with MySQL (“Sending Data”)

I'm currently developing a PHP/MySQL application using the CodeIgniter framework. 我目前正在使用CodeIgniter框架开发PHP / MySQL应用程序。

I've got a fairly length query that's causing a few problems. 我有一个相当长的查询,导致了一些问题。 The problem occurs when altering the date range to a longer period, say 30 days, as opposed to the default which is 7 days. 当将日期范围更改为更长的时间(例如30天)而不是默认的7天时,就会出现问题。 The query time massively increase: 1/2 seconds to 90 seconds but I can only presume this is because of the increase in size of data. 查询时间大量增加:1/2秒到90秒,但是我只能假定这是由于数据大小的增加。

Before I paste out the query, the following is a quick explanation of the tables: 在粘贴查询之前,以下是对表的快速说明:

  • flagged_cases: list of unique cases (main table) - 352 rows flagged_cases:特例列表(主表) -352
  • data_sources: list of data sources, each cases references this table using a foreign key - 20 rows data_sources:数据源列表,每种情况下都使用外键引用此表-20行
  • matches: rows of text matches for a case (one-to-many relationship, ie one case, many matches) - 22000 rows 匹配:一个案例的文本匹配行(一对多关系,即一个案例,很多匹配)-22000行
  • flagged_cases_keywords_hits: mapping of case ids to keywords (and number of hits) - 2500 rows flagged_cases_keywords_hits:案例ID到关键字的映射(以及命中数)-2500行
  • keywords: list of keywords - 121 rows 关键字:关键字列表-121行
  • reviewed_state: id/description for 3 states, only ever checking reviewed_state = 1 for this query - 3 rows review_state: 3个州的ID /说明,仅对此查询检查review_state = 1-3行

The following is the query, I realise it's pretty sizeable but I think there must be an underlying issue with indexes that unfortunately I just don't have the knowledge to fully troubleshoot so any help is appreciated. 以下是查询,我意识到它相当大,但是我认为索引必须存在一个潜在的问题,不幸的是,我只是不具备完全解决问题的知识,因此感谢您的任何帮助。

SELECT    flagged_cases.id, 
          data_source_id, 
          title, 
          fetch_date, 
          publish_date, 
          case_id, 
          case_title, 
          case_link, 
          relevance_score, 
          ( 
                   SELECT   group_concat(match_string_highlighted ORDER BY matches.id SEPARATOR "")
                   FROM     matches 
                   WHERE    flagged_case_id=flagged_cases.id) AS all_matches, 
          reviewed_state_id, 
          ( 
                   SELECT   group_concat(concat(k.keyword, " ", "x", cast(kh.hits AS CHAR), "") SEPARATOR "")
                   FROM     flagged_cases_keywords_hits kh 
                   JOIN     keywords k 
                   ON       kh.keyword_id = k.id 
                   WHERE    kh.flagged_case_id = flagged_cases.id 
                   ORDER BY k.weighting DESC) AS hitcount 
FROM      flagged_cases 
JOIN      data_sources 
ON        flagged_cases.data_source_id = data_sources.id 
JOIN      reviewed_state 
ON        flagged_cases.reviewed_state_id = reviewed_state.id 
LEFT JOIN matches 
ON        flagged_cases.id = matches.flagged_case_id 
WHERE     reviewed_state_id = 1 
AND       data_source_id IN('1', 
                            '3', 
                            '4', 
                            '5', 
                            '6', 
                            '7', 
                            '8', 
                            '9', 
                            '10', 
                            '11', 
                            '12', 
                            '13', 
                            '14', 
                            '15', 
                            '16', 
                            '17', 
                            '18', 
                            '19', 
                            '20') 
AND       fetch_date >= '2015-05-10 00:00:00' 
AND       fetch_date <= '2015-05-17 23:59:59' 
GROUP BY  flagged_cases.id 
ORDER BY  title DESC 
LIMIT     10;

As a result of doing SHOW FULL PROCESSLIST I can see the query stays in the "Sending data" state which from some research I can see is basically MySQL fetching and selecting data so I can only presume there must be a missing index or something causing this to slow down. 作为SHOW FULL PROCESSLIST的结果,我可以看到查询保持在“正在发送数据”状态,从某些研究中我可以看到,基本上是MySQL正在获取并选择数据,因此我只能假定必须存在丢失的索引或引起这种情况的原因慢下来。

I've also obtained the EXPLAIN of the query, which is as follows: 我还获得了查询的解释,如下所示:

+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
| id | select_type        | table          | type   | possible_keys                    | key             | key_len | ref                        | rows | Extra                                        |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | reviewed_state | const  | PRIMARY                          | PRIMARY         | 4       | const                      |    1 | Using index; Using temporary; Using filesort |
|  1 | PRIMARY            | data_sources   | range  | PRIMARY                          | PRIMARY         | 4       | NULL                       |   19 | Using where                                  |
|  1 | PRIMARY            | flagged_cases  | ref    | data_source_id,reviewed_state_id | data_source_id  | 4       | proactive.data_sources.id  |   14 | Using where                                  |
|  1 | PRIMARY            | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | proactive.flagged_cases.id |   32 | Using index                                  |
|  3 | DEPENDENT SUBQUERY | kh             | ref    | flagged_case_id,keyword_id       | flagged_case_id | 5       | func                       |    3 | Using where; Using temporary                 |
|  3 | DEPENDENT SUBQUERY | k              | eq_ref | PRIMARY                          | PRIMARY         | 4       | proactive.kh.keyword_id    |    1 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | func                       |   32 |                                              |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+

Any help / advice / hints massively appreciated! 任何帮助/建议/提示大加赞赏! :) :)

Can you try this to see if it results in any benefit? 您可以尝试一下以查看是否会带来任何好处吗?

The subqueries in your select list are replaced with inline views that are grouped by the values that get joined to your other tables. 选择列表中的子查询将替换为内联视图,这些内联视图按与其他表连接的值进行分组。

select      flagged_cases.id, 
            data_source_id, 
            title, 
            fetch_date, 
            publish_date, 
            case_id, 
            case_title, 
            case_link, 
            relevance_score, 
            v1.all_matches, 
            reviewed_state_id, 
            v2.hitcount
from        flagged_cases 
       join data_sources 
         on flagged_cases.data_source_id = data_sources.id 
       join reviewed_state 
         on flagged_cases.reviewed_state_id = reviewed_state.id
       join (
                select      group_concat(match_string_highlighted order by matches.id separator "") as all_matches
                from        matches
                group by    flagged_case_id
            ) v1
         on v1.flagged_case_id = flagged_cases.id
       join (
                select      group_concat(concat(k.keyword, " ", "x", cast(kh.hits as char), "") order by k.weighting desc separator "")
                from        flagged_cases_keywords_hits kh 
                       join keywords k 
                         on kh.keyword_id = k.id 
                group by    kh.flagged_case_id
            ) v2
         on v2.flagged_case_id = flagged_cases.id 
  left join matches 
         on flagged_cases.id = matches.flagged_case_id 
where       reviewed_state_id = 1 
        and data_source_id in('1','3','4','5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20') 
        and fetch_date >= '2015-05-10 00:00:00' 
        and fetch_date <= '2015-05-17 23:59:59' 
group by    flagged_cases.id 
order by    title desc
limit       10;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM