简体   繁体   中英

Very slow query with MySQL (“Sending Data”)

I'm currently developing a PHP/MySQL application using the CodeIgniter framework.

I've got a fairly length query that's causing a few problems. The problem occurs when altering the date range to a longer period, say 30 days, as opposed to the default which is 7 days. The query time massively increase: 1/2 seconds to 90 seconds but I can only presume this is because of the increase in size of data.

Before I paste out the query, the following is a quick explanation of the tables:

  • flagged_cases: list of unique cases (main table) - 352 rows
  • data_sources: list of data sources, each cases references this table using a foreign key - 20 rows
  • matches: rows of text matches for a case (one-to-many relationship, ie one case, many matches) - 22000 rows
  • flagged_cases_keywords_hits: mapping of case ids to keywords (and number of hits) - 2500 rows
  • keywords: list of keywords - 121 rows
  • reviewed_state: id/description for 3 states, only ever checking reviewed_state = 1 for this query - 3 rows

The following is the query, I realise it's pretty sizeable but I think there must be an underlying issue with indexes that unfortunately I just don't have the knowledge to fully troubleshoot so any help is appreciated.

SELECT    flagged_cases.id, 
          data_source_id, 
          title, 
          fetch_date, 
          publish_date, 
          case_id, 
          case_title, 
          case_link, 
          relevance_score, 
          ( 
                   SELECT   group_concat(match_string_highlighted ORDER BY matches.id SEPARATOR "")
                   FROM     matches 
                   WHERE    flagged_case_id=flagged_cases.id) AS all_matches, 
          reviewed_state_id, 
          ( 
                   SELECT   group_concat(concat(k.keyword, " ", "x", cast(kh.hits AS CHAR), "") SEPARATOR "")
                   FROM     flagged_cases_keywords_hits kh 
                   JOIN     keywords k 
                   ON       kh.keyword_id = k.id 
                   WHERE    kh.flagged_case_id = flagged_cases.id 
                   ORDER BY k.weighting DESC) AS hitcount 
FROM      flagged_cases 
JOIN      data_sources 
ON        flagged_cases.data_source_id = data_sources.id 
JOIN      reviewed_state 
ON        flagged_cases.reviewed_state_id = reviewed_state.id 
LEFT JOIN matches 
ON        flagged_cases.id = matches.flagged_case_id 
WHERE     reviewed_state_id = 1 
AND       data_source_id IN('1', 
                            '3', 
                            '4', 
                            '5', 
                            '6', 
                            '7', 
                            '8', 
                            '9', 
                            '10', 
                            '11', 
                            '12', 
                            '13', 
                            '14', 
                            '15', 
                            '16', 
                            '17', 
                            '18', 
                            '19', 
                            '20') 
AND       fetch_date >= '2015-05-10 00:00:00' 
AND       fetch_date <= '2015-05-17 23:59:59' 
GROUP BY  flagged_cases.id 
ORDER BY  title DESC 
LIMIT     10;

As a result of doing SHOW FULL PROCESSLIST I can see the query stays in the "Sending data" state which from some research I can see is basically MySQL fetching and selecting data so I can only presume there must be a missing index or something causing this to slow down.

I've also obtained the EXPLAIN of the query, which is as follows:

+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
| id | select_type        | table          | type   | possible_keys                    | key             | key_len | ref                        | rows | Extra                                        |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | reviewed_state | const  | PRIMARY                          | PRIMARY         | 4       | const                      |    1 | Using index; Using temporary; Using filesort |
|  1 | PRIMARY            | data_sources   | range  | PRIMARY                          | PRIMARY         | 4       | NULL                       |   19 | Using where                                  |
|  1 | PRIMARY            | flagged_cases  | ref    | data_source_id,reviewed_state_id | data_source_id  | 4       | proactive.data_sources.id  |   14 | Using where                                  |
|  1 | PRIMARY            | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | proactive.flagged_cases.id |   32 | Using index                                  |
|  3 | DEPENDENT SUBQUERY | kh             | ref    | flagged_case_id,keyword_id       | flagged_case_id | 5       | func                       |    3 | Using where; Using temporary                 |
|  3 | DEPENDENT SUBQUERY | k              | eq_ref | PRIMARY                          | PRIMARY         | 4       | proactive.kh.keyword_id    |    1 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | func                       |   32 |                                              |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+

Any help / advice / hints massively appreciated! :)

Can you try this to see if it results in any benefit?

The subqueries in your select list are replaced with inline views that are grouped by the values that get joined to your other tables.

select      flagged_cases.id, 
            data_source_id, 
            title, 
            fetch_date, 
            publish_date, 
            case_id, 
            case_title, 
            case_link, 
            relevance_score, 
            v1.all_matches, 
            reviewed_state_id, 
            v2.hitcount
from        flagged_cases 
       join data_sources 
         on flagged_cases.data_source_id = data_sources.id 
       join reviewed_state 
         on flagged_cases.reviewed_state_id = reviewed_state.id
       join (
                select      group_concat(match_string_highlighted order by matches.id separator "") as all_matches
                from        matches
                group by    flagged_case_id
            ) v1
         on v1.flagged_case_id = flagged_cases.id
       join (
                select      group_concat(concat(k.keyword, " ", "x", cast(kh.hits as char), "") order by k.weighting desc separator "")
                from        flagged_cases_keywords_hits kh 
                       join keywords k 
                         on kh.keyword_id = k.id 
                group by    kh.flagged_case_id
            ) v2
         on v2.flagged_case_id = flagged_cases.id 
  left join matches 
         on flagged_cases.id = matches.flagged_case_id 
where       reviewed_state_id = 1 
        and data_source_id in('1','3','4','5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20') 
        and fetch_date >= '2015-05-10 00:00:00' 
        and fetch_date <= '2015-05-17 23:59:59' 
group by    flagged_cases.id 
order by    title desc
limit       10;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM