简体   繁体   English

如何优化包含联接和子查询的查询

[英]How to optimise a query containing joins and subqueries

I have inherited the following query and DB structure and I want to optimise it as it is slow. 我继承了以下查询和数据库结构,由于它很慢,因此我想对其进行优化。 It contains joins and subqueries which I've read isn't a good plan. 它包含的联接和子查询,我读过不是一个好计划。 I've tried various ways to improve it but am getting stuck/lost. 我尝试了各种方法来改善它,但是却陷于迷los /丢失。

If it is good as it is then fine but if there are suggestions for improving it I would be immensely grateful... 如果它很好就可以了,但是如果有改进的建议,我将不胜感激...

The query draws data from various tables to produce a report on how many clickthroughs to a supplier's website, telephone number 'reveals' there have been for a supplier and emails have been sent to a supplier. 该查询从各个表中提取数据,以生成有关供应商网站的点击次数,供应商已有的电话号码“显示”以及已将电子邮件发送给供应商的报告。

The WHERE clause uses 1=1 as conditions are sometimes added to filter the report down by region, county, and supplier's business type. WHERE子句使用1 = 1,因为有时会添加条件以按地区,县和供应商的业务类型过滤报表。

The code is copied from mysql_slow log to interpolate all the $variables. 从mysql_slow日志中复制代码以内插所有$ variables。 The structure of the tables is output from a mysql dump. 表的结构是从mysql转储输出的。

The query: 查询:

SELECT Business.*, 
       ( SELECT Count(Message.id) FROM messages as Message 
         WHERE (U.id = Message.from_to OR U.id = Message.user_id)  
           AND Message.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
       ) as message_no, 
       ( SELECT Count(DISTINCT(MessageUnique.user_id)) FROM messages as MessageUnique 
         WHERE (U.id = MessageUnique.from_to OR U.id = MessageUnique.user_id) 
           AND (MessageUnique.parent_message_id is null OR MessageUnique.parent_message_id = MessageUnique.id)  
           AND MessageUnique.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
       ) as message_unique_no, 
       ( SELECT Count(*) FROM business_counties as bc2 
         WHERE Business.id = bc2.business_id ) as county_no, 
       ( SELECT Count(click.id) FROM business_clickthroughs as click 
         WHERE Business.id = click.business_id  
           AND click.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
       ) as clicks, 
       ( SELECT Count(*) FROM business_regions as br2 
         WHERE Business.id = br2.business_id ) as region_no, 
       ( SELECT count(BusinessReveal.id) as reveal_no FROM business_reveals as BusinessReveal
         WHERE 1=1  
           AND BusinessReveal.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59' 
           AND BusinessReveal.business_id = Business.id
       ) as reveals_no 
FROM businesses as Business 
LEFT JOIN users as U ON Business.id = U.business_id  
LEFT JOIN business_counties as bc ON Business.id = bc.business_id 
LEFT JOIN businesses_business_types as bt ON Business.id = bt.business_id 
LEFT JOIN business_regions as br ON Business.id = br.business_id 
WHERE 1=1  
Group By Business.id;

The table structures: 该表的结构:

/*
 Navicat MySQL Data Transfer

 Source Server         : _Localhost
 Source Server Type    : MySQL
 Source Server Version : 50530
 Target Server Type    : MySQL
 Target Server Version : 50530
 File Encoding         : utf-8
*/


-- ----------------------------
--  Table structure for `business_clickthroughs`
-- ----------------------------
DROP TABLE IF EXISTS `business_clickthroughs`;
CREATE TABLE `business_clickthroughs` (
  `id` bigint(12) unsigned NOT NULL AUTO_INCREMENT,
  `business_id` int(8) unsigned NOT NULL,
  `registered_user` tinyint(1) unsigned DEFAULT '0',
  `created` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `bid` (`business_id`)
) ENGINE=InnoDB AUTO_INCREMENT=29357 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;

-- ----------------------------
--  Table structure for `business_counties`
-- ----------------------------
DROP TABLE IF EXISTS `business_counties`;
CREATE TABLE `business_counties` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `business_id` int(11) NOT NULL,
  `county_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `bcid` (`business_id`)
) ENGINE=MyISAM AUTO_INCREMENT=20124 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=FIXED;

-- ----------------------------
--  Table structure for `business_regions`
-- ----------------------------
DROP TABLE IF EXISTS `business_regions`;
CREATE TABLE `business_regions` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `business_id` int(11) NOT NULL,
  `region_id` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2719 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=FIXED;

-- ----------------------------
--  Table structure for `business_reveals`
-- ----------------------------
DROP TABLE IF EXISTS `business_reveals`;
CREATE TABLE `business_reveals` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `business_id` int(11) NOT NULL,
  `customer_id` int(11) DEFAULT NULL,
  `created` datetime NOT NULL,
  `modified` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `bid` (`business_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3172 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPACT;

-- ----------------------------
--  Table structure for `businesses_business_types`
-- ----------------------------
DROP TABLE IF EXISTS `businesses_business_types`;
CREATE TABLE `businesses_business_types` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `business_id` int(11) NOT NULL,
  `business_type_id` int(11) NOT NULL,
  `level` int(2) NOT NULL DEFAULT '2',
  PRIMARY KEY (`id`),
  KEY `bid` (`business_id`) COMMENT '(null)'
) ENGINE=MyISAM AUTO_INCREMENT=4484 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=FIXED;

-- ----------------------------
--  Table structure for `messages`
-- ----------------------------
DROP TABLE IF EXISTS `messages`;
CREATE TABLE `messages` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `subject` varchar(500) DEFAULT NULL,
  `message` text,
  `user_id` int(11) DEFAULT NULL,
  `message_folder_id` int(11) DEFAULT NULL,
  `parent_message_id` int(11) DEFAULT NULL,
  `status` int(11) DEFAULT NULL,
  `direction` int(11) DEFAULT NULL,
  `from_to` varchar(500) DEFAULT NULL,
  `attachment` varchar(500) DEFAULT NULL,
  `created` datetime DEFAULT NULL,
  `modified` datetime DEFAULT NULL,
  `guest_sender` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `fromto` (`from_to`(255)),
  KEY `uid` (`user_id`),
  KEY `pmid` (`parent_message_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4582 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;

-- ----------------------------
--  Table structure for `users`
-- ----------------------------
DROP TABLE IF EXISTS `users`;
CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `login` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `password` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `name` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `email` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `title` varchar(20) COLLATE latin1_general_ci NOT NULL,
  `firstname` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `lastname` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `active` tinyint(1) NOT NULL DEFAULT '0',
  `first_visit` tinyint(1) NOT NULL DEFAULT '1',
  `signature` text COLLATE latin1_general_ci,
  `type` varchar(45) COLLATE latin1_general_ci DEFAULT 'customer',
  `business_id` int(11) DEFAULT NULL,
  `admin_monitor` tinyint(1) NOT NULL DEFAULT '0',
  `partner_name` varchar(255) COLLATE latin1_general_ci DEFAULT NULL,
  `postcode` varchar(255) COLLATE latin1_general_ci DEFAULT NULL,
  `venue_postcode` varchar(255) COLLATE latin1_general_ci DEFAULT NULL,
  `wedding_date` datetime DEFAULT NULL,
  `phone` varchar(255) COLLATE latin1_general_ci NOT NULL,
  `register_date` datetime DEFAULT NULL,
  `event` text COLLATE latin1_general_ci,
  `mailing_list` tinyint(1) NOT NULL DEFAULT '0',
  `created` datetime NOT NULL,
  `modified` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2854 DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci ROW_FORMAT=DYNAMIC;

The Explain plan. 解释计划。

id  select_type         table           type    possible_keys       key     key_len     ref             rows    Extra

1   PRIMARY             Business        ALL     -                   -       -           -               444     Using temporary; Using filesort
1   PRIMARY             U               ALL     -                   -       -           -               2658    -
1   PRIMARY             bc              ref     bcid                bcid    4           Business.id     7       Using index
1   PRIMARY             bt              ref     bid                 bid     4           Business.id     9       Using index
1   PRIMARY             br              ALL     -                   -       -           -               440     -
7   DEPENDENT SUBQUERY  BusinessReveal  ref     bid                 bid     4           func            5       Using where
6   DEPENDENT SUBQUERY  br2             ALL     -                   -       -           -               440     Using where
5   DEPENDENT SUBQUERY  click           ref     bid                 bid     4           func            22      Using where
4   DEPENDENT SUBQUERY  bc2             ref     bcid                bcid    4           func            7       Using index
3   DEPENDENT SUBQUERY  MessageUnique   ALL     fromto,uid,pmid     -       -           -               4958    Using where
2   DEPENDENT SUBQUERY  Message         ALL     fromto,uid          -       -           -               4958    Using where

Your query has 6 correlated sub queries, and in total is returning 444 rows. 您的查询有6个相关子查询,总共返回444行。 Each of those correlated sub queries is effectively being executed for each returned row. 这些相关子查询中的每一个都有效地针对每个返回的行执行。 Hence your single query is resulting in just under 3000 queries. 因此,您的单个查询导致的查询量不足3000。

Personally I prefer to avoid then, using a large join or joining against sub queries. 我个人比较喜欢避免使用大型联接或针对子查询的联接。 However it depends on the number of rows returned 但是,这取决于返回的行数

Further you are also joining directly to the tables you are doing the left joins on anyway, which will generate a lot of duplicates which the GROUP BY then excludes. 此外,您还直接连接到您仍在进行左连接的表,这将生成很多重复项,然后GROUP BY排除这些重复项。 As you take nothing directly from most of those tables and the GROUP BY is on what appears to be a unique key it seems irrelevant. 由于您没有直接从大多数表中获取任何内容,而GROUP BY似乎是唯一键,因此似乎无关紧要。

If you keep the correlated sub queries:- 如果您保留相关的子查询:

SELECT Count(Message.id) FROM messages as Message 
WHERE (U.id = Message.from_to OR U.id = Message.user_id)  
AND Message.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'

There is no useful index on this table for this sub query. 该表上没有该子查询的有用索引。 As you are checking 2 different columns for the U.id there is not much that can be done there, but an index on created would help. 当您在检查U.id的2个不同列时,在那里没有太多可以做的事情,但是创建一个索引会有所帮助。 It might be better to duplicate this sub query, once checking from_to and once checking user_id, and adding the results together. 最好检查一次from_to并检查user_id,然后将结果加在一起,重复该子查询。 As you could then have an index on the relevant id field and the date. 这样就可以在相关ID字段和日期上建立索引。

Also, you are doing a count on the value which appears to be the unique key so should never be null. 同样,您正在对似乎是唯一键的值进行计数,因此永远不应为null。

SELECT Count(DISTINCT(MessageUnique.user_id)) FROM messages as MessageUnique 
WHERE (U.id = MessageUnique.from_to OR U.id = MessageUnique.user_id) 
AND (MessageUnique.parent_message_id is null OR MessageUnique.parent_message_id = MessageUnique.id)  
AND MessageUnique.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'

Same problem as the previous sub query. 与上一个子查询相同的问题。

SELECT Count(*) FROM business_counties as bc2 
WHERE Business.id = bc2.business_id

This has a key on business_id and should be OK 这在business_id上有一个键,应该可以

SELECT Count(click.id) FROM business_clickthroughs as click 
WHERE Business.id = click.business_id  
AND click.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'

While indexed on business id there is no index that covers both business id and the created date, which would probably help here. 虽然以企业ID为索引编制索引,但没有同时涵盖企业ID和创建日期的索引,这可能会对您有所帮助。

SELECT Count(*) FROM business_regions as br2 
WHERE Business.id = br2.business_id

This requires an index on business_id on the business regions table 这需要在业务区域表上的business_id上建立索引

SELECT count(BusinessReveal.id) as reveal_no FROM business_reveals as BusinessReveal
WHERE 1=1  
AND BusinessReveal.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59' 
AND BusinessReveal.business_id = Business.id

Here the key does't cover the created date, just the business id. 这里的键不包括创建的日期,仅覆盖企业ID。

If you want to try doing joins against sub queries (which can be more efficient, despite MySQL being poor at joining onto sub queries) then something like this (not tested):- 如果您想尝试对子查询进行联接(尽管MySQL不擅长联接子查询,这样可以提高效率),但这种方式(未测试):-

SELECT Business.*, 
       mess_1.mess_count + mess_2.mess_count as message_no, 
       mess_3.mess_count + mess_4.mess_count as message_unique_no, 
       business1.county_no, 
       click1.clicks, 
       business_regions.region_no, 
       business_reveals1.reveals_no 
FROM businesses as Business 
LEFT JOIN users as U ON Business.id = U.business_id  
LEFT OUTER JOIN
(
    SELECT Message.from_to, Count(Message.id) AS mess_count
    FROM messages as Message 
    WHERE Message.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
    GROUP BY  Message.from_to
) AS mess_1
ON U.id = mess_1.from_to
LEFT OUTER JOIN
(
    SELECT Message.user_id, Count(Message.id) AS mess_count
    FROM messages as Message 
    WHERE Message.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
    GROUP BY  Message.user_id
) AS mess_2
ON U.id = mess_2.user_id
LEFT OUTER JOIN
( 
    SELECT MessageUnique.from_to, Count(DISTINCT(MessageUnique.user_id))  AS mess_count
    FROM messages as MessageUnique 
    WHERE (MessageUnique.parent_message_id is null OR MessageUnique.parent_message_id = MessageUnique.id)  
    AND MessageUnique.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
    GROUP BY  MessageUnique.from_to
) AS mess_3
ON U.id = mess_3.from_to
LEFT OUTER JOIN
( 
    SELECT MessageUnique.user_id, Count(DISTINCT(MessageUnique.user_id))  AS mess_count
    FROM messages as MessageUnique 
    WHERE (MessageUnique.parent_message_id is null OR MessageUnique.parent_message_id = MessageUnique.id)  
    AND MessageUnique.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
    GROUP BY  MessageUnique.user_id
) AS mess_4
ON U.id = mess_4.from_to
LEFT OUTER JOIN
( 
    SELECT business_id, Count(*)  AS county_no
    FROM business_counties as bc2 
    GROUP BY  Business.id 
) as business1
ON Business.id = business1.business_id 
LEFT OUTER JOIN
( 
    SELECT click.business_id, Count(click.id) AS clicks
    FROM business_clickthroughs as click 
    WHERE click.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59'
    GROUP BY click.business_id 
) as click1 
ON Business.id = click1.business_id  
LEFT OUTER JOIN
( 
    SELECT br2.business_id, Count(*) AS region_no 
    FROM business_regions as br2 
    WHERE Business.id = br2.business_id 
    GROUP BY br2.business_id 
) as business_regions 
ON Business.id = business_regions.business_id 
LEFT OUTER JOIN
( 
    SELECT BusinessReveal.business_id, count(BusinessReveal.id) as reveal_no 
    FROM business_reveals as BusinessReveal
    WHERE BusinessReveal.created BETWEEN '2014-04-01 00:00:00' and '2014-04-30 23:59:59' 
    GROUP BY BusinessReveal.business_id
) as business_reveals1 
ON business_reveals1.business_id = Business.id

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM