简体   繁体   English

是否可以对视图进行分区,例如MySQL中的表?

[英]Is it possible to partition a view, like table in MySQL?

I have created view from UNION ALL clause of around 10 tables and want to apply some search queries by date range on it. 我已经从约10个表的UNION ALL子句创建了视图,并希望按日期范围对其应用一些搜索查询。 But as number of records increases it takes longer to execute the query. 但是随着记录数量的增加,执行查询所需的时间会更长。 Right now the view is having 2 billion rows. 目前,该视图有20亿行。

Table structure is like: 表结构如下:

CREATE TABLE IF NOT EXISTS `tbl_queue_stats_0716` (
    `id` int(11) NOT NULL AUTO_INCREMENT, 
    `server_id` int(11) NOT NULL, 
    `uniqueid` varchar(100) DEFAULT NULL, 
    `queue_datetime` datetime NOT NULL, 
    `queue_timestamp` varchar(100) NULL, 
    `qname_id` int(11) NOT NULL, 
    `qagent_id` int(11) NOT NULL, 
    `qevent_id` int(11) NOT NULL, 
    `info1` varchar(100) DEFAULT NULL, 
    `info2` varchar(100) DEFAULT NULL, 
    `info3` varchar(100) DEFAULT NULL, 
    `info4` varchar(100) DEFAULT NULL, 
    `info5` varchar(100) DEFAULT NULL, 
    PRIMARY KEY (`id`)
);

Tables are created on monthly basis , so there can be tables like tbl_queue_stats_0616, tbl_queue_stats_0516, tbl_queue_stats_0416... 表格是按月创建的,因此可以有tbl_queue_stats_0616,tbl_queue_stats_0516,tbl_queue_stats_0416之类的表格...

And I want to apply search query on multiple tables if date range required to search from 2 or more months. 如果要将日期范围从2个月或更长时间开始搜索,我想对多个表应用搜索查询。

Search query is look like: 搜索查询如下所示:

select  server_id,server_name,queue_id,queue_name,qevent_id,event,
        count(id) as cnt,sum(info1) as info1, sum(info2) as info2,
        sum(info3) as info3, sum(info4) as info4, sum(info5) as info5,
        max(cast(info2 AS SIGNED)) as max_info2,
        max(cast(info3 AS SIGNED)) as max_info3
   from
      ( SELECT  a.server_id as server_id,e.server_name as server_name,
                a.id,a.`queue_datetime`, b.agent, a.qname_id as queue_id ,
               c.queue as queue_name,d.event,a.qevent_id,a.info1,a.info2,
               a.info3,a.info4,a.info5
            FROM  view_queue_stats a,tbl_qagent b, tbl_qname c, tbl_qevent d,
                tbl_server e
            WHERE  a.qagent_id=b.id
              AND  a.qname_id=c.id
              AND  a.qevent_id=d.id
              AND  a.server_id=e.id
              AND  DATE(a.queue_datetime) between '" . $start_date .
                                           "' AND '" . $end_date . "'
              AND  a.server_id IN ($server_name) 
      )as total
    GROUP BY  qevent_id,queue_id,server_id
    ORDER BY  length(server_name), server_name,queue_id,qevent_id. 

I think search through partitioned view can execute my query faster. 我认为通过分区视图进行搜索可以更快地执行查询。 To achieve this I applied partition related parameters to create view but not succeeded. 为此,我应用了分区相关的参数来创建视图,但是没有成功。

Below is Output of SHOW CREATE VIEW view_queue_stats; 以下是SHOW CREATE VIEW的输出view_queue_stats;

CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER 
       VIEW `view_queue_stats`
       AS    select  `tbl_queue_stats_0116`.`id` AS `id`,
       `tbl_queue_stats_0116`.`server_id` AS `server_id`,
       `tbl_queue_stats_0116`.`uniqueid` AS `uniqueid`,
       `tbl_queue_stats_0116`.`queue_datetime` AS `queue_datetime`,
       `tbl_queue_stats_0116`.`queue_timestamp` AS `queue_timestamp`,
       `tbl_queue_stats_0116`.`qname_id` AS `qname_id`,
       `tbl_queue_stats_0116`.`qagent_id` AS `qagent_id`,
       `tbl_queue_stats_0116`.`qevent_id` AS `qevent_id`,
       `tbl_queue_stats_0116`.`info1` AS `info1`,
       `tbl_queue_stats_0116`.`info2` AS `info2`,
       `tbl_queue_stats_0116`.`info3` AS `info3`,
       `tbl_queue_stats_0116`.`info4` AS `info4`,
       `tbl_queue_stats_0116`.`info5` AS `info5`
    from  `tbl_queue_stats_0116`
    union  all 
select  `tbl_queue_stats_0216`.`id` AS `id`,
       `tbl_queue_stats_0216`.`server_id` AS `server_id`,
       `tbl_queue_stats_0216`.`uniqueid` AS `uniqueid`,
       `tbl_queue_stats_0216`.`queue_datetime` AS `queue_datetime`,
       `tbl_queue_stats_0216`.`queue_timestamp` AS `queue_timestamp`,
       `tbl_queue_stats_0216`.`qname_id` AS `qname_id`,
       `tbl_queue_stats_0216`.`qagent_id` AS `qagent_id`,
       `tbl_queue_stats_0216`.`qevent_id` AS `qevent_id`,
       `tbl_queue_stats_0216`.`info1` AS `info1`,
       `tbl_queue_stats_0216`.`info2` AS `info2`,
       `tbl_queue_stats_0216`.`info3` AS `info3`,
       `tbl_queue_stats_0216`.`info4` AS `info4`,
       `tbl_queue_stats_0216`.`info5` AS `info5`
    from  `tbl_queue_stats_0216`
    union  all
    ...

| utf8                 | utf8_general_ci      |

So, Is there any way to partition a view? 那么,有什么方法可以分割视图吗?

Will you have a billion server_ids? 您会有十亿个server_ids吗? Perhaps you could use a smaller int, such as MEDIUMINT UNSIGNED , which is 3 bytes (instead of 4) and a limit of 16M. 也许您可以使用较小的int,例如MEDIUMINT UNSIGNED ,它是3个字节(而不是4个字节),并且限制为16M。 Ditto for other ids. 与其他ID同上。 (Smaller -> more cacheable -> less I/O -> faster) (更小->更多可缓存->更少的I / O->更快)

Is queue_timestamp a timestamp? queue_timestamp是时间戳吗? If so, why VARCHAR ? 如果是这样,为什么要使用VARCHAR

cast(info2 AS SIGNED) -- You would be better off cleansing the data before inserting it, and then using an appropriate datatype ( INT ?). cast(info2 AS SIGNED) -最好插入数据之前先清理数据,然后使用适当的数据类型( INT ?)。

important: Don't hide columns in functions ( DATE(a.queue_datetime) ), it inhibits using indexes; 重要:不要在函数中隐藏列( DATE(a.queue_datetime) ),它禁止使用索引; see below. 见下文。

Are most of the fields really optional? 大多数字段真的是可选的吗? If not, say NOT NULL , instead of NULL . 如果不是,请说NOT NULL ,而不是NULL

important: Back to the question... UNION ALL of 10 tables will perform similar to a PARTITIONed table where no "partition pruning" can occur. 重要:回到问题... UNION ALL 10个表的UNION ALL将类似于PARTITIONed表,在该表中不会发生“分区修剪”。 But, the UNION is likely to be worse because it seems to generate the temp table containing all the data, then start filtering. 但是, UNION可能会更糟,因为它似乎会生成包含所有数据的临时表,然后开始过滤。 Please provide EXPLAIN SELECT ... for the query. 请为查询提供EXPLAIN SELECT ... (This should confirm or deny this supposition. It could make a big difference.) (这应该确认或否认这种假设。这可能会带来很大的不同。)

important: INDEX(server_id, queue_datetime) is likely to help performance. 重要提示: INDEX(server_id, queue_datetime)可能有助于提高性能。

So, the question now is whether "pruning" can occur. 因此,现在的问题是是否可能发生“修剪”。 The likely case is when query_datetime would limit the result to few partitions. 可能的情况是query_datetime将结果限制为几个分区。 Are the tables based on query_datetime ? 这些表是否基于query_datetime Are the SELECTs usually limited to one or two of the tables? SELECTs通常只限于一个或两个表吗?

Given the correct answers to the above, and given the changes suggested, then changing from a VIEW to this will help significantly: 鉴于以上所述的正确答案以及建议的更改,那么从VIEW更改为VIEW将有很大帮助:

PARTITION BY RANGE(TO_DAYS(query_datetime)) ...

But, as it turns out, partitioning is not really necessary. 但是,事实证明,分区并不是真正必要的。 The INDEX suggested above (together with the change to the WHERE ) will do just as good on a single table. 上面建议的INDEX (连同对WHERE的更改)在单个表上的效果一样好。

But... Some more questions. 但是...还有更多问题。 You mentioned one SELECT ; 您提到了一个SELECT are there others? 还有其他吗? Fixing the query/schema for one query may or may not help other queries. 修正一个查询的查询/架构可能会或可能不会帮助其他查询。 Do you delete "old" tables/partitions? 您是否删除“旧”表/分区? If so partitioning can help nicely. 如果这样,分区可以很好地帮助您。

Answer those issues, then we can make a mid-course correction. 回答这些问题,然后我们可以进行中途更正。

Check the link given below .That may help you 检查下面给出的链接。这可能对您有帮助

http://dev.mysql.com/doc/refman/5.5/en/partitioning.html http://dev.mysql.com/doc/refman/5.5/zh-CN/partitioning.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM