简体   繁体   English

根据查询结果制作可子查询的 UNION ALL

[英]Crafting a Subquery-able UNION ALL based on the results of a query

Data数据

I have a couple of tables like so:我有几个这样的表:

CREATE TABLE cycles (
  `cycle` varchar(6) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `cycle_type` varchar(140) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `start` date DEFAULT NULL,
  `end` date DEFAULT NULL
);

CREATE TABLE rsvn (
  `str` varchar(140) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `start_date` date DEFAULT NULL,
  `end_date` date DEFAULT NULL
);

INSERT INTO `cycles` (`cycle`, `cycle_type`, `start`, `end`) values
('202013', 'a', '2021-01-04', '2021-01-31'),
('202013', 'b', '2021-01-04', '2021-01-31'),
('202101', 'a', '2021-01-04', '2021-01-31'),
('202101', 'b', '2021-01-04', '2021-01-31'),
('202102', 'a', '2021-02-01', '2021-02-28'),
('202102', 'b', '2021-02-01', '2021-02-28'),
('202103', 'a', '2021-03-01', '2021-03-28'),
('202103', 'b', '2021-03-01', '2021-03-28');

INSERT INTO `rsvn` (str, start_date, end_date) values
('STR01367', '2020-12-07', '2020-06-21'),
('STR00759', '2020-12-07', '2021-04-25'),
('STR01367', '2021-01-04', '2021-09-12'),
('STR01367', '2021-06-21', '2022-02-27');

Desired Results期望的结果

For any given cycle, I want to count the number of occurrences of str across cycles.对于任何给定的周期,我想计算 str 跨周期的出现次数。 So between cycle 2108 - 2108 (one cycle), I see:所以在周期 2108 - 2108(一个周期)之间,我看到:

str字符串 count数数
STR01367 STR01367 1 1
STR00759 STR00759 1 1

And from between 2108 - 2109 (two cycles) I see:从 2108 年到 2109 年(两个周期),我看到:

str字符串 count数数
STR01367 STR01367 2 2
STR00759 STR00759 1 1

What I've tried我试过的

I'm trying to figure out how to dynamically obtain those results.我试图弄清楚如何动态获得这些结果。 I don't see any options outside a UNION ALL query (one query for each cycles), so I tried writing a PROCEDURE.我在 UNION ALL 查询(每个周期一个查询)之外看不到任何选项,所以我尝试编写一个 PROCEDURE。 However, that didn't work because I want to do post-processing on the query results, and I don't believe you can use the results of a PROCEDURE in a CTE or subquery.但是,这不起作用,因为我想对查询结果进行后处理,而且我不相信您可以在 CTE 或子查询中使用 PROCEDURE 的结果。

My PROCEDURE (works, can't include results in a subquery like SELECT * FROM call count_cycles (?)):我的程序(有效,不能在子查询中包含结果,例如SELECT * FROM call count_cycles (?)):

CREATE PROCEDURE `count_cycles`(start_cycle CHAR(6), end_cycle CHAR(6))
BEGIN
    SET @cycles := (
        SELECT CONCAT('WITH installed_cycles_count AS (',
            GROUP_CONCAT(
                CONCAT('
        SELECT rsvn.str, 1 AS installed_cycles
        FROM rsvn
        WHERE "', `cy`.`start`, '" BETWEEN rsvn.start_date AND COALESCE(rsvn.end_date, "9999-01-01")
           OR "', `cy`.`end`, '" BETWEEN rsvn.start_date AND COALESCE(rsvn.end_date, "9999-01-01")
        GROUP BY rsvn.str
    '
                )
                 SEPARATOR ' UNION ALL '
            ),
    ')

    SELECT
             store.chain AS "Chain"
            ,store.division AS "Division"
            ,dividers_store AS "Store"
            ,SUM(installed_cycles) AS "Installed Cycles"
    FROM installed_cycles_count r
    LEFT JOIN store ON store.name = r.dividers_store
    GROUP BY dividers_store
    ORDER BY chain, division, dividers_store, installed_cycles'
        )
        FROM cycles `cy`
        WHERE `cy`.`cycle_type` = 'Ad Cycle'
            AND `cy`.`cycle` >= CONCAT('20', RIGHT(start_cycle, 4))
            AND `cy`.`cycle` <= CONCAT('20', RIGHT(end_cycle, 4))
        GROUP BY `cy`.`cycle_type`
    );

    EXECUTE IMMEDIATE @cycles;
END

Alternatively, I attempted to use a recursive query to obtain my results by incrementing my cycle.或者,我尝试使用递归查询通过增加我的周期来获得我的结果。 This gave me the cycles I wanted:这给了我想要的周期:

WITH RECURSIVE xyz AS (
    SELECT cy.`cycle`, cy.`start`, cy.`end`
    FROM cycles cy
    WHERE cycle_type = 'Ad Cycle'
    AND `cycle` = '202101'

    UNION ALL
    
    SELECT cy.`cycle`, cy.`start`, cy.`end`
    FROM xyz
    JOIN cycles cy
        ON cy.`cycle` = increment_cycle(xyz.`cycle`, 1)
        AND cy.`cycle_type` = 'Ad Cycle'
    WHERE cy.`cycle` <= '202110'
)
SELECT * FROM xyz;

But I can't get it working when I add in the reservations table: infinite loop?但是当我在预订表中添加时,我无法让它工作:无限循环?

WITH RECURSIVE xyz AS (
    SELECT cy.`cycle`, 'dr.dividers_store', 1 AS installed_cycles
    FROM cycles cy
    LEFT JOIN rsvn dr
        ON cy.`start` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
            OR cy.`end` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
    WHERE cy.`cycle_type` = 'Ad Cycle'
        AND cy.`cycle` = '202101'

    UNION ALL

    SELECT cy.`cycle`, 'dr.dividers_store', 1 AS installed_cycles
    FROM xyz
    JOIN cycles cy
        ON cy.`cycle` = increment_cycle(xyz.`cycle`, 1)
        AND cy.`cycle_type` = 'Ad Cycle'
    LEFT JOIN rsvn dr
        ON cy.`start` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
            OR cy.`end` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
    WHERE cy.`cycle` <= '202102'
)
SELECT * FROM xyz

What options do I have to get the results I need, in such a way that I can use them in a CTE or subquery?我必须有哪些选项才能获得我需要的结果,以便我可以在 CTE 或子查询中使用它们?

The results I am looking for are easily obtained via a two-stage grouping.我正在寻找的结果很容易通过两阶段分组获得。 Something like this:像这样的东西:

WITH sbc AS (
    SELECT cy.`cycle`, dr.str, 1 AS 'count'
    FROM cycles cy
    LEFT JOIN rsvn dr
        ON cy.`start` BETWEEN dr.start_date AND dr.end_date
            OR cy.`end` BETWEEN dr.start_date AND dr.end_date
    WHERE cy.`cycle_type` = 'Ad Cycle'
        AND cy.`cycle` BETWEEN '202201' AND '202205'
    GROUP BY cy.`cycle`, dr.str
    ORDER BY dr.str, cy.`cycle`
)
SELECT `cycle`, str, SUM(`count`) as `count`
FROM sbc
GROUP BY str

The CTE produces one result per rsvn per cycle. CTE 每个周期每个 rsvn 产生一个结果。 Afterwards all that is needed is to group by store and count the number of occurrences.之后所需要做的就是按商店分组并计算出现次数。

Besides being simpler, I suspect that this query is faster than the union concept I was stuck on when I asked the question, since among other things the server does not need to perform a union on multiple grouping queries.除了更简单之外,我怀疑这个查询比我提出问题时坚持的联合概念更快,因为除其他外,服务器不需要对多个分组查询执行联合。 However, I do not understand how MariaDB optimizes such queries, and while I am curious I don't have the time to run benchmarks to find out.但是,我不明白 MariaDB 如何优化此类查询,虽然我很好奇,但我没有时间运行基准测试来找出答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM