[英]Crafting a Subquery-able UNION ALL based on the results of a query
I have a couple of tables like so:我有几个这样的表:
CREATE TABLE cycles (
`cycle` varchar(6) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`cycle_type` varchar(140) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`start` date DEFAULT NULL,
`end` date DEFAULT NULL
);
CREATE TABLE rsvn (
`str` varchar(140) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`start_date` date DEFAULT NULL,
`end_date` date DEFAULT NULL
);
INSERT INTO `cycles` (`cycle`, `cycle_type`, `start`, `end`) values
('202013', 'a', '2021-01-04', '2021-01-31'),
('202013', 'b', '2021-01-04', '2021-01-31'),
('202101', 'a', '2021-01-04', '2021-01-31'),
('202101', 'b', '2021-01-04', '2021-01-31'),
('202102', 'a', '2021-02-01', '2021-02-28'),
('202102', 'b', '2021-02-01', '2021-02-28'),
('202103', 'a', '2021-03-01', '2021-03-28'),
('202103', 'b', '2021-03-01', '2021-03-28');
INSERT INTO `rsvn` (str, start_date, end_date) values
('STR01367', '2020-12-07', '2020-06-21'),
('STR00759', '2020-12-07', '2021-04-25'),
('STR01367', '2021-01-04', '2021-09-12'),
('STR01367', '2021-06-21', '2022-02-27');
For any given cycle, I want to count the number of occurrences of str across cycles.对于任何给定的周期,我想计算 str 跨周期的出现次数。 So between cycle 2108 - 2108 (one cycle), I see:
所以在周期 2108 - 2108(一个周期)之间,我看到:
str![]() |
count![]() |
---|---|
STR01367 ![]() |
1 ![]() |
STR00759 ![]() |
1 ![]() |
And from between 2108 - 2109 (two cycles) I see:从 2108 年到 2109 年(两个周期),我看到:
str![]() |
count![]() |
---|---|
STR01367 ![]() |
2 ![]() |
STR00759 ![]() |
1 ![]() |
I'm trying to figure out how to dynamically obtain those results.我试图弄清楚如何动态获得这些结果。 I don't see any options outside a UNION ALL query (one query for each cycles), so I tried writing a PROCEDURE.
我在 UNION ALL 查询(每个周期一个查询)之外看不到任何选项,所以我尝试编写一个 PROCEDURE。 However, that didn't work because I want to do post-processing on the query results, and I don't believe you can use the results of a PROCEDURE in a CTE or subquery.
但是,这不起作用,因为我想对查询结果进行后处理,而且我不相信您可以在 CTE 或子查询中使用 PROCEDURE 的结果。
My PROCEDURE (works, can't include results in a subquery like SELECT * FROM call count_cycles
(?)):我的程序(有效,不能在子查询中包含结果,例如
SELECT * FROM call count_cycles
(?)):
CREATE PROCEDURE `count_cycles`(start_cycle CHAR(6), end_cycle CHAR(6))
BEGIN
SET @cycles := (
SELECT CONCAT('WITH installed_cycles_count AS (',
GROUP_CONCAT(
CONCAT('
SELECT rsvn.str, 1 AS installed_cycles
FROM rsvn
WHERE "', `cy`.`start`, '" BETWEEN rsvn.start_date AND COALESCE(rsvn.end_date, "9999-01-01")
OR "', `cy`.`end`, '" BETWEEN rsvn.start_date AND COALESCE(rsvn.end_date, "9999-01-01")
GROUP BY rsvn.str
'
)
SEPARATOR ' UNION ALL '
),
')
SELECT
store.chain AS "Chain"
,store.division AS "Division"
,dividers_store AS "Store"
,SUM(installed_cycles) AS "Installed Cycles"
FROM installed_cycles_count r
LEFT JOIN store ON store.name = r.dividers_store
GROUP BY dividers_store
ORDER BY chain, division, dividers_store, installed_cycles'
)
FROM cycles `cy`
WHERE `cy`.`cycle_type` = 'Ad Cycle'
AND `cy`.`cycle` >= CONCAT('20', RIGHT(start_cycle, 4))
AND `cy`.`cycle` <= CONCAT('20', RIGHT(end_cycle, 4))
GROUP BY `cy`.`cycle_type`
);
EXECUTE IMMEDIATE @cycles;
END
Alternatively, I attempted to use a recursive query to obtain my results by incrementing my cycle.或者,我尝试使用递归查询通过增加我的周期来获得我的结果。 This gave me the cycles I wanted:
这给了我想要的周期:
WITH RECURSIVE xyz AS (
SELECT cy.`cycle`, cy.`start`, cy.`end`
FROM cycles cy
WHERE cycle_type = 'Ad Cycle'
AND `cycle` = '202101'
UNION ALL
SELECT cy.`cycle`, cy.`start`, cy.`end`
FROM xyz
JOIN cycles cy
ON cy.`cycle` = increment_cycle(xyz.`cycle`, 1)
AND cy.`cycle_type` = 'Ad Cycle'
WHERE cy.`cycle` <= '202110'
)
SELECT * FROM xyz;
But I can't get it working when I add in the reservations table: infinite loop?但是当我在预订表中添加时,我无法让它工作:无限循环?
WITH RECURSIVE xyz AS (
SELECT cy.`cycle`, 'dr.dividers_store', 1 AS installed_cycles
FROM cycles cy
LEFT JOIN rsvn dr
ON cy.`start` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
OR cy.`end` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
WHERE cy.`cycle_type` = 'Ad Cycle'
AND cy.`cycle` = '202101'
UNION ALL
SELECT cy.`cycle`, 'dr.dividers_store', 1 AS installed_cycles
FROM xyz
JOIN cycles cy
ON cy.`cycle` = increment_cycle(xyz.`cycle`, 1)
AND cy.`cycle_type` = 'Ad Cycle'
LEFT JOIN rsvn dr
ON cy.`start` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
OR cy.`end` BETWEEN dr.start_date AND COALESCE(dr.end_date, "9999-01-01")
WHERE cy.`cycle` <= '202102'
)
SELECT * FROM xyz
What options do I have to get the results I need, in such a way that I can use them in a CTE or subquery?我必须有哪些选项才能获得我需要的结果,以便我可以在 CTE 或子查询中使用它们?
The results I am looking for are easily obtained via a two-stage grouping.我正在寻找的结果很容易通过两阶段分组获得。 Something like this:
像这样的东西:
WITH sbc AS (
SELECT cy.`cycle`, dr.str, 1 AS 'count'
FROM cycles cy
LEFT JOIN rsvn dr
ON cy.`start` BETWEEN dr.start_date AND dr.end_date
OR cy.`end` BETWEEN dr.start_date AND dr.end_date
WHERE cy.`cycle_type` = 'Ad Cycle'
AND cy.`cycle` BETWEEN '202201' AND '202205'
GROUP BY cy.`cycle`, dr.str
ORDER BY dr.str, cy.`cycle`
)
SELECT `cycle`, str, SUM(`count`) as `count`
FROM sbc
GROUP BY str
The CTE produces one result per rsvn per cycle. CTE 每个周期每个 rsvn 产生一个结果。 Afterwards all that is needed is to group by store and count the number of occurrences.
之后所需要做的就是按商店分组并计算出现次数。
Besides being simpler, I suspect that this query is faster than the union concept I was stuck on when I asked the question, since among other things the server does not need to perform a union on multiple grouping queries.除了更简单之外,我怀疑这个查询比我提出问题时坚持的联合概念更快,因为除其他外,服务器不需要对多个分组查询执行联合。 However, I do not understand how MariaDB optimizes such queries, and while I am curious I don't have the time to run benchmarks to find out.
但是,我不明白 MariaDB 如何优化此类查询,虽然我很好奇,但我没有时间运行基准测试来找出答案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.