[英]SQL to UNION ALL n times depending on rows of table
I have a table that has quiz entries submitted by users.我有一个表,其中包含用户提交的测验条目。 Each row has: date, id, source and a score.每行都有:日期、id、来源和分数。
I want to create a report, per source, that calculates the average score for each day taking into account all users, even the ones that didn't reply on that day by considering their latest response.我想为每个来源创建一个报告,计算每天的平均分数,同时考虑所有用户,即使是那些在当天没有回复的用户,也考虑到他们的最新回复。
Right now I have one query that can simply pull all quiz responses for a given day, source and user - if a user submitted multiple quizzes on the same day the latest one is retrieved:现在我有一个查询可以简单地提取给定日期、来源和用户的所有测验响应——如果用户在同一天提交了多个测验,则检索到最新的一个:
Query询问
select
id
score,
submission_date,
source
from
quiz
WHERE
id != ""
qualify ROW_NUMBER() OVER(PARTITION BY source, submission_date), id ORDER BY submission_date DESC) = 1;
Now, for each row that is returned from the query above I want to UNION ALL with a similar query as above but where the submission_date, instead of being the latest, should be <= the submission_date of the row it is going to 'union' with.现在,对于从上面的查询返回的每一行,我想用与上面类似的查询来 UNION ALL,但是 submit_date 而不是最新的,应该是 <= 它要“联合”的行的 submit_date和。
In another words, the query above could return 'n' rows, each row from table A should have UNION ALL 'n' times with a query that will base itself on the row submission date it is going to UNION ALL with:换句话说,上面的查询可以返回“n”行,表 A 中的每一行都应该有 UNION ALL 'n' 次,查询将基于行提交日期,它将使用 UNION ALL 进行:
Table A:
row 1
row 2
row 3
...
row n
Final table:
row 1
(multiple rows from UNION ALL)
row 2
(multiple rows from UNION ALL)
row 3
(multiple rows from UNION ALL)
...
row n
(multiple rows from UNION ALL)
I don't know if UNION ALL is the solution here but this is what I could think of.我不知道 UNION ALL 是否是这里的解决方案,但这是我能想到的。
Edit编辑
This is the output of the query above that reads from table named quiz:这是上面查询的 output,它从名为 quiz 的表中读取:
id ID | score分数 | submission_date提交日期 | source资源 |
---|---|---|---|
user1用户1 | 30 30 | 2022-09-16 2022-09-16 | foxmedia福克斯媒体 |
user2用户2 | 29 29 | 2022-09-16 2022-09-16 | foxmedia福克斯媒体 |
user3用户3 | 44 44 | 2022-09-14 2022-09-14 | foxmedia福克斯媒体 |
user4用户4 | 58 58 | 2022-09-13 2022-09-13 | foxmedia福克斯媒体 |
user5用户5 | 94 94 | 2022-09-13 2022-09-13 | branding品牌推广 |
user2用户2 | 25 25 | 2022-09-11 2022-09-11 | foxmedia福克斯媒体 |
user1用户1 | 21 21 | 2022-09-11 2022-09-11 | foxmedia福克斯媒体 |
user4用户4 | 50 50 | 2022-09-10 2022-09-10 | foxmedia福克斯媒体 |
user2用户2 | 23 23 | 2022-09-10 2022-09-10 | foxmedia福克斯媒体 |
user1用户1 | 22 22 | 2022-09-10 2022-09-10 | foxmedia福克斯媒体 |
user5用户5 | 90 90 | 2022-09-09 2022-09-09 | branding品牌推广 |
As you can see, on 2022-09-16 user1 and user2 have submitted responses but the average for this day should consider the last response from all other users on foxmedia as if they answered the quiz on that same day, which means including on the average calculation rows from user3 on 2022-09-14 and user4 on 2022-09-13.如您所见,用户 1 和用户 2 在 2022 年 9 月 16 日提交了回复,但这一天的平均值应考虑所有其他用户在 foxmedia 上的最后回复,就好像他们在同一天回答了测验一样,这意味着包括2022 年 9 月 14 日用户 3 和 2022 年 9 月 13 日用户 4 的平均计算行。 user5 should not count for the average as the source is not branding but foxmedia which should have a separate average, which in this case would only include data from user5. user5 不应计入平均值,因为来源不是品牌,而是 foxmedia,它应该有一个单独的平均值,在这种情况下,它只包括来自 user5 的数据。
Any help would be appreciated.任何帮助,将不胜感激。
Thanks!谢谢!
We start building some sample data.我们开始构建一些示例数据。 Then we cross join this with every date all_dates
in the dataset.然后我们将它与数据集中的每个日期all_dates
交叉连接。 Next we group the data by the all_dates
and user id.接下来,我们按all_dates
和用户 ID 对数据进行分组。 For obtaining the last entry of each user, an array of the scores is build.为了获得每个用户的最后一个条目,构建了一个分数数组。 The if
case is needed to not use score data from the future. if
案例需要不使用未来的分数数据。
with tbl as
(
Select "user1" as id, 30 as score, date("2022-09-16") as submission_date, "foxmedia" as S,
union all Select "user2", 29 ,date("2022-09-16"), "foxmedia",
union all Select "user3", 44 ,date("2022-09-14"), "foxmedia",
union all Select "user4", 58 ,date("2022-09-13"), "foxmedia",
union all Select "user5", 94 ,date("2022-09-13"), "branding",
union all Select "user2", 25 ,date("2022-09-11"), "foxmedia",
union all Select "user1", 21 ,date("2022-09-11"), "foxmedia",
union all Select "user4", 50 ,date("2022-09-10"), "foxmedia",
union all Select "user2", 23 ,date("2022-09-10"), "foxmedia",
union all Select "user1", 22 ,date("2022-09-10"), "foxmedia",
union all Select "user5", 90 ,date("2022-09-09"), "branding"
)
select
S,
all_dates,
id,
array_agg(submission_date order by submission_date desc limit 1)[safe_offset(0)] as last_date,
array_agg(if(submission_date<=all_dates,score,null) ignore nulls order by submission_date desc limit 1)[safe_offset(0)] as last_score,
from tbl
cross join (select distinct submission_date as all_dates from tbl)
group by 1,2,3
order by 1 desc,2 desc,3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.