SQL 到 UNION ALL n 次，具体取决于表的行

Question

I have a table that has quiz entries submitted by users.我有一个表，其中包含用户提交的测验条目。 Each row has: date, id, source and a score.每行都有：日期、id、来源和分数。

I want to create a report, per source, that calculates the average score for each day taking into account all users, even the ones that didn't reply on that day by considering their latest response.我想为每个来源创建一个报告，计算每天的平均分数，同时考虑所有用户，即使是那些在当天没有回复的用户，也考虑到他们的最新回复。

Right now I have one query that can simply pull all quiz responses for a given day, source and user - if a user submitted multiple quizzes on the same day the latest one is retrieved:现在我有一个查询可以简单地提取给定日期、来源和用户的所有测验响应——如果用户在同一天提交了多个测验，则检索到最新的一个：

Query询问

      select
        id
        score,
        submission_date,
        source
      from
        quiz
      WHERE
        id != ""
      qualify ROW_NUMBER() OVER(PARTITION BY source, submission_date), id ORDER BY submission_date DESC) = 1;

Now, for each row that is returned from the query above I want to UNION ALL with a similar query as above but where the submission_date, instead of being the latest, should be <= the submission_date of the row it is going to 'union' with.现在，对于从上面的查询返回的每一行，我想用与上面类似的查询来 UNION ALL，但是 submit_date 而不是最新的，应该是 <= 它要“联合”的行的 submit_date和。

In another words, the query above could return 'n' rows, each row from table A should have UNION ALL 'n' times with a query that will base itself on the row submission date it is going to UNION ALL with:换句话说，上面的查询可以返回“n”行，表 A 中的每一行都应该有 UNION ALL 'n' 次，查询将基于行提交日期，它将使用 UNION ALL 进行：

Table A:
row 1
row 2
row 3
...
row n

Final table:
row 1
(multiple rows from UNION ALL)
row 2
(multiple rows from UNION ALL)
row 3
(multiple rows from UNION ALL)
...
row n
(multiple rows from UNION ALL)

I don't know if UNION ALL is the solution here but this is what I could think of.我不知道 UNION ALL 是否是这里的解决方案，但这是我能想到的。

Edit编辑

This is the output of the query above that reads from table named quiz:这是上面查询的 output，它从名为 quiz 的表中读取：

id ID	score分数	submission_date提交日期	source资源
user1用户1	30 30	2022-09-16 2022-09-16	foxmedia福克斯媒体
user2用户2	29 29	2022-09-16 2022-09-16	foxmedia福克斯媒体
user3用户3	44 44	2022-09-14 2022-09-14	foxmedia福克斯媒体
user4用户4	58 58	2022-09-13 2022-09-13	foxmedia福克斯媒体
user5用户5	94 94	2022-09-13 2022-09-13	branding品牌推广
user2用户2	25 25	2022-09-11 2022-09-11	foxmedia福克斯媒体
user1用户1	21 21	2022-09-11 2022-09-11	foxmedia福克斯媒体
user4用户4	50 50	2022-09-10 2022-09-10	foxmedia福克斯媒体
user2用户2	23 23	2022-09-10 2022-09-10	foxmedia福克斯媒体
user1用户1	22 22	2022-09-10 2022-09-10	foxmedia福克斯媒体
user5用户5	90 90	2022-09-09 2022-09-09	branding品牌推广

As you can see, on 2022-09-16 user1 and user2 have submitted responses but the average for this day should consider the last response from all other users on foxmedia as if they answered the quiz on that same day, which means including on the average calculation rows from user3 on 2022-09-14 and user4 on 2022-09-13.如您所见，用户 1 和用户 2 在 2022 年 9 月 16 日提交了回复，但这一天的平均值应考虑所有其他用户在 foxmedia 上的最后回复，就好像他们在同一天回答了测验一样，这意味着包括2022 年 9 月 14 日用户 3 和 2022 年 9 月 13 日用户 4 的平均计算行。 user5 should not count for the average as the source is not branding but foxmedia which should have a separate average, which in this case would only include data from user5. user5 不应计入平均值，因为来源不是品牌，而是 foxmedia，它应该有一个单独的平均值，在这种情况下，它只包括来自 user5 的数据。

Any help would be appreciated.任何帮助，将不胜感激。

Thanks!谢谢！

Answer 1

You have scores for each user, but only for some days.你有每个用户的分数，但只有几天。
For each day you want to have an entry for all users.对于每一天，您都希望为所有用户提供一个条目。
The score for each user is the score of the day.每个用户的分数是当天的分数。 If on a day an user has no score, take the last value from the past.如果某天用户没有得分，则取过去的最后一个值。

We start building some sample data.我们开始构建一些示例数据。 Then we cross join this with every date all_dates in the dataset.然后我们将它与数据集中的每个日期all_dates交叉连接。 Next we group the data by the all_dates and user id.接下来，我们按all_dates和用户 ID 对数据进行分组。 For obtaining the last entry of each user, an array of the scores is build.为了获得每个用户的最后一个条目，构建了一个分数数组。 The if case is needed to not use score data from the future. if案例需要不使用未来的分数数据。

with tbl as 
(
Select "user1" as id, 30 as score, date("2022-09-16") as submission_date, "foxmedia" as S,
 union all Select "user2", 29 ,date("2022-09-16"), "foxmedia",
 union all Select "user3", 44 ,date("2022-09-14"), "foxmedia",
 union all Select "user4", 58 ,date("2022-09-13"), "foxmedia",
 union all Select "user5", 94 ,date("2022-09-13"), "branding",
 union all Select "user2", 25 ,date("2022-09-11"), "foxmedia",
 union all Select "user1", 21 ,date("2022-09-11"), "foxmedia",
 union all Select "user4", 50 ,date("2022-09-10"), "foxmedia",
 union all Select "user2", 23 ,date("2022-09-10"), "foxmedia",
 union all Select "user1", 22 ,date("2022-09-10"), "foxmedia",
 union all Select "user5", 90 ,date("2022-09-09"), "branding"
)
select 
S,
all_dates,
id,
array_agg(submission_date order by submission_date desc limit 1)[safe_offset(0)] as last_date,
array_agg(if(submission_date<=all_dates,score,null) ignore nulls order by submission_date desc limit 1)[safe_offset(0)] as last_score,

from tbl
cross join (select distinct submission_date as all_dates from tbl)
group by 1,2,3
order by 1 desc,2 desc,3

SQL 到 UNION ALL n 次，具体取决于表的行

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-09-16 08:37:50

SQL 到 UNION ALL n 次，具体取决于表的行

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-09-16 08:37:50

解决方案1
1 已采纳 2022-09-16 08:37:50