[英]SQL/Postgres - collapse every N rows into 1 based on row position in group
I have a set of ordered results from a Postgres table, where every group of 4 rows represents a set of related data. 我从Postgres表中获得了一组有序结果,其中每4行的组代表一组相关数据。 I want to process this set of results further, so that every group of 4 rows are collapsed into 1 row with aliased column names where the value for each column is based on that row's position in the group - I'm close, but I can't quite get the query right (nor am I confident that I'm approaching this in the optimal manner).
我想进一步处理这组结果,以便将每4行的组折叠为具有别名列名的1行,其中每列的值基于该行在组中的位置-我很接近,但是我可以不太正确的查询(我也不相信我正在以最佳方式解决这个问题)。 Here's the scenario:
这是场景:
I am collecting survey results - each survey has 4 questions, but each answer is stored in a separate row in the database. 我正在收集调查结果-每个调查都有4个问题,但是每个答案都存储在数据库的单独行中。 However, they are associated with each other by a submission
event_id
, and the results are guaranteed to be returned in a fixed order. 但是,它们通过提交
event_id
相互关联,并且保证结果以固定顺序返回。 A set of survey_results
will look something like: 一组
survey_results
如下所示:
event_id | answer
----------------------------
a | 10
a | foo
a | 9
a | bar
b | 2
b | baz
b | 4
b | zip
What I would like to be able to do is query this result so that the final output comes out with each set of 4 results on their own line, with aliased column names. 我想做的就是查询此结果,以便最终输出带有别名列名的每组4个结果在自己的行中。
event_id | score_1 | reason_1 | score_2 | reason_2
----------------------------------------------------------
a | 10 | foo | 9 | bar
b | 2 | baz | 4 | zip
The closest that I've been able to get is 我能得到的最接近的是
SELECT survey_answers.event_id,
(SELECT survey_answers.answer FROM survey_answers FETCH NEXT 1 ROWS ONLY) AS score_1,
(SELECT survey_answers.answer FROM survey_answers OFFSET 1 ROWS FETCH NEXT 1 ROWS ONLY) AS reason_1
(SELECT survey_answers.answer FROM survey_answers OFFSET 2 ROWS FETCH NEXT 1 ROWS ONLY) AS score_2,
(SELECT survey_answers.answer FROM survey_answers OFFSET 3 ROWS FETCH NEXT 1 ROWS ONLY) AS reason_2
FROM survey_answers
GROUP BY survey_answers.event_id
But this, understandably, returns the correct number of rows, but with the same values (other than event_id
): 但是,可以理解的是,这返回正确的行数,但具有相同的值(
event_id
除外):
event_id | score_1 | reason_1 | score_2 | reason_2
----------------------------------------------------------
a | 10 | foo | 9 | bar
b | 10 | foo | 9 | bar
How can I structure my query so that it applies the OFFSET
/ FETCH
behaviors every batch of 4 rows, or, maybe more accurately, within every unique set of event_id
s? 如何构造查询,以便每4行批处理(或更准确地说,在
event_id
的每个唯一集合中)应用OFFSET
/ FETCH
行为?
demo: db<>fiddle 演示:db <> fiddle
First of all, this looks like a very bad design: 首先,这看起来是一个非常糟糕的设计:
There is no guaranteed order! 没有保证的订单! Databases store their data in random order and call them in random order.
数据库以随机顺序存储数据,并以随机顺序调用它们。 You really need a order column.
您确实需要一个订单栏。 In this small case this might work for accident.
在这种小情况下,这可能会导致意外。
You should generate two columns, one for score, one for reason. 您应该生成两列,一列得分,一列原因。 Mix up the types is not a good idea.
混合类型不是一个好主意。
Nevertheless for this simple and short example this could be a solution (remember this is not recommended for productive tables): 不过,对于这个简单而简短的示例,这可能是一个解决方案(请记住,不建议在生产性表中使用此方法):
WITH data AS (
SELECT
*,
row_number() OVER (PARTITION BY event_id) -- 1
FROM
survey_results
)
SELECT
event_id,
MAX(CASE WHEN row_number = 1 THEN answer END) AS score_1, -- 2
MAX(CASE WHEN row_number = 2 THEN answer END) AS reason_1,
MAX(CASE WHEN row_number = 3 THEN answer END) AS score_2,
MAX(CASE WHEN row_number = 4 THEN answer END) AS reason_2
FROM
data
GROUP BY event_id
event_id
. event_id
添加行数。 In this case from 1 to 4. This can be used to identify the types of answer
(see intermediate step in fiddle). answer
的类型(请参阅小提琴中的中间步骤)。 In productive code you should use some order column to ensure the order. PARTITION BY event_id ORDER BY order_column
PARTITION BY event_id ORDER BY order_column
event_id
and the type id (row_number) which does exactly what you expect event_id
和类型id(row_number)的简单枢轴,它确实可以实现您的期望 You need a column that specifies the ordering. 您需要一列来指定顺序。 In your case, that should probably be a
serial
column, which is guaranteed to be increasing for each insert. 在您的情况下,它可能应该是一个
serial
列,并保证每次插入都会增加。 I would call such a column survey_result_id
. 我称这样的一栏
survey_result_id
。
With such a column, you can do: 使用这样的列,您可以执行以下操作:
select event_id,
max(case when seqnum = 1 then answer end) as score_1,
max(case when seqnum = 2 then answer end) as reason_1,
max(case when seqnum = 3 then answer end) as score_2,
max(case when seqnum = 4 then answer end) as reason_2
from (select sr.*,
row_number() over (partition by event_id order by survey_result_id) as seqnum
from survey_results sr
) sr
group by event_id;
Without such a column, you cannot reliably do what you want, because SQL tables represent unordered sets. 没有这样的列,您将无法可靠地执行所需的操作,因为SQL表表示无序集。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.