簡體   English   中英

如何在PostgreSQL中有效地編寫這樣的查詢:將連續的行聚合為數組(由一對列標識)?

[英]How to write such query effectively in postgresql: aggregating consecutive rows into arrays(identified by a pair of columns)?

我有一個查詢,如:

(
    SELECT
        t1.person_id,
        t1.created_at,
        't1' AS type,
        t1.extra_data AS extra_data
    FROM table1 AS t1
)
UNION
(
    SELECT
        t2.person_id,
        t2.created_at,
        't2' AS type,
        t2.extra_data AS extra_data
    FROM table2 AS t2
)
UNION
(
    SELECT
        t3.person_id,
        t3.created_at,
        't3' AS type,
        t3.extra_data AS extra_data
    FROM table3 AS t3
)
ORDER BY created_at DESC;

這將導致類似的結果( created_at是一個時間戳,我省略了具體值,並使用簡單的整數來表示順序)

person_id  | type  |  created_at | extra_data
---------  | ----  |  ---------- | ----------
1          | t1    |  9          | a
1          | t1    |  8          | b
2          | t2    |  7          | c
2          | t2    |  6          | c
2          | t2    |  5          | d
1          | t3    |  4          | e
3          | t3    |  3          | f

我想將連續的(person_id,type)對分組,最大的created_at作為最終的created_at,並將extra_data聚合到一個數組中,即我想得到以下結果:

person_id  | type  |  created_at | extra_data_array
---------  | ----  |  ---------- | ----------
1          | t1    |  9          | [a, b]
2          | t2    |  7          | [c, c, d]
1          | t3    |  5          | e
3          | t3    |  4          | f

我已經嘗試了窗口函數,但未能弄清楚如何實現。

我的問題是:

1)如何編寫一個單查詢來實現我的目標?

2)可以使用索引快速查詢嗎?

我對第二個問題的擔心是,由於基本結果是從UNION查詢中選擇的,因此我懷疑是否有任何機會利用索引。

任何人,感謝各種幫助!

您需要的兩個聚合函數是MAXarray_agg

我認為,如果在UNION之前先GROUP ,那么應用索引會更好,所以我會這樣做:

(
    SELECT
        t1.person_id,
        MAX(t1.created_at) AS created_at,
        't1' AS type,
        array_agg(t1.extra_data) AS extra_data
    FROM table1 AS t1
    GROUP BY t1.person_id
)
UNION
(
    SELECT
        t2.person_id,
        MAX(t2.created_at) AS created_at,
        't2' AS type,
        array_agg(t2.extra_data) AS extra_data
    FROM table2 AS t2
    GROUP BY t2.person_id
)
UNION
(
    SELECT
        t3.person_id,
        MAX(t3.created_at) AS created_at,
        't3' AS type,
        array_agg(t3.extra_data) AS extra_data
    FROM table3 AS t3
    GROUP BY t3.person_id
)
ORDER BY created_at DESC;

但是,您也可以先UNION ,然后再將結果GROUP ,只要您將GROUP BY person_id, type

還有一些注意事項:

  • 這仍將對所有表進行全表掃描,因為您需要從每一行獲取created_atextra_data 就像采訪問題一樣,“打印二叉樹的節點的時間復雜度是多少?”

  • 如果要排序數組,則可以使用array_agg(t1.extra_data ORDER BY t1.created_at)或其他方法進行array_agg(t1.extra_data ORDER BY t1.created_at) 這是索引可以幫助您的地方。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM