簡體   English   中英

SELECT 各組按列的最大值

[英]SELECT max value of each group by column

我已經閱讀了一些類似於我的問題的 StackOverflow,但我找不到與我的問題完全相同的問題。 我已閱讀: select 每組的最大值,包括其他列Select 每組的最大值Select 每組的最大值

所以這是我的問題。

我有一張看起來像的桌子

+---------+---------------+-----------------------+
|column_1 |   column_2    |      column_3         | 
+---------+---------------+-----------------------+
|    A    |      200      | 1618558797853684118   |     
|    A    |      198.7    | 1618558797854783205   | 
|    A    |      201.3    | 1618558797855282263   |    
|    B    |      350.5    | 1618558775580928115   |  
|    B    |      349.9    | 1618558775581128138   |  
|    B    |      350.1    | 1618558775580856107   |
|    C    |      532      | 1618558797852667035   |
|    C    |      531      | 1618558775580345051   |
|    A    |      300      | 1618558797855492289   |
|    A    |      302      | 1618558797852512023   |   
|   ...   |  ........     |        ...            | 
+---------+---------------+-----------------------+

因此,您可以看到column_1上給定每個字母表的每一行中的三個幾乎具有相同的值,對吧? 我需要得到它們中的每一個,但只能在序列中。 為了更清楚,讓我們看一下所需的 output:

Desired output
+---------+---------------------------------------------------------------+-------------------------+
|column_1 |                         column_2                              |      column_3           | 
+---------+---------------------------------------------------------------+-------------------------+
|    A    | it can be (200 or 198.7 or 201.3) does not matter which one   | (depends on column_2)   |     
|    B    | it can be (350.5 or 349.9 or 350.1) does not matter which one | (depends on column_2)   | 
|    C    | it can be (532 or 531) does not matter which one              | (depends on column_2)   |    
|    A    | it can be (300 or 302) does not matter which one              | (depends on column_2)   |     
|   ...   |                        ........                               |          ...            | 
+---------+---------------------------------------------------------------+-----------------------+

所以我在想的是按每一列分組並取column_3的最大值或最小值(不管是哪一個),但我沒有這樣做。

我很抱歉這個復雜的問題,但你能幫我嗎? 謝謝

考慮下面

#standardSQL
with `project.dataset.table` as (
  select 1 id, 'A' column_1, 200 column_2, 1618558797853684118 column_3 union all
  select 2, 'A', 198.7, 1618558797854783205 union all
  select 3, 'A', 201.3, 1618558797855282263 union all
  select 4, 'B', 350.5, 1618558775580928115 union all
  select 5, 'B', 349.9, 1618558775581128138 union all
  select 6, 'B', 350.1, 1618558775580856107 union all
  select 7, 'C', 532, 1618558797852667035 union all
  select 8, 'C', 531, 1618558775580345051 union all
  select 9, 'A', 300, 1618558797855492289 union all
  select 10, 'A', 302, 1618558797852512023 union all
  select 12, 'C', 709, 1618558797852562325 union all
  select 13, 'C', 803, 1618558797851315651
)
select as value array_agg(struct(column_1, column_2, column_3) order by column_2 limit 1)[offset(0)]
from (
  select *, countif(flag) over(order by id) grp
  from (
    select *, column_1 != lag(column_1) over(order by id) flag
    from `project.dataset.table`
  )
) 
group by column_1, grp   

與 output

在此處輸入圖像描述

你似乎有一種形式的差距和島嶼問題。 當相鄰行的column_1值相同時,您需要一行。

我建議lag() (用於每組的第一行)或lead() (用於最后一行):

select t.*
from (select t.*,
             lag(column_1) over (order by column_3) as prev_column_1
      from t
     ) t
where prev_column_1 is null or prev_column_1 <> column_1;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM