为 SQL 中的一组列生成相同的 UUID

Question

Can someone pls suggest a way to write SQL query which would generate a unique UUID which would be same for a group of columns in amazon athena.有人可以建议一种编写 SQL 查询的方法，该查询将生成一个唯一的 UUID，该 UUID 对于 amazon athena 中的一组列是相同的。

For example, i have a table like this, where i want to create a UUID for columns1, columns2 and columns3.例如，我有一个这样的表，我想在其中为 columns1、columns2 和 columns3 创建一个 UUID。

column1 | column2 | column3 | column 4
2016    | 101     | 1       | 25
2016    | 101     | 1       | 59
2017    | 105     | 2       | 57
2017    | 105     | 2       | 78

Output 1 must look like Output 1 必须看起来像

ID      | column1 | column2 | column3 | column 4
UUID-1  | 2016    | 101     | 1       | 25
UUID-1  | 2016    | 101     | 1       | 59
UUID-2  | 2017    | 105     | 2       | 57
UUID-2  | 2017    | 105     | 2       | 78

Output 2: should look like Output 2：应该看起来像

ID      | count |column1 | column2 | column3
UUID-1  | 2     |2016    | 101     | 1
UUID-2  | 2     |2017    | 105     | 2

I understand that grouping can be done on output 1 to generate output 2. Can someone suggest, how i can generate output 1?我知道可以在 output 1 上进行分组以生成 output 2。有人可以建议我如何生成 output 1 吗？

Thanks.谢谢。

Answer 1

You can try to use uuid() function.你可以尝试使用uuid() function。

SELECT uuid() id,
       COUNT(*),
       column1 ,
       column2 ,
       column3
FROM T
GROUP BY column1 ,
       column2 ,
       column3

EDIT编辑

I saw you edit your question, you can try to use subquery and self join get output1我看到你编辑了你的问题，你可以尝试使用子查询和自连接获取输出 1

SELECT t1.*,t2.column4
FROM (
    SELECT DISTINCT uuid() id,
           column1 ,
           column2 ,
           column3
    FROM T
) t1 INNER JOIN T t2 
ON t1.column1 = t2.column1
AND t1.column2 = t2.column2
AND t1.column3 = t2.column3

another way you can try to use max window function to get only one GUID per column1 , column2 , column3 columns.另一种方法是您可以尝试使用max window function 为每个column1 、 column2 、 column3列仅获取一个 GUID。

select max(id) over (partition by column1,column2,column3) as id, 
       column1,
       column2,
       column3,
       column4
from 
(
    SELECT uuid() id,*
    FROM T
) t1

为 SQL 中的一组列生成相同的 UUID

问题描述

1 个解决方案

解决方案1
1 2022-03-08 12:20:56

为 SQL 中的一组列生成相同的 UUID

问题描述

1 个解决方案

解决方案1 1 2022-03-08 12:20:56

解决方案1
1 2022-03-08 12:20:56