[英]Generate UUID which is same for a group of columns in SQL
Can someone pls suggest a way to write SQL query which would generate a unique UUID which would be same for a group of columns in amazon athena.有人可以建议一种编写 SQL 查询的方法,该查询将生成一个唯一的 UUID,该 UUID 对于 amazon athena 中的一组列是相同的。
For example, i have a table like this, where i want to create a UUID for columns1, columns2 and columns3.例如,我有一个这样的表,我想在其中为 columns1、columns2 和 columns3 创建一个 UUID。
column1 | column2 | column3 | column 4
2016 | 101 | 1 | 25
2016 | 101 | 1 | 59
2017 | 105 | 2 | 57
2017 | 105 | 2 | 78
Output 1 must look like Output 1 必须看起来像
ID | column1 | column2 | column3 | column 4
UUID-1 | 2016 | 101 | 1 | 25
UUID-1 | 2016 | 101 | 1 | 59
UUID-2 | 2017 | 105 | 2 | 57
UUID-2 | 2017 | 105 | 2 | 78
Output 2: should look like Output 2:应该看起来像
ID | count |column1 | column2 | column3
UUID-1 | 2 |2016 | 101 | 1
UUID-2 | 2 |2017 | 105 | 2
I understand that grouping can be done on output 1 to generate output 2. Can someone suggest, how i can generate output 1?我知道可以在 output 1 上进行分组以生成 output 2。有人可以建议我如何生成 output 1 吗?
Thanks.谢谢。
You can try to use uuid()
function.你可以尝试使用uuid()
function。
SELECT uuid() id,
COUNT(*),
column1 ,
column2 ,
column3
FROM T
GROUP BY column1 ,
column2 ,
column3
EDIT编辑
I saw you edit your question, you can try to use subquery and self join get output1我看到你编辑了你的问题,你可以尝试使用子查询和自连接获取输出 1
SELECT t1.*,t2.column4
FROM (
SELECT DISTINCT uuid() id,
column1 ,
column2 ,
column3
FROM T
) t1 INNER JOIN T t2
ON t1.column1 = t2.column1
AND t1.column2 = t2.column2
AND t1.column3 = t2.column3
another way you can try to use max
window function to get only one GUID per column1
, column2
, column3
columns.另一种方法是您可以尝试使用max
window function 为每个column1
、 column2
、 column3
列仅获取一个 GUID。
select max(id) over (partition by column1,column2,column3) as id,
column1,
column2,
column3,
column4
from
(
SELECT uuid() id,*
FROM T
) t1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.