[英]How to create SQL query to get unique values of column for each GROUP'ed BY value of another column
[英]SQL: how to get records with a unique column value and sum the values in another column
我有一张叫做file
表
id integer primary key,
created_on timestamp
updated_on timestamp
file_name text not null
path text not null unique
hash text not null
size bigint not null
size_mb bigint not null
我想获取具有唯一hash
值的所有记录(即重复文件的单个实例),然后将size
列中的值与磁盘空间的总字节数相加,我需要备份每个记录的单个副本文件。
这仅返回唯一的哈希值,即不存在重复项:
select *,
-- group sum of all files
sum(size) over ()
from
(
select *,
-- rows per hash
count(*) over (partition by hash) as cnt
from file
) as dt
where cnt = 1
编辑:这每个哈希只返回一行:
select *,
-- group sum of all files
sum(size) over ()
from
(
select *,
-- unique number per hash
row_number(*) over (partition by hash order by hash) as rn
from file
) as dt
where rn = 1
这两个查询都是标准 SQL,但 PostgreSQL 也支持专有语法:
select *,
-- group sum of all files
sum(size) over ()
from
(
select DISTINCT ON (hash) *
from file
order by hash
) as dt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.