[英]How to create SQL query to get unique values of column for each GROUP'ed BY value of another column
[英]SQL: how to get records with a unique column value and sum the values in another column
我有一張叫做file
表
id integer primary key,
created_on timestamp
updated_on timestamp
file_name text not null
path text not null unique
hash text not null
size bigint not null
size_mb bigint not null
我想獲取具有唯一hash
值的所有記錄(即重復文件的單個實例),然后將size
列中的值與磁盤空間的總字節數相加,我需要備份每個記錄的單個副本文件。
這僅返回唯一的哈希值,即不存在重復項:
select *,
-- group sum of all files
sum(size) over ()
from
(
select *,
-- rows per hash
count(*) over (partition by hash) as cnt
from file
) as dt
where cnt = 1
編輯:這每個哈希只返回一行:
select *,
-- group sum of all files
sum(size) over ()
from
(
select *,
-- unique number per hash
row_number(*) over (partition by hash order by hash) as rn
from file
) as dt
where rn = 1
這兩個查詢都是標准 SQL,但 PostgreSQL 也支持專有語法:
select *,
-- group sum of all files
sum(size) over ()
from
(
select DISTINCT ON (hash) *
from file
order by hash
) as dt
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.