繁体   English   中英

如何根据psql中其他列的值删除一列中的重复项

[英]How to remove duplicates in one column based on the value of other columns in psql

我有一个应该模仿图书馆管理系统的数据库。 我想编写一个查询,显示一个表格,显示每个出版商借阅的前 3 本书,同时显示它们的相应排名(因此从出版商 X 借阅次数最多的书将显示排名 1)。 我有一个查询,显示以下信息 - 借书的标题及其相应的出版商,以及每本书的借阅次数。 如你看到的; 布卢姆斯伯里(英国)出现了 7 次(每本《哈利波特》书籍各出现一次),但我希望它只显示 3 部最受欢迎的《哈利波特》书籍的借阅次数。 我非常感谢任何帮助。

                  title                   |       publisher        | times
------------------------------------------+------------------------+------
 Harry Potter and the Philosopher's Stone | Bloomsbury (UK)        |    2
 Harry Potter and the Deathly Hallows     | Bloomsbury (UK)        |    2
 Harry Potter the Goblet of Fire          | Bloomsbury (UK)        |    3
 The Fellowship of the Ring               | George Allen & Unwin   |    1
 Calculus                                 | Paerson Addison Wesley |    1
 Go Set a Watchman                        | HarperCollins          |    1
 Harry Potter the Half-Blood Prince       | Bloomsbury (UK)        |    4
 Harry Potter and the Chamber of Secrets  | Bloomsbury (UK)        |    3
 Harry Potter and Prisoner of Azkaban     | Bloomsbury (UK)        |    2
 Nineteen Eighty-Four                     | Secker & Warburg       |    1
 Harry Potter the Order of the Phoenix    | Bloomsbury (UK)        |    4
 To Kill a Mockingbird                    | J.B.Lippincott & Co    |    1

下面的查询将生成上面的视图。

SELECT title, publisher, COUNT(borrowed.resid) AS rank 
FROM borrowed 
  CROSS JOIN book 
  CROSS JOIN bookinfo 
WHERE borrowed.resid = book.resid 
  AND book.isbn = bookinfo.isbn 
  AND book.copynumber = borrowed.copynumber 
GROUP BY title, publisher;
SELECT title, publisher, times
FROM (
    SELECT *, RANK() OVER (PARTITION BY publisher ORDER BY times DESC) AS ranking
    FROM (
        SELECT title, publisher, COUNT(resid) AS times 
        FROM borrowed 
        JOIN book USING (resid, copynumber)
        JOIN bookinfo USING (isbn)
        GROUP BY title, publisher
    ) AS counts
) AS ranks
WHERE ranking <= 3
ORDER BY publisher, times DESC

counts是您编写的部分,已调整为利用USING来组合双方相同的命名列(使其更短)

ranks是使用rank function (窗口函数)对每个发布者进行排名的部分

最后,我们通过选择排名等于和低于 3 来获得前 3 名。

修复连接并添加 RANK:

select *
from 
 (
    SELECT title, publisher, COUNT(*) AS cnt,
       -- rank the counts
       rank() over (partition by publisher order by count(*) desc) as rnk 
    FROM borrowed 
      JOIN book 
        ON borrowed.resid = book.resid 
       AND book.copynumber = borrowed.copynumber 
      JOIN bookinfo 
        ON book.isbn = bookinfo.isbn 
    GROUP BY title, publisher
 ) as dt
where rnk <= 3

您可能想要切换到ROW_NUMBER (正好 3 行)或DENSE_RANK (3 个最高计数)而不是RANK (3 行,如果第 4+ 行的计数与第 3 行相同,则可能更多)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM