PostgreSQL，从2个表中选择，但仅从表2中选择最新元素

Question

Hey, I have 2 tables in PostgreSql: 嘿，我在PostgreSql中有2个表：

1 - documents: id, title
2 - updates: id, document_id, date

and some data: 和一些数据：

documents: 文件：

| 1 | Test Title |

updates: 更新：

| 1 | 1 | 2006-01-01 |
| 2 | 1 | 2007-01-01 |
| 3 | 1 | 2008-01-01 |

So All updates are pointing to the same document, but all with different dates for the updates. 因此，所有更新都指向同一文档，但是所有更新的日期都不同。

What I am trying to do is to do a select from the documents table, but also include the latest update based on the date. 我想做的是从文档表中进行选择，但还要包括基于日期的最新更新。

How should a query like this look like? 这样的查询应如何显示？ This is the one I currently have, but I am listing all updates, and not the latest one as the one I need: 这是我当前拥有的，但是我列出了所有更新，而不是我需要的最新更新：

SELECT * FROM documents,updates WHERE documents.id=1 AND documents.id=updates.document_id ORDER BY date

To include; 包括; The reason I need this in the query is that I want to order by the date from the updates template! 我在查询中需要这个的原因是我想按日期从更新模板中订购！

Edit: This script is heavily simplified, so I should be able to create a query that returns any number of results, but including the latest updated date. 编辑：此脚本已大大简化，因此我应该能够创建一个查询，该查询返回任意数量的结果，但包括最新的更新日期。 I was thinking of using a inner join or left join or something like that!? 我在考虑使用内部联接或左联接或类似的东西！

Answer 1

Use PostgreSQL extension DISTINCT ON : 使用PostgreSQL扩展名DISTINCT ON ：

SELECT  DISTINCT ON (documents.id) *
FROM    document
JOIN    updates
ON      updates.document_id = document_id
ORDER BY
        documents.id, updates.date DESC

This will take the first row from each document.id cluster in ORDER BY order. 这将按照ORDER BY顺序从每个document.id群集中获取第一行。

Test script to check: 测试脚本以检查：

SELECT  DISTINCT ON (documents.id) *
FROM    (
        VALUES
        (1, 'Test Title'),
        (2, 'Test Title 2')
        ) documents (id, title)
JOIN    (
        VALUES
        (1, 1, '2006-01-01'::DATE),
        (2, 1, '2007-01-01'::DATE),
        (3, 1, '2008-01-01'::DATE),
        (4, 2, '2009-01-01'::DATE),
        (5, 2, '2010-01-01'::DATE)
        ) updates (id, document_id, date)
ON      updates.document_id = documents.id
ORDER BY
        documents.id, updates.date DESC

Answer 2

You may create a derived table which contains only the most recent "updates" records per document_id, and then join "documents" against that: 您可以创建一个派生表，其中每个document_id仅包含最新的“更新”记录，然后针对该表加入“文档”：

SELECT d.id, d.title, u.update_id, u."date"
FROM documents d
LEFT JOIN
-- JOIN "documents" against the most recent update per document_id
(
SELECT recent.document_id, id AS update_id, recent."date"
FROM updates
INNER JOIN
(SELECT document_id, MAX("date") AS "date" FROM updates GROUP BY 1) recent
ON updates.document_id = recent.document_id
WHERE
  updates."date" = recent."date"
) u
ON d.id = u.document_id;

This will handle "un-updated" documents, like so: 这将处理“未更新”的文档，如下所示：

pg=> select * from documents;
 id | title 
----+-------
  1 | foo
  2 | bar
  3 | baz
(3 rows)

pg=> select * from updates;
 id | document_id |    date    
----+-------------+------------
  1 |           1 | 2009-10-30
  2 |           1 | 2009-11-04
  3 |           1 | 2009-11-07
  4 |           2 | 2009-11-09
(4 rows)

pg=> SELECT d.id ...
 id | title | update_id |    date    
----+-------+-----------+------------
  1 | foo   |         3 | 2009-11-07
  2 | bar   |         4 | 2009-11-09
  3 | baz   |           | 
(3 rows)

Answer 3

select *
from documents
left join updates
  on updates.document_id=documents.id
  and updates.date=(select max(date) from updates where document_id=documents.id)
where documents.id=?;

It has the some advantages over previous answers: 与以前的答案相比，它具有一些优点：

you can write document_id only in one place which is convenient; 您只能在一个方便的地方编写document_id；
you can omit where and you'll get a table of all documents and their latest updates; 您可以省略位置，并获得所有文档及其最新更新的表格；
you can use more broad selection criteria, for example where documents.id in (1,2,3) . 您可以使用更广泛的选择标准，例如where documents.id in (1,2,3) 。

You can also avoid a subselect using group by, but you'll have to list all fields of documents in group by clause: 您还可以避免使用group by进行子选择，但是必须在group by子句中列出文档的所有字段：

select documents.*, max(date) as max_date
  from documents
  left join updates on documents.id=document_id
  where documents.id=1
  group by documents.id, title;

Answer 4

From the top of my head: 从我的头顶：

ORDER BY date DESC LIMIT 1

If you really want only id 1 your can use this query: 如果您确实只想要ID 1，则可以使用以下查询：

SELECT * FROM documents,updates 
    WHERE documents.id=1 AND updates.document_id=1 
    ORDER BY date DESC LIMIT 1

http://www.postgresql.org/docs/8.4/interactive/queries-limit.html http://www.postgresql.org/docs/8.4/interactive/queries-limit.html

Answer 5

This should also work 这也应该工作

SELECT * FROM documents, updates 
    WHERE documents.id=1 AND updates.document_id=1
    AND updates.date = (SELECT MAX (date) From updates)

PostgreSQL，从2个表中选择，但仅从表2中选择最新元素

问题描述

5 个解决方案

解决方案1
25 2009-11-10 15:33:50

解决方案2
10 已采纳 2009-11-09 20:46:18

解决方案3
4 2009-11-10 15:24:05

解决方案4
2 2009-11-09 20:23:21

解决方案5
-1 2009-11-09 20:36:42

PostgreSQL，从2个表中选择，但仅从表2中选择最新元素

问题描述

5 个解决方案

解决方案1 25 2009-11-10 15:33:50

解决方案2 10 已采纳 2009-11-09 20:46:18

解决方案3 4 2009-11-10 15:24:05

解决方案4 2 2009-11-09 20:23:21

解决方案5 -1 2009-11-09 20:36:42

解决方案1
25 2009-11-10 15:33:50

解决方案2
10 已采纳 2009-11-09 20:46:18

解决方案3
4 2009-11-10 15:24:05

解决方案4
2 2009-11-09 20:23:21

解决方案5
-1 2009-11-09 20:36:42