简体   繁体   中英

Sorting by one-to-many relationship and return distinct records

I have a list of IDs (page__96de2231 etc.) that I want to sort by a lookup on another table (where the property = "p_published_at", also including null values). I only want to return the IDs, properly sorted. But when doing distinct on the resulting IDs the order from the inner query is lost:

SELECT distinct id FROM (

SELECT entity.id, t1.property, t1.postgres_timestamptz
    FROM entity
    
    JOIN triple t1
    on entity.id = t1.subject
    
    WHERE entity.id IN ('page__96de2231', 'page__1fd94de8', 'page__898f13ec', 'page__81503fa2')

    ORDER BY t1.property='p_published_at' DESC NULLS LAST, t1.postgres_timestamptz ASC NULLS LAST

) as inner_query

How can I resolve this by either improving or refactoring the query?

If you want to return ids in a particular order, don't use a subquery. You need an order by in the outer most query.

In this case, you can use group by :

SELECT e.id
FROM entity e JOIN     
    triple t1
    ON entity.id = t1.subject    
WHERE e.id IN ('page__96de2231', 'page__1fd94de8', 'page__898f13ec', 'page__81503fa2')
GROUP BY e.id
ORDER BY COUNT(*) FILTER (WHERE t1.property = 'p_published_at') DESC NULLS LAST,
         MAX(t1.postgres_timestamptz) ASC NULLS LAST;

By the way, you don't need a JOIN for this either:

SELECT t1.subject as id
FROM triple t1
WHERE t1.subject IN ('page__96de2231', 'page__1fd94de8', 'page__898f13ec', 'page__81503fa2')
GROUP BY t1.subject
ORDER BY COUNT(*) FILTER (WHERE t1.property = 'p_published_at') DESC NULLS LAST,
         MAX(t1.postgres_timestamptz) ASC NULLS LAST

One entity can have many triples. If an entity has two triples, one sorted at the beginning of the list, one at the end of it, where would you place the entity in your results? At the beginning or the end?

You can use DENSE_RANK to give the rows sortkeys. Then decide whether to take the minimum or maximum sortkey per entity:

select id
from
(
  select
    e.id,
    dense_rank() over (order by t.property = 'p_published_at' desc nulls last,
                                t.postgres_timestamptz asc nulls last) as sortkey
  from entity e
  join triple t on e.id = t.subject
  where e.id in ('page__96de2231', 'page__1fd94de8', 'page__898f13ec', 'page__81503fa2')
) ranked
group by id
order by min(sortkey), id;

(And Gordon is right of course, you can remove the table entity from the query; it is not needed.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM