简体   繁体   English

加入 2 个表并只保留第一个最近的事件

[英]Join 2 tables and keep only the first closest event

I have the following current tables:我有以下当前表:

table_1
id | timestamp | origin | info

table_2
id | timestamp | origin | type

My aim is to find, for each line in table 2, the origin event in table 1. I want to keep only the first one.我的目标是为表 2 中的每一行找到表 1 中的起源事件。我只想保留第一个。 For instance:例如:

table 1
1 | 1000 | "o1" | "i1"
2 | 2000 | "o2" | "i2"
3 | 2010 | "o2" | "i2"

table 2
1 | 1010 | "o1" | "t1"
2 | 2100 | "o2" | "t2"

My expected result is:我的预期结果是:

table_2.id | table_2.timestamp | table_2.origin | table_2.type | table_1.info | table_1.timestamp
1          | 1010              | "o1"           | "t1"         | "i1"         | 1000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2010

Currently I'm just using a simple join on origin and table_2.timestamp > table_1.timestamp which give me:目前我只是在origintable_2.timestamp > table_1.timestamp上使用一个简单的连接,它给了我:

table_2.id | table_2.timestamp | table_2.origin | table_2.type | table_1.info | table_1.timestamp
1          | 1010              | "o1"           | "t1"         | "i1"         | 1000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2000
2          | 2100              | "o2"           | "t2"         | "i2"         | 2010

As you can see I don't want second line above because I just want first closest event in table_1.如您所见,我不想要上面的第二行,因为我只想要 table_1 中的第一个最接近的事件。

Any ideas?有任何想法吗?

A cross-database solution is to join and filter with a correlated subquery:跨数据库解决方案是使用相关子查询加入和过滤:

select 
    t2.*,
    t1.info,
    t1.timestamp t1_timestamp
from 
    table_2 t2
    inner join table_1 t1
        on t1.origin = t2.origin
        and t1.timestamp = (
            select max(t11.timestamp) 
            from table_1 t11
            where t11.origin = t2.origin and t11.timestamp < t2.timestamp
        )
order by t2.id

Since you are using Postgres, you can use handy syntax distinct on ;由于您使用的是 Postgres,您可以distinct on ; distinct on使用distinct on方便语法。 this might actually perform better:这实际上可能表现更好:

select 
    distinct on(t2.id)
    t2.*,
    t1.info,
    t1.timestamp t1_timestamp
from 
    table_2 t2
    inner join table_1 t1 
        on t1.origin = t2.origin and t1.timestamp < t2.timestamp
order by t2.id, t1.timestamp desc

Demo on DB Fiddle - both queries yield: DB Fiddle 上的演示- 两个查询都产生:

id | timestamp | origin | type | info | t1_timestamp
-: | --------: | :----- | :--- | :--- | -----------:
 1 |      1010 | o1     | t1   | i1   |         1000
 2 |      2100 | o2     | t2   | i2   |         2010

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM