简体   繁体   English

加入这两张桌子的方法是什么?

[英]What's the approach to joining these 2 tables?

Say I have 2 tables A and B which contain information for start and end times respectively.假设我有 2 个表 A 和 B,它们分别包含开始时间和结束时间的信息。 The primary key is a combination of id and the timestamp.主键是 id 和时间戳的组合。 Thus, no 2 records can have the same id and timestamp因此,没有 2 条记录可以具有相同的 id 和时间戳

A一个

id | start time
1 | 2016-02-06 17:03
1 | 2016-03-09 18:09
2 | 2017-02-07 23:34
3 | 2016-02-07 19:12
3 | 2016-02-07 23:52
...

B

id | end time
1 | 2016-02-06 18:32
1 | 2016-03-09 21:11
2 | 2017-02-08 01:22
3 | 2016-02-07 21:32
3 | 2016-02-08 02:11
...

My end result should be something like我的最终结果应该是

id | start time | end time
1 | 2016-02-06 17:03 | 2016-02-06 18:32
1 | 2016-03-09 18:09 | 2016-03-09 21:11
2 | 2017-02-07 23:34 | 2017-02-08 01:22
3 | 2016-02-07 19:12 | 2016-02-07 21:32
3 | 2016-02-07 23:52 | 2016-02-08 02:11
...

Obviously I can't join on just ID as the ids 1 and 3 each appear twice.显然我不能只加入 ID,因为 ID 1 和 3 各出现两次。 I can't join on the day either as the 3rd and 5th records span across 2 different days.由于第 3 条和第 5 条记录跨越 2 个不同的日子,我也无法在当天加入。 So is there a way to join these 2 tables?那么有没有办法加入这两个表? Any help would be much appreciated!任何帮助将非常感激! Thanks!谢谢!

I agree with Barmar and encourage you to revisit your data model.我同意 Barmar 的观点,并鼓励您重新访问您的数据 model。 I would expect start time and end time to be in the same table.我希望开始时间和结束时间在同一个表中。

And while the existing ID may be for something like user_id, if that ID is duplicated in this table then there should be some other unique identifier, maybe transaction_id, that uniquely identifies each record.虽然现有 ID 可能用于类似 user_id 的东西,但如果该 ID 在此表中重复,则应该有一些其他唯一标识符,可能是 transaction_id,唯一标识每条记录。

Since the id's are the same and the end date is higher than the start date.由于 id 相同并且结束日期高于开始日期。

If those times are strings then use STR_TO_DATE如果这些时间是字符串,则使用 STR_TO_DATE

SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b 
  ON b.id = a.id
 AND STR_TO_DATE(b.`end time`, '%Y-%m-%d %H:%i') > STR_TO_DATE(a.`start time`, '%Y-%m-%d %H:%i')
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;

If those are timestamps如果这些是时间戳

SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
  ON b.id = a.id
 AND b.`end time` > a.`start time`
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;

A test on rextester here对 reextester 的测试在这里

If there are many timestamps per B.id?如果每个 B.id 有很多时间戳?
Then it might be more performant if the range is limited to a day or less.如果范围限制在一天或更短的时间内,它可能会更高效。

SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
  ON b.id = a.id
 AND b.`end time` > a.`start time` 
 AND b.`end time` < TIMESTAMPADD(HOUR,24,a.`start time`)
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;

Assuming that there are no overlaps between start/end times of the same id , you could join the tables, with a join condition based on a correlated subquery that ensures that the record of tableb that has the closest end_time after the current start_time of tablea is picked:假设相同id的开始/结束时间之间没有重叠,您可以使用基于相关子查询的连接条件连接表,以确保tableb的记录在tablea的当前start_time之后具有最近的end_time是挑选:

select
    a.*,
    b.end_time
from
    tablea a
    inner join tableb b
        on  b.id = a.id
        and b.end_time = (
            select min(b1.end_time)
            from tableb b1 
            where b1.id = a.id and b1.end_time > a.start_time
        )

Demo on DB Fiddle : DB Fiddle 上的演示

id | start_time       | end_time        
-: | :--------------- | :---------------
 1 | 2016-02-06 17:03 | 2016-02-06 18:32
 1 | 2016-03-09 18:09 | 2016-03-09 21:11
 2 | 2017-02-07 23:34 | 2017-02-08 01:22
 3 | 2016-02-07 19:12 | 2016-02-07 21:32
 3 | 2016-02-07 23:52 | 2016-02-08 02:11

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM