[英]What's the approach to joining these 2 tables?
Say I have 2 tables A and B which contain information for start and end times respectively.假设我有 2 个表 A 和 B,它们分别包含开始时间和结束时间的信息。 The primary key is a combination of id and the timestamp.
主键是 id 和时间戳的组合。 Thus, no 2 records can have the same id and timestamp
因此,没有 2 条记录可以具有相同的 id 和时间戳
A一个
id | start time
1 | 2016-02-06 17:03
1 | 2016-03-09 18:09
2 | 2017-02-07 23:34
3 | 2016-02-07 19:12
3 | 2016-02-07 23:52
...
B乙
id | end time
1 | 2016-02-06 18:32
1 | 2016-03-09 21:11
2 | 2017-02-08 01:22
3 | 2016-02-07 21:32
3 | 2016-02-08 02:11
...
My end result should be something like我的最终结果应该是
id | start time | end time
1 | 2016-02-06 17:03 | 2016-02-06 18:32
1 | 2016-03-09 18:09 | 2016-03-09 21:11
2 | 2017-02-07 23:34 | 2017-02-08 01:22
3 | 2016-02-07 19:12 | 2016-02-07 21:32
3 | 2016-02-07 23:52 | 2016-02-08 02:11
...
Obviously I can't join on just ID as the ids 1 and 3 each appear twice.显然我不能只加入 ID,因为 ID 1 和 3 各出现两次。 I can't join on the day either as the 3rd and 5th records span across 2 different days.
由于第 3 条和第 5 条记录跨越 2 个不同的日子,我也无法在当天加入。 So is there a way to join these 2 tables?
那么有没有办法加入这两个表? Any help would be much appreciated!
任何帮助将非常感激! Thanks!
谢谢!
I agree with Barmar and encourage you to revisit your data model.我同意 Barmar 的观点,并鼓励您重新访问您的数据 model。 I would expect start time and end time to be in the same table.
我希望开始时间和结束时间在同一个表中。
And while the existing ID may be for something like user_id, if that ID is duplicated in this table then there should be some other unique identifier, maybe transaction_id, that uniquely identifies each record.虽然现有 ID 可能用于类似 user_id 的东西,但如果该 ID 在此表中重复,则应该有一些其他唯一标识符,可能是 transaction_id,唯一标识每条记录。
Since the id's are the same and the end date is higher than the start date.由于 id 相同并且结束日期高于开始日期。
If those times are strings then use STR_TO_DATE如果这些时间是字符串,则使用 STR_TO_DATE
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND STR_TO_DATE(b.`end time`, '%Y-%m-%d %H:%i') > STR_TO_DATE(a.`start time`, '%Y-%m-%d %H:%i')
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
If those are timestamps如果这些是时间戳
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND b.`end time` > a.`start time`
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
A test on rextester here对 reextester 的测试在这里
If there are many timestamps per B.id?如果每个 B.id 有很多时间戳?
Then it might be more performant if the range is limited to a day or less.如果范围限制在一天或更短的时间内,它可能会更高效。
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND b.`end time` > a.`start time`
AND b.`end time` < TIMESTAMPADD(HOUR,24,a.`start time`)
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
Assuming that there are no overlaps between start/end times of the same id
, you could join the tables, with a join condition based on a correlated subquery that ensures that the record of tableb
that has the closest end_time
after the current start_time
of tablea
is picked:假设相同
id
的开始/结束时间之间没有重叠,您可以使用基于相关子查询的连接条件连接表,以确保tableb
的记录在tablea
的当前start_time
之后具有最近的end_time
是挑选:
select
a.*,
b.end_time
from
tablea a
inner join tableb b
on b.id = a.id
and b.end_time = (
select min(b1.end_time)
from tableb b1
where b1.id = a.id and b1.end_time > a.start_time
)
Demo on DB Fiddle : DB Fiddle 上的演示:
id | start_time | end_time -: | :--------------- | :--------------- 1 | 2016-02-06 17:03 | 2016-02-06 18:32 1 | 2016-03-09 18:09 | 2016-03-09 21:11 2 | 2017-02-07 23:34 | 2017-02-08 01:22 3 | 2016-02-07 19:12 | 2016-02-07 21:32 3 | 2016-02-07 23:52 | 2016-02-08 02:11
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.