Say I have 2 tables A and B which contain information for start and end times respectively. The primary key is a combination of id and the timestamp. Thus, no 2 records can have the same id and timestamp
A
id | start time
1 | 2016-02-06 17:03
1 | 2016-03-09 18:09
2 | 2017-02-07 23:34
3 | 2016-02-07 19:12
3 | 2016-02-07 23:52
...
B
id | end time
1 | 2016-02-06 18:32
1 | 2016-03-09 21:11
2 | 2017-02-08 01:22
3 | 2016-02-07 21:32
3 | 2016-02-08 02:11
...
My end result should be something like
id | start time | end time
1 | 2016-02-06 17:03 | 2016-02-06 18:32
1 | 2016-03-09 18:09 | 2016-03-09 21:11
2 | 2017-02-07 23:34 | 2017-02-08 01:22
3 | 2016-02-07 19:12 | 2016-02-07 21:32
3 | 2016-02-07 23:52 | 2016-02-08 02:11
...
Obviously I can't join on just ID as the ids 1 and 3 each appear twice. I can't join on the day either as the 3rd and 5th records span across 2 different days. So is there a way to join these 2 tables? Any help would be much appreciated! Thanks!
I agree with Barmar and encourage you to revisit your data model. I would expect start time and end time to be in the same table.
And while the existing ID may be for something like user_id, if that ID is duplicated in this table then there should be some other unique identifier, maybe transaction_id, that uniquely identifies each record.
Since the id's are the same and the end date is higher than the start date.
If those times are strings then use STR_TO_DATE
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND STR_TO_DATE(b.`end time`, '%Y-%m-%d %H:%i') > STR_TO_DATE(a.`start time`, '%Y-%m-%d %H:%i')
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
If those are timestamps
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND b.`end time` > a.`start time`
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
A test on rextester here
If there are many timestamps per B.id?
Then it might be more performant if the range is limited to a day or less.
SELECT a.id, a.`start time`, MIN(b.`end time`) AS `end time`
FROM A a
LEFT JOIN B b
ON b.id = a.id
AND b.`end time` > a.`start time`
AND b.`end time` < TIMESTAMPADD(HOUR,24,a.`start time`)
GROUP BY a.id, a.`start time`
ORDER BY a.id, a.`start time`;
Assuming that there are no overlaps between start/end times of the same id
, you could join the tables, with a join condition based on a correlated subquery that ensures that the record of tableb
that has the closest end_time
after the current start_time
of tablea
is picked:
select
a.*,
b.end_time
from
tablea a
inner join tableb b
on b.id = a.id
and b.end_time = (
select min(b1.end_time)
from tableb b1
where b1.id = a.id and b1.end_time > a.start_time
)
id | start_time | end_time -: | :--------------- | :--------------- 1 | 2016-02-06 17:03 | 2016-02-06 18:32 1 | 2016-03-09 18:09 | 2016-03-09 21:11 2 | 2017-02-07 23:34 | 2017-02-08 01:22 3 | 2016-02-07 19:12 | 2016-02-07 21:32 3 | 2016-02-07 23:52 | 2016-02-08 02:11
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.