I could do this rather easily in Python (or any other language), but I'm trying to see if this is possible with pure T-sql
I have two tables:
Table A has a bunch of general data and timestamps with each row
+------+------+------+-----------+
| Col1 | Col2 | Col3 | Timestamp |
+------+------+------+-----------+
| A | B | C | 17:00 |
| D | E | F | 18:00 |
| G | H | I | 23:00 |
+------+------+------+-----------+
Table B is considered metadata
+-------+-----------+
| RunNo | Timestamp |
+-------+-----------+
| 1 | 16:50 |
| 2 | 17:30 |
| 3 | 18:00 |
| 4 | 19:00 |
+-------+-----------+
So the general data is referenced to a "RunNo". The timestamp in table B is just when that "Run" was created in the DB. You can match the General data to its proper run number by comparing the timestamps. For example the timestamp for the first row in Table A is 17:00 which is greater than 16:50 and less than 17:30, so obviously this row belongs to RunNo 1. How can I perform this query so the resulting table is
+------+------+------+-----------+-------+
| Col1 | Col2 | Col3 | Timestamp | RunNo |
+------+------+------+-----------+-------+
| A | B | C | 17:00 | 1 |
| D | E | F | 18:00 | 2 |
| G | H | I | 23:00 | 4 |
+------+------+------+-----------+-------+
I though maybe using CASE would be helpful here, but I couldn't figure how to put it togther
SELECT a.*,
CASE WHEN a.TIMESTAMP < b.TIMESAMP AND a.TIMESTAMP > b.TIMSTAMP then b.RunNo END AS RunNo
FROM A as a, B as b
Any help would be greatly appreciated.
CASE
allows you to return different values (ie columns or expressions) based on a condition. This is not what you what here. You want to join tables and filter matching rows based on a condition.
I have replaced the name Timestamp
with ts
, as even escaped, I had difficulties with it on SQL Fiddle . It is a reserved keyword.
SELECT A.Col1, A.Col2, A.Col3, A.ts, MAX(B.RunNo) AS RunNo
FROM
A
INNER JOIN B
ON A.ts > B.ts
GROUP BY A.Col1, A.Col2, A.Col3, A.ts
With A.ts > B.ts
this returns RunNo
2 for the second entry. With A.ts >= B.ts
this returns RunNo
3 for the second entry.
with TableA as (
Select [Col1] = 'A',[Col2] = 'B',[Col3] = 'C',[Timestamp] = '17:00'
Union all Select [Col1] = 'D',[Col2] = 'E',[Col3] = 'F',[Timestamp] = '18:00'
Union all Select [Col1] = 'G',[Col2] = 'H',[Col3] = 'I',[Timestamp] = '23:00'
)
, TableB as (
Select [RunNo] = '1',[Timestamp] = '16:50'
Union all Select [RunNo] = '2',[Timestamp] = '17:30'
Union all Select [RunNo] = '3',[Timestamp] = '18:00'
Union all Select [RunNo] = '4',[Timestamp] = '19:00'
)
, TableBWithRowNumber as (
select b.RunNo, ROW_NUMBER() over (order by b.timestamp asc) as number, cast(b.Timestamp as time) as timestamp
from TableB b
)
, TableBWithNextRun as (
select b1.RunNo, startTime = b1.timestamp , endTime = b2.timestamp
from TableBWithRowNumber b1
left join TableBWithRowNumber b2 on b1.number + 1= b2.number
)
select *
from TableA a
inner join TableBWithNextRun B
on a.Timestamp >= b.startTime and (a.Timestamp < b.endTime or b.endTime is null)
This converts your timestamps to time
. I wasn't sure what you're datatype is internally.
This outputs the following
Col1 Col2 Col3 Timestamp RunNo startTime endTime
A B C 17:00 1 16:50:00.0000000 17:30:00.0000000
D E F 18:00 3 18:00:00.0000000 19:00:00.0000000
G H I 23:00 4 19:00:00.0000000 NULL
You can use the lag function to get the prior value of a column and then just join.
WITH Runs AS
(
SELECT
RunNo,
COALESCE(LAG(TIMESTAMP),'00:00')) AS START_TS,
TIMESTAMP AS END_TS
FROM TableB
ORDER BY RunNo ASC
)
SELECT B.RunNo, A.*
FROM TableA A
JOIN Runs B ON A.Timestamp >= B.Start_TS AND A.Timestamp < B.End_Ts
This should be faster than any group by solution on larger datasets.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.