简体   繁体   中英

How can I select the nearest value less-than and greater-than a given value efficiently?

I have two tables, one for values one for location and am trying to interpolate location. The tables have been simplified to the following:

CREATE TABLE value(
    Timestamp DATETIME2,
    Value float NOT NULL,
    PRIMARY KEY(Timestamp)
);

CREATE TABLE location(
    Timestamp DATETIME2,
    Position INT NOT NULL,
    PRIMARY KEY(Timestamp)
); 

INSERT INTO value VALUES 
    ('2011/12/1 16:55:01', 1),
    ('2011/12/1 16:55:02', 5),
    ('2011/12/1 16:55:05', 10),
    ('2011/12/1 16:55:08', 6);

INSERT INTO location VALUES 
    ('2011/12/1 16:55:00', 0),
    ('2011/12/1 16:55:05', 10),
    ('2011/12/1 16:55:10', 5)

The expected results would be

TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation
2011-12-01 16:55:01,  1, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:02,  5, 2011-12-01 16:55:00,  0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10
2011-12-01 16:55:08,  6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10,  5

(Keep in mind this is simplified sample data to get the idea of the query I am trying to perform across.)

To do the interpolation, I need to figure out the time and locations before and after a given values time. I am currently doing this with a query that looks like:

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
 FROM 
    dbo.value V 

Now this works, but this obviously is doing a lot of work. I'm thinking there must be a query simplification that I'm missing but I've been playing with it all morning and haven't come up with anything concrete. Hoping someone here has a better idea.

I am currently exploring if there is a way to figure out the LowerTime and UpperTime and use those in determining the Locations. Something like:

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation
 FROM 
    dbo.value V 

but this doesn't work.

EDIT1: Updated query as suggested. However no visible change in execution time.

EDIT2: Added my thoughts of the approach I am currently trying.

For simplicity you may at least use MAX() and MIN() functions for querying timestamp field instead of TOP 1 and ORDER BY .

Full query will be

SELECT 
    V.Timestamp, 
    V.Value, 
    (SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
    (SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
    (SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
 FROM 
    dbo.value V 

This might do the trick (although I think the join looks quite ugly):

;with OrderedLocations as (
    select
        v.Timestamp,
        v.Value,
        l.Timestamp as tsl,
        l.Position,
        ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp <= v.Timestamp THEN l.Timestamp ELSE '00010101' END desc) as PrevRN,
        ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp >= v.Timestamp THEN l.Timestamp ELSE '99991231' END asc) as NextRN
    from
        value v
            cross join
        location l
)
select
    ol1.Timestamp,
    ol1.Value,
    ol1.tsl,
    ol1.Position,
    ol2.tsl,
    ol2.Position
from
    OrderedLocations ol1
        inner join
    OrderedLocations ol2
        on
            ol1.Timestamp = ol2.Timestamp and
            ol1.Value = ol2.Value
where
    ol1.PrevRN = 1 and
    ol2.NextRN = 1

Unfortunately, as with most efficiency/performance questions, the answer tends to be try lots of different combinations with your actual tables and data, and measure how each one performs.


An alternative (avoiding the join) using the same CTE as above would be:

SELECT Timestamp,Value,
    MAX(CASE WHEN PrevRN=1 THEN tsl END),MAX(CASE WHEN PrevRN=1 then Position END),
    MAX(CASE WHEN NextRN=1 THEN tsl END),MAX(CASE WHEN NextRN=1 then Position END)
FROM
    OrderedLocations
where PrevRN=1 or NextRN=1
group by Timestamp,Value

The CTE ( OrderedLocations ) is trying to construct a rowset where every row from location is matched to every row in value . For each resulting row, we calculate two ROW_NUMBER s - the row number where we number all rows with a lower or equal timestamp ( PrevRN ) in descending order, and another where we number all rows with a greater or equal timestamp ( NextRN ) ascending. We then construct our final result by just considering those rows where one of those row numbers is 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM