[英]How can I select the nearest value less-than and greater-than a given value efficiently?
I have two tables, one for values one for location and am trying to interpolate location. 我有两个表,一个用于位置的值,我正在尝试插入位置。 The tables have been simplified to the following:
表格已简化为以下内容:
CREATE TABLE value(
Timestamp DATETIME2,
Value float NOT NULL,
PRIMARY KEY(Timestamp)
);
CREATE TABLE location(
Timestamp DATETIME2,
Position INT NOT NULL,
PRIMARY KEY(Timestamp)
);
INSERT INTO value VALUES
('2011/12/1 16:55:01', 1),
('2011/12/1 16:55:02', 5),
('2011/12/1 16:55:05', 10),
('2011/12/1 16:55:08', 6);
INSERT INTO location VALUES
('2011/12/1 16:55:00', 0),
('2011/12/1 16:55:05', 10),
('2011/12/1 16:55:10', 5)
The expected results would be 预期的结果将是
TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation
2011-12-01 16:55:01, 1, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:02, 5, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10
2011-12-01 16:55:08, 6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10, 5
(Keep in mind this is simplified sample data to get the idea of the query I am trying to perform across.) (请记住,这是简化的示例数据,以便了解我正在尝试执行的查询。)
To do the interpolation, I need to figure out the time and locations before and after a given values time. 要进行插值,我需要计算给定值时间之前和之后的时间和位置。 I am currently doing this with a query that looks like:
我目前正在使用如下查询执行此操作:
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
FROM
dbo.value V
Now this works, but this obviously is doing a lot of work. 现在这个有效,但这显然做了很多工作。 I'm thinking there must be a query simplification that I'm missing but I've been playing with it all morning and haven't come up with anything concrete.
我认为必须有一个我错过的查询简化,但我整个上午一直在玩它并且没有提出任何具体的东西。 Hoping someone here has a better idea.
希望有人在这里有更好的主意。
I am currently exploring if there is a way to figure out the LowerTime and UpperTime and use those in determining the Locations. 我目前正在探索是否有办法找出LowerTime和UpperTime并使用它们来确定位置。 Something like:
就像是:
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation
FROM
dbo.value V
but this doesn't work. 但这不起作用。
EDIT1: Updated query as suggested. EDIT1:按建议更新了查询。 However no visible change in execution time.
但是执行时间没有明显变化。
EDIT2: Added my thoughts of the approach I am currently trying. EDIT2:添加了我对目前正在尝试的方法的看法。
For simplicity you may at least use MAX()
and MIN()
functions for querying timestamp
field instead of TOP 1
and ORDER BY
. 为简单起见,您至少可以使用
MAX()
和MIN()
函数来查询timestamp
字段而不是TOP 1
和ORDER BY
。
Full query will be 完整的查询将是
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
FROM
dbo.value V
This might do the trick (although I think the join looks quite ugly): 这可能会有所作为(虽然我认为连接看起来很丑陋):
;with OrderedLocations as (
select
v.Timestamp,
v.Value,
l.Timestamp as tsl,
l.Position,
ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp <= v.Timestamp THEN l.Timestamp ELSE '00010101' END desc) as PrevRN,
ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp >= v.Timestamp THEN l.Timestamp ELSE '99991231' END asc) as NextRN
from
value v
cross join
location l
)
select
ol1.Timestamp,
ol1.Value,
ol1.tsl,
ol1.Position,
ol2.tsl,
ol2.Position
from
OrderedLocations ol1
inner join
OrderedLocations ol2
on
ol1.Timestamp = ol2.Timestamp and
ol1.Value = ol2.Value
where
ol1.PrevRN = 1 and
ol2.NextRN = 1
Unfortunately, as with most efficiency/performance questions, the answer tends to be try lots of different combinations with your actual tables and data, and measure how each one performs. 不幸的是,与大多数效率/性能问题一样,答案往往是尝试与您的实际表格和数据进行大量不同的组合,并衡量每个表格和数据的执行方式。
An alternative (avoiding the join) using the same CTE as above would be: 使用与上述相同的CTE的替代方案(避免连接)将是:
SELECT Timestamp,Value,
MAX(CASE WHEN PrevRN=1 THEN tsl END),MAX(CASE WHEN PrevRN=1 then Position END),
MAX(CASE WHEN NextRN=1 THEN tsl END),MAX(CASE WHEN NextRN=1 then Position END)
FROM
OrderedLocations
where PrevRN=1 or NextRN=1
group by Timestamp,Value
The CTE ( OrderedLocations
) is trying to construct a rowset where every row from location is matched to every row in value
. CTE(
OrderedLocations
)正在尝试构造一个行集,其中来自位置的每一行都与value
每一行匹配。 For each resulting row, we calculate two ROW_NUMBER
s - the row number where we number all rows with a lower or equal timestamp ( PrevRN
) in descending order, and another where we number all rows with a greater or equal timestamp ( NextRN
) ascending. 对于每个结果行,我们计算两个
ROW_NUMBER
s - 行号,其中我们按降序排列具有较低或相等时间戳( PrevRN
)的所有行,以及另一个我们对具有大于或等于时间戳( NextRN
)的所有行进行NextRN
行。 We then construct our final result by just considering those rows where one of those row numbers is 1. 然后,我们通过考虑其中一个行号为1的那些行来构造我们的最终结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.