[英]Efficient SQL query to find gap in consecutive numeric data (MySQL)
我有一個包含“時間”列(INT 無符號)的表,每一行代表一秒鍾,我需要及時找到間隔(丟失的秒數)。
我試過這個查詢(在差距之前找到第一次):
SELECT t1.time
FROM `table` AS t1
LEFT JOIN `table` AS t2 ON t2.time=(t1.time+1)
WHERE t2.time IS NULL
ORDER BY TIME ASC
LIMIT 1
它有效,但對於大表(近 100M 行)來說太慢了
有沒有更快的解決方案?
顯示創建:
CREATE TABLE `candles` (
`time` int(10) unsigned NOT NULL,
`open` float unsigned NOT NULL,
`high` float unsigned NOT NULL,
`low` float unsigned NOT NULL,
`close` float unsigned NOT NULL,
`vb` int(10) unsigned NOT NULL,
`vs` int(10) unsigned NOT NULL,
`trades` int(10) unsigned NOT NULL,
PRIMARY KEY (`time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
如果數據庫版本為8.0
,則可以使用遞歸公用表表達式,例如
WITH RECURSIVE cte AS
(
SELECT 1 AS n
UNION ALL
SELECT n + 1 AS value
FROM cte
WHERE cte.n < (SELECT MAX(time) FROM tab )
)
SELECT n AS gaps
FROM cte
LEFT JOIN tab
ON n=time
WHERE cte.n > (SELECT MIN(time) FROM tab )
AND time IS NULL
在 MySQL 5.7 中,這是一個用戶變量可能有用的用例:
select max(time)
from (
select t.time, @rn := @rn + 1 as rn
from (select time from mytable order by time) t
cross join (select @rn := 0) r
) t
group by time - rn
這將問題作為一個間隙和孤島問題來解決。 這個想法是識別時間增量沒有間隙(島嶼)的記錄組。 為此,我們為每一行分配一個遞增的 id,按時間排序; 每當time
和自動增量之間的差異發生變化時,您就會知道存在差距。
對於 mysql 8,您可以使用 LEAD():
select time from (
select time, lead(time, 1) over (order by time) next_time
from `table`
) t
where time+1 != next_time
在早期版本中,我可能會這樣做:
select prev_time as time from (
select @prev_time+0 as prev_time,if(@prev_time:=time,time,time) as time
from (select @prev_time:=null) initvars
cross join (select time from `table` order by time) t
) t
where time != prev_time+1
兩者都不會包括您的原始查詢所包含的最長時間。
我認為 group by 需要將其視為嚴格的差距和島嶼問題,因為有那么多記錄,代價太大了。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.