简体   繁体   English

MySQL的时间间隔SQL查询

[英]Time interval SQL query with MySQL

I've got a table in a database that stores log data by time. 我在数据库中有一个表,它按时间存储日志数据。 For one day there can be a million rows in the db. 有一天,数据库中可能有一百万行。 The times are not at any regular interval. 时间不是任何规律的间隔。 It has several indexes, including the time. 它有几个索引,包括时间。 What I want to do is build a query that will return a set of rows with one row per time interval. 我想要做的是构建一个查询,它将返回一组行,每个时间间隔一行。 For example, I could do a query to return 1 row every 15 minutes for a day. 例如,我可以进行查询,每15分钟返回1行一天。 This would return 24*60=96 rows. 这将返回24 * 60 = 96行。 Each row returned would actually be the nearest row in the db prior to the interval requested (since the data in the database will not equal the requested interval). 返回的每一行实际上是在请求的间隔之前db中最近的行(因为数据库中的数据不等于请求的间隔)。

I am at a loss for how to do it. 我不知道该怎么做。 I can't just query all rows for a particular set of indexes and time interval, as it would load more than a gigabyte of data into memory, which is too slow. 我不能只查询一组特定索引和时间间隔的所有行,因为它会将超过一千兆字节的数据加载到内存中,这太慢了。 Is there any efficient way to do this using SQL. 有没有有效的方法来使用SQL来做到这一点。 I'm using a MySQL database. 我正在使用MySQL数据库。 I would be open to changing the table indexes/etc... 我愿意改变表索引/等...

TIME

11:58
12:03
12:07
12:09
12:22
12:27
12:33
12:38
12:43
12:49
12:55

If I wanted to query this for a 15 minute interval from 12:00 to 1:00, I'd get back: 如果我想在12:00到1:00之间查询15分钟,我会回来:

11:58 (nearest 12:00)
12:09 (nearest 12:15)
12:27 (nearest 12:30)
12:43 (nearest 12:45)
12:55 (nearest 1:00) 

If it makes it any easier, I can also store the time as a number (ie ms since 1970). 如果它更容易,我也可以将时间存储为数字(即自1970年以来的ms)。 In the above query, this would then be an interval of 900000 ms. 在上面的查询中,这将是900000毫秒的间隔。

So, I had thought something like: 所以,我曾想过这样的事情:

SELECT 
  MIN(timeValue)
FROM e
GROUP BY (to_seconds(timeValue) - (to_seconds(timeValue) % (60 * 5)))

..would do it for you, but this only returns the MIN(timeValue) over the whole table. ..会为你做,但这只会在整个表上返回MIN(timeValue)。 It works if the seconds rounded to the nearest 5 min is in its own col. 如果舍入到最近的5分钟的秒数在它自己的col中,则它可以工作。

See SQL Fiddle 请参阅SQL小提琴

Edit per Andiry, this works: ( http://sqlfiddle.com/#!2/bb870/6 ) 编辑每个Andiry,这是有效的:( http://sqlfiddle.com/#!2/bb870/6

SELECT MIN(t)
FROM e
GROUP BY to_seconds(t) DIV (60 * 5)

But this just gives one row: ( http://sqlfiddle.com/#!2/bb870/7 ) 但这只是一行:( http://sqlfiddle.com/#!2/bb870/7

SELECT MIN(t)
FROM e
GROUP BY to_seconds(t) - (to_seconds(t) % (60 * 5))

Anyone know why? 谁知道为什么?

I can't think of a good way to do it all in one query. 我想不出在一个查询中完成所有操作的好方法。 Perhaps someone else can think of a better way, but perhaps you could use something like this: 也许其他人可以想出更好的方法,但也许你可以使用这样的东西:

$startTime = mktime(12, 0);
$endTime = mktime(13, 0);
$queries = array();
for ($i = $startTime; $i <= $endTime; $i += 900)
    $queries[] = "SELECT MAX(timeValue) FROM table1 WHERE timeValue < '". date("G:i", $i) ."'";

$query = implode("\nUNION\n", $queries);

I just realized that this assumes that you are using PHP. 我刚才意识到这假设您正在使用PHP。 If you are not, then just use the resulting query, which will look like: 如果不是,那么只需使用生成的查询,如下所示:

SELECT MAX(timeValue) FROM table1 WHERE timeValue < '12:00'
UNION
SELECT MAX(timeValue) FROM table1 WHERE timeValue < '12:15'
UNION
SELECT MAX(timeValue) FROM table1 WHERE timeValue < '12:30'
UNION
SELECT MAX(timeValue) FROM table1 WHERE timeValue < '12:45'
UNION
SELECT MAX(timeValue) FROM table1 WHERE timeValue < '13:00'

Not sure if the < comparison will work 100% correctly with these string values, but I definitely think it would be a good idea to switch them to unix timestamps (or ms since 1970, if you need that much granularity). 不确定<比较是否可以100%正确地使用这些字符串值,但我绝对认为将它们切换到unix时间戳(或者自1970年以来的ms,如果你需要那么多的粒度)是个好主意。 I have found it's always easier to work with integer values for date/time instead of strings. 我发现使用日期/时间的整数值而不是字符串更容易。

I think using functions is pretty easy and I haven't noticed big performance implications, although a cursor would probably preform better depending on how many rows there are between times. 我认为使用函数非常简单,我没有注意到性能的重大影响,尽管游标可能会更好地执行,具体取决于时间之间的行数。

CREATE TABLE TEST_TIMES (EventTime datetime)
-- skipping INSERTS of your times

CREATE FUNCTION fn_MyTimes ( @StartTime datetime, @EndTime datetime, @Minutes int )
    RETURNS @TimeTable TABLE (TimeValue datetime)
AS BEGIN
    DECLARE @CurrentTime datetime
    SET @CurrentTime = @StartTime
    WHILE @CurrentTime <= @EndTime
    BEGIN
        INSERT INTO @TimeTable VALUES (@CurrentTime)
        SET @CurrentTime = DATEADD(minute, @Minutes, @CurrentTime)
    END
    RETURN
END

CREATE FUNCTION fn_ClosestTime ( @CheckTime datetime )
    RETURNS datetime
AS BEGIN
    DECLARE @LowerTime datetime, @HigherTime datetime

    SELECT @LowerTime = MAX(EventTime)
    FROM TEST_TIMES
    WHERE EventTime <= @CheckTime

    SELECT @HigherTime = MAX(EventTime)
    FROM TEST_TIMES
    WHERE EventTime >= @CheckTime

    IF @LowerTime IS NULL RETURN @HigherTime -- both null?  then null
    IF @HigherTime IS NULL RETURN @LowerTime

    IF DATEDIFF(ms, @LowerTime, @CheckTime) < DATEDIFF(ms, @CheckTime, @HigherTime)
        RETURN @LowerTime
    RETURN @HigherTime
END

SELECT TimeValue, dbo.fn_ClosestTime(TimeValue) as ClosestTime
FROM fn_MyTimes('2012-05-17 12:00', '2012-05-17 13:00', 15)

Results: 结果:

TimeValue               ClosestTime
----------------------- -----------------------
2012-05-17 12:00:00.000 2012-05-17 11:58:00.000
2012-05-17 12:15:00.000 2012-05-17 12:09:00.000
2012-05-17 12:30:00.000 2012-05-17 12:27:00.000
2012-05-17 12:45:00.000 2012-05-17 12:43:00.000
2012-05-17 13:00:00.000 2012-05-17 12:55:00.000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM