[英]MySQL group by intervals in a date range
我將繪制存儲在MySQL數據庫中的netflow數據,我需要一種有效的方法來獲取相關的數據點。 自紀元以來,它們的記錄以日期作為int存儲了幾秒鍾。 我想能夠像:
Select SUM(bytes) from table where stime > x and stime < Y
group by (10 second intervals)
反正有沒有這樣做? 或者,在python本地處理它會更快嗎? 即使是500K行表?
編輯我的錯誤,時間存儲為無符號雙精度而不是INT。 我目前正在使用GROUP BY (FLOOR(stime / I))
,其中我是所需的間隔。
您可以使用整數除法來執行此操作。 不確定性能。
讓我成為你想要的間隔秒。
SELECT SUM(bytes), ((stime - X) DIV I) as interval
FROM table
WHERE (stime > X) and (stime < Y)
GROUP BY interval
Example, let X = 1500 and I = 10
stime = 1503 -> (1503 - 1500) DIV 10 = 0
stime = 1507 -> (1507 - 1500) DIV 10 = 0
stime = 1514 -> (1514 - 1500) DIV 10 = 1
stime = 1523 -> (1523 - 1500) DIV 10 = 2
你試過以下嗎? 只需將tyiem列除以10並將結果向下舍入。
SELECT SUM(bytes)
FROM table
WHERE stime > x
AND stime < Y
GROUP BY ROUND(stime/10, -1)
我不知道ROUND()函數和函數調用分組在MySQL中工作,但上面是T-SQL。
組中的FLOOR
有時會失敗。 它有時將不同的時間分組為一個值,例如當您將值除以3但是當您除以4時它不會相同,盡管這兩個值之間的差異遠大於3或4它應該分組為兩個不同的群體。 更好地將其投射到地板上的無符號,其工作方式如下:
CAST(FLOOR(UNIX_TIMESTAMP(time_field)/I) AS UNSIGNED INT)
問題:
有時GROUP BY FLOOR(UNIX_TIMESTAMP(time_field)/3)
與GROUP BY FLOOR(UNIX_TIMESTAMP(time_field)/4)
相比給出較少的組,這在數學上是不可能的。
SELECT sec_to_time(time_to_sec(datefield)- time_to_sec(datefield)%(10)) as intervals,SUM(bytes)
FROM table
WHERE where stime > x and stime < Y
group by intervals
我使用了答案和同事的建議。 最終結果如下:
Select FROM_UNIXTIME(stime), bytes
from argusTable_2009_10_22
where stime > (UNIX_TIMESTAMP()-600)
group by floor(stime /10)
我也嘗試了舍入解決方案,但結果不一致。
機會
我之前做過這個,所以我創建了一些函數(使用sql server,但我認為它幾乎相同):
首先,我創建了一個標量函數,它根據間隔和日期部分(分鍾,小時,日,蛾,年)返回日期的ID:
CREATE FUNCTION [dbo].[GetIDDate]
(
@date datetime,
@part nvarchar(10),
@intervalle int
)
RETURNS int
AS
BEGIN
-- Declare the return variable here
DECLARE @res int
DECLARE @date_base datetime
SET @date_base = convert(datetime,'01/01/1970',103)
set @res = case @part
WHEN 'minute' THEN datediff(minute,@date_base,@date)/@intervalle
WHEN 'hour' THEN datediff(hour,@date_base,@date)/@intervalle
WHEN 'day' THEN datediff(day,@date_base,@date)/@intervalle
WHEN 'month' THEN datediff(month,@date_base,@date)/@intervalle
WHEN 'year' THEN datediff(year,@date_base,@date)/@intervalle
ELSE datediff(minute,@date_base,@date)/@intervalle END
-- Return the result of the function
RETURN @res
END
然后我創建了一個表函數,它返回一個日期范圍之外的所有id:
CREATE FUNCTION [dbo].[GetTableDate]
(
-- Add the parameters for the function here
@start_date datetime,
@end_date datetime,
@interval int,
@unite varchar(10)
)
RETURNS @res TABLE (StartDate datetime,TxtStartDate nvarchar(50),EndDate datetime,TxtEndDate nvarchar(50),IdDate int)
AS
begin
declare @current_date datetime
declare @end_date_courante datetime
declare @txt_start_date nvarchar(50)
declare @txt_end_date nvarchar(50)
set @current_date = case @unite
WHEN 'minute' THEN dateadd(minute, datediff(minute,0,@start_date),0)
WHEN 'hour' THEN dateadd(hour, datediff(hour,0,@start_date),0)
WHEN 'day' THEN dateadd(day, datediff(day,0,@start_date),0)
WHEN 'month' THEN dateadd(month, datediff(month,0,@start_date),0)
WHEN 'year' THEN dateadd(year, datediff(year,0,dateadd(year,@interval,@start_date)),0)
ELSE dateadd(minute, datediff(minute,0,@start_date),0) END
while @current_date < @end_date
begin
set @end_date_courante =
case @unite
WHEN 'minute' THEN dateadd(minute, datediff(minute,0,dateadd(minute,@interval,@current_date)),0)
WHEN 'hour' THEN dateadd(hour, datediff(hour,0,dateadd(hour,@interval,@current_date)),0)
WHEN 'day' THEN dateadd(day, datediff(day,0,dateadd(day,@interval,@current_date)),0)
WHEN 'month' THEN dateadd(month, datediff(month,0,dateadd(month,@interval,@current_date)),0)
WHEN 'year' THEN dateadd(year, datediff(year,0,dateadd(year,@interval,@current_date)),0)
ELSE dateadd(minute, datediff(minute,0,dateadd(minute,@interval,@current_date)),0) END
SET @txt_start_date = case @unite
WHEN 'minute' THEN CONVERT(VARCHAR(20), @current_date, 100)
WHEN 'hour' THEN CONVERT(VARCHAR(20), @current_date, 100)
WHEN 'day' THEN REPLACE(CONVERT(VARCHAR(11), @current_date, 106), ' ', '-')
WHEN 'month' THEN REPLACE(RIGHT(CONVERT(VARCHAR(11), @current_date, 106), 8), ' ', '-')
WHEN 'year' THEN CONVERT(VARCHAR(20), datepart(year,@current_date))
ELSE CONVERT(VARCHAR(20), @current_date, 100) END
SET @txt_end_date = case @unite
WHEN 'minute' THEN CONVERT(VARCHAR(20), @end_date_courante, 100)
WHEN 'hour' THEN CONVERT(VARCHAR(20), @end_date_courante, 100)
WHEN 'day' THEN REPLACE(CONVERT(VARCHAR(11), @end_date_courante, 106), ' ', '-')
WHEN 'month' THEN REPLACE(RIGHT(CONVERT(VARCHAR(11), @end_date_courante, 106), 8), ' ', '-')
WHEN 'year' THEN CONVERT(VARCHAR(20), datepart(year,@end_date_courante))
ELSE CONVERT(VARCHAR(20), @end_date_courante, 100) END
INSERT INTO @res (
StartDate,
EndDate,
TxtStartDate,
TxtEndDate,
IdDate) values(
@current_date,
@end_date_courante,
@txt_start_date,
@txt_end_date,
dbo.GetIDDate(@current_date,@unite,@interval)
)
set @current_date = @end_date_courante
end
return
end
因此,如果我想計算每個33分鍾間隔添加的所有用戶:
SELECT count(id_user) , timeTable.StartDate
FROM user
INNER JOIn dbo.[GetTableDate]('1970-01-01',datedate(),33,'minute') as timeTable
ON dbo.getIDDate(user.creation_date,'minute',33) = timeTable.IDDate
GROUP BY dbo.getIDDate(user.creation_date,'minute',33)ORDER BY timeTable.StartDate
:)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.