簡體   English   中英

在一個時間范圍內分組為 5 分鍾的間隔

[英]Grouping into interval of 5 minutes within a time range

我對我想做的 mySQL 命令有一些困難。

SELECT a.timestamp, name, count(b.name) 
FROM time a, id b 
WHERE a.user = b.user
  AND a.id = b.id
  AND b.name = 'John'
  AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00' 
GROUP BY a.timestamp

這是我當前的 output 聲明。

timestamp            name  count(b.name)
-------------------  ----  -------------
2010-11-16 10:32:22  John  2
2010-11-16 10:35:12  John  7
2010-11-16 10:36:34  John  1
2010-11-16 10:37:45  John  2
2010-11-16 10:48:26  John  8
2010-11-16 10:55:00  John  9
2010-11-16 10:58:08  John  2

如何將它們分組為 5 分鍾間隔結果?

我希望我的 output 像

timestamp            name  count(b.name)
-------------------  ----  -------------
2010-11-16 10:30:00  John  2
2010-11-16 10:35:00  John  10
2010-11-16 10:40:00  John  0
2010-11-16 10:45:00  John  8
2010-11-16 10:50:00  John  0
2010-11-16 10:55:00  John  11 

這適用於每個間隔。

PostgreSQL

SELECT
    TIMESTAMP WITH TIME ZONE 'epoch' +
    INTERVAL '1 second' * round(extract('epoch' from timestamp) / 300) * 300 as timestamp,
    name,
    count(b.name)
FROM time a, id 
WHERE …
GROUP BY 
round(extract('epoch' from timestamp) / 300), name


MySQL

SELECT
    timestamp,  -- not sure about that
    name,
    count(b.name)
FROM time a, id 
WHERE …
GROUP BY 
UNIX_TIMESTAMP(timestamp) DIV 300, name

我遇到了同樣的問題。

我發現按任何分鍾間隔分組很容易,只需將紀元除以秒數的分鍾,然后四舍五入或使用地板來獲得剩余部分。 因此,如果您想在5 分鍾內獲得間隔,您將使用300 秒

    SELECT COUNT(*) cnt, 
    to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300) 
    AT TIME ZONE 'UTC' as interval_alias
    FROM TABLE_NAME GROUP BY interval_alias
interval_alias       cnt
-------------------  ----  
2010-11-16 10:30:00  2
2010-11-16 10:35:00  10
2010-11-16 10:45:00  8
2010-11-16 10:55:00  11 

這將按選定的分鍾間隔正確返回數據; 但是,它不會返回不包含任何數據的區間。 為了獲得這些空區間,我們可以使用函數generate_series

    SELECT generate_series(MIN(date_trunc('hour',timestamp_column)),
    max(date_trunc('minute',timestamp_column)),'5m') as interval_alias FROM 
    TABLE_NAME

結果:

interval_alias       
-------------------    
2010-11-16 10:30:00  
2010-11-16 10:35:00
2010-11-16 10:40:00   
2010-11-16 10:45:00
2010-11-16 10:50:00   
2010-11-16 10:55:00   

現在要獲得間隔為零的結果,我們只需外連接兩個結果集

    SELECT series.minute as interval,  coalesce(cnt.amnt,0) as count from 
       (
       SELECT count(*) amnt,
       to_timestamp(floor((extract('epoch' from timestamp_column) / 300 )) * 300)
       AT TIME ZONE 'UTC' as interval_alias
       from TABLE_NAME  group by interval_alias
       ) cnt
    
    RIGHT JOIN 
       (    
       SELECT generate_series(min(date_trunc('hour',timestamp_column)),
       max(date_trunc('minute',timestamp_column)),'5m') as minute from TABLE_NAME 
       ) series
  on series.minute = cnt.interval_alias

最終結果將包括所有 5 分鍾間隔的系列,即使是那些沒有值的系列。

interval             count
-------------------  ----  
2010-11-16 10:30:00  2
2010-11-16 10:35:00  10
2010-11-16 10:40:00  0
2010-11-16 10:45:00  8
2010-11-16 10:50:00  0 
2010-11-16 10:55:00  11 

通過調整 generate_series 的最后一個參數可以很容易地改變間隔。 在我們的例子中,我們使用“5m”,但它可以是我們想要的任何間隔

您應該使用GROUP BY UNIX_TIMESTAMP(time_stamp) DIV 300而不是 round(../300) 因為四舍五入我發現一些記錄被計入兩個分組的結果集。

對於postgres ,我發現使用

date_trunc

功能,如:

select name, sum(count), date_trunc('minute',timestamp) as timestamp
FROM table
WHERE xxx
GROUP BY name,date_trunc('minute',timestamp)
ORDER BY timestamp

您可以向 date_trunc 提供各種分辨率,例如“分鍾”、“小時”、“天”等。

查詢將類似於:

SELECT 
  DATE_FORMAT(
    MIN(timestamp),
    '%d/%m/%Y %H:%i:00'
  ) AS tmstamp,
  name,
  COUNT(id) AS cnt 
FROM
  table
GROUP BY ROUND(UNIX_TIMESTAMP(timestamp) / 300), name

不確定你是否還需要它。

SELECT FROM_UNIXTIME(FLOOR((UNIX_TIMESTAMP(timestamp))/300)*300) AS t,timestamp,count(1) as c from users GROUP BY t ORDER BY t;

2016-10-29 19:35:00 | 2016-10-29 19:35:50 | 4 |

2016-10-29 19:40:00 | 2016-10-29 19:40:37 | 5 |

2016-10-29 19:45:00 | 2016-10-29 19:45:09 | 6 |

2016-10-29 19:50:00 | 2016-10-29 19:51:14 | 4 |

2016-10-29 19:55:00 | 2016-10-29 19:56:17 | 1 |

您可能不得不將時間戳分解為 ymd:HM 並使用 DIV 5 將分鍾分成 5 分鍾的垃圾箱——類似於

select year(a.timestamp), 
       month(a.timestamp), 
       hour(a.timestamp), 
       minute(a.timestamp) DIV 5,
       name, 
       count(b.name)
FROM time a, id b
WHERE a.user = b.user AND a.id = b.id AND b.name = 'John' 
      AND a.timestamp BETWEEN '2010-11-16 10:30:00' AND '2010-11-16 11:00:00'
GROUP BY year(a.timestamp), 
       month(a.timestamp), 
       hour(a.timestamp), 
       minute(a.timestamp) DIV 12

...然后以您喜歡的方式顯示客戶端代碼中的輸出。 或者,如果您願意,您可以使用 sql concat 運算符構建整個日期字符串,而不是獲取單獨的列。

select concat(year(a.timestamp), "-", month(a.timestamp), "-" ,day(a.timestamp), 
       " " , lpad(hour(a.timestamp),2,'0'), ":", 
       lpad((minute(a.timestamp) DIV 5) * 5, 2, '0'))

...然后分組

這個怎么樣:

select 
    from_unixtime(unix_timestamp(timestamp) - unix_timestamp(timestamp) mod 300) as ts,  
    sum(value)
from group_interval 
group by ts 
order by ts
;
select 
CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2)) AS '5MINDATE'
,count(something)
from TABLE
group by CONCAT(CAST(CREATEDATE AS DATE),' ',datepart(hour,createdate),':',ROUNd(CAST((CAST((CAST(DATEPART(MINUTE,CREATEDATE) AS DECIMAL (18,4)))/5 AS INT)) AS DECIMAL (18,4))/12*60,2))

這將有助於你想要什么

替換 dt - 您的日期時間 c - 呼叫字段 astro_transit1 - 您的表 300 引用 5 分鍾,因此每次添加 300 以增加時間間隔

SELECT FROM_UNIXTIME( 300 * ROUND( UNIX_TIMESTAMP( r.dt ) /300 ) ) AS 5datetime, (
SELECT r.c
FROM astro_transit1 ra
WHERE ra.dt = r.dt
ORDER BY ra.dt DESC
LIMIT 1
) AS first_val FROM astro_transit1 r GROUP BY UNIX_TIMESTAMP( r.dt )
DIV 300
LIMIT 0 , 30

基於@boecko 對 MySQL 的回答,我使用了 CTE(公用表表達式)來加快查詢執行時間:

所以這:

SELECT
    `timestamp`,
    `name`,
     count(b.`name`)
FROM `time` a, `id` b
WHERE …
GROUP BY 
UNIX_TIMESTAMP(`timestamp`) DIV 300, name  

變成:

WITH cte AS (
    SELECT
        `timestamp`,
        `name`,
         count(b.`name`),
         UNIX_TIMESTAMP(`timestamp`) DIV 300 AS `intervals`
    FROM `time` a, `id` b
    WHERE …
)
SELECT * FROM cte GROUP BY `intervals`

在海量數據中,速度提升10多倍!

由於MySQL中保留了timestamptime ,所以不要忘記在每個表和列名上使用`...`!

希望它會幫助你們中的一些人。

我發現使用 MySQL 可能正確的查詢如下:

SELECT SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,  
                                 '%Y-%m-%d %H:%i:%S' ) , 1, 19 ) AS ts_CEILING,
SUM(value)
FROM group_interval
GROUP BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,  
                                   '%Y-%m-%d %H:%i:%S' ) , 1, 19 )
ORDER BY SUBSTRING( FROM_UNIXTIME( CEILING( timestamp /300 ) *300,  
                                   '%Y-%m-%d %H:%i:%S' ) , 1, 19 ) DESC

讓我知道你的想法。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM