如何優化以下MySQL查詢以實現每秒並發調用？

Question

以下查詢從DB1.Data表中讀取數據，該查詢正常運行，但速度很慢。 該查詢結果是來自CDR信息的並發呼叫。

MySQL查詢

select sql_calc_found_rows H,M,S,(TCNT+ADCNT) as CNT from
(
select H,M,S,sum(CNT) as TCNT,
(
select 
count(id) as CNT
from DB1.Data force index (datetimeOrgination)  where 1=1 and 
(datetimeOrgination<UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))  and (datetimeOrgination+callDuration)>UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))) 
  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))   
) as ADCNT 
 from 
(
(select 
hour(from_unixtime(datetimeOrgination)) as H,
minute(from_unixtime(datetimeOrgination)) as M,
second(from_unixtime(datetimeOrgination)) as S,
count(id) as CNT  
from DB1.Data where 1=1  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination)),minute(from_unixtime(datetimeOrgination)),second(from_unixtime(datetimeOrgination)))

Union  all

(select 
hour(from_unixtime(datetimeOrgination+callDuration)) as H,
minute(from_unixtime(datetimeOrgination+callDuration)) as M,
second(from_unixtime(datetimeOrgination+callDuration)) as S,
count(id) as CNT 
from DB1.Data  force index (datetimeOrgination) where 1=1 and  
(second(from_unixtime(datetimeOrgination+callDuration))>second(from_unixtime(datetimeOrgination)))   and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination+callDuration)),minute(from_unixtime(datetimeOrgination+callDuration)),second(from_unixtime(datetimeOrgination+callDuration)))
) as T1  group by H,M,S
) as T2;

這是說明輸出

這是JSON格式的查詢輸出：

{
"meta": {
    "count": 18,
    "totalCount": 18
},
"calls": [{
    "H": 10,
    "M": 30,
    "S": 44,
    "CNT": 1
}, {
    "H": 11,
    "M": 27,
    "S": 1,
    "CNT": 1
}, {
    "H": 11,
    "M": 28,
    "S": 44,
    "CNT": 1
}, {
    "H": 12,
    "M": 23,
    "S": 52,
    "CNT": 1
}, {
    "H": 12,
    "M": 29,
    "S": 27,
    "CNT": 1
}, {
    "H": 12,
    "M": 30,
    "S": 38,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 17,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 44,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 51,
    "CNT": 1
}, {
    "H": 14,
    "M": 27,
    "S": 2,
    "CNT": 1
}, {
    "H": 14,
    "M": 27,
    "S": 8,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 27,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 57,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 58,
    "CNT": 1
}, {
    "H": 15,
    "M": 8,
    "S": 4,
    "CNT": 1
}, {
    "H": 15,
    "M": 8,
    "S": 31,
    "CNT": 1
}, {
    "H": 15,
    "M": 56,
    "S": 38,
    "CNT": 1
}, {
    "H": 16,
    "M": 27,
    "S": 30,
    "CNT": 1
}]

}

結果中的第一條記錄

  "H": 10,
    "M": 30,
    "S": 44,
    "CNT": 1

顯示我們在10:30:44有1個並發呼叫

更多細節

為了計算每秒的並發呼叫數，我們應該計算每秒3種呼叫類型。

例如，如果我們要計算10:51:20的並發調用，則需要計算以下所有內容：

步驟1：計算所有在10:51:20開始的通話

步驟2-計算所有呼叫在10:51:20結束，但未在同一秒開始（20）。

步驟3-計算所有在10:51:20之前開始並在10:51:20之后結束的呼叫。

步驟4-最后，需要對所有這些求和進行求和以計算並發調用。

此查詢適用於步驟1

(select 
hour(from_unixtime(datetimeOrgination)) as H,
minute(from_unixtime(datetimeOrgination)) as M,
second(from_unixtime(datetimeOrgination)) as S,
count(id) as CNT  
from DB1.Data where 1=1  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination)),minute(from_unixtime(datetimeOrgination)),second(from_unixtime(datetimeOrgination)))

該查詢適用於步驟2

(select 
hour(from_unixtime(datetimeOrgination+callDuration)) as H,
minute(from_unixtime(datetimeOrgination+callDuration)) as M,
second(from_unixtime(datetimeOrgination+callDuration)) as S,
count(id) as CNT 
from DB1.Data  force index (datetimeOrgination) where 1=1 and  
(second(from_unixtime(datetimeOrgination+callDuration))>second(from_unixtime(datetimeOrgination)))   and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination+callDuration)),minute(from_unixtime(datetimeOrgination+callDuration)),second(from_unixtime(datetimeOrgination+callDuration)))

該查詢是針對前2個查詢的並集結果的第3步查詢

(
select 
count(id) as CNT
from DB1.Data force index (datetimeOrgination)  where 1=1 and 
(datetimeOrgination<UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))  and (datetimeOrgination+callDuration)>UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))) 
  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))   
) as ADCNT

該查詢將收集所有這些查詢並返回最終結果。

select sql_calc_found_rows H,M,S,(TCNT+ADCNT) as CNT from
(

如前所述，該查詢有效但非常緩慢且復雜，我知道需要優化和簡化。

欄位類型

`datetimeOrgination` BIGINT(20) NOT NULL DEFAULT
`callDuration` BIGINT(20) NOT NULL DEFAULT '0',

和索引

INDEX `datetimeOrgination` (`datetimeOrgination`),
INDEX `callDuration` (`callDuration`),

Answer 1

警告：我的一些建議是為了清楚或簡化，不一定是為了提高速度。

潛在的錯誤： and (second(from_unixtime(datetimeOrgination+callDuration)) > second(from_unixtime(datetimeOrgination)))沒有多大意義。 它將捕獲從11:22:00開始的2秒呼叫，但不會捕獲從11:21:59開始的呼叫。 那真的是您想要的嗎？ 無論如何，請說明查詢要執行的操作。

不要使用H，M，S，只需幾秒鍾即可工作-通過從日期中提取hh：mm：ss字符串，或以秒為單位獲取一天中的時間。 轉換為H，M，S作為最后一步，而不是第一步 。

不要FORCE INDEX -今天可能會有所幫助，但明天會受到傷害。

將and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') AND UNIX_TIMESTAMP('2018-02-09 23:59:59'))更改為

  AND  DB1.Data.datetimeOrgination >= '2018-02-00'
  AND  DB1.Data.datetimeOrgination  < '2018-02-00' + INTERVAL 1 DAY

（同樣，這是為了清楚起見，而不是速度。）

使用COUNT(*)而不是COUNT(id)

我正在做很多猜測； 通過提供SHOW CREATE TABLE幫助我們。 聞起來好像您為datetimeOrgination使用了錯誤的數據類型。

轉換為秒（從H，M，S）后，

 datetimeOrgination < UNIX_TIMESTAMP(concat('2018-02-09',' ',',T1.H,':',T1.M,':',T1.S)

變成像

 datetimeOrgination < '2018-02-09' + INTERVAL secs SECOND

更好的是從子查詢中提取日期時間，然后移至類似

 datetimeOrgination < datetime_from_subquery

這樣可能會更好地使用索引。

清理代碼並說明目標； 我將嘗試提出更多的加速方案。

Answer 2

（由於問題的定義正在變化，所以我開始一個新的答案。）

在特定時間點的（所有類型的）呼叫次數很簡單：

SELECT COUNT(*) FROM tbl
    WHERE call_start            <= '2018-02-14 15:11:35'
    WHERE call_start + duration >= '2018-02-14 15:11:35';

但是，我會懷疑答案是“高”的，因為它沒有考慮呼叫在給定秒數的哪一部分開始或結束。 因此，我認為這更接近糾正：

SELECT COUNT(*) FROM tbl
    WHERE call_start            <  '2018-02-14 15:11:35'
    WHERE call_start + duration >= '2018-02-14 15:11:35';

這應該盡可能地接近確切地說'2018-02-14 15：11：35.000000'發生了多少個並發調用; 它是'2018-02-14 15：11：35.5'的近似數字。

通過將COUNT(*)更改為SUM(...) （如前所述），可以獲得給定類型的呼叫的計數。

然后，您可以使用datetime或timestamp算法添加GROUP BY以完成任務。

一天

接聽一天中開始的所有呼叫：

WHERE call_start >= '2018-02-09'
  AND call_start  < '2018-02-09' + INTERVAL 1 DAY

問題定義錯誤

為了計算每秒的並發呼叫數，我們應該計算每秒3種呼叫類型...

我認為這在數學上是錯誤的。

“並發呼叫”是即時的，而不是一秒鍾（或一小時或一天）。 這表示“當時正在使用多少個電話連接。

讓我將問題的陳述更改為“每小時並發通話”。 那有意義嗎？ 您可以詢問“每小時呼叫”，這可以解釋為“每小時發起的呼叫”，可以通過datetimeOrgination和GROUP BY進行計算。

假設我在每分鍾開始時打電話，每次持續59秒。 一條電話線就可以解決這個問題。 我建議是“ 1個並發調用”。

相反，如果我有60個人都在中午開始他們59秒的通話，該怎么辦？ 那將需要60條電話線。 在一天的繁忙時間內，這將是60個並發呼叫。

您擁有的指標涉及一個datetimeOrgination ，它被截斷（或四舍五入到1秒）邊界。

讓我不要修改示例以更好地解釋您的3個步驟錯誤的原因。 我想按小時分組，並且我願意在小時的頂部衡量通話次數。 特別地，讓我們看一下10點鍾的時間。

09:55-10:05-根據您的算法，在09到10個小時中，每10分鍾的通話被計算在內。
10:20-10:30-根據您的算法，僅在10小時內計算的10分鍾通話時間。

為什么將10分鍾的通話計為兩個小時？ 這會增加“並發”計數。

09:05-10:55-一個110分鍾的通話時間也算在09和10小時中。
09:30-11:30-110分鍾的通話時間也算為3個小時。 再次，過度計數。

因此，我認為唯一合理的計算是

第1步-計算所有始於10:51:20的呼叫-計算為在：20瞬間發生。

步驟2-計算所有呼叫均在 10:51:20 之前結束，但未在同一秒（20）中開始。 - 不計入：20。

步驟3-計算所有在10:51:20之前開始並在10:51:20之后結束的呼叫。 -計算為：20瞬間。

我建議的解決方案可以實現這種修改，並且更簡單且在數學上是“正確的”。

如何優化以下MySQL查詢以實現每秒並發調用？

問題描述

2 個解決方案

解決方案1
3 2018-02-13 22:07:33

解決方案2
1 2018-02-14 23:16:45

如何優化以下MySQL查詢以實現每秒並發調用？

問題描述

2 個解決方案

解決方案1 3 2018-02-13 22:07:33

解決方案2 1 2018-02-14 23:16:45

解決方案1
3 2018-02-13 22:07:33

解決方案2
1 2018-02-14 23:16:45