如何优化以下MySQL查询以实现每秒并发调用？

Question

以下查询从DB1.Data表中读取数据，该查询正常运行，但速度很慢。 该查询结果是来自CDR信息的并发呼叫。

MySQL查询

select sql_calc_found_rows H,M,S,(TCNT+ADCNT) as CNT from
(
select H,M,S,sum(CNT) as TCNT,
(
select 
count(id) as CNT
from DB1.Data force index (datetimeOrgination)  where 1=1 and 
(datetimeOrgination<UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))  and (datetimeOrgination+callDuration)>UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))) 
  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))   
) as ADCNT 
 from 
(
(select 
hour(from_unixtime(datetimeOrgination)) as H,
minute(from_unixtime(datetimeOrgination)) as M,
second(from_unixtime(datetimeOrgination)) as S,
count(id) as CNT  
from DB1.Data where 1=1  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination)),minute(from_unixtime(datetimeOrgination)),second(from_unixtime(datetimeOrgination)))

Union  all

(select 
hour(from_unixtime(datetimeOrgination+callDuration)) as H,
minute(from_unixtime(datetimeOrgination+callDuration)) as M,
second(from_unixtime(datetimeOrgination+callDuration)) as S,
count(id) as CNT 
from DB1.Data  force index (datetimeOrgination) where 1=1 and  
(second(from_unixtime(datetimeOrgination+callDuration))>second(from_unixtime(datetimeOrgination)))   and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination+callDuration)),minute(from_unixtime(datetimeOrgination+callDuration)),second(from_unixtime(datetimeOrgination+callDuration)))
) as T1  group by H,M,S
) as T2;

这是说明输出

这是JSON格式的查询输出：

{
"meta": {
    "count": 18,
    "totalCount": 18
},
"calls": [{
    "H": 10,
    "M": 30,
    "S": 44,
    "CNT": 1
}, {
    "H": 11,
    "M": 27,
    "S": 1,
    "CNT": 1
}, {
    "H": 11,
    "M": 28,
    "S": 44,
    "CNT": 1
}, {
    "H": 12,
    "M": 23,
    "S": 52,
    "CNT": 1
}, {
    "H": 12,
    "M": 29,
    "S": 27,
    "CNT": 1
}, {
    "H": 12,
    "M": 30,
    "S": 38,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 17,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 44,
    "CNT": 1
}, {
    "H": 14,
    "M": 26,
    "S": 51,
    "CNT": 1
}, {
    "H": 14,
    "M": 27,
    "S": 2,
    "CNT": 1
}, {
    "H": 14,
    "M": 27,
    "S": 8,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 27,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 57,
    "CNT": 1
}, {
    "H": 14,
    "M": 40,
    "S": 58,
    "CNT": 1
}, {
    "H": 15,
    "M": 8,
    "S": 4,
    "CNT": 1
}, {
    "H": 15,
    "M": 8,
    "S": 31,
    "CNT": 1
}, {
    "H": 15,
    "M": 56,
    "S": 38,
    "CNT": 1
}, {
    "H": 16,
    "M": 27,
    "S": 30,
    "CNT": 1
}]

}

结果中的第一条记录

  "H": 10,
    "M": 30,
    "S": 44,
    "CNT": 1

显示我们在10:30:44有1个并发呼叫

更多细节

为了计算每秒的并发呼叫数，我们应该计算每秒3种呼叫类型。

例如，如果我们要计算10:51:20的并发调用，则需要计算以下所有内容：

步骤1：计算所有在10:51:20开始的通话

步骤2-计算所有呼叫在10:51:20结束，但未在同一秒开始（20）。

步骤3-计算所有在10:51:20之前开始并在10:51:20之后结束的呼叫。

步骤4-最后，需要对所有这些求和进行求和以计算并发调用。

此查询适用于步骤1

(select 
hour(from_unixtime(datetimeOrgination)) as H,
minute(from_unixtime(datetimeOrgination)) as M,
second(from_unixtime(datetimeOrgination)) as S,
count(id) as CNT  
from DB1.Data where 1=1  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination)),minute(from_unixtime(datetimeOrgination)),second(from_unixtime(datetimeOrgination)))

该查询适用于步骤2

(select 
hour(from_unixtime(datetimeOrgination+callDuration)) as H,
minute(from_unixtime(datetimeOrgination+callDuration)) as M,
second(from_unixtime(datetimeOrgination+callDuration)) as S,
count(id) as CNT 
from DB1.Data  force index (datetimeOrgination) where 1=1 and  
(second(from_unixtime(datetimeOrgination+callDuration))>second(from_unixtime(datetimeOrgination)))   and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))    
group by hour(from_unixtime(datetimeOrgination+callDuration)),minute(from_unixtime(datetimeOrgination+callDuration)),second(from_unixtime(datetimeOrgination+callDuration)))

该查询是针对前2个查询的并集结果的第3步查询

(
select 
count(id) as CNT
from DB1.Data force index (datetimeOrgination)  where 1=1 and 
(datetimeOrgination<UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))  and (datetimeOrgination+callDuration)>UNIX_TIMESTAMP(concat('2018-02-09',' ',T1.H,':',T1.M,':',T1.S))) 
  and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') and UNIX_TIMESTAMP('2018-02-09 23:59:59'))   
) as ADCNT

该查询将收集所有这些查询并返回最终结果。

select sql_calc_found_rows H,M,S,(TCNT+ADCNT) as CNT from
(

如前所述，该查询有效但非常缓慢且复杂，我知道需要优化和简化。

栏位类型

`datetimeOrgination` BIGINT(20) NOT NULL DEFAULT
`callDuration` BIGINT(20) NOT NULL DEFAULT '0',

和索引

INDEX `datetimeOrgination` (`datetimeOrgination`),
INDEX `callDuration` (`callDuration`),

Answer 1

警告：我的一些建议是为了清楚或简化，不一定是为了提高速度。

潜在的错误： and (second(from_unixtime(datetimeOrgination+callDuration)) > second(from_unixtime(datetimeOrgination)))没有多大意义。 它将捕获从11:22:00开始的2秒呼叫，但不会捕获从11:21:59开始的呼叫。 那真的是您想要的吗？ 无论如何，请说明查询要执行的操作。

不要使用H，M，S，只需几秒钟即可工作-通过从日期中提取hh：mm：ss字符串，或以秒为单位获取一天中的时间。 转换为H，M，S作为最后一步，而不是第一步 。

不要FORCE INDEX -今天可能会有所帮助，但明天会受到伤害。

将and (DB1.Data.datetimeOrgination between UNIX_TIMESTAMP('2018-02-09 00:00:00') AND UNIX_TIMESTAMP('2018-02-09 23:59:59'))更改为

  AND  DB1.Data.datetimeOrgination >= '2018-02-00'
  AND  DB1.Data.datetimeOrgination  < '2018-02-00' + INTERVAL 1 DAY

（同样，这是为了清楚起见，而不是速度。）

使用COUNT(*)而不是COUNT(id)

我正在做很多猜测； 通过提供SHOW CREATE TABLE帮助我们。 闻起来好像您为datetimeOrgination使用了错误的数据类型。

转换为秒（从H，M，S）后，

 datetimeOrgination < UNIX_TIMESTAMP(concat('2018-02-09',' ',',T1.H,':',T1.M,':',T1.S)

变成像

 datetimeOrgination < '2018-02-09' + INTERVAL secs SECOND

更好的是从子查询中提取日期时间，然后移至类似

 datetimeOrgination < datetime_from_subquery

这样可能会更好地使用索引。

清理代码并说明目标； 我将尝试提出更多的加速方案。

Answer 2

（由于问题的定义正在变化，所以我开始一个新的答案。）

在特定时间点的（所有类型的）呼叫次数很简单：

SELECT COUNT(*) FROM tbl
    WHERE call_start            <= '2018-02-14 15:11:35'
    WHERE call_start + duration >= '2018-02-14 15:11:35';

但是，我会怀疑答案是“高”的，因为它没有考虑呼叫在给定秒数的哪一部分开始或结束。 因此，我认为这更接近纠正：

SELECT COUNT(*) FROM tbl
    WHERE call_start            <  '2018-02-14 15:11:35'
    WHERE call_start + duration >= '2018-02-14 15:11:35';

这应该尽可能地接近确切地说'2018-02-14 15：11：35.000000'发生了多少个并发调用; 它是'2018-02-14 15：11：35.5'的近似数字。

通过将COUNT(*)更改为SUM(...) （如前所述），可以获得给定类型的呼叫的计数。

然后，您可以使用datetime或timestamp算法添加GROUP BY以完成任务。

一天

接听一天中开始的所有呼叫：

WHERE call_start >= '2018-02-09'
  AND call_start  < '2018-02-09' + INTERVAL 1 DAY

问题定义错误

为了计算每秒的并发呼叫数，我们应该计算每秒3种呼叫类型...

我认为这在数学上是错误的。

“并发呼叫”是即时的，而不是一秒钟（或一小时或一天）。 这表示“当时正在使用多少个电话连接。

让我将问题的陈述更改为“每小时并发通话”。 那有意义吗？ 您可以询问“每小时呼叫”，这可以解释为“每小时发起的呼叫”，可以通过datetimeOrgination和GROUP BY进行计算。

假设我在每分钟开始时打电话，每次持续59秒。 一条电话线就可以解决这个问题。 我建议是“ 1个并发调用”。

相反，如果我有60个人都在中午开始他们59秒的通话，该怎么办？ 那将需要60条电话线。 在一天的繁忙时间内，这将是60个并发呼叫。

您拥有的指标涉及一个datetimeOrgination ，它被截断（或四舍五入到1秒）边界。

让我不要修改示例以更好地解释您的3个步骤错误的原因。 我想按小时分组，并且我愿意在小时的顶部衡量通话次数。 特别地，让我们看一下10点钟的时间。

09:55-10:05-根据您的算法，在09到10个小时中，每10分钟的通话被计算在内。
10:20-10:30-根据您的算法，仅在10小时内计算的10分钟通话时间。

为什么将10分钟的通话计为两个小时？ 这会增加“并发”计数。

09:05-10:55-一个110分钟的通话时间也算在09和10小时中。
09:30-11:30-110分钟的通话时间也算为3个小时。 再次，过度计数。

因此，我认为唯一合理的计算是

第1步-计算所有始于10:51:20的呼叫-计算为在：20瞬间发生。

步骤2-计算所有呼叫均在 10:51:20 之前结束，但未在同一秒（20）中开始。 - 不计入：20。

步骤3-计算所有在10:51:20之前开始并在10:51:20之后结束的呼叫。 -计算为：20瞬间。

我建议的解决方案可以实现这种修改，并且更简单且在数学上是“正确的”。

如何优化以下MySQL查询以实现每秒并发调用？

问题描述

2 个解决方案

解决方案1
3 2018-02-13 22:07:33

解决方案2
1 2018-02-14 23:16:45

如何优化以下MySQL查询以实现每秒并发调用？

问题描述

2 个解决方案

解决方案1 3 2018-02-13 22:07:33

解决方案2 1 2018-02-14 23:16:45

解决方案1
3 2018-02-13 22:07:33

解决方案2
1 2018-02-14 23:16:45