[英]How can I optimize the following MySQL query to achieve concurrent calls per seconds?
[英]How can I optimize this mysql query to find maximum simultaneous calls?
我正在尝试计算最大同时通话数。 我的查询,我认为是准确的,在给定约250,000行时需要太长时间。 cdrs表看起来像这样:
+---------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| CallType | varchar(32) | NO | | NULL | |
| StartTime | datetime | NO | MUL | NULL | |
| StopTime | datetime | NO | | NULL | |
| CallDuration | float(10,5) | NO | | NULL | |
| BillDuration | mediumint(8) unsigned | NO | | NULL | |
| CallMinimum | tinyint(3) unsigned | NO | | NULL | |
| CallIncrement | tinyint(3) unsigned | NO | | NULL | |
| BasePrice | float(12,9) | NO | | NULL | |
| CallPrice | float(12,9) | NO | | NULL | |
| TransactionId | varchar(20) | NO | | NULL | |
| CustomerIP | varchar(15) | NO | | NULL | |
| ANI | varchar(20) | NO | | NULL | |
| ANIState | varchar(10) | NO | | NULL | |
| DNIS | varchar(20) | NO | | NULL | |
| LRN | varchar(20) | NO | | NULL | |
| DNISState | varchar(10) | NO | | NULL | |
| DNISLATA | varchar(10) | NO | | NULL | |
| DNISOCN | varchar(10) | NO | | NULL | |
| OrigTier | varchar(10) | NO | | NULL | |
| TermRateDeck | varchar(20) | NO | | NULL | |
+---------------+-----------------------+------+-----+---------+----------------+
我有以下索引:
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| cdrs | 0 | PRIMARY | 1 | id | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | id | 1 | id | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | call_time_index | 1 | StartTime | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | call_time_index | 2 | StopTime | A | 269622 | NULL | NULL | | BTREE | | |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
我正在运行的查询是这样的:
SELECT MAX(cnt) AS max_channels FROM
(SELECT cl1.StartTime, COUNT(*) AS cnt
FROM cdrs cl1
INNER JOIN cdrs cl2
ON cl1.StartTime
BETWEEN cl2.StartTime AND cl2.StopTime
GROUP BY cl1.id)
AS counts;
似乎我可能不得不每天将这些数据分块并将结果存储在一个单独的表中,如simultaneous_calls
。
我确定你不仅想要知道最大同时呼叫,还要知道何时发生这种情况。
我会创建一个包含每个分钟的时间戳的表
CREATE TABLE times (ts DATETIME UNSIGNED AUTO_INCREMENT PRIMARY KEY);
INSERT INTO times (ts) VALUES ('2014-05-14 00:00:00');
. . . until 1440 rows, one for each minute . . .
然后加入呼叫。
SELECT ts, COUNT(*) AS count FROM times
JOIN cdrs ON times.ts BETWEEN cdrs.starttime AND cdrs.stoptime
GROUP BY ts ORDER BY count DESC LIMIT 1;
这是我的测试结果(在Macbook Pro上运行的Linux VM上的MySQL 5.6.17):
+---------------------+----------+
| ts | count(*) |
+---------------------+----------+
| 2014-05-14 10:59:00 | 1001 |
+---------------------+----------+
1 row in set (1 min 3.90 sec)
这实现了几个目标:
这是我的查询的EXPLAIN:
explain select ts, count(*) from times join cdrs on times.ts between cdrs.starttime and cdrs.stoptime group by ts order by count(*) desc limit 1;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | times | index | PRIMARY | PRIMARY | 5 | NULL | 1440 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | cdrs | ALL | starttime | NULL | NULL | NULL | 260727 | Range checked for each record (index map: 0x4) |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
注意rows列中的数字,并与原始查询的EXPLAIN进行比较。 您可以通过将这些行相乘来估计检查的总行数(但如果您的查询不是SIMPLE,则会变得更复杂)。
内联视图不是绝对必要的。 (你有很多时间在内联视图的查询上运行EXPLAIN,EXPLAIN将实现内联视图(即运行内联视图查询并填充派生表),然后给出一个EXPLAIN外部查询。
请注意,此查询将返回等效结果:
SELECT COUNT(*) AS max_channels
FROM cdrs cl1
JOIN cdrs cl2
ON cl1.StartTime BETWEEN cl2.StartTime AND cl2.StopTime
GROUP BY cl1.id
ORDER BY max_channels DESC
LIMIT 1
虽然它仍然需要做所有的工作,并且可能没有更好的表现; EXPLAIN应该运行得更快。 (我们希望在Extra列中看到“Using temporary; Using filesort”。)
结果集中的行数将是表中的行数(~250,000行),并且需要对这些行进行排序,因此这将是一段时间。 更大的问题(我的直觉告诉我)是加入操作。
我想知道如果在谓词中交换cl1和cl2,EXPLAIN(或性能)是否会有所不同,即
ON cl2.StartTime BETWEEN cl1.StartTime AND cl1.StopTime
我在想,只是因为我想尝试一个相关的子查询。 那是〜250,000次执行,而且不太可能更快......
SELECT ( SELECT COUNT(*)
FROM cdrs cl2
WHERE cl2.StartTime BETWEEN cl1.StartTime AND cl1.StopTime
) AS max_channels
, cl1.StartTime
FROM cdrs cl1
ORDER BY max_channels DESC
LIMIT 11
你可以运行一个EXPLAIN,我们仍然会看到“使用临时;使用filesort”,它还会显示“依赖子查询”......
显然,在cl1表上添加谓词以减少要返回的行数(例如,仅检查过去15天); 这应该加快速度,但它不能得到你想要的答案。
WHERE cl1.StartTime > NOW() - INTERVAL 15 DAY
(我的思考没有一个是对你的问题的确定答案,或解决性能问题的解决方案;它们只是沉思。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.