[英]How to speed up MySQL query that is a carthesian
假设有一个名为contact
的表,具有下一个结构:
id INT -- primary key,autoincrement,index
firstname VARCHAR (255),
lastname VARCHAR(255),
type ENUM
执行这样的查询:
SELECT c1.id AS c1_id, c2.id AS c2_id
FROM contact c1
INNER JOIN contact c2 ON c1.firstname = c2.firstname AND c1.lastname = c2.lastname
WHERE c1.id <> c2.id AND c1.type=c2.type
可以处理少量记录...但是当记录数从30-40增加到1000时,此查询非常慢。需要从记录数中提取出来并尽可能快地提高此查询的速度。有什么建议吗?
您可以尝试使用以下方法:
SELECT GROUP_CONCAT(DISTINCT id), firstname, lastname
FROM contact
GROUP BY firstname, lastname
HAVING COUNT(DISTINCT id)>1
这将返回所有重复的名称。 如果您想要ID,则可以使用JOIN:
SELECT
contact.id
FROM
contact INNER JOIN (SELECT firstname, lastname
FROM contact
GROUP BY firstname, lastname
HAVING COUNT(DISTINCT id)>1) dup
ON contact.firstname=dup.firstname AND contact.lastname=dup.lastname
与fthiella的答案存在微小偏差(以防万一,这是您所需要的):
SELECT group_concat(id) as ids, firstname, lastname
FROM contact
GROUP BY firstname, lastname
上面的查询将用每个逗号分隔的ID的ID列表(以逗号分隔)填充列ID。
除“ <>”外,您的查询还可以(实际上,它甚至可能比提出的替代方案要快一些!)。 只是索引需要一些工作...
CREATE TABLE contact
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,firstname VARCHAR(255) NOT NULL
,lastname VARCHAR(255) NOT NULL
,type ENUM('small','medium','large') NOT NULL
);
INSERT INTO contact VALUES
(NULL,'John','Brown','small'),
(NULL,'Bill','Red','small'),
(NULL,'Paul','Orange','medium'),
(NULL,'Mike','Green','large'),
(NULL,'John','Scarlet','small'),
(NULL,'John','Cyan','medium'),
(NULL,'Fiona','Brown','large'),
(NULL,'John','Brown','small'),
(NULL,'Chris','Copper','medium'),
(NULL,'Steve','Silver','large');
INSERT INTO contact SELECT NULL,x.firstname, y.lastname, z.type FROM contact x, contact y, contact z;
SELECT COUNT(*) FROM contact;
+----------+
| COUNT(*) |
+----------+
| 1010 |
+----------+
1 row in set (0.01 sec)
SELECT c1.id c1_id
, c2.id c2_id
FROM contact c1
JOIN contact c2
ON c1.firstname = c2.firstname
AND c1.lastname = c2.lastname
AND c1.type=c2.type
WHERE c1.id < c2.id;
...
...
| 1006 | 1008 |
+-------+-------+
5634 rows in set (0.16 sec)
因此,现在让我们在(名字,姓氏,类型)上添加一个索引...
DROP TABLE IF EXISTS contact;
CREATE TABLE contact
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,firstname VARCHAR(255) NOT NULL
,lastname VARCHAR(255) NOT NULL
,type ENUM('small','medium','large') NOT NULL
,INDEX(firstname,lastname,type)
);
INSERT INTO contact VALUES
(NULL,'John','Brown','small'),
(NULL,'Bill','Red','small'),
(NULL,'Paul','Orange','medium'),
(NULL,'Mike','Green','large'),
(NULL,'John','Scarlet','small'),
(NULL,'John','Cyan','medium'),
(NULL,'Fiona','Brown','large'),
(NULL,'John','Brown','small'),
(NULL,'Chris','Copper','medium'),
(NULL,'Steve','Silver','large');
INSERT INTO contact SELECT NULL,x.firstname, y.lastname, z.type FROM contact x, contact y, contact z;
SELECT c1.id c1_id
, c2.id c2_id
FROM contact c1
JOIN contact c2
ON c1.firstname = c2.firstname
AND c1.lastname = c2.lastname
AND c1.type = c2.type
AND c1.id < c2.id;
| 775 | 776 |
...
| 1006 | 1008 |
+-------+-------+
5634 rows in set (0.05 sec)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.