简体   繁体   English

Mysql 查询很慢,其中一个条件为 Exists

[英]Mysql Query is slow with one where condition on Exists

I got 3 tables and the relations between them is many to many.我有 3 张桌子,它们之间的关系是多对多的。

This is the Image of my tables with its columns这是我的表格及其列的图像

I need to get the hashtag names that has files with specific category_id.我需要获取具有特定 category_id 文件的主题标签名称。

the problem is that when i use the below query without specifying the category, it performs well like 0.05s.问题是当我使用下面的查询而不指定类别时,它的性能像 0.05 秒一样好。

select `hashtags`.`slug` 
from `hashtags` 
where EXISTS (
   select * from `files` 
   inner join `file_hashtags` on `files`.`id` = `file_hashtags`.`file_id` 
   where `hashtags`.`id` = `file_hashtags`.`hashtag_id` 
);

but when i perform the below query with specified category it goes like 3s to perform.但是当我使用指定的类别执行以下查询时,它会像 3s 一样执行。

select `hashtags`.`slug` 
from `hashtags` 
where EXISTS (
   select * from `files` 
   inner join `file_hashtags` on `files`.`id` = `file_hashtags`.`file_id` 
   where `hashtags`.`id` = `file_hashtags`.`hashtag_id` 
   and `files`.`category_id`=2 
);

what can i do to improve this to get the better query time?我能做些什么来改进它以获得更好的查询时间? also i did this query using IN instead of Exists, but the result is the same with a little like 0.1s better perform in time.我也使用 IN 而不是 Exists 进行了此查询,但结果是相同的,有点像 0.1s 更好地执行时间。

about the indexes:关于索引:

  • files table has ID as primary key, and category_id as BTREE index (need this for when i need to perform easy queries like get files with specific category), and slug as Unique index. files 表有 ID 作为主键,category_id 作为 BTREE 索引(当我需要执行简单的查询,比如获取具有特定类别的文件时需要这个),并且 slug 作为唯一索引。

  • hashtags table has ID as primary key, and slug as Unique index. hashtags 表将 ID 作为主键,将 slug 作为唯一索引。

  • file_hashtags table has two foregin keys for their tables, also (file_id, hashtag_id) is Primary. file_hashtags 表有两个表的外键, (file_id, hashtag_id) 也是主键。

also there are about 150k rows in files table, 75 in hashtags table and 260k in the pivot table. files 表中也有大约 150k 行,hashtags 表中有 75 行,pivot 表中有 260k 行。

You can use inner join without exist:您可以在不存在的情况下使用内部联接:

select distinct `hashtags`.`slug` 
from `hashtags`
inner join `file_hashtags` on `hashtags`.`id` = `file_hashtags`.`hashtag_id`
inner join `files` on `files`.`id` = `file_hashtags`.`file_id`
where `files`.`category_id`=2
group by `hashtags`.`slug`;

Try like this.像这样试试。

SELECT hashtags.slug
FROM hashtags
WHERE EXISTS
(SELECT * FROM (SELECT * 
FROM files
WHERE category_id = 2) A
INNER JOIN file_hashtags ON A.id = file_hashtags.file_id
WHERE hashtags.id = file_hashtags.hashtag_id)

If the number of rows in the files table is large, the number of join records can be reduced by performing a filter before proceeding with the join operation.如果 files 表中的行数很大,则可以通过在进行连接操作之前执行过滤器来减少连接记录的数量。

Please use SHOW CREATE TABLE .请使用SHOW CREATE TABLE It shows what indexes you have它显示了您拥有的索引

I assume that file_hashtags is a many-to-many mapping?我假设file_hashtags是多对多映射? Then it needs那么就需要

PRIMARY KEY(file_id, hashtag_id)
INDEX(hashtag_id, file_id)

More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table更多讨论: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

Because of and files.category_id=2 , files needs由于and files.category_id=2files需要

INDEX(category_id)

So, have those ids, and simply the query to:因此,拥有这些 id,只需查询:

SELECT  h.slug
    FROM  files AS f
    JOIN  file_hashtags AS fh  ON fh.file_id = f.id
    JOIN  hashtags AS h  ON h.id = fh.hashtag_id
    WHERE  f.category_id = 2;

(I assume id is the PRIMARY KEY of each table, though id is not needed in file_hashtags .) (我假设id是每个表的PRIMARY KEY ,尽管file_hashtags不需要id 。)

I do not believe EXISTS helps with performance for this query.我不相信EXISTS有助于提高这个查询的性能。

If you are not using ENGINE=InnoDB , my answer is inadequate.如果您不使用ENGINE=InnoDB ,我的回答是不充分的。 You should be using InnoDB.您应该使用 InnoDB。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM