I have a problem with the speed of query. Simple mysql query, but when I have a lot of records (currently > 1 000 000), the performance is really slow. Question is similar to this one, but can't find solution. Explain says that MySQL is using: Using index; Using temporary; Using filesort. Has anyone any suggestions to speed it up? Slow query:
select
`books`.`id`
from `books`
join `books_data` on `books_data`.`book_id` = `books`.`id`
where
`books`.`is_status` = 'active'
order by `books_data`.`date_add` DESC
limit 0, 10
Result:
10 rows (0.525 s)
Explain:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE books NULL ref PRIMARY,is_status,is_status_read_num,is_status_reviews_count,is_status_year_read_num,is_status_poster_name is_status 1 const 112342 100.00 Using index; Using temporary; Using filesort
1 SIMPLE books_data NULL ref book_id,book_id_date_add book_id 4 mon.books.id 1 100.00 NULL
My tables:
CREATE TABLE `books` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`name_original` varchar(255) DEFAULT NULL,
`annotation` text CHARACTER SET utf8 COLLATE utf8_general_ci,
`year` int DEFAULT NULL,
`year_original` int DEFAULT NULL,
`year_publishing` int DEFAULT NULL,
`poster` varchar(255) DEFAULT NULL,
`isbn` varchar(255) DEFAULT NULL,
`read_num` int NOT NULL DEFAULT '0',
`reviews_count` int NOT NULL DEFAULT '0',
`views_count` int NOT NULL DEFAULT '0',
`rating` decimal(4,1) NOT NULL DEFAULT '0.0',
`is_status` enum('new','active','duplicate','deleted') NOT NULL DEFAULT 'new',
`deleted_reason` varchar(255) DEFAULT NULL,
`duplicate_id` int DEFAULT NULL,
`new_id` int DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `is_status` (`is_status`),
KEY `read_num` (`read_num`),
KEY `reviews_count` (`reviews_count`),
KEY `views_count` (`views_count`),
KEY `is_status_read_num` (`is_status`,`read_num`),
KEY `is_status_reviews_count` (`is_status`,`reviews_count`),
KEY `year_original` (`year_original`),
KEY `year` (`year`),
KEY `is_status_year_read_num` (`is_status`,`year`,`read_num`),
KEY `year_publishing` (`year_publishing`),
KEY `new_id` (`new_id`),
KEY `is_status_poster_name` (`is_status`,`poster`,`name`),
FULLTEXT KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;
CREATE TABLE `books_data` (
`book_id` int NOT NULL,
`date_add` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP,
`file_id` int DEFAULT NULL,
KEY `book_id` (`book_id`),
KEY `file_id` (`file_id`),
KEY `date_add` (`date_add`),
KEY `book_id_date_add` (`book_id`,`date_add`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;
The problem is this combination:
WHERE
)ORDER BY
)LIMIT
Formulation 1
The query will be executed this way:
bools.is_status
to do the filtering. Note: The cardinality of status
probably does not warrant using an index.books_data
.ORDER BY...
LIMIT 10
rows.Cost:
books.is_status
books_data.book_id
that many times.books_data
that many times.Formulation 2
You should Drop KEY book_id
( book_id
) since it is in the way of using the better ("covering") book_id_date_add
. This eliminates the third bullet item.
Formulation 3
If you could eliminate the test on status
(step 1), then INDEX(date_add)
could be used, and the execution would be:
INDEX(date_add)
, find the last 10 items.books
for books.id
.Cost:
The second bullet could be circumvented by delivering books_data.book_id
instead of books.id
.
However, the second bullet has a side effect that may not be needed -- It verifies that there is an entry in books
for the desired books_data.book_id
Formulation 4
Consider this rewrite:
SELECT bd.book_id
FROM books_book_data AS bd
WHERE EXISTS ( SELECT 1 FROM books WHERE id = bd.book_id );
That eliminates the caveat mentioned in Formulation 3.
The Cost may be:
INDEX(book_id, date_add)
status = 'active'
. (10+ more rows) This formulation 4
Notes
PRIMARY KEY
. If there is no 'natural' PK, use an AUTO_INCREMENT. If the combination
AUTO_INCREMENT. If the combination
(book_id, file_id)` is unique, then it would be a likely PK. INDEX(a)
and INDEX(a,b)
, drop the former; its existence can lead to a less optimal execution plan.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.