I have created a table with some dummy data. The table ( ARTICLES ) consists of id, author_id, title, description and the table ( AUTHOR ) consists of author_id, name, article_list.
As per application flow first I would take out the list of authors, this will give me author name and article list and id. When the user navigates inside author I can get the list of all articles in two way.
First
SELECT * FROM articles WHERE author_id = 100;
and secondly, if I keep all list of all articles in form of a list inside my author table then I can use
SELECT *
FROM articles
WHERE id IN (100, 1100, 2100, 3100, 4100, 5100, 6100,
7100, 8100, 9100, 10100, 11100, 12100, 13100,
14100, 15100, 16100, 17100, 18100, 19100, 20100,
21100, 22100, 23100, 24100, 25100, 26100, 27100,
28100, 29100, 30100, 31100, 32100, 33100, 34100);
The first query took 0.0329 sec while the second query took 0.0017 sec.
I am not able to understand how is it possible that the first query is taking more time than the second query.
All I know the second query will execute like
SELECT *
FROM articles
WHERE id = 100
OR id = 1100
OR id = 2100... and so on
Caching.
If you start up the server, then run a query, nothing is yet in the buffer_pool (or table cache or ...). So several files need opening and several blocks need reading. 32.9ms could indicate that you needed to hit the disk (if HDD) 3 times at about 10ms each.
If you run the identical query a second time, everything will be cached, and it will take only a few milliseconds, typically under 10ms.
Since the first query primed the cache with some stuff, the second query found most, maybe all, the blocks it needed. So, it was probably CPU-only, no I/O. 1.7ms is reasonable.
A possible issue... Do you have the "Query cache" turned on? If so, then (in certain situations) a subsequent execution of any SELECT
will find the resultset in the QC and return very fast, possibly < 1ms. One way to be sure to avoid the QC (for realistic timing) is to do SELECT SQL_NO_CACHE ...
.
The OR
query you present is optimized into the IN
that you present. That is, they end up being identical. (Using OR
with different columns is a performance killer; that is not the situation here.)
Timing tips:
SELECT SQL_NO_CACHE ...
(to avoid the QC) Now to analyze what happens if you do not have any index on author_id
.
If you had INDEX(author_id)
, both queries would probably run faster, cached or not.
This can be because there can be thousands of author_ids and for:
SELECT * FROM articles WHERE author_id = 100;
Every row needs to be traversed because its applied on the entire column rows
And for:
SELECT * FROM articles WHERE id IN (100, 1100, 2100, 3100, 4100, 5100, 6100, 7100, 8100, 9100, 10100, 11100, 12100, 13100, 14100, 15100, 16100, 17100, 18100, 19100, 20100, 21100, 22100, 23100, 24100, 25100, 26100, 27100, 28100, 29100, 30100, 31100, 32100, 33100, 34100);
They are limited records and a single check of id can help in faster traversing by using memory.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.