I am writing a script which will list 25 items of all 12 categories. Database structure is like:
tbl_items
---------------------------------------------
item_id | item_name | item_value | timestamp
---------------------------------------------
tbl_categories
-----------------------------
cat_id | item_id | timestamp
-----------------------------
There are around 600,000 rows in the table tbl_items
. I am using this SQL query:
SELECT e.item_id, e.item_value
FROM tbl_items AS e
JOIN tbl_categories AS cat WHERE e.item_id = cat.item_id AND cat.cat_id = 6001
LIMIT 25
Using the same query in a loop for cat_id
from 6000 to 6012. But I want the latest records of every category. If I use something like:
SELECT e.item_id, e.item_value
FROM tbl_items AS e
JOIN tbl_categories AS cat WHERE e.item_id = cat.item_id AND cat.cat_id = 6001
ORDER BY e.timestamp
LIMIT 25
..the query goes computing for approximately 10 minutes which is not acceptable. Can I use LIMIT
more nicely to give the latest 25 records for each category?
Can anyone help me achieve this without ORDER BY
? Any ideas or help will be highly appreciated.
EDIT
tbl_items
+---------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+-------+
| item_id | int(11) | NO | PRI | 0 | |
| item_name | longtext | YES | | NULL | |
| item_value | longtext | YES | | NULL | |
| timestamp | datetime | YES | | NULL | |
+---------------------+--------------+------+-----+---------+-------+
tbl_categories
+----------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| cat_id | int(11) | NO | PRI | 0 | |
| item_id | int(11) | NO | PRI | 0 | |
| timestamp | datetime | YES | | NULL | |
+----------------+------------+------+-----+---------+-------+
Can you add indices? If you add an index on the timestamp
and other appropriate columns the ORDER BY
won't take 10 minutes.
First of all:
It seems to be a N:M relation between items
and categories
: a item
may be in several categories
. I say this because categories
has item_id
foreign key.
If is not a N:M relationship then you should consider to change design. If it is a 1:N relationship, where a category has several items, then item
must constain category_id
foreign key.
Working with N:M:
I have rewrite your query to make a inner join insteat a cross join:
SELECT e.item_id, e.item_value
FROM
tbl_items AS e
JOIN
tbl_categories AS cat
on e.item_id = cat.item_id
WHERE
cat.cat_id = 6001
ORDER BY
e.timestamp
LIMIT 25
To optimize performance required indexes are:
create index idx_1 on tbl_categories( cat_id, item_id)
it is not mandatory an index on items because primary key is also indexed. A index that contains timestamp don't help as mutch. To be sure can try with an index on item with item_id
and timestamp
to avoid access to table and take values from index:
create index idx_2 on tbl_items( item_id, timestamp)
To increase performace you can change your loop over categories by a single query:
select T.cat_id, T.item_id, T.item_value from
(SELECT cat.cat_id, e.item_id, e.item_value
FROM
tbl_items AS e
JOIN
tbl_categories AS cat
on e.item_id = cat.item_id
ORDER BY
e.timestamp
LIMIT 25
) T
WHERE
T.cat_id between 6001 and 6012
ORDER BY
T.cat_id, T.item_id
Please, try this querys and come back with your comments to refine it if necessary.
Leaving aside all other factors I can tell you that the main reason why the query is so slow, is because the result involves longtext
columns.
BLOB
and TEXT
fields in MySQL are mostly meant to store complete files, textual or binary. They are stored separately from the row data for InnoDB tables. Each time a query involes sorting (explicitly or for a group by
), MySQL is sure to use disk for the sorting (because it can not be sure in advance how large any file is).
And it is probably a rule of thumb: if you need to return more than a single row of a column in a query, the type of the field is almost never should be TEXT
or BLOB
, use VARCHAR
or VARBINARY
instead.
UPD
If you can not update the table, the query will hardly be fast with the current indexes and column types. But, anyway, here is a similar question and a popular solution to your problem: How to SELECT the newest four items per category?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.