简体   繁体   中英

how can I make this query more efficient?

edit: here is a simplified version of the original query (runs in 3.6 secs on a products table of 475K rows)

SELECT p.*, shop FROM products p JOIN
users u ON p.date >= u.prior_login and u.user_id = 22 JOIN
shops s ON p.shop_id = s.shop_id
ORDER BY shop, date, product_id;

this is the explain plan

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  u   const   PRIMARY,prior_login,user_id PRIMARY 4   const   1   Using temporary; Using filesort
1   SIMPLE  s   ALL PRIMARY NULL    NULL    NULL    90   
1   SIMPLE  p   ref shop_id,date,shop_id_2,shop_id_3    shop_id 4   bitt3n_minxa.s.shop_id  5338    Using where

the bottleneck seems to be ORDER BY date,product_id . Removing these two orderings, the query runs in 0.06 seconds. (Removing either one of the two (but not both) has virtually no effect, query still takes over 3 seconds.) I have indexes on both product_id and date in the products table. I have also added an index on (product,date) with no improvement.

newtover suggests the problem is the fact that the INNER JOIN users u1 ON products.date >= u1.prior_login requirement is preventing use of the index on products.date

Two variations of the query that execute in ~0.006 secs (as opposed to 3.6 secs for the original) have been suggested to me (not from this thread).

this one uses a subquery, which appears to force the order of the joins

SELECT p.*, shop 
  FROM 
  (
    SELECT p.*
    FROM products p 
    WHERE p.date >= (select prior_login FROM users where user_id = 22)
  ) as p
  JOIN shops s 
    ON p.shop_id = s.shop_id
  ORDER BY shop, date, product_id;

this one uses the WHERE clause to do the same thing (although the presence of SQL_SMALL_RESULT doesn't change the execution time, 0.006 secs without it as well)

SELECT SQL_SMALL_RESULT p . * , shop
FROM products p
INNER JOIN shops s ON p.shop_id = s.shop_id
WHERE p.date >= ( 
SELECT prior_login
FROM users
WHERE user_id =22 ) 
ORDER BY shop, DATE, product_id;

My understanding is that these queries work much faster on account of reducing the relevant number of rows of the product table before joining it to the shops table. I am wondering if this is correct.

Use the EXPLAIN statement to see the execution plan. Also you can try adding an index to products.date and u1.prior_login .

Also please just make sure you have defined your foreign keys and they are indexed.

Good luck.

We do need an explain plan... but

Be very careful of select * from table where id in (select id from another_table) This is a notorious. Generally these can be replaced by a join. The following query might run, although I haven't tested it.

SELECT shop,
       shops.shop_id AS shop_id,
       products.product_id AS product_id,
       brand,
       title,
       price,
       image AS image,
       image_width,
       image_height,
       0 AS sex,
       products.date AS date,
       fav1.favorited AS circle_favorited,
       fav2.favorited AS session_user_favorited,
       u2.username AS circle_username
  FROM products
       LEFT JOIN favorites fav2
          ON     fav2.product_id = products.product_id
             AND fav2.user_id = 22
             AND fav2.current = 1
       INNER JOIN shops
          ON shops.shop_id = products.shop_id
       INNER JOIN users u1
          ON products.date >= u1.prior_login AND u1.user_id = 22
       LEFT JOIN favorites fav1
          ON products.product_id = fav1.product_id
       LEFT JOIN friends f1
          ON f1.star_id = fav1.user_id
       LEFT JOIN users u2
          ON fav1.user_id = u2.user_id
 WHERE f1.fan_id = 22 OR fav1.user_id = 22
ORDER BY shop,
         DATE,
         product_id,
         circle_favorited

the fact that the query is slow because of the ordering is rather obvious since it is hard to find an index that would to apply ORDER BY in this case. The main problem is products.date >= comparison which breaks using any index for ORDER BY. And since you have a lot of data to output, MySQL starts using temporary tables for sorting.

what i would to is to try to force MySQL output data in the order of an index which already has the required order and remove the ORDER BY clause.

I am not at a computer to test, but how would I do it:

  • I would do all inner joins
  • then I would LEFT JOIN to a subquery which makes all computations on favorites ordered by product_id, circle_favourited (which would provide the last ordering criterion).

So, the question is how to make the data be sorted on shop, date, product_id

I am going to write about it a bit later =)

UPD1:

You should probably read something on how btree indexes work in MySQL. There is a good article on mysqlperformanceblog.com about it (I currently write from a mobile and don't have the link at hand). In short, you seem to talk about one-column indexes which arrange pointers to rows based on values sorted in a single column. Compound indexes store an order based on several columns. Indexes mostly used to operate on clearly defined ranges of them to obtain most of the information before retrieving data from the rows they point at. Indexes usually do not know about other indexes on the same table, as result they are rarely merged. when there is no more info to take from the index, MySQL starts to operate directly on data.

That is an index on date can not make use of the index on product_id, but an index on (date, product_id) can get some more info on product_id after a condition on date (sort on product id for a specific date match).

Nevertheless, a range condition on date (>=) breaks this. That is what I was talking about.

UPD2:

As I uderstand the problem can be reduced to (most of the time it spends on that):

SELECT p.*, shop
FROM products p
JOIN users u ON p.`date` >= u.prior_login and u.user_id = 22
JOIN shops s ON p.shop_id = s.shop_id
ORDER BY shop, `date`, product_id;

Now add an index (user_id, prior_login) on users and (date) on products, and try the following query:

SELECT STRAIGHT_JOIN p.*, shop
FROM (
  SELECT product_id, shop
  FROM users u
  JOIN products p
    user_id = 22 AND p.`date` >= prior_login
  JOIN shops s
    ON p.shop_id = s.shop_id
  ORDER BY shop, p.`date`, product_id
) as s
JOIN products p USING (product_id);

If I am correct the query should return the same result but quicker. If would be nice if you would post the result of EXPLAIN for the query.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM