简体   繁体   中英

MySQL stops using index when additional constraints are added

Using EXPLAIN reveals that the following query does not use my index, could somebody please explain what is going on?

    SELECT  u.id AS userId, firstName, profilePhotoId, preferredActivityId, preferredSubActivityId, availabilityType,
         3959 * ACOS(COS(radians(requestingUserLat)) * COS(radians(u.latitude)) * COS(radians(u.longitude) - radians(requestingUserLon)) + SIN(radians(requestingUserLat)) * SIN(radians(u.latitude))) AS distanceInMiles
    FROM users u
   WHERE u.latitude     between lat1    and lat2 -- MySQL 5.7 supports Point data type, but it is not indexed in innoDB. I store latitude and longitude as DOUBLE for now
     AND u.longitude    between lon1    and lon2
     AND u.dateOfBirth  between maxAge  and minAge -- dates are in millis, therefore maxAge will have a smaller value than minAge and so it needs to go first
     AND IF(gender       is null, TRUE, u.gender = gender)
     AND IF(activityType is null, TRUE, u.preferredActivityType = activityType)
     AND u.accountState = 'A'
     AND u.id != userId
  HAVING distanceInMiles < searchRadius ORDER BY distanceInMiles LIMIT pagingStart, pagingLength;


CREATE INDEX `findMatches` ON `users` (`latitude` ASC, `longitude` ASC, `dateOfBirth` ASC) USING BTREE;


The index is not used at all at this stage. To get it to work, I need to comment out a bunch of columns from the SELECT statement, and also removed any unindexed columns from the WHERE clause. The following works:

    SELECT  u.id AS userId --, firstName, profilePhotoId, preferredActivityId, preferredSubActivityId, availabilityType,
         3959 * ACOS(COS(radians(requestingUserLat)) * COS(radians(u.latitude)) * COS(radians(u.longitude) - radians(requestingUserLon)) + SIN(radians(requestingUserLat)) * SIN(radians(u.latitude))) AS distanceInMiles
    FROM users u
   WHERE u.latitude     between lat1    and lat2 -- MySQL 5.7 supports Point data type, but it is not indexed in innoDB. We store latitude and longitude as DOUBLE for now
     AND u.longitude    between lon1    and lon2
     AND u.dateOfBirth  between maxAge  and minAge -- dates are in millis, therefore maxAge will have a smaller value than minAge and so it needs to go first
    -- AND IF(gender         is null, TRUE, u.gender = gender)
    -- AND IF(activityType is null, TRUE, u.preferredActivityType = activityType)
    -- AND u.accountState = 'A'
    -- AND u.id != userId
  HAVING distanceInMiles < searchRadius ORDER BY distanceInMiles LIMIT pagingStart, pagingLength;


Other things I tried:
I tried creating 3 distinct single-part indexes, in addition to my multi-part index that contains all 3 keys. Based on the docs here , shouldn't the optimizer merge them by creating a UNION of their qualifying rows, further speeding up execution? It's not doing it, it still selects the multi-part (covering) index.


Any help greatly appreciated!

This is a little difficult to explain.

The query that uses the index is using it because the index is a "covering" index. That is, all the column in the index are in the query. The only part of the index really being used effectively is the condition on latitude .

Normally a covering index would have only the columns mentioned in the query. However, the primary key is used to reference the records, so I'm guessing that users.Id is the primary key on the table. And the index is being scanned for valid values of latitude .

The query that is not using the index is not using it for two reasons. First, the conditions on the columns are inequalities. An index seek can only use equality conditions and one inequality. That means the index could only be used for latitude in its most effective method. Second, the additional columns in the query require going to the data page anyway.

In other words, the optimizer is, in effect, saying: "Why bother going to the index to scan through the index and then scan the data pages? Instead, I can just scan the data pages and get everything all at once."

Your next question is undoubtedly: "But how do I make my query faster?" My suggestion would be to investigate spatial indexes .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM