简体   繁体   中英

Mysql, Complex Where clause slows down query

I have a map tables which is structured like this:

CREATE TABLE `map` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `occupied` tinyint(2) NOT NULL DEFAULT '0',
  `c_type` tinyint(4) NOT NULL DEFAULT '0',
  `x` int(11) NOT NULL,
  `y` int(11) NOT NULL,
  `terrain` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `get_map_w_radius` (`x`,`y`,`id`,`terrain`,`occupied`,`c_type`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8mb4_general_ci

There is 40k records with x and y both from 1 to 200.

And in my script I use it like this:

SELECT id, terrain, occupied, c_type FROM map WHERE x >= $x-$radius AND x <= $x+$radius AND y >= $y-$radius AND y <= $y+$radius LIMIT 30

For example $x is 15 and y is 95 and radius is 5. When I profile the query, sending data is 0.000496 milliseconds, but without covering index (only x and y instead) it runs faster, even though in theory it should be other way around? With covering index, when I do a select query with simple where clause using x and y only one time:

SELECT id, terrain, c_type, occupied FROM map WHERE x >= $x And y <= $y limit 30;

It's executes much faster, sending data only in 0.000059. Is there something I am missing or misunderstanding? Maybe it's supposed to be like this?

For a mere 40K rows, suggest adding

INDEX(x),
INDEX(y)

That way, the Optimizer can look at the BETWEENs and pick the one thw might work better and shrink the work some .

Further optimizations are tricky. They are discussed repeatedly in Questions tagged [latitude-longitude].

(Terminology quibble) "Radius" implies a 2D "distance". What you have is a "bounding box".

Without ORDER BY , the query will return any 30 rows, not necessarily the closest 30. If you are satisfied with that, fine, since it is faster.

"Covering index" vs INDEX(x) -- I have a rule: Don't create an index of more than 5 columns. There is nothing really wrong with doing so, it gets gets bulky. My suggestion of also having INDEX(y) is based on the assumption that y is sometimes a better filter.

Beware of the Query Cache -- If it is turned on, your 'faster' runs may be due to that. Run your timings with SELECT SQL_NO_CACHE ... to get honest comparisons.

Is your 6-col index UNIQUE? If so, make it the PK, and totally get rid of id .

If x and y are always 0..200, then make them TINYINT UNSIGNED (range of 0..255, and 1 byte instead of 4).

All flavors of 'range' ( BETWEEN , <= - 2-sided or 1-sided) are performed the same. So any performance differences are artifacts of other things...

  1. drill down the BTree to the starting value (which could be the beginning of the table)
  2. scan forward
  3. stop if LIMIT reached and there is no ORDER BY
  4. stop at the ending value (or end of table)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM