简体   繁体   中英

How to improve a simple MySQL-Query

There is this rather simple query that I have to run on a livesystem, in order to get a count. The problem is that the table and database are rather inefficiently designed and since it is a livesystem altering it is not an option at this point.
So I have to figure out a query that runs fast and won't slow down the system too much, because for the time of the query execution the system basically stops which is not really what I would like a livesystem to do, so I need to streamline my query in order to make it perform in an acceptable time.

SELECT id1, count(id2) AS count FROM table GROUP BY id1 ORDER BY count 
DESC;

So here is the query, unfortunately it is so simple that I am out of ideas on how to further improve it, maybe someone else has an idea ... ?

Application Get "good enough" results via application changes:

If you have access to the application, but not the database, then there are possibilities:

Periodically run that slow query and capture the results. Then use the cached results.

Do you need all

What is the goal? Find a few of the most common id1's? Rank all of them?

Back to the query

COUNT(id2) checks for id2 being not null; this us usually unnecessary, so COUNT(*) is better. However the speedup is insignificant.

ORDER BY NULL is irrelevant if you are picking off the rows with the highest COUNT -- the sort needs to be done somewhere. Moving it to the application does not help; at least not much.

Adding LIMIT 10 would only help because of cutting down on the time to send the data back to the client.

INDEX(id1) is the best index for the query (after changing to COUNT(*) ). But the operation still requires

  • full index scan to do the COUNT and GROUP BY
  • sort the grouped results -- for the ORDER BY

Zero or near-zero downtime

Do you have replication established? Galera Clustering?

Look into pt-online-schema-change and gh-ost .

What is the real goal?

We cannot fix the query as written. What things can we change? Better yet, what is the ultimate goal -- perhaps there is an approach that does not involve any query that looks the least like the one you are trying to speed up.

Now I have just dumped the table and imported it into a MySQL-Docker, ran the query there, took ages and I actually had to move my entire Docker because the dump was so huge, but in the end I got my results and now I know how many id2s are associated with specific id1s (apostrophe to form a plural? You may want to double-check that ;) ).
As it was already pointed out, there wasn't much room for improvement on the query anymore.

FYI suddenly the care about stopping the system was gone and now we are indexing the table, so far it took 6 hours, no end in sight :D

Anyways, thanks for the help everyone.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM