There is a table called transactions
with ~6 million rows. Below query counts the current user balance. Here is the log after I enabled slow_query_log = 'ON'
:
# Time: 170406 9:51:48
# User@Host: root[root] @ [xx.xx.xx.xx]
# Thread_id: 13 Schema: main_db QC_hit: No
# Query_time: 38.924823 Lock_time: 0.000034 Rows_sent: 1 Rows_examined: 773550
# Rows_affected: 0
SET timestamp=1491456108;
SELECT SUM(`Transaction`.`amount`) as total
FROM `main_db`.`transactions` AS `Transaction`
WHERE `Transaction`.`user_id` = 1008
AND `Transaction`.`confirmed` = 1
LIMIT 1;
As you can see it took ~38 seconds
!
Here is transactions
table EXPLAIN :
This query sometimes run fast (about ~1 second) and sometimes really slow!
Any help would be great appreciated.
PS:
It's InnoDB and transactions
table has frequent INSERT and SELECT operations.
I tried running the query with SQL_NO_CACHE
, but it is still sometimes fast, sometimes slow.
transactions
Table Schema :
CREATE TABLE `transactions` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`ref_id` varchar(40) COLLATE utf8_persian_ci NOT NULL,
`payment_id` tinyint(3) unsigned NOT NULL,
`amount` decimal(10,1) NOT NULL,
`created` datetime NOT NULL,
`private_note` varchar(6000) COLLATE utf8_persian_ci NOT NULL,
`public_note` varchar(200) COLLATE utf8_persian_ci NOT NULL,
`confirmed` tinyint(3) NOT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=13133663 DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci
MySQL is running on a VPS with 12GB RAM and 9 Logical CPU cores.
Here is a part of my.cnf
:
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
default_storage_engine = InnoDB
# you can't just change log file size, requires special procedure
innodb_buffer_pool_size = 9G
innodb_log_buffer_size = 8M
innodb_file_per_table = 1
innodb_open_files = 400
innodb_io_capacity = 400
innodb_flush_method = O_DIRECT
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address = 127.0.0.1
#
# * Fine Tuning
#
max_connections = 500
connect_timeout = 5
wait_timeout = 600
max_allowed_packet = 16M
thread_cache_size = 128
sort_buffer_size = 4M
bulk_insert_buffer_size = 16M
tmp_table_size = 32M
max_heap_table_size = 32M
(Sorry to step on all the good comments. I hope I have added enough to justify claiming an "Answer".)
Are there 6M rows in the table? But 773K rows with that user_id
?
9GB buffer_pool? The table is about 4GB of data? So it fits in the buffer_pool if there is not much else to bump it out. ( SHOW TABLE STATUS
and check "Data_length".)
The existing INDEX(user_id)
might be 20MB, easily cachable.
If the user_ids are sufficiently scattered around the table, the query probably needs to fetch virtually every 16KB block of the data. So, the original query with original index will go something like:
user_id
. This will be a minor part of the total effort. If you change to the optimal, "covering", INDEX(user_id, confirmed, amount)
, things change some...
If there were also a date range in the WHERE
clause, I would push for building and maintaining a "Summary table". This might speed up similar queries by a factor of 10.
If you do add a composite index starting with user_id
, you should (not must ) DROP
the index on just user_id as being redundant. (If you don't drop it, it will mostly waste disk space.)
As for doing it in production...
ALTER TABLE ... ALGORITHM=INPLACE ...
, which is feasible for adding/dropping indexes with minimal impact. pt-online-schema-change
. It requires that there be no other Triggers, and does take a very short downtime. The Trigger takes care of the 200 writes/minute 'transparently'. ALGORITHM=INPLACE
was added in MySQL 5.6 and MariaDB 10.0.
(Yes, I am adding another answer. Justification: It addresses the underlying problem a different way.)
The underlying problem seems to be that there is an ever-growing "transaction" table from which is derived various statistics, such as SUM(amount)
. The performance of this will only get worse and worse as the table(s) grow.
The basis for this Answer will be to look at the data in two ways: "History" and "Current". Transactions
is the History. A new table would be the Current
totals for each User. But I see multiple ways to do that. Each involves some form of subtotal(s) so as to avoid adding 773K rows to get the answer.
Transactions
and add them to Current
. Transactions
, increment Current
. SUM
through last night. More discussion in my blog on Summary Tables .
Note that the up-to-the-second balance for the banking or hybrid way is a little tricky:
Any of the approaches will be a lot faster than scanning all 773K rows for the user, but it will be more complex code.
One thing you might try is adding an composite index to see if it speeds up the select part of the query:
ALTER TABLE `transactions` ADD INDEX `user_confirmed` (`user_id`, `confirmed`);
Also, as @wajeeh pointed out in a comment, the LIMIT
clause is unnecessary here since you're already calling an aggregate function.
It would be helpful if you can post the table schema in your question as well.
Take a look at this answer Any way to select without causing locking in MySQL?
And this article: Consistent Nonlocking Reads
In your case as I think and as @billynoah mention, the table have to do many write operations as it is as Log
table, So this may help you.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.