简体   繁体   中英

MYSQL: query doesn't use the index some time

Database of test:

SET NAMES utf8;
SET foreign_key_checks = 0;
SET time_zone = '+02:00';
SET sql_mode = 'NO_AUTO_VALUE_ON_ZERO';

CREATE TABLE `account` (
  `idAccount` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(128) NOT NULL,
  PRIMARY KEY (`idAccount`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

CREATE TABLE `users` (
  `idUser` int(11) NOT NULL AUTO_INCREMENT,
  `idAccount` int(11) NOT NULL,
  `firstName` varchar(128) NOT NULL,
  PRIMARY KEY (`idUser`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


DROP TABLE IF EXISTS `transactions`;
CREATE TABLE `transactions` (
  `idTransactions` int(11) NOT NULL AUTO_INCREMENT,
  `idUser` int(11) NOT NULL,
  `dateTransaction` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`idTransactions`),
  KEY `index_dateTransaction` (`dateTransaction`) USING BTREE
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


INSERT INTO `transactions` (`idTransactions`, `idUser`, `dateTransaction`) VALUES
(1, 1,  '2012-12-16 15:52:32'),
(2, 1,  '2012-12-20 15:52:37'),
(3, 1,  '2013-02-01 15:52:37'),
(4, 2,  '2013-03-16 15:52:37'),
(5, 2,  '2013-03-18 15:52:37'),
(6, 3,  '2014-04-19 15:52:37'),
(7, 3,  '2014-05-20 15:52:37'),
(8, 4,  '2014-06-21 15:58:46');

INSERT INTO `account` (`idAccount`, `name`) VALUES
(1, 'Burger & Burger');

INSERT INTO `users` (`idUser`, `idAccount`, `firstName` ) VALUES
(1, 1,  'Roberto'),
(2, 1,  'Alessandro');

Depending with the date passed, sometimes MYSQL doesn't use the INDEX.

I know that I need to add / edit INDEX, please, could you please help me to perform this query very well?

This query doesn't use the INDEX:

SELECT 
    users.firstName,
    ts1.*,
    COUNT(transactions.dateTransaction) AS num_transactions
FROM users
    INNER JOIN transactions ON transactions.idUser = users.idUser
    INNER JOIN ( 
        SELECT 
            users.idUser,
            MIN(transactions.dateTransaction) AS first_transaction,
            MAX(transactions.dateTransaction) AS last_transaction
        FROM transactions
            INNER JOIN users ON transactions.idUser = users.idUser
        WHERE (users.idAccount = 1) 
        GROUP BY users.idUser 
    ) AS ts1 ON users.idUser = ts1.idUser
WHERE 
    transactions.dateTransaction BETWEEN ('2012-01-01') AND ('2013-12-31')
AND users.idAccount = 1
GROUP BY users.idUser

EXPLAIN link: http://sqlfiddle.com/#!2/059d8/7/0

This query use it:

SELECT 
    users.firstName,
    ts1.*,
    COUNT(transactions.dateTransaction) AS num_transactions
FROM users
        INNER JOIN transactions ON transactions.idUser = users.idUser
        INNER JOIN ( 
            SELECT 
                users.idUser,
                MIN(transactions.dateTransaction) AS first_transaction,
                MAX(transactions.dateTransaction) AS last_transaction
            FROM transactions
                INNER JOIN users ON transactions.idUser = users.idUser
            WHERE users.idAccount = 1
            GROUP BY users.idUser
        ) AS ts1 ON users.idUser = ts1.idUser
WHERE 
    transactions.dateTransaction BETWEEN ('2012-01-01') AND ('2012-12-31')
AND users.idAccount = 1
GROUP BY users.idUser

Change only the year.

But the biggest problem is that in production environment, with ~65.000 rows of transactions, query hangs on over 60 seconds (!)

I created a sqlfiddle, this is the link: http://sqlfiddle.com/#!2/059d8/1/0

Thank you very much!

Add the following two indexes:

ALTER TABLE `users` ADD KEY `bk1_account_user` (idAccount, idUser);

ALTER TABLE `transactions` KEY `bk2_user_datetrans` (idUser, dateTransaction);

This allows all the tables to be accessed by covering indexes, and eliminates some of the ALL type tables. See the SQLfiddle for details: http://sqlfiddle.com/#!2/b11bb/4

Also, consider upgrading to 5.6, to get rid of the "using join buffer".

This is interesting. I played with the dates, and if the filter is obviously off (using year 2001 for example) mysql uses its CONST tables to compute the query:

Impossible WHERE noticed after reading const tables

I suspect there's a strong optimization on the date columns which I guess is interfering with the index calculations. But I'm not sure about this...

Nonetheless, your query can be improved.

Take a look at this one:

SELECT 
    users.firstName,
    ts1.*
FROM users
    JOIN ( 
        SELECT 
            users.idUser,
            MIN(transactions.dateTransaction) AS first_transaction,
            MAX(transactions.dateTransaction) AS last_transaction,
            COUNT(transactions.dateTransaction) AS num_transactions
        FROM transactions
            JOIN users ON transactions.idUser = users.idUser AND users.idAccount = 1
        WHERE 
            transactions.dateTransaction BETWEEN ('2011-01-01') AND ('2011-07-31')
        GROUP BY users.idUser
    ) AS ts1 ON users.idUser = ts1.idUser
WHERE 
   users.idAccount = 1
GROUP BY users.idUser;

I moved in the subquery the COUNT and the WHERE clause, so you only have to use the transactions table once. But it means that the meaning of the query changed, you have to check if it is what you want. Now, the count will count only the transactions between these 2 dates while before, it was counting them in general for the given user, regardless of the date. If you don't think it fits your needs, just ignore my change.

From the DDL perspective, I think you can improve it like this:

  1. IF and only IF, you have many different user accounts (cardinality of idAccount >20-30), spread more or less equally:

KEY index_idAccount ( idAccount ) on the user table.

2. Change your existing index index_dateTransaction to use the idUser too:

KEY index_dateTransaction ( idUser , dateTransaction )

Final result would be as follows:

在此处输入图片说明

您应该在transactions.idUser,users.idUser和transactions.dateTransaction上具有索引

If I understand you right, you need the dates of the first and last transaction for each user with account = 1, plus the total number of the user's transactions within a certain period.

This is best done like this:

SELECT  u.*,
        (
        SELECT  MIN(dateTransaction)
        FROM    transactions t
        WHERE   t.idUser = u.idUser
        ) minDate,
        (
        SELECT  MAX(dateTransaction)
        FROM    transactions t
        WHERE   t.idUser = u.idUser
        ) maxDate,
        (
        SELECT  COUNT(*)
        FROM    transactions t
        WHERE   t.idUser = u.idUser
                AND t.dateTransaction BETWEEN '2012-01-01' AND '2012-02-02'
        ) cnt
FROM    users u
WHERE   u.idAccount = 1

Create the following indexes:

users (idAccount)
transactions (idUser, dateTransaction)

I don't include the primary keys into the indexes which I should have done on MyISAM tables, however, you should not use MyISAM unless you have a specific reason for that (which I don't think you have). Change your engine to InnoDB.

See this fiddle: http://sqlfiddle.com/#!2/d92e6/3

On a side note, if this query is frequent, you should consider materializing some of its results. If you keep the daily or monthly transaction counts per user in a separate table which would be updated with a trigger, the most costly part of your query, the COUNT , would go away, which would improve the query greatly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM