简体   繁体   中英

Optimizing an InnoDB table and a problematic query

I have a biggish InnoDB table which at this moment contains about 20 million rows with ~20000 new rows inserted every day. They contain messages for different topics.

CREATE TABLE IF NOT EXISTS `Messages` (
  `ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `TopicID` bigint(20) unsigned NOT NULL,
  `DATESTAMP` int(11) DEFAULT NULL,
  `TIMESTAMP` int(10) unsigned NOT NULL,
  `Message` mediumtext NOT NULL,
  `Checksum` varchar(50) DEFAULT NULL,
  `Nickname` varchar(80) NOT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `TopicID` (`TopicID`,`Checksum`),
  KEY `DATESTAMP` (`DATESTAMP`),
  KEY `Nickname` (`Nickname`),
  KEY `TIMESTAMP` (`TIMESTAMP`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=25195126 ;

NOTE: The Cheksum stores an MD5 checksum which prevents same messages inserted twice in the same topics. (nickname + timestamp + topicid + last 20 chars of message)

The site I'm building has a newsfeed in which users can select to view newest messages from different Nicknames from different forums. The query is as follows:

SELECT
Messages.ID AS MessageID,
Messages.Message,
Messages.TIMESTAMP,
Messages.Nickname,
Topics.ID AS TopicID,
Topics.Title AS TopicTitle,
Forums.Title AS ForumTitle

FROM Messages   

JOIN FollowedNicknames ON FollowedNicknames.UserID = 'MYUSERID'
JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
JOIN Subforums ON Subforums.ForumID = Forums.ID
JOIN Topics ON Topics.SubforumID = Subforums.ID

WHERE 

Messages.Nickname = FollowedNicknames.Nickname AND 
Messages.TopicID = Topics.ID AND Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC

The TIMESTAMP contains an unix timestamp and DATESTAMP is simply a date generated from the unix timestamp for faster access via '=' operator instead of range scans with unix timestamps.

The problem is, this query takes about 13 seconds ( or more ) unbuffered. That is of course unacceptable for the intented usage. Adding the DATESTAMP seemed to speed things up, but not by much.

At this point, I don't really know what should I do. I've read about composite primary keys, but I am still unsure whether they would do any good and how to correctly implement one in this particular case.

I know that using BIGINTs may be a little overkill, but do they affect that much?

EXPLAIN:

+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table                 | type   | possible_keys                         | key        | key_len | ref                                           | rows | Extra                                        |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | FollowedNicknames     | ALL    | UserID,ForumID,Nickname               | NULL       | NULL    | NULL                                          |    8 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | Forums                | eq_ref | PRIMARY                               | PRIMARY    | 8       | database.FollowedNicknames.ForumiID           |    1 | NULL                                         |
|  1 | SIMPLE      | Messages              | ref    | TopicID,DATETIME,Nickname             | Nickname   | 242     | database.FollowedNicknames.Nickname           |   15 | Using where                                  |
|  1 | SIMPLE      | Topics                | eq_ref | PRIMARY,SubforumID                    | PRIMARY    | 8       | database.Messages.TopicID                     |    1 | NULL                                         |
|  1 | SIMPLE      | Subforums             | eq_ref | PRIMARY,ForumID                       | PRIMARY    | 8       | database.Topics.SubforumID                    |    1 | Using where                                  |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+

You shouldn't be JOIN ing on a VARCHAR column ( Nickname ); you should use the user ID to join those tables. That is definitely slowing the query down and is probably the biggest issue. It would also be easier to follow if you wrote all of the JOIN s explicitly instead of at the end in the WHERE clause like this:

SELECT
    Messages.ID AS MessageID,
    Messages.Message,
    Messages.TIMESTAMP,
    Messages.Nickname,
    Topics.ID AS TopicID,
    Topics.Title AS TopicTitle,
    Forums.Title AS ForumTitle
FROM Messages   
    JOIN FollowedNicknames ON Messages.Nickname = FollowedNicknames.Nickname
        AND FollowedNicknames.UserID = 'MYUSERID'
    JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
    JOIN Subforums ON Subforums.ForumID = Forums.ID
    JOIN Topics ON Messages.TopicID = Topics.ID
        AND Topics.SubforumID = Subforums.ID
WHERE Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC

Instead of INT as the data type for the DATESTAMP column, I would use DATE . The Checksum column should probably use latin1_general_ci as the collation. I would use INT for the ID columns as long as their values are less than 2,000,000,000 since INT UNSIGNED can store values up to roughly 4,000,000,000. InnoDB is affected by the primary key much more than MyISAM and it could make a noticeable difference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM