简体   繁体   中英

MySQL I/O bound InnoDB query optimization problem without setting innodb_buffer_pool_size to 5GB

I got myself into a MySQL design scalability issue. Any help would be greatly appreciated.

The requirements:

Storing users' SOCIAL_GRAPH and USER_INFO about each user in their social graph. Many concurrent reads and writes per second occur. Dirty reads acceptable.

Current design:

We have 2 (relevant) tables. Both InnoDB for row locking, instead of table locking.

  1. USER_SOCIAL_GRAPH table that maps a logged in (user_id) to another (related_user_id). PRIMARY key composite user_id and related_user_id.

  2. USER_INFO table with information about each related user. PRIMARY key is (related_user_id).

Note 1: No relationships defined.

Note 2: Each table is now about 1GB in size, with 8 million and 2 million records, respectively.

Simplified table SQL creates:

CREATE TABLE `user_social_graph` (
  `user_id` int(10) unsigned NOT NULL,
  `related_user_id` int(11) NOT NULL,
  PRIMARY KEY (`user_id`,`related_user_id`),
  KEY `user_idx` (`user_id`)
) ENGINE=InnoDB;

CREATE TABLE `user_info` (
  `related_user_id` int(10) unsigned NOT NULL,
  `screen_name` varchar(20) CHARACTER SET latin1 DEFAULT NULL,
  [... and many other non-indexed fields irrelevant]
  `last_updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`related_user_id`),
  KEY `last_updated_idx` (`last_updated`)
) ENGINE=InnoDB;

MY.CFG values set:

innodb_buffer_pool_size = 256M
key_buffer_size         = 320M

Note 3: Memory available 1GB, these 2 tables are 2GBs, other innoDB tables 3GB.

Problem:

The following example SQL statement, which needs to access all records found, takes 15 seconds to execute (!!) and num_results = 220,000:

SELECT SQL_NO_CACHE COUNT(u.related_user_id) 
FROM user_info u LEFT JOIN user_socialgraph u2 ON u.related_user_id = u2.related_user_id 
WHERE u2.user_id = '1' 
AND u.related_user_id = u2.related_user_id 
AND (NOT (u.related_user_id IS NULL));

For a user_id with a count of 30,000, it takes about 3 seconds (!).

EXPLAIN EXTENDED for the 220,000 count user. It uses indices:

+----+-------------+-------+--------+------------------------+----------+---------+--------------------+--------+----------+--------------------------+
| id | select_type | table | type   | possible_keys          | key      | key_len | ref                | rows   | filtered | Extra                    |
+----+-------------+-------+--------+------------------------+----------+---------+--------------------+--------+----------+--------------------------+
|  1 | SIMPLE      | u2    | ref    | user_user_idx,user_idx | user_idx | 4       | const              | 157320 |   100.00 | Using where              |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                | PRIMARY  | 4       | u2.related_user_id |      1 |   100.00 | Using where; Using index |
+----+-------------+-------+--------+------------------------+----------+---------+--------------------+--------+----------+--------------------------+

How do we speed these up without setting innodb_buffer_pool_size to 5GB?

Thank you!

The user_social_graph table is not indexed correctly !!!

You have ths:

CREATE TABLE user_social_graph
( user_id int(10) unsigned NOT NULL,
related_user_id int(11) NOT NULL,
PRIMARY KEY ( user_id , related_user_id ),
KEY user_idx ( user_id ))
ENGINE=InnoDB;

The second index is redundant since the first column is user_id. You are attempting to join the related_user_id column over to the user_info table. That column needed to be indexed.

Change user_social_graphs as follows:

CREATE TABLE user_social_graph
( user_id int(10) unsigned NOT NULL,
related_user_id int(11) NOT NULL,
PRIMARY KEY ( user_id , related_user_id ),
UNIQUE KEY related_user_idx ( related_user_id , user_id ))
ENGINE=InnoDB;

This should change the EXPLAIN PLAN. Keep in mind that the index order matters depending the the way you query the columns.

Give it a Try !!!

  1. What is the MySQL version? Its manual contains important information for speeding up statements and code in general;

  2. Change your paradigm to a data warehouse capable to manage till terabyte table. Migrate your legacy MySQL data base with free tool or application to the new paradigm. This is an example: http://www.infobright.org/Downloads/What-is-ICE/ many others (free and commercial).

  3. PostgreSQL is not commercial and there a lot of tools to migrate MySQL to it!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM