简体   繁体   中英

[Partition Benifit on Indexed Column]

 CREATE TABLE ofRoster (
  `rosterID` bigint(20) NOT NULL,
  `username` varchar(64) NOT NULL,
  `jid` varchar(1024) NOT NULL,
  `sub` tinyint(4) NOT NULL,
  `ask` tinyint(4) NOT NULL,
  `recv` tinyint(4) NOT NULL,
  `nick` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`rosterID`),
  KEY `ofRoster_unameid_idx` (`username`),
  KEY `ofRoster_jid_idx` (`jid`(255))
) ENGINE=InnoDB;


 CREATE TABLE `ofRoster_par` (
  `rosterID` bigint(20) NOT NULL AUTO_INCREMENT,
  `username` int(64) NOT NULL,
  `jid` varchar(1024) NOT NULL,
  `sub` tinyint(4) NOT NULL,
  `ask` tinyint(4) NOT NULL,
  `recv` tinyint(4) NOT NULL,
  `nick` varchar(255) DEFAULT NULL,
  UNIQUE KEY `rosterID` (`rosterID`,`username`),
  KEY `ofRoster_unameid_idx` (`username`),
  KEY `ofRoster_jid_idx` (`jid`(255))
) ENGINE=InnoDB AUTO_INCREMENT=412595 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (username)
PARTITIONS 10 */ ;

I created partition on username so that when i use select command it need to search on one partition only. But i am not sure if this will be benifitial as there is already a index on username.

explain SELECT count(*) FROM ofRoster_par WHERE username='1';
+----+-------------+--------------+------+----------------------+----------------------+---------+-------+------+-------------+
| id | select_type | table        | type | possible_keys        | key                  | key_len | ref   | rows | Extra       |
+----+-------------+--------------+------+----------------------+----------------------+---------+-------+------+-------------+
|  1 | SIMPLE      | ofRoster_par | ref  | ofRoster_unameid_idx | ofRoster_unameid_idx | 4       | const |  120 | Using index |
+----+-------------+--------------+------+----------------------+----------------------+---------+-------+------+-------------+


explain SELECT count(*) FROM ofRoster WHERE username='1';
+----+-------------+----------+------+----------------------+----------------------+---------+-------+------+--------------------------+
| id | select_type | table    | type | possible_keys        | key                  | key_len | ref   | rows | Extra                    |
+----+-------------+----------+------+----------------------+----------------------+---------+-------+------+--------------------------+
|  1 | SIMPLE      | ofRoster | ref  | ofRoster_unameid_idx | ofRoster_unameid_idx | 66      | const |  120 | Using where; Using index |

Right now there are just 400 000 records on the table but on the production records will be around 80 million.

Time taken by both query is also the same :-(

PARTITION BY HASH is, in my opinion, useless .

In your example, INDEX(username) on a non-partitioned table would probably be faster than using PARTITION BY HASH(username) .

You already have such an index. How fast was it?

Here's what is happening:

With partitioning:

  1. pick partition
  2. use KEY(username) (and not the data) to do the COUNT(*) inside the index (note "Using index")

Without partitioning:

  1. use KEY(username) (and not the data) to do the COUNT(*) inside the index (note "Using index")

Other comments:

  • If username is unique, consider making it the PRIMARY KEY and get rid of rosterID . (You may want to keep rosterID because it is smaller and used for JOINing to several other tables.)
  • Bug: You say INT(64) where you meant VARCHAR(64) . This may have impacted your timing test.
  • "Prefix indexes" (jid(255)) are rarely useful. Let's see how you are using it.
  • 80M rows does not warrant BIGINT (8 bytes); INT UNSIGNED (4 bytes) can handle 400 crore.
  • You understand that latin1 limits you to western European languages?
  • When using EXPLAIN with partitioned tables, use EXPLAIN PARTITIONS SELECT ... . You may get some surprises.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM