简体   繁体   中英

Improve speed on MySQL JOIN with big data

Below are my two tables I want to fetch data from (note that these are dummy databases so don't pay attention to what data i'm pulling)

> DESCRIBE bigdata;

+----------------+----------------------+------+-----+---------+-------+
| Field          | Type                 | Null | Key | Default | Extra |
+----------------+----------------------+------+-----+---------+-------+
| galaxy         | int(2)               | NO   | PRI | 0       |       |
| system         | int(3)               | NO   | PRI | 0       |       |
| planet         | int(2)               | NO   | PRI | 0       |       |
| ogame_playerid | int(11) unsigned     | NO   | MUL | 0       |       |
| moon           | enum('true','false') | NO   |     | false   |       |
| moonsize       | smallint(5) unsigned | NO   |     | 0       |       |
| metal          | int(10) unsigned     | NO   |     | 0       |       |
| crystal        | int(10) unsigned     | NO   |     | 0       |       |
| planetname     | varchar(40)          | NO   |     |         |       |
+----------------+----------------------+------+-----+---------+-------+

also CREATE TABLE as @drew requested

CREATE TABLE `bigdata` (
    `galaxy` INT(2) NOT NULL DEFAULT '0',
    `system` INT(3) NOT NULL DEFAULT '0',
    `planet` INT(2) NOT NULL DEFAULT '0',
    `ogame_playerid` INT(11) UNSIGNED NOT NULL DEFAULT '0',
    `moon` ENUM('true','false') NOT NULL DEFAULT 'false',
    `moonsize` SMALLINT(5) UNSIGNED NOT NULL DEFAULT '0',
    `metal` INT(10) UNSIGNED NOT NULL DEFAULT '0',
    `crystal` INT(10) UNSIGNED NOT NULL DEFAULT '0',
    `planetname` VARCHAR(40) NOT NULL DEFAULT '',
    PRIMARY KEY (`galaxy`, `system`, `planet`),
    INDEX `player_id` (`ogame_playerid`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;

and the second table (note that there are more userid's)

SELECT * FROM smalldata WHERE userid = 1;
+----+--------+----------------+---------------------+
| id | userid | ip             | logintime           |
+----+--------+----------------+---------------------+
|  1 |      1 | 127.0.0.1      | 2016-02-25 13:50:59 |
|  2 |      1 | ::1            | 2016-02-29 23:22:18 |
|  3 |      1 | 127.0.0.1      | 2016-03-14 22:52:22 |
|  4 |      1 | 127.0.0.1      | 2016-03-22 23:27:02 |
+----+--------+----------------+---------------------+

My query is as below

SELECT smalldata.id, SUM(bigdata.planet) 
FROM smalldata LEFT JOIN bigdata ON smalldata.id = bigdata.galaxy 
WHERE smalldata.userid = 1 
GROUP BY smalldata.id;

My concern is that if I run such query to automatize the SELECTion on all 4 smalldata's IDs at once, it takes around 10 seconds to complete. However, if I skip the LEFT JOIN and execute 4 invidual queries on bigdata, thus "hardcoding" WHERE galaxy = 1(or 2,3,4 respectively) , then it takes around 0.05 second each.

I wonder why that happens. I presume that it might be that the LEFT JOIN utilizes a lot of data from columns in bigdata which I'm not using (which I'm not selecting) such as moon, moonsize etc. It might be perhaps that the JOIN itself is time consuming where in fact I could perform 4 selections from bigdata without actually joining those tables.

Am I misusing JOIN here?

Try writing the code like this:

SELECT smalldata.id,
       (SELECT SUM(bigdata.planet) 
        FROM bigdata 
        WHERE smalldata.id = bigdata.galaxy 
       )
FROM smalldata
WHERE smalldata.userid = 1 ;

Be sure you have an index on smalldata(userid, id) . Based on your description, you would seem to have the right index on bigdata ( galaxy should be the first key in the index and planet should be in the index as well).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM