I have two tables I am trying to join in a third query and it seems to be taking far too long.
Here is the syntax I am using
CREATE TABLE active_users
(PRIMARY KEY ix_all (platform_id, login_year, login_month, person_id))
SELECT platform_id
, YEAR(my_timestamp) AS login_year
, MONTH(my_timestamp) AS login_month
, person_id
, COUNT(*) AS logins
FROM
my_login_table
GROUP BY 1,2,3,4;
CREATE TABLE active_alerts
(PRIMARY KEY ix_all (platform_id, alert_year, alert_month, person_id))
SELECT platform_id
, YEAR(alert_datetime) AS alert_year
, MONTH(alert_datetime) AS alert_month
, person_id
, COUNT(*) AS alerts
FROM
my_alert_table
GROUP BY 1,2,3,4;
CREATE TABLE all_data
(PRIMARY KEY ix_all (platform_id, theYear, theMonth, person_id))
SELECT a.platform_id
, a.login_year AS theyear
, a.login_month AS themonth
, a.person_id
, IFNULL(a.logins,0) AS logins
, IFNULL(b.alerts,0) AS job_alerts
FROM
active_users a
LEFT OUTER JOIN
active_alerts b
ON a.platform_id = b.platform_id
AND a.login_year = b.alert_year
AND a.login_month = b.alert_month
AND a.person_id = b.person_id;
The first table (logins) returns about half a million rows and takes less than 1 minute, the second table (alerts) returns about 200k rows and takes less than 1 minute.
If I run just the SELECT part of the third statement it runs in a few seconds, however as soon as I run it with the CREATE TABLE syntax it takes more than 30 minutes.
I have tried different types of indexes than a primary key, such as UNIQUE or INDEX as well as no key at all, but that doesn't seem to make much difference.
Is there something I can do to speed up the creation / insertion of this table?
EDIT: Here is the output of the show create table statements
CREATE TABLE `active_users` (
`platform_id` int(11) NOT NULL,
`login_year` int(4) DEFAULT NULL,
`login_month` int(2) DEFAULT NULL,
`person_id` varchar(40) NOT NULL,
`logins` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`login_year`,`login_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
CREATE TABLE `alerts` (
`platform_id` int(11) NOT NULL,
`alert_year` int(4) DEFAULT NULL,
`alert_month` int(2) DEFAULT NULL,
`person_id` char(36) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`alerts` bigint(21) NOT NULL DEFAULT '0',
KEY `ix_all` (`platform_id`,`alert_year`,`alert_month`,`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
and the output of the EXPLAIN
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a (null) ALL (null) (null) (null) (null) 503504 100 (null)
1 SIMPLE b (null) ALL ix_all (null) (null) (null) 220187 100 Using where; Using join buffer (Block Nested Loop)
It's a bit of a hack but I figured out how to get it to run much faster.
I added a primary key to the third table on platform, year, month, person
I inserted the intersect data using an inner join, then insert ignore the left table plus a zero for alerts in a separate statement.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.