简体   繁体   中英

Faster sql query then join

I have a big table with more than 10,000 rows and it will grow to 1,000,000 in the near future, and I need to run a query which gives back a Time value for each keyword for each user. I have one right now which is quite slow because I use left joins and it needs one subquery / keyword:

SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time
FROM
rawdata left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time'
FROM rawdata 
WHERE MainWindowTitle LIKE '%Facebook%'
GROUP by user)t1 on rawdata.user = t1.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time'
FROM rawdata 
WHERE MainWindowTitle LIKE '%Outlook%'
GROUP by user)t2 on rawdata.user = t2.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time'
FROM rawdata 
WHERE MainWindowTitle LIKE '%Excel%'
GROUP by user)t3 on rawdata.user = t3.user

The table looks like this:

WindowTitle | StartTime | EndTime | User
------------|-----------|---------|---------
Form1       | DateTime  | DateTime| user1
Form2       | DateTime  | DateTime| user2
...         | ...       | ...     | ...
Form_n      | DateTime  | DateTime| user_n

The output should looks like this:

User   | Keyword   | SUM(EndTime-StartTime)
-------|-----------|-----------------------
User1  | 'Facebook'|              00:34:12
User1  | 'Outlook' |              00:12:34
User1  | 'Excel'   |              00:43:13
User2  | 'Facebook'|              00:34:12
User2  | 'Outlook' |              00:12:34
User2  | 'Excel'   |              00:43:13
...    | ...       | ...  
User_n | ...       | ...

And the question is, which is the fastest way in MySQL to do this?

I think your wildcard searches are probably what's slowing it down the most, since you can't really utilize indexes on those fields. Also if you can avoid doing sub-queries and just do a straight join, it might help, but the wildcard searches are far worse. Is there anyway you could change the table to have a categoryName or categoryID that can have an index and not require a wildcard search? Like "where categoryName = 'Outlook'"

To optimize the data in your tables, add a categoryID (ideally this would reference a separate table, but let's just use arbitrary numbers for this example):

alter table rawData add column categoryID int not null

alter table rawData add index (categoryID)

Then populate the categoryID field for the existing data:

update rawData set categoryID=1 where name like '%Outlook%'
update rawData set categoryID=2 where name like '%Facebook%'
-- etc...

Then change your insert to follow the same rules.

Then make your SELECT query like this (changed wild cards to categoryID):

SELECT rawdata.user, t1.Facebook_Time, t2.Outlook_Time, t3.Excel_time
FROM
rawdata left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Facebook_Time'
FROM rawdata 
WHERE categoryID = 2
GROUP by user)t1 on rawdata.user = t1.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Outlook_Time'
FROM rawdata 
WHERE categoryID = 1
GROUP by user)t2 on rawdata.user = t2.user left join
(SELECT user, sec_to_time(SuM(time_to_sec(EndTime-StartTime))) as 'Excel_Time'
FROM rawdata 
WHERE categoryID = 3
GROUP by user)t3 on rawdata.user = t3.user

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM