简体   繁体   中英

Optimize sql query containing left joins

I have 1 table called errors it has the following structure:

Errors

| id | UserID        | CrashDump   | ErrorCode| Timestamp
| 1  | user1         | Crash 1     | 100      | 2015-04-08 21:00:00 
| 2  | user2         | Crash 2     | 102      | 2015-04-10 22:00:00
| 3  | user3         | Crash 4     | 105      | 2015-05-08 12:00:00
| 4  | user4         | Crash 4     | 105      | 2015-06-02 21:22:00
| 5  | user4         | Crash 4     | 105      | 2015-06-03 04:16:00

i wanted to get a result set with the following data:

Desired resultset

   CrashDump        | Error Count| Affected Users| 
    Crash 4         | 3          | 2             |  
    Crash 2         | 1          | 1             | 
    Crash 1         | 1          | 1             | 

The result set would hold the count of each error as error count and affected users (the distinct users who received this error).

i have been able to get the desired outcome using the following query, however it has proven to be very resource intensive and on huge data sets MySQL crashes. Could you please guide me as how i can optimize my current query or guide me towards a better approach in implementing its logic? Any help would be greatly appreciated.

Current Query:

select B.CrashDump as CrashDump, B.B_UID as affected users, C.C_UID as ErrorCount  
from
(
    Select count(A.UserID) as B_UID, A.CrashDump, (A.timestamp) as timestmp, 
    (a.errorcode) as errorCde, (a.ID) as uniqueId
    from
    (   
        select UserID , CrashDump, timestamp,errorcode,id
        from errors 
        where Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15'
        group by userID,CrashDump
    ) as A
    group by A.CrashDump
) as B

left outer join 
(
    select CrashDump , count(UserID) as C_UID
    from errors 
    where Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15'
    group by CrashDump
) as C

On B.CrashDump = C.CrashDump

order by ErrorCount desc limit 0,10

Try

SELECT CrashDump, COUNT(ErrorCode) AS ErrorCount, COUNT(DISTINCT UserID) AS AffectedUser
FROM errors
WHERE Timestamp >='2015-04-08 21:00:00' AND Timestamp <='2015-06-10 08:18:15'
GROUP BY CrashDump

Can't you just do this?:

SELECT
    CrashDump,
    COUNT(ErrorCode) AS ErrorCount,
    COUNT(DISTINCT UserID) AS AffectedUsers
FROM
    Errors
WHERE 
    Timestamp >='2015-04-08 21:00:00' and Timestamp <='2015-06-10 08:18:15'
GROUP BY
    CrashDump
SELECT CrashDump, SUM(e) AS "Error Count", MAX(u) AS "Affected Users"
FROM(
SELECT crashdump, count(errorcode) as e, count(userid) as u
FROM errors
WHERE Time_stamp BETWEEN '2015-04-08 21:00:00' and '2015-06-10 08:18:15'
GROUP BY crashdump, userid) a
GROUP BY crashdump
ORDER BY crashdump DESC

OUTPUT

crashdump   Error Count Affected Users
Crash 4     3           2
Crash 2     1           1
Crash 1     1           1

SQL FIDDLE: http://sqlfiddle.com/#!9/13eab/1/0

This is the solution that worked:

Select A.CrashDump, sum(A.ErrorCount) as ErrorC, count(A.AffectedUsers) 
From
(
SELECT
    CrashDump,
    COUNT(ErrorCode) AS ErrorCount,
    COUNT(DISTINCT UserID) AS AffectedUsers, UserID
FROM
    errors
WHERE 
    Timestamp >='2015-05-13 10:00:00' and Timestamp <='2015-05-14 03:07:00'

GROUP BY
    CrashDump, userID
) AS A
group by A.CrashDump

order by ErrorC desc limit 0,10

Thank you everyone for helping in achieving the desired result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM