简体   繁体   中英

MySQL View performance issue with aggregate query

I'm using mysql version 5.6.47. I have the following table for student marks:

CREATE TABLE `studentmarks` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `StudentID` int(11) NOT NULL,
  `subjectName` varchar(255) DEFAULT NULL,
  `MARKS` int(11) NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `idx_studentmarks_StudentID` (`StudentID`)
);

and created a View on the table:

CREATE OR REPLACE VIEW `vw_student_marks` AS
    SELECT 
        `s1`.`StudentID` AS `StudentID`,
        `s1`.`subjectName` AS `subjectName`,
        `s1`.`MARKS` AS `marks`,
        (SELECT 
                SUM(`s2`.`MARKS`)
            FROM
                `studentmarks` `s2`
            WHERE
                (`s2`.`StudentID` = `s1`.`StudentID`)) AS `totalMarks`
    FROM
        `studentmarks` `s1`;

When testing with around 20K rows, there is a noticeable difference in performance in running SELECT query vs SELECT * FROM VIEW . The select query shows an optimized execution plan with just 1 full table scan, while for view there are 2 full table scans.

Query Stats (Measured by MySQL Workbench):

SELECT QUERY

Timing: 0:00:0.07677120 (as measured by the server)

Rows Examined: 108285

SELECT FROM VIEW QUERY :

Timing: 0:00:1.6082441 (as measured by the server)

Rows Examined: 2985730

What is the reason behind this difference in performance?

Query Execution Plans: https://i.stack.imgur.com/noOxI.jpg

UPDATE: I tested with MySQL version 8.0.19, the same issue occurs

MySQL must be using TEMPTABLE algorithm for the view in this case (aggregate functions). This might be the reason for the difference.

You can refer https://dev.mysql.com/doc/refman/5.6/en/view-algorithms.html for more details.

If the MERGE algorithm cannot be used, a temporary table must be used instead. MERGE cannot be used if the view contains any of the following constructs:

Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)

DISTINCT

GROUP BY

HAVING

LIMIT

UNION or UNION ALL

Subquery in the select list

Assignment to user variables

Refers only to literal values (in this case, there is no underlying table)

FWIW, I'd write this using an uncorrelated subquery (but I accept that it won't necessarily improve performance) - and dispense with the idea of using a view altogether...

SELECT s1.StudentID
     , s1.subjectName
     , s1.MARKS
     , s2.totalmarks
  FROM studentmarks s1
  JOIN
     ( SELECT studentid
            , SUM(s2.MARKS) totalmarks
         FROM studentmarks
        GROUP 
           BY studentid
     ) s2
    ON s2.studentid = s1.studentid;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM