简体   繁体   中英

Is it faster to run an SQL count(*) query in a loop, or try to merge it into the parent query?

I have an SQL query.

SELECT `shifts`.*, `races`.`race_attrition_rate`
FROM `shifts`
JOIN `races` ON `races`.`race_id` = `shifts`.`race_id`
WHERE `shifts`.`race_id` = 'X'
AND `shift_deleted` =0
ORDER BY `shift_name` ASC, `shift_id` ASC

That query pulls a list of volunteer shifts from a database. Then I have a PHP loop that, for each shift that was pulled in the above query, runs this SQL query.

SELECT COUNT(*) AS `numrows`
FROM `volunteer_shifts`
WHERE `shift_id` = 'Y'
AND `shift_deleted` =0

So if there are 5 shifts pulled in the first query, the second query is run 5 times, one time for each shift.

1) Can these two queries be merged together? What would the combined code look like?

2) Is merging these two queries together faster?

3) Merging them together would probably make the code less readable. So what is best practice? Two readable queries or one hard to read but fast query?

We don't know which one would run faster unless you post your table schema. If I were you I would probably run query 1, collect all of the shift_id s then run 1 more query that pulls the counts for the list of shift_id using IN .

Something like this.

SELECT COUNT(*) AS `numrows`, `shift_id`
FROM `volunteer_shifts`
WHERE `shift_id` IN ('42','other number', 'more numbers'...)
AND `shift_deleted` =0
GROUP BY `shift_id`

In this case, pure SQL would be more maintainable, readable, and efficient than looping at application layer (ie, PHP). Hence, consider joining the aggregate query as a derived table (notice shift_id is now a grouping). Now, the count will appear inline with other fields in one query:

SELECT s.*, r.`race_attrition_rate`, agg.`numrows`
FROM `shifts` s
JOIN `races` r ON r.`race_id` = s.`race_id`

JOIN (
      SELECT `shift_id`, COUNT(*) AS `numrows`
      FROM `volunteer_shifts`
      WHERE `shift_deleted` = 0
      GROUP BY `shift_id`
     ) AS agg

ON agg.shift_id = s.shift_id

WHERE r.`race_id` = '17'
AND s.`shift_deleted` = 0
ORDER BY s.`shift_name` ASC, s.`shift_id` ASC

2) Is merging these two queries together faster?

Single query will be definetely faster, as there is no time spent on network activity (just imagine that DB is located at another server, that is quite usual case)

Also separate-queries approach doesn't allow built-in DB query optimizer to do its work

1) Can these two queries be merged together? What would the combined code look like?

Following query may work for you:

SELECT 
  `shifts`.*,
  `races`.`race_attrition_rate`,
  (SELECT 
      COUNT(*) AS `numrows`
    FROM 
      `volunteer_shifts`
    WHERE
      `volunteer_shifts`.`shift_id` = `shifts`.`shift_id`
    AND 
      `shift_deleted` = 0) AS `volunteer_shifts`
FROM 
  `shifts`
  JOIN `races` ON `races`.`race_id` = `shifts`.`race_id`
WHERE 
  `shifts`.`race_id` = 'X'
AND 
  `shift_deleted` = 0
ORDER BY 
  `shift_name` ASC, `shift_id` ASC

3) So what is best practice? Two readable queries or one hard to read but fast query?

General rule is "Readability is the main point until you get problems with performance". Just because Computing resources are cheaper than Human resources

If all you want is the count that the second SQL produces, then it would be more readable, and it would be way shorter.

SELECT COUNT(*) numrows
FROM shifts
Where shift_id = 42 
   and race_id = '17'
   and shift_deleted = 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM