简体   繁体   中英

Stuck with MySQL and GROUP BY

I have a table with some stats. Example:

date        userId  attempts  good  bad
2010-08-23  1       5         4     1
2010-08-23  2       10        6     4
2010-08-23  3       6         3     3
2010-08-23  4       8         2     6

Each user has to do something, and the outcome is either good or bad. I'd like to find out what the relative score is for each user, compared to the other users on that day . Example:

User 1 made 5 attempts, and 4 of those were good. So 4 / 5 = 80% of his attempts were good. For the other users that day it was 60%, 50% and 25%. So the relative score of successful attempts for user 1 on that day is 80 / (80 + 60 + 50 + 25) ≈ 37% .

But I'm stuck at this point:

SELECT
  date,
  userId,
  ( (good / attempts) / x ) * 100 AS score_good
  ( (bad / attempts) / y ) * 100 AS score_bad
FROM stats
GROUP BY date, userId -- ?

Where x is the sum of all the (good / attempts) for that day, and y is the sum of all the (bad / attempts) for the same day. Can this be done in the same query?

I'd like the result to be eg

date        userId  score_good
2010-08-23  1       37%
2010-08-23  2       28% (60 / (80 + 60 + 50 + 25))
etc

Or:

userId   score_good_total
1        ...

Where score_good_total would be the sum of all the score_good scores, divided by the amount of days.

I can replace x and y with a subquery, but that doesn't seem they way to go, and will probably cause too much of a load when I want the data grouped by month or totals scores for all the available dates.

I don't see a better way than the subquery, as any creative way to do this will have to sum all the rows anyway. The optimizer should make your subqueries pretty well-performing, and it certainly is simple.

If you indeed need better performance, you'll have to run a separate job that saves all the "daily total" scores in another table, since they don't change once the day is done. Then you can change your query to calculate it only if it is today; otherwise, use the data in said "daily total scores" table.

This pulls out a bit of SQL-fu but it's perfectly doable in a very simple query.

// this would be the working query
SELECT 
   *,
   @score := good / attempts * 100 AS score,
   @t_score := (SELECT SUM(good / attempts * 100) FROM stats) as t_score ,
   @score / @t_score as relative_score_good
FROM stats

Bellow I you can use the values I used to replicate and play with the results.

The things to notice here are the inside sub-query thats an uncorrelated scalar subquery and will, therefore, only run once for all rows (just run the query with EXPLAIN to see that there are really only two queries here.

And the second thing to notice ( and the really important one!) are the user defined variables that are written as @variable .


For replication purposes you can rebuild the sample table with these two commands (it's always nice if you can give the SQL to generate the demo values to the community).

// create the demo table
CREATE TABLE `test`.`stats` (
   `date` DATE NOT NULL ,
   `id` INT NOT NULL ,
   `attempts` INT NOT NULL ,
   `good` INT NOT NULL ,
   `bad` INT NOT NULL ,
   INDEX ( `id` , `attempts` , `good` , `bad` ) 
) ENGINE = MYISAM

// inject some values
INSERT INTO `test`.`stats` (`date`,`id`,`attempts` ,`good` ,`bad`)
VALUES 
   ('2010-08-23', '1', '5', '4', '1'), 
   ('2010-08-23', '2', '10', '6', '4'), 
   ('2010-08-23', '3', '6', '3', '3'), 
   ('2010-08-23', '4', '8', '2', '6');

Hope it helps! Saw the question just as I was leaving the office and though someone would beat me to it. Cinema and 4 hours latter and no answers yet, hurray! ;)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM