[英]mysql group by returning incorrect result
I have two tables (timesheet and tasks) each contains an hour value column "allocated hours" and "actual hours" of which I am trying to get the sum of both of these values. 我有两个表(时间表和任务),每个表包含一个小时值列“已分配小时”和“实际小时”,我试图获得这两个值的总和。 also the timesheet table contains a integer value for "staff_id" which corresponds to the "assigned_to" in the task table
时间表表还包含“staff_id”的整数值,该值对应于任务表中的“assigned_to”
the task table contains: 任务表包含:
task_id INT(11)
assigned_to INT(11)
date_start DATE
hrs DECIMAL (10,0)
the timesheet table contains: 时间表表包含:
timesheet_id (int)
name varchar(100)
hours decimal(10,0)
staff_id(INT 11)
my query looks like: 我的查询看起来像:
SELECT
timesheet.staff_id,
task.assigned_to,
SUM(task.hrs) AS assigned_hrs,
timesheet.name,
SUM(timesheet.hours) AS actual_hours
FROM timesheet
INNER JOIN task
ON timesheet.staff_id = task.assigned_to
GROUP BY timesheet.name
which will (incorrectly) result in: 这将(错误地)导致:
staff_id |assigned_to |assigned_hrs | name. | actual_hours |
---------------|------------|----------------|---------------|---------------|
4 |4 | 1364 | John Smith |52
2 |2 | 80 | Jane Doe |14.5
6 |6 | 454 | Test User 1 |40
9 |9 | 262 | Test User 2 |4
The above is what I am trying to get, However all of the results are correct but John Smith's assigned hours get doubled. 以上是我想要的,但是所有的结果都是正确的,但约翰史密斯的分配时间增加了一倍。 I know it has to do with the "Grouping Pitfall" as described here:
我知道它与这里描述的“分组陷阱”有关:
http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall
but I just go cross eyed trying to figure this out. 但我只是试图弄明白这一点。 can someone point me in the right direction?
有人能指出我正确的方向吗?
(edit again) If I run a query just on the task table: (再次编辑)如果我只在任务表上运行查询:
SELECT
task.assigned_to,
SUM(task.hrs) AS allocated_hrs
FROM task
GROUP BY task.assigned_to
It (correctly) results in: 它(正确)导致:
assigned_to | allocated_hrs |
----------------------------
4 | 682
7 | 378
2 | 40
6 | 227
9 | 262
you can see that the user ID of "4" which is John Smith has doubled (and also ID 6) 你可以看到约翰史密斯的用户ID“4”翻了一番(还有ID 6)
running a query on just the timesheet table : 仅在时间表表上运行查询:
SELECT
timesheet.name,
SUM(timesheet.hours) AS actual_hours
FROM timesheet
GROUP BY timesheet.name
correctly results in : 正确导致:
name | Actual_hrs
-------------------------
Jane Doe | 19.5
John Smith | 6.5
Test User1 | 4
Test User2 | 5
running the query supplied by JoachimL results in : 运行JoachimL提供的查询会导致:
staff_id | assigned_to | assigned_hrs | name | actual_hours
----------------------------------------------------------------------
2 2 40 Jane Doe 19.5
4 4 24 John Smith 6.5
4 4 7 John Smith 6.5
4 4 21 John Smith 6.5
4 4 210 John Smith 6.5
4 4 28 John Smith 6.5
4 4 91 John Smith 6.5
6 6 14 Test User 1 8
6 6 91 Test User 1 8
6 6 28 Test User 1 8
6 6 3 Test User 1 8
9 9 24 Test User 2 1
9 9 91 Test User 2 1
9 9 56 Test User 2 1
Here's a fiddle http://sqlfiddle.com/#!2/ef680 这是一个小提琴http://sqlfiddle.com/#!2/ef680
No comment privs... 没有评论私人......
Does ID 4 and 6 have two rows in timesheet? ID 4和6在时间表中有两行吗? The others just one?
其他人只有一个? Then task.hrs would be doubled.
然后task.hrs会加倍。
Something like this should avoid that. 这样的事情应该避免这种情况。 If task_id is unique you don't have to sum that.
如果task_id是唯一的,则不必总结。 (test data would help)
(测试数据会有帮助)
EDIT 编辑
SELECT
ts.staff_id,
task.assigned_to,
task.hrs AS assigned_hrs,
ts.name,
ts.actual_hours
FROM task
INNER JOIN (SELECT staff_id, name, SUM(hours) as actual_hours FROM timesheet GROUP BY staff_id, name) as ts
ON ts.staff_id = task.assigned_to
The above: group timesheet by staff_id/name Then join with tasks, which should be just one row per task 上面的:group timesheet by staff_id / name然后加入任务,每个任务应该只有一行
SELECT
timesheet.staff_id,
task.assigned_to,
SUM(task.hrs) AS assigned_hrs,
timesheet.name,
SUM(timesheet.hours) AS actual_hours
FROM task
LEFT JOIN timesheet ON timesheet.staff_id = task.assigned_to
GROUP BY timesheet.staff_id
Try a LEFT JOIN and make sure you group by a UNIQUE field. 尝试LEFT JOIN并确保按UNIQUE字段分组。 "name" may not be unique.
“名称”可能不是唯一的。
Note: the LEFT JOIN will leave out any timesheets that are not assigned to a task. 注意:LEFT JOIN将忽略未分配给任务的任何时间表。 You can reverse this by SELECT FROM timesheet LEFT JOIN task instead.
您可以通过SELECT FROM timesheet LEFT JOIN任务来反转此操作。
Edit: See this answer: Select multiple sums with MySQL query and display them in separate columns 编辑:请参阅此答案: 使用MySQL查询选择多个总和并将其显示在单独的列中
Sorry, no comment privileges yet. 对不起,还没有评论权限。
SELECT x.*
, SUM(y.hrs) n
FROM
( SELECT t.staff_id
, t.name
, SUM(t.hours) actual_hours
FROM timesheet t
GROUP
BY t.staff_id
) x
JOIN task y
ON y.assigned_to = x.staff_id
GROUP
BY staff_id;
http://sqlfiddle.com/#!2/ef680/14 http://sqlfiddle.com/#!2/ef680/14
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.