简体   繁体   中英

Getting wrong sum when using group by with inner join

With reference to this question ( How to get the sum in a joined table when using group by - getting wrong results ) I have two tables orders and order_items . I need to group the results by days. But I also need to get the sum of energy_used for each day from another table. When I try that using a join, I get wrong order_sum for each day (they are not being summed up). Not sure what I am doing wrong.

I would like to get for each day

  • the sum of order_items.energy_used for all orders created that day
  • the sum of orders.order_sum for all orders created that day
  • the created_at and order_sum that correspond to the latest order created on that day

Here is my orders table

+----+-----------+---------+---------------------+
| id | order_sum | user_id | created_at          |
+----+-----------+---------+---------------------+
| 1  | 25.13     | 7       | 2020-01-25 09:13:00 |
| 2  | 10.00     | 7       | 2020-01-25 15:23:00 |
| 3  | 14.00     | 5       | 2020-01-26 10:14:00 |
| 4  | 35.00     | 1       | 2020-01-27 11:13:00 |
+----+-----------+---------+---------------------+

And here is my order_items table

+----+----------+-------------+---------------------+
| id | order_id | energy_used | created_at          |
+----+----------+-------------+---------------------+
| 1  | 1        | 65          | 2020-01-25 09:13:00 |
| 2  | 1        | 12          | 2020-01-25 09:13:00 |
| 3  | 2        | 70          | 2020-01-26 10:14:00 |
| 4  | 2        | 5           | 2020-01-26 10:14:00 |
| 5  | 3        | 0           | 2020-01-27 11:13:00 |
+----+----------+-------------+---------------------+

And this is the desired result that I am trying to achieve

+---------------+-----------------+-------------------+---------------------+----------------+
| date_of_month | total_order_sum | total_energy_used | last_order_date     | last_order_sum |
+---------------+-----------------+-------------------+---------------------+----------------+
| 2020-01-25    | 35.13           | 77                | 2020-01-25 09:13:00 | 25.13          |
| 2020-01-26    | 14.00           | 75                | 2020-01-26 10:14:00 | 14.00          |
| 2020-01-27    | 35.00           | 0                 | 2020-01-27 11:13:00 | 35.00          |
+---------------+-----------------+-------------------+---------------------+----------------+

And here is the query that I have tried but I'm getting wrong results, the order_sum is not being calculated correctly. It is showing the same as last_order_sum

select 
    date(o.created_at) date_of_month,
    i.total_energy_used,
    o.created_at last_order_date,
    o.order_sum last_order_sum,
    sum(order_sum) as total_order_sum
from orders o
inner join (
    select date(o1.created_at) date_of_month, sum(i1.energy_used) total_energy_used
    from orders o1
    inner join order_items i1 on o1.id = i1.order_id
    group by date(o1.created_at)
) i on i.date_of_month = date(o.created_at)
where o.created_at = (
    select max(o1.created_at)
    from orders o1
    where date(o1.created_at) = date(o.created_at)
)

Here is a fiddle: https://dbfiddle.uk/?rdbms=mysql_5.6&fiddle=92b8cc2920ad9f7a7cdd56bded5a3bf2

Always join tables together on their relationships (in this case orders.id with order_items.order_id) and then group. to avoid duplicating order_sums for multiple order_items when joining, first group order_items by order_id.

select 
    date(o.created_at) date_of_month,
    sum(i.total_energy_used),
    max(o.created_at),
    sum(order_sum) as total_order_sum
from orders o
inner join (
    select order_id, sum(total_energy_used) total_energy_used
    from order_items i
    group by order_id
) i on o.id = i.order_id
group by date(o.created_at)

from this point onwards you can do a join again on orders with max(o.created_at) to get the order_sum of the last order. moral of the story: keep an eye on your granularity.

Your problem is that you are selecting from orders , where you actually want an aggregate of orders by date. So select from two aggregtating subqueries that you join. Only problem is last_order_sum , which we can select in a further subquery, once we know the last order date.

select 
  order_date,
  o.total_order_sum,
  oi.total_energy_used,
  o.last_order_date,
  (
    select order_sum
    from orders last_order
    where lastorder.created_at = o.last_order_date
  ) as last_order_sum
from 
(
  select
    date(created_at) as order_date,
    sum(order_sum) as total_order_sum,
    max(created_date) as last_order_date
  from orders
  group by date(created_at)
) o
inner join 
(
  select
    date(created_at) as order_date,
    sum(energy_used) as total_energy_used
  from order_items
  group by date(created_at)
) oi using(order_date)
order by order_date;

What you are asking and what you show us in output are not correlated. Assuming that it is a typo:

select so.dtDay as date_of_month, so.order_sum as total_order_sum,
   eu.energy_used as total_energy_used,
   o.created_at as last_order_date,
   o.order_sum as last_order_sum
from (
select left(created_at,10) as dtDay, sum(order_sum) as order_sum, max(id) as last_insert_id
from orders
group by left(created_at,10)
order by created_at
) so
inner join orders o on o.id = so.last_insert_id
left join (select left(created_at,10) as dtDay, sum(energy_used) as energy_used
from order_items
group by left(created_at,10)) eu on so.dtDay = eu.dtDay;

DBFiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM