[英]SQL Server Query aggregating incorrectly when no where or having clause present
Edit: I've tried the first two solutions, but am still having this issue of the query returning correct results when looking at a single customer with a WHERE clause, but incorrectly for the same customer without it.编辑:我已经尝试了前两个解决方案,但在查看带有 WHERE 子句的单个客户时查询返回正确结果的问题仍然存在,但对于没有它的同一客户则不正确。 How could this be happening?
这怎么可能发生? What is going on under the hood that could lead to this?
幕后发生了什么可能导致这种情况?
I am building a query to join and aggregate customer information on a big table, so I am starting out building the query with a where clause for a single customer to make sure the logic is working before implementing it on the population of customers.我正在构建一个查询以在一个大表上加入和聚合客户信息,因此我开始使用针对单个客户的 where 子句构建查询,以确保逻辑在对客户群体实施之前有效。
The tables I'm joining look something like this:我加入的表看起来像这样:
Table A:表一:
| customer | order_id |
----------------------
| abc | 1 |
| abc | 2 |
| xyz | 3 |
| xyz | 4 |
| xyz | 5 |
| xyz | 6 |
...
Table B:表乙:
| order_id | return_date |
----------------------------
| 1 | Mon |
| 3 | Tues |
| 5 | Wed |
...
I need to aggregate these by the customer name and essentially count the number of times their info appears in each table.我需要按客户名称汇总这些信息,并计算他们的信息在每个表中出现的次数。
So the query looks something like this:所以查询看起来像这样:
SELECT
a.customer as customer_name
,COUNT(DISTINCT(a.order_id)) as total_orders
,COUNT(DISTINCT(B.order_id)) as num_returns
FROM B
RIGHT JOIN (
SELECT
customer
order_id
FROM A
) as a
ON B.order_id = a.order_id
WHERE customer = 'xyz'
GROUP BY a.customer
This works perfectly when the where clause is present (also works with a HAVING customer = 'xyz' after the group by instead) But when I remove the where clause to apply this to the population of customers, the results are completely incorrect.当存在 where 子句时,这非常有效(也适用于 group by 之后的 HAVING customer = 'xyz')但是当我删除 where 子句以将其应用于客户群体时,结果是完全不正确的。 How can I fix this to work for the population?
我该如何解决这个问题才能为大众服务?
This query should work:此查询应该有效:
SELECT a.customer as customer_name,
COUNT(DISTINCT a.order_id) as total_orders,
COUNT(DISTINCT B.order_id) as num_returns
FROM A LEFT JOIN
B
ON B.order_id = a.order_id
WHERE a.customer = 'xyz'
GROUP BY a.customer;
If xyz
has no rows in A
, then this returns no rows.如果
xyz
在A
中没有行,则不返回任何行。
I would recommend pre-aggregation on b
, and a left join
:我建议在
b
上进行预聚合和left join
:
select a.customer, count(*) total_orders, coalesce(sum(b.num_returns), 0) num_returns
from a
left join (
select order_id, count(*) num_returns
from b
group by order_id
) b on b.order_id = a.order_id
group by a.customer
The results are consistent, regardless of whether a where
clause is used or not.无论是否使用
where
子句,结果都是一致的。 Note that this assumes no duplicate (customer_id, order_id)
in a
, as showned in your sample data.请注意,这假设
a
中没有重复项(customer_id, order_id)
,如示例数据所示。
A lateral join would also do:横向连接也可以:
select a.customer, count(*) total_orders, sum(b.num_returns) num_returns
from a
cross apply (
select count(*) num_returns
from b
where b.order_id = a.order_id
) b
group by a.customer
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.