简体   繁体   English

SQL 当没有 where 或 having 子句时,服务器查询聚合不正确

[英]SQL Server Query aggregating incorrectly when no where or having clause present

Edit: I've tried the first two solutions, but am still having this issue of the query returning correct results when looking at a single customer with a WHERE clause, but incorrectly for the same customer without it.编辑:我已经尝试了前两个解决方案,但在查看带有 WHERE 子句的单个客户时查询返回正确结果的问题仍然存在,但对于没有它的同一客户则不正确。 How could this be happening?这怎么可能发生? What is going on under the hood that could lead to this?幕后发生了什么可能导致这种情况?

I am building a query to join and aggregate customer information on a big table, so I am starting out building the query with a where clause for a single customer to make sure the logic is working before implementing it on the population of customers.我正在构建一个查询以在一个大表上加入和聚合客户信息,因此我开始使用针对单个客户的 where 子句构建查询,以确保逻辑在对客户群体实施之前有效。

The tables I'm joining look something like this:我加入的表看起来像这样:

Table A:表一:

| customer | order_id |
----------------------
| abc      | 1       |
| abc      | 2       |
| xyz      | 3       |
| xyz      | 4       |
| xyz      | 5       |
| xyz      | 6       |
...

Table B:表乙:

| order_id | return_date   |
----------------------------
| 1        |       Mon     |
| 3        |       Tues    |
| 5        |       Wed     |
...

I need to aggregate these by the customer name and essentially count the number of times their info appears in each table.我需要按客户名称汇总这些信息,并计算他们的信息在每个表中出现的次数。

So the query looks something like this:所以查询看起来像这样:

SELECT 
  a.customer as customer_name
  ,COUNT(DISTINCT(a.order_id)) as total_orders
  ,COUNT(DISTINCT(B.order_id)) as num_returns
FROM B

RIGHT JOIN (
  SELECT 
    customer
    order_id
  FROM A
  ) as a

ON B.order_id = a.order_id
WHERE customer = 'xyz'
GROUP BY a.customer

This works perfectly when the where clause is present (also works with a HAVING customer = 'xyz' after the group by instead) But when I remove the where clause to apply this to the population of customers, the results are completely incorrect.当存在 where 子句时,这非常有效(也适用于 group by 之后的 HAVING customer = 'xyz')但是当我删除 where 子句以将其应用于客户群体时,结果是完全不正确的。 How can I fix this to work for the population?我该如何解决这个问题才能为大众服务?

This query should work:此查询应该有效:

SELECT a.customer as customer_name,
       COUNT(DISTINCT a.order_id) as total_orders,
       COUNT(DISTINCT B.order_id) as num_returns
FROM A LEFT JOIN
     B
     ON B.order_id = a.order_id
WHERE a.customer = 'xyz'
GROUP BY a.customer;

If xyz has no rows in A , then this returns no rows.如果xyzA中没有行,则不返回任何行。

I would recommend pre-aggregation on b , and a left join :我建议在b上进行预聚合和left join

select a.customer, count(*) total_orders, coalesce(sum(b.num_returns), 0) num_returns
from a
left join (
    select order_id, count(*) num_returns
    from b
    group by order_id
) b on b.order_id = a.order_id
group by a.customer

The results are consistent, regardless of whether a where clause is used or not.无论是否使用where子句,结果都是一致的。 Note that this assumes no duplicate (customer_id, order_id) in a , as showned in your sample data.请注意,这假设a中没有重复项(customer_id, order_id) ,如示例数据所示。

A lateral join would also do:横向连接也可以:

select a.customer, count(*) total_orders, sum(b.num_returns) num_returns
from a
cross apply (
    select count(*) num_returns
    from b
    where b.order_id = a.order_id
) b
group by a.customer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM