简体   繁体   English

SQL Server:连接导致行太多

[英]SQL Server : join results in too many rows

I have two SQL Server tables. 我有两个SQL Server表。

The first table is called Content and contains -- among other things -- columns called 第一个表称为Content ,包含 - 除其他外 - 称为列

FileID, FileHighResolutionID, FileHighResolutionProID, FileVectorID

The second table is called Analytics and contains -- among other things -- a column called FileID . 第二个表称为Analytics ,其中包含一个名为FileID的列。 This column contains the value from one of the four aforementioned columns in Content . 此列包含Content四个上述列之一的值。

Executing the following... 执行以下......

SELECT 
    * 
FROM 
    Analytics a
WHERE 
    a.Created BETWEEN '2017-03-07' AND '2017-03-08'

results in 782 rows. 结果为782行。

But executing the following... 但执行以下......

SELECT 
    * 
FROM 
    Analytics a
INNER JOIN 
    Content c ON (c.FileID = a.FileID OR c.FileHighResolutionID = a.FileID OR c.FileHighResolutionProID = a.FileID OR c.FileVectorID = a.FileID)
WHERE 
    a.Created BETWEEN '2017-03-07' AND '2017-03-08'

results in 843 rows. 结果是843行。

I know I have something wrong with my JOIN, because now I have 61 too many records. 我知道我的JOIN有问题,因为现在我有61个记录太多了。 I have tried INNER JOINs, LEFT OUTER JOINS, RIGHT OUTER JOINs, but each results in 61 mysterious extra records. 我尝试过INNER JOIN,LEFT OUTER JOINS,RIGHT OUTER JOIN,但每个都会产生61个神秘的额外记录。

Can some SQL expert please review and tell me what I am doing wrong? 有些SQL专家可以回顾并告诉我我做错了什么吗?

You must have figured out by now that the issue is that the OR is matching multiple columns in c . 您现在必须弄清楚问题是OR是否匹配c多个列。 You get a separate row for each match. 每个匹配都会有一个单独的行。 Voila! 瞧! Unexpected rows. 意外的行。

One method to resolve this uses a "lateral join". 解决此问题的一种方法是使用“横向连接”。 This is like a correlated subquery, but it can return more than one column and more than one row (not needed here). 这就像一个相关的子查询,但它可以返回多个列和多行(这里不需要)。 In SQL Server, this is implemented using APPLY : 在SQL Server中,这是使用APPLY实现的:

SELECT a.*, c.*
FROM Analytics a CROSS APPLY
     (SELECT TOP 1 c.*
      FROM Content c 
      WHERE a.FileID IN (c.FileId, c.FileHighResolutionID, c.FileHighResolutionProID, c.FileVectorID)
    ) c
WHERE a.Created BETWEEN '2017-03-07' AND '2017-03-08';

This returns an arbitrary matching row. 这将返回任意匹配的行。 You can get a specific row with an ORDER BY : 您可以使用ORDER BY获取特定行:

SELECT a.*, c.*
FROM Analytics a CROSS APPLY
     (SELECT TOP 1 c.*
      FROM Content c 
      WHERE a.FileID IN (c.FileId, c.FileHighResolutionID, c.FileHighResolutionProID, c.FileVectorID)
      ORDER BY (CASE a.FileId WHEN c.FileId THEN 1 c.FileHighResolutionID THEN 2 c.FileHighResolutionProID THEN 3 c.FileVectorID THEN 4
                END)
    ) c
WHERE a.Created BETWEEN '2017-03-07' AND '2017-03-08';

Note: I do agree with the answer that questions the use of BETWEEN with date/time values. 注意:我同意对使用BETWEEN与日期/时间值的问题的答案。 This is dangerous, because times cause misleading logic. 这很危险,因为时间会导致误导逻辑。 I strongly recommend one of the following: 我强烈推荐以下其中一项:

WHERE a.Created = '2017-03-07'
WHERE a.Created >= '2017-03-07' AND a.Created < '2017-03-09';

If you don't need data from Content table, you could go for EXISTS : 如果您不需要Content表中的数据,您可以选择EXISTS

SELECT *
FROM Analytics AS A
WHERE A.Created >= '2017-03-07'
    AND A.Created < '2017-03-08'
    AND EXISTS (
        SELECT *
        FROM Content AS C
        WHERE A.FileID IN (C.FileID, C.FileHighResolutionID, C.FileHighResolutionProID, C.FileVectorID)
        );

EXISTS will yield either true / false in WHERE condition and will not create dupes. EXISTS将在WHERE条件下产生true / false ,并且不会创建dupes。

Another bad practice is using BETWEEN in a WHERE clause when filtering on dates 另一个不好的做法是在过滤日期时在WHERE子句中使用BETWEEN

WHERE a.Created >= '20170307' AND a.Created < '20170308'

You have more rows in the second query because you are joining on multiple columns and this might be the reason why you are getting more records. 第二个查询中有更多行,因为您要连接多个列,这可能是您获取更多记录的原因。 In your first query you aren't joining with another table. 在第一个查询中,您没有加入另一个表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM