繁体   English   中英

使用 UNION ALL 和 GROUP BY 将 3 TSQL 语句合并为一个查询

[英]Use UNION ALL and GROUP BY to combine 3 TSQL statements into one query

我有 3 个单独工作的 TSQL 查询,但我需要将它们组合成一个查询。 在 Microsoft Access 中,它们使用以下 SQL 语句成功组合; 但是,我正在尝试使用 TSQL 对所有数据进行逆向工程。我怎样才能在 TSQL 中做同样的事情? 请记住ap.property_idaf1.property_id来自不同的表。

SELECT property_id]
FROM [Insured (see TSQL Statement 1)] 
GROUP BY property_id
UNION ALL
SELECT property_id
FROM [Uninsured (see TSQL Statement 2)] 
GROUP BY property_id
UNION ALL
SELECT property_id
FROM [IE > 90 Days (see TSQL Statement 3)] 
GROUP BY property_id;

TSQL 报表 1(被保险人)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_active_assistance_ind,
    af1.is_pipeline_ind,

    CASE
        WHEN ap.is_insured_ind = "Y" THEN "Insured"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN ap.is_insured_ind = "Y" AND ap.is_under_management_ind = "Y" AND af1.is_pipeline_ind = "N" THEN "Insured Only"
        WHEN ap.is_insured_ind = "Y" AND ap.is_under_management_ind = "Y" AND af1.is_pipeline_ind = "Y" AND ap.has_active_assistance_ind = "Y" THEN "Insured and Assisted"
        ELSE "Other Insured"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_financing af1
    INNER JOIN rems_dmart.dbo.active_property ap
    ON af1.property_id = ap.property_id

GROUP BY
    ap.region_name,
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_active_assistance_ind,
    af1.is_pipeline_ind

HAVING
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "N" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "Y"
    AND ap.has_active_assistance_ind = "Y" )

TSQL 报表 2(未投保)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_use_restriction_ind,
    ap.has_active_irp_ind,
    ap.has_active_assistance_ind,
    ap.is_service_coordinator_ind,

    CASE
        WHEN ap.is_insured_ind = "N" THEN "Uninsured"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "Y" THEN "Assisted Only"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "Y" THEN "Use Agreement Only"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "Y" AND ap.is_service_coordinator_ind = "N" THEN "IRP"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "N" AND ap.is_service_coordinator_ind = "Y" THEN "Service Coordinator"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "Y" AND ap.is_service_coordinator_ind = "Y" THEN "IRP & Service Coordinator"
        ELSE "Other Uninsured"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_property ap

GROUP BY
    ap.region_name,
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_use_restriction_ind,
    ap.has_active_irp_ind,
    ap.has_active_assistance_ind,
    ap.is_service_coordinator_ind

HAVING
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "N"
    AND ap.has_active_assistance_ind = "Y" )
    OR 
    ( ap.region_name <> "OHP"
    AND ap.is_insured_ind = "N"
    AND ap.has_use_restriction_ind = "Y" )

TSQL 报表 3(IE > 90 天)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    af1.property_id,
    af1.initial_endorsement_date,
    af1.final_endorsement_date,
    ap.is_under_management_ind,

    CASE
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL THEN "IE > 90 Days"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL AND ap.is_under_management_ind = "Y" THEN "IE > 90 Days_Under Mgmt"
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL AND ap.is_under_management_ind = "N" THEN "IE > 90 Days_Not Under Mgmt"
        ELSE "Other IE > 90 Days"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_property ap
    INNER JOIN rems_dmart.dbo.active_financing af1
        ON ap.property_id = af1.property_id

WHERE
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "N" )
    OR 
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "Y" )

首先,您想要的最终结果只是property_id值。 那么,自然而然的第一步是从 TSQL 查询(的副本)的 select 列表中删除所有其他内容。 这已经使它们变得简单多了。

其次,前两个原始 TSQL 查询使用GROUP BY而不是任何聚合函数。 他们确实在其 select 列表中包含了所有分组列。 这有一个主要和至少一个次要影响:

  • 主要效果是对行进行重复数据删除,相当于select distinct
  • 次要影响可能是真正的要点:过滤条件是通过having子句应用的,这样(逻辑上)它们在行去重进行测试。

但是,如果两个基表中的任何一个都没有重复的property_id值,那么这两个查询中的分组和不同选择都是没有意义的,因为在分组列中包含一个property_id意味着不能有任何组具有更多多于一行,在任何情况下也不会有重复的结果行。

这建议删除GROUP BY子句并将HAVING子句转换为WHERE子句。

第三,这三个查询都是从相同的两个表的大致相同的连接中选择的。 第二个是从一个表中选择,但由于连接列中没有重复项,我们可以通过将内部连接转换为外部连接来使其有效地成为同一件事。 如果您可以依赖ap中的每一行在af1中都有对应的行,那么即使那样也是不必要的。 此外,尽管您声称他们正在从不同的表中选择property_id列,但这是技术问题。 可以在不更改结果的情况下将查询 3 修改为 select ap.property_id而不是af1.property_id ,因为它们保证在行集中的每一行中都相等。

因此,这三个查询可以有效地组合成一个查询而不依赖联合。 可以调整为普通查询,相同的select列表和源行集,剩下的就是统一过滤条件了。 在处理过滤条件之前,这是这样的:

-- can be made SELECT DISTINCT if duplicates are present after all:
SELECT ap.property_id

FROM rems_dmart.dbo.active_property ap
    LEFT OUTER JOIN rems_dmart.dbo.active_financing af1
        ON af1.property_id = ap.property_id

WHERE
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "N" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "Y"
    AND ap.has_active_assistance_ind = "Y" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "N"
    AND ap.has_active_assistance_ind = "Y" )
    OR 
    ( ap.region_name <> "OHP"
    AND ap.is_insured_ind = "N"
    AND ap.has_use_restriction_ind = "Y" )
    OR
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "N" )
    OR 
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "Y" )

同样,只要两个表中没有重复的property_id值,就不需要GROUP BYDISTINCT选择,但如果您毕竟必须容纳重复项,那么您只需将SELECT更改为SELECT DISTINCT

该过滤条件可以更好地分解,但此时它高度反映了原始查询。 这可能会使验证或检查变得更容易。 我会考虑至少从所有单独的备选方案中取消ap.region_name的条件,但我把它和你想执行的任何其他重构留给你。

您的代码可以翻译成

--1
SELECT
        AP.property_id
FROM rems_dmart.dbo.active_financing AS AF
INNER JOIN rems_dmart.dbo.active_property AS AP
    ON AF.property_id = AP.property_id
WHERE AP.region_name <> 'OHP' AND AP.is_under_management_ind = 'Y' AND AP.is_insured_ind = 'Y' AND AF.is_pipeline_ind = 'N'

UNION 

--2
SELECT
        property_id
FROM rems_dmart.dbo.active_property
WHERE region_name <> 'OHP' AND is_insured_ind = 'N' AND has_active_assistance_ind = 'Y'

UNION

--3
SELECT
        AF.property_id
FROM rems_dmart.dbo.active_property AS AP
INNER JOIN rems_dmart.dbo.active_financing AS AF
    ON AP.property_id = AF.property_id
WHERE AP.region_name <> 'OHP' AND DATEDIFF(DAY, AF.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90 AND AF.final_endorsement_date IS NULL AND AP.is_under_management_ind IN ('N','Y')

甚至更进一步

SELECT
        AP.property_id
FROM rems_dmart.dbo.active_financing AS AF
INNER JOIN rems_dmart.dbo.active_property AS AP
    ON AF.property_id = AP.property_id
WHERE AP.region_name <> 'OHP' AND
(
    (AP.is_under_management_ind = 'Y' AND AP.is_insured_ind = 'Y' AND AF.is_pipeline_ind = 'N') 
    OR 
    (DATEDIFF(DAY, AF.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90 AND AF.final_endorsement_date IS NULL AND AP.is_under_management_ind IN ('N','Y'))
)

UNION 

SELECT
        property_id
FROM rems_dmart.dbo.active_property
WHERE region_name <> 'OHP' AND is_insured_ind = 'N' AND has_active_assistance_ind = 'Y'

@john-bollinger 对您的查询说了很多好话,所以也请尝试使用他的答案。 此外,您还需要在两个表上有良好的索引才能使查询顺利运行。

下面是一个示例,说明如何使用 UNION ALL 运算符和 GROUP BY 子句将三个 TSQL 语句组合到一个查询中:

WITH CTE AS ( SELECT column1, column2, SUM(column3) as Total_Column3 FROM table1 WHERE column4 = 'value1' GROUP BY column1, column2

联合所有

SELECT column1, column2, SUM(column3) as Total_Column3 FROM table2 WHERE column5 = 'value2' GROUP BY column1, column2

联合所有

SELECT column1, column2, SUM(column3) as Total_Column3 FROM table3 WHERE column6 = 'value3' GROUP BY column1, column2 ) SELECT column1, column2, SUM(Total_Column3) as Grand_Total FROM CTE GROUP BY column1, column2

此查询根据三个 SELECT 语句的结果创建一个公用表表达式 (CTE),每个语句聚合来自不同表的数据并根据特定条件对其进行过滤。 UNION ALL 运算符将这些 SELECT 语句的结果合并到一个结果集中。 最后,查询使用 GROUP BY 子句按 column1 和 column2 聚合 Total_Column3 的总计。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM