简体   繁体   English

使用 UNION ALL 和 GROUP BY 将 3 TSQL 语句合并为一个查询

[英]Use UNION ALL and GROUP BY to combine 3 TSQL statements into one query

I have 3 TSQL queries that work individually, but I need to combine them into one single query.我有 3 个单独工作的 TSQL 查询,但我需要将它们组合成一个查询。 In Microsoft Access, they were successfully combined using the following SQL statement;在 Microsoft Access 中,它们使用以下 SQL 语句成功组合; however, I am trying to reverse engineer all the data using TSQL. How can I do the same thing in TSQL?但是,我正在尝试使用 TSQL 对所有数据进行逆向工程。我怎样才能在 TSQL 中做同样的事情? Keep in mind that ap.property_id and af1.property_id are from different tables.请记住ap.property_idaf1.property_id来自不同的表。

SELECT property_id]
FROM [Insured (see TSQL Statement 1)] 
GROUP BY property_id
UNION ALL
SELECT property_id
FROM [Uninsured (see TSQL Statement 2)] 
GROUP BY property_id
UNION ALL
SELECT property_id
FROM [IE > 90 Days (see TSQL Statement 3)] 
GROUP BY property_id;

TSQL Statement 1 (Insured) TSQL 报表 1(被保险人)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_active_assistance_ind,
    af1.is_pipeline_ind,

    CASE
        WHEN ap.is_insured_ind = "Y" THEN "Insured"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN ap.is_insured_ind = "Y" AND ap.is_under_management_ind = "Y" AND af1.is_pipeline_ind = "N" THEN "Insured Only"
        WHEN ap.is_insured_ind = "Y" AND ap.is_under_management_ind = "Y" AND af1.is_pipeline_ind = "Y" AND ap.has_active_assistance_ind = "Y" THEN "Insured and Assisted"
        ELSE "Other Insured"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_financing af1
    INNER JOIN rems_dmart.dbo.active_property ap
    ON af1.property_id = ap.property_id

GROUP BY
    ap.region_name,
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_active_assistance_ind,
    af1.is_pipeline_ind

HAVING
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "N" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "Y"
    AND ap.has_active_assistance_ind = "Y" )

TSQL Statement 2 (Uninsured) TSQL 报表 2(未投保)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_use_restriction_ind,
    ap.has_active_irp_ind,
    ap.has_active_assistance_ind,
    ap.is_service_coordinator_ind,

    CASE
        WHEN ap.is_insured_ind = "N" THEN "Uninsured"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "Y" THEN "Assisted Only"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "Y" THEN "Use Agreement Only"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "Y" AND ap.is_service_coordinator_ind = "N" THEN "IRP"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "N" AND ap.is_service_coordinator_ind = "Y" THEN "Service Coordinator"
        WHEN ap.is_insured_ind = "N" AND ap.has_active_assistance_ind = "N" AND ap.has_use_restriction_ind = "N" AND ap.has_active_irp_ind = "Y" AND ap.is_service_coordinator_ind = "Y" THEN "IRP & Service Coordinator"
        ELSE "Other Uninsured"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_property ap

GROUP BY
    ap.region_name,
    ap.property_id,
    ap.is_under_management_ind,
    ap.is_insured_ind,
    ap.has_use_restriction_ind,
    ap.has_active_irp_ind,
    ap.has_active_assistance_ind,
    ap.is_service_coordinator_ind

HAVING
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "N"
    AND ap.has_active_assistance_ind = "Y" )
    OR 
    ( ap.region_name <> "OHP"
    AND ap.is_insured_ind = "N"
    AND ap.has_use_restriction_ind = "Y" )

TSQL Statement 3 (IE > 90 Days) TSQL 报表 3(IE > 90 天)

SELECT
    rtrim(STR_REPLACE(ap.region_name,' Region','')) AS 'Region',
    af1.property_id,
    af1.initial_endorsement_date,
    af1.final_endorsement_date,
    ap.is_under_management_ind,

    CASE
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL THEN "IE > 90 Days"
        ELSE "Other"
    END AS 'Classification 1',

    CASE
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL AND ap.is_under_management_ind = "Y" THEN "IE > 90 Days_Under Mgmt"
        WHEN af1.initial_endorsement_date IS NOT NULL AND af1.final_endorsement_date IS NULL AND ap.is_under_management_ind = "N" THEN "IE > 90 Days_Not Under Mgmt"
        ELSE "Other IE > 90 Days"
    END AS 'Classification 2'

FROM rems_dmart.dbo.active_property ap
    INNER JOIN rems_dmart.dbo.active_financing af1
        ON ap.property_id = af1.property_id

WHERE
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "N" )
    OR 
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "Y" )

In the first place , the final results you want are only the property_id values.首先,您想要的最终结果只是property_id值。 A natural first step, then, would be to remove everything else from the select lists of (copies of) the TSQL queries.那么,自然而然的第一步是从 TSQL 查询(的副本)的 select 列表中删除所有其他内容。 That already makes them much simpler.这已经使它们变得简单多了。

In the second place , the first two original TSQL queries use GROUP BY but not any aggregate functions.其次,前两个原始 TSQL 查询使用GROUP BY而不是任何聚合函数。 They do include all the grouping columns in their select lists.他们确实在其 select 列表中包含了所有分组列。 This has one primary and at least one secondary effect:这有一个主要和至少一个次要影响:

  • the primary effect is de-duplication of the rows, equivalent to a select distinct主要效果是对行进行重复数据删除,相当于select distinct
  • the secondary effect may be the real point: the filter conditions are applied via a having clause, such that (logically) they are are tested after row de-duplication.次要影响可能是真正的要点:过滤条件是通过having子句应用的,这样(逻辑上)它们在行去重进行测试。

However, if there are no duplicate property_id values in either of the two base tables then grouping and distinct selection in these two queries are both moot, as inclusion of one of the property_id s among the grouping columns means that there cannot be any groups with more than one row, nor would there be duplicate result rows in any case.但是,如果两个基表中的任何一个都没有重复的property_id值,那么这两个查询中的分组和不同选择都是没有意义的,因为在分组列中包含一个property_id意味着不能有任何组具有更多多于一行,在任何情况下也不会有重复的结果行。

This suggests removing the GROUP BY clauses and converting the HAVING clauses to WHERE clauses.这建议删除GROUP BY子句并将HAVING子句转换为WHERE子句。

In the third place , the three queries are all selecting from approximately the same join of the same two tables.第三,这三个查询都是从相同的两个表的大致相同的连接中选择的。 The second is selecting from a single one of the tables, but since there are no duplicates in the join columns, we can make it effectively the same thing by converting the inner join to an outer one.第二个是从一个表中选择,但由于连接列中没有重复项,我们可以通过将内部连接转换为外部连接来使其有效地成为同一件事。 Even that would be unnecessary if you could rely on every row in ap having a corresponding row in af1 .如果您可以依赖ap中的每一行在af1中都有对应的行,那么即使那样也是不必要的。 Additionally, although you claim that they are selecting property_id columns from different tables, that's a technicality.此外,尽管您声称他们正在从不同的表中选择property_id列,但这是技术问题。 Query 3 could be modified to select ap.property_id instead of af1.property_id without changing the results, because these are guaranteed to be equal in every row of the rowset.可以在不更改结果的情况下将查询 3 修改为 select ap.property_id而不是af1.property_id ,因为它们保证在行集中的每一行中都相等。

Thus, these three queries can usefully be combined into a single one without relying on a union.因此,这三个查询可以有效地组合成一个查询而不依赖联合。 They can be adjusted to be ordinary queries with the same select list and source row set, so all that remains is unifying the filter conditions.可以调整为普通查询,相同的select列表和源行集,剩下的就是统一过滤条件了。 Before working on the filter condition, that's this:在处理过滤条件之前,这是这样的:

-- can be made SELECT DISTINCT if duplicates are present after all:
SELECT ap.property_id

FROM rems_dmart.dbo.active_property ap
    LEFT OUTER JOIN rems_dmart.dbo.active_financing af1
        ON af1.property_id = ap.property_id

WHERE
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "N" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "Y"
    AND af1.is_pipeline_ind = "Y"
    AND ap.has_active_assistance_ind = "Y" )
    OR
    ( ap.region_name <> "OHP"
    AND ap.is_under_management_ind = "Y"
    AND ap.is_insured_ind = "N"
    AND ap.has_active_assistance_ind = "Y" )
    OR 
    ( ap.region_name <> "OHP"
    AND ap.is_insured_ind = "N"
    AND ap.has_use_restriction_ind = "Y" )
    OR
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "N" )
    OR 
    ( ap.region_name <> "OHP"
    AND DATEDIFF(DAY, af1.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90
    AND af1.final_endorsement_date IS NULL 
    AND ap.is_under_management_ind = "Y" )

Again, as long as there are no duplicate property_id values in either table, there is no need for a GROUP BY or a DISTINCT selection, but if you have to accommodate duplicates after all then you can just change the SELECT to SELECT DISTINCT .同样,只要两个表中没有重复的property_id值,就不需要GROUP BYDISTINCT选择,但如果您毕竟必须容纳重复项,那么您只需将SELECT更改为SELECT DISTINCT

That filter condition could be better factored, but at this point it's highly reflective of the original queries.该过滤条件可以更好地分解,但此时它高度反映了原始查询。 That may make it easier to validate or check.这可能会使验证或检查变得更容易。 I would consider at least lifting the condition on ap.region_name out of all the the individual alternatives, but I leave that and any other refactoring you want to perform to you.我会考虑至少从所有单独的备选方案中取消ap.region_name的条件,但我把它和你想执行的任何其他重构留给你。

Your code can be translated into您的代码可以翻译成

--1
SELECT
        AP.property_id
FROM rems_dmart.dbo.active_financing AS AF
INNER JOIN rems_dmart.dbo.active_property AS AP
    ON AF.property_id = AP.property_id
WHERE AP.region_name <> 'OHP' AND AP.is_under_management_ind = 'Y' AND AP.is_insured_ind = 'Y' AND AF.is_pipeline_ind = 'N'

UNION 

--2
SELECT
        property_id
FROM rems_dmart.dbo.active_property
WHERE region_name <> 'OHP' AND is_insured_ind = 'N' AND has_active_assistance_ind = 'Y'

UNION

--3
SELECT
        AF.property_id
FROM rems_dmart.dbo.active_property AS AP
INNER JOIN rems_dmart.dbo.active_financing AS AF
    ON AP.property_id = AF.property_id
WHERE AP.region_name <> 'OHP' AND DATEDIFF(DAY, AF.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90 AND AF.final_endorsement_date IS NULL AND AP.is_under_management_ind IN ('N','Y')

or even further to甚至更进一步

SELECT
        AP.property_id
FROM rems_dmart.dbo.active_financing AS AF
INNER JOIN rems_dmart.dbo.active_property AS AP
    ON AF.property_id = AP.property_id
WHERE AP.region_name <> 'OHP' AND
(
    (AP.is_under_management_ind = 'Y' AND AP.is_insured_ind = 'Y' AND AF.is_pipeline_ind = 'N') 
    OR 
    (DATEDIFF(DAY, AF.initial_endorsement_date, CONVERT(VARCHAR(20), GETDATE(), 101)) > 90 AND AF.final_endorsement_date IS NULL AND AP.is_under_management_ind IN ('N','Y'))
)

UNION 

SELECT
        property_id
FROM rems_dmart.dbo.active_property
WHERE region_name <> 'OHP' AND is_insured_ind = 'N' AND has_active_assistance_ind = 'Y'

@john-bollinger said many good things for your query, so try to use his answer as well. @john-bollinger 对您的查询说了很多好话,所以也请尝试使用他的答案。 Also you need to have good indices on the two table for your query to run smooth.此外,您还需要在两个表上有良好的索引才能使查询顺利运行。

Here is an example of how you could combine three TSQL statements into one query using the UNION ALL operator and GROUP BY clause:下面是一个示例,说明如何使用 UNION ALL 运算符和 GROUP BY 子句将三个 TSQL 语句组合到一个查询中:

WITH CTE AS ( SELECT column1, column2, SUM(column3) as Total_Column3 FROM table1 WHERE column4 = 'value1' GROUP BY column1, column2 WITH CTE AS ( SELECT column1, column2, SUM(column3) as Total_Column3 FROM table1 WHERE column4 = 'value1' GROUP BY column1, column2

UNION ALL联合所有

SELECT column1, column2, SUM(column3) as Total_Column3 FROM table2 WHERE column5 = 'value2' GROUP BY column1, column2 SELECT column1, column2, SUM(column3) as Total_Column3 FROM table2 WHERE column5 = 'value2' GROUP BY column1, column2

UNION ALL联合所有

SELECT column1, column2, SUM(column3) as Total_Column3 FROM table3 WHERE column6 = 'value3' GROUP BY column1, column2 ) SELECT column1, column2, SUM(Total_Column3) as Grand_Total FROM CTE GROUP BY column1, column2 SELECT column1, column2, SUM(column3) as Total_Column3 FROM table3 WHERE column6 = 'value3' GROUP BY column1, column2 ) SELECT column1, column2, SUM(Total_Column3) as Grand_Total FROM CTE GROUP BY column1, column2

This query creates a common table expression (CTE) from the results of three SELECT statements, each of which aggregates data from a different table and filters it based on a specific condition.此查询根据三个 SELECT 语句的结果创建一个公用表表达式 (CTE),每个语句聚合来自不同表的数据并根据特定条件对其进行过滤。 The UNION ALL operator combines the results of these SELECT statements into a single result set. UNION ALL 运算符将这些 SELECT 语句的结果合并到一个结果集中。 Finally, the query uses the GROUP BY clause to aggregate the Grand Total of Total_Column3 by column1 and column2.最后,查询使用 GROUP BY 子句按 column1 和 column2 聚合 Total_Column3 的总计。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM