简体   繁体   English

如何基于一列联接多个表的结果(通过UNION ALL)

[英]How to join the results of multiple tables based on one column (via UNION ALL)

Well, the SQL statement that I have written works fine, but I would like to make it less bulky and use only one 'ACCEPTANCE_DATE' since it is the same in all tables. 好的,我编写的SQL语句工作正常,但我想使其不那么笨重,并且仅使用一个'ACCEPTANCE_DATE',因为在所有表中它都相同。

I am trying to join the results of multiple tables by using UNION ALL statement. 我试图通过使用UNION ALL语句来连接多个表的结果。 The below example works perfectly fine. 下面的示例工作得很好。

SEL COUNT(*)FROM
MY_DATABASE.HUMAN_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09'
UNION ALL
SEL COUNT(*)FROM
MY_DATABASE.FINANCIAL_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09'
UNION ALL
SEL COUNT(*)FROM
MY_DATABASE.INFRASTRUCTURE_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09';

All the tables have the same type of columns, eg each table has a column called 'ACCEPTANCE_DATE'. 所有表都具有相同类型的列,例如,每个表都有一个名为“ ACCEPTANCE_DATE”的列。 The result I get is correct. 我得到的结果是正确的。 Nevertheless, I am combining a lot of tables in one query (using UNION ALL) and am wondering whether there is a way to transform this query so I do not have to update the ACCEPTANCE_DATE='2015-08-09' in each select statement. 不过,我在一个查询中组合了很多表(使用UNION ALL),并且想知道是否有一种方法可以转换此查询,因此我不必在每个select语句中更新ACCEPTANCE_DATE ='2015-08-09' 。 Ideally, I would like to define it just once especially when I use over 30 UNION ALL clauses, eg 理想情况下,我只想定义一次,尤其是当我使用30多个UNION ALL子句时,例如

SEL * FROM
    (SEL COUNT(*)FROM
    MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
    SEL COUNT(*)FROM
    MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
    SEL COUNT(*)FROM
    MY_DATABASE.INFRASTRUCTURE_RESOURCES) AS T1
WHERE ACCEPTANCE_DATE='2015-08-09'; 

The below statement fetches the data. 以下语句获取数据。

SEL COUNT(*)FROM
MY_DATABASE.HUMAN_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09'
UNION ALL
SEL COUNT(*)FROM
MY_DATABASE.FINANCIAL_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09'
UNION ALL
SEL COUNT(*)FROM
MY_DATABASE.INFRASTRUCTURE_RESOURCES
WHERE ACCEPTANCE_DATE='2015-08-09';

As described above, I would like it to be something like the below one: 如上所述,我希望它类似于以下内容:

SEL * FROM
    (SEL COUNT(*)FROM
    MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
    SEL COUNT(*)FROM
    MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
    SEL COUNT(*)FROM
    MY_DATABASE.INFRASTRUCTURE_RESOURCES) AS T1
WHERE ACCEPTANCE_DATE='2015-08-09';

Maybe this would be what you are looking for: 也许这就是您要寻找的东西:

SEL COUNT(*) FROM
    (SEL ACCEPTANCE_DATE, 1 AS ORIGIN FROM
    MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 2 AS ORIGIN FROM
    MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 3 AS ORIGIN FROM
    MY_DATABASE.INFRASTRUCTURE_RESOURCES) AS T1
WHERE ACCEPTANCE_DATE='2015-08-09'
GROUP BY ORIGIN;

You could even give the ORIGIN some more meaningful names and show them afterwards: 您甚至可以给ORIGIN一些更有意义的名称,然后再显示它们:

SEL ORIGIN, COUNT(*) FROM
    (SEL ACCEPTANCE_DATE, 'HUMAN' AS ORIGIN FROM
    MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 'FINANCIAL' AS ORIGIN FROM
    MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 'INFRASTRUCTURE' AS ORIGIN FROM
    MY_DATABASE.INFRASTRUCTURE_RESOURCES) AS T1
WHERE ACCEPTANCE_DATE='2015-08-09'
GROUP BY ORIGIN;

Which results in two columns. 结果为两列。 Now, it still doesn't solve the problem with the third value not being displayed, but this way you know which values are missing and can easily distinguish which should be zeros. 现在,它仍然不能解决第三个值不显示的问题,但是通过这种方式,您知道哪些值丢失了,并且可以轻松地区分哪个值应该为零。 If this is not sufficient for you than the code gets nastier. 如果这还不足以使您的代码变得更糟。 I might think of a solution later. 我可能稍后会想到解决方案。


To address the problem of a missing entry for empty tables I thought of two possible solutions. 为了解决缺少空表条目的问题,我想到了两种可能的解决方案。 The choice depends on whether the operation would be performed just once or is this a reoccurring action or just one time event. 该选择取决于该操作将仅执行一次还是重复执行一次还是仅一次执行。 If you plan to do it multiple times it might be a good idea to create a table with all the origin tables names (or some shortcuts, you'll get the idea) on the database. 如果您打算多次执行此操作,则最好在数据库上创建一个包含所有原始表名称的表(或某些快捷方式,您会想到的)。 For this minimal example let's consider such a thing exists under name SOURCE_TABLES: 对于这个最小的示例,让我们考虑在名称SOURCE_TABLES下存在这样的东西:

SELECT RESOURCE FROM SOURCE_TABLES
/*
    RESOURCE:
    HUMAN
    FINANCIAL
    INFRASTRUCTURE
*/

In this case the previously provided script needs just a little modification: 在这种情况下,先前提供的脚本仅需进行一些修改:

SEL ST.RESOURCE, COUNT(T1.ACCEPTANCE_DATE) FROM SOURCE_TABLES ST
    LEFT JOIN (SEL ACCEPTANCE_DATE, 'HUMAN' AS ORIGIN FROM
    MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 'FINANCIAL' AS ORIGIN FROM
    MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
    SEL ACCEPTANCE_DATE, 'INFRASTRUCTURE' AS ORIGIN FROM
    MY_DATABASE.INFRASTRUCTURE_RESOURCES) AS T1
    ON ST.RESOURCE = T1.ORIGIN
WHERE ACCEPTANCE_DATE='2015-08-09'
GROUP BY ST.RESOURCE;

Here, by using LEFT JOIN you ensure that every entry from the table is present in the output, even if in T1 there are no rows with specified origin. 在这里,通过使用LEFT JOIN,您可以确保表中的每个条目都存在于输出中,即使在T1中不存在具有指定原点的行。 COUNT(T1.ACCEPTANCE_DATE) utilises the fact, that NULLs do not add up to the counter. COUNT(T1.ACCEPTANCE_DATE)利用了一个事实,即NULL不会加到计数器上。

Now, if for any reason you do not like the idea of creating the table (you can't create an object on database or it is too much of a hassle for a single action) you could stick to the idea of numbers which are easier generated on the fly. 现在,如果由于某种原因您不喜欢创建表的想法(您无法在数据库上创建对象,或者单个操作太麻烦了),您可以坚持使用数字的想法,因为数字更容易即时产生。 The solution below exploits the same idea as above, but is more flexible in terms of the number of tables it reads from and obviously does not require you to create additional table. 下面的解决方案采用了与上述相同的思想,但是在读取表的数量方面更加灵活,并且显然不需要您创建其他表。 Considering you mentioned 30 tables, this could be a better option. 考虑到您提到了30个表,这可能是一个更好的选择。 One can argue that it is less readable though: 有人可以说,它的可读性较差:

WITH numbers AS (
SEL 1 AS number
UNION ALL
SEL number + 1 FROM numbers WHERE number + 1 <= 3 -- Change 3 to the number of sourcing tables
), input_merged AS ( -- if we already use the WITH clause we can do so for merging input. It's more readable
SEL ACCEPTANCE_DATE, 1 AS ORIGIN FROM MY_DATABASE.HUMAN_RESOURCES
    UNION ALL
SEL ACCEPTANCE_DATE, 2 AS ORIGIN FROM MY_DATABASE.FINANCIAL_RESOURCES
    UNION ALL
SEL ACCEPTANCE_DATE, 3 AS ORIGIN FROM MY_DATABASE.INFRASTRUCTURE_RESOURCES
-- add further sources accordingly...
)
SEL COUNT(ACCEPTANCE_DATE) FROM numbers n
    LEFT JOIN input_merged im ON n.number = im.ORIGIN
WHERE ACCEPTANCE_DATE='2015-08-09'
GROUP BY n.number;

This should produce the first asked and desired output. 这将产生第一个要求和期望的输出。

As for the numbers part in WITH statement you might want to refer to this , note that in this solution I used WITH to also merge input as Christoph did. 至于WITH语句中的数字部分,您可能要引用 ,请注意,在此解决方案中,我也使用WITH与Christoph合并输入。 If you use ORACLE database, utilising CONNECT BY LEVEL could be a better option to create a sequence of numbers. 如果使用ORACLE数据库,则使用CONNECT BY LEVEL可能是创建数字序列的更好选择。

Hopefully now you can achieve what you desired! 希望现在您可以实现您想要的!

You can either use a Macro as @ravioli suggested: 您可以按照@ravioli的建议使用宏:

REPLACE MACRO my_counts(inDate DATE) AS
 (
   SELECT 'HUMAN_RESOURCES' AS tab, Count(*)
   FROM MY_DATABASE.HUMAN_RESOURCES
   WHERE ACCEPTANCE_DATE=:inDate
   UNION ALL
   SELECT 'FINANCIAL_RESOURCES', Count(*)
   FROM MY_DATABASE.FINANCIAL_RESOURCES
   WHERE ACCEPTANCE_DATE=:inDate
   UNION ALL
   SELECT 'INFRASTRUCTURE_RESOURCES', Count(*)
   FROM MY_DATABASE.INFRASTRUCTURE_RESOURCES
   WHERE ACCEPTANCE_DATE=:inDate;
 );

EXEC my_counts(DATE '2015-08-09');

Create this macro either in a database where you have Create Macro rights or within your own user (but then nobody else but you can use it). 在具有“创建宏”权限的数据库中或在您自己的用户中创建此宏(但是除了您可以使用它的其他人之外)。

Or you use a Common Table Expression for defining the date: 或者,您可以使用通用表表达式来定义日期:

WITH cte AS 
 (
   SELECT DATE '2015-08-09' AS ACCEPTANCE_DATE
 )
SELECT 'HUMAN_RESOURCES' AS tab, Count(*)
FROM MY_DATABASE.HUMAN_RESOURCES
WHERE ACCEPTANCE_DATE=(SELECT ACCEPTANCE_DATE FROM cte)
UNION ALL
SELECT 'FINANCIAL_RESOURCES', Count(*)
FROM MY_DATABASE.FINANCIAL_RESOURCES
WHERE ACCEPTANCE_DATE=(SELECT ACCEPTANCE_DATE FROM cte)
UNION ALL
SELECT 'INFRASTRUCTURE_RESOURCES', Count(*)
FROM MY_DATABASE.INFRASTRUCTURE_RESOURCES
WHERE ACCEPTANCE_DATE=(SELECT ACCEPTANCE_DATE FROM cte);

The (SELECT ACCEPTANCE_DATE FROM cte) will be executed once and then passed as parameter to each Select. (SELECT ACCEPTANCE_DATE FROM cte)将执行一次,然后作为参数传递给每个Select。

Not sure if this is standard SQL or just Microsoft syntax, but in Microsoft SQL Server you could do it like this: 不确定这是标准SQL还是Microsoft语法,但是在Microsoft SQL Server中,您可以这样做:

WITH PreSelect AS (
  SELECT ACCEPTANCE_DATE FROM HUMAN_RESOURCES
  UNION ALL 
  SELECT ACCEPTANCE_DATE FROM FINANCIAL_RESOURCES
  UNION ALL 
  SELECT ACCEPTANCE_DATE FROM INFRASTRUCTURE_RESOURCES
)
SELECT COUNT(*) FROM PreSelect WHERE ACCEPTANCE_DATE = '2015-08-09';

Or with an origin as TheDecks suggests if you need each value separately. 或以TheDecks建议的来源为基础,确定是否需要分别使用每个值。

2nd try: 第二次尝试:

WITH PreSelect AS (
  SELECT 'Infrastructure Resources' AS Origin, ACCEPTANCE_DATE FROM INFRASTRUCTURE_RESOURCES
  UNION ALL 
  SELECT 'Human Resources' AS Origin, ACCEPTANCE_DATE FROM HUMAN_RESOURCES
  UNION ALL 
  SELECT 'Financial Resources' AS Origin, ACCEPTANCE_DATE FROM FINANCIAL_RESOURCES
)
SELECT Origin, COUNT(*) FROM PreSelect 
WHERE ACCEPTANCE_DATE = '2015-08-09' 
GROUP BY Origin
ORDER BY 2 DESC;

This version does not sum it up and provides speaking labels and orders it by the highest number descending. 此版本不进行总结,而是提供语音标签,并按降序排列。

3rd try: 第三次尝试:

WITH PreSelect AS (
  SELECT 'Infrastructure Resources' AS Origin, ACCEPTANCE_DATE FROM INFRASTRUCTURE_RESOURCES
  UNION ALL 
  SELECT 'Human Resources' AS Origin, ACCEPTANCE_DATE FROM HUMAN_RESOURCES
  UNION ALL 
  SELECT 'Financial Resources' AS Origin, ACCEPTANCE_DATE FROM FINANCIAL_RESOURCES
), 
Categories AS (
  SELECT DISTINCT Origin FROM PreSelect
),
ReferenceDate AS (
    SELECT Origin, COUNT(*) RecordCount FROM PreSelect 
    WHERE ACCEPTANCE_DATE = '2015-08-09' 
    GROUP BY Origin
)
SELECT c.Origin, ISNULL(rd.RecordCount, 0) AS RecordCount FROM Categories c
LEFT OUTER JOIN ReferenceDate rd ON  c.Origin = rd.Origin 
ORDER BY 2 DESC;

Like this also rows with 0 entries appear... 像这样,也会出现条目为0的行...

是的,cte是为此的最佳选择...在CTE中,它使用结果中的where子句过滤数据(联合所有)

If your issue is that you only want to define the ACCEPTANCE_DATE one time, then you can keep your original SQL and use a macro or an SP and parameterize ACCEPTANCE_DATE as an input value. 如果您的问题是只想一次定义ACCEPTANCE_DATE ,那么您可以保留原始SQL并使用宏或SP并将参数ACCEPTANCE_DATE为输入值。

If you want to re-write the SQL, maybe try something like this: 如果要重新编写SQL,请尝试执行以下操作:

SELECT MyCount FROM (
  SELECT ACCEPTANCE_DATE, MyCount
  FROM (
    SELECT ACCEPTANCE_DATE, COUNT(*) AS MyCount
    FROM MY_DATABASE.HUMAN_RESOURCES
    GROUP BY ACCEPTANCE_DATE
  )

  UNION ALL

  SELECT ACCEPTANCE_DATE, MyCount
  FROM (
    SELECT ACCEPTANCE_DATE, COUNT(*) AS MyCount
    FROM MY_DATABASE.FINANCIAL_RESOURCES
    GROUP BY ACCEPTANCE_DATE
  )

  UNION ALL

  SELECT ACCEPTANCE_DATE, MyCount
  FROM (
    SELECT ACCEPTANCE_DATE, COUNT(*) AS MyCount
    FROM MY_DATABASE.INFRASTRUCTURE_RESOURCES
    GROUP BY ACCEPTANCE_DATE
  )
) src
WHERE ACCEPTANCE_DATE = '2015-08-09';

This will likely not perform very well if you have lots of rows in these tables, unless you have some optimization like PPI defined on ACCEPTANCE_DATE fields. 如果您在这些表中有很多行,这可能不会很好地执行,除非您进行了一些优化,例如在ACCEPTANCE_DATE字段上定义了PPI。

I haven't tested this, so you may have some syntax errors to work through, but it should get you what you want. 我尚未对此进行测试,因此您可能会遇到一些语法错误,但是它应该可以为您提供所需的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM